alignment-free evolutionary conservation: Topics by Science.gov

Sample records for alignment-free evolutionary conservation

The impact of age, biogenesis, and genomic clustering on Drosophila microRNA evolution

PubMed Central

Mohammed, Jaaved; Flynt, Alex S.; Siepel, Adam; Lai, Eric C.

2013-01-01

The molecular evolutionary signatures of miRNAs inform our understanding of their emergence, biogenesis, and function. The known signatures of miRNA evolution have derived mostly from the analysis of deeply conserved, canonical loci. In this study, we examine the impact of age, biogenesis pathway, and genomic arrangement on the evolutionary properties of Drosophila miRNAs. Crucial to the accuracy of our results was our curation of high-quality miRNA alignments, which included nearly 150 corrections to ortholog calls and nucleotide sequences of the global 12-way Drosophilid alignments currently available. Using these data, we studied primary sequence conservation, normalized free-energy values, and types of structure-preserving substitutions. We expand upon common miRNA evolutionary patterns that reflect fundamental features of miRNAs that are under functional selection. We observe that melanogaster-subgroup-specific miRNAs, although recently emerged and rapidly evolving, nonetheless exhibit evolutionary signatures that are similar to well-conserved miRNAs and distinct from other structured noncoding RNAs and bulk conserved non-miRNA hairpins. This provides evidence that even young miRNAs may be selected for regulatory activities. More strikingly, we observe that mirtrons and clustered miRNAs both exhibit distinct evolutionary properties relative to solo, well-conserved miRNAs, even after controlling for sequence depth. These studies highlight the previously unappreciated impact of biogenesis strategy and genomic location on the evolutionary dynamics of miRNAs, and affirm that miRNAs do not evolve as a unitary class. PMID:23882112
The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures.

PubMed

Goldenberg, Ofir; Erez, Elana; Nimrod, Guy; Ben-Tal, Nir

2009-01-01

ConSurf-DB is a repository for evolutionary conservation analysis of the proteins of known structures in the Protein Data Bank (PDB). Sequence homologues of each of the PDB entries were collected and aligned using standard methods. The evolutionary conservation of each amino acid position in the alignment was calculated using the Rate4Site algorithm, implemented in the ConSurf web server. The algorithm takes into account the phylogenetic relations between the aligned proteins and the stochastic nature of the evolutionary process explicitly. Rate4Site assigns a conservation level for each position in the multiple sequence alignment using an empirical Bayesian inference. Visual inspection of the conservation patterns on the 3D structure often enables the identification of key residues that comprise the functionally important regions of the protein. The repository is updated with the latest PDB entries on a monthly basis and will be rebuilt annually. ConSurf-DB is available online at http://consurfdb.tau.ac.il/
The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures

PubMed Central

Goldenberg, Ofir; Erez, Elana; Nimrod, Guy; Ben-Tal, Nir

2009-01-01

ConSurf-DB is a repository for evolutionary conservation analysis of the proteins of known structures in the Protein Data Bank (PDB). Sequence homologues of each of the PDB entries were collected and aligned using standard methods. The evolutionary conservation of each amino acid position in the alignment was calculated using the Rate4Site algorithm, implemented in the ConSurf web server. The algorithm takes into account the phylogenetic relations between the aligned proteins and the stochastic nature of the evolutionary process explicitly. Rate4Site assigns a conservation level for each position in the multiple sequence alignment using an empirical Bayesian inference. Visual inspection of the conservation patterns on the 3D structure often enables the identification of key residues that comprise the functionally important regions of the protein. The repository is updated with the latest PDB entries on a monthly basis and will be rebuilt annually. ConSurf-DB is available online at http://consurfdb.tau.ac.il/ PMID:18971256
A Fast Alignment-Free Approach for De Novo Detection of Protein Conserved Regions

PubMed Central

Abnousi, Armen; Broschat, Shira L.; Kalyanaraman, Ananth

2016-01-01

Background Identifying conserved regions in protein sequences is a fundamental operation, occurring in numerous sequence-driven analysis pipelines. It is used as a way to decode domain-rich regions within proteins, to compute protein clusters, to annotate sequence function, and to compute evolutionary relationships among protein sequences. A number of approaches exist for identifying and characterizing protein families based on their domains, and because domains represent conserved portions of a protein sequence, the primary computation involved in protein family characterization is identification of such conserved regions. However, identifying conserved regions from large collections (millions) of protein sequences presents significant challenges. Methods In this paper we present a new, alignment-free method for detecting conserved regions in protein sequences called NADDA (No-Alignment Domain Detection Algorithm). Our method exploits the abundance of exact matching short subsequences (k-mers) to quickly detect conserved regions, and the power of machine learning is used to improve the prediction accuracy of detection. We present a parallel implementation of NADDA using the MapReduce framework and show that our method is highly scalable. Results We have compared NADDA with Pfam and InterPro databases. For known domains annotated by Pfam, accuracy is 83%, sensitivity 96%, and specificity 44%. For sequences with new domains not present in the training set an average accuracy of 63% is achieved when compared to Pfam. A boost in results in comparison with InterPro demonstrates the ability of NADDA to capture conserved regions beyond those present in Pfam. We have also compared NADDA with ADDA and MKDOM2, assuming Pfam as ground-truth. On average NADDA shows comparable accuracy, more balanced sensitivity and specificity, and being alignment-free, is significantly faster. Excluding the one-time cost of training, runtimes on a single processor were 49s, 10,566s, and 456s for NADDA, ADDA, and MKDOM2, respectively, for a data set comprised of approximately 2500 sequences. PMID:27552220
Incorporating evolution of transcription factor binding sites into annotated alignments.

PubMed

Bais, Abha S; Grossmann, Stefen; Vingron, Martin

2007-08-01

Identifying transcription factor binding sites (TFBSs) is essential to elucidate putative regulatory mechanisms. A common strategy is to combine cross-species conservation with single sequence TFBS annotation to yield "conserved TFBSs". Most current methods in this field adopt a multi-step approach that segregates the two aspects. Again, it is widely accepted that the evolutionary dynamics of binding sites differ from those of the surrounding sequence. Hence, it is desirable to have an approach that explicitly takes this factor into account. Although a plethora of approaches have been proposed for the prediction of conserved TFBSs, very few explicitly model TFBS evolutionary properties, while additionally being multi-step. Recently, we introduced a novel approach to simultaneously align and annotate conserved TFBSs in a pair of sequences. Building upon the standard Smith-Waterman algorithm for local alignments, SimAnn introduces additional states for profiles to output extended alignments or annotated alignments. That is, alignments with parts annotated as gaplessly aligned TFBSs (pair-profile hits)are generated. Moreover,the pair- profile related parameters are derived in a sound statistical framework. In this article, we extend this approach to explicitly incorporate evolution of binding sites in the SimAnn framework. We demonstrate the extension in the theoretical derivations through two position-specific evolutionary models, previously used for modelling TFBS evolution. In a simulated setting, we provide a proof of concept that the approach works given the underlying assumptions,as compared to the original work. Finally, using a real dataset of experimentally verified binding sites in human-mouse sequence pairs,we compare the new approach (eSimAnn) to an existing multi-step tool that also considers TFBS evolution. Although it is widely accepted that binding sites evolve differently from the surrounding sequences, most comparative TFBS identification methods do not explicitly consider this.Additionally, prediction of conserved binding sites is carried out in a multi-step approach that segregates alignment from TFBS annotation. In this paper, we demonstrate how the simultaneous alignment and annotation approach of SimAnn can be further extended to incorporate TFBS evolutionary relationships. We study how alignments and binding site predictions interplay at varying evolutionary distances and for various profile qualities.
MONKEY: Identifying conserved transcription-factor binding sitesin multiple alignments using a binding site-specific evolutionarymodel

DOE Office of Scientific and Technical Information (OSTI.GOV)

Moses, Alan M.; Chiang, Derek Y.; Pollard, Daniel A.

2004-10-28

We introduce a method (MONKEY) to identify conserved transcription-factor binding sites in multispecies alignments. MONKEY employs probabilistic models of factor specificity and binding site evolution, on which basis we compute the likelihood that putative sites are conserved and assign statistical significance to each hit. Using genomes from the genus Saccharomyces, we illustrate how the significance of real sites increases with evolutionary distance and explore the relationship between conservation and function.
Evolutionary dynamics of a conserved sequence motif in the ribosomal genes of the ciliate Paramecium.

PubMed

Catania, Francesco; Lynch, Michael

2010-05-04

In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa) remains a virtually unexplored issue. By screening Paramecium tetraurelia's 3' untranslated regions for 8-mers that were previously found to be preserved in mammalian 3' UTRs, we detect and characterize a motif that is distinctly conserved in the ribosomal genes of this ciliate. The motif appears to be conserved across Paramecium aurelia species but is absent from the ribosomal genes of four additional non-Paramecium species surveyed, including another ciliate, Tetrahymena thermophila. Motif-free ribosomal genes retain fewer paralogs in the genome and appear to be lost more rapidly relative to motif-containing genes. Features associated with the discovered preserved motif are consistent with this 8-mer playing a role in post-transcriptional regulation. Our observations 1) shed light on the evolution of a putative regulatory motif across large phylogenetic distances; 2) are expected to facilitate the understanding of the modulation of ribosomal genes expression in Paramecium; and 3) reveal a largely unexplored--and presumably not restricted to Paramecium--association between the presence/absence of a DNA motif and the evolutionary fate of its host genes.
Aligning science and policy to achieve evolutionarily enlightened conservation.

PubMed

Cook, Carly N; Sgrò, Carla M

2017-06-01

There is increasing recognition among conservation scientists that long-term conservation outcomes could be improved through better integration of evolutionary theory into management practices. Despite concerns that the importance of key concepts emerging from evolutionary theory (i.e., evolutionary principles and processes) are not being recognized by managers, there has been little effort to determine the level of integration of evolutionary theory into conservation policy and practice. We assessed conservation policy at 3 scales (international, national, and provincial) on 3 continents to quantify the degree to which key evolutionary concepts, such as genetic diversity and gene flow, are being incorporated into conservation practice. We also evaluated the availability of clear guidance within the applied evolutionary biology literature as to how managers can change their management practices to achieve better conservation outcomes. Despite widespread recognition of the importance of maintaining genetic diversity, conservation policies provide little guidance about how this can be achieved in practice and other relevant evolutionary concepts, such as inbreeding depression, are mentioned rarely. In some cases the poor integration of evolutionary concepts into management reflects a lack of decision-support tools in the literature. Where these tools are available, such as risk-assessment frameworks, they are not being adopted by conservation policy makers, suggesting that the availability of a strong evidence base is not the only barrier to evolutionarily enlightened management. We believe there is a clear need for more engagement by evolutionary biologists with policy makers to develop practical guidelines that will help managers make changes to conservation practice. There is also an urgent need for more research to better understand the barriers to and opportunities for incorporating evolutionary theory into conservation practice. © 2016 Society for Conservation Biology.
Evolutionarily conserved regions and hydrophobic contacts at the superfamily level: The case of the fold-type I, pyridoxal-5′-phosphate-dependent enzymes

PubMed Central

Paiardini, Alessandro; Bossa, Francesco; Pascarella, Stefano

2004-01-01

The wealth of biological information provided by structural and genomic projects opens new prospects of understanding life and evolution at the molecular level. In this work, it is shown how computational approaches can be exploited to pinpoint protein structural features that remain invariant upon long evolutionary periods in the fold-type I, PLP-dependent enzymes. A nonredundant set of 23 superposed crystallographic structures belonging to this superfamily was built. Members of this family typically display high-structural conservation despite low-sequence identity. For each structure, a multiple-sequence alignment of orthologous sequences was obtained, and the 23 alignments were merged using the structural information to obtain a comprehensive multiple alignment of 921 sequences of fold-type I enzymes. The structurally conserved regions (SCRs), the evolutionarily conserved residues, and the conserved hydrophobic contacts (CHCs) were extracted from this data set, using both sequence and structural information. The results of this study identified a structural pattern of hydrophobic contacts shared by all of the superfamily members of fold-type I enzymes and involved in native interactions. This profile highlights the presence of a nucleus for this fold, in which residues participating in the most conserved native interactions exhibit preferential evolutionary conservation, that correlates significantly (r = 0.70) with the extent of mean hydrophobic contact value of their apolar fraction. PMID:15498941
Evolutionary dynamics of a conserved sequence motif in the ribosomal genes of the ciliate Paramecium

PubMed Central

2010-01-01

Background In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa) remains a virtually unexplored issue. Results By screening Paramecium tetraurelia's 3' untranslated regions for 8-mers that were previously found to be preserved in mammalian 3' UTRs, we detect and characterize a motif that is distinctly conserved in the ribosomal genes of this ciliate. The motif appears to be conserved across Paramecium aurelia species but is absent from the ribosomal genes of four additional non-Paramecium species surveyed, including another ciliate, Tetrahymena thermophila. Motif-free ribosomal genes retain fewer paralogs in the genome and appear to be lost more rapidly relative to motif-containing genes. Features associated with the discovered preserved motif are consistent with this 8-mer playing a role in post-transcriptional regulation. Conclusions Our observations 1) shed light on the evolution of a putative regulatory motif across large phylogenetic distances; 2) are expected to facilitate the understanding of the modulation of ribosomal genes expression in Paramecium; and 3) reveal a largely unexplored--and presumably not restricted to Paramecium--association between the presence/absence of a DNA motif and the evolutionary fate of its host genes. PMID:20441586
ESPERR: learning strong and weak signals in genomic sequence alignments to identify functional elements.

PubMed

Taylor, James; Tyekucheva, Svitlana; King, David C; Hardison, Ross C; Miller, Webb; Chiaromonte, Francesca

2006-12-01

Genomic sequence signals - such as base composition, presence of particular motifs, or evolutionary constraint - have been used effectively to identify functional elements. However, approaches based only on specific signals known to correlate with function can be quite limiting. When training data are available, application of computational learning algorithms to multispecies alignments has the potential to capture broader and more informative sequence and evolutionary patterns that better characterize a class of elements. However, effective exploitation of patterns in multispecies alignments is impeded by the vast number of possible alignment columns and by a limited understanding of which particular strings of columns may characterize a given class. We have developed a computational method, called ESPERR (evolutionary and sequence pattern extraction through reduced representations), which uses training examples to learn encodings of multispecies alignments into reduced forms tailored for the prediction of chosen classes of functional elements. ESPERR produces a greatly improved Regulatory Potential score, which can discriminate regulatory regions from neutral sites with excellent accuracy ( approximately 94%). This score captures strong signals (GC content and conservation), as well as subtler signals (with small contributions from many different alignment patterns) that characterize the regulatory elements in our training set. ESPERR is also effective for predicting other classes of functional elements, as we show for DNaseI hypersensitive sites and highly conserved regions with developmental enhancer activity. Our software, training data, and genome-wide predictions are available from our Web site (http://www.bx.psu.edu/projects/esperr).
Viral Phylogenomics Using an Alignment-Free Method: A Three-Step Approach to Determine Optimal Length of k-mer

PubMed Central

Zhang, Qian; Jun, Se-Ran; Leuze, Michael; Ussery, David; Nookaew, Intawat

2017-01-01

The development of rapid, economical genome sequencing has shed new light on the classification of viruses. As of October 2016, the National Center for Biotechnology Information (NCBI) database contained >2 million viral genome sequences and a reference set of ~4000 viral genome sequences that cover a wide range of known viral families. Whole-genome sequences can be used to improve viral classification and provide insight into the viral “tree of life”. However, due to the lack of evolutionary conservation amongst diverse viruses, it is not feasible to build a viral tree of life using traditional phylogenetic methods based on conserved proteins. In this study, we used an alignment-free method that uses k-mers as genomic features for a large-scale comparison of complete viral genomes available in RefSeq. To determine the optimal feature length, k (an essential step in constructing a meaningful dendrogram), we designed a comprehensive strategy that combines three approaches: (1) cumulative relative entropy, (2) average number of common features among genomes, and (3) the Shannon diversity index. This strategy was used to determine k for all 3,905 complete viral genomes in RefSeq. The resulting dendrogram shows consistency with the viral taxonomy of the ICTV and the Baltimore classification of viruses. PMID:28102365
Revisiting the phylogeny of Zoanthidea (Cnidaria: Anthozoa): Staggered alignment of hypervariable sequences improves species tree inference.

PubMed

Swain, Timothy D

2018-01-01

The recent rapid proliferation of novel taxon identification in the Zoanthidea has been accompanied by a parallel propagation of gene trees as a tool of species discovery, but not a corresponding increase in our understanding of phylogeny. This disparity is caused by the trade-off between the capabilities of automated DNA sequence alignment and data content of genes applied to phylogenetic inference in this group. Conserved genes or segments are easily aligned across the order, but produce poorly resolved trees; hypervariable genes or segments contain the evolutionary signal necessary for resolution and robust support, but sequence alignment is daunting. Staggered alignments are a form of phylogeny-informed sequence alignment composed of a mosaic of local and universal regions that allow phylogenetic inference to be applied to all nucleotides from both hypervariable and conserved gene segments. Comparisons between species tree phylogenies inferred from all data (staggered alignment) and hypervariable-excluded data (standard alignment) demonstrate improved confidence and greater topological agreement with other sources of data for the complete-data tree. This novel phylogeny is the most comprehensive to date (in terms of taxa and data) and can serve as an expandable tool for evolutionary hypothesis testing in the Zoanthidea. Spanish language abstract available in Text S1. Translation by L. O. Swain, DePaul University, Chicago, Illinois, 60604, USA. Copyright © 2017 Elsevier Inc. All rights reserved.
BAYESIAN PROTEIN STRUCTURE ALIGNMENT.

PubMed

Rodriguez, Abel; Schmidler, Scott C

The analysis of the three-dimensional structure of proteins is an important topic in molecular biochemistry. Structure plays a critical role in defining the function of proteins and is more strongly conserved than amino acid sequence over evolutionary timescales. A key challenge is the identification and evaluation of structural similarity between proteins; such analysis can aid in understanding the role of newly discovered proteins and help elucidate evolutionary relationships between organisms. Computational biologists have developed many clever algorithmic techniques for comparing protein structures, however, all are based on heuristic optimization criteria, making statistical interpretation somewhat difficult. Here we present a fully probabilistic framework for pairwise structural alignment of proteins. Our approach has several advantages, including the ability to capture alignment uncertainty and to estimate key "gap" parameters which critically affect the quality of the alignment. We show that several existing alignment methods arise as maximum a posteriori estimates under specific choices of prior distributions and error models. Our probabilistic framework is also easily extended to incorporate additional information, which we demonstrate by including primary sequence information to generate simultaneous sequence-structure alignments that can resolve ambiguities obtained using structure alone. This combined model also provides a natural approach for the difficult task of estimating evolutionary distance based on structural alignments. The model is illustrated by comparison with well-established methods on several challenging protein alignment examples.
Evolutionary distances in the twilight zone--a rational kernel approach.

PubMed

Schwarz, Roland F; Fletcher, William; Förster, Frank; Merget, Benjamin; Wolf, Matthias; Schultz, Jörg; Markowetz, Florian

2010-12-31

Phylogenetic tree reconstruction is traditionally based on multiple sequence alignments (MSAs) and heavily depends on the validity of this information bottleneck. With increasing sequence divergence, the quality of MSAs decays quickly. Alignment-free methods, on the other hand, are based on abstract string comparisons and avoid potential alignment problems. However, in general they are not biologically motivated and ignore our knowledge about the evolution of sequences. Thus, it is still a major open question how to define an evolutionary distance metric between divergent sequences that makes use of indel information and known substitution models without the need for a multiple alignment. Here we propose a new evolutionary distance metric to close this gap. It uses finite-state transducers to create a biologically motivated similarity score which models substitutions and indels, and does not depend on a multiple sequence alignment. The sequence similarity score is defined in analogy to pairwise alignments and additionally has the positive semi-definite property. We describe its derivation and show in simulation studies and real-world examples that it is more accurate in reconstructing phylogenies than competing methods. The result is a new and accurate way of determining evolutionary distances in and beyond the twilight zone of sequence alignments that is suitable for large datasets.
rVISTA 2.0: Evolutionary Analysis of Transcription Factor Binding Sites

DOE Office of Scientific and Technical Information (OSTI.GOV)

Loots, G G; Ovcharenko, I

2004-01-28

Identifying and characterizing the patterns of DNA cis-regulatory modules represents a challenge that has the potential to reveal the regulatory language the genome uses to dictate transcriptional dynamics. Several studies have demonstrated that regulatory modules are under positive selection and therefore are often conserved between related species. Using this evolutionary principle we have created a comparative tool, rVISTA, for analyzing the regulatory potential of noncoding sequences. The rVISTA tool combines transcription factor binding site (TFBS) predictions, sequence comparisons and cluster analysis to identify noncoding DNA regions that are highly conserved and present in a specific configuration within an alignment. Heremore » we present the newly developed version 2.0 of the rVISTA tool that can process alignments generated by both zPicture and PipMaker alignment programs or use pre-computed pairwise alignments of seven vertebrate genomes available from the ECR Browser. The rVISTA web server is closely interconnected with the TRANSFAC database, allowing users to either search for matrices present in the TRANSFAC library collection or search for user-defined consensus sequences. rVISTA tool is publicly available at http://rvista.dcode.org/.« less
Sequence similarities and evolutionary relationships of microbial, plant and animal alpha-amylases.

PubMed

Janecek, S

1994-09-01

Amino acid sequence comparison of 37 alpha-amylases from microbial, plant and animal sources was performed to identify their mutual sequence similarities in addition to the five already described conserved regions. These sequence regions were examined from structure/function and evolutionary perspectives. An unrooted evolutionary tree of alpha-amylases was constructed on a subset of 55 residues from the alignment of sequence similarities along with conserved regions. The most important new information extracted from the tree was as follows: (a) the close evolutionary relationship of Alteromonas haloplanctis alpha-amylase (thermolabile enzyme from an antarctic psychrotroph) with the already known group of homologous alpha-amylases from streptomycetes, Thermomonospora curvata, insects and mammals, and (b) the remarkable 40.1% identity between starch-saccharifying Bacillus subtilis alpha-amylase and the enzyme from the ruminal bacterium Butyrivibrio fibrisolvens, an alpha-amylase with an unusually large polypeptide chain (943 residues in the mature enzyme). Due to a very high degree of similarity, the whole amino acid sequences of three groups of alpha-amylases, namely (a) fungi and yeasts, (b) plants, and (c) A. haloplanctis, streptomycetes, T. curvata, insects and mammals, were aligned independently and their unrooted distance trees were calculated using these alignments. Possible rooting of the trees was also discussed. Based on the knowledge of the location of the five disulfide bonds in the structure of pig pancreatic alpha-amylase, the possible disulfide bridges were established for each of these groups of homologous alpha-amylases.
Alignment-free microbial phylogenomics under scenarios of sequence divergence, genome rearrangement and lateral genetic transfer.

PubMed

Bernard, Guillaume; Chan, Cheong Xin; Ragan, Mark A

2016-07-01

Alignment-free (AF) approaches have recently been highlighted as alternatives to methods based on multiple sequence alignment in phylogenetic inference. However, the sensitivity of AF methods to genome-scale evolutionary scenarios is little known. Here, using simulated microbial genome data we systematically assess the sensitivity of nine AF methods to three important evolutionary scenarios: sequence divergence, lateral genetic transfer (LGT) and genome rearrangement. Among these, AF methods are most sensitive to the extent of sequence divergence, less sensitive to low and moderate frequencies of LGT, and most robust against genome rearrangement. We describe the application of AF methods to three well-studied empirical genome datasets, and introduce a new application of the jackknife to assess node support. Our results demonstrate that AF phylogenomics is computationally scalable to multi-genome data and can generate biologically meaningful phylogenies and insights into microbial evolution.
Alignment-free protein interaction network comparison

PubMed Central

Ali, Waqar; Rito, Tiago; Reinert, Gesine; Sun, Fengzhu; Deane, Charlotte M.

2014-01-01

Motivation: Biological network comparison software largely relies on the concept of alignment where close matches between the nodes of two or more networks are sought. These node matches are based on sequence similarity and/or interaction patterns. However, because of the incomplete and error-prone datasets currently available, such methods have had limited success. Moreover, the results of network alignment are in general not amenable for distance-based evolutionary analysis of sets of networks. In this article, we describe Netdis, a topology-based distance measure between networks, which offers the possibility of network phylogeny reconstruction. Results: We first demonstrate that Netdis is able to correctly separate different random graph model types independent of network size and density. The biological applicability of the method is then shown by its ability to build the correct phylogenetic tree of species based solely on the topology of current protein interaction networks. Our results provide new evidence that the topology of protein interaction networks contains information about evolutionary processes, despite the lack of conservation of individual interactions. As Netdis is applicable to all networks because of its speed and simplicity, we apply it to a large collection of biological and non-biological networks where it clusters diverse networks by type. Availability and implementation: The source code of the program is freely available at http://www.stats.ox.ac.uk/research/proteins/resources. Contact: w.ali@stats.ox.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25161230
Alignment-free genome tree inference by learning group-specific distance metrics.

PubMed

Patil, Kaustubh R; McHardy, Alice C

2013-01-01

Understanding the evolutionary relationships between organisms is vital for their in-depth study. Gene-based methods are often used to infer such relationships, which are not without drawbacks. One can now attempt to use genome-scale information, because of the ever increasing number of genomes available. This opportunity also presents a challenge in terms of computational efficiency. Two fundamentally different methods are often employed for sequence comparisons, namely alignment-based and alignment-free methods. Alignment-free methods rely on the genome signature concept and provide a computationally efficient way that is also applicable to nonhomologous sequences. The genome signature contains evolutionary signal as it is more similar for closely related organisms than for distantly related ones. We used genome-scale sequence information to infer taxonomic distances between organisms without additional information such as gene annotations. We propose a method to improve genome tree inference by learning specific distance metrics over the genome signature for groups of organisms with similar phylogenetic, genomic, or ecological properties. Specifically, our method learns a Mahalanobis metric for a set of genomes and a reference taxonomy to guide the learning process. By applying this method to more than a thousand prokaryotic genomes, we showed that, indeed, better distance metrics could be learned for most of the 18 groups of organisms tested here. Once a group-specific metric is available, it can be used to estimate the taxonomic distances for other sequenced organisms from the group. This study also presents a large scale comparison between 10 methods--9 alignment-free and 1 alignment-based.

A statistical physics perspective on alignment-independent protein sequence comparison.

PubMed

Chattopadhyay, Amit K; Nasiev, Diar; Flower, Darren R

2015-08-01

Within bioinformatics, the textual alignment of amino acid sequences has long dominated the determination of similarity between proteins, with all that implies for shared structure, function and evolutionary descent. Despite the relative success of modern-day sequence alignment algorithms, so-called alignment-free approaches offer a complementary means of determining and expressing similarity, with potential benefits in certain key applications, such as regression analysis of protein structure-function studies, where alignment-base similarity has performed poorly. Here, we offer a fresh, statistical physics-based perspective focusing on the question of alignment-free comparison, in the process adapting results from 'first passage probability distribution' to summarize statistics of ensemble averaged amino acid propensity values. In this article, we introduce and elaborate this approach. © The Author 2015. Published by Oxford University Press.
Toxin structures as evolutionary tools: Using conserved 3D folds to study the evolution of rapidly evolving peptides.

PubMed

Undheim, Eivind A B; Mobli, Mehdi; King, Glenn F

2016-06-01

Three-dimensional (3D) structures have been used to explore the evolution of proteins for decades, yet they have rarely been utilized to study the molecular evolution of peptides. Here, we highlight areas in which 3D structures can be particularly useful for studying the molecular evolution of peptide toxins. Although we focus our discussion on animal toxins, including one of the most widespread disulfide-rich peptide folds known, the inhibitor cystine knot, our conclusions should be widely applicable to studies of the evolution of disulfide-constrained peptides. We show that conserved 3D folds can be used to identify evolutionary links and test hypotheses regarding the evolutionary origin of peptides with extremely low sequence identity; construct accurate multiple sequence alignments; and better understand the evolutionary forces that drive the molecular evolution of peptides. Also watch the video abstract. © 2016 WILEY Periodicals, Inc.
ConSurf 2016: an improved methodology to estimate and visualize evolutionary conservation in macromolecules

PubMed Central

Ashkenazy, Haim; Abadi, Shiran; Martz, Eric; Chay, Ofer; Mayrose, Itay; Pupko, Tal; Ben-Tal, Nir

2016-01-01

The degree of evolutionary conservation of an amino acid in a protein or a nucleic acid in DNA/RNA reflects a balance between its natural tendency to mutate and the overall need to retain the structural integrity and function of the macromolecule. The ConSurf web server (http://consurf.tau.ac.il), established over 15 years ago, analyses the evolutionary pattern of the amino/nucleic acids of the macromolecule to reveal regions that are important for structure and/or function. Starting from a query sequence or structure, the server automatically collects homologues, infers their multiple sequence alignment and reconstructs a phylogenetic tree that reflects their evolutionary relations. These data are then used, within a probabilistic framework, to estimate the evolutionary rates of each sequence position. Here we introduce several new features into ConSurf, including automatic selection of the best evolutionary model used to infer the rates, the ability to homology-model query proteins, prediction of the secondary structure of query RNA molecules from sequence, the ability to view the biological assembly of a query (in addition to the single chain), mapping of the conservation grades onto 2D RNA models and an advanced view of the phylogenetic tree that enables interactively rerunning ConSurf with the taxa of a sub-tree. PMID:27166375
L-GRAAL: Lagrangian graphlet-based network aligner.

PubMed

Malod-Dognin, Noël; Pržulj, Nataša

2015-07-01

Discovering and understanding patterns in networks of protein-protein interactions (PPIs) is a central problem in systems biology. Alignments between these networks aid functional understanding as they uncover important information, such as evolutionary conserved pathways, protein complexes and functional orthologs. A few methods have been proposed for global PPI network alignments, but because of NP-completeness of underlying sub-graph isomorphism problem, producing topologically and biologically accurate alignments remains a challenge. We introduce a novel global network alignment tool, Lagrangian GRAphlet-based ALigner (L-GRAAL), which directly optimizes both the protein and the interaction functional conservations, using a novel alignment search heuristic based on integer programming and Lagrangian relaxation. We compare L-GRAAL with the state-of-the-art network aligners on the largest available PPI networks from BioGRID and observe that L-GRAAL uncovers the largest common sub-graphs between the networks, as measured by edge-correctness and symmetric sub-structures scores, which allow transferring more functional information across networks. We assess the biological quality of the protein mappings using the semantic similarity of their Gene Ontology annotations and observe that L-GRAAL best uncovers functionally conserved proteins. Furthermore, we introduce for the first time a measure of the semantic similarity of the mapped interactions and show that L-GRAAL also uncovers best functionally conserved interactions. In addition, we illustrate on the PPI networks of baker's yeast and human the ability of L-GRAAL to predict new PPIs. Finally, L-GRAAL's results are the first to show that topological information is more important than sequence information for uncovering functionally conserved interactions. L-GRAAL is coded in C++. Software is available at: http://bio-nets.doc.ic.ac.uk/L-GRAAL/. n.malod-dognin@imperial.ac.uk Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
Background Adjusted Alignment-Free Dissimilarity Measures Improve the Detection of Horizontal Gene Transfer.

PubMed

Tang, Kujin; Lu, Yang Young; Sun, Fengzhu

2018-01-01

Horizontal gene transfer (HGT) plays an important role in the evolution of microbial organisms including bacteria. Alignment-free methods based on single genome compositional information have been used to detect HGT. Currently, Manhattan and Euclidean distances based on tetranucleotide frequencies are the most commonly used alignment-free dissimilarity measures to detect HGT. By testing on simulated bacterial sequences and real data sets with known horizontal transferred genomic regions, we found that more advanced alignment-free dissimilarity measures such as CVTree and [Formula: see text] that take into account the background Markov sequences can solve HGT detection problems with significantly improved performance. We also studied the influence of different factors such as evolutionary distance between host and donor sequences, size of sliding window, and host genome composition on the performances of alignment-free methods to detect HGT. Our study showed that alignment-free methods can predict HGT accurately when host and donor genomes are in different order levels. Among all methods, CVTree with word length of 3, [Formula: see text] with word length 3, Markov order 1 and [Formula: see text] with word length 4, Markov order 1 outperform others in terms of their highest F 1 -score and their robustness under the influence of different factors.
Viral phylogenomics using an alignment-free method: A three-step approach to determine optimal length of k-mer

DOE PAGES

Zhang, Qian; Jun, Se -Ran; Leuze, Michael; ...

2017-01-19

The development of rapid, economical genome sequencing has shed new light on the classification of viruses. As of October 2016, the National Center for Biotechnology Information (NCBI) database contained >2 million viral genome sequences and a reference set of ~4000 viral genome sequences that cover a wide range of known viral families. Whole-genome sequences can be used to improve viral classification and provide insight into the viral tree of life . However, due to the lack of evolutionary conservation amongst diverse viruses, it is not feasible to build a viral tree of life using traditional phylogenetic methods based on conservedmore » proteins. In this study, we used an alignment-free method that uses k-mers as genomic features for a large-scale comparison of complete viral genomes available in RefSeq. To determine the optimal feature length, k (an essential step in constructing a meaningful dendrogram), we designed a comprehensive strategy that combines three approaches: (1) cumulative relative entropy, (2) average number of common features among genomes, and (3) the Shannon diversity index. This strategy was used to determine k for all 3,905 complete viral genomes in RefSeq. Lastly, the resulting dendrogram shows consistency with the viral taxonomy of the ICTV and the Baltimore classification of viruses.« less
Viral phylogenomics using an alignment-free method: A three-step approach to determine optimal length of k-mer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Qian; Jun, Se -Ran; Leuze, Michael

The development of rapid, economical genome sequencing has shed new light on the classification of viruses. As of October 2016, the National Center for Biotechnology Information (NCBI) database contained >2 million viral genome sequences and a reference set of ~4000 viral genome sequences that cover a wide range of known viral families. Whole-genome sequences can be used to improve viral classification and provide insight into the viral tree of life . However, due to the lack of evolutionary conservation amongst diverse viruses, it is not feasible to build a viral tree of life using traditional phylogenetic methods based on conservedmore » proteins. In this study, we used an alignment-free method that uses k-mers as genomic features for a large-scale comparison of complete viral genomes available in RefSeq. To determine the optimal feature length, k (an essential step in constructing a meaningful dendrogram), we designed a comprehensive strategy that combines three approaches: (1) cumulative relative entropy, (2) average number of common features among genomes, and (3) the Shannon diversity index. This strategy was used to determine k for all 3,905 complete viral genomes in RefSeq. Lastly, the resulting dendrogram shows consistency with the viral taxonomy of the ICTV and the Baltimore classification of viruses.« less
Invariant glycines and prolines flanking in loops the strand beta 2 of various (alpha/beta)8-barrel enzymes: a hidden homology?

PubMed Central

Janecek, S.

1996-01-01

The question of parallel (alpha/beta)8-barrel fold evolution remains unclear, owing mainly to the lack of sequence homology throughout the amino acid sequences of (alpha/beta)8-barrel enzymes. The "classical" approaches used in the search for homologies among (alpha/beta)8-barrels (e.g., production of structurally based alignments) have yielded alignments perfect from the structural point of view, but the approaches have been unable to reveal the homologies. These are proposed to be "hidden" in (alpha/beta)8-barrel enzymes. The term "hidden homology" means that the alignment of sequence stretches proposed to be homologous need not be structurally fully satisfactory. This is due to the very long evolutionary history of all (alpha/beta)8-barrels. This work identifies so-called hidden homology around the strand beta 2 that is flanked by loops containing invariant glycines and prolines in 17 different (alpha/beta)8-barrel enzymes, i.e., roughly in half of all currently known (alpha/beta)8-barrel proteins. The search was based on the idea that a conserved sequence region of an (alpha/beta)8-barrel enzyme should be more or less conserved also in the equivalent part of the structure of the other enzymes with this folding motif, given their mutual evolutionary relatedness. For this purpose, the sequence region around the well-conserved second beta-strand of alpha-amylase flanked by the invariant glycine and proline (56_GFTAIWITP, Aspergillus oryzae alpha-amylase numbering), was used as the sequence-structural template. The proposal that the second beta-strand of (alpha/beta)8-barrel fold is important from the evolutionary point of view is strongly supported by the increasing trend of the observed beta 2-strand structural similarity for the pairs of (alpha/beta)8-barrel enzymes: alpha-amylase and the alpha-subunit of tryptophan synthase, alpha-amylase and mandelate racemase, and alpha-amylase and cyclodextrin glycosyltransferase. This trend is also in agreement with the existing evolutionary division of the entire family of (alpha/beta)8-barrel proteins. PMID:8762144
Invariant glycines and prolines flanking in loops the strand beta 2 of various (alpha/beta)8-barrel enzymes: a hidden homology?

PubMed

Janecek, S

1996-06-01

The question of parallel (alpha/beta)8-barrel fold evolution remains unclear, owing mainly to the lack of sequence homology throughout the amino acid sequences of (alpha/beta)8-barrel enzymes. The "classical" approaches used in the search for homologies among (alpha/beta)8-barrels (e.g., production of structurally based alignments) have yielded alignments perfect from the structural point of view, but the approaches have been unable to reveal the homologies. These are proposed to be "hidden" in (alpha/beta)8-barrel enzymes. The term "hidden homology" means that the alignment of sequence stretches proposed to be homologous need not be structurally fully satisfactory. This is due to the very long evolutionary history of all (alpha/beta)8-barrels. This work identifies so-called hidden homology around the strand beta 2 that is flanked by loops containing invariant glycines and prolines in 17 different (alpha/beta)8-barrel enzymes, i.e., roughly in half of all currently known (alpha/beta)8-barrel proteins. The search was based on the idea that a conserved sequence region of an (alpha/beta)8-barrel enzyme should be more or less conserved also in the equivalent part of the structure of the other enzymes with this folding motif, given their mutual evolutionary relatedness. For this purpose, the sequence region around the well-conserved second beta-strand of alpha-amylase flanked by the invariant glycine and proline (56_GFTAIWITP, Aspergillus oryzae alpha-amylase numbering), was used as the sequence-structural template. The proposal that the second beta-strand of (alpha/beta)8-barrel fold is important from the evolutionary point of view is strongly supported by the increasing trend of the observed beta 2-strand structural similarity for the pairs of (alpha/beta)8-barrel enzymes: alpha-amylase and the alpha-subunit of tryptophan synthase, alpha-amylase and mandelate racemase, and alpha-amylase and cyclodextrin glycosyltransferase. This trend is also in agreement with the existing evolutionary division of the entire family of (alpha/beta)8-barrel proteins.
Disease-associated mitochondrial mutations and the evolution of primate mitogenomes

PubMed Central

Tavares, William Corrêa

2017-01-01

Several human diseases have been associated with mutations in mitochondrial genes comprising a set of confirmed and reported mutations according to the MITOMAP database. An analysis of complete mitogenomes across 139 primate species showed that most confirmed disease-associated mutations occurred in aligned codon positions and gene regions under strong purifying selection resulting in a strong evolutionary conservation. Only two confirmed variants (7.1%), coding for the same amino acids accounting for severe human diseases, were identified without apparent pathogenicity in non-human primates, like the closely related Bornean orangutan. Conversely, reported disease-associated mutations were not especially concentrated in conserved codon positions, and a large fraction of them occurred in highly variable ones. Additionally, 88 (45.8%) of reported mutations showed similar variants in several non-human primates and some of them have been present in extinct species of the genus Homo. Considering that recurrent mutations leading to persistent variants throughout the evolutionary diversification of primates are less likely to be severely damaging to fitness, we suggest that these 88 mutations are less likely to be pathogenic. Conversely, 69 (35.9%) of reported disease-associated mutations occurred in extremely conserved aligned codon positions which makes them more likely to damage the primate mitochondrial physiology. PMID:28510580
Genome Alignment Spanning Major Poaceae Lineages Reveals Heterogeneous Evolutionary Rates and Alters Inferred Dates for Key Evolutionary Events.

PubMed

Wang, Xiyin; Wang, Jingpeng; Jin, Dianchuan; Guo, Hui; Lee, Tae-Ho; Liu, Tao; Paterson, Andrew H

2015-06-01

Multiple comparisons among genomes can clarify their evolution, speciation, and functional innovations. To date, the genome sequences of eight grasses representing the most economically important Poaceae (grass) clades have been published, and their genomic-level comparison is an essential foundation for evolutionary, functional, and translational research. Using a formal and conservative approach, we aligned these genomes. Direct comparison of paralogous gene pairs all duplicated simultaneously reveal striking variation in evolutionary rates among whole genomes, with nucleotide substitution slowest in rice and up to 48% faster in other grasses, adding a new dimension to the value of rice as a grass model. We reconstructed ancestral genome contents for major evolutionary nodes, potentially contributing to understanding the divergence and speciation of grasses. Recent fossil evidence suggests revisions of the estimated dates of key evolutionary events, implying that the pan-grass polyploidization occurred ∼96 million years ago and could not be related to the Cretaceous-Tertiary mass extinction as previously inferred. Adjusted dating to reflect both updated fossil evidence and lineage-specific evolutionary rates suggested that maize subgenome divergence and maize-sorghum divergence were virtually simultaneous, a coincidence that would be explained if polyploidization directly contributed to speciation. This work lays a solid foundation for Poaceae translational genomics. Copyright © 2015 The Author. Published by Elsevier Inc. All rights reserved.
Insights into the phylogenetic positions of photosynthetic bacteria obtained from 5S rRNA and 16S rRNA sequence data

NASA Technical Reports Server (NTRS)

Fox, G. E.

1985-01-01

Comparisons of complete 16S ribosomal ribonucleic acid (rRNA) sequences established that the secondary structure of these molecules is highly conserved. Earlier work with 5S rRNA secondary structure revealed that when structural conservation exists the alignment of sequences is straightforward. The constancy of structure implies minimal functional change. Under these conditions a uniform evolutionary rate can be expected so that conditions are favorable for phylogenetic tree construction.
PASS2: an automated database of protein alignments organised as structural superfamilies.

PubMed

Bhaduri, Anirban; Pugalenthi, Ganesan; Sowdhamini, Ramanathan

2004-04-02

The functional selection and three-dimensional structural constraints of proteins in nature often relates to the retention of significant sequence similarity between proteins of similar fold and function despite poor sequence identity. Organization of structure-based sequence alignments for distantly related proteins, provides a map of the conserved and critical regions of the protein universe that is useful for the analysis of folding principles, for the evolutionary unification of protein families and for maximizing the information return from experimental structure determination. The Protein Alignment organised as Structural Superfamily (PASS2) database represents continuously updated, structural alignments for evolutionary related, sequentially distant proteins. An automated and updated version of PASS2 is, in direct correspondence with SCOP 1.63, consisting of sequences having identity below 40% among themselves. Protein domains have been grouped into 628 multi-member superfamilies and 566 single member superfamilies. Structure-based sequence alignments for the superfamilies have been obtained using COMPARER, while initial equivalencies have been derived from a preliminary superposition using LSQMAN or STAMP 4.0. The final sequence alignments have been annotated for structural features using JOY4.0. The database is supplemented with sequence relatives belonging to different genomes, conserved spatially interacting and structural motifs, probabilistic hidden markov models of superfamilies based on the alignments and useful links to other databases. Probabilistic models and sensitive position specific profiles obtained from reliable superfamily alignments aid annotation of remote homologues and are useful tools in structural and functional genomics. PASS2 presents the phylogeny of its members both based on sequence and structural dissimilarities. Clustering of members allows us to understand diversification of the family members. The search engine has been improved for simpler browsing of the database. The database resolves alignments among the structural domains consisting of evolutionarily diverged set of sequences. Availability of reliable sequence alignments of distantly related proteins despite poor sequence identity and single-member superfamilies permit better sampling of structures in libraries for fold recognition of new sequences and for the understanding of protein structure-function relationships of individual superfamilies. PASS2 is accessible at http://www.ncbs.res.in/~faculty/mini/campass/pass2.html
Dissipation dynamics of field-free molecular alignment for symmetric-top molecules: Ethane (C2H6)

NASA Astrophysics Data System (ADS)

Zhang, H.; Billard, F.; Yu, X.; Faucher, O.; Lavorel, B.

2018-03-01

The field-free molecular alignment of symmetric-top molecules, ethane, induced by intense non-resonant linearly polarized femtosecond laser pulses is investigated experimentally in the presence of collisional relaxation. The dissipation dynamics of field-free molecular alignment are measured by the balanced detection of ultrafast molecular birefringence of ethane gas samples at high pressures. By separating the molecular alignment into the permanent alignment and the transient alignment, the decay time-constants of both components are quantified at the same pressure. It is observed that the permanent alignment always decays slower compared to the transient alignment within the measured pressure range. This demonstrates that the propensity of molecules to conserve the orientation of angular momentum during collisions, previously observed for linear species, is also applicable to symmetric-top molecules. The results of this work provide valuable information for further theoretical understanding of collisional relaxation within nonlinear polyatomic molecules, which are expected to present interesting and nontrivial features due to an extra rotational degree of freedom.
Delineating slowly and rapidly evolving fractions of the Drosophila genome.

PubMed

Keith, Jonathan M; Adams, Peter; Stephen, Stuart; Mattick, John S

2008-05-01

Evolutionary conservation is an important indicator of function and a major component of bioinformatic methods to identify non-protein-coding genes. We present a new Bayesian method for segmenting pairwise alignments of eukaryotic genomes while simultaneously classifying segments into slowly and rapidly evolving fractions. We also describe an information criterion similar to the Akaike Information Criterion (AIC) for determining the number of classes. Working with pairwise alignments enables detection of differences in conservation patterns among closely related species. We analyzed three whole-genome and three partial-genome pairwise alignments among eight Drosophila species. Three distinct classes of conservation level were detected. Sequences comprising the most slowly evolving component were consistent across a range of species pairs, and constituted approximately 62-66% of the D. melanogaster genome. Almost all (>90%) of the aligned protein-coding sequence is in this fraction, suggesting much of it (comprising the majority of the Drosophila genome, including approximately 56% of non-protein-coding sequences) is functional. The size and content of the most rapidly evolving component was species dependent, and varied from 1.6% to 4.8%. This fraction is also enriched for protein-coding sequence (while containing significant amounts of non-protein-coding sequence), suggesting it is under positive selection. We also classified segments according to conservation and GC content simultaneously. This analysis identified numerous sub-classes of those identified on the basis of conservation alone, but was nevertheless consistent with that classification. Software, data, and results available at www.maths.qut.edu.au/-keithj/. Genomic segments comprising the conservation classes available in BED format.
Multiple sequence alignment using multi-objective based bacterial foraging optimization algorithm.

PubMed

Rani, R Ranjani; Ramyachitra, D

2016-12-01

Multiple sequence alignment (MSA) is a widespread approach in computational biology and bioinformatics. MSA deals with how the sequences of nucleotides and amino acids are sequenced with possible alignment and minimum number of gaps between them, which directs to the functional, evolutionary and structural relationships among the sequences. Still the computation of MSA is a challenging task to provide an efficient accuracy and statistically significant results of alignments. In this work, the Bacterial Foraging Optimization Algorithm was employed to align the biological sequences which resulted in a non-dominated optimal solution. It employs Multi-objective, such as: Maximization of Similarity, Non-gap percentage, Conserved blocks and Minimization of gap penalty. BAliBASE 3.0 benchmark database was utilized to examine the proposed algorithm against other methods In this paper, two algorithms have been proposed: Hybrid Genetic Algorithm with Artificial Bee Colony (GA-ABC) and Bacterial Foraging Optimization Algorithm. It was found that Hybrid Genetic Algorithm with Artificial Bee Colony performed better than the existing optimization algorithms. But still the conserved blocks were not obtained using GA-ABC. Then BFO was used for the alignment and the conserved blocks were obtained. The proposed Multi-Objective Bacterial Foraging Optimization Algorithm (MO-BFO) was compared with widely used MSA methods Clustal Omega, Kalign, MUSCLE, MAFFT, Genetic Algorithm (GA), Ant Colony Optimization (ACO), Artificial Bee Colony (ABC), Particle Swarm Optimization (PSO) and Hybrid Genetic Algorithm with Artificial Bee Colony (GA-ABC). The final results show that the proposed MO-BFO algorithm yields better alignment than most widely used methods. Copyright Â© 2016 Elsevier Ireland Ltd. All rights reserved.
Investigating homology between proteins using energetic profiles.

PubMed

Wrabl, James O; Hilser, Vincent J

2010-03-26

Accumulated experimental observations demonstrate that protein stability is often preserved upon conservative point mutation. In contrast, less is known about the effects of large sequence or structure changes on the stability of a particular fold. Almost completely unknown is the degree to which stability of different regions of a protein is generally preserved throughout evolution. In this work, these questions are addressed through thermodynamic analysis of a large representative sample of protein fold space based on remote, yet accepted, homology. More than 3,000 proteins were computationally analyzed using the structural-thermodynamic algorithm COREX/BEST. Estimated position-specific stability (i.e., local Gibbs free energy of folding) and its component enthalpy and entropy were quantitatively compared between all proteins in the sample according to all-vs.-all pairwise structural alignment. It was discovered that the local stabilities of homologous pairs were significantly more correlated than those of non-homologous pairs, indicating that local stability was indeed generally conserved throughout evolution. However, the position-specific enthalpy and entropy underlying stability were less correlated, suggesting that the overall regional stability of a protein was more important than the thermodynamic mechanism utilized to achieve that stability. Finally, two different types of statistically exceptional evolutionary structure-thermodynamic relationships were noted. First, many homologous proteins contained regions of similar thermodynamics despite localized structure change, suggesting a thermodynamic mechanism enabling evolutionary fold change. Second, some homologous proteins with extremely similar structures nonetheless exhibited different local stabilities, a phenomenon previously observed experimentally in this laboratory. These two observations, in conjunction with the principal conclusion that homologous proteins generally conserved local stability, may provide guidance for a future thermodynamically informed classification of protein homology.
PROPER: global protein interaction network alignment through percolation matching.

PubMed

Kazemi, Ehsan; Hassani, Hamed; Grossglauser, Matthias; Pezeshgi Modarres, Hassan

2016-12-12

The alignment of protein-protein interaction (PPI) networks enables us to uncover the relationships between different species, which leads to a deeper understanding of biological systems. Network alignment can be used to transfer biological knowledge between species. Although different PPI-network alignment algorithms were introduced during the last decade, developing an accurate and scalable algorithm that can find alignments with high biological and structural similarities among PPI networks is still challenging. In this paper, we introduce a new global network alignment algorithm for PPI networks called PROPER. Compared to other global network alignment methods, our algorithm shows higher accuracy and speed over real PPI datasets and synthetic networks. We show that the PROPER algorithm can detect large portions of conserved biological pathways between species. Also, using a simple parsimonious evolutionary model, we explain why PROPER performs well based on several different comparison criteria. We highlight that PROPER has high potential in further applications such as detecting biological pathways, finding protein complexes and PPI prediction. The PROPER algorithm is available at http://proper.epfl.ch .
Evolutionary conservation of regulatory elements in vertebrate HOX gene clusters

DOE Office of Scientific and Technical Information (OSTI.GOV)

Santini, Simona; Boore, Jeffrey L.; Meyer, Axel

2003-12-31

Due to their high degree of conservation, comparisons of DNA sequences among evolutionarily distantly-related genomes permit to identify functional regions in noncoding DNA. Hox genes are optimal candidate sequences for comparative genome analyses, because they are extremely conserved in vertebrates and occur in clusters. We aligned (Pipmaker) the nucleotide sequences of HoxA clusters of tilapia, pufferfish, striped bass, zebrafish, horn shark, human and mouse (over 500 million years of evolutionary distance). We identified several highly conserved intergenic sequences, likely to be important in gene regulation. Only a few of these putative regulatory elements have been previously described as being involvedmore » in the regulation of Hox genes, while several others are new elements that might have regulatory functions. The majority of these newly identified putative regulatory elements contain short fragments that are almost completely conserved and are identical to known binding sites for regulatory proteins (Transfac). The conserved intergenic regions located between the most rostrally expressed genes in the developing embryo are longer and better retained through evolution. We document that presumed regulatory sequences are retained differentially in either A or A clusters resulting from a genome duplication in the fish lineage. This observation supports both the hypothesis that the conserved elements are involved in gene regulation and the Duplication-Deletion-Complementation model.« less
Conserved antigenic sites between MERS-CoV and Bat-coronavirus are revealed through sequence analysis.

PubMed

Sharmin, Refat; Islam, Abul B M M K

2016-01-01

MERS-CoV is a newly emerged human coronavirus reported closely related with HKU4 and HKU5 Bat coronaviruses. Bat and MERS corona-viruses are structurally related. Therefore, it is of interest to estimate the degree of conserved antigenic sites among them. It is of importance to elucidate the shared antigenic-sites and extent of conservation between them to understand the evolutionary dynamics of MERS-CoV. Multiple sequence alignment of the spike (S), membrane (M), enveloped (E) and nucleocapsid (N) proteins was employed to identify the sequence conservation among MERS and Bat (HKU4, HKU5) coronaviruses. We used various in silico tools to predict the conserved antigenic sites. We found that MERS-CoV shared 30 % of its S protein antigenic sites with HKU4 and 70 % with HKU5 bat-CoV. Whereas 100 % of its E, M and N protein's antigenic sites are found to be conserved with those in HKU4 and HKU5. This sharing suggests that in case of pathogenicity MERS-CoV is more closely related to HKU5 bat-CoV than HKU4 bat-CoV. The conserved epitopes indicates their evolutionary relationship and ancestry of pathogenicity.

Differential evolution-simulated annealing for multiple sequence alignment

NASA Astrophysics Data System (ADS)

Addawe, R. C.; Addawe, J. M.; Sueño, M. R. K.; Magadia, J. C.

2017-10-01

Multiple sequence alignments (MSA) are used in the analysis of molecular evolution and sequence structure relationships. In this paper, a hybrid algorithm, Differential Evolution - Simulated Annealing (DESA) is applied in optimizing multiple sequence alignments (MSAs) based on structural information, non-gaps percentage and totally conserved columns. DESA is a robust algorithm characterized by self-organization, mutation, crossover, and SA-like selection scheme of the strategy parameters. Here, the MSA problem is treated as a multi-objective optimization problem of the hybrid evolutionary algorithm, DESA. Thus, we name the algorithm as DESA-MSA. Simulated sequences and alignments were generated to evaluate the accuracy and efficiency of DESA-MSA using different indel sizes, sequence lengths, deletion rates and insertion rates. The proposed hybrid algorithm obtained acceptable solutions particularly for the MSA problem evaluated based on the three objectives.
Enzyme sequence similarity improves the reaction alignment method for cross-species pathway comparison

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ovacik, Meric A.; Androulakis, Ioannis P., E-mail: yannis@rci.rutgers.edu; Biomedical Engineering Department, Rutgers University, Piscataway, NJ 08854

2013-09-15

Pathway-based information has become an important source of information for both establishing evolutionary relationships and understanding the mode of action of a chemical or pharmaceutical among species. Cross-species comparison of pathways can address two broad questions: comparison in order to inform evolutionary relationships and to extrapolate species differences used in a number of different applications including drug and toxicity testing. Cross-species comparison of metabolic pathways is complex as there are multiple features of a pathway that can be modeled and compared. Among the various methods that have been proposed, reaction alignment has emerged as the most successful at predicting phylogeneticmore » relationships based on NCBI taxonomy. We propose an improvement of the reaction alignment method by accounting for sequence similarity in addition to reaction alignment method. Using nine species, including human and some model organisms and test species, we evaluate the standard and improved comparison methods by analyzing glycolysis and citrate cycle pathways conservation. In addition, we demonstrate how organism comparison can be conducted by accounting for the cumulative information retrieved from nine pathways in central metabolism as well as a more complete study involving 36 pathways common in all nine species. Our results indicate that reaction alignment with enzyme sequence similarity results in a more accurate representation of pathway specific cross-species similarities and differences based on NCBI taxonomy.« less
RNA-TVcurve: a Web server for RNA secondary structure comparison based on a multi-scale similarity of its triple vector curve representation.

PubMed

Li, Ying; Shi, Xiaohu; Liang, Yanchun; Xie, Juan; Zhang, Yu; Ma, Qin

2017-01-21

RNAs have been found to carry diverse functionalities in nature. Inferring the similarity between two given RNAs is a fundamental step to understand and interpret their functional relationship. The majority of functional RNAs show conserved secondary structures, rather than sequence conservation. Those algorithms relying on sequence-based features usually have limitations in their prediction performance. Hence, integrating RNA structure features is very critical for RNA analysis. Existing algorithms mainly fall into two categories: alignment-based and alignment-free. The alignment-free algorithms of RNA comparison usually have lower time complexity than alignment-based algorithms. An alignment-free RNA comparison algorithm was proposed, in which novel numerical representations RNA-TVcurve (triple vector curve representation) of RNA sequence and corresponding secondary structure features are provided. Then a multi-scale similarity score of two given RNAs was designed based on wavelet decomposition of their numerical representation. In support of RNA mutation and phylogenetic analysis, a web server (RNA-TVcurve) was designed based on this alignment-free RNA comparison algorithm. It provides three functional modules: 1) visualization of numerical representation of RNA secondary structure; 2) detection of single-point mutation based on secondary structure; and 3) comparison of pairwise and multiple RNA secondary structures. The inputs of the web server require RNA primary sequences, while corresponding secondary structures are optional. For the primary sequences alone, the web server can compute the secondary structures using free energy minimization algorithm in terms of RNAfold tool from Vienna RNA package. RNA-TVcurve is the first integrated web server, based on an alignment-free method, to deliver a suite of RNA analysis functions, including visualization, mutation analysis and multiple RNAs structure comparison. The comparison results with two popular RNA comparison tools, RNApdist and RNAdistance, showcased that RNA-TVcurve can efficiently capture subtle relationships among RNAs for mutation detection and non-coding RNA classification. All the relevant results were shown in an intuitive graphical manner, and can be freely downloaded from this server. RNA-TVcurve, along with test examples and detailed documents, are available at: http://ml.jlu.edu.cn/tvcurve/ .
Phylogeny Reconstruction with Alignment-Free Method That Corrects for Horizontal Gene Transfer.

PubMed

Bromberg, Raquel; Grishin, Nick V; Otwinowski, Zbyszek

2016-06-01

Advances in sequencing have generated a large number of complete genomes. Traditionally, phylogenetic analysis relies on alignments of orthologs, but defining orthologs and separating them from paralogs is a complex task that may not always be suited to the large datasets of the future. An alternative to traditional, alignment-based approaches are whole-genome, alignment-free methods. These methods are scalable and require minimal manual intervention. We developed SlopeTree, a new alignment-free method that estimates evolutionary distances by measuring the decay of exact substring matches as a function of match length. SlopeTree corrects for horizontal gene transfer, for composition variation and low complexity sequences, and for branch-length nonlinearity caused by multiple mutations at the same site. We tested SlopeTree on 495 bacteria, 73 archaea, and 72 strains of Escherichia coli and Shigella. We compared our trees to the NCBI taxonomy, to trees based on concatenated alignments, and to trees produced by other alignment-free methods. The results were consistent with current knowledge about prokaryotic evolution. We assessed differences in tree topology over different methods and settings and found that the majority of bacteria and archaea have a core set of proteins that evolves by descent. In trees built from complete genomes rather than sets of core genes, we observed some grouping by phenotype rather than phylogeny, for instance with a cluster of sulfur-reducing thermophilic bacteria coming together irrespective of their phyla. The source-code for SlopeTree is available at: http://prodata.swmed.edu/download/pub/slopetree_v1/slopetree.tar.gz.
Phylogeny Reconstruction with Alignment-Free Method That Corrects for Horizontal Gene Transfer

PubMed Central

Grishin, Nick V.; Otwinowski, Zbyszek

2016-01-01

Advances in sequencing have generated a large number of complete genomes. Traditionally, phylogenetic analysis relies on alignments of orthologs, but defining orthologs and separating them from paralogs is a complex task that may not always be suited to the large datasets of the future. An alternative to traditional, alignment-based approaches are whole-genome, alignment-free methods. These methods are scalable and require minimal manual intervention. We developed SlopeTree, a new alignment-free method that estimates evolutionary distances by measuring the decay of exact substring matches as a function of match length. SlopeTree corrects for horizontal gene transfer, for composition variation and low complexity sequences, and for branch-length nonlinearity caused by multiple mutations at the same site. We tested SlopeTree on 495 bacteria, 73 archaea, and 72 strains of Escherichia coli and Shigella. We compared our trees to the NCBI taxonomy, to trees based on concatenated alignments, and to trees produced by other alignment-free methods. The results were consistent with current knowledge about prokaryotic evolution. We assessed differences in tree topology over different methods and settings and found that the majority of bacteria and archaea have a core set of proteins that evolves by descent. In trees built from complete genomes rather than sets of core genes, we observed some grouping by phenotype rather than phylogeny, for instance with a cluster of sulfur-reducing thermophilic bacteria coming together irrespective of their phyla. The source-code for SlopeTree is available at: http://prodata.swmed.edu/download/pub/slopetree_v1/slopetree.tar.gz. PMID:27336403
A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities.

PubMed

Bastien, Olivier; Ortet, Philippe; Roy, Sylvaine; Maréchal, Eric

2005-03-10

Popular methods to reconstruct molecular phylogenies are based on multiple sequence alignments, in which addition or removal of data may change the resulting tree topology. We have sought a representation of homologous proteins that would conserve the information of pair-wise sequence alignments, respect probabilistic properties of Z-scores (Monte Carlo methods applied to pair-wise comparisons) and be the basis for a novel method of consistent and stable phylogenetic reconstruction. We have built up a spatial representation of protein sequences using concepts from particle physics (configuration space) and respecting a frame of constraints deduced from pair-wise alignment score properties in information theory. The obtained configuration space of homologous proteins (CSHP) allows the representation of real and shuffled sequences, and thereupon an expression of the TULIP theorem for Z-score probabilities. Based on the CSHP, we propose a phylogeny reconstruction using Z-scores. Deduced trees, called TULIP trees, are consistent with multiple-alignment based trees. Furthermore, the TULIP tree reconstruction method provides a solution for some previously reported incongruent results, such as the apicomplexan enolase phylogeny. The CSHP is a unified model that conserves mutual information between proteins in the way physical models conserve energy. Applications include the reconstruction of evolutionary consistent and robust trees, the topology of which is based on a spatial representation that is not reordered after addition or removal of sequences. The CSHP and its assigned phylogenetic topology, provide a powerful and easily updated representation for massive pair-wise genome comparisons based on Z-score computations.
GOSAP: Gene Ontology-Based Semantic Alignment of Biological Pathways.

PubMed

Gamalielsson, Jonas; Olsson, Bjorn

2008-01-01

We present a new method for semantic comparison of biological pathways, aiming to discover evolutionary conservation of pathways between species. Our method uses all three sub-ontologies of Gene Ontology (GO) and a measure of semantic similarity to calculate match scores between gene products. These scores are used for finding local pairwise pathway alignments. This approach has the advantage of being applicable to all types of pathways where nodes are gene products, e.g., regulatory pathways, signalling pathways and metabolic enzyme-to-enzyme pathways. We demonstrate the usefulness of the method using regulatory and metabolic pathways from E. coli and S. cerevisiae as examples.
Versatility and Invariance in the Evolution of Homologous Heteromeric Interfaces

PubMed Central

Andreani, Jessica; Faure, Guilhem; Guerois, Raphaël

2012-01-01

Evolutionary pressures act on protein complex interfaces so that they preserve their complementarity. Nonetheless, the elementary interactions which compose the interface are highly versatile throughout evolution. Understanding and characterizing interface plasticity across evolution is a fundamental issue which could provide new insights into protein-protein interaction prediction. Using a database of 1,024 couples of close and remote heteromeric structural interologs, we studied protein-protein interactions from a structural and evolutionary point of view. We systematically and quantitatively analyzed the conservation of different types of interface contacts. Our study highlights astonishing plasticity regarding polar contacts at complex interfaces. It also reveals that up to a quarter of the residues switch out of the interface when comparing two homologous complexes. Despite such versatility, we identify two important interface descriptors which correlate with an increased conservation in the evolution of interfaces: apolar patches and contacts surrounding anchor residues. These observations hold true even when restricting the dataset to transiently formed complexes. We show that a combination of six features related either to sequence or to geometric properties of interfaces can be used to rank positions likely to share similar contacts between two interologs. Altogether, our analysis provides important tracks for extracting meaningful information from multiple sequence alignments of conserved binding partners and for discriminating near-native interfaces using evolutionary information. PMID:22952442
Evolutionary analysis of FAM83H in vertebrates.

PubMed

Huang, Wushuang; Yang, Mei; Wang, Changning; Song, Yaling

2017-01-01

Amelogenesis imperfecta is a group of disorders causing abnormalities in enamel formation in various phenotypes. Many mutations in the FAM83H gene have been identified to result in autosomal dominant hypocalcified amelogenesis imperfecta in different populations. However, the structure and function of FAM83H and its pathological mechanism have yet to be further explored. Evolutionary analysis is an alternative for revealing residues or motifs that are important for protein function. In the present study, we chose 50 vertebrate species in public databases representative of approximately 230 million years of evolution, including 1 amphibian, 2 fishes, 7 sauropsidas and 40 mammals, and we performed evolutionary analysis on the FAM83H protein. By sequence alignment, conserved residues and motifs were indicated, and the loss of important residues and motifs of five special species (Malayan pangolin, platypus, minke whale, nine-banded armadillo and aardvark) was discovered. A phylogenetic time tree showed the FAM83H divergent process. Positive selection sites in the C-terminus suggested that the C-terminus of FAM83H played certain adaptive roles during evolution. The results confirmed some important motifs reported in previous findings and identified some new highly conserved residues and motifs that need further investigation. The results suggest that the C-terminus of FAM83H contain key conserved regions critical to enamel formation and calcification.
Accelerated probabilistic inference of RNA structure evolution

PubMed Central

Holmes, Ian

2005-01-01

Background Pairwise stochastic context-free grammars (Pair SCFGs) are powerful tools for evolutionary analysis of RNA, including simultaneous RNA sequence alignment and secondary structure prediction, but the associated algorithms are intensive in both CPU and memory usage. The same problem is faced by other RNA alignment-and-folding algorithms based on Sankoff's 1985 algorithm. It is therefore desirable to constrain such algorithms, by pre-processing the sequences and using this first pass to limit the range of structures and/or alignments that can be considered. Results We demonstrate how flexible classes of constraint can be imposed, greatly reducing the computational costs while maintaining a high quality of structural homology prediction. Any score-attributed context-free grammar (e.g. energy-based scoring schemes, or conditionally normalized Pair SCFGs) is amenable to this treatment. It is now possible to combine independent structural and alignment constraints of unprecedented general flexibility in Pair SCFG alignment algorithms. We outline several applications to the bioinformatics of RNA sequence and structure, including Waterman-Eggert N-best alignments and progressive multiple alignment. We evaluate the performance of the algorithm on test examples from the RFAM database. Conclusion A program, Stemloc, that implements these algorithms for efficient RNA sequence alignment and structure prediction is available under the GNU General Public License. PMID:15790387
CORAL: aligning conserved core regions across domain families.

PubMed

Fong, Jessica H; Marchler-Bauer, Aron

2009-08-01

Homologous protein families share highly conserved sequence and structure regions that are frequent targets for comparative analysis of related proteins and families. Many protein families, such as the curated domain families in the Conserved Domain Database (CDD), exhibit similar structural cores. To improve accuracy in aligning such protein families, we propose a profile-profile method CORAL that aligns individual core regions as gap-free units. CORAL computes optimal local alignment of two profiles with heuristics to preserve continuity within core regions. We benchmarked its performance on curated domains in CDD, which have pre-defined core regions, against COMPASS, HHalign and PSI-BLAST, using structure superpositions and comprehensive curator-optimized alignments as standards of truth. CORAL improves alignment accuracy on core regions over general profile methods, returning a balanced score of 0.57 for over 80% of all domain families in CDD, compared with the highest balanced score of 0.45 from other methods. Further, CORAL provides E-values to aid in detecting homologous protein families and, by respecting block boundaries, produces alignments with improved 'readability' that facilitate manual refinement. CORAL will be included in future versions of the NCBI Cn3D/CDTree software, which can be downloaded at http://www.ncbi.nlm.nih.gov/Structure/cdtree/cdtree.shtml. Supplementary data are available at Bioinformatics online.
Sequence comparison alignment-free approach based on suffix tree and L-words frequency.

PubMed

Soares, Inês; Goios, Ana; Amorim, António

2012-01-01

The vast majority of methods available for sequence comparison rely on a first sequence alignment step, which requires a number of assumptions on evolutionary history and is sometimes very difficult or impossible to perform due to the abundance of gaps (insertions/deletions). In such cases, an alternative alignment-free method would prove valuable. Our method starts by a computation of a generalized suffix tree of all sequences, which is completed in linear time. Using this tree, the frequency of all possible words with a preset length L-L-words--in each sequence is rapidly calculated. Based on the L-words frequency profile of each sequence, a pairwise standard Euclidean distance is then computed producing a symmetric genetic distance matrix, which can be used to generate a neighbor joining dendrogram or a multidimensional scaling graph. We present an improvement to word counting alignment-free approaches for sequence comparison, by determining a single optimal word length and combining suffix tree structures to the word counting tasks. Our approach is, thus, a fast and simple application that proved to be efficient and powerful when applied to mitochondrial genomes. The algorithm was implemented in Python language and is freely available on the web.
Measuring and comparing structural fluctuation patterns in large protein datasets.

PubMed

Fuglebakk, Edvin; Echave, Julián; Reuter, Nathalie

2012-10-01

The function of a protein depends not only on its structure but also on its dynamics. This is at the basis of a large body of experimental and theoretical work on protein dynamics. Further insight into the dynamics-function relationship can be gained by studying the evolutionary divergence of protein motions. To investigate this, we need appropriate comparative dynamics methods. The most used dynamical similarity score is the correlation between the root mean square fluctuations (RMSF) of aligned residues. Despite its usefulness, RMSF is in general less evolutionarily conserved than the native structure. A fundamental issue is whether RMSF is not as conserved as structure because dynamics is less conserved or because RMSF is not the best property to use to study its conservation. We performed a systematic assessment of several scores that quantify the (dis)similarity between protein fluctuation patterns. We show that the best scores perform as well as or better than structural dissimilarity, as assessed by their consistency with the SCOP classification. We conclude that to uncover the full extent of the evolutionary conservation of protein fluctuation patterns, it is important to measure the directions of fluctuations and their correlations between sites. Nathalie.Reuter@mbi.uib.no Supplementary data are available at Bioinformatics Online.
Probing binding hot spots at protein-RNA recognition sites.

PubMed

Barik, Amita; Nithin, Chandran; Karampudi, Naga Bhushana Rao; Mukherjee, Sunandan; Bahadur, Ranjit Prasad

2016-01-29

We use evolutionary conservation derived from structure alignment of polypeptide sequences along with structural and physicochemical attributes of protein-RNA interfaces to probe the binding hot spots at protein-RNA recognition sites. We find that the degree of conservation varies across the RNA binding proteins; some evolve rapidly compared to others. Additionally, irrespective of the structural class of the complexes, residues at the RNA binding sites are evolutionary better conserved than those at the solvent exposed surfaces. For recognitions involving duplex RNA, residues interacting with the major groove are better conserved than those interacting with the minor groove. We identify multi-interface residues participating simultaneously in protein-protein and protein-RNA interfaces in complexes where more than one polypeptide is involved in RNA recognition, and show that they are better conserved compared to any other RNA binding residues. We find that the residues at water preservation site are better conserved than those at hydrated or at dehydrated sites. Finally, we develop a Random Forests model using structural and physicochemical attributes for predicting binding hot spots. The model accurately predicts 80% of the instances of experimental ΔΔG values in a particular class, and provides a stepping-stone towards the engineering of protein-RNA recognition sites with desired affinity. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
G-Anchor: a novel approach for whole-genome comparative mapping utilizing evolutionary conserved DNA sequences.

PubMed

Lenis, Vasileios Panagiotis E; Swain, Martin; Larkin, Denis M

2018-05-01

Cross-species whole-genome sequence alignment is a critical first step for genome comparative analyses, ranging from the detection of sequence variants to studies of chromosome evolution. Animal genomes are large and complex, and whole-genome alignment is a computationally intense process, requiring expensive high-performance computing systems due to the need to explore extensive local alignments. With hundreds of sequenced animal genomes available from multiple projects, there is an increasing demand for genome comparative analyses. Here, we introduce G-Anchor, a new, fast, and efficient pipeline that uses a strictly limited but highly effective set of local sequence alignments to anchor (or map) an animal genome to another species' reference genome. G-Anchor makes novel use of a databank of highly conserved DNA sequence elements. We demonstrate how these elements may be aligned to a pair of genomes, creating anchors. These anchors enable the rapid mapping of scaffolds from a de novo assembled genome to chromosome assemblies of a reference species. Our results demonstrate that G-Anchor can successfully anchor a vertebrate genome onto a phylogenetically related reference species genome using a desktop or laptop computer within a few hours and with comparable accuracy to that achieved by a highly accurate whole-genome alignment tool such as LASTZ. G-Anchor thus makes whole-genome comparisons accessible to researchers with limited computational resources. G-Anchor is a ready-to-use tool for anchoring a pair of vertebrate genomes. It may be used with large genomes that contain a significant fraction of evolutionally conserved DNA sequences and that are not highly repetitive, polypoid, or excessively fragmented. G-Anchor is not a substitute for whole-genome aligning software but can be used for fast and accurate initial genome comparisons. G-Anchor is freely available and a ready-to-use tool for the pairwise comparison of two genomes.
Conserved structure and inferred evolutionary history of long terminal repeats (LTRs)

PubMed Central

2013-01-01

Background Long terminal repeats (LTRs, consisting of U3-R-U5 portions) are important elements of retroviruses and related retrotransposons. They are difficult to analyse due to their variability. The aim was to obtain a more comprehensive view of structure, diversity and phylogeny of LTRs than hitherto possible. Results Hidden Markov models (HMM) were created for 11 clades of LTRs belonging to Retroviridae (class III retroviruses), animal Metaviridae (Gypsy/Ty3) elements and plant Pseudoviridae (Copia/Ty1) elements, complementing our work with Orthoretrovirus HMMs. The great variation in LTR length of plant Metaviridae and the few divergent animal Pseudoviridae prevented building HMMs from both of these groups. Animal Metaviridae LTRs had the same conserved motifs as retroviral LTRs, confirming that the two groups are closely related. The conserved motifs were the short inverted repeats (SIRs), integrase recognition signals (5´TGTTRNR…YNYAACA 3´); the polyadenylation signal or AATAAA motif; a GT-rich stretch downstream of the polyadenylation signal; and a less conserved AT-rich stretch corresponding to the core promoter element, the TATA box. Plant Pseudoviridae LTRs differed slightly in having a conserved TATA-box, TATATA, but no conserved polyadenylation signal, plus a much shorter R region. The sensitivity of the HMMs for detection in genomic sequences was around 50% for most models, at a relatively high specificity, suitable for genome screening. The HMMs yielded consensus sequences, which were aligned by creating an HMM model (a ‘Superviterbi’ alignment). This yielded a phylogenetic tree that was compared with a Pol-based tree. Both LTR and Pol trees supported monophyly of retroviruses. In both, Pseudoviridae was ancestral to all other LTR retrotransposons. However, the LTR trees showed the chromovirus portion of Metaviridae clustering together with Pseudoviridae, dividing Metaviridae into two portions with distinct phylogeny. Conclusion The HMMs clearly demonstrated a unitary conserved structure of LTRs, supporting that they arose once during evolution. We attempted to follow the evolution of LTRs by tracing their functional foundations, that is, acquisition of RNAse H, a combined promoter/ polyadenylation site, integrase, hairpin priming and the primer binding site (PBS). Available information did not support a simple evolutionary chain of events. PMID:23369192
Improving protein-protein interaction prediction using evolutionary information from low-quality MSAs.

PubMed

Várnai, Csilla; Burkoff, Nikolas S; Wild, David L

2017-01-01

Evolutionary information stored in multiple sequence alignments (MSAs) has been used to identify the interaction interface of protein complexes, by measuring either co-conservation or co-mutation of amino acid residues across the interface. Recently, maximum entropy related correlated mutation measures (CMMs) such as direct information, decoupling direct from indirect interactions, have been developed to identify residue pairs interacting across the protein complex interface. These studies have focussed on carefully selected protein complexes with large, good-quality MSAs. In this work, we study protein complexes with a more typical MSA consisting of fewer than 400 sequences, using a set of 79 intramolecular protein complexes. Using a maximum entropy based CMM at the residue level, we develop an interface level CMM score to be used in re-ranking docking decoys. We demonstrate that our interface level CMM score compares favourably to the complementarity trace score, an evolutionary information-based score measuring co-conservation, when combined with the number of interface residues, a knowledge-based potential and the variability score of individual amino acid sites. We also demonstrate, that, since co-mutation and co-complementarity in the MSA contain orthogonal information, the best prediction performance using evolutionary information can be achieved by combining the co-mutation information of the CMM with co-conservation information of a complementarity trace score, predicting a near-native structure as the top prediction for 41% of the dataset. The method presented is not restricted to small MSAs, and will likely improve interface prediction also for complexes with large and good-quality MSAs.
Comparison of ZP3 protein sequences among vertebrate species: to obtain a consensus sequence for immunocontraception.

PubMed

Zhu, X; Naz, R K

1999-03-01

The deduced ZP3 amino acid (aa) sequences of 13 vertebrate species namely mouse, hamster, rabbit, pig, porcine, cow, dog, cat, human, bonnet, marmoset, carp, and frog were compared using the PILEUP and PRETTY alignment programs (GCG, Wisconsin, USA). The published aa sequences obtained from 13 vertebrate species indicated the overall evolutionarily conservation in the N-terminus, central region, and C-terminus of the ZP3 polypeptide. More variations of ZP3 polypeptide sequences were seen in the alignments of carp and frog from the 11 mammalian species making the leader sequence more prominent. The canonical furin proteolytic processing signal at the C-terminus was found in all the ZP3 polypeptide sequences except of carp and frog. In the central region, the ZP3 deduced aa sequences of all the 13 vertebrate species aligned well, and six relatively conserved sequences were found. There are 11 conserved cysteine residues in the central region across all species including carp and frog, indicating that these residues have longer evolutionary history. The ZP3 aa sequence similarities were examined using the GAP program (GCG). The highest aa similarities are observed between the members of the same order within the class mammalia, and also (95.4%) between pig (ungulata) and rabbit (lagomorpha). The deduced ZP3 aa sequences per se may not be enough to build a phylogenetic tree.
AlignNemo: a local network alignment method to integrate homology and topology.

PubMed

Ciriello, Giovanni; Mina, Marco; Guzzi, Pietro H; Cannataro, Mario; Guerra, Concettina

2012-01-01

Local network alignment is an important component of the analysis of protein-protein interaction networks that may lead to the identification of evolutionary related complexes. We present AlignNemo, a new algorithm that, given the networks of two organisms, uncovers subnetworks of proteins that relate in biological function and topology of interactions. The discovered conserved subnetworks have a general topology and need not to correspond to specific interaction patterns, so that they more closely fit the models of functional complexes proposed in the literature. The algorithm is able to handle sparse interaction data with an expansion process that at each step explores the local topology of the networks beyond the proteins directly interacting with the current solution. To assess the performance of AlignNemo, we ran a series of benchmarks using statistical measures as well as biological knowledge. Based on reference datasets of protein complexes, AlignNemo shows better performance than other methods in terms of both precision and recall. We show our solutions to be biologically sound using the concept of semantic similarity applied to Gene Ontology vocabularies. The binaries of AlignNemo and supplementary details about the algorithms and the experiments are available at: sourceforge.net/p/alignnemo.
The Evolutionary Pattern of Glycosylation Sites in Influenza Virus (H5N1) Hemagglutinin and Neuraminidase

PubMed Central

Chen, Wentian; Zhong, Yaogang; Qin, Yannan; Sun, Shisheng; Li, Zheng

2012-01-01

Two glycoproteins, hemagglutinin (HA) and neuraminidase (NA), on the surface of influenza viruses play crucial roles in transfaunation, membrane fusion and the release of progeny virions. To explore the distribution of N-glycosylation sites (glycosites) in these two glycoproteins, we collected and aligned the amino acid sequences of all the HA and NA subtypes. Two glycosites were located at HA0 cleavage sites and fusion peptides and were strikingly conserved in all HA subtypes, while the remaining glycosites were unique to their subtypes. Two to four conserved glycosites were found in the stalk domain of NA, but these are affected by the deletion of specific stalk domain sequences. Another highly conserved glycosite appeared at the top center of tetrameric global domain, while the others glycosites were distributed around the global domain. Here we present a detailed investigation of the distribution and the evolutionary pattern of the glycosites in the envelope glycoproteins of IVs, and further focus on the H5N1 virus and conclude that the glycosites in H5N1 have become more complicated in HA and less influential in NA in the last five years. PMID:23133677

EvoDB: a database of evolutionary rate profiles, associated protein domains and phylogenetic trees for PFAM-A

PubMed Central

Ndhlovu, Andrew; Durand, Pierre M.; Hazelhurst, Scott

2015-01-01

The evolutionary rate at codon sites across protein-coding nucleotide sequences represents a valuable tier of information for aligning sequences, inferring homology and constructing phylogenetic profiles. However, a comprehensive resource for cataloguing the evolutionary rate at codon sites and their corresponding nucleotide and protein domain sequence alignments has not been developed. To address this gap in knowledge, EvoDB (an Evolutionary rates DataBase) was compiled. Nucleotide sequences and their corresponding protein domain data including the associated seed alignments from the PFAM-A (protein family) database were used to estimate evolutionary rate (ω = dN/dS) profiles at codon sites for each entry. EvoDB contains 98.83% of the gapped nucleotide sequence alignments and 97.1% of the evolutionary rate profiles for the corresponding information in PFAM-A. As the identification of codon sites under positive selection and their position in a sequence profile is usually the most sought after information for molecular evolutionary biologists, evolutionary rate profiles were determined under the M2a model using the CODEML algorithm in the PAML (Phylogenetic Analysis by Maximum Likelihood) suite of software. Validation of nucleotide sequences against amino acid data was implemented to ensure high data quality. EvoDB is a catalogue of the evolutionary rate profiles and provides the corresponding phylogenetic trees, PFAM-A alignments and annotated accession identifier data. In addition, the database can be explored and queried using known evolutionary rate profiles to identify domains under similar evolutionary constraints and pressures. EvoDB is a resource for evolutionary, phylogenetic studies and presents a tier of information untapped by current databases. Database URL: http://www.bioinf.wits.ac.za/software/fire/evodb PMID:26140928
EvoDB: a database of evolutionary rate profiles, associated protein domains and phylogenetic trees for PFAM-A.

PubMed

Ndhlovu, Andrew; Durand, Pierre M; Hazelhurst, Scott

2015-01-01

The evolutionary rate at codon sites across protein-coding nucleotide sequences represents a valuable tier of information for aligning sequences, inferring homology and constructing phylogenetic profiles. However, a comprehensive resource for cataloguing the evolutionary rate at codon sites and their corresponding nucleotide and protein domain sequence alignments has not been developed. To address this gap in knowledge, EvoDB (an Evolutionary rates DataBase) was compiled. Nucleotide sequences and their corresponding protein domain data including the associated seed alignments from the PFAM-A (protein family) database were used to estimate evolutionary rate (ω = dN/dS) profiles at codon sites for each entry. EvoDB contains 98.83% of the gapped nucleotide sequence alignments and 97.1% of the evolutionary rate profiles for the corresponding information in PFAM-A. As the identification of codon sites under positive selection and their position in a sequence profile is usually the most sought after information for molecular evolutionary biologists, evolutionary rate profiles were determined under the M2a model using the CODEML algorithm in the PAML (Phylogenetic Analysis by Maximum Likelihood) suite of software. Validation of nucleotide sequences against amino acid data was implemented to ensure high data quality. EvoDB is a catalogue of the evolutionary rate profiles and provides the corresponding phylogenetic trees, PFAM-A alignments and annotated accession identifier data. In addition, the database can be explored and queried using known evolutionary rate profiles to identify domains under similar evolutionary constraints and pressures. EvoDB is a resource for evolutionary, phylogenetic studies and presents a tier of information untapped by current databases. © The Author(s) 2015. Published by Oxford University Press.
Improved measurements of RNA structure conservation with generalized centroid estimators.

PubMed

Okada, Yohei; Saito, Yutaka; Sato, Kengo; Sakakibara, Yasubumi

2011-01-01

Identification of non-protein-coding RNAs (ncRNAs) in genomes is a crucial task for not only molecular cell biology but also bioinformatics. Secondary structures of ncRNAs are employed as a key feature of ncRNA analysis since biological functions of ncRNAs are deeply related to their secondary structures. Although the minimum free energy (MFE) structure of an RNA sequence is regarded as the most stable structure, MFE alone could not be an appropriate measure for identifying ncRNAs since the free energy is heavily biased by the nucleotide composition. Therefore, instead of MFE itself, several alternative measures for identifying ncRNAs have been proposed such as the structure conservation index (SCI) and the base pair distance (BPD), both of which employ MFE structures. However, these measurements are unfortunately not suitable for identifying ncRNAs in some cases including the genome-wide search and incur high false discovery rate. In this study, we propose improved measurements based on SCI and BPD, applying generalized centroid estimators to incorporate the robustness against low quality multiple alignments. Our experiments show that our proposed methods achieve higher accuracy than the original SCI and BPD for not only human-curated structural alignments but also low quality alignments produced by CLUSTAL W. Furthermore, the centroid-based SCI on CLUSTAL W alignments is more accurate than or comparable with that of the original SCI on structural alignments generated with RAF, a high quality structural aligner, for which twofold expensive computational time is required on average. We conclude that our methods are more suitable for genome-wide alignments which are of low quality from the point of view on secondary structures than the original SCI and BPD.
A Dense Brown Trout (Salmo trutta) Linkage Map Reveals Recent Chromosomal Rearrangements in the Salmo Genus and the Impact of Selection on Linked Neutral Diversity

PubMed Central

Leitwein, Maeva; Guinand, Bruno; Pouzadoux, Juliette; Desmarais, Erick; Berrebi, Patrick; Gagnaire, Pierre-Alexandre

2017-01-01

High-density linkage maps are valuable tools for conservation and eco-evolutionary issues. In salmonids, a complex rediploidization process consecutive to an ancient whole genome duplication event makes linkage maps of prime importance for investigating the evolutionary history of chromosome rearrangements. Here, we developed a high-density consensus linkage map for the brown trout (Salmo trutta), a socioeconomically important species heavily impacted by human activities. A total of 3977 ddRAD markers were mapped and ordered in 40 linkage groups using sex- and lineage-averaged recombination distances obtained from two family crosses. Performing map comparison between S. trutta and its sister species, S. salar, revealed extensive chromosomal rearrangements. Strikingly, all of the fusion and fission events that occurred after the S. salar/S. trutta speciation happened in the Atlantic salmon branch, whereas the brown trout remained closer to the ancestral chromosome structure. Using the strongly conserved synteny within chromosome arms, we aligned the brown trout linkage map to the Atlantic salmon genome sequence to estimate the local recombination rate in S. trutta at 3721 loci. A significant positive correlation between recombination rate and within-population nucleotide diversity (π) was found, indicating that selection constrains variation at linked neutral sites in brown trout. This new high-density linkage map provides a useful genomic resource for future aquaculture, conservation, and eco-evolutionary studies in brown trout. PMID:28235829
Analyses of the radiation of birnaviruses from diverse host phyla and of their evolutionary affinities with other double-stranded RNA and positive strand RNA viruses using robust structure-based multiple sequence alignments and advanced phylogenetic methods

PubMed Central

2013-01-01

Background Birnaviruses form a distinct family of double-stranded RNA viruses infecting animals as different as vertebrates, mollusks, insects and rotifers. With such a wide host range, they constitute a good model for studying the adaptation to the host. Additionally, several lines of evidence link birnaviruses to positive strand RNA viruses and suggest that phylogenetic analyses may provide clues about transition. Results We characterized the genome of a birnavirus from the rotifer Branchionus plicalitis. We used X-ray structures of RNA-dependent RNA polymerases and capsid proteins to obtain multiple structure alignments that allowed us to obtain reliable multiple sequence alignments and we employed “advanced” phylogenetic methods to study the evolutionary relationships between some positive strand and double-stranded RNA viruses. We showed that the rotifer birnavirus genome exhibited an organization remarkably similar to other birnaviruses. As this host was phylogenetically very distant from the other known species targeted by birnaviruses, we revisited the evolutionary pathways within the Birnaviridae family using phylogenetic reconstruction methods. We also applied a number of phylogenetic approaches based on structurally conserved domains/regions of the capsid and RNA-dependent RNA polymerase proteins to study the evolutionary relationships between birnaviruses, other double-stranded RNA viruses and positive strand RNA viruses. Conclusions We show that there is a good correlation between the phylogeny of the birnaviruses and that of their hosts at the phylum level using the RNA-dependent RNA polymerase (genomic segment B) on the one hand and a concatenation of the capsid protein, protease and ribonucleoprotein (genomic segment A) on the other hand. This correlation tends to vanish within phyla. The use of advanced phylogenetic methods and robust structure-based multiple sequence alignments allowed us to obtain a more accurate picture (in terms of probability of the tree topologies) of the evolutionary affinities between double-stranded RNA and positive strand RNA viruses. In particular, we were able to show that there exists a good statistical support for the claims that dsRNA viruses are not monophyletic and that viruses with permuted RdRps belong to a common evolution lineage as previously proposed by other groups. We also propose a tree topology with a good statistical support describing the evolutionary relationships between the Picornaviridae, Caliciviridae, Flaviviridae families and a group including the Alphatetraviridae, Nodaviridae, Permutotretraviridae, Birnaviridae, and Cystoviridae families. PMID:23865988
gmos: Rapid Detection of Genome Mosaicism over Short Evolutionary Distances.

PubMed

Domazet-Lošo, Mirjana; Domazet-Lošo, Tomislav

2016-01-01

Prokaryotic and viral genomes are often altered by recombination and horizontal gene transfer. The existing methods for detecting recombination are primarily aimed at viral genomes or sets of loci, since the expensive computation of underlying statistical models often hinders the comparison of complete prokaryotic genomes. As an alternative, alignment-free solutions are more efficient, but cannot map (align) a query to subject genomes. To address this problem, we have developed gmos (Genome MOsaic Structure), a new program that determines the mosaic structure of query genomes when compared to a set of closely related subject genomes. The program first computes local alignments between query and subject genomes and then reconstructs the query mosaic structure by choosing the best local alignment for each query region. To accomplish the analysis quickly, the program mostly relies on pairwise alignments and constructs multiple sequence alignments over short overlapping subject regions only when necessary. This fine-tuned implementation achieves an efficiency comparable to an alignment-free tool. The program performs well for simulated and real data sets of closely related genomes and can be used for fast recombination detection; for instance, when a new prokaryotic pathogen is discovered. As an example, gmos was used to detect genome mosaicism in a pathogenic Enterococcus faecium strain compared to seven closely related genomes. The analysis took less than two minutes on a single 2.1 GHz processor. The output is available in fasta format and can be visualized using an accessory program, gmosDraw (freely available with gmos).
gmos: Rapid Detection of Genome Mosaicism over Short Evolutionary Distances

PubMed Central

Domazet-Lošo, Mirjana; Domazet-Lošo, Tomislav

2016-01-01

Prokaryotic and viral genomes are often altered by recombination and horizontal gene transfer. The existing methods for detecting recombination are primarily aimed at viral genomes or sets of loci, since the expensive computation of underlying statistical models often hinders the comparison of complete prokaryotic genomes. As an alternative, alignment-free solutions are more efficient, but cannot map (align) a query to subject genomes. To address this problem, we have developed gmos (Genome MOsaic Structure), a new program that determines the mosaic structure of query genomes when compared to a set of closely related subject genomes. The program first computes local alignments between query and subject genomes and then reconstructs the query mosaic structure by choosing the best local alignment for each query region. To accomplish the analysis quickly, the program mostly relies on pairwise alignments and constructs multiple sequence alignments over short overlapping subject regions only when necessary. This fine-tuned implementation achieves an efficiency comparable to an alignment-free tool. The program performs well for simulated and real data sets of closely related genomes and can be used for fast recombination detection; for instance, when a new prokaryotic pathogen is discovered. As an example, gmos was used to detect genome mosaicism in a pathogenic Enterococcus faecium strain compared to seven closely related genomes. The analysis took less than two minutes on a single 2.1 GHz processor. The output is available in fasta format and can be visualized using an accessory program, gmosDraw (freely available with gmos). PMID:27846272
CORE_TF: a user-friendly interface to identify evolutionary conserved transcription factor binding sites in sets of co-regulated genes

PubMed Central

Hestand, Matthew S; van Galen, Michiel; Villerius, Michel P; van Ommen, Gert-Jan B; den Dunnen, Johan T; 't Hoen, Peter AC

2008-01-01

Background The identification of transcription factor binding sites is difficult since they are only a small number of nucleotides in size, resulting in large numbers of false positives and false negatives in current approaches. Computational methods to reduce false positives are to look for over-representation of transcription factor binding sites in a set of similarly regulated promoters or to look for conservation in orthologous promoter alignments. Results We have developed a novel tool, "CORE_TF" (Conserved and Over-REpresented Transcription Factor binding sites) that identifies common transcription factor binding sites in promoters of co-regulated genes. To improve upon existing binding site predictions, the tool searches for position weight matrices from the TRANSFACR database that are over-represented in an experimental set compared to a random set of promoters and identifies cross-species conservation of the predicted transcription factor binding sites. The algorithm has been evaluated with expression and chromatin-immunoprecipitation on microarray data. We also implement and demonstrate the importance of matching the random set of promoters to the experimental promoters by GC content, which is a unique feature of our tool. Conclusion The program CORE_TF is accessible in a user friendly web interface at . It provides a table of over-represented transcription factor binding sites in the users input genes' promoters and a graphical view of evolutionary conserved transcription factor binding sites. In our test data sets it successfully predicts target transcription factors and their binding sites. PMID:19036135
Combining protein sequence, structure, and dynamics: A novel approach for functional evolution analysis of PAS domain superfamily.

PubMed

Dong, Zheng; Zhou, Hongyu; Tao, Peng

2018-02-01

PAS domains are widespread in archaea, bacteria, and eukaryota, and play important roles in various functions. In this study, we aim to explore functional evolutionary relationship among proteins in the PAS domain superfamily in view of the sequence-structure-dynamics-function relationship. We collected protein sequences and crystal structure data from RCSB Protein Data Bank of the PAS domain superfamily belonging to three biological functions (nucleotide binding, photoreceptor activity, and transferase activity). Protein sequences were aligned and then used to select sequence-conserved residues and build phylogenetic tree. Three-dimensional structure alignment was also applied to obtain structure-conserved residues. The protein dynamics were analyzed using elastic network model (ENM) and validated by molecular dynamics (MD) simulation. The result showed that the proteins with same function could be grouped by sequence similarity, and proteins in different functional groups displayed statistically significant difference in their vibrational patterns. Interestingly, in all three functional groups, conserved amino acid residues identified by sequence and structure conservation analysis generally have a lower fluctuation than other residues. In addition, the fluctuation of conserved residues in each biological function group was strongly correlated with the corresponding biological function. This research suggested a direct connection in which the protein sequences were related to various functions through structural dynamics. This is a new attempt to delineate functional evolution of proteins using the integrated information of sequence, structure, and dynamics. © 2017 The Protein Society.
SWPhylo - A Novel Tool for Phylogenomic Inferences by Comparison of Oligonucleotide Patterns and Integration of Genome-Based and Gene-Based Phylogenetic Trees.

PubMed

Yu, Xiaoyu; Reva, Oleg N

2018-01-01

Modern phylogenetic studies may benefit from the analysis of complete genome sequences of various microorganisms. Evolutionary inferences based on genome-scale analysis are believed to be more accurate than the gene-based alternative. However, the computational complexity of current phylogenomic procedures, inappropriateness of standard phylogenetic tools to process genome-wide data, and lack of reliable substitution models which correlates with alignment-free phylogenomic approaches deter microbiologists from using these opportunities. For example, the super-matrix and super-tree approaches of phylogenomics use multiple integrated genomic loci or individual gene-based trees to infer an overall consensus tree. However, these approaches potentially multiply errors of gene annotation and sequence alignment not mentioning the computational complexity and laboriousness of the methods. In this article, we demonstrate that the annotation- and alignment-free comparison of genome-wide tetranucleotide frequencies, termed oligonucleotide usage patterns (OUPs), allowed a fast and reliable inference of phylogenetic trees. These were congruent to the corresponding whole genome super-matrix trees in terms of tree topology when compared with other known approaches including 16S ribosomal RNA and GyrA protein sequence comparison, complete genome-based MAUVE, and CVTree methods. A Web-based program to perform the alignment-free OUP-based phylogenomic inferences was implemented at http://swphylo.bi.up.ac.za/. Applicability of the tool was tested on different taxa from subspecies to intergeneric levels. Distinguishing between closely related taxonomic units may be enforced by providing the program with alignments of marker protein sequences, eg, GyrA.
SWPhylo – A Novel Tool for Phylogenomic Inferences by Comparison of Oligonucleotide Patterns and Integration of Genome-Based and Gene-Based Phylogenetic Trees

PubMed Central

Yu, Xiaoyu; Reva, Oleg N

2018-01-01

Modern phylogenetic studies may benefit from the analysis of complete genome sequences of various microorganisms. Evolutionary inferences based on genome-scale analysis are believed to be more accurate than the gene-based alternative. However, the computational complexity of current phylogenomic procedures, inappropriateness of standard phylogenetic tools to process genome-wide data, and lack of reliable substitution models which correlates with alignment-free phylogenomic approaches deter microbiologists from using these opportunities. For example, the super-matrix and super-tree approaches of phylogenomics use multiple integrated genomic loci or individual gene-based trees to infer an overall consensus tree. However, these approaches potentially multiply errors of gene annotation and sequence alignment not mentioning the computational complexity and laboriousness of the methods. In this article, we demonstrate that the annotation- and alignment-free comparison of genome-wide tetranucleotide frequencies, termed oligonucleotide usage patterns (OUPs), allowed a fast and reliable inference of phylogenetic trees. These were congruent to the corresponding whole genome super-matrix trees in terms of tree topology when compared with other known approaches including 16S ribosomal RNA and GyrA protein sequence comparison, complete genome-based MAUVE, and CVTree methods. A Web-based program to perform the alignment-free OUP-based phylogenomic inferences was implemented at http://swphylo.bi.up.ac.za/. Applicability of the tool was tested on different taxa from subspecies to intergeneric levels. Distinguishing between closely related taxonomic units may be enforced by providing the program with alignments of marker protein sequences, eg, GyrA. PMID:29511354
PDB-wide identification of biological assemblies from conserved quaternary structure geometry.

PubMed

Dey, Sucharita; Ritchie, David W; Levy, Emmanuel D

2018-01-01

Protein structures are key to understanding biomolecular mechanisms and diseases, yet their interpretation is hampered by limited knowledge of their biologically relevant quaternary structure (QS). A critical challenge in inferring QS information from crystallographic data is distinguishing biological interfaces from fortuitous crystal-packing contacts. Here, we tackled this problem by developing strategies for aligning and comparing QS states across both homologs and data repositories. QS conservation across homologs proved remarkably strong at predicting biological relevance and is implemented in two methods, QSalign and anti-QSalign, for annotating homo-oligomers and monomers, respectively. QS conservation across repositories is implemented in QSbio (http://www.QSbio.org), which approaches the accuracy of manual curation and allowed us to predict >100,000 QS states across the Protein Data Bank. Based on this high-quality data set, we analyzed pairs of structurally conserved interfaces, and this analysis revealed a striking plasticity whereby evolutionary distant interfaces maintain similar interaction geometries through widely divergent chemical properties.
Flavivirus and Filovirus EvoPrinters: New alignment tools for the comparative analysis of viral evolution.

PubMed

Brody, Thomas; Yavatkar, Amarendra S; Park, Dong Sun; Kuzin, Alexander; Ross, Jermaine; Odenwald, Ward F

2017-06-01

Flavivirus and Filovirus infections are serious epidemic threats to human populations. Multi-genome comparative analysis of these evolving pathogens affords a view of their essential, conserved sequence elements as well as progressive evolutionary changes. While phylogenetic analysis has yielded important insights, the growing number of available genomic sequences makes comparisons between hundreds of viral strains challenging. We report here a new approach for the comparative analysis of these hemorrhagic fever viruses that can superimpose an unlimited number of one-on-one alignments to identify important features within genomes of interest. We have adapted EvoPrinter alignment algorithms for the rapid comparative analysis of Flavivirus or Filovirus sequences including Zika and Ebola strains. The user can input a full genome or partial viral sequence and then view either individual comparisons or generate color-coded readouts that superimpose hundreds of one-on-one alignments to identify unique or shared identity SNPs that reveal ancestral relationships between strains. The user can also opt to select a database genome in order to access a library of pre-aligned genomes of either 1,094 Flaviviruses or 460 Filoviruses for rapid comparative analysis with all database entries or a select subset. Using EvoPrinter search and alignment programs, we show the following: 1) superimposing alignment data from many related strains identifies lineage identity SNPs, which enable the assessment of sublineage complexity within viral outbreaks; 2) whole-genome SNP profile screens uncover novel Dengue2 and Zika recombinant strains and their parental lineages; 3) differential SNP profiling identifies host cell A-to-I hyper-editing within Ebola and Marburg viruses, and 4) hundreds of superimposed one-on-one Ebola genome alignments highlight ultra-conserved regulatory sequences, invariant amino acid codons and evolutionarily variable protein-encoding domains within a single genome. EvoPrinter allows for the assessment of lineage complexity within Flavivirus or Filovirus outbreaks, identification of recombinant strains, highlights sequences that have undergone host cell A-to-I editing, and identifies unique input and database SNPs within highly conserved sequences. EvoPrinter's ability to superimpose alignment data from hundreds of strains onto a single genome has allowed us to identify unique Zika virus sublineages that are currently spreading in South, Central and North America, the Caribbean, and in China. This new set of integrated alignment programs should serve as a useful addition to existing tools for the comparative analysis of these viruses.
DNATagger, colors for codons.

PubMed

Scherer, N M; Basso, D M

2008-09-16

DNATagger is a web-based tool for coloring and editing DNA, RNA and protein sequences and alignments. It is dedicated to the visualization of protein coding sequences and also protein sequence alignments to facilitate the comprehension of evolutionary processes in sequence analysis. The distinctive feature of DNATagger is the use of codons as informative units for coloring DNA and RNA sequences. The codons are colored according to their corresponding amino acids. It is the first program that colors codons in DNA sequences without being affected by "out-of-frame" gaps of alignments. It can handle single gaps and gaps inside the triplets. The program also provides the possibility to edit the alignments and change color patterns and translation tables. DNATagger is a JavaScript application, following the W3C guidelines, designed to work on standards-compliant web browsers. It therefore requires no installation and is platform independent. The web-based DNATagger is available as free and open source software at http://www.inf.ufrgs.br/~dmbasso/dnatagger/.
ProfileGrids: a sequence alignment visualization paradigm that avoids the limitations of Sequence Logos.

PubMed

Roca, Alberto I

2014-01-01

The 2013 BioVis Contest provided an opportunity to evaluate different paradigms for visualizing protein multiple sequence alignments. Such data sets are becoming extremely large and thus taxing current visualization paradigms. Sequence Logos represent consensus sequences but have limitations for protein alignments. As an alternative, ProfileGrids are a new protein sequence alignment visualization paradigm that represents an alignment as a color-coded matrix of the residue frequency occurring at every homologous position in the aligned protein family. The JProfileGrid software program was used to analyze the BioVis contest data sets to generate figures for comparison with the Sequence Logo reference images. The ProfileGrid representation allows for the clear and effective analysis of protein multiple sequence alignments. This includes both a general overview of the conservation and diversity sequence patterns as well as the interactive ability to query the details of the protein residue distributions in the alignment. The JProfileGrid software is free and available from http://www.ProfileGrid.org.
Evolutionary divergence of chloroplast FAD synthetase proteins

PubMed Central

2010-01-01

Background Flavin adenine dinucleotide synthetases (FADSs) - a group of bifunctional enzymes that carry out the dual functions of riboflavin phosphorylation to produce flavin mononucleotide (FMN) and its subsequent adenylation to generate FAD in most prokaryotes - were studied in plants in terms of sequence, structure and evolutionary history. Results Using a variety of bioinformatics methods we have found that FADS enzymes localized to the chloroplasts, which we term as plant-like FADS proteins, are distributed across a variety of green plant lineages and constitute a divergent protein family clearly of cyanobacterial origin. The C-terminal module of these enzymes does not contain the typical riboflavin kinase active site sequence, while the N-terminal module is broadly conserved. These results agree with a previous work reported by Sandoval et al. in 2008. Furthermore, our observations and preliminary experimental results indicate that the C-terminus of plant-like FADS proteins may contain a catalytic activity, but different to that of their prokaryotic counterparts. In fact, homology models predict that plant-specific conserved residues constitute a distinct active site in the C-terminus. Conclusions A structure-based sequence alignment and an in-depth evolutionary survey of FADS proteins, thought to be crucial in plant metabolism, are reported, which will be essential for the correct annotation of plant genomes and further structural and functional studies. This work is a contribution to our understanding of the evolutionary history of plant-like FADS enzymes, which constitute a new family of FADS proteins whose C-terminal module might be involved in a distinct catalytic activity. PMID:20955574
Genomic diversity guides conservation strategies among rare terrestrial orchid species when taxonomy remains uncertain.

PubMed

Ahrens, Collin W; Supple, Megan A; Aitken, Nicola C; Cantrill, David J; Borevitz, Justin O; James, Elizabeth A

2017-06-01

Species are often used as the unit for conservation, but may not be suitable for species complexes where taxa are difficult to distinguish. Under such circumstances, it may be more appropriate to consider species groups or populations as evolutionarily significant units (ESUs). A population genomic approach was employed to investigate the diversity within and among closely related species to create a more robust, lineage-specific conservation strategy for a nationally endangered terrestrial orchid and its relatives from south-eastern Australia. Four putative species were sampled from a total of 16 populations in the Victorian Volcanic Plain (VVP) bioregion and one population of a sub-alpine outgroup in south-eastern Australia. Morphological measurements were taken in situ along with leaf material for genotyping by sequencing (GBS) and microsatellite analyses. Species could not be differentiated using morphological measurements. Microsatellite and GBS markers confirmed the outgroup as distinct, but only GBS markers provided resolution of population genetic structure. The nationally endangered Diuris basaltica was indistinguishable from two related species ( D. chryseopsis and D. behrii ), while the state-protected D. gregaria showed genomic differentiation. Genomic diversity identified among the four Diuris species suggests that conservation of this taxonomically complex group will be best served by considering them as one ESU rather than separately aligned with species as currently recognized. This approach will maximize evolutionary potential among all species during increased isolation and environmental change. The methods used here can be applied generally to conserve evolutionary processes for groups where taxonomic uncertainty hinders the use of species as conservation units. © The Author 2017. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Characterising RNA secondary structure space using information entropy

PubMed Central

2013-01-01

Comparative methods for RNA secondary structure prediction use evolutionary information from RNA alignments to increase prediction accuracy. The model is often described in terms of stochastic context-free grammars (SCFGs), which generate a probability distribution over secondary structures. It is, however, unclear how this probability distribution changes as a function of the input alignment. As prediction programs typically only return a single secondary structure, better characterisation of the underlying probability space of RNA secondary structures is of great interest. In this work, we show how to efficiently compute the information entropy of the probability distribution over RNA secondary structures produced for RNA alignments by a phylo-SCFG, and implement it for the PPfold model. We also discuss interpretations and applications of this quantity, including how it can clarify reasons for low prediction reliability scores. PPfold and its source code are available from http://birc.au.dk/software/ppfold/. PMID:23368905
Reconstructing evolutionary trees in parallel for massive sequences.

PubMed

Zou, Quan; Wan, Shixiang; Zeng, Xiangxiang; Ma, Zhanshan Sam

2017-12-14

Building the evolutionary trees for massive unaligned DNA sequences is challenging and crucial. However, reconstructing evolutionary tree for ultra-large sequences is hard. Massive multiple sequence alignment is also challenging and time/space consuming. Hadoop and Spark are developed recently, which bring spring light for the classical computational biology problems. In this paper, we tried to solve the multiple sequence alignment and evolutionary reconstruction in parallel. HPTree, which is developed in this paper, can deal with big DNA sequence files quickly. It works well on the >1GB files, and gets better performance than other evolutionary reconstruction tools. Users could use HPTree for reonstructing evolutioanry trees on the computer clusters or cloud platform (eg. Amazon Cloud). HPTree could help on population evolution research and metagenomics analysis. In this paper, we employ the Hadoop and Spark platform and design an evolutionary tree reconstruction software tool for unaligned massive DNA sequences. Clustering and multiple sequence alignment are done in parallel. Neighbour-joining model was employed for the evolutionary tree building. We opened our software together with source codes via http://lab.malab.cn/soft/HPtree/ .
Insights into the fold organization of TIM barrel from interaction energy based structure networks.

PubMed

Vijayabaskar, M S; Vishveshwara, Saraswathi

2012-01-01

There are many well-known examples of proteins with low sequence similarity, adopting the same structural fold. This aspect of sequence-structure relationship has been extensively studied both experimentally and theoretically, however with limited success. Most of the studies consider remote homology or "sequence conservation" as the basis for their understanding. Recently "interaction energy" based network formalism (Protein Energy Networks (PENs)) was developed to understand the determinants of protein structures. In this paper we have used these PENs to investigate the common non-covalent interactions and their collective features which stabilize the TIM barrel fold. We have also developed a method of aligning PENs in order to understand the spatial conservation of interactions in the fold. We have identified key common interactions responsible for the conservation of the TIM fold, despite high sequence dissimilarity. For instance, the central beta barrel of the TIM fold is stabilized by long-range high energy electrostatic interactions and low-energy contiguous vdW interactions in certain families. The other interfaces like the helix-sheet or the helix-helix seem to be devoid of any high energy conserved interactions. Conserved interactions in the loop regions around the catalytic site of the TIM fold have also been identified, pointing out their significance in both structural and functional evolution. Based on these investigations, we have developed a novel network based phylogenetic analysis for remote homologues, which can perform better than sequence based phylogeny. Such an analysis is more meaningful from both structural and functional evolutionary perspective. We believe that the information obtained through the "interaction conservation" viewpoint and the subsequently developed method of structure network alignment, can shed new light in the fields of fold organization and de novo computational protein design.

MEGA5: Molecular Evolutionary Genetics Analysis Using Maximum Likelihood, Evolutionary Distance, and Maximum Parsimony Methods

PubMed Central

Tamura, Koichiro; Peterson, Daniel; Peterson, Nicholas; Stecher, Glen; Nei, Masatoshi; Kumar, Sudhir

2011-01-01

Comparative analysis of molecular sequence data is essential for reconstructing the evolutionary histories of species and inferring the nature and extent of selective forces shaping the evolution of genes and species. Here, we announce the release of Molecular Evolutionary Genetics Analysis version 5 (MEGA5), which is a user-friendly software for mining online databases, building sequence alignments and phylogenetic trees, and using methods of evolutionary bioinformatics in basic biology, biomedicine, and evolution. The newest addition in MEGA5 is a collection of maximum likelihood (ML) analyses for inferring evolutionary trees, selecting best-fit substitution models (nucleotide or amino acid), inferring ancestral states and sequences (along with probabilities), and estimating evolutionary rates site-by-site. In computer simulation analyses, ML tree inference algorithms in MEGA5 compared favorably with other software packages in terms of computational efficiency and the accuracy of the estimates of phylogenetic trees, substitution parameters, and rate variation among sites. The MEGA user interface has now been enhanced to be activity driven to make it easier for the use of both beginners and experienced scientists. This version of MEGA is intended for the Windows platform, and it has been configured for effective use on Mac OS X and Linux desktops. It is available free of charge from http://www.megasoftware.net. PMID:21546353
EGenBio: A Data Management System for Evolutionary Genomics and Biodiversity

PubMed Central

Nahum, Laila A; Reynolds, Matthew T; Wang, Zhengyuan O; Faith, Jeremiah J; Jonna, Rahul; Jiang, Zhi J; Meyer, Thomas J; Pollock, David D

2006-01-01

Background Evolutionary genomics requires management and filtering of large numbers of diverse genomic sequences for accurate analysis and inference on evolutionary processes of genomic and functional change. We developed Evolutionary Genomics and Biodiversity (EGenBio; ) to begin to address this. Description EGenBio is a system for manipulation and filtering of large numbers of sequences, integrating curated sequence alignments and phylogenetic trees, managing evolutionary analyses, and visualizing their output. EGenBio is organized into three conceptual divisions, Evolution, Genomics, and Biodiversity. The Genomics division includes tools for selecting pre-aligned sequences from different genes and species, and for modifying and filtering these alignments for further analysis. Species searches are handled through queries that can be modified based on a tree-based navigation system and saved. The Biodiversity division contains tools for analyzing individual sequences or sequence alignments, whereas the Evolution division contains tools involving phylogenetic trees. Alignments are annotated with analytical results and modification history using our PRAED format. A miscellaneous Tools section and Help framework are also available. EGenBio was developed around our comparative genomic research and a prototype database of mtDNA genomes. It utilizes MySQL-relational databases and dynamic page generation, and calls numerous custom programs. Conclusion EGenBio was designed to serve as a platform for tools and resources to ease combined analysis in evolution, genomics, and biodiversity. PMID:17118150
A Novel Framework for the Comparative Analysis of Biological Networks

PubMed Central

Pache, Roland A.; Aloy, Patrick

2012-01-01

Genome sequencing projects provide nearly complete lists of the individual components present in an organism, but reveal little about how they work together. Follow-up initiatives have deciphered thousands of dynamic and context-dependent interrelationships between gene products that need to be analyzed with novel bioinformatics approaches able to capture their complex emerging properties. Here, we present a novel framework for the alignment and comparative analysis of biological networks of arbitrary topology. Our strategy includes the prediction of likely conserved interactions, based on evolutionary distances, to counter the high number of missing interactions in the current interactome networks, and a fast assessment of the statistical significance of individual alignment solutions, which vastly increases its performance with respect to existing tools. Finally, we illustrate the biological significance of the results through the identification of novel complex components and potential cases of cross-talk between pathways and alternative signaling routes. PMID:22363585
Complex Admixture Preceded and Followed the Extinction of Wisent in the Wild

PubMed Central

Hartmann, Stefanie; Paijmans, Johanna L. A.; Taron, Ulrike; Xenikoudakis, Georgios; Cahill, James A.; Heintzman, Peter D.; Shapiro, Beth; Baryshnikov, Gennady; Bunevich, Aleksei N.; Crees, Jennifer J.; Dobosz, Roland; Manaserian, Ninna; Okarma, Henryk; Tokarska, Małgorzata; Turvey, Samuel T.; Wójcik, Jan M.; Żyła, Waldemar; Szymura, Jacek M.; Hofreiter, Michael

2017-01-01

Retracing complex population processes that precede extreme bottlenecks may be impossible using data from living individuals. The wisent (Bison bonasus), Europe’s largest terrestrial mammal, exemplifies such a population history, having gone extinct in the wild but subsequently restored by captive breeding efforts. Using low coverage genomic data from modern and historical individuals, we investigate population processes occurring before and after this extinction. Analysis of aligned genomes supports the division of wisent into two previously recognized subspecies, but almost half of the genomic alignment contradicts this population history as a result of incomplete lineage sorting and admixture. Admixture between subspecies populations occurred prior to extinction and subsequently during the captive breeding program. Admixture with the Bos cattle lineage is also widespread but results from ancient events rather than recent hybridization with domestics. Our study demonstrates the huge potential of historical genomes for both studying evolutionary histories and for guiding conservation strategies. PMID:28007976
Fine-tuning structural RNA alignments in the twilight zone.

PubMed

Bremges, Andreas; Schirmer, Stefanie; Giegerich, Robert

2010-04-30

A widely used method to find conserved secondary structure in RNA is to first construct a multiple sequence alignment, and then fold the alignment, optimizing a score based on thermodynamics and covariance. This method works best around 75% sequence similarity. However, in a "twilight zone" below 55% similarity, the sequence alignment tends to obscure the covariance signal used in the second phase. Therefore, while the overall shape of the consensus structure may still be found, the degree of conservation cannot be estimated reliably. Based on a combination of available methods, we present a method named planACstar for improving structure conservation in structural alignments in the twilight zone. After constructing a consensus structure by alignment folding, planACstar abandons the original sequence alignment, refolds the sequences individually, but consistent with the consensus, aligns the structures, irrespective of sequence, by a pure structure alignment method, and derives an improved sequence alignment from the alignment of structures, to be re-submitted to alignment folding, etc.. This circle may be iterated as long as structural conservation improves, but normally, one step suffices. Employing the tools ClustalW, RNAalifold, and RNAforester, we find that for sequences with 30-55% sequence identity, structural conservation can be improved by 10% on average, with a large variation, measured in terms of RNAalifold's own criterion, the structure conservation index.
Historian: accurate reconstruction of ancestral sequences and evolutionary rates.

PubMed

Holmes, Ian H

2017-04-15

Reconstruction of ancestral sequence histories, and estimation of parameters like indel rates, are improved by using explicit evolutionary models and summing over uncertain alignments. The previous best tool for this purpose (according to simulation benchmarks) was ProtPal, but this tool was too slow for practical use. Historian combines an efficient reimplementation of the ProtPal algorithm with performance-improving heuristics from other alignment tools. Simulation results on fidelity of rate estimation via ancestral reconstruction, along with evaluations on the structurally informed alignment dataset BAliBase 3.0, recommend Historian over other alignment tools for evolutionary applications. Historian is available at https://github.com/evoldoers/historian under the Creative Commons Attribution 3.0 US license. ihholmes+historian@gmail.com. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
ProfileGrids: a sequence alignment visualization paradigm that avoids the limitations of Sequence Logos

PubMed Central

2014-01-01

Background The 2013 BioVis Contest provided an opportunity to evaluate different paradigms for visualizing protein multiple sequence alignments. Such data sets are becoming extremely large and thus taxing current visualization paradigms. Sequence Logos represent consensus sequences but have limitations for protein alignments. As an alternative, ProfileGrids are a new protein sequence alignment visualization paradigm that represents an alignment as a color-coded matrix of the residue frequency occurring at every homologous position in the aligned protein family. Results The JProfileGrid software program was used to analyze the BioVis contest data sets to generate figures for comparison with the Sequence Logo reference images. Conclusions The ProfileGrid representation allows for the clear and effective analysis of protein multiple sequence alignments. This includes both a general overview of the conservation and diversity sequence patterns as well as the interactive ability to query the details of the protein residue distributions in the alignment. The JProfileGrid software is free and available from http://www.ProfileGrid.org. PMID:25237393
A Stochastic Evolutionary Model for Protein Structure Alignment and Phylogeny

PubMed Central

Challis, Christopher J.; Schmidler, Scott C.

2012-01-01

We present a stochastic process model for the joint evolution of protein primary and tertiary structure, suitable for use in alignment and estimation of phylogeny. Indels arise from a classic Links model, and mutations follow a standard substitution matrix, whereas backbone atoms diffuse in three-dimensional space according to an Ornstein–Uhlenbeck process. The model allows for simultaneous estimation of evolutionary distances, indel rates, structural drift rates, and alignments, while fully accounting for uncertainty. The inclusion of structural information enables phylogenetic inference on time scales not previously attainable with sequence evolution models. The model also provides a tool for testing evolutionary hypotheses and improving our understanding of protein structural evolution. PMID:22723302
Fine-tuning structural RNA alignments in the twilight zone

PubMed Central

2010-01-01

Background A widely used method to find conserved secondary structure in RNA is to first construct a multiple sequence alignment, and then fold the alignment, optimizing a score based on thermodynamics and covariance. This method works best around 75% sequence similarity. However, in a "twilight zone" below 55% similarity, the sequence alignment tends to obscure the covariance signal used in the second phase. Therefore, while the overall shape of the consensus structure may still be found, the degree of conservation cannot be estimated reliably. Results Based on a combination of available methods, we present a method named planACstar for improving structure conservation in structural alignments in the twilight zone. After constructing a consensus structure by alignment folding, planACstar abandons the original sequence alignment, refolds the sequences individually, but consistent with the consensus, aligns the structures, irrespective of sequence, by a pure structure alignment method, and derives an improved sequence alignment from the alignment of structures, to be re-submitted to alignment folding, etc.. This circle may be iterated as long as structural conservation improves, but normally, one step suffices. Conclusions Employing the tools ClustalW, RNAalifold, and RNAforester, we find that for sequences with 30-55% sequence identity, structural conservation can be improved by 10% on average, with a large variation, measured in terms of RNAalifold's own criterion, the structure conservation index. PMID:20433706
MOSAIC: an online database dedicated to the comparative genomics of bacterial strains at the intra-species level.

PubMed

Chiapello, Hélène; Gendrault, Annie; Caron, Christophe; Blum, Jérome; Petit, Marie-Agnès; El Karoui, Meriem

2008-11-27

The recent availability of complete sequences for numerous closely related bacterial genomes opens up new challenges in comparative genomics. Several methods have been developed to align complete genomes at the nucleotide level but their use and the biological interpretation of results are not straightforward. It is therefore necessary to develop new resources to access, analyze, and visualize genome comparisons. Here we present recent developments on MOSAIC, a generalist comparative bacterial genome database. This database provides the bacteriologist community with easy access to comparisons of complete bacterial genomes at the intra-species level. The strategy we developed for comparison allows us to define two types of regions in bacterial genomes: backbone segments (i.e., regions conserved in all compared strains) and variable segments (i.e., regions that are either specific to or variable in one of the aligned genomes). Definition of these segments at the nucleotide level allows precise comparative and evolutionary analyses of both coding and non-coding regions of bacterial genomes. Such work is easily performed using the MOSAIC Web interface, which allows browsing and graphical visualization of genome comparisons. The MOSAIC database now includes 493 pairwise comparisons and 35 multiple maximal comparisons representing 78 bacterial species. Genome conserved regions (backbones) and variable segments are presented in various formats for further analysis. A graphical interface allows visualization of aligned genomes and functional annotations. The MOSAIC database is available online at http://genome.jouy.inra.fr/mosaic.
Evol and ProDy for bridging protein sequence evolution and structural dynamics

PubMed Central

Mao, Wenzhi; Liu, Ying; Chennubhotla, Chakra; Lezon, Timothy R.; Bahar, Ivet

2014-01-01

Correlations between sequence evolution and structural dynamics are of utmost importance in understanding the molecular mechanisms of function and their evolution. We have integrated Evol, a new package for fast and efficient comparative analysis of evolutionary patterns and conformational dynamics, into ProDy, a computational toolbox designed for inferring protein dynamics from experimental and theoretical data. Using information-theoretic approaches, Evol coanalyzes conservation and coevolution profiles extracted from multiple sequence alignments of protein families with their inferred dynamics. Availability and implementation: ProDy and Evol are open-source and freely available under MIT License from http://prody.csb.pitt.edu/. Contact: bahar@pitt.edu PMID:24849577
The drug target genes show higher evolutionary conservation than non-target genes.

PubMed

Lv, Wenhua; Xu, Yongdeng; Guo, Yiying; Yu, Ziqi; Feng, Guanglong; Liu, Panpan; Luan, Meiwei; Zhu, Hongjie; Liu, Guiyou; Zhang, Mingming; Lv, Hongchao; Duan, Lian; Shang, Zhenwei; Li, Jin; Jiang, Yongshuai; Zhang, Ruijie

2016-01-26

Although evidence indicates that drug target genes share some common evolutionary features, there have been few studies analyzing evolutionary features of drug targets from an overall level. Therefore, we conducted an analysis which aimed to investigate the evolutionary characteristics of drug target genes. We compared the evolutionary conservation between human drug target genes and non-target genes by combining both the evolutionary features and network topological properties in human protein-protein interaction network. The evolution rate, conservation score and the percentage of orthologous genes of 21 species were included in our study. Meanwhile, four topological features including the average shortest path length, betweenness centrality, clustering coefficient and degree were considered for comparison analysis. Then we got four results as following: compared with non-drug target genes, 1) drug target genes had lower evolutionary rates; 2) drug target genes had higher conservation scores; 3) drug target genes had higher percentages of orthologous genes and 4) drug target genes had a tighter network structure including higher degrees, betweenness centrality, clustering coefficients and lower average shortest path lengths. These results demonstrate that drug target genes are more evolutionarily conserved than non-drug target genes. We hope that our study will provide valuable information for other researchers who are interested in evolutionary conservation of drug targets.
DNA sequence alignment by microhomology sampling during homologous recombination

PubMed Central

Qi, Zhi; Redding, Sy; Lee, Ja Yil; Gibb, Bryan; Kwon, YoungHo; Niu, Hengyao; Gaines, William A.; Sung, Patrick

2015-01-01

Summary Homologous recombination (HR) mediates the exchange of genetic information between sister or homologous chromatids. During HR, members of the RecA/Rad51 family of recombinases must somehow search through vast quantities of DNA sequence to align and pair ssDNA with a homologous dsDNA template. Here we use single-molecule imaging to visualize Rad51 as it aligns and pairs homologous DNA sequences in real-time. We show that Rad51 uses a length-based recognition mechanism while interrogating dsDNA, enabling robust kinetic selection of 8-nucleotide (nt) tracts of microhomology, which kinetically confines the search to sites with a high probability of being a homologous target. Successful pairing with a 9th nucleotide coincides with an additional reduction in binding free energy and subsequent strand exchange occurs in precise 3-nt steps, reflecting the base triplet organization of the presynaptic complex. These findings provide crucial new insights into the physical and evolutionary underpinnings of DNA recombination. PMID:25684365
Insights from life history theory for an explicit treatment of trade-offs in conservation biology.

PubMed

Charpentier, Anne

2015-06-01

As economic and social contexts become more embedded within biodiversity conservation, it becomes obvious that resources are a limiting factor in conservation. This recognition is leading conservation scientists and practitioners to increasingly frame conservation decisions as trade-offs between conflicting societal objectives. However, this framing is all too often done in an intuitive way, rather than by addressing trade-offs explicitly. In contrast, the concept of trade-off is a keystone in evolutionary biology, where it has been investigated extensively. I argue that insights from evolutionary theory can provide methodological and theoretical support to evaluating and quantifying trade-offs in biodiversity conservation. I reviewed the diverse ways in which trade-offs have emerged within the context of conservation and how advances from evolutionary theory can help avoid the main pitfalls of an implicit approach. When studying both evolutionary trade-offs (e.g., reproduction vs. survival) and conservation trade-offs (e.g., biodiversity conservation vs. agriculture), it is crucial to correctly identify the limiting resource, hold constant the amount of this resource when comparing different scenarios, and choose appropriate metrics to quantify the extent to which the objectives have been achieved. Insights from studies in evolutionary theory also reveal how an inadequate selection of conservation solutions may result from considering suboptimal rather than optional solutions when examining whether a trade-off exits between 2 objectives. Furthermore, the shape of a trade-off curve (i.e., whether the relationship between 2 objectives follows a concave, convex, or linear form) is known to affect crucially the definition of optimal solutions in evolutionary biology and very likely affects decisions in biodiversity conservation planning too. This interface between evolutionary biology and biodiversity conservation can therefore provide methodological guidance to support decision makers in the difficult task of choosing among conservation solutions. © 2015 Society for Conservation Biology.
African wildlife conservation and the evolution of hunting institutions

NASA Astrophysics Data System (ADS)

't Sas-Rolfes, Michael

2017-11-01

Hunting regulation presents a significant challenge for contemporary global conservation governance. Motivated by various incentives, hunters may act legally or illegally, for or against the interests of conservation. Hunter incentives are shaped by the interactions between unevenly evolving formal and informal institutions, embedded in socio-ecological systems. To work effectively for conservation, regulatory interventions must take these evolving institutional interactions into account. Drawing on analytical tools from evolutionary institutional economics, this article examines the trajectory of African hunting regulation and its consequences. Concepts of institutional dynamics, fit, scale, and interplay are applied to case studies of rhinoceros and lion hunting to highlight issues of significance to conservation outcomes. These include important links between different forms of hunting and dynamic interplay with institutions of trade. The case studies reveal that inappropriate formal regulatory approaches may be undermined by adaptive informal market responses. Poorly regulated hunting may lead to calls for stricter regulations or bans, but such legal restrictions may in turn perversely lead to more intensified and organised illegal hunting activity, further undermining conservation objectives. I conclude by offering insights and recommendations to guide more effective future regulatory interventions and priorities for further research. Specifically, I advocate approaches that move beyond simplistic regulatory interventions toward more complex, but supportive, institutional arrangements that align formal and informal institutions through inclusive stakeholder engagement.
Phenotype–genotype correlation in Hirschsprung disease is illuminated by comparative analysis of the RET protein sequence

PubMed Central

Kashuk, Carl S.; Stone, Eric A.; Grice, Elizabeth A.; Portnoy, Matthew E.; Green, Eric D.; Sidow, Arend; Chakravarti, Aravinda; McCallion, Andrew S.

2005-01-01

The ability to discriminate between deleterious and neutral amino acid substitutions in the genes of patients remains a significant challenge in human genetics. The increasing availability of genomic sequence data from multiple vertebrate species allows inclusion of sequence conservation and physicochemical properties of residues to be used for functional prediction. In this study, the RET receptor tyrosine kinase serves as a model disease gene in which a broad spectrum (≥116) of disease-associated mutations has been identified among patients with Hirschsprung disease and multiple endocrine neoplasia type 2. We report the alignment of the human RET protein sequence with the orthologous sequences of 12 non-human vertebrates (eight mammalian, one avian, and three teleost species), their comparative analysis, the evolutionary topology of the RET protein, and predicted tolerance for all published missense mutations. We show that, although evolutionary conservation alone provides significant information to predict the effect of a RET mutation, a model that combines comparative sequence data with analysis of physiochemical properties in a quantitative framework provides far greater accuracy. Although the ability to discern the impact of a mutation is imperfect, our analyses permit substantial discrimination between predicted functional classes of RET mutations and disease severity even for a multigenic disease such as Hirschsprung disease. PMID:15956201
Conservation: evolutionary values for all 10,000 birds.

PubMed

Lovette, Irby J

2014-05-19

Many biologists and conservation practitioners believe that preserving evolutionary diversity should be a priority. An innovative new study measures the evolutionary distinctness of all the world's birds and identifies the species and locations that capture the highest fraction of avian evolutionary history. Copyright © 2014 Elsevier Ltd. All rights reserved.
Embryonic Cleavage Cycles: How Is a Mouse Like a Fly?

PubMed Central

O’Farrell, Patrick H.; Stumpff, Jason; Su, Tin Tin

2009-01-01

The evolutionary advent of uterine support of embryonic growth in mammals is relatively recent. Nonetheless, striking differences in the earliest steps of embryogenesis make it difficult to draw parallels even with other chordates. We suggest that use of fertilization as a reference point misaligns the earliest stages and masks parallels that are evident when development is aligned at conserved stages surrounding gastrulation. In externally deposited eggs from representatives of all the major phyla, gastrulation is preceded by specialized extremely rapid cleavage cell cycles. Mammals also exhibit remarkably fast cell cycles in close association with gastrulation, but instead of beginning development with these rapid cycles, the mammalian egg first devotes itself to the production of extraembryonic structures. Previous attempts to identify common features of cleavage cycles focused on post-fertilization divisions of the mammalian egg. We propose that comparison to the rapid peri-gastrulation cycles is more appropriate and suggest that these cycles are related by evolutionary descent to the early cleavage stages of embryos such as those of frog and fly. The deferral of events in mammalian embryogenesis might be due to an evolutionary shift in the timing of fertilization. PMID:14711435
Global priorities for conserving the evolutionary history of sharks, rays and chimaeras.

PubMed

Stein, R William; Mull, Christopher G; Kuhn, Tyler S; Aschliman, Neil C; Davidson, Lindsay N K; Joy, Jeffrey B; Smith, Gordon J; Dulvy, Nicholas K; Mooers, Arne O

2018-02-01

In an era of accelerated biodiversity loss and limited conservation resources, systematic prioritization of species and places is essential. In terrestrial vertebrates, evolutionary distinctness has been used to identify species and locations that embody the greatest share of evolutionary history. We estimate evolutionary distinctness for a large marine vertebrate radiation on a dated taxon-complete tree for all 1,192 chondrichthyan fishes (sharks, rays and chimaeras) by augmenting a new 610-species molecular phylogeny using taxonomic constraints. Chondrichthyans are by far the most evolutionarily distinct of all major radiations of jawed vertebrates-the average species embodies 26 million years of unique evolutionary history. With this metric, we identify 21 countries with the highest richness, endemism and evolutionary distinctness of threatened species as targets for conservation prioritization. On average, threatened chondrichthyans are more evolutionarily distinct-further motivating improved conservation, fisheries management and trade regulation to avoid significant pruning of the chondrichthyan tree of life.
Selection and paucity of phylogenetic signal challenge the utility of alpha-tubulin in reconstruction of evolutionary history of free-living litostomateans (Protista, Ciliophora).

PubMed

Rajter, Ľubomír; Vďačný, Peter

2018-05-12

The class Litostomatea represents a highly diverse but monophyletic group, uniting both free-living and endosymbiotic ciliates. Ribosomal RNA genes and ITS-region sequences helped to recognize and define the main litostomatean lineages, but did not provide enough phylogenetic signal to unambiguously resolve their interrelationships. In this study, we attempted to improve the resolution among main free-living predatory lineages by adding the gene coding for alpha-tubulin. However, our phylogenetic analyses challenged the performance of alpha-tubulin in reconstruction of evolutionary history of free-living litostomateans. We identified several mutually interconnected problems associated with the ciliate alpha-tubulin gene: the paucity of phylogenetic signal, molecular homoplasies and non-neutral evolution. Positive selection may generate molecular homoplasies (parallel evolution), while negative selection may cause a small number of changes and hence little phylogenetic informativness. Both problems were encountered in nucleotide and amino acid alpha-tubulin alignments, indicating an action of various selective pressures. Taking into account the involvement of alpha-tubulin in many essential biological processes, this protein could be so strongly affected by purifying selection that it even might have become an inappropriate molecular marker for reconstruction of phylogenetic relationships. Therefore, a great caution should be paid when tubulin genes are included in phylogenetic and/or phylogenomic analyses. Copyright © 2018 Elsevier Inc. All rights reserved.

Free Energy Landscape - Settlements of Key Residues.

NASA Astrophysics Data System (ADS)

Aroutiounian, Svetlana

2007-03-01

FEL perspective in studies of protein folding transitions reflects notion that since there are ˜10^N conformations to scan in search of lowest free energy state, random search is beyond biological timescale. Protein folding must follow certain fel pathways and folding kinetics of evolutionary selected proteins dominates kinetic traps. Good model for functional robustness of natural proteins - coarse-grained model protein is not very accurate but affords bringing simulations closer to biological realm; Go-like potential secures the fel funnel shape; biochemical contacts signify the funnel bottleneck. Boltzmann-weighted ensemble of protein conformations and histogram method are used to obtain from MC sampling of protein conformational space the approximate probability distribution. The fel is F(rmsd) = -1/βLn[Hist(rmsd)], β=kBT and rmsd is root-mean-square-deviation from native conformation. The sperm whale myoglobin has rich dynamic behavior, is small and large - on computational scale, has a symmetry in architecture and unusual sextet of residue pairs. Main idea: there is a mathematical relation between protein fel and a key residues set providing stability to folding transition. Is the set evolutionary conserved also for functional reasons? Hypothesis: primary sequence determines the key residues positions conserved as stabilizers and the fel is the battlefield for the folding stability. Preliminary results: primary sequence - not the architecture, is the rule settler, indeed.
Metabolite concentrations, fluxes and free energies imply efficient enzyme usage

DOE PAGES

Park, Junyoung O.; Rubin, Sara A.; Xu, Yi -Fan; ...

2016-05-02

In metabolism, available free energy is limited and must be divided across pathway steps to maintain a negative Δ G throughout. For each reaction, Δ G is log proportional both to a concentration ratio (reaction quotient to equilibrium constant) and to a flux ratio (backward to forward flux). In this paper, we use isotope labeling to measure absolute metabolite concentrations and fluxes in Escherichia coli, yeast and a mammalian cell line. We then integrate this information to obtain a unified set of concentrations and Δ G for each organism. In glycolysis, we find that free energy is partitioned so asmore » to mitigate unproductive backward fluxes associated with Δ G near zero. Across metabolism, we observe that absolute metabolite concentrations and Δ G are substantially conserved and that most substrate (but not inhibitor) concentrations exceed the associated enzyme binding site dissociation constant ( K m or K i). Finally, the observed conservation of metabolite concentrations is consistent with an evolutionary drive to utilize enzymes efficiently given thermodynamic and osmotic constraints.« less
Reconsideration of Plant Morphological Traits: From a Structure-Based Perspective to a Function-Based Evolutionary Perspective

PubMed Central

Bai, Shu-Nong

2017-01-01

This opinion article proposes a novel alignment of traits in plant morphogenesis from a function-based evolutionary perspective. As a member species of the ecosystem on Earth, we human beings view our neighbor organisms from our own sensing system. We tend to distinguish forms and structures (i.e., “morphological traits”) mainly through vision. Traditionally, a plant was considered to be consisted of three parts, i.e., the shoot, the leaves, and the root. Based on such a “structure-based perspective,” evolutionary analyses or comparisons across species were made on particular parts or their derived structures. So far no conceptual framework has been established to incorporate the morphological traits of all three land plant phyta, i.e., bryophyta, pteridophyta and spermatophyta, for evolutionary developmental analysis. Using the tenets of the recently proposed concept of sexual reproduction cycle, the major morphological traits of land plants can be aligned into five categories from a function-based evolutionary perspective. From this perspective, and the resulting alignment, a new conceptual framework emerges, called “Plant Morphogenesis 123.” This framework views a plant as a colony of integrated plant developmental units that are each produced via one life cycle. This view provided an alternative perspective for evolutionary developmental investigation in plants. PMID:28360919
DCODE.ORG Anthology of Comparative Genomic Tools

DOE Office of Scientific and Technical Information (OSTI.GOV)

Loots, G G; Ovcharenko, I

2005-01-11

Comparative genomics provides the means to demarcate functional regions in anonymous DNA sequences. The successful application of this method to identifying novel genes is currently shifting to deciphering the noncoding encryption of gene regulation across genomes. To facilitate the use of comparative genomics to practical applications in genetics and genomics we have developed several analytical and visualization tools for the analysis of arbitrary sequences and whole genomes. These tools include two alignment tools: zPicture and Mulan; a phylogenetic shadowing tool: eShadow for identifying lineage- and species-specific functional elements; two evolutionary conserved transcription factor analysis tools: rVista and multiTF; a toolmore » for extracting cis-regulatory modules governing the expression of co-regulated genes, CREME; and a dynamic portal to multiple vertebrate and invertebrate genome alignments, the ECR Browser. Here we briefly describe each one of these tools and provide specific examples on their practical applications. All the tools are publicly available at the http://www.dcode.org/ web site.« less
Dcode.org anthology of comparative genomic tools.

PubMed

Loots, Gabriela G; Ovcharenko, Ivan

2005-07-01

Comparative genomics provides the means to demarcate functional regions in anonymous DNA sequences. The successful application of this method to identifying novel genes is currently shifting to deciphering the non-coding encryption of gene regulation across genomes. To facilitate the practical application of comparative sequence analysis to genetics and genomics, we have developed several analytical and visualization tools for the analysis of arbitrary sequences and whole genomes. These tools include two alignment tools, zPicture and Mulan; a phylogenetic shadowing tool, eShadow for identifying lineage- and species-specific functional elements; two evolutionary conserved transcription factor analysis tools, rVista and multiTF; a tool for extracting cis-regulatory modules governing the expression of co-regulated genes, Creme 2.0; and a dynamic portal to multiple vertebrate and invertebrate genome alignments, the ECR Browser. Here, we briefly describe each one of these tools and provide specific examples on their practical applications. All the tools are publicly available at the http://www.dcode.org/ website.
Multiple network alignment via multiMAGNA+.

PubMed

Vijayan, Vipin; Milenkovic, Tijana

2017-08-21

Network alignment (NA) aims to find a node mapping that identifies topologically or functionally similar network regions between molecular networks of different species. Analogous to genomic sequence alignment, NA can be used to transfer biological knowledge from well- to poorly-studied species between aligned network regions. Pairwise NA (PNA) finds similar regions between two networks while multiple NA (MNA) can align more than two networks. We focus on MNA. Existing MNA methods aim to maximize total similarity over all aligned nodes (node conservation). Then, they evaluate alignment quality by measuring the amount of conserved edges, but only after the alignment is constructed. Directly optimizing edge conservation during alignment construction in addition to node conservation may result in superior alignments. Thus, we present a novel MNA method called multiMAGNA++ that can achieve this. Indeed, multiMAGNA++ outperforms or is on par with existing MNA methods, while often completing faster than existing methods. That is, multiMAGNA++ scales well to larger network data and can be parallelized effectively. During method evaluation, we also introduce new MNA quality measures to allow for more fair MNA method comparison compared to the existing alignment quality measures. MultiMAGNA++ code is available on the method's web page at http://nd.edu/~cone/multiMAGNA++/.
Automated hierarchical classification of protein domain subfamilies based on functionally-divergent residue signatures

PubMed Central

2012-01-01

Background The NCBI Conserved Domain Database (CDD) consists of a collection of multiple sequence alignments of protein domains that are at various stages of being manually curated into evolutionary hierarchies based on conserved and divergent sequence and structural features. These domain models are annotated to provide insights into the relationships between sequence, structure and function via web-based BLAST searches. Results Here we automate the generation of conserved domain (CD) hierarchies using a combination of heuristic and Markov chain Monte Carlo (MCMC) sampling procedures and starting from a (typically very large) multiple sequence alignment. This procedure relies on statistical criteria to define each hierarchy based on the conserved and divergent sequence patterns associated with protein functional-specialization. At the same time this facilitates the sequence and structural annotation of residues that are functionally important. These statistical criteria also provide a means to objectively assess the quality of CD hierarchies, a non-trivial task considering that the protein subgroups are often very distantly related—a situation in which standard phylogenetic methods can be unreliable. Our aim here is to automatically generate (typically sub-optimal) hierarchies that, based on statistical criteria and visual comparisons, are comparable to manually curated hierarchies; this serves as the first step toward the ultimate goal of obtaining optimal hierarchical classifications. A plot of runtimes for the most time-intensive (non-parallelizable) part of the algorithm indicates a nearly linear time complexity so that, even for the extremely large Rossmann fold protein class, results were obtained in about a day. Conclusions This approach automates the rapid creation of protein domain hierarchies and thus will eliminate one of the most time consuming aspects of conserved domain database curation. At the same time, it also facilitates protein domain annotation by identifying those pattern residues that most distinguish each protein domain subgroup from other related subgroups. PMID:22726767
Comprehensive characterization of evolutionary conserved breakpoints in four New World Monkey karyotypes compared to Chlorocebus aethiops and Homo sapiens.

PubMed

Fan, Xiaobo; Supiwong, Weerayuth; Weise, Anja; Mrasek, Kristin; Kosyakova, Nadezda; Tanomtong, Alongkoad; Pinthong, Krit; Trifonov, Vladimir A; Cioffi, Marcelo de Bello; Grothmann, Pierre; Liehr, Thomas; Oliveira, Edivaldo H C de

2015-11-01

Comparative cytogenetic analysis in New World Monkeys (NWMs) using human multicolor banding (MCB) probe sets were not previously done. Here we report on an MCB based FISH-banding study complemented with selected locus-specific and heterochromatin specific probes in four NWMs and one Old World Monkey (OWM) species, i.e. in Alouatta caraya (ACA), Callithrix jacchus (CJA), Cebus apella (CAP), Saimiri sciureus (SSC), and Chlorocebus aethiops (CAE), respectively. 107 individual evolutionary conserved breakpoints (ECBs) among those species were identified and compared with those of other species in previous reports. Especially for chromosomal regions being syntenic to human chromosomes 6, 8, 9, 10, 11, 12 and 16 previously cryptic rearrangements could be observed. 50.4% (54/107) NWM-ECBs were colocalized with those of OWMs, 62.6% (62/99) NWM-ECBs were related with those of Hylobates lar (HLA) and 66.3% (71/107) NWM-ECBs corresponded with those known from other mammalians. Furthermore, human fragile sites were aligned with the ECBs found in the five studied species and interestingly 66.3% ECBs colocalized with those fragile sites (FS). Overall, this study presents detailed chromosomal maps of one OWM and four NWM species. This data will be helpful to further investigation on chromosome evolution in NWM and hominoids in general and is prerequisite for correct interpretation of future sequencing based genomic studies in those species.
Identification of novel binding partners (annexins) for the cell death signal phosphatidylserine and definition of their recognition motif.

PubMed

Rosenbaum, Sabrina; Kreft, Sandra; Etich, Julia; Frie, Christian; Stermann, Jacek; Grskovic, Ivan; Frey, Benjamin; Mielenz, Dirk; Pöschl, Ernst; Gaipl, Udo; Paulsson, Mats; Brachvogel, Bent

2011-02-18

Identification and clearance of apoptotic cells prevents the release of harmful cell contents thereby suppressing inflammation and autoimmune reactions. Highly conserved annexins may modulate the phagocytic cell removal by acting as bridging molecules to phosphatidylserine, a characteristic phagocytosis signal of dying cells. In this study five members of the structurally and functionally related annexin family were characterized for their capacity to interact with phosphatidylserine and dying cells. The results showed that AnxA3, AnxA4, AnxA13, and the already described interaction partner AnxA5 can bind to phosphatidylserine and apoptotic cells, whereas AnxA8 lacks this ability. Sequence alignment experiments located the essential amino residues for the recognition of surface exposed phosphatidylserine within the calcium binding motifs common to all annexins. These amino acid residues were missing in the evolutionary young AnxA8 and when they were reintroduced by site directed mutagenesis AnxA8 gains the capability to interact with phosphatidylserine containing liposomes and apoptotic cells. By defining the evolutionary conserved amino acid residues mediating phosphatidylserine binding of annexins we show that the recognition of dying cells represent a common feature of most annexins. Hence, the individual annexin repertoire bound to the cell surface of dying cells may fulfil opsonin-like function in cell death recognition.
The importance of immune gene variability (MHC) in evolutionary ecology and conservation

PubMed Central

Sommer, Simone

2005-01-01

Genetic studies have typically inferred the effects of human impact by documenting patterns of genetic differentiation and levels of genetic diversity among potentially isolated populations using selective neutral markers such as mitochondrial control region sequences, microsatellites or single nucleotide polymorphism (SNPs). However, evolutionary relevant and adaptive processes within and between populations can only be reflected by coding genes. In vertebrates, growing evidence suggests that genetic diversity is particularly important at the level of the major histocompatibility complex (MHC). MHC variants influence many important biological traits, including immune recognition, susceptibility to infectious and autoimmune diseases, individual odours, mating preferences, kin recognition, cooperation and pregnancy outcome. These diverse functions and characteristics place genes of the MHC among the best candidates for studies of mechanisms and significance of molecular adaptation in vertebrates. MHC variability is believed to be maintained by pathogen-driven selection, mediated either through heterozygote advantage or frequency-dependent selection. Up to now, most of our knowledge has derived from studies in humans or from model organisms under experimental, laboratory conditions. Empirical support for selective mechanisms in free-ranging animal populations in their natural environment is rare. In this review, I first introduce general information about the structure and function of MHC genes, as well as current hypotheses and concepts concerning the role of selection in the maintenance of MHC polymorphism. The evolutionary forces acting on the genetic diversity in coding and non-coding markers are compared. Then, I summarise empirical support for the functional importance of MHC variability in parasite resistance with emphasis on the evidence derived from free-ranging animal populations investigated in their natural habitat. Finally, I discuss the importance of adaptive genetic variability with respect to human impact and conservation, and implications for future studies. PMID:16242022
Genealogical analyses of multiple loci of litostomatean ciliates (Protista, Ciliophora, Litostomatea)

PubMed Central

Vd’ačný, Peter; Bourland, William A.; Orsi, William; Epstein, Slava S.; Foissner, Wilhelm

2012-01-01

The class Litostomatea is a highly diverse ciliate taxon comprising hundreds of free-living and endocommensal species. However, their traditional morphology-based classification conflicts with 18S rRNA gene phylogenies indicating (1) a deep bifurcation of the Litostomatea into Rhynchostomatia and Haptoria + Trichostomatia, and (2) body polarization and simplification of the oral apparatus as main evolutionary trends in the Litostomatea. To test whether 18S rRNA molecules provide a suitable proxy for litostomatean evolutionary history, we used eighteen new ITS1-5.8S rRNA-ITS2 region sequences from various free-living litostomatean orders. These single- and multiple-locus analyses are in agreement with previous 18S rRNA gene phylogenies, supporting that both 18S rRNA gene and ITS region sequences are effective tools for resolving phylogenetic relationships among the litostomateans. Despite insertions, deletions and mutational saturations in the ITS region, the present study shows that ITS1 and ITS2 molecules can be used to infer phylogenetic relationships not only at species level but also at higher taxonomic ranks when their secondary structure information is utilized to aid alignment. PMID:22789763
Genealogical analyses of multiple loci of litostomatean ciliates (Protista, Ciliophora, Litostomatea).

PubMed

Vd'ačný, Peter; Bourland, William A; Orsi, William; Epstein, Slava S; Foissner, Wilhelm

2012-11-01

The class Litostomatea is a highly diverse ciliate taxon comprising hundreds of free-living and endocommensal species. However, their traditional morphology-based classification conflicts with 18S rRNA gene phylogenies indicating (1) a deep bifurcation of the Litostomatea into Rhynchostomatia and Haptoria+Trichostomatia, and (2) body polarization and simplification of the oral apparatus as main evolutionary trends in the Litostomatea. To test whether 18S rRNA molecules provide a suitable proxy for litostomatean evolutionary history, we used eighteen new ITS1-5.8S rRNA-ITS2 region sequences from various free-living litostomatean orders. These single- and multiple-locus analyses are in agreement with previous 18S rRNA gene phylogenies, supporting that both 18S rRNA gene and ITS region sequences are effective tools for resolving phylogenetic relationships among the litostomateans. Despite insertions, deletions and mutational saturations in the ITS region, the present study shows that ITS1 and ITS2 molecules can be used to infer phylogenetic relationships not only at species level but also at higher taxonomic ranks when their secondary structure information is utilized to aid alignment. Copyright © 2012 Elsevier Inc. All rights reserved.
Phylogenetic Relationships within the Opisthokonta Based on Phylogenomic Analyses of Conserved Single-Copy Protein Domains

PubMed Central

Torruella, Guifré; Derelle, Romain; Paps, Jordi; Lang, B. Franz; Roger, Andrew J.; Shalchian-Tabrizi, Kamran; Ruiz-Trillo, Iñaki

2012-01-01

Many of the eukaryotic phylogenomic analyses published to date were based on alignments of hundreds to thousands of genes. Frequently, in such analyses, the most realistic evolutionary models currently available are often used to minimize the impact of systematic error. However, controversy remains over whether or not idiosyncratic gene family dynamics (i.e., gene duplications and losses) and incorrect orthology assignments are always appropriately taken into account. In this paper, we present an innovative strategy for overcoming orthology assignment problems. Rather than identifying and eliminating genes with paralogy problems, we have constructed a data set comprised exclusively of conserved single-copy protein domains that, unlike most of the commonly used phylogenomic data sets, should be less confounded by orthology miss-assignments. To evaluate the power of this approach, we performed maximum likelihood and Bayesian analyses to infer the evolutionary relationships within the opisthokonts (which includes Metazoa, Fungi, and related unicellular lineages). We used this approach to test 1) whether Filasterea and Ichthyosporea form a clade, 2) the interrelationships of early-branching metazoans, and 3) the relationships among early-branching fungi. We also assessed the impact of some methods that are known to minimize systematic error, including reducing the distance between the outgroup and ingroup taxa or using the CAT evolutionary model. Overall, our analyses support the Filozoa hypothesis in which Ichthyosporea are the first holozoan lineage to emerge followed by Filasterea, Choanoflagellata, and Metazoa. Blastocladiomycota appears as a lineage separate from Chytridiomycota, although this result is not strongly supported. These results represent independent tests of previous phylogenetic hypotheses, highlighting the importance of sophisticated approaches for orthology assignment in phylogenomic analyses. PMID:21771718
OrthoMaM v8: a database of orthologous exons and coding sequences for comparative genomics in mammals.

PubMed

Douzery, Emmanuel J P; Scornavacca, Celine; Romiguier, Jonathan; Belkhir, Khalid; Galtier, Nicolas; Delsuc, Frédéric; Ranwez, Vincent

2014-07-01

Comparative genomic studies extensively rely on alignments of orthologous sequences. Yet, selecting, gathering, and aligning orthologous exons and protein-coding sequences (CDS) that are relevant for a given evolutionary analysis can be a difficult and time-consuming task. In this context, we developed OrthoMaM, a database of ORTHOlogous MAmmalian Markers describing the evolutionary dynamics of orthologous genes in mammalian genomes using a phylogenetic framework. Since its first release in 2007, OrthoMaM has regularly evolved, not only to include newly available genomes but also to incorporate up-to-date software in its analytic pipeline. This eighth release integrates the 40 complete mammalian genomes available in Ensembl v73 and provides alignments, phylogenies, evolutionary descriptor information, and functional annotations for 13,404 single-copy orthologous CDS and 6,953 long exons. The graphical interface allows to easily explore OrthoMaM to identify markers with specific characteristics (e.g., taxa availability, alignment size, %G+C, evolutionary rate, chromosome location). It hence provides an efficient solution to sample preprocessed markers adapted to user-specific needs. OrthoMaM has proven to be a valuable resource for researchers interested in mammalian phylogenomics, evolutionary genomics, and has served as a source of benchmark empirical data sets in several methodological studies. OrthoMaM is available for browsing, query and complete or filtered downloads at http://www.orthomam.univ-montp2.fr/. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
The International Oryza Map Alignment Project: development of a genus-wide comparative genomics platform to help solve the 9 billion-people question.

PubMed

Jacquemin, Julie; Bhatia, Dharminder; Singh, Kuldeep; Wing, Rod A

2013-05-01

The wild relatives of rice contain a virtually untapped reservoir of traits that can be used help drive the 21st century green revolution aimed at solving world food security issues by 2050. To better understand and exploit the 23 species of the Oryza genus the rice research community is developing foundational resources composed of: 1) reference genomes and transcriptomes for all 23 species; 2) advanced mapping populations for functional and breeding studies; and 3) in situ conservation sites for ecological, evolutionary and population genomics. To this end, 16 genome sequencing projects are currently underway, and all completed assemblies have been annotated; and several advanced mapping populations have been developed, and more will be generated, mapped, and phenotyped, to uncover useful alleles. As wild Oryza populations are threatened by human activity and climate change, we also discuss the urgent need for sustainable in situ conservation of the genus. Copyright © 2013 Elsevier Ltd. All rights reserved.
Evaluating, Comparing, and Interpreting Protein Domain Hierarchies

PubMed Central

2014-01-01

Abstract Arranging protein domain sequences hierarchically into evolutionarily divergent subgroups is important for investigating evolutionary history, for speeding up web-based similarity searches, for identifying sequence determinants of protein function, and for genome annotation. However, whether or not a particular hierarchy is optimal is often unclear, and independently constructed hierarchies for the same domain can often differ significantly. This article describes methods for statistically evaluating specific aspects of a hierarchy, for probing the criteria underlying its construction and for direct comparisons between hierarchies. Information theoretical notions are used to quantify the contributions of specific hierarchical features to the underlying statistical model. Such features include subhierarchies, sequence subgroups, individual sequences, and subgroup-associated signature patterns. Underlying properties are graphically displayed in plots of each specific feature's contributions, in heat maps of pattern residue conservation, in “contrast alignments,” and through cross-mapping of subgroups between hierarchies. Together, these approaches provide a deeper understanding of protein domain functional divergence, reveal uncertainties caused by inconsistent patterns of sequence conservation, and help resolve conflicts between competing hierarchies. PMID:24559108
ATGC database and ATGC-COGs: an updated resource for micro- and macro-evolutionary studies of prokaryotic genomes and protein family annotation

PubMed Central

Kristensen, David M.; Wolf, Yuri I.; Koonin, Eugene V.

2017-01-01

The Alignable Tight Genomic Clusters (ATGCs) database is a collection of closely related bacterial and archaeal genomes that provides several tools to aid research into evolutionary processes in the microbial world. Each ATGC is a taxonomy-independent cluster of 2 or more completely sequenced genomes that meet the objective criteria of a high degree of local gene order (synteny) and a small number of synonymous substitutions in the protein-coding genes. As such, each ATGC is suited for analysis of microevolutionary variations within a cohesive group of organisms (e.g. species), whereas the entire collection of ATGCs is useful for macroevolutionary studies. The ATGC database includes many forms of pre-computed data, in particular ATGC-COGs (Clusters of Orthologous Genes), multiple sequence alignments, a set of ‘index’ orthologs representing the most well-conserved members of each ATGC-COG, the phylogenetic tree of the organisms within each ATGC, etc. Although the ATGC database contains several million proteins from thousands of genomes organized into hundreds of clusters (roughly a 4-fold increase since the last version of the ATGC database), it is now built with completely automated methods and will be regularly updated following new releases of the NCBI RefSeq database. The ATGC database is hosted jointly at the University of Iowa at dmk-brain.ecn.uiowa.edu/ATGC/ and the NCBI at ftp.ncbi.nlm.nih.gov/pub/kristensen/ATGC/atgc_home.html. PMID:28053163
Gene family size conservation is a good indicator of evolutionary rates.

PubMed

Chen, Feng-Chi; Chen, Chiuan-Jung; Li, Wen-Hsiung; Chuang, Trees-Juen

2010-08-01

The evolution of duplicate genes has been a topic of broad interest. Here, we propose that the conservation of gene family size is a good indicator of the rate of sequence evolution and some other biological properties. By comparing the human-chimpanzee-macaque orthologous gene families with and without family size conservation, we demonstrate that genes with family size conservation evolve more slowly than those without family size conservation. Our results further demonstrate that both family expansion and contraction events may accelerate gene evolution, resulting in elevated evolutionary rates in the genes without family size conservation. In addition, we show that the duplicate genes with family size conservation evolve significantly more slowly than those without family size conservation. Interestingly, the median evolutionary rate of singletons falls in between those of the above two types of duplicate gene families. Our results thus suggest that the controversy on whether duplicate genes evolve more slowly than singletons can be resolved when family size conservation is taken into consideration. Furthermore, we also observe that duplicate genes with family size conservation have the highest level of gene expression/expression breadth, the highest proportion of essential genes, and the lowest gene compactness, followed by singletons and then by duplicate genes without family size conservation. Such a trend accords well with our observations of evolutionary rates. Our results thus point to the importance of family size conservation in the evolution of duplicate genes.
WNV Typer: a server for genotyping of West Nile viruses using an alignment-free method based on a return time distribution.

PubMed

Kolekar, Pandurang; Hake, Nilesh; Kale, Mohan; Kulkarni-Kale, Urmila

2014-03-01

West Nile virus (WNV), genus Flavivirus, family Flaviviridae, is a major cause of viral encephalitis with broad host range and global spread. The virus has undergone a series of evolutionary changes with emergence of various genotypic lineages that are known to differ in type and severity of the diseases caused. Currently, genotyping is carried out using molecular phylogeny of complete coding sequences and genotype is assigned based on proximity to reference genotypes in tree topology. Efficient epidemiological surveillance of WNVs demands development of objective criteria for typing. An alignment-free approach based on return time distribution (RTD) of k-mers has been validated for genotyping of WNVs. The RTDs of complete genome sequences at k=7 were found to be optimum for classification of the known lineages of WNVs as well as for genotyping. It provides time and computationally efficient alternative for genome based annotation of WNV lineages. The development of a WNV Typer server based on RTD is described (http://bioinfo.net.in/wnv/homepage.html). Both the method and the server have 100% sensitivity and specificity. Copyright © 2014 The Authors. Published by Elsevier B.V. All rights reserved.
Sociology of the growth/no-growth debate

DOE Office of Scientific and Technical Information (OSTI.GOV)

Humphrey, C.R.; Buttel, F.H.

The properties of conservative, liberal, and radical patterns in social science are analyzed and applied to the growth/no-growth debate in environmental policy literature. The fact that conservatives work with an evolutionary model of society suggests that environmental problems are imperfections to be remedied by science, technology, and the free market. Liberals recognize the benefits and costs of growth, and they articulate ways to minimize the costs through state regulation and planning. Radicals argue for state ownership of the means of production and new cultural values about growth as the only effective environmental policies. This analysis closes with a discussion ofmore » the future of the growth debate in terms of these paradigms. 40 references.« less

Alignment of dynamic networks.

PubMed

Vijayan, V; Critchlow, D; Milenkovic, T

2017-07-15

Network alignment (NA) aims to find a node mapping that conserves similar regions between compared networks. NA is applicable to many fields, including computational biology, where NA can guide the transfer of biological knowledge from well- to poorly-studied species across aligned network regions. Existing NA methods can only align static networks. However, most complex real-world systems evolve over time and should thus be modeled as dynamic networks. We hypothesize that aligning dynamic network representations of evolving systems will produce superior alignments compared to aligning the systems' static network representations, as is currently done. For this purpose, we introduce the first ever dynamic NA method, DynaMAGNA ++. This proof-of-concept dynamic NA method is an extension of a state-of-the-art static NA method, MAGNA++. Even though both MAGNA++ and DynaMAGNA++ optimize edge as well as node conservation across the aligned networks, MAGNA++ conserves static edges and similarity between static node neighborhoods, while DynaMAGNA++ conserves dynamic edges (events) and similarity between evolving node neighborhoods. For this purpose, we introduce the first ever measure of dynamic edge conservation and rely on our recent measure of dynamic node conservation. Importantly, the two dynamic conservation measures can be optimized with any state-of-the-art NA method and not just MAGNA++. We confirm our hypothesis that dynamic NA is superior to static NA, on synthetic and real-world networks, in computational biology and social domains. DynaMAGNA++ is parallelized and has a user-friendly graphical interface. http://nd.edu/∼cone/DynaMAGNA++/ . tmilenko@nd.edu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Alignment of dynamic networks

PubMed Central

Vijayan, V.; Critchlow, D.; Milenković, T.

2017-01-01

Abstract Motivation: Network alignment (NA) aims to find a node mapping that conserves similar regions between compared networks. NA is applicable to many fields, including computational biology, where NA can guide the transfer of biological knowledge from well- to poorly-studied species across aligned network regions. Existing NA methods can only align static networks. However, most complex real-world systems evolve over time and should thus be modeled as dynamic networks. We hypothesize that aligning dynamic network representations of evolving systems will produce superior alignments compared to aligning the systems’ static network representations, as is currently done. Results: For this purpose, we introduce the first ever dynamic NA method, DynaMAGNA ++. This proof-of-concept dynamic NA method is an extension of a state-of-the-art static NA method, MAGNA++. Even though both MAGNA++ and DynaMAGNA++ optimize edge as well as node conservation across the aligned networks, MAGNA++ conserves static edges and similarity between static node neighborhoods, while DynaMAGNA++ conserves dynamic edges (events) and similarity between evolving node neighborhoods. For this purpose, we introduce the first ever measure of dynamic edge conservation and rely on our recent measure of dynamic node conservation. Importantly, the two dynamic conservation measures can be optimized with any state-of-the-art NA method and not just MAGNA++. We confirm our hypothesis that dynamic NA is superior to static NA, on synthetic and real-world networks, in computational biology and social domains. DynaMAGNA++ is parallelized and has a user-friendly graphical interface. Availability and implementation: http://nd.edu/∼cone/DynaMAGNA++/. Contact: tmilenko@nd.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28881980
Structural re-alignment in an immunologic surface region of ricin A chain

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zemla, A T; Zhou, C E

2007-07-24

We compared structure alignments generated by several protein structure comparison programs to determine whether existing methods would satisfactorily align residues at a highly conserved position within an immunogenic loop in ribosome inactivating proteins (RIPs). Using default settings, structure alignments generated by several programs (CE, DaliLite, FATCAT, LGA, MAMMOTH, MATRAS, SHEBA, SSM) failed to align the respective conserved residues, although LGA reported correct residue-residue (R-R) correspondences when the beta-carbon (Cb) position was used as the point of reference in the alignment calculations. Further tests using variable points of reference indicated that points distal from the beta carbon along a vector connectingmore » the alpha and beta carbons yielded rigid structural alignments in which residues known to be highly conserved in RIPs were reported as corresponding residues in structural comparisons between ricin A chain, abrin-A, and other RIPs. Results suggest that approaches to structure alignment employing alternate point representations corresponding to side chain position may yield structure alignments that are more consistent with observed conservation of functional surface residues than do standard alignment programs, which apply uniform criteria for alignment (i.e., alpha carbon (Ca) as point of reference) along the entirety of the peptide chain. We present the results of tests that suggest the utility of allowing user-specified points of reference in generating alternate structural alignments, and we present a web server for automatically generating such alignments.« less
Evol and ProDy for bridging protein sequence evolution and structural dynamics.

PubMed

Bakan, Ahmet; Dutta, Anindita; Mao, Wenzhi; Liu, Ying; Chennubhotla, Chakra; Lezon, Timothy R; Bahar, Ivet

2014-09-15

Correlations between sequence evolution and structural dynamics are of utmost importance in understanding the molecular mechanisms of function and their evolution. We have integrated Evol, a new package for fast and efficient comparative analysis of evolutionary patterns and conformational dynamics, into ProDy, a computational toolbox designed for inferring protein dynamics from experimental and theoretical data. Using information-theoretic approaches, Evol coanalyzes conservation and coevolution profiles extracted from multiple sequence alignments of protein families with their inferred dynamics. ProDy and Evol are open-source and freely available under MIT License from http://prody.csb.pitt.edu/. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Phylogenic inference using alignment-free methods for applications in microbial community surveys using 16s rRNA gene

PubMed Central

2017-01-01

The diversity of microbiota is best explored by understanding the phylogenetic structure of the microbial communities. Traditionally, sequence alignment has been used for phylogenetic inference. However, alignment-based approaches come with significant challenges and limitations when massive amounts of data are analyzed. In the recent decade, alignment-free approaches have enabled genome-scale phylogenetic inference. Here we evaluate three alignment-free methods: ACS, CVTree, and Kr for phylogenetic inference with 16s rRNA gene data. We use a taxonomic gold standard to compare the accuracy of alignment-free phylogenetic inference with that of common microbiome-wide phylogenetic inference pipelines based on PyNAST and MUSCLE alignments with FastTree and RAxML. We re-simulate fecal communities from Human Microbiome Project data to evaluate the performance of the methods on datasets with properties of real data. Our comparisons show that alignment-free methods are not inferior to alignment-based methods in giving accurate and robust phylogenic trees. Moreover, consensus ensembles of alignment-free phylogenies are superior to those built from alignment-based methods in their ability to highlight community differences in low power settings. In addition, the overall running times of alignment-based and alignment-free phylogenetic inference are comparable. Taken together our empirical results suggest that alignment-free methods provide a viable approach for microbiome-wide phylogenetic inference. PMID:29136663
Conservation Evo-Devo: Preserving Biodiversity by Understanding Its Origins.

PubMed

Campbell, Calum S; Adams, Colin E; Bean, Colin W; Parsons, Kevin J

2017-10-01

Unprecedented rates of species extinction increase the urgency for effective conservation biology management practices. Thus, any improvements in practice are vital and we suggest that conservation can be enhanced through recent advances in evolutionary biology, specifically advances put forward by evolutionary developmental biology (i.e., evo-devo). There are strong overlapping conceptual links between conservation and evo-devo whereby both fields focus on evolutionary potential. In particular, benefits to conservation can be derived from some of the main areas of evo-devo research, namely phenotypic plasticity, modularity and integration, and mechanistic investigations of the precise developmental and genetic processes that determine phenotypes. Using examples we outline how evo-devo can expand into conservation biology, an opportunity which holds great promise for advancing both fields. Copyright © 2017 Elsevier Ltd. All rights reserved.
Identification and Characterization of Small Noncoding RNAs in Genome Sequences of the Edible Fungus Pleurotus ostreatus

PubMed Central

Zhao, Mengran; Hsiang, Tom; Feng, Xiaoxing

2016-01-01

Noncoding RNAs (ncRNAs) have been identified in many fungi. However, no genome-scale identification of ncRNAs has been inventoried for basidiomycetes. In this research, we detected 254 small noncoding RNAs (sncRNAs) in a genome assembly of an isolate (CCEF00389) of Pleurotus ostreatus, which is a widely cultivated edible basidiomycetous fungus worldwide. The identified sncRNAs include snRNAs, snoRNAs, tRNAs, and miRNAs. SnRNA U1 was not found in CCEF00389 genome assembly and some other basidiomycetous genomes by BLASTn. This implies that if snRNA U1 of basidiomycetes exists, it has a sequence that varies significantly from other organisms. By analyzing the distribution of sncRNA loci, we found that snRNAs and most tRNAs (88.6%) were located in pseudo-UTR regions, while miRNAs are commonly found in introns. To analyze the evolutionary conservation of the sncRNAs in P. ostreatus, we aligned all 254 sncRNAs to the genome assemblies of some other Agaricomycotina fungi. The results suggest that most sncRNAs (77.56%) were highly conserved in P. ostreatus, and 20% were conserved in Agaricomycotina fungi. These findings indicate that most sncRNAs of P. ostreatus were not conserved across Agaricomycotina fungi. PMID:27703969
The Ditylenchus destructor genome provides new insights into the evolution of plant parasitic nematodes

PubMed Central

Zheng, Jinshui; Peng, Donghai; Chen, Ling; Liu, Hualin; Chen, Feng; Xu, Mengci; Ju, Shouyong; Ruan, Lifang

2016-01-01

Plant-parasitic nematodes were found in 4 of the 12 clades of phylum Nematoda. These nematodes in different clades may have originated independently from their free-living fungivorous ancestors. However, the exact evolutionary process of these parasites is unclear. Here, we sequenced the genome sequence of a migratory plant nematode, Ditylenchus destructor. We performed comparative genomics among the free-living nematode, Caenorhabditis elegans and all the plant nematodes with genome sequences available. We found that, compared with C. elegans, the core developmental control processes underwent heavy reduction, though most signal transduction pathways were conserved. We also found D. destructor contained more homologies of the key genes in the above processes than the other plant nematodes. We suggest that Ditylenchus spp. may be an intermediate evolutionary history stage from free-living nematodes that feed on fungi to obligate plant-parasitic nematodes. Based on the facts that D. destructor can feed on fungi and has a relatively short life cycle, and that it has similar features to both C. elegans and sedentary plant-parasitic nematodes from clade 12, we propose it as a new model to study the biology, biocontrol of plant nematodes and the interaction between nematodes and plants. PMID:27466450
Conservation of coevolving protein interfaces bridges prokaryote–eukaryote homologies in the twilight zone

PubMed Central

Rodriguez-Rivas, Juan; Marsili, Simone; Juan, David; Valencia, Alfonso

2016-01-01

Protein–protein interactions are fundamental for the proper functioning of the cell. As a result, protein interaction surfaces are subject to strong evolutionary constraints. Recent developments have shown that residue coevolution provides accurate predictions of heterodimeric protein interfaces from sequence information. So far these approaches have been limited to the analysis of families of prokaryotic complexes for which large multiple sequence alignments of homologous sequences can be compiled. We explore the hypothesis that coevolution points to structurally conserved contacts at protein–protein interfaces, which can be reliably projected to homologous complexes with distantly related sequences. We introduce a domain-centered protocol to study the interplay between residue coevolution and structural conservation of protein–protein interfaces. We show that sequence-based coevolutionary analysis systematically identifies residue contacts at prokaryotic interfaces that are structurally conserved at the interface of their eukaryotic counterparts. In turn, this allows the prediction of conserved contacts at eukaryotic protein–protein interfaces with high confidence using solely mutational patterns extracted from prokaryotic genomes. Even in the context of high divergence in sequence (the twilight zone), where standard homology modeling of protein complexes is unreliable, our approach provides sequence-based accurate information about specific details of protein interactions at the residue level. Selected examples of the application of prokaryotic coevolutionary analysis to the prediction of eukaryotic interfaces further illustrate the potential of this approach. PMID:27965389
Conservation of coevolving protein interfaces bridges prokaryote-eukaryote homologies in the twilight zone.

PubMed

Rodriguez-Rivas, Juan; Marsili, Simone; Juan, David; Valencia, Alfonso

2016-12-27

Protein-protein interactions are fundamental for the proper functioning of the cell. As a result, protein interaction surfaces are subject to strong evolutionary constraints. Recent developments have shown that residue coevolution provides accurate predictions of heterodimeric protein interfaces from sequence information. So far these approaches have been limited to the analysis of families of prokaryotic complexes for which large multiple sequence alignments of homologous sequences can be compiled. We explore the hypothesis that coevolution points to structurally conserved contacts at protein-protein interfaces, which can be reliably projected to homologous complexes with distantly related sequences. We introduce a domain-centered protocol to study the interplay between residue coevolution and structural conservation of protein-protein interfaces. We show that sequence-based coevolutionary analysis systematically identifies residue contacts at prokaryotic interfaces that are structurally conserved at the interface of their eukaryotic counterparts. In turn, this allows the prediction of conserved contacts at eukaryotic protein-protein interfaces with high confidence using solely mutational patterns extracted from prokaryotic genomes. Even in the context of high divergence in sequence (the twilight zone), where standard homology modeling of protein complexes is unreliable, our approach provides sequence-based accurate information about specific details of protein interactions at the residue level. Selected examples of the application of prokaryotic coevolutionary analysis to the prediction of eukaryotic interfaces further illustrate the potential of this approach.
Functional Sites Induce Long-Range Evolutionary Constraints in Enzymes

PubMed Central

Jack, Benjamin R.; Meyer, Austin G.; Echave, Julian; Wilke, Claus O.

2016-01-01

Functional residues in proteins tend to be highly conserved over evolutionary time. However, to what extent functional sites impose evolutionary constraints on nearby or even more distant residues is not known. Here, we report pervasive conservation gradients toward catalytic residues in a dataset of 524 distinct enzymes: evolutionary conservation decreases approximately linearly with increasing distance to the nearest catalytic residue in the protein structure. This trend encompasses, on average, 80% of the residues in any enzyme, and it is independent of known structural constraints on protein evolution such as residue packing or solvent accessibility. Further, the trend exists in both monomeric and multimeric enzymes and irrespective of enzyme size and/or location of the active site in the enzyme structure. By contrast, sites in protein–protein interfaces, unlike catalytic residues, are only weakly conserved and induce only minor rate gradients. In aggregate, these observations show that functional sites, and in particular catalytic residues, induce long-range evolutionary constraints in enzymes. PMID:27138088
Modeling the evolution of regulatory elements by simultaneous detection and alignment with phylogenetic pair HMMs.

PubMed

Majoros, William H; Ohler, Uwe

2010-12-16

The computational detection of regulatory elements in DNA is a difficult but important problem impacting our progress in understanding the complex nature of eukaryotic gene regulation. Attempts to utilize cross-species conservation for this task have been hampered both by evolutionary changes of functional sites and poor performance of general-purpose alignment programs when applied to non-coding sequence. We describe a new and flexible framework for modeling binding site evolution in multiple related genomes, based on phylogenetic pair hidden Markov models which explicitly model the gain and loss of binding sites along a phylogeny. We demonstrate the value of this framework for both the alignment of regulatory regions and the inference of precise binding-site locations within those regions. As the underlying formalism is a stochastic, generative model, it can also be used to simulate the evolution of regulatory elements. Our implementation is scalable in terms of numbers of species and sequence lengths and can produce alignments and binding-site predictions with accuracy rivaling or exceeding current systems that specialize in only alignment or only binding-site prediction. We demonstrate the validity and power of various model components on extensive simulations of realistic sequence data and apply a specific model to study Drosophila enhancers in as many as ten related genomes and in the presence of gain and loss of binding sites. Different models and modeling assumptions can be easily specified, thus providing an invaluable tool for the exploration of biological hypotheses that can drive improvements in our understanding of the mechanisms and evolution of gene regulation.
REDIdb 3.0: A Comprehensive Collection of RNA Editing Events in Plant Organellar Genomes.

PubMed

Lo Giudice, Claudio; Pesole, Graziano; Picardi, Ernesto

2018-01-01

RNA editing is an important epigenetic mechanism by which genome-encoded transcripts are modified by substitutions, insertions and/or deletions. It was first discovered in kinetoplastid protozoa followed by its reporting in a wide range of organisms. In plants, RNA editing occurs mostly by cytidine (C) to uridine (U) conversion in translated regions of organelle mRNAs and tends to modify affected codons restoring evolutionary conserved aminoacid residues. RNA editing has also been described in non-protein coding regions such as group II introns and structural RNAs. Despite its impact on organellar transcriptome and proteome complexity, current primary databases still do not provide a specific field for RNA editing events. To overcome these limitations, we developed REDIdb a specialized database for RNA editing modifications in plant organelles. Hereafter we describe its third release containing more than 26,000 events in a completely novel web interface to accommodate RNA editing in its genomics, biological and evolutionary context through whole genome maps and multiple sequence alignments. REDIdb is freely available at http://srv00.recas.ba.infn.it/redidb/index.html.
A novel approach to multiple sequence alignment using hadoop data grids.

PubMed

Sudha Sadasivam, G; Baktavatchalam, G

2010-01-01

Multiple alignment of protein sequences helps to determine evolutionary linkage and to predict molecular structures. The factors to be considered while aligning multiple sequences are speed and accuracy of alignment. Although dynamic programming algorithms produce accurate alignments, they are computation intensive. In this paper we propose a time efficient approach to sequence alignment that also produces quality alignment. The dynamic nature of the algorithm coupled with data and computational parallelism of hadoop data grids improves the accuracy and speed of sequence alignment. The principle of block splitting in hadoop coupled with its scalability facilitates alignment of very large sequences.
Identifying functionally informative evolutionary sequence profiles.

PubMed

Gil, Nelson; Fiser, Andras

2018-04-15

Multiple sequence alignments (MSAs) can provide essential input to many bioinformatics applications, including protein structure prediction and functional annotation. However, the optimal selection of sequences to obtain biologically informative MSAs for such purposes is poorly explored, and has traditionally been performed manually. We present Selection of Alignment by Maximal Mutual Information (SAMMI), an automated, sequence-based approach to objectively select an optimal MSA from a large set of alternatives sampled from a general sequence database search. The hypothesis of this approach is that the mutual information among MSA columns will be maximal for those MSAs that contain the most diverse set possible of the most structurally and functionally homogeneous protein sequences. SAMMI was tested to select MSAs for functional site residue prediction by analysis of conservation patterns on a set of 435 proteins obtained from protein-ligand (peptides, nucleic acids and small substrates) and protein-protein interaction databases. Availability and implementation: A freely accessible program, including source code, implementing SAMMI is available at https://github.com/nelsongil92/SAMMI.git. andras.fiser@einstein.yu.edu. Supplementary data are available at Bioinformatics online.
Using genomics to characterize evolutionary potential for conservation of wild populations

PubMed Central

Harrisson, Katherine A; Pavlova, Alexandra; Telonis-Scott, Marina; Sunnucks, Paul

2014-01-01

Genomics promises exciting advances towards the important conservation goal of maximizing evolutionary potential, notwithstanding associated challenges. Here, we explore some of the complexity of adaptation genetics and discuss the strengths and limitations of genomics as a tool for characterizing evolutionary potential in the context of conservation management. Many traits are polygenic and can be strongly influenced by minor differences in regulatory networks and by epigenetic variation not visible in DNA sequence. Much of this critical complexity is difficult to detect using methods commonly used to identify adaptive variation, and this needs appropriate consideration when planning genomic screens, and when basing management decisions on genomic data. When the genomic basis of adaptation and future threats are well understood, it may be appropriate to focus management on particular adaptive traits. For more typical conservations scenarios, we argue that screening genome-wide variation should be a sensible approach that may provide a generalized measure of evolutionary potential that accounts for the contributions of small-effect loci and cryptic variation and is robust to uncertainty about future change and required adaptive response(s). The best conservation outcomes should be achieved when genomic estimates of evolutionary potential are used within an adaptive management framework. PMID:25553064
Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment

PubMed Central

2013-01-01

Background Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. Results In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Conclusion Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA. PMID:24564200
Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment.

PubMed

Nagar, Anurag; Hahsler, Michael

2013-01-01

Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA.
Evolutionary profiles from the QR factorization of multiple sequence alignments

PubMed Central

Sethi, Anurag; O'Donoghue, Patrick; Luthey-Schulten, Zaida

2005-01-01

We present an algorithm to generate complete evolutionary profiles that represent the topology of the molecular phylogenetic tree of the homologous group. The method, based on the multidimensional QR factorization of numerically encoded multiple sequence alignments, removes redundancy from the alignments and orders the protein sequences by increasing linear dependence, resulting in the identification of a minimal basis set of sequences that spans the evolutionary space of the homologous group of proteins. We observe a general trend that these smaller, more evolutionarily balanced profiles have comparable and, in many cases, better performance in database searches than conventional profiles containing hundreds of sequences, constructed in an iterative and computationally intensive procedure. For more diverse families or superfamilies, with sequence identity <30%, structural alignments, based purely on the geometry of the protein structures, provide better alignments than pure sequence-based methods. Merging the structure and sequence information allows the construction of accurate profiles for distantly related groups. These structure-based profiles outperformed other sequence-based methods for finding distant homologs and were used to identify a putative class II cysteinyl-tRNA synthetase (CysRS) in several archaea that eluded previous annotation studies. Phylogenetic analysis showed the putative class II CysRSs to be a monophyletic group and homology modeling revealed a constellation of active site residues similar to that in the known class I CysRS. PMID:15741270
Conservation Education in Schools: Aligning Teachers' Perceptions with Students' Attitudes

ERIC Educational Resources Information Center

Sutherland, Melany R.

2017-01-01

As global environmental problems intensify, the importance of providing effective conservation education to young people is increasingly apparent. To accomplish this, teachers' perceptions and students' attitudes about conservation education in schools must align. This article explores students' attitudes via a survey distributed to students from…

BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements.

PubMed

De Witte, Dieter; Van de Velde, Jan; Decap, Dries; Van Bel, Michiel; Audenaert, Pieter; Demeester, Piet; Dhoedt, Bart; Vandepoele, Klaas; Fostier, Jan

2015-12-01

The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be. Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements

PubMed Central

De Witte, Dieter; Van de Velde, Jan; Decap, Dries; Van Bel, Michiel; Audenaert, Pieter; Demeester, Piet; Dhoedt, Bart; Vandepoele, Klaas; Fostier, Jan

2015-01-01

Motivation: The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. Results: We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. Availability and implementation: BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller Contact: Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26254488
SeqFIRE: a web application for automated extraction of indel regions and conserved blocks from protein multiple sequence alignments.

PubMed

Ajawatanawong, Pravech; Atkinson, Gemma C; Watson-Haigh, Nathan S; Mackenzie, Bryony; Baldauf, Sandra L

2012-07-01

Analyses of multiple sequence alignments generally focus on well-defined conserved sequence blocks, while the rest of the alignment is largely ignored or discarded. This is especially true in phylogenomics, where large multigene datasets are produced through automated pipelines. However, some of the most powerful phylogenetic markers have been found in the variable length regions of multiple alignments, particularly insertions/deletions (indels) in protein sequences. We have developed Sequence Feature and Indel Region Extractor (SeqFIRE) to enable the automated identification and extraction of indels from protein sequence alignments. The program can also extract conserved blocks and identify fast evolving sites using a combination of conservation and entropy. All major variables can be adjusted by the user, allowing them to identify the sets of variables most suited to a particular analysis or dataset. Thus, all major tasks in preparing an alignment for further analysis are combined in a single flexible and user-friendly program. The output includes a numbered list of indels, alignments in NEXUS format with indels annotated or removed and indel-only matrices. SeqFIRE is a user-friendly web application, freely available online at www.seqfire.org/.
Genetic Testing Registry

MedlinePlus

... Splign Vector Alignment Search Tool (VAST) All Data & Software Resources... Domains & Structures BioSystems Cn3D Conserved Domain Database (CDD) Conserved Domain Search Service (CD Search) Structure (Molecular Modeling Database) Vector Alignment ...
How conservative are evolutionary anthropologists?: a survey of political attitudes.

PubMed

Lyle, Henry F; Smith, Eric A

2012-09-01

The application of evolutionary theory to human behavior has elicited a variety of critiques, some of which charge that this approach expresses or encourages conservative or reactionary political agendas. In a survey of graduate students in psychology, Tybur, Miller, and Gangestad (Human Nature, 18, 313-328, 2007) found that the political attitudes of those who use an evolutionary approach did not differ from those of other psychology grad students. Here, we present results from a directed online survey of a broad sample of graduate students in anthropology that assays political views. We found that evolutionary anthropology graduate students were very liberal in their political beliefs, overwhelmingly voted for a liberal U.S. presidential candidate in the 2008 election, and identified with liberal political parties; in this, they were almost indistinguishable from non-evolutionary anthropology students. Our results contradict the view that evolutionary anthropologists hold conservative or reactionary political views. We discuss some possible reasons for the persistence of this view in terms of the sociology of science.
AlignMiner: a Web-based tool for detection of divergent regions in multiple sequence alignments of conserved sequences

PubMed Central

2010-01-01

Background Multiple sequence alignments are used to study gene or protein function, phylogenetic relations, genome evolution hypotheses and even gene polymorphisms. Virtually without exception, all available tools focus on conserved segments or residues. Small divergent regions, however, are biologically important for specific quantitative polymerase chain reaction, genotyping, molecular markers and preparation of specific antibodies, and yet have received little attention. As a consequence, they must be selected empirically by the researcher. AlignMiner has been developed to fill this gap in bioinformatic analyses. Results AlignMiner is a Web-based application for detection of conserved and divergent regions in alignments of conserved sequences, focusing particularly on divergence. It accepts alignments (protein or nucleic acid) obtained using any of a variety of algorithms, which does not appear to have a significant impact on the final results. AlignMiner uses different scoring methods for assessing conserved/divergent regions, Entropy being the method that provides the highest number of regions with the greatest length, and Weighted being the most restrictive. Conserved/divergent regions can be generated either with respect to the consensus sequence or to one master sequence. The resulting data are presented in a graphical interface developed in AJAX, which provides remarkable user interaction capabilities. Users do not need to wait until execution is complete and can.even inspect their results on a different computer. Data can be downloaded onto a user disk, in standard formats. In silico and experimental proof-of-concept cases have shown that AlignMiner can be successfully used to designing specific polymerase chain reaction primers as well as potential epitopes for antibodies. Primer design is assisted by a module that deploys several oligonucleotide parameters for designing primers "on the fly". Conclusions AlignMiner can be used to reliably detect divergent regions via several scoring methods that provide different levels of selectivity. Its predictions have been verified by experimental means. Hence, it is expected that its usage will save researchers' time and ensure an objective selection of the best-possible divergent region when closely related sequences are analysed. AlignMiner is freely available at http://www.scbi.uma.es/alignminer. PMID:20525162
Early Evolution of Conserved Regulatory Sequences Associated with Development in Vertebrates

PubMed Central

McEwen, Gayle K.; Goode, Debbie K.; Parker, Hugo J.; Woolfe, Adam; Callaway, Heather; Elgar, Greg

2009-01-01

Comparisons between diverse vertebrate genomes have uncovered thousands of highly conserved non-coding sequences, an increasing number of which have been shown to function as enhancers during early development. Despite their extreme conservation over 500 million years from humans to cartilaginous fish, these elements appear to be largely absent in invertebrates, and, to date, there has been little understanding of their mode of action or the evolutionary processes that have modelled them. We have now exploited emerging genomic sequence data for the sea lamprey, Petromyzon marinus, to explore the depth of conservation of this type of element in the earliest diverging extant vertebrate lineage, the jawless fish (agnathans). We searched for conserved non-coding elements (CNEs) at 13 human gene loci and identified lamprey elements associated with all but two of these gene regions. Although markedly shorter and less well conserved than within jawed vertebrates, identified lamprey CNEs are able to drive specific patterns of expression in zebrafish embryos, which are almost identical to those driven by the equivalent human elements. These CNEs are therefore a unique and defining characteristic of all vertebrates. Furthermore, alignment of lamprey and other vertebrate CNEs should permit the identification of persistent sequence signatures that are responsible for common patterns of expression and contribute to the elucidation of the regulatory language in CNEs. Identifying the core regulatory code for development, common to all vertebrates, provides a foundation upon which regulatory networks can be constructed and might also illuminate how large conserved regulatory sequence blocks evolve and become fixed in genomic DNA. PMID:20011110
BEAUTY: an enhanced BLAST-based search tool that integrates multiple biological information resources into sequence similarity search results.

PubMed

Worley, K C; Wiese, B A; Smith, R F

1995-09-01

BEAUTY (BLAST enhanced alignment utility) is an enhanced version of the NCBI's BLAST data base search tool that facilitates identification of the functions of matched sequences. We have created new data bases of conserved regions and functional domains for protein sequences in NCBI's Entrez data base, and BEAUTY allows this information to be incorporated directly into BLAST search results. A Conserved Regions Data Base, containing the locations of conserved regions within Entrez protein sequences, was constructed by (1) clustering the entire data base into families, (2) aligning each family using our PIMA multiple sequence alignment program, and (3) scanning the multiple alignments to locate the conserved regions within each aligned sequence. A separate Annotated Domains Data Base was constructed by extracting the locations of all annotated domains and sites from sequences represented in the Entrez, PROSITE, BLOCKS, and PRINTS data bases. BEAUTY performs a BLAST search of those Entrez sequences with conserved regions and/or annotated domains. BEAUTY then uses the information from the Conserved Regions and Annotated Domains data bases to generate, for each matched sequence, a schematic display that allows one to directly compare the relative locations of (1) the conserved regions, (2) annotated domains and sites, and (3) the locally aligned regions matched in the BLAST search. In addition, BEAUTY search results include World-Wide Web hypertext links to a number of external data bases that provide a variety of additional types of information on the function of matched sequences. This convenient integration of protein families, conserved regions, annotated domains, alignment displays, and World-Wide Web resources greatly enhances the biological informativeness of sequence similarity searches. BEAUTY searches can be performed remotely on our system using the "BCM Search Launcher" World-Wide Web pages (URL is < http:/ /gc.bcm.tmc.edu:8088/ search-launcher/launcher.html > ).
National Center for Biotechnology Information

MedlinePlus

... Splign Vector Alignment Search Tool (VAST) All Data & Software Resources... Domains & Structures BioSystems Cn3D Conserved Domain Database (CDD) Conserved Domain Search Service (CD Search) Structure (Molecular Modeling Database) Vector Alignment ...
Conservation success as a function of good alignment of social and ecological structures and processes.

PubMed

Bodin, Orjan; Crona, Beatrice; Thyresson, Matilda; Golz, Anna-Lea; Tengö, Maria

2014-10-01

How to create and adjust governing institutions so that they align (fit) with complex ecosystem processes and structures across scales is an issue of increasing concern in conservation. It is argued that lack of such social-ecological fit makes governance and conservation difficult, yet progress in explicitly defining and rigorously testing what constitutes a good fit has been limited. We used a novel modeling approach and data from case studies of fishery and forest conservation to empirically test presumed relationships between conservation outcomes and certain patterns of alignment of social-ecological interdependences. Our approach made it possible to analyze conservation outcome on a systems level while also providing information on how individual actors are positioned in the complex web of social-ecological interdependencies. We found that when actors who shared resources were also socially linked, conservation at the level of the whole social-ecological system was positively affected. When the scales at which individual actors used resources and the scale at which ecological resources were interconnected to other ecological resources were aligned through tightened feedback loops, conservation outcome was better than when they were not aligned. The analysis of individual actors' positions in the web of social-ecological interdependencies was helpful in understanding why a system has a certain level of social-ecological fit. Results of analysis of positions showed that different actors contributed in very different ways to achieve a certain fit and revealed some underlying difference between the actors, for example in terms of actors' varying rights to access and use different ecological resources. © 2014 Society for Conservation Biology.
CAB-Align: A Flexible Protein Structure Alignment Method Based on the Residue-Residue Contact Area.

PubMed

Terashi, Genki; Takeda-Shitaka, Mayuko

2015-01-01

Proteins are flexible, and this flexibility has an essential functional role. Flexibility can be observed in loop regions, rearrangements between secondary structure elements, and conformational changes between entire domains. However, most protein structure alignment methods treat protein structures as rigid bodies. Thus, these methods fail to identify the equivalences of residue pairs in regions with flexibility. In this study, we considered that the evolutionary relationship between proteins corresponds directly to the residue-residue physical contacts rather than the three-dimensional (3D) coordinates of proteins. Thus, we developed a new protein structure alignment method, contact area-based alignment (CAB-align), which uses the residue-residue contact area to identify regions of similarity. The main purpose of CAB-align is to identify homologous relationships at the residue level between related protein structures. The CAB-align procedure comprises two main steps: First, a rigid-body alignment method based on local and global 3D structure superposition is employed to generate a sufficient number of initial alignments. Then, iterative dynamic programming is executed to find the optimal alignment. We evaluated the performance and advantages of CAB-align based on four main points: (1) agreement with the gold standard alignment, (2) alignment quality based on an evolutionary relationship without 3D coordinate superposition, (3) consistency of the multiple alignments, and (4) classification agreement with the gold standard classification. Comparisons of CAB-align with other state-of-the-art protein structure alignment methods (TM-align, FATCAT, and DaliLite) using our benchmark dataset showed that CAB-align performed robustly in obtaining high-quality alignments and generating consistent multiple alignments with high coverage and accuracy rates, and it performed extremely well when discriminating between homologous and nonhomologous pairs of proteins in both single and multi-domain comparisons. The CAB-align software is freely available to academic users as stand-alone software at http://www.pharm.kitasato-u.ac.jp/bmd/bmd/Publications.html.
Making evolutionary history count: biodiversity planning for coral reef fishes and the conservation of evolutionary processes

NASA Astrophysics Data System (ADS)

von der Heyden, Sophie

2017-03-01

Anthropogenic activities are having devastating impacts on marine systems with numerous knock-on effects on trophic functioning, species interactions and an accelerated loss of biodiversity. Establishing conservation areas can not only protect biodiversity, but also confer resilience against changes to coral reefs and their inhabitants. Planning for protection and conservation in marine systems is complex, but usually focuses on maintaining levels of biodiversity and protecting special and unique landscape features while avoiding negative impacts to socio-economic benefits. Conversely, the integration of evolutionary processes that have shaped extant species assemblages is rarely taken into account. However, it is as important to protect processes as it is to protect patterns for maintaining the evolutionary trajectories of populations and species. This review focuses on different approaches for integrating genetic analyses, such as phylogenetic diversity, phylogeography and the delineation of management units, temporal and spatial monitoring of genetic diversity and quantification of adaptive variation for protecting evolutionary resilience, into marine spatial planning, specifically for coral reef fishes. Many of these concepts are not yet readily applied to coral reef fish studies, but this synthesis highlights their potential and the importance of including historical processes into systematic biodiversity planning for conserving not only extant, but also future, biodiversity and its evolutionary potential.
Ancient genomic architecture for mammalian olfactory receptor clusters

PubMed Central

Aloni, Ronny; Olender, Tsviya; Lancet, Doron

2006-01-01

Background Mammalian olfactory receptor (OR) genes reside in numerous genomic clusters of up to several dozen genes. Whole-genome sequence alignment nets of five mammals allow their comprehensive comparison, aimed at reconstructing the ancestral olfactory subgenome. Results We developed a new and general tool for genome-wide definition of genomic gene clusters conserved in multiple species. Syntenic orthologs, defined as gene pairs showing conservation of both genomic location and coding sequence, were subjected to a graph theory algorithm for discovering CLICs (clusters in conservation). When applied to ORs in five mammals, including the marsupial opossum, more than 90% of the OR genes were found within a framework of 48 multi-species CLICs, invoking a general conservation of gene order and composition. A detailed analysis of individual CLICs revealed multiple differences among species, interpretable through species-specific genomic rearrangements and reflecting complex mammalian evolutionary dynamics. One significant instance involves CLIC #1, which lacks a human member, implying the human-specific deletion of an OR cluster, whose mouse counterpart has been tentatively associated with isovaleric acid odorant detection. Conclusion The identified multi-species CLICs demonstrate that most of the mammalian OR clusters have a common ancestry, preceding the split between marsupials and placental mammals. However, only two of these CLICs were capable of incorporating chicken OR genes, parsimoniously implying that all other CLICs emerged subsequent to the avian-mammalian divergence. PMID:17010214
The Ditylenchus destructor genome provides new insights into the evolution of plant parasitic nematodes.

PubMed

Zheng, Jinshui; Peng, Donghai; Chen, Ling; Liu, Hualin; Chen, Feng; Xu, Mengci; Ju, Shouyong; Ruan, Lifang; Sun, Ming

2016-07-27

Plant-parasitic nematodes were found in 4 of the 12 clades of phylum Nematoda. These nematodes in different clades may have originated independently from their free-living fungivorous ancestors. However, the exact evolutionary process of these parasites is unclear. Here, we sequenced the genome sequence of a migratory plant nematode, Ditylenchus destructor We performed comparative genomics among the free-living nematode, Caenorhabditis elegans and all the plant nematodes with genome sequences available. We found that, compared with C. elegans, the core developmental control processes underwent heavy reduction, though most signal transduction pathways were conserved. We also found D. destructor contained more homologies of the key genes in the above processes than the other plant nematodes. We suggest that Ditylenchus spp. may be an intermediate evolutionary history stage from free-living nematodes that feed on fungi to obligate plant-parasitic nematodes. Based on the facts that D. destructor can feed on fungi and has a relatively short life cycle, and that it has similar features to both C. elegans and sedentary plant-parasitic nematodes from clade 12, we propose it as a new model to study the biology, biocontrol of plant nematodes and the interaction between nematodes and plants. © 2016 The Author(s).
Global evolutionary isolation measures can capture key local conservation species in Nearctic and Neotropical bird communities

PubMed Central

Redding, David W.; Mooers, Arne O.; Şekercioğlu, Çağan H.; Collen, Ben

2015-01-01

Understanding how to prioritize among the most deserving imperilled species has been a focus of biodiversity science for the past three decades. Though global metrics that integrate evolutionary history and likelihood of loss have been successfully implemented, conservation is typically carried out at sub-global scales on communities of species rather than among members of complete taxonomic assemblages. Whether and how global measures map to a local scale has received little scrutiny. At a local scale, conservation-relevant assemblages of species are likely to be made up of relatively few species spread across a large phylogenetic tree, and as a consequence there are potentially relatively large amounts of evolutionary history at stake. We ask to what extent global metrics of evolutionary history are useful for conservation priority setting at the community level by evaluating the extent to which three global measures of evolutionary isolation (evolutionary distinctiveness (ED), average pairwise distance (APD) and the pendant edge or unique phylogenetic diversity (PD) contribution) capture community-level phylogenetic and trait diversity for a large sample of Neotropical and Nearctic bird communities. We find that prioritizing the most ED species globally safeguards more than twice the total PD of local communities on average, but that this does not translate into increased local trait diversity. By contrast, global APD is strongly related to the APD of those same species at the community level, and prioritizing these species also safeguards local PD and trait diversity. The next step for biologists is to understand the variation in the concordance of global and local level scores and what this means for conservation priorities: we need more directed research on the use of different measures of evolutionary isolation to determine which might best capture desirable aspects of biodiversity. PMID:25561674
Mitochondrial DNA haplotype distribution patterns in Pinus ponderosa (Pinaceae): range-wide evolutionary history and implications for conservation.

PubMed

Potter, Kevin M; Hipkins, Valerie D; Mahalovich, Mary F; Means, Robert E

2013-08-01

Ponderosa pine (Pinus ponderosa Douglas ex P. Lawson & C. Lawson) exhibits complicated patterns of morphological and genetic variation across its range in western North America. This study aims to clarify P. ponderosa evolutionary history and phylogeography using a highly polymorphic mitochondrial DNA marker, with results offering insights into how geographical and climatological processes drove the modern evolutionary structure of tree species in the region. We amplified the mtDNA nad1 second intron minisatellite region for 3,100 trees representing 104 populations, and sequenced all length variants. We estimated population-level haplotypic diversity and determined diversity partitioning among varieties, races and populations. After aligning sequences of minisatellite repeat motifs, we evaluated evolutionary relationships among haplotypes. The geographical structuring of the 10 haplotypes corresponded with division between Pacific and Rocky Mountain varieties. Pacific haplotypes clustered with high bootstrap support, and appear to have descended from Rocky Mountain haplotypes. A greater proportion of diversity was partitioned between Rocky Mountain races than between Pacific races. Areas of highest haplotypic diversity were the southern Sierra Nevada mountain range in California, northwestern California, and southern Nevada. Pinus ponderosa haplotype distribution patterns suggest a complex phylogeographic history not revealed by other genetic and morphological data, or by the sparse paleoecological record. The results appear consistent with long-term divergence between the Pacific and Rocky Mountain varieties, along with more recent divergences not well-associated with race. Pleistocene refugia may have existed in areas of high haplotypic diversity, as well as the Great Basin, Southwestern United States/northern Mexico, and the High Plains.
NOBAI: a web server for character coding of geometrical and statistical features in RNA structure

PubMed Central

Knudsen, Vegeir; Caetano-Anollés, Gustavo

2008-01-01

The Numeration of Objects in Biology: Alignment Inferences (NOBAI) web server provides a web interface to the applications in the NOBAI software package. This software codes topological and thermodynamic information related to the secondary structure of RNA molecules as multi-state phylogenetic characters, builds character matrices directly in NEXUS format and provides sequence randomization options. The web server is an effective tool that facilitates the search for evolutionary history embedded in the structure of functional RNA molecules. The NOBAI web server is accessible at ‘http://www.manet.uiuc.edu/nobai/nobai.php’. This web site is free and open to all users and there is no login requirement. PMID:18448469
The invariant cleavage pattern displayed by ascidian embryos depends on spindle positioning along the cell's longest axis in the apical plane and relies on asynchronous cell divisions

PubMed Central

Dumollard, Rémi; Minc, Nicolas; Salez, Gregory; Aicha, Sameh Ben; Bekkouche, Faisal; Hebras, Céline; Besnardeau, Lydia; McDougall, Alex

2017-01-01

The ascidian embryo is an ideal system to investigate how cell position is determined during embryogenesis. Using 3D timelapse imaging and computational methods we analyzed the planar cell divisions in ascidian early embryos and found that spindles in every cell tend to align at metaphase in the long length of the apical surface except in cells undergoing unequal cleavage. Furthermore, the invariant and conserved cleavage pattern of ascidian embryos was found to consist in alternate planar cell divisions between ectoderm and endomesoderm. In order to test the importance of alternate cell divisions we manipulated zygotic transcription induced by β-catenin or downregulated wee1 activity, both of which abolish this cell cycle asynchrony. Crucially, abolishing cell cycle asynchrony consistently disrupted the spindle orienting mechanism underpinning the invariant cleavage pattern. Our results demonstrate how an evolutionary conserved cell cycle asynchrony maintains the invariant cleavage pattern driving morphogenesis of the ascidian blastula. DOI: http://dx.doi.org/10.7554/eLife.19290.001 PMID:28121291
A comparative genomics strategy for targeted discovery of single-nucleotide polymorphisms and conserved-noncoding sequences in orphan crops.

PubMed

Feltus, F A; Singh, H P; Lohithaswa, H C; Schulze, S R; Silva, T D; Paterson, A H

2006-04-01

Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species.
A Comparative Genomics Strategy for Targeted Discovery of Single-Nucleotide Polymorphisms and Conserved-Noncoding Sequences in Orphan Crops1[W

PubMed Central

Feltus, F.A.; Singh, H.P.; Lohithaswa, H.C.; Schulze, S.R.; Silva, T.D.; Paterson, A.H.

2006-01-01

Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species. PMID:16607031

Strengths and weaknesses of McNamara's evolutionary psychological model of dreaming.

PubMed

Olliges, Sandra

2010-10-07

This article includes a brief overview of McNamara's (2004) evolutionary model of dreaming. The strengths and weaknesses of this model are then evaluated in terms of its consonance with measurable neurological and biological properties of dreaming, its fit within the tenets of evolutionary theories of dreams, and its alignment with evolutionary concepts of cooperation and spirituality. McNamara's model focuses primarily on dreaming that occurs during rapid eye movement (REM) sleep; therefore this article also focuses on REM dreaming.
JCoDA: a tool for detecting evolutionary selection.

PubMed

Steinway, Steven N; Dannenfelser, Ruth; Laucius, Christopher D; Hayes, James E; Nayak, Sudhir

2010-05-27

The incorporation of annotated sequence information from multiple related species in commonly used databases (Ensembl, Flybase, Saccharomyces Genome Database, Wormbase, etc.) has increased dramatically over the last few years. This influx of information has provided a considerable amount of raw material for evaluation of evolutionary relationships. To aid in the process, we have developed JCoDA (Java Codon Delimited Alignment) as a simple-to-use visualization tool for the detection of site specific and regional positive/negative evolutionary selection amongst homologous coding sequences. JCoDA accepts user-inputted unaligned or pre-aligned coding sequences, performs a codon-delimited alignment using ClustalW, and determines the dN/dS calculations using PAML (Phylogenetic Analysis Using Maximum Likelihood, yn00 and codeml) in order to identify regions and sites under evolutionary selection. The JCoDA package includes a graphical interface for Phylip (Phylogeny Inference Package) to generate phylogenetic trees, manages formatting of all required file types, and streamlines passage of information between underlying programs. The raw data are output to user configurable graphs with sliding window options for straightforward visualization of pairwise or gene family comparisons. Additionally, codon-delimited alignments are output in a variety of common formats and all dN/dS calculations can be output in comma-separated value (CSV) format for downstream analysis. To illustrate the types of analyses that are facilitated by JCoDA, we have taken advantage of the well studied sex determination pathway in nematodes as well as the extensive sequence information available to identify genes under positive selection, examples of regional positive selection, and differences in selection based on the role of genes in the sex determination pathway. JCoDA is a configurable, open source, user-friendly visualization tool for performing evolutionary analysis on homologous coding sequences. JCoDA can be used to rapidly screen for genes and regions of genes under selection using PAML. It can be freely downloaded at http://www.tcnj.edu/~nayaklab/jcoda.
JCoDA: a tool for detecting evolutionary selection

PubMed Central

2010-01-01

Background The incorporation of annotated sequence information from multiple related species in commonly used databases (Ensembl, Flybase, Saccharomyces Genome Database, Wormbase, etc.) has increased dramatically over the last few years. This influx of information has provided a considerable amount of raw material for evaluation of evolutionary relationships. To aid in the process, we have developed JCoDA (Java Codon Delimited Alignment) as a simple-to-use visualization tool for the detection of site specific and regional positive/negative evolutionary selection amongst homologous coding sequences. Results JCoDA accepts user-inputted unaligned or pre-aligned coding sequences, performs a codon-delimited alignment using ClustalW, and determines the dN/dS calculations using PAML (Phylogenetic Analysis Using Maximum Likelihood, yn00 and codeml) in order to identify regions and sites under evolutionary selection. The JCoDA package includes a graphical interface for Phylip (Phylogeny Inference Package) to generate phylogenetic trees, manages formatting of all required file types, and streamlines passage of information between underlying programs. The raw data are output to user configurable graphs with sliding window options for straightforward visualization of pairwise or gene family comparisons. Additionally, codon-delimited alignments are output in a variety of common formats and all dN/dS calculations can be output in comma-separated value (CSV) format for downstream analysis. To illustrate the types of analyses that are facilitated by JCoDA, we have taken advantage of the well studied sex determination pathway in nematodes as well as the extensive sequence information available to identify genes under positive selection, examples of regional positive selection, and differences in selection based on the role of genes in the sex determination pathway. Conclusions JCoDA is a configurable, open source, user-friendly visualization tool for performing evolutionary analysis on homologous coding sequences. JCoDA can be used to rapidly screen for genes and regions of genes under selection using PAML. It can be freely downloaded at http://www.tcnj.edu/~nayaklab/jcoda. PMID:20507581
Ensembl comparative genomics resources.

PubMed

Herrero, Javier; Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J; Searle, Stephen M J; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul

2016-01-01

Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org. © The Author(s) 2016. Published by Oxford University Press.
Ensembl comparative genomics resources

PubMed Central

Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J.; Searle, Stephen M. J.; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul

2016-01-01

Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org. PMID:26896847
DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats

PubMed Central

de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

2015-01-01

Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. PMID:26481363
Genomic sequencing of Pleistocene cave bears

DOE Office of Scientific and Technical Information (OSTI.GOV)

Noonan, James P.; Hofreiter, Michael; Smith, Doug

2005-04-01

Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome,more » the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.« less
Robust prediction of consensus secondary structures using averaged base pairing probability matrices.

PubMed

Kiryu, Hisanori; Kin, Taishin; Asai, Kiyoshi

2007-02-15

Recent transcriptomic studies have revealed the existence of a considerable number of non-protein-coding RNA transcripts in higher eukaryotic cells. To investigate the functional roles of these transcripts, it is of great interest to find conserved secondary structures from multiple alignments on a genomic scale. Since multiple alignments are often created using alignment programs that neglect the special conservation patterns of RNA secondary structures for computational efficiency, alignment failures can cause potential risks of overlooking conserved stem structures. We investigated the dependence of the accuracy of secondary structure prediction on the quality of alignments. We compared three algorithms that maximize the expected accuracy of secondary structures as well as other frequently used algorithms. We found that one of our algorithms, called McCaskill-MEA, was more robust against alignment failures than others. The McCaskill-MEA method first computes the base pairing probability matrices for all the sequences in the alignment and then obtains the base pairing probability matrix of the alignment by averaging over these matrices. The consensus secondary structure is predicted from this matrix such that the expected accuracy of the prediction is maximized. We show that the McCaskill-MEA method performs better than other methods, particularly when the alignment quality is low and when the alignment consists of many sequences. Our model has a parameter that controls the sensitivity and specificity of predictions. We discussed the uses of that parameter for multi-step screening procedures to search for conserved secondary structures and for assigning confidence values to the predicted base pairs. The C++ source code that implements the McCaskill-MEA algorithm and the test dataset used in this paper are available at http://www.ncrna.org/papers/McCaskillMEA/. Supplementary data are available at Bioinformatics online.
Evolutionary profiles derived from the QR factorization of multiple structural alignments gives an economy of information.

PubMed

O'Donoghue, Patrick; Luthey-Schulten, Zaida

2005-02-25

We present a new algorithm, based on the multidimensional QR factorization, to remove redundancy from a multiple structural alignment by choosing representative protein structures that best preserve the phylogenetic tree topology of the homologous group. The classical QR factorization with pivoting, developed as a fast numerical solution to eigenvalue and linear least-squares problems of the form Ax=b, was designed to re-order the columns of A by increasing linear dependence. Removing the most linear dependent columns from A leads to the formation of a minimal basis set which well spans the phase space of the problem at hand. By recasting the problem of redundancy in multiple structural alignments into this framework, in which the matrix A now describes the multiple alignment, we adapted the QR factorization to produce a minimal basis set of protein structures which best spans the evolutionary (phase) space. The non-redundant and representative profiles obtained from this procedure, termed evolutionary profiles, are shown in initial results to outperform well-tested profiles in homology detection searches over a large sequence database. A measure of structural similarity between homologous proteins, Q(H), is presented. By properly accounting for the effect and presence of gaps, a phylogenetic tree computed using this metric is shown to be congruent with the maximum-likelihood sequence-based phylogeny. The results indicate that evolutionary information is indeed recoverable from the comparative analysis of protein structure alone. Applications of the QR ordering and this structural similarity metric to analyze the evolution of structure among key, universally distributed proteins involved in translation, and to the selection of representatives from an ensemble of NMR structures are also discussed.
Independent Evolution of Six Families of Halogenating Enzymes.

PubMed

Xu, Gangming; Wang, Bin-Gui

2016-01-01

Halogenated natural products are widespread in the environment, and the halogen atoms are typically vital to their bioactivities. Thus far, six families of halogenating enzymes have been identified: cofactor-free haloperoxidases (HPO), vanadium-dependent haloperoxidases (V-HPO), heme iron-dependent haloperoxidases (HI-HPO), non-heme iron-dependent halogenases (NI-HG), flavin-dependent halogenases (F-HG), and S-adenosyl-L-methionine (SAM)-dependent halogenases (S-HG). However, these halogenating enzymes with similar biological functions but distinct structures might have evolved independently. Phylogenetic and structural analyses suggest that the HPO, V-HPO, HI-HPO, NI-HG, F-HG, and S-HG enzyme families may have evolutionary relationships to the α/β hydrolases, acid phosphatases, peroxidases, chemotaxis phosphatases, oxidoreductases, and SAM hydroxide adenosyltransferases, respectively. These halogenating enzymes have established sequence homology, structural conservation, and mechanistic features within each family. Understanding the distinct evolutionary history of these halogenating enzymes will provide further insights into the study of their catalytic mechanisms and halogenation specificity.
Surveying alignment-free features for Ortholog detection in related yeast proteomes by using supervised big data classifiers.

PubMed

Galpert, Deborah; Fernández, Alberto; Herrera, Francisco; Antunes, Agostinho; Molina-Ruiz, Reinaldo; Agüero-Chapin, Guillermin

2018-05-03

The development of new ortholog detection algorithms and the improvement of existing ones are of major importance in functional genomics. We have previously introduced a successful supervised pairwise ortholog classification approach implemented in a big data platform that considered several pairwise protein features and the low ortholog pair ratios found between two annotated proteomes (Galpert, D et al., BioMed Research International, 2015). The supervised models were built and tested using a Saccharomycete yeast benchmark dataset proposed by Salichos and Rokas (2011). Despite several pairwise protein features being combined in a supervised big data approach; they all, to some extent were alignment-based features and the proposed algorithms were evaluated on a unique test set. Here, we aim to evaluate the impact of alignment-free features on the performance of supervised models implemented in the Spark big data platform for pairwise ortholog detection in several related yeast proteomes. The Spark Random Forest and Decision Trees with oversampling and undersampling techniques, and built with only alignment-based similarity measures or combined with several alignment-free pairwise protein features showed the highest classification performance for ortholog detection in three yeast proteome pairs. Although such supervised approaches outperformed traditional methods, there were no significant differences between the exclusive use of alignment-based similarity measures and their combination with alignment-free features, even within the twilight zone of the studied proteomes. Just when alignment-based and alignment-free features were combined in Spark Decision Trees with imbalance management, a higher success rate (98.71%) within the twilight zone could be achieved for a yeast proteome pair that underwent a whole genome duplication. The feature selection study showed that alignment-based features were top-ranked for the best classifiers while the runners-up were alignment-free features related to amino acid composition. The incorporation of alignment-free features in supervised big data models did not significantly improve ortholog detection in yeast proteomes regarding the classification qualities achieved with just alignment-based similarity measures. However, the similarity of their classification performance to that of traditional ortholog detection methods encourages the evaluation of other alignment-free protein pair descriptors in future research.
Genes with stable DNA methylation levels show higher evolutionary conservation than genes with fluctuant DNA methylation levels.

PubMed

Zhang, Ruijie; Lv, Wenhua; Luan, Meiwei; Zheng, Jiajia; Shi, Miao; Zhu, Hongjie; Li, Jin; Lv, Hongchao; Zhang, Mingming; Shang, Zhenwei; Duan, Lian; Jiang, Yongshuai

2015-11-24

Different human genes often exhibit different degrees of stability in their DNA methylation levels between tissues, samples or cell types. This may be related to the evolution of human genome. Thus, we compared the evolutionary conservation between two types of genes: genes with stable DNA methylation levels (SM genes) and genes with fluctuant DNA methylation levels (FM genes). For long-term evolutionary characteristics between species, we compared the percentage of the orthologous genes, evolutionary rate dn/ds and protein sequence identity. We found that the SM genes had greater percentages of the orthologous genes, lower dn/ds, and higher protein sequence identities in all the 21 species. These results indicated that the SM genes were more evolutionarily conserved than the FM genes. For short-term evolutionary characteristics among human populations, we compared the single nucleotide polymorphism (SNP) density, and the linkage disequilibrium (LD) degree in HapMap populations and 1000 genomes project populations. We observed that the SM genes had lower SNP densities, and higher degrees of LD in all the 11 HapMap populations and 13 1000 genomes project populations. These results mean that the SM genes had more stable chromosome genetic structures, and were more conserved than the FM genes.
ATGC database and ATGC-COGs: an updated resource for micro- and macro-evolutionary studies of prokaryotic genomes and protein family annotation.

PubMed

Kristensen, David M; Wolf, Yuri I; Koonin, Eugene V

2017-01-04

The Alignable Tight Genomic Clusters (ATGCs) database is a collection of closely related bacterial and archaeal genomes that provides several tools to aid research into evolutionary processes in the microbial world. Each ATGC is a taxonomy-independent cluster of 2 or more completely sequenced genomes that meet the objective criteria of a high degree of local gene order (synteny) and a small number of synonymous substitutions in the protein-coding genes. As such, each ATGC is suited for analysis of microevolutionary variations within a cohesive group of organisms (e.g. species), whereas the entire collection of ATGCs is useful for macroevolutionary studies. The ATGC database includes many forms of pre-computed data, in particular ATGC-COGs (Clusters of Orthologous Genes), multiple sequence alignments, a set of 'index' orthologs representing the most well-conserved members of each ATGC-COG, the phylogenetic tree of the organisms within each ATGC, etc. Although the ATGC database contains several million proteins from thousands of genomes organized into hundreds of clusters (roughly a 4-fold increase since the last version of the ATGC database), it is now built with completely automated methods and will be regularly updated following new releases of the NCBI RefSeq database. The ATGC database is hosted jointly at the University of Iowa at dmk-brain.ecn.uiowa.edu/ATGC/ and the NCBI at ftp.ncbi.nlm.nih.gov/pub/kristensen/ATGC/atgc_home.html. Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Structure-Based Sequence Alignment of the Transmembrane Domains of All Human GPCRs: Phylogenetic, Structural and Functional Implications

PubMed Central

Cvicek, Vaclav; Goddard, William A.; Abrol, Ravinder

2016-01-01

The understanding of G-protein coupled receptors (GPCRs) is undergoing a revolution due to increased information about their signaling and the experimental determination of structures for more than 25 receptors. The availability of at least one receptor structure for each of the GPCR classes, well separated in sequence space, enables an integrated superfamily-wide analysis to identify signatures involving the role of conserved residues, conserved contacts, and downstream signaling in the context of receptor structures. In this study, we align the transmembrane (TM) domains of all experimental GPCR structures to maximize the conserved inter-helical contacts. The resulting superfamily-wide GpcR Sequence-Structure (GRoSS) alignment of the TM domains for all human GPCR sequences is sufficient to generate a phylogenetic tree that correctly distinguishes all different GPCR classes, suggesting that the class-level differences in the GPCR superfamily are encoded at least partly in the TM domains. The inter-helical contacts conserved across all GPCR classes describe the evolutionarily conserved GPCR structural fold. The corresponding structural alignment of the inactive and active conformations, available for a few GPCRs, identifies activation hot-spot residues in the TM domains that get rewired upon activation. Many GPCR mutations, known to alter receptor signaling and cause disease, are located at these conserved contact and activation hot-spot residue positions. The GRoSS alignment places the chemosensory receptor subfamilies for bitter taste (TAS2R) and pheromones (Vomeronasal, VN1R) in the rhodopsin family, known to contain the chemosensory olfactory receptor subfamily. The GRoSS alignment also enables the quantification of the structural variability in the TM regions of experimental structures, useful for homology modeling and structure prediction of receptors. Furthermore, this alignment identifies structurally and functionally important residues in all human GPCRs. These residues can be used to make testable hypotheses about the structural basis of receptor function and about the molecular basis of disease-associated single nucleotide polymorphisms. PMID:27028541
Evolutionary principles and their practical application

PubMed Central

Hendry, Andrew P; Kinnison, Michael T; Heino, Mikko; Day, Troy; Smith, Thomas B; Fitt, Gary; Bergstrom, Carl T; Oakeshott, John; Jørgensen, Peter S; Zalucki, Myron P; Gilchrist, George; Southerton, Simon; Sih, Andrew; Strauss, Sharon; Denison, Robert F; Carroll, Scott P

2011-01-01

Evolutionary principles are now routinely incorporated into medicine and agriculture. Examples include the design of treatments that slow the evolution of resistance by weeds, pests, and pathogens, and the design of breeding programs that maximize crop yield or quality. Evolutionary principles are also increasingly incorporated into conservation biology, natural resource management, and environmental science. Examples include the protection of small and isolated populations from inbreeding depression, the identification of key traits involved in adaptation to climate change, the design of harvesting regimes that minimize unwanted life-history evolution, and the setting of conservation priorities based on populations, species, or communities that harbor the greatest evolutionary diversity and potential. The adoption of evolutionary principles has proceeded somewhat independently in these different fields, even though the underlying fundamental concepts are the same. We explore these fundamental concepts under four main themes: variation, selection, connectivity, and eco-evolutionary dynamics. Within each theme, we present several key evolutionary principles and illustrate their use in addressing applied problems. We hope that the resulting primer of evolutionary concepts and their practical utility helps to advance a unified multidisciplinary field of applied evolutionary biology. PMID:25567966
Evolutionary principles and their practical application.

PubMed

Hendry, Andrew P; Kinnison, Michael T; Heino, Mikko; Day, Troy; Smith, Thomas B; Fitt, Gary; Bergstrom, Carl T; Oakeshott, John; Jørgensen, Peter S; Zalucki, Myron P; Gilchrist, George; Southerton, Simon; Sih, Andrew; Strauss, Sharon; Denison, Robert F; Carroll, Scott P

2011-03-01

Evolutionary principles are now routinely incorporated into medicine and agriculture. Examples include the design of treatments that slow the evolution of resistance by weeds, pests, and pathogens, and the design of breeding programs that maximize crop yield or quality. Evolutionary principles are also increasingly incorporated into conservation biology, natural resource management, and environmental science. Examples include the protection of small and isolated populations from inbreeding depression, the identification of key traits involved in adaptation to climate change, the design of harvesting regimes that minimize unwanted life-history evolution, and the setting of conservation priorities based on populations, species, or communities that harbor the greatest evolutionary diversity and potential. The adoption of evolutionary principles has proceeded somewhat independently in these different fields, even though the underlying fundamental concepts are the same. We explore these fundamental concepts under four main themes: variation, selection, connectivity, and eco-evolutionary dynamics. Within each theme, we present several key evolutionary principles and illustrate their use in addressing applied problems. We hope that the resulting primer of evolutionary concepts and their practical utility helps to advance a unified multidisciplinary field of applied evolutionary biology.
Ranking Mammal Species for Conservation and the Loss of Both Phylogenetic and Trait Diversity.

PubMed

Redding, David W; Mooers, Arne O

2015-01-01

The 'edge of existence' (EDGE) prioritisation scheme is a new approach to rank species for conservation attention that aims to identify species that are both isolated on the tree of life and at imminent risk of extinction as defined by the World Conservation Union (IUCN). The self-stated benefit of the EDGE system is that it effectively captures unusual 'unique' species, and doing so will preserve the total evolutionary history of a group into the future. Given the EDGE metric was not designed to capture total evolutionary history, we tested this claim. Our analyses show that the total evolutionary history of mammals preserved is indeed much higher if EDGE species are protected than if at-risk species are chosen randomly. More of the total tree is also protected by EDGE species than if solely threat status or solely evolutionary distinctiveness were used for prioritisation. When considering how much trait diversity is captured by IUCN and EDGE prioritisation rankings, interestingly, preserving the highest-ranked EDGE species, or indeed just the most threatened species, captures more total trait diversity compared to sets of randomly-selected at-risk species. These results suggest that, as advertised, EDGE mammal species contribute evolutionary history to the evolutionary tree of mammals non-randomly, and EDGE-style rankings among endangered species can also capture important trait diversity. If this pattern holds for other groups, the EDGE prioritisation scheme has greater potential to be an efficient method to allocate scarce conservation effort.
Lessons from (co-)evolution in the docking of proteins and peptides for CAPRI Rounds 28-35.

PubMed

Yu, Jinchao; Andreani, Jessica; Ochsenbein, Françoise; Guerois, Raphaël

2017-03-01

Computational protein-protein docking is of great importance for understanding protein interactions at the structural level. Critical assessment of prediction of interactions (CAPRI) experiments provide the protein docking community with a unique opportunity to blindly test methods based on real-life cases and help accelerate methodology development. For CAPRI Rounds 28-35, we used an automatic docking pipeline integrating the coarse-grained co-evolution-based potential InterEvScore. This score was developed to exploit the information contained in the multiple sequence alignments of binding partners and selectively recognize co-evolved interfaces. Together with Zdock/Frodock for rigid-body docking, SOAP-PP for atomic potential and Rosetta applications for structural refinement, this pipeline reached high performance on a majority of targets. For protein-peptide docking and interfacial water position predictions, we also explored different means of taking evolutionary information into account. Overall, our group ranked 1 st by correctly predicting 10 targets, composed of 1 High, 7 Medium and 2 Acceptable predictions. Excellent and Outstanding levels of accuracy were reached for each of the two water prediction targets, respectively. Altogether, in 15 out of 18 targets in total, evolutionary information, either through co-evolution or conservation analyses, could provide key constraints to guide modeling towards the most likely assemblies. These results open promising perspectives regarding the way evolutionary information can be valuable to improve docking prediction accuracy. Proteins 2017; 85:378-390. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Production of a reference transcriptome and transcriptomic database (EdwardsiellaBase) for the lined sea anemone, Edwardsiella lineata, a parasitic cnidarian

PubMed Central

2014-01-01

Background The lined sea anemone Edwardsiella lineata is an informative model system for evolutionary-developmental studies of parasitism. In this species, it is possible to compare alternate developmental pathways leading from a larva to either a free-living polyp or a vermiform parasite that inhabits the mesoglea of a ctenophore host. Additionally, E. lineata is confamilial with the model cnidarian Nematostella vectensis, providing an opportunity for comparative genomic, molecular and organismal studies. Description We generated a reference transcriptome for E. lineata via high-throughput sequencing of RNA isolated from five developmental stages (parasite; parasite-to-larva transition; larva; larva-to-adult transition; adult). The transcriptome comprises 90,440 contigs assembled from >15 billion nucleotides of DNA sequence. Using a molecular clock approach, we estimated the divergence between E. lineata and N. vectensis at 215–364 million years ago. Based on gene ontology and metabolic pathway analyses and gene family surveys (bHLH-PAS, deiodinases, Fox genes, LIM homeodomains, minicollagens, nuclear receptors, Sox genes, and Wnts), the transcriptome of E. lineata is comparable in depth and completeness to N. vectensis. Analyses of protein motifs and revealed extensive conservation between the proteins of these two edwardsiid anemones, although we show the NF-κB protein of E. lineata reflects the ancestral structure, while the NF-κB protein of N. vectensis has undergone a split that separates the DNA-binding domain from the inhibitory domain. All contigs have been deposited in a public database (EdwardsiellaBase), where they may be searched according to contig ID, gene ontology, protein family motif (Pfam), enzyme commission number, and BLAST. The alignment of the raw reads to the contigs can also be visualized via JBrowse. Conclusions The transcriptomic data and database described here provide a platform for studying the evolutionary developmental genomics of a derived parasitic life cycle. In addition, these data from E. lineata will aid in the interpretation of evolutionary novelties in gene sequence or structure that have been reported for the model cnidarian N. vectensis (e.g., the split NF-κB locus). Finally, we include custom computational tools to facilitate the annotation of a transcriptome based on high-throughput sequencing data obtained from a “non-model system.” PMID:24467778
Production of a reference transcriptome and transcriptomic database (EdwardsiellaBase) for the lined sea anemone, Edwardsiella lineata, a parasitic cnidarian.

PubMed

Stefanik, Derek J; Lubinski, Tristan J; Granger, Brian R; Byrd, Allyson L; Reitzel, Adam M; DeFilippo, Lukas; Lorenc, Allison; Finnerty, John R

2014-01-28

The lined sea anemone Edwardsiella lineata is an informative model system for evolutionary-developmental studies of parasitism. In this species, it is possible to compare alternate developmental pathways leading from a larva to either a free-living polyp or a vermiform parasite that inhabits the mesoglea of a ctenophore host. Additionally, E. lineata is confamilial with the model cnidarian Nematostella vectensis, providing an opportunity for comparative genomic, molecular and organismal studies. We generated a reference transcriptome for E. lineata via high-throughput sequencing of RNA isolated from five developmental stages (parasite; parasite-to-larva transition; larva; larva-to-adult transition; adult). The transcriptome comprises 90,440 contigs assembled from >15 billion nucleotides of DNA sequence. Using a molecular clock approach, we estimated the divergence between E. lineata and N. vectensis at 215-364 million years ago. Based on gene ontology and metabolic pathway analyses and gene family surveys (bHLH-PAS, deiodinases, Fox genes, LIM homeodomains, minicollagens, nuclear receptors, Sox genes, and Wnts), the transcriptome of E. lineata is comparable in depth and completeness to N. vectensis. Analyses of protein motifs and revealed extensive conservation between the proteins of these two edwardsiid anemones, although we show the NF-κB protein of E. lineata reflects the ancestral structure, while the NF-κB protein of N. vectensis has undergone a split that separates the DNA-binding domain from the inhibitory domain. All contigs have been deposited in a public database (EdwardsiellaBase), where they may be searched according to contig ID, gene ontology, protein family motif (Pfam), enzyme commission number, and BLAST. The alignment of the raw reads to the contigs can also be visualized via JBrowse. The transcriptomic data and database described here provide a platform for studying the evolutionary developmental genomics of a derived parasitic life cycle. In addition, these data from E. lineata will aid in the interpretation of evolutionary novelties in gene sequence or structure that have been reported for the model cnidarian N. vectensis (e.g., the split NF-κB locus). Finally, we include custom computational tools to facilitate the annotation of a transcriptome based on high-throughput sequencing data obtained from a "non-model system."

Teaching the Toolkit: A Laboratory Series to Demonstrate the Evolutionary Conservation of Metazoan Cell Signaling Pathways

ERIC Educational Resources Information Center

LeClair, Elizabeth E.

2008-01-01

A major finding of comparative genomics and developmental genetics is that metazoans share certain conserved, embryonically deployed signaling pathways that instruct cells as to their ultimate fate. Because the DNA encoding these pathways predates the evolutionary split of most animal groups, it should in principle be possible to clone…
The use and application of phylogeography for invertebrate conservation research and planning

Treesearch

Ryan C. Garrick; Chester J. Sands; Paul Sunnucks

2006-01-01

To conserve evolutionary processes within taxa as well as local co-evolutionary associations among taxa, habitat reservation and production forestry management needs to take account of natural genetic-geographic patterns. While vertebrates tend to have at least moderate dispersal and gene flow on a landscape-scale, there are good reasons to expect many small,...
Survey of local and global biological network alignment: the need to reconcile the two sides of the same coin.

PubMed

Guzzi, Pietro Hiram; Milenkovic, Tijana

2018-05-01

Analogous to genomic sequence alignment that allows for across-species transfer of biological knowledge between conserved sequence regions, biological network alignment can be used to guide the knowledge transfer between conserved regions of molecular networks of different species. Hence, biological network alignment can be used to redefine the traditional notion of a sequence-based homology to a new notion of network-based homology. Analogous to genomic sequence alignment, there exist local and global biological network alignments. Here, we survey prominent and recent computational approaches of each network alignment type and discuss their (dis)advantages. Then, as it was recently shown that the two approach types are complementary, in the sense that they capture different slices of cellular functioning, we discuss the need to reconcile the two network alignment types and present a recent first step in this direction. We conclude with some open research problems on this topic and comment on the usefulness of network alignment in other domains besides computational biology.
Hotspots and the conservation of evolutionary history

PubMed Central

Sechrest, Wes; Brooks, Thomas M.; da Fonseca, Gustavo A. B.; Konstant, William R.; Mittermeier, Russell A.; Purvis, Andy; Rylands, Anthony B.; Gittleman, John L.

2002-01-01

Species diversity is unevenly distributed across the globe, with terrestrial diversity concentrated in a few restricted biodiversity hotspots. These areas are associated with high losses of primary vegetation and increased human population density, resulting in growing numbers of threatened species. We show that conservation of these hotspots is critical because they harbor even greater amounts of evolutionary history than expected by species numbers alone. We used supertrees for carnivores and primates to estimate that nearly 70% of the total amount of evolutionary history represented in these groups is found in 25 biodiversity hotspots. PMID:11854502
The alignment of agricultural and nature conservation policies in the European Union.

PubMed

Hodge, Ian; Hauck, Jennifer; Bonn, Aletta

2015-08-01

Europe is a region of relatively high population density and productive agriculture subject to substantial government intervention under the Common Agricultural Policy (CAP). Many habitats and species of high conservation interest have been created by the maintenance of agricultural practices over long periods. These practices are often no longer profitable, and nature conservation initiatives require government support to cover the cost for them to be continued. The CAP has been reformed both to reduce production of agricultural commodities at costs in excess of world prices and to establish incentives for landholders to adopt voluntary conservation measures. A separate nature conservation policy has established an extensive series of protected sites (Natura 2000) that has, as yet, failed to halt the loss of biodiversity. Additional broader scale approaches have been advocated for conservation in the wider landscape matrix, including the alignment of agricultural and nature conservation policies, which remains a challenge. Possibilities for alignment include further shifting of funds from general support for farmers toward targeted payments for biodiversity goals at larger scales and adoption of an ecosystem approach. The European response to the competing demands for land resources may offer lessons globally as demands on rural land increase. © 2015 Society for Conservation Biology.
Alignment-Independent Comparisons of Human Gastrointestinal Tract Microbial Communities in a Multidimensional 16S rRNA Gene Evolutionary Space▿

PubMed Central

Rudi, Knut; Zimonja, Monika; Kvenshagen, Bente; Rugtveit, Jarle; Midtvedt, Tore; Eggesbø, Merete

2007-01-01

We present a novel approach for comparing 16S rRNA gene clone libraries that is independent of both DNA sequence alignment and definition of bacterial phylogroups. These steps are the major bottlenecks in current microbial comparative analyses. We used direct comparisons of taxon density distributions in an absolute evolutionary coordinate space. The coordinate space was generated by using alignment-independent bilinear multivariate modeling. Statistical analyses for clone library comparisons were based on multivariate analysis of variance, partial least-squares regression, and permutations. Clone libraries from both adult and infant gastrointestinal tract microbial communities were used as biological models. We reanalyzed a library consisting of 11,831 clones covering complete colons from three healthy adults in addition to a smaller 390-clone library from infant feces. We show that it is possible to extract detailed information about microbial community structures using our alignment-independent method. Our density distribution analysis is also very efficient with respect to computer operation time, meeting the future requirements of large-scale screenings to understand the diversity and dynamics of microbial communities. PMID:17337554
Fast alignment-free sequence comparison using spaced-word frequencies.

PubMed

Leimeister, Chris-Andre; Boden, Marcus; Horwege, Sebastian; Lindner, Sebastian; Morgenstern, Burkhard

2014-07-15

Alignment-free methods for sequence comparison are increasingly used for genome analysis and phylogeny reconstruction; they circumvent various difficulties of traditional alignment-based approaches. In particular, alignment-free methods are much faster than pairwise or multiple alignments. They are, however, less accurate than methods based on sequence alignment. Most alignment-free approaches work by comparing the word composition of sequences. A well-known problem with these methods is that neighbouring word matches are far from independent. To reduce the statistical dependency between adjacent word matches, we propose to use 'spaced words', defined by patterns of 'match' and 'don't care' positions, for alignment-free sequence comparison. We describe a fast implementation of this approach using recursive hashing and bit operations, and we show that further improvements can be achieved by using multiple patterns instead of single patterns. To evaluate our approach, we use spaced-word frequencies as a basis for fast phylogeny reconstruction. Using real-world and simulated sequence data, we demonstrate that our multiple-pattern approach produces better phylogenies than approaches relying on contiguous words. Our program is freely available at http://spaced.gobics.de/. © The Author 2014. Published by Oxford University Press.
Plant polyadenylation factors: conservation and variety in the polyadenylation complex in plants.

PubMed

Hunt, Arthur G; Xing, Denghui; Li, Qingshun Q

2012-11-20

Polyadenylation, an essential step in eukaryotic gene expression, requires both cis-elements and a plethora of trans-acting polyadenylation factors. The polyadenylation factors are largely conserved across mammals and fungi. The conservation seems also extended to plants based on the analyses of Arabidopsis polyadenylation factors. To extend this observation, we systemically identified the orthologs of yeast and human polyadenylation factors from 10 plant species chosen based on both the availability of their genome sequences and their positions in the evolutionary tree, which render them representatives of different plant lineages. The evolutionary trajectories revealed several interesting features of plant polyadenylation factors. First, the number of genes encoding plant polyadenylation factors was clearly increased from "lower" to "higher" plants. Second, the gene expansion in higher plants was biased to some polyadenylation factors, particularly those involved in RNA binding. Finally, while there are clear commonalities, the differences in the polyadenylation apparatus were obvious across different species, suggesting an ongoing process of evolutionary change. These features lead to a model in which the plant polyadenylation complex consists of a conserved core, which is rather rigid in terms of evolutionary conservation, and a panoply of peripheral subunits, which are less conserved and associated with the core in various combinations, forming a collection of somewhat distinct complex assemblies. The multiple forms of plant polyadenylation complex, together with the diversified polyA signals may explain the intensive alternative polyadenylation (APA) and its regulatory role in biological functions of higher plants.
Molecular Evolution of the Non-Coding Eosinophil Granule Ontogeny Transcript

PubMed Central

Rose, Dominic; Stadler, Peter F.

2011-01-01

Eukaryotic genomes are pervasively transcribed. A large fraction of the transcriptional output consists of long, mRNA-like, non-protein-coding transcripts (mlncRNAs). The evolutionary history of mlncRNAs is still largely uncharted territory. In this contribution, we explore in detail the evolutionary traces of the eosinophil granule ontogeny transcript (EGOT), an experimentally confirmed representative of an abundant class of totally intronic non-coding transcripts (TINs). EGOT is located antisense to an intron of the ITPR1 gene. We computationally identify putative EGOT orthologs in the genomes of 32 different amniotes, including orthologs from primates, rodents, ungulates, carnivores, afrotherians, and xenarthrans, as well as putative candidates from basal amniotes, such as opossum or platypus. We investigate the EGOT gene phylogeny, analyze patterns of sequence conservation, and the evolutionary conservation of the EGOT gene structure. We show that EGO-B, the spliced isoform, may be present throughout the placental mammals, but most likely dates back even further. We demonstrate here for the first time that the whole EGOT locus is highly structured, containing several evolutionary conserved, and thermodynamic stable secondary structures. Our analyses allow us to postulate novel functional roles of a hitherto poorly understood region at the intron of EGO-B which is highly conserved at the sequence level. The region contains a novel ITPR1 exon and also conserved RNA secondary structures together with a conserved TATA-like element, which putatively acts as a promoter of an independent regulatory element. PMID:22303364
MultiSeq: unifying sequence and structure data for evolutionary analysis

PubMed Central

Roberts, Elijah; Eargle, John; Wright, Dan; Luthey-Schulten, Zaida

2006-01-01

Background Since the publication of the first draft of the human genome in 2000, bioinformatic data have been accumulating at an overwhelming pace. Currently, more than 3 million sequences and 35 thousand structures of proteins and nucleic acids are available in public databases. Finding correlations in and between these data to answer critical research questions is extremely challenging. This problem needs to be approached from several directions: information science to organize and search the data; information visualization to assist in recognizing correlations; mathematics to formulate statistical inferences; and biology to analyze chemical and physical properties in terms of sequence and structure changes. Results Here we present MultiSeq, a unified bioinformatics analysis environment that allows one to organize, display, align and analyze both sequence and structure data for proteins and nucleic acids. While special emphasis is placed on analyzing the data within the framework of evolutionary biology, the environment is also flexible enough to accommodate other usage patterns. The evolutionary approach is supported by the use of predefined metadata, adherence to standard ontological mappings, and the ability for the user to adjust these classifications using an electronic notebook. MultiSeq contains a new algorithm to generate complete evolutionary profiles that represent the topology of the molecular phylogenetic tree of a homologous group of distantly related proteins. The method, based on the multidimensional QR factorization of multiple sequence and structure alignments, removes redundancy from the alignments and orders the protein sequences by increasing linear dependence, resulting in the identification of a minimal basis set of sequences that spans the evolutionary space of the homologous group of proteins. Conclusion MultiSeq is a major extension of the Multiple Alignment tool that is provided as part of VMD, a structural visualization program for analyzing molecular dynamics simulations. Both are freely distributed by the NIH Resource for Macromolecular Modeling and Bioinformatics and MultiSeq is included with VMD starting with version 1.8.5. The MultiSeq website has details on how to download and use the software: PMID:16914055
Iterative non-sequential protein structural alignment.

PubMed

Salem, Saeed; Zaki, Mohammed J; Bystroff, Christopher

2009-06-01

Structural similarity between proteins gives us insights into their evolutionary relationships when there is low sequence similarity. In this paper, we present a novel approach called SNAP for non-sequential pair-wise structural alignment. Starting from an initial alignment, our approach iterates over a two-step process consisting of a superposition step and an alignment step, until convergence. We propose a novel greedy algorithm to construct both sequential and non-sequential alignments. The quality of SNAP alignments were assessed by comparing against the manually curated reference alignments in the challenging SISY and RIPC datasets. Moreover, when applied to a dataset of 4410 protein pairs selected from the CATH database, SNAP produced longer alignments with lower rmsd than several state-of-the-art alignment methods. Classification of folds using SNAP alignments was both highly sensitive and highly selective. The SNAP software along with the datasets are available online at http://www.cs.rpi.edu/~zaki/software/SNAP.
Structural basis for signaling by exclusive EDS1 heteromeric complexes with SAG101 or PAD4 in plant innate immunity.

PubMed

Wagner, Stephan; Stuttmann, Johannes; Rietz, Steffen; Guerois, Raphael; Brunstein, Elena; Bautor, Jaqueline; Niefind, Karsten; Parker, Jane E

2013-12-11

Biotrophic plant pathogens encounter a postinfection basal resistance layer controlled by the lipase-like protein enhanced disease susceptibility 1 (EDS1) and its sequence-related interaction partners, senescence-associated gene 101 (SAG101) and phytoalexin deficient 4 (PAD4). Maintainance of separate EDS1 family member clades through angiosperm evolution suggests distinct functional attributes. We report the Arabidopsis EDS1-SAG101 heterodimer crystal structure with juxtaposed N-terminal α/β hydrolase and C-terminal α-helical EP domains aligned via a large conserved interface. Mutational analysis of the EDS1-SAG101 heterodimer and a derived EDS1-PAD4 structural model shows that EDS1 signals within mutually exclusive heterocomplexes. Although there is evolutionary conservation of α/β hydrolase topology in all three proteins, a noncatalytic resistance mechanism is indicated. Instead, the respective N-terminal domains appear to facilitate binding of the essential EP domains to create novel interaction surfaces on the heterodimer. Transitions between distinct functional EDS1 heterodimers might explain the central importance and versatility of this regulatory node in plant immunity. Copyright © 2013 Elsevier Inc. All rights reserved.
Conservation of sex chromosomes in lacertid lizards.

PubMed

Rovatsos, Michail; Vukić, Jasna; Altmanová, Marie; Johnson Pokorná, Martina; Moravec, Jiří; Kratochvíl, Lukáš

2016-07-01

Sex chromosomes are believed to be stable in endotherms, but young and evolutionary unstable in most ectothermic vertebrates. Within lacertids, the widely radiated lizard group, sex chromosomes have been reported to vary in morphology and heterochromatinization, which may suggest turnovers during the evolution of the group. We compared the partial gene content of the Z-specific part of sex chromosomes across major lineages of lacertids and discovered a strong evolutionary stability of sex chromosomes. We can conclude that the common ancestor of lacertids, living around 70 million years ago (Mya), already had the same highly differentiated sex chromosomes. Molecular data demonstrating an evolutionary conservation of sex chromosomes have also been documented for iguanas and caenophidian snakes. It seems that differences in the evolutionary conservation of sex chromosomes in vertebrates do not reflect the distinction between endotherms and ectotherms, but rather between amniotes and anamniotes, or generally, the differences in the life history of particular lineages. © 2016 John Wiley & Sons Ltd.
Ramachandran analysis of conserved glycyl residues in homologous proteins of known structure.

PubMed

Lakshmi, Balasubramanian; Sinduja, Chandrasekaran; Archunan, Govind; Srinivasan, Narayanaswamy

2014-06-01

High conservation of glycyl residues in homologous proteins is fairly frequent. It is commonly understood that glycine tends to be highly conserved either because of its unique Ramachandran angles or to avoid steric clash that would arise with a larger side chain. Using a database of aligned 3D structures of homologous proteins we identified conserved Gly in 288 alignment positions from 85 families. Ninety-six of these alignment positions correspond to conserved Gly residue with (φ, ψ) values allowed for non-glycyl residues. Reasons for this observation were investigated by in-silico mutation of these glycyl residues to Ala. We found in 94% of the cases a short contact exists between the C(β) atom of the introduced Ala with the atoms which are often distant in the primary structure. This suggests the lack of space even for a short side chain thereby explaining high conservation of glycyl residues even when they adopt (φ, ψ) values allowed for Ala. In 189 alignment positions, the conserved glycyl residues adopt (φ, ψ) values which are disallowed for Ala. In-silico mutation of these Gly residues to Ala almost always results in steric hindrance involving C(β) atom of Ala as one would expect by comparing Ramachandran maps for Ala and Gly. Rare occurrence of the disallowed glycyl conformations even in ultrahigh resolution protein structures are accompanied by short contacts in the crystal structures and such disallowed conformations are not conserved in the homologues. These observations raise the doubt on the accuracy of such glycyl conformations in proteins. © 2014 The Protein Society.
Is multiple-sequence alignment required for accurate inference of phylogeny?

PubMed

Höhl, Michael; Ragan, Mark A

2007-04-01

The process of inferring phylogenetic trees from molecular sequences almost always starts with a multiple alignment of these sequences but can also be based on methods that do not involve multiple sequence alignment. Very little is known about the accuracy with which such alignment-free methods recover the correct phylogeny or about the potential for increasing their accuracy. We conducted a large-scale comparison of ten alignment-free methods, among them one new approach that does not calculate distances and a faster variant of our pattern-based approach; all distance-based alignment-free methods are freely available from http://www.bioinformatics.org.au (as Python package decaf+py). We show that most methods exhibit a higher overall reconstruction accuracy in the presence of high among-site rate variation. Under all conditions that we considered, variants of the pattern-based approach were significantly better than the other alignment-free methods. The new pattern-based variant achieved a speed-up of an order of magnitude in the distance calculation step, accompanied by a small loss of tree reconstruction accuracy. A method of Bayesian inference from k-mers did not improve on classical alignment-free (and distance-based) methods but may still offer other advantages due to its Bayesian nature. We found the optimal word length k of word-based methods to be stable across various data sets, and we provide parameter ranges for two different alphabets. The influence of these alphabets was analyzed to reveal a trade-off in reconstruction accuracy between long and short branches. We have mapped the phylogenetic accuracy for many alignment-free methods, among them several recently introduced ones, and increased our understanding of their behavior in response to biologically important parameters. In all experiments, the pattern-based approach emerged as superior, at the expense of higher resource consumption. Nonetheless, no alignment-free method that we examined recovers the correct phylogeny as accurately as does an approach based on maximum-likelihood distance estimates of multiply aligned sequences.
Preserving genes, species, or ecosystems? Healing the fractured foundations of conservation policy.

PubMed

Bowen, B W

1999-12-01

The scientific foundations of conservation policy are the subject of a recent tripolar debate, with systematists arguing for the primacy of phylogenetic rankings, ecologists arguing for protection at the level of populations or ecosystems, and evolutionary biologists urging more attention for the factors that enhance adaptation and biodiversity. In the field of conservation genetics, this controversy is manifested in the diverse viewpoints of molecular systematists, population biologists, and evolutionary (and quantitative) geneticists. A resolution of these viewpoints is proposed here, based on the premise that preserving particular objects (genes, species, or ecosystems) is not the ultimate goal of conservation. In order to be successful, conservation efforts must preserve the processes of life. This task requires the identification and protection of diverse branches in the tree of life (phylogenetics), the maintenance of life-support systems for organisms (ecology), and the continued adaptation of organisms to changing environments (evolution). None of these objectives alone is sufficient to preserve the threads of life across time. Under this temporal perspective, molecular genetic technologies have applications in all three conservation agendas; DNA sequence comparisons serve the phylogenetic goals, population genetic markers serve the ecological goals, quantitative genetics and genome explorations serve the evolutionary goals.
Identification and Potential Regulatory Properties of Evolutionary Conserved Regions (ECRs) at the Schizophrenia-Associated MIR137 Locus.

PubMed

Gianfrancesco, Olympia; Griffiths, Daniel; Myers, Paul; Collier, David A; Bubb, Vivien J; Quinn, John P

2016-10-01

Genome-wide association studies (GWAS) have identified a region at chromosome 1p21.3, containing the microRNA MIR137, to be among the most significant associations for schizophrenia. However, the mechanism by which genetic variation at this locus increases risk of schizophrenia is unknown. Identifying key regulatory regions around MIR137 is crucial to understanding the potential role of this gene in the aetiology of psychiatric disorders. Through alignment of vertebrate genomes, we identified seven non-coding regions at the MIR137 locus with conservation comparable to exons (>70 %). Bioinformatic analysis using the Psychiatric Genomics Consortium GWAS dataset for schizophrenia showed five of the ECRs to have genome-wide significant SNPs in or adjacent to their sequence. Analysis of available datasets on chromatin marks and histone modification data showed that three of the ECRs were predicted to be functional in the human brain, and three in development. In vitro analysis of ECR activity using reporter gene assays showed that all seven of the selected ECRs displayed transcriptional regulatory activity in the SH-SY5Y neuroblastoma cell line. This data suggests a regulatory role in the developing and adult brain for these highly conserved regions at the MIR137 schizophrenia-associated locus and further that these domains could act individually or synergistically to regulate levels of MIR137 expression.
DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats.

PubMed

de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

2015-11-16

Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Conservation of transcription factor binding events predicts gene expression across species

PubMed Central

Hemberg, Martin; Kreiman, Gabriel

2011-01-01

Recent technological advances have made it possible to determine the genome-wide binding sites of transcription factors (TFs). Comparisons across species have suggested a relatively low degree of evolutionary conservation of experimentally defined TF binding events (TFBEs). Using binding data for six different TFs in hepatocytes and embryonic stem cells from human and mouse, we demonstrate that evolutionary conservation of TFBEs within orthologous proximal promoters is closely linked to function, defined as expression of the target genes. We show that (i) there is a significantly higher degree of conservation of TFBEs when the target gene is expressed in both species; (ii) there is increased conservation of binding events for groups of TFs compared to individual TFs; and (iii) conserved TFBEs have a greater impact on the expression of their target genes than non-conserved ones. These results link conservation of structural elements (TFBEs) to conservation of function (gene expression) and suggest a higher degree of functional conservation than implied by previous studies. PMID:21622661
An efficient algorithm for pairwise local alignment of protein interaction networks

DOE PAGES

Chen, Wenbin; Schmidt, Matthew; Tian, Wenhong; ...

2015-04-01

Recently, researchers seeking to understand, modify, and create beneficial traits in organisms have looked for evolutionarily conserved patterns of protein interactions. Their conservation likely means that the proteins of these conserved functional modules are important to the trait's expression. In this paper, we formulate the problem of identifying these conserved patterns as a graph optimization problem, and develop a fast heuristic algorithm for this problem. We compare the performance of our network alignment algorithm to that of the MaWISh algorithm [Koyuturk M, Kim Y, Topkara U, Subramaniam S, Szpankowski W, Grama A, Pairwise alignment of protein interaction networks, J Computmore » Biol 13(2): 182-199, 2006.], which bases its search algorithm on a related decision problem formulation. We find that our algorithm discovers conserved modules with a larger number of proteins in an order of magnitude less time. In conclusion, the protein sets found by our algorithm correspond to known conserved functional modules at comparable precision and recall rates as those produced by the MaWISh algorithm.« less

Advanced evolutionary molecular engineering to produce thermostable cellulase by using a small but efficient library.

PubMed

Ito, Y; Ikeuchi, A; Imamura, C

2013-01-01

We aimed at constructing thermostable cellulase variants of cellobiohydrolase II, derived from the mesophilic fungus Phanerochaete chrysosporium, by using an advanced evolutionary molecular engineering method. By aligning the amino acid sequences of the catalytic domains of five thermophilic fungal CBH2 and PcCBH2 proteins, we identified 45 positions where the PcCBH2 genes differ from the consensus sequence of two to five thermophilic fungal CBH2s. PcCBH2 variants with the consensus mutations were obtained by a cell-free translation system that was chosen for easy evaluation of thermostability. From the small library of consensus mutations, advantageous mutations for improving thermostability were found to occur with much higher frequency relative to a random library. To further improve thermostability, advantageous mutations were accumulated within the wild-type gene. Finally, we obtained the most thermostable variant Mall4, which contained all 15 advantageous mutations found in this study. This variant had the same specific cellulase activity as the wild type and retained sufficient activity at 50°C for >72 h, whereas wild-type PcCBH2 retained much less activity under the same conditions. The history of the accumulation process indicated that evolution of PcCBH2 toward improved thermostability was ideally and rapidly accomplished through the evolutionary process employed in this study.
GenomicusPlants: a web resource to study genome evolution in flowering plants.

PubMed

Louis, Alexandra; Murat, Florent; Salse, Jérôme; Crollius, Hugues Roest

2015-01-01

Comparative genomics combined with phylogenetic reconstructions are powerful approaches to study the evolution of genes and genomes. However, the current rapid expansion of the volume of genomic information makes it increasingly difficult to interrogate, integrate and synthesize comparative genome data while taking into account the maximum breadth of information available. GenomicusPlants (http://www.genomicus.biologie.ens.fr/genomicus-plants) is an extension of the Genomicus webserver that addresses this issue by allowing users to explore flowering plant genomes in an intuitive way, across the broadest evolutionary scales. Extant genomes of 26 flowering plants can be analyzed, as well as 23 ancestral reconstructed genomes. Ancestral gene order provides a long-term chronological view of gene order evolution, greatly facilitating comparative genomics and evolutionary studies. Four main interfaces ('views') are available where: (i) PhyloView combines phylogenetic trees with comparisons of genomic loci across any number of genomes; (ii) AlignView projects loci of interest against all other genomes to visualize its topological conservation; (iii) MatrixView compares two genomes in a classical dotplot representation; and (iv) Karyoview visualizes chromosome karyotypes 'painted' with colours of another genome of interest. All four views are interconnected and benefit from many customizable features. © The Author 2014. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.
Dali server update.

PubMed

Holm, Liisa; Laakso, Laura M

2016-07-08

The Dali server (http://ekhidna2.biocenter.helsinki.fi/dali) is a network service for comparing protein structures in 3D. In favourable cases, comparing 3D structures may reveal biologically interesting similarities that are not detectable by comparing sequences. The Dali server has been running in various places for over 20 years and is used routinely by crystallographers on newly solved structures. The latest update of the server provides enhanced analytics for the study of sequence and structure conservation. The server performs three types of structure comparisons: (i) Protein Data Bank (PDB) search compares one query structure against those in the PDB and returns a list of similar structures; (ii) pairwise comparison compares one query structure against a list of structures specified by the user; and (iii) all against all structure comparison returns a structural similarity matrix, a dendrogram and a multidimensional scaling projection of a set of structures specified by the user. Structural superimpositions are visualized using the Java-free WebGL viewer PV. The structural alignment view is enhanced by sequence similarity searches against Uniprot. The combined structure-sequence alignment information is compressed to a stack of aligned sequence logos. In the stack, each structure is structurally aligned to the query protein and represented by a sequence logo. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Comparative Genome Sequence Analysis of the Bpa/Str Region in Mouse and Man

PubMed Central

Mallon, A.-M.; Platzer, M.; Bate, R.; Gloeckner, G.; Botcherby, M.R.M.; Nordsiek, G.; Strivens, M.A.; Kioschis, P.; Dangel, A.; Cunningham, D.; Straw, R.N.A.; Weston, P.; Gilbert, M.; Fernando, S.; Goodall, K.; Hunter, G.; Greystrong, J.S.; Clarke, D.; Kimberley, C.; Goerdes, M.; Blechschmidt, K.; Rump, A.; Hinzmann, B.; Mundy, C.R.; Miller, W.; Poustka, A.; Herman, G.E.; Rhodes, M.; Denny, P.; Rosenthal, A.; Brown, S.D.M.

2000-01-01

The progress of human and mouse genome sequencing programs presages the possibility of systematic cross-species comparison of the two genomes as a powerful tool for gene and regulatory element identification. As the opportunities to perform comparative sequence analysis emerge, it is important to develop parameters for such analyses and to examine the outcomes of cross-species comparison. Our analysis used gene prediction and a database search of 430 kb of genomic sequence covering the Bpa/Str region of the mouse X chromosome, and 745 kb of genomic sequence from the homologous human X chromosome region. We identified 11 genes in mouse and 13 genes and two pseudogenes in human. In addition, we compared the mouse and human sequences using pairwise alignment and searches for evolutionary conserved regions (ECRs) exceeding a defined threshold of sequence identity. This approach aided the identification of at least four further putative conserved genes in the region. Comparative sequencing revealed that this region is a mosaic in evolutionary terms, with considerably more rearrangement between the two species than realized previously from comparative mapping studies. Surprisingly, this region showed an extremely high LINE and low SINE content, low G+C content, and yet a relatively high gene density, in contrast to the low gene density usually associated with such regions. [The sequence data described in this paper have been submitted to EMBL under the following accession nos.: Mouse Genomic Sequence: Mouse contig A (AL021127), Mouse contig B (AL049866), BAC41M10 (AL136328), PAC303O11(AL136329). Human Genomic Sequence: Human contig 1 (U82671, U82670), Human contig 2 (U82695).] PMID:10854409
Revealing the paradox of drug reward in human evolution

PubMed Central

Sullivan, Roger J; Hagen, Edward H; Hammerstein, Peter

2008-01-01

Neurobiological models of drug abuse propose that drug use is initiated and maintained by rewarding feedback mechanisms. However, the most commonly used drugs are plant neurotoxins that evolved to punish, not reward, consumption by animal herbivores. Reward models therefore implicitly assume an evolutionary mismatch between recent drug-profligate environments and a relatively drug-free past in which a reward centre, incidentally vulnerable to neurotoxins, could evolve. By contrast, emerging insights from plant evolutionary ecology and the genetics of hepatic enzymes, particularly cytochrome P450, indicate that animal and hominid taxa have been exposed to plant toxins throughout their evolution. Specifically, evidence of conserved function, stabilizing selection, and population-specific selection of human cytochrome P450 genes indicate recent evolutionary exposure to plant toxins, including those that affect animal nervous systems. Thus, the human propensity to seek out and consume plant neurotoxins is a paradox with far-reaching implications for current drug-reward theory. We sketch some potential resolutions of the paradox, including the possibility that humans may have evolved to counter-exploit plant neurotoxins. Resolving the paradox of drug reward will require a synthesis of ecological and neurobiological perspectives of drug seeking and use. PMID:18353749
Predicting loss of evolutionary history: Where are we?

PubMed

Veron, Simon; Davies, T Jonathan; Cadotte, Marc W; Clergeau, Philippe; Pavoine, Sandrine

2017-02-01

The Earth's evolutionary history is threatened by species loss in the current sixth mass extinction event in Earth's history. Such extinction events not only eliminate species but also their unique evolutionary histories. Here we review the expected loss of Earth's evolutionary history quantified by phylogenetic diversity (PD) and evolutionary distinctiveness (ED) at risk. Due to the general paucity of data, global evolutionary history losses have been predicted for only a few groups, such as mammals, birds, amphibians, plants, corals and fishes. Among these groups, there is now empirical support that extinction threats are clustered on the phylogeny; however this is not always a sufficient condition to cause higher loss of phylogenetic diversity in comparison to a scenario of random extinctions. Extinctions of the most evolutionarily distinct species and the shape of phylogenetic trees are additional factors that can elevate losses of evolutionary history. Consequently, impacts of species extinctions differ among groups and regions, and even if global losses are low within large groups, losses can be high among subgroups or within some regions. Further, we show that PD and ED are poorly protected by current conservation practices. While evolutionary history can be indirectly protected by current conservation schemes, optimizing its preservation requires integrating phylogenetic indices with those that capture rarity and extinction risk. Measures based on PD and ED could bring solutions to conservation issues, however they are still rarely used in practice, probably because the reasons to protect evolutionary history are not clear for practitioners or due to a lack of data. However, important advances have been made in the availability of phylogenetic trees and methods for their construction, as well as assessments of extinction risk. Some challenges remain, and looking forward, research should prioritize the assessment of expected PD and ED loss for more taxonomic groups and test the assumption that preserving ED and PD also protects rare species and ecosystem services. Such research will be useful to inform and guide the conservation of Earth's biodiversity and the services it provides. © 2015 Cambridge Philosophical Society.
Preserving the tree of life.

PubMed

Mace, Georgina M; Gittleman, John L; Purvis, Andy

2003-06-13

Phylogenies provide new ways to measure biodiversity, to assess conservation priorities, and to quantify the evolutionary history in any set of species. Methodological problems and a lack of knowledge about most species have so far hampered their use. In the future, as techniques improve and more data become accessible, we will have an expanded set of conservation options, including ways to prioritize outcomes from evolutionary and ecological processes.
Implementation of a parallel protein structure alignment service on cloud.

PubMed

Hung, Che-Lun; Lin, Yaw-Ling

2013-01-01

Protein structure alignment has become an important strategy by which to identify evolutionary relationships between protein sequences. Several alignment tools are currently available for online comparison of protein structures. In this paper, we propose a parallel protein structure alignment service based on the Hadoop distribution framework. This service includes a protein structure alignment algorithm, a refinement algorithm, and a MapReduce programming model. The refinement algorithm refines the result of alignment. To process vast numbers of protein structures in parallel, the alignment and refinement algorithms are implemented using MapReduce. We analyzed and compared the structure alignments produced by different methods using a dataset randomly selected from the PDB database. The experimental results verify that the proposed algorithm refines the resulting alignments more accurately than existing algorithms. Meanwhile, the computational performance of the proposed service is proportional to the number of processors used in our cloud platform.
Implementation of a Parallel Protein Structure Alignment Service on Cloud

PubMed Central

Hung, Che-Lun; Lin, Yaw-Ling

2013-01-01

Protein structure alignment has become an important strategy by which to identify evolutionary relationships between protein sequences. Several alignment tools are currently available for online comparison of protein structures. In this paper, we propose a parallel protein structure alignment service based on the Hadoop distribution framework. This service includes a protein structure alignment algorithm, a refinement algorithm, and a MapReduce programming model. The refinement algorithm refines the result of alignment. To process vast numbers of protein structures in parallel, the alignment and refinement algorithms are implemented using MapReduce. We analyzed and compared the structure alignments produced by different methods using a dataset randomly selected from the PDB database. The experimental results verify that the proposed algorithm refines the resulting alignments more accurately than existing algorithms. Meanwhile, the computational performance of the proposed service is proportional to the number of processors used in our cloud platform. PMID:23671842
Forelimb kinematics and motor patterns of swimming loggerhead sea turtles (Caretta caretta): are motor patterns conserved in the evolution of new locomotor strategies?

PubMed

Rivera, Angela R V; Wyneken, Jeanette; Blob, Richard W

2011-10-01

Novel functions in animals may evolve through changes in morphology, muscle activity or a combination of both. The idea that new functions or behavior can arise solely through changes in structure, without concurrent changes in the patterns of muscle activity that control movement of those structures, has been formalized as the neuromotor conservation hypothesis. In vertebrate locomotor systems, evidence for neuromotor conservation is found across evolutionary transitions in the behavior of terrestrial species, and in evolutionary transitions from terrestrial species to flying species. However, evolutionary transitions in the locomotion of aquatic species have received little comparable study to determine whether changes in morphology and muscle function were coordinated through the evolution of new locomotor behavior. To evaluate the potential for neuromotor conservation in an ancient aquatic system, we quantified forelimb kinematics and muscle activity during swimming in the loggerhead sea turtle, Caretta caretta. Loggerhead forelimbs are hypertrophied into wing-like flippers that produce thrust via dorsoventral forelimb flapping. We compared kinematic and motor patterns from loggerheads with previous data from the red-eared slider, Trachemys scripta, a generalized freshwater species exhibiting unspecialized forelimb morphology and anteroposterior rowing motions during swimming. For some forelimb muscles, comparisons between C. caretta and T. scripta support neuromotor conservation; for example, the coracobrachialis and the latissimus dorsi show similar activation patterns. However, other muscles (deltoideus, pectoralis and triceps) do not show neuromotor conservation; for example, the deltoideus changes dramatically from a limb protractor/elevator in sliders to a joint stabilizer in loggerheads. Thus, during the evolution of flapping in sea turtles, drastic restructuring of the forelimb was accompanied by both conservation and evolutionary novelty in limb motor patterns.
Epsin deficiency impairs endocytosis by stalling the actin-dependent invagination of endocytic clathrin-coated pits

PubMed Central

Messa, Mirko; Fernández-Busnadiego, Rubén; Sun, Elizabeth Wen; Chen, Hong; Czapla, Heather; Wrasman, Kristie; Wu, Yumei; Ko, Genevieve; Ross, Theodora; Wendland, Beverly; De Camilli, Pietro

2014-01-01

Epsin is an evolutionarily conserved endocytic clathrin adaptor whose most critical function(s) in clathrin coat dynamics remain(s) elusive. To elucidate such function(s), we generated embryonic fibroblasts from conditional epsin triple KO mice. Triple KO cells displayed a dramatic cell division defect. Additionally, a robust impairment in clathrin-mediated endocytosis was observed, with an accumulation of early and U-shaped pits. This defect correlated with a perturbation of the coupling between the clathrin coat and the actin cytoskeleton, which we confirmed in a cell-free assay of endocytosis. Our results indicate that a key evolutionary conserved function of epsin, in addition to other roles that include, as we show here, a low affinity interaction with SNAREs, is to help generate the force that leads to invagination and then fission of clathrin-coated pits. DOI: http://dx.doi.org/10.7554/eLife.03311.001 PMID:25122462
Adaptive evolutionary conservation: towards a unified concept for defining conservation units.

PubMed

Fraser, D J; Bernatchez, L

2001-12-01

Recent years have seen a debate over various methods that could objectively prioritize conservation value below the species level. Most prominent among these has been the evolutionarily significant unit (ESU). We reviewed ESU concepts with the aim of proposing a more unified concept that would reconcile opposing views. Like species concepts, conflicting ESU concepts are all essentially aiming to define the same thing: segments of species whose divergence can be measured or evaluated by putting differential emphasis on the role of evolutionary forces at varied temporal scales. Thus, differences between ESU concepts lie more in the criteria used to define the ESUs themselves rather than in their fundamental essence. We provide a context-based framework for delineating ESUs which circumvents much of this situation. Rather than embroil in a befuddled debate over an optimal criterion, the key to a solution is accepting that differing criteria will work more dynamically than others and can be used alone or in combination depending on the situation. These assertions constitute the impetus behind adaptive evolutionary conservation.
Fixism and conservation science.

PubMed

Robert, Alexandre; Fontaine, Colin; Veron, Simon; Monnet, Anne-Christine; Legrand, Marine; Clavel, Joanne; Chantepie, Stéphane; Couvet, Denis; Ducarme, Frédéric; Fontaine, Benoît; Jiguet, Frédéric; le Viol, Isabelle; Rolland, Jonathan; Sarrazin, François; Teplitsky, Céline; Mouchet, Maud

2017-08-01

The field of biodiversity conservation has recently been criticized as relying on a fixist view of the living world in which existing species constitute at the same time targets of conservation efforts and static states of reference, which is in apparent disagreement with evolutionary dynamics. We reviewed the prominent role of species as conservation units and the common benchmark approach to conservation that aims to use past biodiversity as a reference to conserve current biodiversity. We found that the species approach is justified by the discrepancy between the time scales of macroevolution and human influence and that biodiversity benchmarks are based on reference processes rather than fixed reference states. Overall, we argue that the ethical and theoretical frameworks underlying conservation research are based on macroevolutionary processes, such as extinction dynamics. Current species, phylogenetic, community, and functional conservation approaches constitute short-term responses to short-term human effects on these reference processes, and these approaches are consistent with evolutionary principles. © 2016 Society for Conservation Biology.
The sagittal stem alignment and the stem version clearly influence the impingement-free range of motion in total hip arthroplasty: a computer model-based analysis.

PubMed

Müller, Michael; Duda, Georg; Perka, Carsten; Tohtz, Stephan

2016-03-01

The component alignment in total hip arthroplasty influences the impingement-free range of motion (ROM). While substantiated data is available for the cup positioning, little is known about the stem alignment. Especially stem rotation and the sagittal alignment influence the position of the cone in relation to the edge of the socket and thus the impingement-free functioning. Hence, the question arises as to what influence do these parameters have on the impingement-free ROM? With the help of a computer model the influence of the sagittal stem alignment and rotation on the impingement-free ROM were investigated. The computer model was based on the CT dataset of a patient with a non-cemented THA. In the model the stem version was set at 10°/0°/-10° and the sagittal alignment at 5°/0°/-5°, which resulted in nine alternative stem positions. For each position, the maximum impingement-free ROM was investigated. Both stem version and sagittal stem alignment have a relevant influence on the impingement-free ROM. In particular, flexion and extension as well as internal and external rotation capability present evident differences. In the position intervals of 10° sagittal stem alignment and 20° stem version a difference was found of about 80° in the flexion and 50° in the extension capability. Likewise, differences were evidenced of up to 72° in the internal and up to 36° in the external rotation. The sagittal stem alignment and the stem torsion have a relevant influence on the impingement-free ROM. To clarify the causes of an impingement or accompanying problems, both parameters should be examined and, if possible, a combined assessment of these factors should be made.
Accurate Simulation and Detection of Coevolution Signals in Multiple Sequence Alignments

PubMed Central

Ackerman, Sharon H.; Tillier, Elisabeth R.; Gatti, Domenico L.

2012-01-01

Background While the conserved positions of a multiple sequence alignment (MSA) are clearly of interest, non-conserved positions can also be important because, for example, destabilizing effects at one position can be compensated by stabilizing effects at another position. Different methods have been developed to recognize the evolutionary relationship between amino acid sites, and to disentangle functional/structural dependencies from historical/phylogenetic ones. Methodology/Principal Findings We have used two complementary approaches to test the efficacy of these methods. In the first approach, we have used a new program, MSAvolve, for the in silico evolution of MSAs, which records a detailed history of all covarying positions, and builds a global coevolution matrix as the accumulated sum of individual matrices for the positions forced to co-vary, the recombinant coevolution, and the stochastic coevolution. We have simulated over 1600 MSAs for 8 protein families, which reflect sequences of different sizes and proteins with widely different functions. The calculated coevolution matrices were compared with the coevolution matrices obtained for the same evolved MSAs with different coevolution detection methods. In a second approach we have evaluated the capacity of the different methods to predict close contacts in the representative X-ray structures of an additional 150 protein families using only experimental MSAs. Conclusions/Significance Methods based on the identification of global correlations between pairs were found to be generally superior to methods based only on local correlations in their capacity to identify coevolving residues using either simulated or experimental MSAs. However, the significant variability in the performance of different methods with different proteins suggests that the simulation of MSAs that replicate the statistical properties of the experimental MSA can be a valuable tool to identify the coevolution detection method that is most effective in each case. PMID:23091608
Conserved Nonexonic Elements: A Novel Class of Marker for Phylogenomics.

PubMed

Edwards, Scott V; Cloutier, Alison; Baker, Allan J

2017-11-01

Noncoding markers have a particular appeal as tools for phylogenomic analysis because, at least in vertebrates, they appear less subject to strong variation in GC content among lineages. Thus far, ultraconserved elements (UCEs) and introns have been the most widely used noncoding markers. Here we analyze and study the evolutionary properties of a new type of noncoding marker, conserved nonexonic elements (CNEEs), which consists of noncoding elements that are estimated to evolve slower than the neutral rate across a set of species. Although they often include UCEs, CNEEs are distinct from UCEs because they are not ultraconserved, and, most importantly, the core region alone is analyzed, rather than both the core and its flanking regions. Using a data set of 16 birds plus an alligator outgroup, and ∼3600-∼3800 loci per marker type, we found that although CNEEs were less variable than bioinformatically derived UCEs or introns and in some cases exhibited a slower approach to branch resolution as determined by phylogenomic subsampling, the quality of CNEE alignments was superior to those of the other markers, with fewer gaps and missing species. Phylogenetic resolution using coalescent approaches was comparable among the three marker types, with most nodes being fully and congruently resolved. Comparison of phylogenetic results across the three marker types indicated that one branch, the sister group to the passerine + falcon clade, was resolved differently and with moderate (>70%) bootstrap support between CNEEs and UCEs or introns. Overall, CNEEs appear to be promising as phylogenomic markers, yielding phylogenetic resolution as high as for UCEs and introns but with fewer gaps, less ambiguity in alignments and with patterns of nucleotide substitution more consistent with the assumptions of commonly used methods of phylogenetic analysis. © The Author(s) 2017. Published by Oxford University Press on behalf of the Systematic Biologists.
Conserved Nonexonic Elements: A Novel Class of Marker for Phylogenomics

PubMed Central

Cloutier, Alison; Baker, Allan J.

2017-01-01

Abstract Noncoding markers have a particular appeal as tools for phylogenomic analysis because, at least in vertebrates, they appear less subject to strong variation in GC content among lineages. Thus far, ultraconserved elements (UCEs) and introns have been the most widely used noncoding markers. Here we analyze and study the evolutionary properties of a new type of noncoding marker, conserved nonexonic elements (CNEEs), which consists of noncoding elements that are estimated to evolve slower than the neutral rate across a set of species. Although they often include UCEs, CNEEs are distinct from UCEs because they are not ultraconserved, and, most importantly, the core region alone is analyzed, rather than both the core and its flanking regions. Using a data set of 16 birds plus an alligator outgroup, and ∼3600–∼3800 loci per marker type, we found that although CNEEs were less variable than bioinformatically derived UCEs or introns and in some cases exhibited a slower approach to branch resolution as determined by phylogenomic subsampling, the quality of CNEE alignments was superior to those of the other markers, with fewer gaps and missing species. Phylogenetic resolution using coalescent approaches was comparable among the three marker types, with most nodes being fully and congruently resolved. Comparison of phylogenetic results across the three marker types indicated that one branch, the sister group to the passerine + falcon clade, was resolved differently and with moderate (>70%) bootstrap support between CNEEs and UCEs or introns. Overall, CNEEs appear to be promising as phylogenomic markers, yielding phylogenetic resolution as high as for UCEs and introns but with fewer gaps, less ambiguity in alignments and with patterns of nucleotide substitution more consistent with the assumptions of commonly used methods of phylogenetic analysis. PMID:28637293
Cost-effective conservation of amphibian ecology and evolution

PubMed Central

Campos, Felipe S.; Lourenço-de-Moraes, Ricardo; Llorente, Gustavo A.; Solé, Mirco

2017-01-01

Habitat loss is the most important threat to species survival, and the efficient selection of priority areas is fundamental for good systematic conservation planning. Using amphibians as a conservation target, we designed an innovative assessment strategy, showing that prioritization models focused on functional, phylogenetic, and taxonomic diversity can include cost-effectiveness–based assessments of land values. We report new key conservation sites within the Brazilian Atlantic Forest hot spot, revealing a congruence of ecological and evolutionary patterns. We suggest payment for ecosystem services through environmental set-asides on private land, establishing potential trade-offs for ecological and evolutionary processes. Our findings introduce additional effective area-based conservation parameters that set new priorities for biodiversity assessment in the Atlantic Forest, validating the usefulness of a novel approach to cost-effectiveness–based assessments of conservation value for other species-rich regions. PMID:28691084
Does the evolutionary conservation of microsatellite loci imply function?

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shriver, M.D.; Deka, R.; Ferrell, R.E.

Microsatellites are highly polymorphic tandem arrays of short (1-6 bp) sequence motifs which have been found widely distributed in the genomes of all eukaryotes. We have analyzed allele frequency data on 16 microsatellite loci typed in the great apes (human, chimp, orangutan, and gorilla). The majority of these loci (13) were isolated from human genomic libraries; three were cloned from chimpanzee genomic DNA. Most of these loci are not only present in all apes species, but are polymorphic with comparable levels of heterozygosity and have alleles which overlap in size. The extent of divergence of allele frequencies among these fourmore » species were studies using the stepwise-weighted genetic distance (Dsw), which was previously shown to conform to linearity with evolutionary time since divergence for loci where mutations exist in a stepwise fashion. The phylogenetic tree of the great apes constructed from this distance matrix was consistent with the expected topology, with a high bootstrap confidence (82%) for the human/chimp clade. However, the allele frequency distributions of these species are 10 times more similar to each other than expected when they were calibrated with a conservative estimate of the time since separation of humans and the apes. These results are in agreement with sequence-based surveys of microsatellites which have demonstrated that they are highly (90%) conserved over short periods of evolutionary time (< 10 million years) and moderately (30%) conserved over long periods of evolutionary time (> 60-80 million years). This evolutionary conservation has prompted some authors to speculate that there are functional constraints on microsatellite loci. In contrast, the presence of directional bias of mutations with constraints and/or selection against aberrant sized alleles can explain these results.« less
Modular and configurable optimal sequence alignment software: Cola.

PubMed

Zamani, Neda; Sundström, Görel; Höppner, Marc P; Grabherr, Manfred G

2014-01-01

The fundamental challenge in optimally aligning homologous sequences is to define a scoring scheme that best reflects the underlying biological processes. Maximising the overall number of matches in the alignment does not always reflect the patterns by which nucleotides mutate. Efficiently implemented algorithms that can be parameterised to accommodate more complex non-linear scoring schemes are thus desirable. We present Cola, alignment software that implements different optimal alignment algorithms, also allowing for scoring contiguous matches of nucleotides in a nonlinear manner. The latter places more emphasis on short, highly conserved motifs, and less on the surrounding nucleotides, which can be more diverged. To illustrate the differences, we report results from aligning 14,100 sequences from 3' untranslated regions of human genes to 25 of their mammalian counterparts, where we found that a nonlinear scoring scheme is more consistent than a linear scheme in detecting short, conserved motifs. Cola is freely available under LPGL from https://github.com/nedaz/cola.

Recombinant transfer in the basic genome of E. coli

DOE PAGES

Dixit, Purushottam; Studier, F. William; Pang, Tin Yau; ...

2015-07-07

An approximation to the ~4-Mbp basic genome shared by 32 strains of E. coli representing six evolutionary groups has been derived and analyzed computationally. A multiple-alignment of the 32 complete genome sequences was filtered to remove mobile elements and identify the most reliable ~90% of the aligned length of each of the resulting 496 basic-genome pairs. Patterns of single bp mutations (SNPs) in aligned pairs distinguish clonally inherited regions from regions where either genome has acquired DNA fragments from diverged genomes by homologous recombination since their last common ancestor. Such recombinant transfer is pervasive across the basic genome, mostly betweenmore » genomes in the same evolutionary group, and generates many unique mosaic patterns. The six least-diverged genome-pairs have one or two recombinant transfers of length ~40–115 kbp (and few if any other transfers), each containing one or more gene clusters known to confer strong selective advantage in some environments. Moderately diverged genome pairs (0.4–1% SNPs) show mosaic patterns of interspersed clonal and recombinant regions of varying lengths throughout the basic genome, whereas more highly diverged pairs within an evolutionary group or pairs between evolutionary groups having >1.3% SNPs have few clonal matches longer than a few kbp. Many recombinant transfers appear to incorporate fragments of the entering DNA produced by restriction systems of the recipient cell. A simple computational model can closely fit the data. As a result, most recombinant transfers seem likely to be due to generalized transduction by co-evolving populations of phages, which could efficiently distribute variability throughout bacterial genomes.« less
Recombinant transfer in the basic genome of E. coli

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dixit, Purushottam; Studier, F. William; Pang, Tin Yau

An approximation to the ~4-Mbp basic genome shared by 32 strains of E. coli representing six evolutionary groups has been derived and analyzed computationally. A multiple-alignment of the 32 complete genome sequences was filtered to remove mobile elements and identify the most reliable ~90% of the aligned length of each of the resulting 496 basic-genome pairs. Patterns of single bp mutations (SNPs) in aligned pairs distinguish clonally inherited regions from regions where either genome has acquired DNA fragments from diverged genomes by homologous recombination since their last common ancestor. Such recombinant transfer is pervasive across the basic genome, mostly betweenmore » genomes in the same evolutionary group, and generates many unique mosaic patterns. The six least-diverged genome-pairs have one or two recombinant transfers of length ~40–115 kbp (and few if any other transfers), each containing one or more gene clusters known to confer strong selective advantage in some environments. Moderately diverged genome pairs (0.4–1% SNPs) show mosaic patterns of interspersed clonal and recombinant regions of varying lengths throughout the basic genome, whereas more highly diverged pairs within an evolutionary group or pairs between evolutionary groups having >1.3% SNPs have few clonal matches longer than a few kbp. Many recombinant transfers appear to incorporate fragments of the entering DNA produced by restriction systems of the recipient cell. A simple computational model can closely fit the data. As a result, most recombinant transfers seem likely to be due to generalized transduction by co-evolving populations of phages, which could efficiently distribute variability throughout bacterial genomes.« less
Evolutionary growth process of highly conserved sequences in vertebrate genomes.

PubMed

Ishibashi, Minaka; Noda, Akiko Ogura; Sakate, Ryuichi; Imanishi, Tadashi

2012-08-01

Genome sequence comparison between evolutionarily distant species revealed ultraconserved elements (UCEs) among mammals under strong purifying selection. Most of them were also conserved among vertebrates. Because they tend to be located in the flanking regions of developmental genes, they would have fundamental roles in creating vertebrate body plans. However, the evolutionary origin and selection mechanism of these UCEs remain unclear. Here we report that UCEs arose in primitive vertebrates, and gradually grew in vertebrate evolution. We searched for UCEs in two teleost fishes, Tetraodon nigroviridis and Oryzias latipes, and found 554 UCEs with 100% identity over 100 bps. Comparison of teleost and mammalian UCEs revealed 43 pairs of common, jawed-vertebrate UCEs (jUCE) with high sequence identities, ranging from 83.1% to 99.2%. Ten of them retain lower similarities to the Petromyzon marinus genome, and the substitution rates of four non-exonic jUCEs were reduced after the teleost-mammal divergence, suggesting that robust conservation had been acquired in the jawed vertebrate lineage. Our results indicate that prototypical UCEs originated before the divergence of jawed and jawless vertebrates and have been frozen as perfect conserved sequences in the jawed vertebrate lineage. In addition, our comparative sequence analyses of UCEs and neighboring regions resulted in a discovery of lineage-specific conserved sequences. They were added progressively to prototypical UCEs, suggesting step-wise acquisition of novel regulatory roles. Our results indicate that conserved non-coding elements (CNEs) consist of blocks with distinct evolutionary history, each having been frozen since different evolutionary era along the vertebrate lineage. Copyright © 2012 Elsevier B.V. All rights reserved.
The covariance between genetic and environmental influences across ecological gradients: reassessing the evolutionary significance of countergradient and cogradient variation.

PubMed

Conover, David O; Duffy, Tara A; Hice, Lyndie A

2009-06-01

Patterns of phenotypic change across environmental gradients (e.g., latitude, altitude) have long captivated the interest of evolutionary ecologists. The pattern and magnitude of phenotypic change is determined by the covariance between genetic and environmental influences across a gradient. Cogradient variation (CoGV) occurs when covariance is positive: that is, genetic and environmental influences on phenotypic expression are aligned and their joint influence accentuates the change in mean trait value across the gradient. Conversely, countergradient variation (CnGV) occurs when covariance is negative: that is, genetic and environmental influences on phenotypes oppose one another, thereby diminishing the change in mean trait expression across the gradient. CnGV has so far been found in at least 60 species, with most examples coming from fishes, amphibians, and insects across latitudinal or altitudinal gradients. Traits that display CnGV most often involve metabolic compensation, that is, the elevation of various physiological rates processes (development, growth, feeding, metabolism, activity) to counteract the dampening effect of reduced temperature, growing season length, or food supply. Far fewer examples of CoGV have been identified (11 species), and these most often involve morphological characters. Increased knowledge of spatial covariance patterns has furthered our understanding of Bergmann size clines, phenotypic plasticity, species range limits, tradeoffs in juvenile growth rate, and the design of conservation strategies for wild species. Moreover, temporal CnGV explains some cases of an apparent lack of phenotypic response to directional selection and provides a framework for predicting evolutionary responses to climate change.
Covariant Evolutionary Event Analysis for Base Interaction Prediction Using a Relational Database Management System for RNA.

PubMed

Xu, Weijia; Ozer, Stuart; Gutell, Robin R

2009-01-01

With an increasingly large amount of sequences properly aligned, comparative sequence analysis can accurately identify not only common structures formed by standard base pairing but also new types of structural elements and constraints. However, traditional methods are too computationally expensive to perform well on large scale alignment and less effective with the sequences from diversified phylogenetic classifications. We propose a new approach that utilizes coevolutional rates among pairs of nucleotide positions using phylogenetic and evolutionary relationships of the organisms of aligned sequences. With a novel data schema to manage relevant information within a relational database, our method, implemented with a Microsoft SQL Server 2005, showed 90% sensitivity in identifying base pair interactions among 16S ribosomal RNA sequences from Bacteria, at a scale 40 times bigger and 50% better sensitivity than a previous study. The results also indicated covariation signals for a few sets of cross-strand base stacking pairs in secondary structure helices, and other subtle constraints in the RNA structure.
Covariant Evolutionary Event Analysis for Base Interaction Prediction Using a Relational Database Management System for RNA

PubMed Central

Xu, Weijia; Ozer, Stuart; Gutell, Robin R.

2010-01-01

With an increasingly large amount of sequences properly aligned, comparative sequence analysis can accurately identify not only common structures formed by standard base pairing but also new types of structural elements and constraints. However, traditional methods are too computationally expensive to perform well on large scale alignment and less effective with the sequences from diversified phylogenetic classifications. We propose a new approach that utilizes coevolutional rates among pairs of nucleotide positions using phylogenetic and evolutionary relationships of the organisms of aligned sequences. With a novel data schema to manage relevant information within a relational database, our method, implemented with a Microsoft SQL Server 2005, showed 90% sensitivity in identifying base pair interactions among 16S ribosomal RNA sequences from Bacteria, at a scale 40 times bigger and 50% better sensitivity than a previous study. The results also indicated covariation signals for a few sets of cross-strand base stacking pairs in secondary structure helices, and other subtle constraints in the RNA structure. PMID:20502534
Optimizing multiple sequence alignments using a genetic algorithm based on three objectives: structural information, non-gaps percentage and totally conserved columns.

PubMed

Ortuño, Francisco M; Valenzuela, Olga; Rojas, Fernando; Pomares, Hector; Florido, Javier P; Urquiza, Jose M; Rojas, Ignacio

2013-09-01

Multiple sequence alignments (MSAs) are widely used approaches in bioinformatics to carry out other tasks such as structure predictions, biological function analyses or phylogenetic modeling. However, current tools usually provide partially optimal alignments, as each one is focused on specific biological features. Thus, the same set of sequences can produce different alignments, above all when sequences are less similar. Consequently, researchers and biologists do not agree about which is the most suitable way to evaluate MSAs. Recent evaluations tend to use more complex scores including further biological features. Among them, 3D structures are increasingly being used to evaluate alignments. Because structures are more conserved in proteins than sequences, scores with structural information are better suited to evaluate more distant relationships between sequences. The proposed multiobjective algorithm, based on the non-dominated sorting genetic algorithm, aims to jointly optimize three objectives: STRIKE score, non-gaps percentage and totally conserved columns. It was significantly assessed on the BAliBASE benchmark according to the Kruskal-Wallis test (P < 0.01). This algorithm also outperforms other aligners, such as ClustalW, Multiple Sequence Alignment Genetic Algorithm (MSA-GA), PRRP, DIALIGN, Hidden Markov Model Training (HMMT), Pattern-Induced Multi-sequence Alignment (PIMA), MULTIALIGN, Sequence Alignment Genetic Algorithm (SAGA), PILEUP, Rubber Band Technique Genetic Algorithm (RBT-GA) and Vertical Decomposition Genetic Algorithm (VDGA), according to the Wilcoxon signed-rank test (P < 0.05), whereas it shows results not significantly different to 3D-COFFEE (P > 0.05) with the advantage of being able to use less structures. Structural information is included within the objective function to evaluate more accurately the obtained alignments. The source code is available at http://www.ugr.es/~fortuno/MOSAStrE/MO-SAStrE.zip.
Conserving the functional and phylogenetic trees of life of European tetrapods

PubMed Central

Thuiller, Wilfried; Maiorano, Luigi; Mazel, Florent; Guilhaumon, François; Ficetola, Gentile Francesco; Lavergne, Sébastien; Renaud, Julien; Roquet, Cristina; Mouillot, David

2015-01-01

Protected areas (PAs) are pivotal tools for biodiversity conservation on the Earth. Europe has had an extensive protection system since Natura 2000 areas were created in parallel with traditional parks and reserves. However, the extent to which this system covers not only taxonomic diversity but also other biodiversity facets, such as evolutionary history and functional diversity, has never been evaluated. Using high-resolution distribution data of all European tetrapods together with dated molecular phylogenies and detailed trait information, we first tested whether the existing European protection system effectively covers all species and in particular, those with the highest evolutionary or functional distinctiveness. We then tested the ability of PAs to protect the entire tetrapod phylogenetic and functional trees of life by mapping species' target achievements along the internal branches of these two trees. We found that the current system is adequately representative in terms of the evolutionary history of amphibians while it fails for the rest. However, the most functionally distinct species were better represented than they would be under random conservation efforts. These results imply better protection of the tetrapod functional tree of life, which could help to ensure long-term functioning of the ecosystem, potentially at the expense of conserving evolutionary history. PMID:25561666
HubAlign: an accurate and efficient method for global alignment of protein-protein interaction networks.

PubMed

Hashemifar, Somaye; Xu, Jinbo

2014-09-01

High-throughput experimental techniques have produced a large amount of protein-protein interaction (PPI) data. The study of PPI networks, such as comparative analysis, shall benefit the understanding of life process and diseases at the molecular level. One way of comparative analysis is to align PPI networks to identify conserved or species-specific subnetwork motifs. A few methods have been developed for global PPI network alignment, but it still remains challenging in terms of both accuracy and efficiency. This paper presents a novel global network alignment algorithm, denoted as HubAlign, that makes use of both network topology and sequence homology information, based upon the observation that topologically important proteins in a PPI network usually are much more conserved and thus, more likely to be aligned. HubAlign uses a minimum-degree heuristic algorithm to estimate the topological and functional importance of a protein from the global network topology information. Then HubAlign aligns topologically important proteins first and gradually extends the alignment to the whole network. Extensive tests indicate that HubAlign greatly outperforms several popular methods in terms of both accuracy and efficiency, especially in detecting functionally similar proteins. HubAlign is available freely for non-commercial purposes at http://ttic.uchicago.edu/∼hashemifar/software/HubAlign.zip. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
Independent Evolution of Six Families of Halogenating Enzymes

PubMed Central

Xu, Gangming; Wang, Bin-Gui

2016-01-01

Halogenated natural products are widespread in the environment, and the halogen atoms are typically vital to their bioactivities. Thus far, six families of halogenating enzymes have been identified: cofactor-free haloperoxidases (HPO), vanadium-dependent haloperoxidases (V-HPO), heme iron-dependent haloperoxidases (HI-HPO), non-heme iron-dependent halogenases (NI-HG), flavin-dependent halogenases (F-HG), and S-adenosyl-L-methionine (SAM)-dependent halogenases (S-HG). However, these halogenating enzymes with similar biological functions but distinct structures might have evolved independently. Phylogenetic and structural analyses suggest that the HPO, V-HPO, HI-HPO, NI-HG, F-HG, and S-HG enzyme families may have evolutionary relationships to the α/β hydrolases, acid phosphatases, peroxidases, chemotaxis phosphatases, oxidoreductases, and SAM hydroxide adenosyltransferases, respectively. These halogenating enzymes have established sequence homology, structural conservation, and mechanistic features within each family. Understanding the distinct evolutionary history of these halogenating enzymes will provide further insights into the study of their catalytic mechanisms and halogenation specificity. PMID:27153321
An Evolutionary Landscape of A-to-I RNA Editome across Metazoan Species

PubMed Central

Hung, Li-Yuan; Chen, Yen-Ju; Mai, Te-Lun; Chen, Chia-Ying; Yang, Min-Yu; Chiang, Tai-Wei; Wang, Yi-Da

2018-01-01

Abstract Adenosine-to-inosine (A-to-I) editing is widespread across the kingdom Metazoa. However, for the lack of comprehensive analysis in nonmodel animals, the evolutionary history of A-to-I editing remains largely unexplored. Here, we detect high-confidence editing sites using clustering and conservation strategies based on RNA sequencing data alone, without using single-nucleotide polymorphism information or genome sequencing data from the same sample. We thereby unveil the first evolutionary landscape of A-to-I editing maps across 20 metazoan species (from worm to human), providing unprecedented evidence on how the editing mechanism gradually expands its territory and increases its influence along the history of evolution. Our result revealed that highly clustered and conserved editing sites tended to have a higher editing level and a higher magnitude of the ADAR motif. The ratio of the frequencies of nonsynonymous editing to that of synonymous editing remarkably increased with increasing the conservation level of A-to-I editing. These results thus suggest potentially functional benefit of highly clustered and conserved editing sites. In addition, spatiotemporal dynamics analyses reveal a conserved enrichment of editing and ADAR expression in the central nervous system throughout more than 300 Myr of divergent evolution in complex animals and the comparability of editing patterns between invertebrates and between vertebrates during development. This study provides evolutionary and dynamic aspects of A-to-I editome across metazoan species, expanding this important but understudied class of nongenomically encoded events for comprehensive characterization. PMID:29294013
Free energy analysis of cell spreading.

PubMed

McEvoy, Eóin; Deshpande, Vikram S; McGarry, Patrick

2017-10-01

In this study we present a steady-state adaptation of the thermodynamically motivated stress fiber (SF) model of Vigliotti et al. (2015). We implement this steady-state formulation in a non-local finite element setting where we also consider global conservation of the total number of cytoskeletal proteins within the cell, global conservation of the number of binding integrins on the cell membrane, and adhesion limiting ligand density on the substrate surface. We present a number of simulations of cell spreading in which we consider a limited subset of the possible deformed spread-states assumed by the cell in order to examine the hypothesis that free energy minimization drives the process of cell spreading. Simulations suggest that cell spreading can be viewed as a competition between (i) decreasing cytoskeletal free energy due to strain induced assembly of cytoskeletal proteins into contractile SFs, and (ii) increasing elastic free energy due to stretching of the mechanically passive components of the cell. The computed minimum free energy spread area is shown to be lower for a cell on a compliant substrate than on a rigid substrate. Furthermore, a low substrate ligand density is found to limit cell spreading. The predicted dependence of cell spread area on substrate stiffness and ligand density is in agreement with the experiments of Engler et al. (2003). We also simulate the experiments of Théry et al. (2006), whereby initially circular cells deform and adhere to "V-shaped" and "Y-shaped" ligand patches. Analysis of a number of different spread states reveals that deformed configurations with the lowest free energy exhibit a SF distribution that corresponds to experimental observations, i.e. a high concentration of highly aligned SFs occurs along free edges, with lower SF concentrations in the interior of the cell. In summary, the results of this study suggest that cell spreading is driven by free energy minimization based on a competition between decreasing cytoskeletal free energy and increasing passive elastic free energy. Copyright © 2017 Elsevier Ltd. All rights reserved.
Phylogeny, extinction and conservation: embracing uncertainties in a time of urgency

PubMed Central

Forest, Félix; Crandall, Keith A.; Chase, Mark W.; Faith, Daniel P.

2015-01-01

Evolutionary studies have played a fundamental role in our understanding of life, but until recently, they had only a relatively modest involvement in addressing conservation issues. The main goal of the present discussion meeting issue is to offer a platform to present the available methods allowing the integration of phylogenetic and extinction risk data in conservation planning. Here, we identify the main knowledge gaps in biodiversity science, which include incomplete sampling, reconstruction biases in phylogenetic analyses, partly known species distribution ranges, and the difficulty in producing conservation assessments for all known species, not to mention that much of the effective biological diversity remains to be discovered. Given the impact that human activities have on biodiversity and the urgency with which we need to address these issues, imperfect assumptions need to be sanctioned and surrogates used in the race to salvage as much as possible of our natural and evolutionary heritage. We discuss some aspects of the uncertainties found in biodiversity science, such as the ideal surrogates for biodiversity, the gaps in our knowledge and the numerous available phylogenetic diversity-based methods. We also introduce a series of cases studies that demonstrate how evolutionary biology can effectively contribute to biodiversity conservation science. PMID:25561663
CAFE: aCcelerated Alignment-FrEe sequence analysis.

PubMed

Lu, Yang Young; Tang, Kujin; Ren, Jie; Fuhrman, Jed A; Waterman, Michael S; Sun, Fengzhu

2017-07-03

Alignment-free genome and metagenome comparisons are increasingly important with the development of next generation sequencing (NGS) technologies. Recently developed state-of-the-art k-mer based alignment-free dissimilarity measures including CVTree, $d_2^*$ and $d_2^S$ are more computationally expensive than measures based solely on the k-mer frequencies. Here, we report a standalone software, aCcelerated Alignment-FrEe sequence analysis (CAFE), for efficient calculation of 28 alignment-free dissimilarity measures. CAFE allows for both assembled genome sequences and unassembled NGS shotgun reads as input, and wraps the output in a standard PHYLIP format. In downstream analyses, CAFE can also be used to visualize the pairwise dissimilarity measures, including dendrograms, heatmap, principal coordinate analysis and network display. CAFE serves as a general k-mer based alignment-free analysis platform for studying the relationships among genomes and metagenomes, and is freely available at https://github.com/younglululu/CAFE. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Clustering of reads with alignment-free measures and quality values.

PubMed

Comin, Matteo; Leoni, Andrea; Schimd, Michele

2015-01-01

The data volume generated by Next-Generation Sequencing (NGS) technologies is growing at a pace that is now challenging the storage and data processing capacities of modern computer systems. In this context an important aspect is the reduction of data complexity by collapsing redundant reads in a single cluster to improve the run time, memory requirements, and quality of post-processing steps like assembly and error correction. Several alignment-free measures, based on k-mers counts, have been used to cluster reads. Quality scores produced by NGS platforms are fundamental for various analysis of NGS data like reads mapping and error detection. Moreover future-generation sequencing platforms will produce long reads but with a large number of erroneous bases (up to 15 %). In this scenario it will be fundamental to exploit quality value information within the alignment-free framework. To the best of our knowledge this is the first study that incorporates quality value information and k-mers counts, in the context of alignment-free measures, for the comparison of reads data. Based on this principles, in this paper we present a family of alignment-free measures called D (q) -type. A set of experiments on simulated and real reads data confirms that the new measures are superior to other classical alignment-free statistics, especially when erroneous reads are considered. Also results on de novo assembly and metagenomic reads classification show that the introduction of quality values improves over standard alignment-free measures. These statistics are implemented in a software called QCluster (http://www.dei.unipd.it/~ciompin/main/qcluster.html).
Genomic Analysis of the Chicken Infectious Anemia Virus in a Specific Pathogen-Free Chicken Population in China

PubMed Central

Li, Yang; Wang, Yixin; Fang, Lichun; Fu, Jiayuan; Cui, Shuai; Zhao, Yingjie; Cui, Zhizhong; Chang, Shuang; Zhao, Peng

2016-01-01

The antibody to chicken infectious anemia virus (CIAV) was positive in a specific pathogen-free (SPF) chicken population by ELISA test in our previous inspection, indicating a possible infection with CIAV. In this study, blood samples collected from the SPF chickens were used to isolate CIAV by inoculating into MSB1 cells and PCR amplification. A CIAV strain (SD1403) was isolated and successfully identified. Three overlapping genomic fragments were obtained by PCR amplification and sequencing. The full genome sequence of the SD1403 strain was obtained by aligning the sequences. The genome of the SD1403 strain was 2293 bp with a nucleotide identity of 94.8% to 98.5% when compared with 30 referred CIAV strains. The viral proteins VP2 and VP3 were highly conserved, but VP1 was not relatively conserved. Both amino acids 139 and 144 of VP1 were glutamine, which was in accord with the low pathogenic characteristics. In this study, we first reported that CIAV exists in Chinese SPF chicken populations and may be an important reason why attenuated vaccine can be contaminated with CIAV. PMID:27298822
Genomic Analysis of the Chicken Infectious Anemia Virus in a Specific Pathogen-Free Chicken Population in China.

PubMed

Li, Yang; Wang, Yixin; Fang, Lichun; Fu, Jiayuan; Cui, Shuai; Zhao, Yingjie; Cui, Zhizhong; Chang, Shuang; Zhao, Peng

2016-01-01

The antibody to chicken infectious anemia virus (CIAV) was positive in a specific pathogen-free (SPF) chicken population by ELISA test in our previous inspection, indicating a possible infection with CIAV. In this study, blood samples collected from the SPF chickens were used to isolate CIAV by inoculating into MSB1 cells and PCR amplification. A CIAV strain (SD1403) was isolated and successfully identified. Three overlapping genomic fragments were obtained by PCR amplification and sequencing. The full genome sequence of the SD1403 strain was obtained by aligning the sequences. The genome of the SD1403 strain was 2293 bp with a nucleotide identity of 94.8% to 98.5% when compared with 30 referred CIAV strains. The viral proteins VP2 and VP3 were highly conserved, but VP1 was not relatively conserved. Both amino acids 139 and 144 of VP1 were glutamine, which was in accord with the low pathogenic characteristics. In this study, we first reported that CIAV exists in Chinese SPF chicken populations and may be an important reason why attenuated vaccine can be contaminated with CIAV.
Markov model plus k-word distributions: a synergy that produces novel statistical measures for sequence comparison.

PubMed

Dai, Qi; Yang, Yanchun; Wang, Tianming

2008-10-15

Many proposed statistical measures can efficiently compare biological sequences to further infer their structures, functions and evolutionary information. They are related in spirit because all the ideas for sequence comparison try to use the information on the k-word distributions, Markov model or both. Motivated by adding k-word distributions to Markov model directly, we investigated two novel statistical measures for sequence comparison, called wre.k.r and S2.k.r. The proposed measures were tested by similarity search, evaluation on functionally related regulatory sequences and phylogenetic analysis. This offers the systematic and quantitative experimental assessment of our measures. Moreover, we compared our achievements with these based on alignment or alignment-free. We grouped our experiments into two sets. The first one, performed via ROC (receiver operating curve) analysis, aims at assessing the intrinsic ability of our statistical measures to search for similar sequences from a database and discriminate functionally related regulatory sequences from unrelated sequences. The second one aims at assessing how well our statistical measure is used for phylogenetic analysis. The experimental assessment demonstrates that our similarity measures intending to incorporate k-word distributions into Markov model are more efficient.
Coordinating Multi-Rover Systems: Evaluation Functions for Dynamic and Noisy Environments

NASA Technical Reports Server (NTRS)

Turner, Kagan; Agogino, Adrian

2005-01-01

This paper addresses the evolution of control strategies for a collective: a set of entities that collectively strives to maximize a global evaluation function that rates the performance of the full system. Directly addressing such problems by having a population of collectives and applying the evolutionary algorithm to that population is appealing, but the search space is prohibitively large in most cases. Instead, we focus on evolving control policies for each member of the collective. The fundamental issue in this approach is how to create an evaluation function for each member of the collective that is both aligned with the global evaluation function and is sensitive to the fitness changes of the member, while relatively insensitive to the fitness changes of other members. We show how to construct evaluation functions in dynamic, noisy and communication-limited collective environments. On a rover coordination problem, a control policy evolved using aligned and member-sensitive evaluations outperfoms global evaluation methods by up to 400%. More notably, in the presence of a larger number of rovers or rovers with noisy and communication limited sensors, the proposed method outperforms global evaluation by a higher percentage than in noise-free conditions with a small number of rovers.
Limitations of outsourcing on-the-ground biodiversity conservation.

PubMed

Iacona, Gwenllian D; Bode, Michael; Armsworth, Paul R

2016-12-01

To counteract global species decline, modern biodiversity conservation engages in large projects, spends billions of dollars, and includes many organizations working simultaneously within regions. To add to this complexity, the conservation sector has hierarchical structure, where conservation actions are often outsourced by funders (foundations, government, etc.) to local organizations that work on-the-ground. In contrast, conservation science usually assumes that a single organization makes resource allocation decisions. This discrepancy calls for theory to understand how the expected biodiversity outcomes change when interactions between organizations are accounted for. Here, we used a game theoretic model to explore how biodiversity outcomes are affected by vertical and horizontal interactions between 3 conservation organizations: a funder that outsourced its actions and 2 local conservation organizations that work on-the-ground. Interactions between the organizations changed the spending decisions made by individual organizations, and thereby the magnitude and direction of the conservation benefits. We showed that funders would struggle to incentivize recipient organizations with set priorities to perform desired actions, even when they control substantial amounts of the funding and employ common contracting approaches to enhance outcomes. Instead, biodiversity outcomes depended on priority alignment across the organizations. Conservation outcomes for the funder were improved by strategic interactions when organizational priorities were well aligned, but decreased when priorities were misaligned. Meanwhile, local organizations had improved outcomes regardless of alignment due to additional funding in the system. Given that conservation often involves the aggregate actions of multiple organizations with different objectives, strategic interactions between organizations need to be considered if we are to predict possible outcomes of conservation programs or costs of achieving conservation targets. © 2016 Society for Conservation Biology.

Evolutionary Creation: Moving beyond the Evolution versus Creation Debate

ERIC Educational Resources Information Center

Lamoureux, Denis O.

2010-01-01

Evolutionary creation offers a conservative Christian approach to evolution. It explores biblical faith and evolutionary science through a Two Divine Books model and proposes a complementary relationship between Scripture and science. The Book of God's Words discloses the spiritual character of the world, while the Book of God's Works reveals the…
K2 and K2*: efficient alignment-free sequence similarity measurement based on Kendall statistics.

PubMed

Lin, Jie; Adjeroh, Donald A; Jiang, Bing-Hua; Jiang, Yue

2018-05-15

Alignment-free sequence comparison methods can compute the pairwise similarity between a huge number of sequences much faster than sequence-alignment based methods. We propose a new non-parametric alignment-free sequence comparison method, called K2, based on the Kendall statistics. Comparing to the other state-of-the-art alignment-free comparison methods, K2 demonstrates competitive performance in generating the phylogenetic tree, in evaluating functionally related regulatory sequences, and in computing the edit distance (similarity/dissimilarity) between sequences. Furthermore, the K2 approach is much faster than the other methods. An improved method, K2*, is also proposed, which is able to determine the appropriate algorithmic parameter (length) automatically, without first considering different values. Comparative analysis with the state-of-the-art alignment-free sequence similarity methods demonstrates the superiority of the proposed approaches, especially with increasing sequence length, or increasing dataset sizes. The K2 and K2* approaches are implemented in the R language as a package and is freely available for open access (http://community.wvu.edu/daadjeroh/projects/K2/K2_1.0.tar.gz). yueljiang@163.com. Supplementary data are available at Bioinformatics online.
Conservation genetics of high elevation five-needle white pines

Treesearch

Andrew D. Bower; Sierra C. McLane; Andrew Eckert; Stacy Jorgensen; Anna Schoettle; Sally Aitken

2011-01-01

Conservation genetics examines the biophysical factors influencing genetic processes and uses that information to conserve and maintain the evolutionary potential of species and populations. Here we review published and unpublished literature on the conservation genetics of seven North American high-elevation five-needle pines. Although these species are widely...
Radiation and the regulatory landscape of neo2-Darwinism.

PubMed

Rollo, C David

2006-05-11

Several recently revealed features of eukaryotic genomes were not predicted by earlier evolutionary paradigms, including the relatively small number of genes, the very large amounts of non-functional code and its quarantine in heterochromatin, the remarkable conservation of many functionally important genes across relatively enormous phylogenetic distances, and the prevalence of extra-genomic information associated with chromatin structure and histone proteins. All of these emphasize a paramount role for regulatory evolution, which is further reinforced by recent perspectives highlighting even higher-order regulation governing epigenetics and development (EVO-DEVO). Modern neo2-Darwinism, with its emphasis on regulatory mechanisms and regulatory evolution provides new vision for understanding radiation biology, particularly because free radicals and redox states are central to many regulatory mechanisms and free radicals generated by radiation mimic and amplify endogenous signalling. This paper explores some of these aspects and their implications for low-dose radiation biology.
A likelihood ratio test for evolutionary rate shifts and functional divergence among proteins

PubMed Central

Knudsen, Bjarne; Miyamoto, Michael M.

2001-01-01

Changes in protein function can lead to changes in the selection acting on specific residues. This can often be detected as evolutionary rate changes at the sites in question. A maximum-likelihood method for detecting evolutionary rate shifts at specific protein positions is presented. The method determines significance values of the rate differences to give a sound statistical foundation for the conclusions drawn from the analyses. A statistical test for detecting slowly evolving sites is also described. The methods are applied to a set of Myc proteins for the identification of both conserved sites and those with changing evolutionary rates. Those positions with conserved and changing rates are related to the structures and functions of their proteins. The results are compared with an earlier Bayesian method, thereby highlighting the advantages of the new likelihood ratio tests. PMID:11734650
Differences in evolutionary pressure acting within highly conserved ortholog groups.

PubMed

Przytycka, Teresa M; Jothi, Raja; Aravind, L; Lipman, David J

2008-07-17

In highly conserved widely distributed ortholog groups, the main evolutionary force is assumed to be purifying selection that enforces sequence conservation, with most divergence occurring by accumulation of neutral substitutions. Using a set of ortholog groups from prokaryotes, with a single representative in each studied organism, we asked the question if this evolutionary pressure is acting similarly on different subgroups of orthologs defined as major lineages (e.g. Proteobacteria or Firmicutes). Using correlations in entropy measures as a proxy for evolutionary pressure, we observed two distinct behaviors within our ortholog collection. The first subset of ortholog groups, called here informational, consisted mostly of proteins associated with information processing (i.e. translation, transcription, DNA replication) and the second, the non-informational ortholog groups, mostly comprised of proteins involved in metabolic pathways. The evolutionary pressure acting on non-informational proteins is more uniform relative to their informational counterparts. The non-informational proteins show higher level of correlation between entropy profiles and more uniformity across subgroups. The low correlation of entropy profiles in the informational ortholog groups suggest that the evolutionary pressure acting on the informational ortholog groups is not uniform across different clades considered this study. This might suggest "fine-tuning" of informational proteins in each lineage leading to lineage-specific differences in selection. This, in turn, could make these proteins less exchangeable between lineages. In contrast, the uniformity of the selective pressure acting on the non-informational groups might allow the exchange of the genetic material via lateral gene transfer.
Preserving the evolutionary potential of floras in biodiversity hotspots.

PubMed

Forest, Félix; Grenyer, Richard; Rouget, Mathieu; Davies, T Jonathan; Cowling, Richard M; Faith, Daniel P; Balmford, Andrew; Manning, John C; Procheş, Serban; van der Bank, Michelle; Reeves, Gail; Hedderson, Terry A J; Savolainen, Vincent

2007-02-15

One of the biggest challenges for conservation biology is to provide conservation planners with ways to prioritize effort. Much attention has been focused on biodiversity hotspots. However, the conservation of evolutionary process is now also acknowledged as a priority in the face of global change. Phylogenetic diversity (PD) is a biodiversity index that measures the length of evolutionary pathways that connect a given set of taxa. PD therefore identifies sets of taxa that maximize the accumulation of 'feature diversity'. Recent studies, however, concluded that taxon richness is a good surrogate for PD. Here we show taxon richness to be decoupled from PD, using a biome-wide phylogenetic analysis of the flora of an undisputed biodiversity hotspot--the Cape of South Africa. We demonstrate that this decoupling has real-world importance for conservation planning. Finally, using a database of medicinal and economic plant use, we demonstrate that PD protection is the best strategy for preserving feature diversity in the Cape. We should be able to use PD to identify those key regions that maximize future options, both for the continuing evolution of life on Earth and for the benefit of society.
I-HEDGE: determining the optimum complementary sets of taxa for conservation using evolutionary isolation

PubMed Central

Mooers, Arne Ø.; Caccone, Adalgisa; Russello, Michael A.

2016-01-01

In the midst of the current biodiversity crisis, conservation efforts might profitably be directed towards ensuring that extinctions do not result in inordinate losses of evolutionary history. Numerous methods have been developed to evaluate the importance of species based on their contribution to total phylogenetic diversity on trees and networks, but existing methods fail to take complementarity into account, and thus cannot identify the best order or subset of taxa to protect. Here, we develop a novel iterative calculation of the heightened evolutionary distinctiveness and globally endangered metric (I-HEDGE) that produces the optimal ranked list for conservation prioritization, taking into account complementarity and based on both phylogenetic diversity and extinction probability. We applied this metric to a phylogenetic network based on mitochondrial control region data from extant and recently extinct giant Galápagos tortoises, a highly endangered group of closely related species. We found that the restoration of two extinct species (a project currently underway) will contribute the greatest gain in phylogenetic diversity, and present an ordered list of rankings that is the optimum complementarity set for conservation prioritization. PMID:27635324
I-HEDGE: determining the optimum complementary sets of taxa for conservation using evolutionary isolation.

PubMed

Jensen, Evelyn L; Mooers, Arne Ø; Caccone, Adalgisa; Russello, Michael A

2016-01-01

In the midst of the current biodiversity crisis, conservation efforts might profitably be directed towards ensuring that extinctions do not result in inordinate losses of evolutionary history. Numerous methods have been developed to evaluate the importance of species based on their contribution to total phylogenetic diversity on trees and networks, but existing methods fail to take complementarity into account, and thus cannot identify the best order or subset of taxa to protect. Here, we develop a novel iterative calculation of the heightened evolutionary distinctiveness and globally endangered metric (I-HEDGE) that produces the optimal ranked list for conservation prioritization, taking into account complementarity and based on both phylogenetic diversity and extinction probability. We applied this metric to a phylogenetic network based on mitochondrial control region data from extant and recently extinct giant Galápagos tortoises, a highly endangered group of closely related species. We found that the restoration of two extinct species (a project currently underway) will contribute the greatest gain in phylogenetic diversity, and present an ordered list of rankings that is the optimum complementarity set for conservation prioritization.
Functionally essential, invariant glutamate near the C-terminus of strand beta 5 in various (alpha/beta)8-barrel enzymes as a possible indicator of their evolutionary relatedness.

PubMed

Janecek, S; Baláz, S

1995-08-01

Twelve different (alpha/beta)8-barrel enzymes belonging to three structurally distinct families were found to contain, near the C-terminus of their strand beta 5, a conserved invariant glutamic acid residue that plays an important functional role in each of these enzymes. The search was based on the idea that a conserved sequence region of an (alpha/beta)8-barrel enzyme should be more or less conserved also in the equivalent part of the structure of the other enzymes with this folding motif owing to their mutual evolutionary relatedness. For this purpose, the sequence region around the well conserved fifth beta-strand of alpha-amylase containing catalytic glutamate (Glu230, Aspergillus oryzae alpha-amylase numbering), was used as the sequence-structural template. The isolated sequence stretches of the 12 (alpha/beta)8-barrels are discussed from both the sequence-structural and the evolutionary point of view, the invariant glutamate residue being proposed to be a joining feature of the studied group of enzymes remaining from their ancestral (alpha/beta)8-barrel.
Biological intuition in alignment-free methods: response to Posada.

PubMed

Ragan, Mark A; Chan, Cheong Xin

2013-08-01

A recent editorial in Journal of Molecular Evolution highlights opportunities and challenges facing molecular evolution in the era of next-generation sequencing. Abundant sequence data should allow more-complex models to be fit at higher confidence, making phylogenetic inference more reliable and improving our understanding of evolution at the molecular level. However, concern that approaches based on multiple sequence alignment may be computationally infeasible for large datasets is driving the development of so-called alignment-free methods for sequence comparison and phylogenetic inference. The recent editorial characterized these approaches as model-free, not based on the concept of homology, and lacking in biological intuition. We argue here that alignment-free methods have not abandoned models or homology, and can be biologically intuitive.
Global patterns of evolutionary distinct and globally endangered amphibians and mammals.

PubMed

Safi, Kamran; Armour-Marshall, Katrina; Baillie, Jonathan E M; Isaac, Nick J B

2013-01-01

Conservation of phylogenetic diversity allows maximising evolutionary information preserved within fauna and flora. The "EDGE of Existence" programme is the first institutional conservation initiative that prioritises species based on phylogenetic information. Species are ranked in two ways: one according to their evolutionary distinctiveness (ED) and second, by including IUCN extinction status, their evolutionary distinctiveness and global endangerment (EDGE). Here, we describe the global patterns in the spatial distribution of priority ED and EDGE species, in order to identify conservation areas for mammalian and amphibian communities. In addition, we investigate whether environmental conditions can predict the observed spatial pattern in ED and EDGE globally. Priority zones with high concentrations of ED and EDGE scores were defined using two different methods. The overlap between mammal and amphibian zones was very small, reflecting the different phylo-biogeographic histories. Mammal ED zones were predominantly found on the African continent and the neotropical forests, whereas in amphibians, ED zones were concentrated in North America. Mammal EDGE zones were mainly in South-East Asia, southern Africa and Madagascar; for amphibians they were in central and south America. The spatial pattern of ED and EDGE was poorly described by a suite of environmental variables. Mapping the spatial distribution of ED and EDGE provides an important step towards identifying priority areas for the conservation of mammalian and amphibian phylogenetic diversity in the EDGE of existence programme.
JDet: interactive calculation and visualization of function-related conservation patterns in multiple sequence alignments and structures.

PubMed

Muth, Thilo; García-Martín, Juan A; Rausell, Antonio; Juan, David; Valencia, Alfonso; Pazos, Florencio

2012-02-15

We have implemented in a single package all the features required for extracting, visualizing and manipulating fully conserved positions as well as those with a family-dependent conservation pattern in multiple sequence alignments. The program allows, among other things, to run different methods for extracting these positions, combine the results and visualize them in protein 3D structures and sequence spaces. JDet is a multiplatform application written in Java. It is freely available, including the source code, at http://csbg.cnb.csic.es/JDet. The package includes two of our recently developed programs for detecting functional positions in protein alignments (Xdet and S3Det), and support for other methods can be added as plug-ins. A help file and a guided tutorial for JDet are also available.
Investigation of the two-quasiparticle bands in the doubly-odd nucleus 166Ta using a particle-number conserving cranked shell model

NASA Astrophysics Data System (ADS)

Zhang, ZhenHua

2016-07-01

The high-spin rotational properties of two-quasiparticle bands in the doubly-odd 166Ta are analyzed using the cranked shell model with pairing correlations treated by a particle-number conserving method, in which the blocking effects are taken into account exactly. The experimental moments of inertia and alignments and their variations with the rotational frequency hω are reproduced very well by the particle-number conserving calculations, which provides a reliable support to the configuration assignments in previous works for these bands. The backbendings in these two-quasiparticle bands are analyzed by the calculated occupation probabilities and the contributions of each orbital to the total angular momentum alignments. The moments of inertia and alignments for the Gallagher-Moszkowski partners of these observed two-quasiparticle rotational bands are also predicted.
Information theory applications for biological sequence analysis.

PubMed

Vinga, Susana

2014-05-01

Information theory (IT) addresses the analysis of communication systems and has been widely applied in molecular biology. In particular, alignment-free sequence analysis and comparison greatly benefited from concepts derived from IT, such as entropy and mutual information. This review covers several aspects of IT applications, ranging from genome global analysis and comparison, including block-entropy estimation and resolution-free metrics based on iterative maps, to local analysis, comprising the classification of motifs, prediction of transcription factor binding sites and sequence characterization based on linguistic complexity and entropic profiles. IT has also been applied to high-level correlations that combine DNA, RNA or protein features with sequence-independent properties, such as gene mapping and phenotype analysis, and has also provided models based on communication systems theory to describe information transmission channels at the cell level and also during evolutionary processes. While not exhaustive, this review attempts to categorize existing methods and to indicate their relation with broader transversal topics such as genomic signatures, data compression and complexity, time series analysis and phylogenetic classification, providing a resource for future developments in this promising area.
NetCoffee: a fast and accurate global alignment approach to identify functionally conserved proteins in multiple networks.

PubMed

Hu, Jialu; Kehr, Birte; Reinert, Knut

2014-02-15

Owing to recent advancements in high-throughput technologies, protein-protein interaction networks of more and more species become available in public databases. The question of how to identify functionally conserved proteins across species attracts a lot of attention in computational biology. Network alignments provide a systematic way to solve this problem. However, most existing alignment tools encounter limitations in tackling this problem. Therefore, the demand for faster and more efficient alignment tools is growing. We present a fast and accurate algorithm, NetCoffee, which allows to find a global alignment of multiple protein-protein interaction networks. NetCoffee searches for a global alignment by maximizing a target function using simulated annealing on a set of weighted bipartite graphs that are constructed using a triplet approach similar to T-Coffee. To assess its performance, NetCoffee was applied to four real datasets. Our results suggest that NetCoffee remedies several limitations of previous algorithms, outperforms all existing alignment tools in terms of speed and nevertheless identifies biologically meaningful alignments. The source code and data are freely available for download under the GNU GPL v3 license at https://code.google.com/p/netcoffee/.
An experimental and computational evolution-based method to study a mode of co-evolution of overlapping open reading frames in the AAV2 viral genome.

PubMed

Kawano, Yasuhiro; Neeley, Shane; Adachi, Kei; Nakai, Hiroyuki

2013-01-01

Overlapping open reading frames (ORFs) in viral genomes undergo co-evolution; however, how individual amino acids coded by overlapping ORFs are structurally, functionally, and co-evolutionarily constrained remains difficult to address by conventional homologous sequence alignment approaches. We report here a new experimental and computational evolution-based methodology to address this question and report its preliminary application to elucidating a mode of co-evolution of the frame-shifted overlapping ORFs in the adeno-associated virus (AAV) serotype 2 viral genome. These ORFs encode both capsid VP protein and non-structural assembly-activating protein (AAP). To show proof of principle of the new method, we focused on the evolutionarily conserved QVKEVTQ and KSKRSRR motifs, a pair of overlapping heptapeptides in VP and AAP, respectively. In the new method, we first identified a large number of capsid-forming VP3 mutants and functionally competent AAP mutants of these motifs from mutant libraries by experimental directed evolution under no co-evolutionary constraints. We used Illumina sequencing to obtain a large dataset and then statistically assessed the viability of VP and AAP heptapeptide mutants. The obtained heptapeptide information was then integrated into an evolutionary algorithm, with which VP and AAP were co-evolved from random or native nucleotide sequences in silico. As a result, we demonstrate that these two heptapeptide motifs could exhibit high degeneracy if coded by separate nucleotide sequences, and elucidate how overlap-evoked co-evolutionary constraints play a role in making the VP and AAP heptapeptide sequences into the present shape. Specifically, we demonstrate that two valine (V) residues and β-strand propensity in QVKEVTQ are structurally important, the strongly negative and hydrophilic nature of KSKRSRR is functionally important, and overlap-evoked co-evolution imposes strong constraints on serine (S) residues in KSKRSRR, despite high degeneracy of the motifs in the absence of co-evolutionary constraints.
Evolution of aminoacyl-tRNA synthetases--analysis of unique domain architectures and phylogenetic trees reveals a complex history of horizontal gene transfer events.

PubMed

Wolf, Y I; Aravind, L; Grishin, N V; Koonin, E V

1999-08-01

Phylogenetic analysis of aminoacyl-tRNA synthetases (aaRSs) of all 20 specificities from completely sequenced bacterial, archaeal, and eukaryotic genomes reveals a complex evolutionary picture. Detailed examination of the domain architecture of aaRSs using sequence profile searches delineated a network of partially conserved domains that is even more elaborate than previously suspected. Several unexpected evolutionary connections were identified, including the apparent origin of the beta-subunit of bacterial GlyRS from the HD superfamily of hydrolases, a domain shared by bacterial AspRS and the B subunit of archaeal glutamyl-tRNA amidotransferases, and another previously undetected domain that is conserved in a subset of ThrRS, guanosine polyphosphate hydrolases and synthetases, and a family of GTPases. Comparison of domain architectures and multiple alignments resulted in the delineation of synapomorphies-shared derived characters, such as extra domains or inserts-for most of the aaRSs specificities. These synapomorphies partition sets of aaRSs with the same specificity into two or more distinct and apparently monophyletic groups. In conjunction with cluster analysis and a modification of the midpoint-rooting procedure, this partitioning was used to infer the likely root position in phylogenetic trees. The topologies of the resulting rooted trees for most of the aaRSs specificities are compatible with the evolutionary "standard model" whereby the earliest radiation event separated bacteria from the common ancestor of archaea and eukaryotes as opposed to the two other possible evolutionary scenarios for the three major divisions of life. For almost all aaRSs specificities, however, this simple scheme is confounded by displacement of some of the bacterial aaRSs by their eukaryotic or, less frequently, archaeal counterparts. Displacement of ancestral eukaryotic aaRS genes by bacterial ones, presumably of mitochondrial origin, was observed for three aaRSs. In contrast, there was no convincing evidence of displacement of archaeal aaRSs by bacterial ones. Displacement of aaRS genes by eukaryotic counterparts is most common among parasitic and symbiotic bacteria, particularly the spirochaetes, in which 10 of the 19 aaRSs seem to have been displaced by the respective eukaryotic genes and two by the archaeal counterpart. Unlike the primary radiation events between the three main divisions of life, that were readily traceable through the phylogenetic analysis of aaRSs, no consistent large-scale bacterial phylogeny could be established. In part, this may be due to additional gene displacement events among bacterial lineages. Argument is presented that, although lineage-specific gene loss might have contributed to the evolution of some of the aaRSs, this is not a viable alternative to horizontal gene transfer as the principal evolutionary phenomenon in this gene class.
Neutral Theory is the Foundation of Conservation Genetics.

PubMed

Yoder, Anne D; Poelstra, Jelmer; Tiley, George P; Williams, Rachel

2018-04-16

Kimura's neutral theory of molecular evolution has been essential to virtually every advance in evolutionary genetics, and by extension, is foundational to the field of conservation genetics. Conservation genetics utilizes the key concepts of neutral theory to identify species and populations at risk of losing evolutionary potential by detecting patterns of inbreeding depression and low effective population size. In turn, this information can inform the management of organisms and their habitat providing hope for the long-term preservation of both. We expand upon Avise's "inventorial" and "functional" categories of conservation genetics by proposing a third category that is linked to the coalescent and that we refer to as "process-driven." It is here that connections between Kimura's theory and conservation genetics are strongest. Process-driven conservation genetics can be especially applied to large genomic datasets to identify patterns of historical risk, such as population bottlenecks, and accordingly, yield informed intuitions for future outcomes. By examining inventorial, functional, and process-driven conservation genetics in sequence, we assess the progression from theory, to data collection and analysis, and ultimately, to the production of hypotheses that can inform conservation policies.
Genome-wide comparative analysis reveals human-mouse regulatory landscape and evolution.

PubMed

Denas, Olgert; Sandstrom, Richard; Cheng, Yong; Beal, Kathryn; Herrero, Javier; Hardison, Ross C; Taylor, James

2015-02-14

Because species-specific gene expression is driven by species-specific regulation, understanding the relationship between sequence and function of the regulatory regions in different species will help elucidate how differences among species arise. Despite active experimental and computational research, relationships among sequence, conservation, and function are still poorly understood. We compared transcription factor occupied segments (TFos) for 116 human and 35 mouse TFs in 546 human and 125 mouse cell types and tissues from the Human and the Mouse ENCODE projects. We based the map between human and mouse TFos on a one-to-one nucleotide cross-species mapper, bnMapper, that utilizes whole genome alignments (WGA). Our analysis shows that TFos are under evolutionary constraint, but a substantial portion (25.1% of mouse and 25.85% of human on average) of the TFos does not have a homologous sequence on the other species; this portion varies among cell types and TFs. Furthermore, 47.67% and 57.01% of the homologous TFos sequence shows binding activity on the other species for human and mouse respectively. However, 79.87% and 69.22% is repurposed such that it binds the same TF in different cells or different TFs in the same cells. Remarkably, within the set of repurposed TFos, the corresponding genome regions in the other species are preferred locations of novel TFos. These events suggest exaptation of some functional regulatory sequences into new function. Despite TFos repurposing, we did not find substantial changes in their predicted target genes, suggesting that CRMs buffer evolutionary events allowing little or no change in the TFos - target gene associations. Thus, the small portion of TFos with strictly conserved occupancy underestimates the degree of conservation of regulatory interactions. We mapped regulatory sequences from an extensive number of TFs and cell types between human and mouse using WGA. A comparative analysis of this correspondence unveiled the extent of the shared regulatory sequence across TFs and cell types under study. Importantly, a large part of the shared regulatory sequence is repurposed on the other species. This sequence, fueled by turnover events, provides a strong case for exaptation in regulatory elements.

Genetic and structural analyses of cytochrome P450 hydroxylases in sex hormone biosynthesis: Sequential origin and subsequent coevolution.

PubMed

Goldstone, Jared V; Sundaramoorthy, Munirathinam; Zhao, Bin; Waterman, Michael R; Stegeman, John J; Lamb, David C

2016-01-01

Biosynthesis of steroid hormones in vertebrates involves three cytochrome P450 hydroxylases, CYP11A1, CYP17A1 and CYP19A1, which catalyze sequential steps in steroidogenesis. These enzymes are conserved in the vertebrates, but their origin and existence in other chordate subphyla (Tunicata and Cephalochordata) have not been clearly established. In this study, selected protein sequences of CYP11A1, CYP17A1 and CYP19A1 were compiled and analyzed using multiple sequence alignment and phylogenetic analysis. Our analyses show that cephalochordates have sequences orthologous to vertebrate CYP11A1, CYP17A1 or CYP19A1, and that echinoderms and hemichordates possess CYP11-like but not CYP19 genes. While the cephalochordate sequences have low identity with the vertebrate sequences, reflecting evolutionary distance, the data show apparent origin of CYP11 prior to the evolution of CYP19 and possibly CYP17, thus indicating a sequential origin of these functionally related steroidogenic CYPs. Co-occurrence of the three CYPs in early chordates suggests that the three genes may have coevolved thereafter, and that functional conservation should be reflected in functionally important residues in the proteins. CYP19A1 has the largest number of conserved residues while CYP11A1 sequences are less conserved. Structural analyses of human CYP11A1, CYP17A1 and CYP19A1 show that critical substrate binding site residues are highly conserved in each enzyme family. The results emphasize that the steroidogenic pathways producing glucocorticoids and reproductive steroids are several hundred million years old and that the catalytic structural elements of the enzymes have been conserved over the same period of time. Analysis of these elements may help to identify when precursor functions linked to these enzymes first arose. Copyright © 2015 Elsevier Inc. All rights reserved.
Free energy minimization to predict RNA secondary structures and computational RNA design.

PubMed

Churkin, Alexander; Weinbrand, Lina; Barash, Danny

2015-01-01

Determining the RNA secondary structure from sequence data by computational predictions is a long-standing problem. Its solution has been approached in two distinctive ways. If a multiple sequence alignment of a collection of homologous sequences is available, the comparative method uses phylogeny to determine conserved base pairs that are more likely to form as a result of billions of years of evolution than by chance. In the case of single sequences, recursive algorithms that compute free energy structures by using empirically derived energy parameters have been developed. This latter approach of RNA folding prediction by energy minimization is widely used to predict RNA secondary structure from sequence. For a significant number of RNA molecules, the secondary structure of the RNA molecule is indicative of its function and its computational prediction by minimizing its free energy is important for its functional analysis. A general method for free energy minimization to predict RNA secondary structures is dynamic programming, although other optimization methods have been developed as well along with empirically derived energy parameters. In this chapter, we introduce and illustrate by examples the approach of free energy minimization to predict RNA secondary structures.
Evolutionary refugia and ecological refuges: key concepts for conserving Australian arid zone freshwater biodiversity under climate change

PubMed Central

Davis, Jenny; Pavlova, Alexandra; Thompson, Ross; Sunnucks, Paul

2013-01-01

Refugia have been suggested as priority sites for conservation under climate change because of their ability to facilitate survival of biota under adverse conditions. Here, we review the likely role of refugial habitats in conserving freshwater biota in arid Australian aquatic systems where the major long-term climatic influence has been aridification. We introduce a conceptual model that characterizes evolutionary refugia and ecological refuges based on our review of the attributes of aquatic habitats and freshwater taxa (fishes and aquatic invertebrates) in arid Australia. We also identify methods of recognizing likely future refugia and approaches to assessing the vulnerability of arid-adapted freshwater biota to a warming and drying climate. Evolutionary refugia in arid areas are characterized as permanent, groundwater-dependent habitats (subterranean aquifers and springs) supporting vicariant relicts and short-range endemics. Ecological refuges can vary across space and time, depending on the dispersal abilities of aquatic taxa and the geographical proximity and hydrological connectivity of aquatic habitats. The most important are the perennial waterbodies (both groundwater and surface water fed) that support obligate aquatic organisms. These species will persist where suitable habitats are available and dispersal pathways are maintained. For very mobile species (invertebrates with an aerial dispersal phase) evolutionary refugia may also act as ecological refuges. Evolutionary refugia are likely future refugia because their water source (groundwater) is decoupled from local precipitation. However, their biota is extremely vulnerable to changes in local conditions because population extinction risks cannot be abated by the dispersal of individuals from other sites. Conservation planning must incorporate a high level of protection for aquifers that support refugial sites. Ecological refuges are vulnerable to changes in regional climate because they have little thermal or hydrological buffering. Accordingly, conservation planning must focus on maintaining meta-population processes, especially through dynamic connectivity between aquatic habitats at a landscape scale. PMID:23526791
Evolutionary refugia and ecological refuges: key concepts for conserving Australian arid zone freshwater biodiversity under climate change.

PubMed

Davis, Jenny; Pavlova, Alexandra; Thompson, Ross; Sunnucks, Paul

2013-07-01

Refugia have been suggested as priority sites for conservation under climate change because of their ability to facilitate survival of biota under adverse conditions. Here, we review the likely role of refugial habitats in conserving freshwater biota in arid Australian aquatic systems where the major long-term climatic influence has been aridification. We introduce a conceptual model that characterizes evolutionary refugia and ecological refugees based on our review of the attributes of aquatic habitats and freshwater taxa (fishes and aquatic invertebrates) in arid Australia. We also identify methods of recognizing likely future refugia and approaches to assessing the vulnerability of arid-adapted freshwater biota to a warming and drying climate. Evolutionary refugia in arid areas are characterized as permanent, groundwater-dependent habitats (subterranean aquifers and springs) supporting vicariant relicts and short-range endemics. Ecological refugees can vary across space and time, depending on the dispersal abilities of aquatic taxa and the geographical proximity and hydrological connectivity of aquatic habitats. The most important are the perennial waterbodies (both groundwater and surface water fed) that support obligate aquatic organisms. These species will persist where suitable habitats are available and dispersal pathways are maintained. For very mobile species (invertebrates with an aerial dispersal phase) evolutionary refugia may also act as ecological refugees. Evolutionary refugia are likely future refugia because their water source (groundwater) is decoupled from local precipitation. However, their biota is extremely vulnerable to changes in local conditions because population extinction risks cannot be abated by the dispersal of individuals from other sites. Conservation planning must incorporate a high level of protection for aquifers that support refugial sites. Ecological refuges are vulnerable to changes in regional climate because they have little thermal or hydrological buffering. Accordingly, conservation planning must focus on maintaining meta-population processes, especially through dynamic connectivity between aquatic habitats at a landscape scale. © 2013 Blackwell Publishing Ltd.
High-precision morphology: bifocal 4D-microscopy enables the comparison of detailed cell lineages of two chordate species separated for more than 525 million years.

PubMed

Stach, Thomas; Anselmi, Chiara

2015-12-23

Understanding the evolution of divergent developmental trajectories requires detailed comparisons of embryologies at appropriate levels. Cell lineages, the accurate visualization of cleavage patterns, tissue fate restrictions, and morphogenetic movements that occur during the development of individual embryos are currently available for few disparate animal taxa, encumbering evolutionarily meaningful comparisons. Tunicates, considered to be close relatives of vertebrates, are marine invertebrates whose fossil record dates back to 525 million years ago. Life-history strategies across this subphylum are radically different, and include biphasic ascidians with free swimming larvae and a sessile adult stage, and the holoplanktonic larvaceans. Despite considerable progress, notably on the molecular level, the exact extent of evolutionary conservation and innovation during embryology remain obscure. Here, using the innovative technique of bifocal 4D-microscopy, we demonstrate exactly which characteristics in the cell lineages of the ascidian Phallusia mammillata and the larvacean Oikopleura dioica were conserved and which were altered during evolution. Our accurate cell lineage trees in combination with detailed three-dimensional representations clearly identify conserved correspondence in relative cell position, cell identity, and fate restriction in several lines from all prospective larval tissues. At the same time, we precisely pinpoint differences observable at all levels of development. These differences comprise fate restrictions, tissue types, complex morphogenetic movement patterns, numerous cases of heterochronous acceleration in the larvacean embryo, and differences in bilateral symmetry. Our results demonstrate in extraordinary detail the multitude of developmental levels amenable to evolutionary innovation, including subtle changes in the timing of fate restrictions as well as dramatic alterations in complex morphogenetic movements. We anticipate that the precise spatial and temporal cell lineage data will moreover serve as a high-precision guide to devise experimental investigations of other levels, such as molecular interactions between cells or changes in gene expression underlying the documented structural evolutionary changes. Finally, the quantitative amount of digital high-precision morphological data will enable and necessitate software-based similarity assessments as the basis of homology hypotheses.
Fast and accurate phylogeny reconstruction using filtered spaced-word matches.

PubMed

Leimeister, Chris-André; Sohrabi-Jahromi, Salma; Morgenstern, Burkhard

2017-04-01

Word-based or 'alignment-free' algorithms are increasingly used for phylogeny reconstruction and genome comparison, since they are much faster than traditional approaches that are based on full sequence alignments. Existing alignment-free programs, however, are less accurate than alignment-based methods. We propose Filtered Spaced Word Matches (FSWM) , a fast alignment-free approach to estimate phylogenetic distances between large genomic sequences. For a pre-defined binary pattern of match and don't-care positions, FSWM rapidly identifies spaced word-matches between input sequences, i.e. gap-free local alignments with matching nucleotides at the match positions and with mismatches allowed at the don't-care positions. We then estimate the number of nucleotide substitutions per site by considering the nucleotides aligned at the don't-care positions of the identified spaced-word matches. To reduce the noise from spurious random matches, we use a filtering procedure where we discard all spaced-word matches for which the overall similarity between the aligned segments is below a threshold. We show that our approach can accurately estimate substitution frequencies even for distantly related sequences that cannot be analyzed with existing alignment-free methods; phylogenetic trees constructed with FSWM distances are of high quality. A program run on a pair of eukaryotic genomes of a few hundred Mb each takes a few minutes. The program source code for FSWM including a documentation, as well as the software that we used to generate artificial genome sequences are freely available at http://fswm.gobics.de/. chris.leimeister@stud.uni-goettingen.de. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.
Interferon regulatory factor 10 (IRF10): Cloning in orange spotted grouper, Epinephelus coioides, and evolutionary analysis in vertebrates.

PubMed

Huang, Bei; Jia, Qin Qin; Liang, Ying; Huang, Wen Shu; Nie, P

2015-10-01

IRF10 gene was cloned in orange spotted grouper, Epinephelus coioides, and its expression was examined following poly(I:C) stimulation and bacterial infection. The cDNA sequence of grouper IRF10 contains an open reading frame of 1197 bp, flanked by 99 bp 5'-untranslated region and 480 bp 3'- untranslated region. Multiple alignments showed that the grouper IRF10 has a highly conserved DNA binding domain in the N terminus with characteristic motif containing five tryptophan residues. Quantitative real-time PCR analysis revealed that the expression of IRF10 was responsive to both poly(I:C) stimulation and Vibrio parahemolyticus infection, with a higher increase to poly(I:C), indicating an important role of IRF10 in host immune response during infection. A phyletic distribution of IRF members was also examined in vertebrates, and IRF10 was found in most lineages of vertebrates, not in modern primates and rodents. It is suggested that the first divergence of IRF members might have occurred before the evolutionary split of vertebrate and cephalochordates, producing ancestors of IRF (1/2/11) and IRF (4/8/9/10)[(3/7) (5/6)], and that the second and/or third divergence of IRF members occurred following the split, thus leading to the subsets of the IRF family in vertebrates. Copyright © 2015 Elsevier Ltd. All rights reserved.
Comparative transcriptome analysis reveals vertebrate phylotypic period during organogenesis

PubMed Central

Irie, Naoki; Kuratani, Shigeru

2011-01-01

One of the central issues in evolutionary developmental biology is how we can formulate the relationships between evolutionary and developmental processes. Two major models have been proposed: the 'funnel-like' model, in which the earliest embryo shows the most conserved morphological pattern, followed by diversifying later stages, and the 'hourglass' model, in which constraints are imposed to conserve organogenesis stages, which is called the phylotypic period. Here we perform a quantitative comparative transcriptome analysis of several model vertebrate embryos and show that the pharyngula stage is most conserved, whereas earlier and later stages are rather divergent. These results allow us to predict approximate developmental timetables between different species, and indicate that pharyngula embryos have the most conserved gene expression profiles, which may be the source of the basic body plan of vertebrates. PMID:21427719
Mouse regulatory DNA landscapes reveal global principles of cis-regulatory evolution.

PubMed

Vierstra, Jeff; Rynes, Eric; Sandstrom, Richard; Zhang, Miaohua; Canfield, Theresa; Hansen, R Scott; Stehling-Sun, Sandra; Sabo, Peter J; Byron, Rachel; Humbert, Richard; Thurman, Robert E; Johnson, Audra K; Vong, Shinny; Lee, Kristen; Bates, Daniel; Neri, Fidencio; Diegel, Morgan; Giste, Erika; Haugen, Eric; Dunn, Douglas; Wilken, Matthew S; Josefowicz, Steven; Samstein, Robert; Chang, Kai-Hsin; Eichler, Evan E; De Bruijn, Marella; Reh, Thomas A; Skoultchi, Arthur; Rudensky, Alexander; Orkin, Stuart H; Papayannopoulou, Thalia; Treuting, Piper M; Selleri, Licia; Kaul, Rajinder; Groudine, Mark; Bender, M A; Stamatoyannopoulos, John A

2014-11-21

To study the evolutionary dynamics of regulatory DNA, we mapped >1.3 million deoxyribonuclease I-hypersensitive sites (DHSs) in 45 mouse cell and tissue types, and systematically compared these with human DHS maps from orthologous compartments. We found that the mouse and human genomes have undergone extensive cis-regulatory rewiring that combines branch-specific evolutionary innovation and loss with widespread repurposing of conserved DHSs to alternative cell fates, and that this process is mediated by turnover of transcription factor (TF) recognition elements. Despite pervasive evolutionary remodeling of the location and content of individual cis-regulatory regions, within orthologous mouse and human cell types the global fraction of regulatory DNA bases encoding recognition sites for each TF has been strictly conserved. Our findings provide new insights into the evolutionary forces shaping mammalian regulatory DNA landscapes. Copyright © 2014, American Association for the Advancement of Science.
CodonLogo: a sequence logo-based viewer for codon patterns.

PubMed

Sharma, Virag; Murphy, David P; Provan, Gregory; Baranov, Pavel V

2012-07-15

Conserved patterns across a multiple sequence alignment can be visualized by generating sequence logos. Sequence logos show each column in the alignment as stacks of symbol(s) where the height of a stack is proportional to its informational content, whereas the height of each symbol within the stack is proportional to its frequency in the column. Sequence logos use symbols of either nucleotide or amino acid alphabets. However, certain regulatory signals in messenger RNA (mRNA) act as combinations of codons. Yet no tool is available for visualization of conserved codon patterns. We present the first application which allows visualization of conserved regions in a multiple sequence alignment in the context of codons. CodonLogo is based on WebLogo3 and uses the same heuristics but treats codons as inseparable units of a 64-letter alphabet. CodonLogo can discriminate patterns of codon conservation from patterns of nucleotide conservation that appear indistinguishable in standard sequence logos. The CodonLogo source code and its implementation (in a local version of the Galaxy Browser) are available at http://recode.ucc.ie/CodonLogo and through the Galaxy Tool Shed at http://toolshed.g2.bx.psu.edu/.
A conserved endocrine mechanism controls the formation of dauer and infective larvae in nematodes.

PubMed

Ogawa, Akira; Streit, Adrian; Antebi, Adam; Sommer, Ralf J

2009-01-13

Under harsh environmental conditions, Caenorhabditis elegans larvae undergo arrest and form dauer larvae that can attach to other animals to facilitate dispersal. It has been argued that this phenomenon, called phoresy, represents an intermediate step toward parasitism. Indeed, parasitic nematodes invade their hosts as infective larvae, a stage that shows striking morphological similarities to dauer larvae. Although the molecular regulation of dauer entry in C. elegans involves insulin and TGF-beta signaling, studies of TGF-beta orthologs in parasitic nematodes didn't provide evidence for a common origin of dauer and infective larvae. To identify conserved regulators between Caenorhabditis and parasitic nematodes, we used an evolutionary approach involving Pristionchus pacificus as an intermediate. We show by mutational and pharmacological analysis that Pristionchus and Caenorhabditis share the dafachronic acid-DAF-12 system as the core endocrine module for dauer formation. One dafachronic acid, Delta7-DA, has a conserved role in the mammalian parasite Strongyloides papillosus by controlling entry into the infective stage. Application of Delta7-DA blocks formation of infective larvae and results in free-living animals. Conservation of this small molecule ligand represents a fundamental link between dauer and infective larvae and might provide a general strategy for nematode parasitism.
A case study of bats and white-nose syndrome demonstrating how to model population viability with evolutionary effects.

PubMed

Maslo, Brooke; Fefferman, Nina H

2015-08-01

Ecological factors generally affect population viability on rapid time scales. Traditional population viability analyses (PVA) therefore focus on alleviating ecological pressures, discounting potential evolutionary impacts on individual phenotypes. Recent studies of evolutionary rescue (ER) focus on cases in which severe, environmentally induced population bottlenecks trigger a rapid evolutionary response that can potentially reverse demographic threats. ER models have focused on shifting genetics and resulting population recovery, but no one has explored how to incorporate those findings into PVA. We integrated ER into PVA to identify the critical decision interval for evolutionary rescue (DIER) under which targeted conservation action should be applied to buffer populations undergoing ER against extinction from stochastic events and to determine the most appropriate vital rate to target to promote population recovery. We applied this model to little brown bats (Myotis lucifugus) affected by white-nose syndrome (WNS), a fungal disease causing massive declines in several North American bat populations. Under the ER scenario, the model predicted that the DIER period for little brown bats was within 11 years of initial WNS emergence, after which they stabilized at a positive growth rate (λ = 1.05). By comparing our model results with population trajectories of multiple infected hibernacula across the WNS range, we concluded that ER is a potential explanation of observed little brown bat population trajectories across multiple hibernacula within the affected range. Our approach provides a tool that can be used by all managers to provide testable hypotheses regarding the occurrence of ER in declining populations, suggest empirical studies to better parameterize the population genetics and conservation-relevant vital rates, and identify the DIER period during which management strategies will be most effective for species conservation. © 2015 Society for Conservation Biology.
Spatial multiobjective optimization of agricultural conservation practices using a SWAT model and an evolutionary algorithm.

PubMed

Rabotyagov, Sergey; Campbell, Todd; Valcu, Adriana; Gassman, Philip; Jha, Manoj; Schilling, Keith; Wolter, Calvin; Kling, Catherine

2012-12-09

Finding the cost-efficient (i.e., lowest-cost) ways of targeting conservation practice investments for the achievement of specific water quality goals across the landscape is of primary importance in watershed management. Traditional economics methods of finding the lowest-cost solution in the watershed context (e.g.,(5,12,20)) assume that off-site impacts can be accurately described as a proportion of on-site pollution generated. Such approaches are unlikely to be representative of the actual pollution process in a watershed, where the impacts of polluting sources are often determined by complex biophysical processes. The use of modern physically-based, spatially distributed hydrologic simulation models allows for a greater degree of realism in terms of process representation but requires a development of a simulation-optimization framework where the model becomes an integral part of optimization. Evolutionary algorithms appear to be a particularly useful optimization tool, able to deal with the combinatorial nature of a watershed simulation-optimization problem and allowing the use of the full water quality model. Evolutionary algorithms treat a particular spatial allocation of conservation practices in a watershed as a candidate solution and utilize sets (populations) of candidate solutions iteratively applying stochastic operators of selection, recombination, and mutation to find improvements with respect to the optimization objectives. The optimization objectives in this case are to minimize nonpoint-source pollution in the watershed, simultaneously minimizing the cost of conservation practices. A recent and expanding set of research is attempting to use similar methods and integrates water quality models with broadly defined evolutionary optimization methods(3,4,9,10,13-15,17-19,22,23,25). In this application, we demonstrate a program which follows Rabotyagov et al.'s approach and integrates a modern and commonly used SWAT water quality model(7) with a multiobjective evolutionary algorithm SPEA2(26), and user-specified set of conservation practices and their costs to search for the complete tradeoff frontiers between costs of conservation practices and user-specified water quality objectives. The frontiers quantify the tradeoffs faced by the watershed managers by presenting the full range of costs associated with various water quality improvement goals. The program allows for a selection of watershed configurations achieving specified water quality improvement goals and a production of maps of optimized placement of conservation practices.
EFFECTS OF CHEMICAL CONTAMINANTS ON GENETIC DIVERSITY IN NATURAL POPULATIONS: IMPLICATIONS FOR BIOMONITORING AND ECOTOXICOLOGY

EPA Science Inventory

The conservation of genetic diversity has emerged as one of the central issues in conservation biology. Although researchers in the areas of evolutionary biology, population management, and conservation biology routinely investigate genetic variability in natural populations, onl...
Quadrupole Alignment and Trajectory Correction for Future Linear Colliders: SLC Tests of a Dispersion-Free Steering Algorithm

DOE Office of Scientific and Technical Information (OSTI.GOV)

Assmann, R

2004-06-08

The feasibility of future linear colliders depends on achieving very tight alignment and steering tolerances. All proposals (NLC, JLC, CLIC, TESLA and S-BAND) currently require a total emittance growth in the main linac of less than 30-100% [1]. This should be compared with a 100% emittance growth in the much smaller SLC linac [2]. Major advances in alignment and beam steering techniques beyond those used in the SLC are necessary for the next generation of linear colliders. In this paper, we present an experimental study of quadrupole alignment with a dispersion-free steering algorithm. A closely related method (wakefield-free steering) takesmore » into account wakefield effects [3]. However, this method can not be studied at the SLC. The requirements for future linear colliders lead to new and unconventional ideas about alignment and beam steering. For example, no dipole correctors are foreseen for the standard trajectory correction in the NLC [4]; beam steering will be done by moving the quadrupole positions with magnet movers. This illustrates the close symbiosis between alignment, beam steering and beam dynamics that will emerge. It is no longer possible to consider the accelerator alignment as static with only a few surveys and realignments per year. The alignment in future linear colliders will be a dynamic process in which the whole linac, with thousands of beam-line elements, is aligned in a few hours or minutes, while the required accuracy of about 5 pm for the NLC quadrupole alignment [4] is a factor of 20 higher than in existing accelerators. The major task in alignment and steering is the accurate determination of the optimum beam-line position. Ideally one would like all elements to be aligned along a straight line. However, this is not practical. Instead a ''smooth curve'' is acceptable as long as its wavelength is much longer than the betatron wavelength of the accelerated beam. Conventional alignment methods are limited in accuracy by errors in the survey and the fiducials. Beam-based alignment methods ideally only depend upon the BPM resolution and generally provide much better precision. Many of those techniques are described in other contributions to this workshop. In this paper we describe our experiences with a dispersion-free steering algorithm for linacs. This algorithm was first suggested by Raubenheimer and Ruth in 1990 [5]. It h as been studied in simulations for NLC [5], TESLA [6], the S-BAND proposal [7] and CLIC [8]. The dispersion-free steering technique can be applied to the whole linac at once and returns the alignment (or trajectory) that minimizes the dispersive emittance growth of the beam. Thus it allows an extremely fast alignment of the beam-line. As we will show dispersion-free steering is only sensitive to quadrupole misalignments. Wakefield-free steering [3] as mentioned before is a closely related technique that minimizes the emittance growth caused by both dispersion and wakefields. Due to hardware limitations (i.e. insufficient relative range of power supplies) we could not study this method experimentally in the SLC. However, its systematics are very similar to those of dispersion-free steering. The studies of dispersion-free steering which are presented made extensive use of the unique potential of the SLC as the only operating linear collider. We used it to study the performance and problems of advanced beam-based optimization tools in a real beam-line environment and on a large scale. We should mention that the SLC has utilized beam-based alignment for years [9], using the difference of electron and positron trajectories. This method, however, cannot be used in future linear colliders. The goal of our work is to demonstrate the performance of advanced beam-based alignment techniques in linear colliders and to anticipate possible reality-related problems. Those can then be solved in the design state for the next generation of linear colliders.« less
Insights into the evolution of Darwin’s finches from comparative analysis of the Geospiza magnirostris genome sequence

PubMed Central

2013-01-01

Background A classical example of repeated speciation coupled with ecological diversification is the evolution of 14 closely related species of Darwin’s (Galápagos) finches (Thraupidae, Passeriformes). Their adaptive radiation in the Galápagos archipelago took place in the last 2–3 million years and some of the molecular mechanisms that led to their diversification are now being elucidated. Here we report evolutionary analyses of genome of the large ground finch, Geospiza magnirostris. Results 13,291 protein-coding genes were predicted from a 991.0 Mb G. magnirostris genome assembly. We then defined gene orthology relationships and constructed whole genome alignments between the G. magnirostris and other vertebrate genomes. We estimate that 15% of genomic sequence is functionally constrained between G. magnirostris and zebra finch. Genic evolutionary rate comparisons indicate that similar selective pressures acted along the G. magnirostris and zebra finch lineages suggesting that historical effective population size values have been similar in both lineages. 21 otherwise highly conserved genes were identified that each show evidence for positive selection on amino acid changes in the Darwin's finch lineage. Two of these genes (Igf2r and Pou1f1) have been implicated in beak morphology changes in Darwin’s finches. Five of 47 genes showing evidence of positive selection in early passerine evolution have cilia related functions, and may be examples of adaptively evolving reproductive proteins. Conclusions These results provide insights into past evolutionary processes that have shaped G. magnirostris genes and its genome, and provide the necessary foundation upon which to build population genomics resources that will shed light on more contemporaneous adaptive and non-adaptive processes that have contributed to the evolution of the Darwin’s finches. PMID:23402223
DDRprot: a database of DNA damage response-related proteins.

PubMed

Andrés-León, Eduardo; Cases, Ildefonso; Arcas, Aida; Rojas, Ana M

2016-01-01

The DNA Damage Response (DDR) signalling network is an essential system that protects the genome's integrity. The DDRprot database presented here is a resource that integrates manually curated information on the human DDR network and its sub-pathways. For each particular DDR protein, we present detailed information about its function. If involved in post-translational modifications (PTMs) with each other, we depict the position of the modified residue/s in the three-dimensional structures, when resolved structures are available for the proteins. All this information is linked to the original publication from where it was obtained. Phylogenetic information is also shown, including time of emergence and conservation across 47 selected species, family trees and sequence alignments of homologues. The DDRprot database can be queried by different criteria: pathways, species, evolutionary age or involvement in (PTM). Sequence searches using hidden Markov models can be also used.Database URL: http://ddr.cbbio.es. © The Author(s) 2016. Published by Oxford University Press.
Establishing homology between mitochondrial calcium uniporters, prokaryotic magnesium channels and chlamydial IncA proteins.

PubMed

Lee, Andre; Vastermark, Ake; Saier, Milton H

2014-08-01

Mitochondrial calcium uniporters (MCUs) (TC no. 1.A.77) are oligomeric channel proteins found in the mitochondrial inner membrane. MCUs have two well-conserved transmembrane segments (TMSs), connected by a linker, similar to bacterial MCU homologues. These proteins and chlamydial IncA proteins (of unknown function; TC no. 9.B.159) are homologous to prokaryotic Mg(2+) transporters, AtpI and AtpZ, based on comparison scores of up to 14.5 sds. A phylogenetic tree containing all of these proteins showed that the AtpZ proteins cluster coherently as a subset within the large and diverse AtpI cluster, which branches separately from the MCUs and IncAs, both of which cluster coherently. The MCUs and AtpZs share the same two TMS topology, but the AtpIs have four TMSs, and IncAs can have either two (most frequent) or four (less frequent) TMSs. Binary alignments, comparison scores and motif analyses showed that TMSs 1 and 2 align with TMSs 3 and 4 of the AtpIs, suggesting that the four TMS AtpI proteins arose via an intragenic duplication event. These findings establish an evolutionary link interconnecting eukaryotic and prokaryotic Ca(2+) and Mg(2+) transporters with chlamydial IncAs, and lead us to suggest that all members of the MCU superfamily, including IncAs, function as divalent cation channels. © 2014 The Authors.
Establishing homology between mitochondrial calcium uniporters, prokaryotic magnesium channels and chlamydial IncA proteins

PubMed Central

Lee, Andre; Vastermark, Ake

2014-01-01

Mitochondrial calcium uniporters (MCUs) (TC no. 1.A.77) are oligomeric channel proteins found in the mitochondrial inner membrane. MCUs have two well-conserved transmembrane segments (TMSs), connected by a linker, similar to bacterial MCU homologues. These proteins and chlamydial IncA proteins (of unknown function; TC no. 9.B.159) are homologous to prokaryotic Mg2+ transporters, AtpI and AtpZ, based on comparison scores of up to 14.5 sds. A phylogenetic tree containing all of these proteins showed that the AtpZ proteins cluster coherently as a subset within the large and diverse AtpI cluster, which branches separately from the MCUs and IncAs, both of which cluster coherently. The MCUs and AtpZs share the same two TMS topology, but the AtpIs have four TMSs, and IncAs can have either two (most frequent) or four (less frequent) TMSs. Binary alignments, comparison scores and motif analyses showed that TMSs 1 and 2 align with TMSs 3 and 4 of the AtpIs, suggesting that the four TMS AtpI proteins arose via an intragenic duplication event. These findings establish an evolutionary link interconnecting eukaryotic and prokaryotic Ca2+ and Mg2+ transporters with chlamydial IncAs, and lead us to suggest that all members of the MCU superfamily, including IncAs, function as divalent cation channels. PMID:24869855
Phylogenetic analyses of the genus Aeromonas based on housekeeping gene sequencing and its influence on systematics.

PubMed

Navarro, Aaron; Martínez-Murcia, Antonio

2018-04-19

The phylogenies derived from housekeeping gene sequence alignments, although mere evolutionary hypotheses, have increased our knowledge about the Aeromonas genetic diversity, providing a robust species delineation framework invaluable for reliable, easy and fast species identification. Previous classifications of Aeromonas, have been fully surpassed by recently developed phylogenetic (natural) classification obtained from the analysis of so-called "molecular chronometers". Despite ribosomal RNAs cannot split all known Aeromonas species, the conserved nature of 16S rRNA offers reliable alignments containing mosaics of sequence signatures which may serve as targets of genus-specific oligonucleotides for subsequent identification/detection tests in samples without culturing. On the contrary, some housekeeping genes coding for proteins show a much better chronometric capacity to discriminate highly related strains. Although both, species and loci, do not all evolve at exactly the same rate, published Aeromonas phylogenies were congruent to each other, indicating that, phylogenetic markers are synchronized and a concatenated multi-gene phylogeny, may be "the mirror" of the entire genomic relationships. Thanks to MLPA approaches, the discovery of new Aeromonas species and strains of rarely isolated species is today more frequent and, consequently, should be extensively promoted for isolate screening and species identification. Although, accumulated data still should be carefully catalogued to inherit a reliable database. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

cisprimertool: software to implement a comparative genomics strategy for the development of conserved intron scanning (CIS) markers.

PubMed

Jayashree, B; Jagadeesh, V T; Hoisington, D

2008-05-01

The availability of complete, annotated genomic sequence information in model organisms is a rich resource that can be extended to understudied orphan crops through comparative genomic approaches. We report here a software tool (cisprimertool) for the identification of conserved intron scanning regions using expressed sequence tag alignments to a completely sequenced model crop genome. The method used is based on earlier studies reporting the assessment of conserved intron scanning primers (called CISP) within relatively conserved exons located near exon-intron boundaries from onion, banana, sorghum and pearl millet alignments with rice. The tool is freely available to academic users at http://www.icrisat.org/gt-bt/CISPTool.htm. © 2007 ICRISAT.
flyDIVaS: A Comparative Genomics Resource for Drosophila Divergence and Selection

PubMed Central

Stanley, Craig E.; Kulathinal, Rob J.

2016-01-01

With arguably the best finished and expertly annotated genome assembly, Drosophila melanogaster is a formidable genetics model to study all aspects of biology. Nearly a decade ago, the 12 Drosophila genomes project expanded D. melanogaster’s breadth as a comparative model through the community-development of an unprecedented genus- and genome-wide comparative resource. However, since its inception, these datasets for evolutionary inference and biological discovery have become increasingly outdated, outmoded, and inaccessible. Here, we provide an updated and upgradable comparative genomics resource of Drosophila divergence and selection, flyDIVaS, based on the latest genomic assemblies, curated FlyBase annotations, and recent OrthoDB orthology calls. flyDIVaS is an online database containing D. melanogaster-centric orthologous gene sets, CDS and protein alignments, divergence statistics (% gaps, dN, dS, dN/dS), and codon-based tests of positive Darwinian selection. Out of 13,920 protein-coding D. melanogaster genes, ∼80% have one aligned ortholog in the closely related species, D. simulans, and ∼50% have 1–1 12-way alignments in the original 12 sequenced species that span over 80 million yr of divergence. Genes and their orthologs can be chosen from four different taxonomic datasets differing in phylogenetic depth and coverage density, and visualized via interactive alignments and phylogenetic trees. Users can also batch download entire comparative datasets. A functional survey finds conserved mitotic and neural genes, highly diverged immune and reproduction-related genes, more conspicuous signals of divergence across tissue-specific genes, and an enrichment of positive selection among highly diverged genes. flyDIVaS will be regularly updated and can be freely accessed at www.flydivas.info. We encourage researchers to regularly use this resource as a tool for biological inference and discovery, and in their classrooms to help train the next generation of biologists to creatively use such genomic big data resources in an integrative manner. PMID:27226167
flyDIVaS: A Comparative Genomics Resource for Drosophila Divergence and Selection.

PubMed

Stanley, Craig E; Kulathinal, Rob J

2016-08-09

With arguably the best finished and expertly annotated genome assembly, Drosophila melanogaster is a formidable genetics model to study all aspects of biology. Nearly a decade ago, the 12 Drosophila genomes project expanded D. melanogaster's breadth as a comparative model through the community-development of an unprecedented genus- and genome-wide comparative resource. However, since its inception, these datasets for evolutionary inference and biological discovery have become increasingly outdated, outmoded, and inaccessible. Here, we provide an updated and upgradable comparative genomics resource of Drosophila divergence and selection, flyDIVaS, based on the latest genomic assemblies, curated FlyBase annotations, and recent OrthoDB orthology calls. flyDIVaS is an online database containing D. melanogaster-centric orthologous gene sets, CDS and protein alignments, divergence statistics (% gaps, dN, dS, dN/dS), and codon-based tests of positive Darwinian selection. Out of 13,920 protein-coding D. melanogaster genes, ∼80% have one aligned ortholog in the closely related species, D. simulans, and ∼50% have 1-1 12-way alignments in the original 12 sequenced species that span over 80 million yr of divergence. Genes and their orthologs can be chosen from four different taxonomic datasets differing in phylogenetic depth and coverage density, and visualized via interactive alignments and phylogenetic trees. Users can also batch download entire comparative datasets. A functional survey finds conserved mitotic and neural genes, highly diverged immune and reproduction-related genes, more conspicuous signals of divergence across tissue-specific genes, and an enrichment of positive selection among highly diverged genes. flyDIVaS will be regularly updated and can be freely accessed at www.flydivas.info We encourage researchers to regularly use this resource as a tool for biological inference and discovery, and in their classrooms to help train the next generation of biologists to creatively use such genomic big data resources in an integrative manner. Copyright © 2016 Stanley and Kulathinal.
Different phylogenomic approaches to resolve the evolutionary relationships among model fish species.

PubMed

Negrisolo, Enrico; Kuhl, Heiner; Forcato, Claudio; Vitulo, Nicola; Reinhardt, Richard; Patarnello, Tomaso; Bargelloni, Luca

2010-12-01

Comparative genomics holds the promise to magnify the information obtained from individual genome sequencing projects, revealing common features conserved across genomes and identifying lineage-specific characteristics. To implement such a comparative approach, a robust phylogenetic framework is required to accurately reconstruct evolution at the genome level. Among vertebrate taxa, teleosts represent the second best characterized group, with high-quality draft genome sequences for five model species (Danio rerio, Gasterosteus aculeatus, Oryzias latipes, Takifugu rubripes, and Tetraodon nigroviridis), and several others are in the finishing lane. However, the relationships among the acanthomorph teleost model fishes remain an unresolved taxonomic issue. Here, a genomic region spanning over 1.2 million base pairs was sequenced in the teleost fish Dicentrarchus labrax. Together with genomic data available for the above fish models, the new sequence was used to identify unique orthologous genomic regions shared across all target taxa. Different strategies were applied to produce robust multiple gene and genomic alignments spanning from 11,802 to 186,474 amino acid/nucleotide positions. Ten data sets were analyzed according to Bayesian inference, maximum likelihood, maximum parsimony, and neighbor joining methods. Extensive analyses were performed to explore the influence of several factors (e.g., alignment methodology, substitution model, data set partitions, and long-branch attraction) on the tree topology. Although a general consensus was observed for a closer relationship between G. aculeatus (Gasterosteidae) and Di. labrax (Moronidae) with the atherinomorph O. latipes (Beloniformes) sister taxon of this clade, with the tetraodontiform group Ta. rubripes and Te. nigroviridis (Tetraodontiformes) representing a more distantly related taxon among acanthomorph model fish species, conflicting results were obtained between data sets and methods, especially with respect to the choice of alignment methodology applied to noncoding parts of the genomic region under study. This may limit the use of intergenic/noncoding sequences in phylogenomics until more robust alignment algorithms are developed.
Exploring the Genomic Roadmap and Molecular Phylogenetics Associated with MODY Cascades Using Computational Biology.

PubMed

Chakraborty, Chiranjib; Bandyopadhyay, Sanghamitra; Doss, C George Priya; Agoramoorthy, Govindasamy

2015-04-01

Maturity onset diabetes of the young (MODY) is a metabolic and genetic disorder. It is different from type 1 and type 2 diabetes with low occurrence level (1-2%) among all diabetes. This disorder is a consequence of β-cell dysfunction. Till date, 11 subtypes of MODY have been identified, and all of them can cause gene mutations. However, very little is known about the gene mapping, molecular phylogenetics, and co-expression among MODY genes and networking between cascades. This study has used latest servers and software such as VarioWatch, ClustalW, MUSCLE, G Blocks, Phylogeny.fr, iTOL, WebLogo, STRING, and KEGG PATHWAY to perform comprehensive analyses of gene mapping, multiple sequences alignment, molecular phylogenetics, protein-protein network design, co-expression analysis of MODY genes, and pathway development. The MODY genes are located in chromosomes-2, 7, 8, 9, 11, 12, 13, 17, and 20. Highly aligned block shows Pro, Gly, Leu, Arg, and Pro residues are highly aligned in the positions of 296, 386, 437, 455, 456 and 598, respectively. Alignment scores inform us that HNF1A and HNF1B proteins have shown high sequence similarity among MODY proteins. Protein-protein network design shows that HNF1A, HNF1B, HNF4A, NEUROD1, PDX1, PAX4, INS, and GCK are strongly connected, and the co-expression analyses between MODY genes also show distinct association between HNF1A and HNF4A genes. This study has used latest tools of bioinformatics to develop a rapid method to assess the evolutionary relationship, the network development, and the associations among eleven MODY genes and cascades. The prediction of sequence conservation, molecular phylogenetics, protein-protein network and the association between the MODY cascades enhances opportunities to get more insights into the less-known MODY disease.
Dry habitats were crucibles of domestication in the evolution of agriculture in ants.

PubMed

Branstetter, Michael G; Ješovnik, Ana; Sosa-Calvo, Jeffrey; Lloyd, Michael W; Faircloth, Brant C; Brady, Seán G; Schultz, Ted R

2017-04-12

The evolution of ant agriculture, as practised by the fungus-farming 'attine' ants, is thought to have arisen in the wet rainforests of South America about 55-65 Ma. Most subsequent attine agricultural evolution, including the domestication event that produced the ancestor of higher attine cultivars, is likewise hypothesized to have occurred in South American rainforests. The 'out-of-the-rainforest' hypothesis, while generally accepted, has never been tested in a phylogenetic context. It also presents a problem for explaining how fungal domestication might have occurred, given that isolation from free-living populations is required. Here, we use phylogenomic data from ultra-conserved element (UCE) loci to reconstruct the evolutionary history of fungus-farming ants, reduce topological uncertainty, and identify the closest non-fungus-growing ant relative. Using the phylogeny we infer the history of attine agricultural systems, habitat preference and biogeography. Our results show that the out-of-the-rainforest hypothesis is correct with regard to the origin of attine ant agriculture; however, contrary to expectation, we find that the transition from lower to higher agriculture is very likely to have occurred in a seasonally dry habitat, inhospitable to the growth of free-living populations of attine fungal cultivars. We suggest that dry habitats favoured the isolation of attine cultivars over the evolutionary time spans necessary for domestication to occur. © 2017 The Authors.
Dry habitats were crucibles of domestication in the evolution of agriculture in ants

PubMed Central

2017-01-01

The evolution of ant agriculture, as practised by the fungus-farming ‘attine’ ants, is thought to have arisen in the wet rainforests of South America about 55–65 Ma. Most subsequent attine agricultural evolution, including the domestication event that produced the ancestor of higher attine cultivars, is likewise hypothesized to have occurred in South American rainforests. The ‘out-of-the-rainforest’ hypothesis, while generally accepted, has never been tested in a phylogenetic context. It also presents a problem for explaining how fungal domestication might have occurred, given that isolation from free-living populations is required. Here, we use phylogenomic data from ultra-conserved element (UCE) loci to reconstruct the evolutionary history of fungus-farming ants, reduce topological uncertainty, and identify the closest non-fungus-growing ant relative. Using the phylogeny we infer the history of attine agricultural systems, habitat preference and biogeography. Our results show that the out-of-the-rainforest hypothesis is correct with regard to the origin of attine ant agriculture; however, contrary to expectation, we find that the transition from lower to higher agriculture is very likely to have occurred in a seasonally dry habitat, inhospitable to the growth of free-living populations of attine fungal cultivars. We suggest that dry habitats favoured the isolation of attine cultivars over the evolutionary time spans necessary for domestication to occur. PMID:28404776
Analysis of evolutionary conservation patterns and their influence on identifying protein functional sites.

PubMed

Fang, Chun; Noguchi, Tamotsu; Yamana, Hayato

2014-10-01

Evolutionary conservation information included in position-specific scoring matrix (PSSM) has been widely adopted by sequence-based methods for identifying protein functional sites, because all functional sites, whether in ordered or disordered proteins, are found to be conserved at some extent. However, different functional sites have different conservation patterns, some of them are linear contextual, some of them are mingled with highly variable residues, and some others seem to be conserved independently. Every value in PSSMs is calculated independently of each other, without carrying the contextual information of residues in the sequence. Therefore, adopting the direct output of PSSM for prediction fails to consider the relationship between conservation patterns of residues and the distribution of conservation scores in PSSMs. In order to demonstrate the importance of combining PSSMs with the specific conservation patterns of functional sites for prediction, three different PSSM-based methods for identifying three kinds of functional sites have been analyzed. Results suggest that, different PSSM-based methods differ in their capability to identify different patterns of functional sites, and better combining PSSMs with the specific conservation patterns of residues would largely facilitate the prediction.
Structure of CPV17 polyhedrin determined by the improved analysis of serial femtosecond crystallographic data

DOE PAGES

Ginn, Helen M.; Messerschmidt, Marc; Ji, Xiaoyun; ...

2015-03-09

The X-ray free-electron laser (XFEL) allows the analysis of small weakly diffracting protein crystals, but has required very many crystals to obtain good data. Here we use an XFEL to determine the room temperature atomic structure for the smallest cytoplasmic polyhedrosis virus polyhedra yet characterized, which we failed to solve at a synchrotron. These protein microcrystals, roughly a micron across, accrue within infected cells. We use a new physical model for XFEL diffraction, which better estimates the experimental signal, delivering a high-resolution XFEL structure (1.75 Å), using fewer crystals than previously required for this resolution. The crystal lattice and proteinmore » core are conserved compared with a polyhedrin with less than 10% sequence identity. We explain how the conserved biological phenotype, the crystal lattice, is maintained in the face of extreme environmental challenge and massive evolutionary divergence. Our improved methods should open up more challenging biological samples to XFEL analysis.« less
Vpr Promotes Macrophage-Dependent HIV-1 Infection of CD4+ T Lymphocytes

PubMed Central

Collins, David R.; Lubow, Jay; Lukic, Zana; Mashiba, Michael; Collins, Kathleen L.

2015-01-01

Vpr is a conserved primate lentiviral protein that promotes infection of T lymphocytes in vivo by an unknown mechanism. Here we demonstrate that Vpr and its cellular co-factor, DCAF1, are necessary for efficient cell-to-cell spread of HIV-1 from macrophages to CD4+ T lymphocytes when there is inadequate cell-free virus to support direct T lymphocyte infection. Remarkably, Vpr functioned to counteract a macrophage-specific intrinsic antiviral pathway that targeted Env-containing virions to LAMP1+ lysosomal compartments. This restriction of Env also impaired virological synapses formed through interactions between HIV-1 Env on infected macrophages and CD4 on T lymphocytes. Treatment of infected macrophages with exogenous interferon-alpha induced virion degradation and blocked synapse formation, overcoming the effects of Vpr. These results provide a mechanism that helps explain the in vivo requirement for Vpr and suggests that a macrophage-dependent stage of HIV-1 infection drives the evolutionary conservation of Vpr. PMID:26186441
New powerful statistics for alignment-free sequence comparison under a pattern transfer model.

PubMed

Liu, Xuemei; Wan, Lin; Li, Jing; Reinert, Gesine; Waterman, Michael S; Sun, Fengzhu

2011-09-07

Alignment-free sequence comparison is widely used for comparing gene regulatory regions and for identifying horizontally transferred genes. Recent studies on the power of a widely used alignment-free comparison statistic D2 and its variants D*2 and D(s)2 showed that their power approximates a limit smaller than 1 as the sequence length tends to infinity under a pattern transfer model. We develop new alignment-free statistics based on D2, D*2 and D(s)2 by comparing local sequence pairs and then summing over all the local sequence pairs of certain length. We show that the new statistics are much more powerful than the corresponding statistics and the power tends to 1 as the sequence length tends to infinity under the pattern transfer model. Copyright © 2011 Elsevier Ltd. All rights reserved.
New Powerful Statistics for Alignment-free Sequence Comparison Under a Pattern Transfer Model

PubMed Central

Liu, Xuemei; Wan, Lin; Li, Jing; Reinert, Gesine; Waterman, Michael S.; Sun, Fengzhu

2011-01-01

Alignment-free sequence comparison is widely used for comparing gene regulatory regions and for identifying horizontally transferred genes. Recent studies on the power of a widely used alignment-free comparison statistic D2 and its variants D2∗ and D2s showed that their power approximates a limit smaller than 1 as the sequence length tends to infinity under a pattern transfer model. We develop new alignment-free statistics based on D2, D2∗ and D2s by comparing local sequence pairs and then summing over all the local sequence pairs of certain length. We show that the new statistics are much more powerful than the corresponding statistics and the power tends to 1 as the sequence length tends to infinity under the pattern transfer model. PMID:21723298
Phylogenetic diversity meets conservation policy: small areas are key to preserving eucalypt lineages.

PubMed

Pollock, Laura J; Rosauer, Dan F; Thornhill, Andrew H; Kujala, Heini; Crisp, Michael D; Miller, Joseph T; McCarthy, Michael A

2015-02-19

Evolutionary and genetic knowledge is increasingly being valued in conservation theory, but is rarely considered in conservation planning and policy. Here, we integrate phylogenetic diversity (PD) with spatial reserve prioritization to evaluate how well the existing reserve system in Victoria, Australia captures the evolutionary lineages of eucalypts, which dominate forest canopies across the state. Forty-three per cent of remaining native woody vegetation in Victoria is located in protected areas (mostly national parks) representing 48% of the extant PD found in the state. A modest expansion in protected areas of 5% (less than 1% of the state area) would increase protected PD by 33% over current levels. In a recent policy change, portions of the national parks were opened for development. These tourism development zones hold over half the PD found in national parks with some species and clades falling entirely outside of protected zones within the national parks. This approach of using PD in spatial prioritization could be extended to any clade or area that has spatial and phylogenetic data. Our results demonstrate the relevance of PD to regional conservation policy by highlighting that small but strategically located areas disproportionally impact the preservation of evolutionary lineages.
Evolutionary response of landraces to climate change in centers of crop diversity

PubMed Central

Mercer, Kristin L; Perales, Hugo R

2010-01-01

Landraces cultivated in centers of crop diversity result from past and contemporary patterns of natural and farmer-mediated evolutionary forces. Successful in situ conservation of crop genetic resources depends on continuity of these evolutionary processes. Climate change is projected to affect agricultural production, yet analyses of impacts on in situ conservation of crop genetic diversity and farmers who conserve it have been absent. How will crop landraces respond to alterations in climate? We review the roles that phenotypic plasticity, evolution, and gene flow might play in sustaining production, although we might expect erosion of genetic diversity if landrace populations or entire races lose productivity. For example, highland maize landraces in southern Mexico do not express the plasticity necessary to sustain productivity under climate change, but may evolve in response to altered conditions. The outcome for any given crop in a given region will depend on the distribution of genetic variation that affects fitness and patterns of climate change. Understanding patterns of neutral and adaptive diversity from the population to the landscape scale is essential to clarify how landraces conserved in situ will continue to evolve and how to minimize genetic erosion of this essential natural resource. PMID:25567941
Evolutionary response of landraces to climate change in centers of crop diversity.

PubMed

Mercer, Kristin L; Perales, Hugo R

2010-09-01

Landraces cultivated in centers of crop diversity result from past and contemporary patterns of natural and farmer-mediated evolutionary forces. Successful in situ conservation of crop genetic resources depends on continuity of these evolutionary processes. Climate change is projected to affect agricultural production, yet analyses of impacts on in situ conservation of crop genetic diversity and farmers who conserve it have been absent. How will crop landraces respond to alterations in climate? We review the roles that phenotypic plasticity, evolution, and gene flow might play in sustaining production, although we might expect erosion of genetic diversity if landrace populations or entire races lose productivity. For example, highland maize landraces in southern Mexico do not express the plasticity necessary to sustain productivity under climate change, but may evolve in response to altered conditions. The outcome for any given crop in a given region will depend on the distribution of genetic variation that affects fitness and patterns of climate change. Understanding patterns of neutral and adaptive diversity from the population to the landscape scale is essential to clarify how landraces conserved in situ will continue to evolve and how to minimize genetic erosion of this essential natural resource.
Maintaining replication origins in the face of genomic change.

PubMed

Di Rienzi, Sara C; Lindstrom, Kimberly C; Mann, Tobias; Noble, William S; Raghuraman, M K; Brewer, Bonita J

2012-10-01

Origins of replication present a paradox to evolutionary biologists. As a collection, they are absolutely essential genomic features, but individually are highly redundant and nonessential. It is therefore difficult to predict to what extent and in what regard origins are conserved over evolutionary time. Here, through a comparative genomic analysis of replication origins and chromosomal replication patterns in the budding yeasts Saccharomyces cerevisiae and Lachancea waltii, we assess to what extent replication origins survived genomic change produced from 150 million years of evolution. We find that L. waltii origins exhibit a core consensus sequence and nucleosome occupancy pattern highly similar to those of S. cerevisiae origins. We further observe that the overall progression of chromosomal replication is similar between L. waltii and S. cerevisiae. Nevertheless, few origins show evidence of being conserved in location between the two species. Among the conserved origins are those surrounding centromeres and adjacent to histone genes, suggesting that proximity to an origin may be important for their regulation. We conclude that, over evolutionary time, origins maintain sequence, structure, and regulation, but are continually being created and destroyed, with the result that their locations are generally not conserved.
Maintaining replication origins in the face of genomic change

PubMed Central

Di Rienzi, Sara C.; Lindstrom, Kimberly C.; Mann, Tobias; Noble, William S.; Raghuraman, M.K.; Brewer, Bonita J.

2012-01-01

Origins of replication present a paradox to evolutionary biologists. As a collection, they are absolutely essential genomic features, but individually are highly redundant and nonessential. It is therefore difficult to predict to what extent and in what regard origins are conserved over evolutionary time. Here, through a comparative genomic analysis of replication origins and chromosomal replication patterns in the budding yeasts Saccharomyces cerevisiae and Lachancea waltii, we assess to what extent replication origins survived genomic change produced from 150 million years of evolution. We find that L. waltii origins exhibit a core consensus sequence and nucleosome occupancy pattern highly similar to those of S. cerevisiae origins. We further observe that the overall progression of chromosomal replication is similar between L. waltii and S. cerevisiae. Nevertheless, few origins show evidence of being conserved in location between the two species. Among the conserved origins are those surrounding centromeres and adjacent to histone genes, suggesting that proximity to an origin may be important for their regulation. We conclude that, over evolutionary time, origins maintain sequence, structure, and regulation, but are continually being created and destroyed, with the result that their locations are generally not conserved. PMID:22665441
Molecular Evolutionary Constraints that Determine the Avirulence State of Clostridium botulinum C2 Toxin.

PubMed

Prisilla, A; Prathiviraj, R; Chellapandi, P

2017-04-01

Clostridium botulinum (group-III) is an anaerobic bacterium producing C2 toxin along with botulinum neurotoxins. C2 toxin is belonged to binary toxin A family in bacterial ADP-ribosylation superfamily. A structural and functional diversity of binary toxin A family was inferred from different evolutionary constraints to determine the avirulence state of C2 toxin. Evolutionary genetic analyses revealed evidence of C2 toxin cluster evolution through horizontal gene transfer from the phage or plasmid origins, site-specific insertion by gene divergence, and homologous recombination event. It has also described that residue in conserved NAD-binding core, family-specific domain structure, and functional motifs found to predetermine its virulence state. Any mutational changes in these residues destabilized its structure-function relationship. Avirulent mutants of C2 toxin were screened and selected from a crucial site required for catalytic function of C2I and pore-forming function of C2II. We found coevolved amino acid pairs contributing an essential role in stabilization of its local structural environment. Avirulent toxins selected in this study were evaluated by detecting evolutionary constraints in stability of protein backbone structure, folding and conformational dynamic space, and antigenic peptides. We found 4 avirulent mutants of C2I and 5 mutants of C2II showing more stability in their local structural environment and backbone structure with rapid fold rate, and low conformational flexibility at mutated sites. Since, evolutionary constraints-free mutants with lack of catalytic and pore-forming function suggested as potential immunogenic candidates for treating C. botulinum infected poultry and veterinary animals. Single amino acid substitution in C2 toxin thus provides a major importance to understand its structure-function link, not only of a molecule but also of the pathogenesis.
Transcription Factor Map Alignment of Promoter Regions

PubMed Central

Blanco, Enrique; Messeguer, Xavier; Smith, Temple F; Guigó, Roderic

2006-01-01

We address the problem of comparing and characterizing the promoter regions of genes with similar expression patterns. This remains a challenging problem in sequence analysis, because often the promoter regions of co-expressed genes do not show discernible sequence conservation. In our approach, thus, we have not directly compared the nucleotide sequence of promoters. Instead, we have obtained predictions of transcription factor binding sites, annotated the predicted sites with the labels of the corresponding binding factors, and aligned the resulting sequences of labels—to which we refer here as transcription factor maps (TF-maps). To obtain the global pairwise alignment of two TF-maps, we have adapted an algorithm initially developed to align restriction enzyme maps. We have optimized the parameters of the algorithm in a small, but well-curated, collection of human–mouse orthologous gene pairs. Results in this dataset, as well as in an independent much larger dataset from the CISRED database, indicate that TF-map alignments are able to uncover conserved regulatory elements, which cannot be detected by the typical sequence alignments. PMID:16733547
Global Alignment of Pairwise Protein Interaction Networks for Maximal Common Conserved Patterns

DOE PAGES

Tian, Wenhong; Samatova, Nagiza F.

2013-01-01

A number of tools for the alignment of protein-protein interaction (PPI) networks have laid the foundation for PPI network analysis. Most of alignment tools focus on finding conserved interaction regions across the PPI networks through either local or global mapping of similar sequences. Researchers are still trying to improve the speed, scalability, and accuracy of network alignment. In view of this, we introduce a connected-components based fast algorithm, HopeMap, for network alignment. Observing that the size of true orthologs across species is small comparing to the total number of proteins in all species, we take a different approach based onmore » a precompiled list of homologs identified by KO terms. Applying this approach to S. cerevisiae (yeast) and D. melanogaster (fly), E. coli K12 and S. typhimurium , E. coli K12 and C. crescenttus , we analyze all clusters identified in the alignment. The results are evaluated through up-to-date known gene annotations, gene ontology (GO), and KEGG ortholog groups (KO). Comparing to existing tools, our approach is fast with linear computational cost, highly accurate in terms of KO and GO terms specificity and sensitivity, and can be extended to multiple alignments easily.« less

Capturing neutral and adaptive genetic diversity for conservation in a highly structured tree species.

PubMed

Rodríguez-Quilón, Isabel; Santos-Del-Blanco, Luis; Serra-Varela, María Jesús; Koskela, Jarkko; González-Martínez, Santiago C; Alía, Ricardo

2016-10-01

Preserving intraspecific genetic diversity is essential for long-term forest sustainability in a climate change scenario. Despite that, genetic information is largely neglected in conservation planning, and how conservation units should be defined is still heatedly debated. Here, we use maritime pine (Pinus pinaster Ait.), an outcrossing long-lived tree with a highly fragmented distribution in the Mediterranean biodiversity hotspot, to prove the importance of accounting for genetic variation, of both neutral molecular markers and quantitative traits, to define useful conservation units. Six gene pools associated to distinct evolutionary histories were identified within the species using 12 microsatellites and 266 single nucleotide polymorphisms (SNPs). In addition, height and survival standing variation, their genetic control, and plasticity were assessed in a multisite clonal common garden experiment (16 544 trees). We found high levels of quantitative genetic differentiation within previously defined neutral gene pools. Subsequent cluster analysis and post hoc trait distribution comparisons allowed us to define 10 genetically homogeneous population groups with high evolutionary potential. They constitute the minimum number of units to be represented in a maritime pine dynamic conservation program. Our results uphold that the identification of conservation units below the species level should account for key neutral and adaptive components of genetic diversity, especially in species with strong population structure and complex evolutionary histories. The environmental zonation approach currently used by the pan-European genetic conservation strategy for forest trees would be largely improved by gradually integrating molecular and quantitative trait information, as data become available. © 2016 by the Ecological Society of America.
An optimization-based approach for high-order accurate discretization of conservation laws with discontinuous solutions

NASA Astrophysics Data System (ADS)

Zahr, M. J.; Persson, P.-O.

2018-07-01

This work introduces a novel discontinuity-tracking framework for resolving discontinuous solutions of conservation laws with high-order numerical discretizations that support inter-element solution discontinuities, such as discontinuous Galerkin or finite volume methods. The proposed method aims to align inter-element boundaries with discontinuities in the solution by deforming the computational mesh. A discontinuity-aligned mesh ensures the discontinuity is represented through inter-element jumps while smooth basis functions interior to elements are only used to approximate smooth regions of the solution, thereby avoiding Gibbs' phenomena that create well-known stability issues. Therefore, very coarse high-order discretizations accurately resolve the piecewise smooth solution throughout the domain, provided the discontinuity is tracked. Central to the proposed discontinuity-tracking framework is a discrete PDE-constrained optimization formulation that simultaneously aligns the computational mesh with discontinuities in the solution and solves the discretized conservation law on this mesh. The optimization objective is taken as a combination of the deviation of the finite-dimensional solution from its element-wise average and a mesh distortion metric to simultaneously penalize Gibbs' phenomena and distorted meshes. It will be shown that our objective function satisfies two critical properties that are required for this discontinuity-tracking framework to be practical: (1) possesses a local minima at a discontinuity-aligned mesh and (2) decreases monotonically to this minimum in a neighborhood of radius approximately h / 2, whereas other popular discontinuity indicators fail to satisfy the latter. Another important contribution of this work is the observation that traditional reduced space PDE-constrained optimization solvers that repeatedly solve the conservation law at various mesh configurations are not viable in this context since severe overshoot and undershoot in the solution, i.e., Gibbs' phenomena, may make it impossible to solve the discrete conservation law on non-aligned meshes. Therefore, we advocate a gradient-based, full space solver where the mesh and conservation law solution converge to their optimal values simultaneously and therefore never require the solution of the discrete conservation law on a non-aligned mesh. The merit of the proposed method is demonstrated on a number of one- and two-dimensional model problems including the L2 projection of discontinuous functions, Burgers' equation with a discontinuous source term, transonic flow through a nozzle, and supersonic flow around a bluff body. We demonstrate optimal O (h p + 1) convergence rates in the L1 norm for up to polynomial order p = 6 and show that accurate solutions can be obtained on extremely coarse meshes.
CAFE: aCcelerated Alignment-FrEe sequence analysis

PubMed Central

Lu, Yang Young; Tang, Kujin; Ren, Jie; Fuhrman, Jed A.; Waterman, Michael S.

2017-01-01

Abstract Alignment-free genome and metagenome comparisons are increasingly important with the development of next generation sequencing (NGS) technologies. Recently developed state-of-the-art k-mer based alignment-free dissimilarity measures including CVTree, \\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}$d_2^*$\\end{document} and \\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}$d_2^S$\\end{document} are more computationally expensive than measures based solely on the k-mer frequencies. Here, we report a standalone software, aCcelerated Alignment-FrEe sequence analysis (CAFE), for efficient calculation of 28 alignment-free dissimilarity measures. CAFE allows for both assembled genome sequences and unassembled NGS shotgun reads as input, and wraps the output in a standard PHYLIP format. In downstream analyses, CAFE can also be used to visualize the pairwise dissimilarity measures, including dendrograms, heatmap, principal coordinate analysis and network display. CAFE serves as a general k-mer based alignment-free analysis platform for studying the relationships among genomes and metagenomes, and is freely available at https://github.com/younglululu/CAFE. PMID:28472388
Domain architecture conservation in orthologs

PubMed Central

2011-01-01

Background As orthologous proteins are expected to retain function more often than other homologs, they are often used for functional annotation transfer between species. However, ortholog identification methods do not take into account changes in domain architecture, which are likely to modify a protein's function. By domain architecture we refer to the sequential arrangement of domains along a protein sequence. To assess the level of domain architecture conservation among orthologs, we carried out a large-scale study of such events between human and 40 other species spanning the entire evolutionary range. We designed a score to measure domain architecture similarity and used it to analyze differences in domain architecture conservation between orthologs and paralogs relative to the conservation of primary sequence. We also statistically characterized the extents of different types of domain swapping events across pairs of orthologs and paralogs. Results The analysis shows that orthologs exhibit greater domain architecture conservation than paralogous homologs, even when differences in average sequence divergence are compensated for, for homologs that have diverged beyond a certain threshold. We interpret this as an indication of a stronger selective pressure on orthologs than paralogs to retain the domain architecture required for the proteins to perform a specific function. In general, orthologs as well as the closest paralogous homologs have very similar domain architectures, even at large evolutionary separation. The most common domain architecture changes observed in both ortholog and paralog pairs involved insertion/deletion of new domains, while domain shuffling and segment duplication/deletion were very infrequent. Conclusions On the whole, our results support the hypothesis that function conservation between orthologs demands higher domain architecture conservation than other types of homologs, relative to primary sequence conservation. This supports the notion that orthologs are functionally more similar than other types of homologs at the same evolutionary distance. PMID:21819573
USING ECO-EVOLUTIONARY INDIVIDUAL-BASED MODELS TO INVESTIGATE SPATIALLY-DEPENDENT PROCESSES IN CONSERVATION GENETICS

EPA Science Inventory

Eco-evolutionary population simulation models are powerful new forecasting tools for exploring management strategies for climate change and other dynamic disturbance regimes. Additionally, eco-evo individual-based models (IBMs) are useful for investigating theoretical feedbacks ...
MOCASSIN-prot: A multi-objective clustering approach for protein similarity networks

USDA-ARS?s Scientific Manuscript database

Motivation: Proteins often include multiple conserved domains. Various evolutionary events including duplication and loss of domains, domain shuffling, as well as sequence divergence contribute to generating complexities in protein structures, and consequently, in their functions. The evolutionary h...
Wetting of nonconserved residue-backbones: A feature indicative of aggregation associated regions of proteins.

PubMed

Pradhan, Mohan R; Pal, Arumay; Hu, Zhongqiao; Kannan, Srinivasaraghavan; Chee Keong, Kwoh; Lane, David P; Verma, Chandra S

2016-02-01

Aggregation is an irreversible form of protein complexation and often toxic to cells. The process entails partial or major unfolding that is largely driven by hydration. We model the role of hydration in aggregation using "Dehydrons." "Dehydrons" are unsatisfied backbone hydrogen bonds in proteins that seek shielding from water molecules by associating with ligands or proteins. We find that the residues at aggregation interfaces have hydrated backbones, and in contrast to other forms of protein-protein interactions, are under less evolutionary pressure to be conserved. Combining evolutionary conservation of residues and extent of backbone hydration allows us to distinguish regions on proteins associated with aggregation (non-conserved dehydron-residues) from other interaction interfaces (conserved dehydron-residues). This novel feature can complement the existing strategies used to investigate protein aggregation/complexation. © 2015 Wiley Periodicals, Inc.
Evolutionary emergence and maintenance of horizontally transmitted mutualism that do not rely on the supply of standing variation in symbiont quality.

PubMed

Uchiumi, Y; Ohtsuki, H; Sasaki, A

2017-12-01

Mutualism based on reciprocal exchange of costly services must avoid exploitation by 'free-rides'. Accordingly, hosts discriminate against free-riding symbionts in many mutualistic relationships. However, as the selective advantage of discriminators comes from the presence of variability in symbiont quality that they eliminate, discrimination and thus mutualism have been considered to be maintained with exogenous supply of free-riders. In this study, we tried to resolve the 'paradoxical' co-evolution of discrimination by hosts and cooperation by symbionts, by comparing two different types of discrimination: 'one-shot' discrimination, where a host does not reacquire new symbionts after evicting free-riders, and 'resampling' discrimination, where a host does from the environment. Our study shows that this apparently minor difference in discrimination types leads to qualitatively different evolutionary outcomes. First, although it has been usually considered that the benefit of discriminators is derived from the variability of symbiont quality, the benefit of a certain type of discriminators (e.g. one-shot discrimination) is proportional to the frequency of free-riders, which is in stark contrast to the case of resampling discrimination. As a result, one-shot discriminators can invade the free-rider/nondiscriminator population, even if standing variation for symbiont quality is absent. Second, our one-shot discriminators can also be maintained without exogenous supply of free-riders and hence is free from the paradox of discrimination. Therefore, our result indicates that the paradox is not a common feature of evolution of discrimination but is a problem of specific types of discrimination. © 2017 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2017 European Society For Evolutionary Biology.
Conservation of native Pacific trout diversity in Western North America

Treesearch

Brooke E. Penaluna; Alicia Abadía-Cardoso; Jason B. Dunham; Francisco J. García-Dé León; Robert E. Gresswell; Arturo Ruiz Luna; Eric B. Taylor; Bradley B. Shepard; Robert Al-Chokhachy; Clint C. Muhlfeld; Kevin R. Bestgen; Kevin Rogers; Marco A. Escalante; Ernest R. Keeley; Gabriel M. Temple; Jack E. Williams; Kathleen R. Matthews; Ron Pierce; Richard L. Mayden; Ryan P. Kovach; John Carlos Garza; Kurt D. Fausch

2016-01-01

Pacific trout Oncorhynchus spp. in western North America are strongly valued in ecological, socioeconomic, and cultural views, and have been the subject of substantial research and conservation efforts. Despite this, the understanding of their evolutionary histories, overall diversity, and challenges to their conservation is incomplete. We review...
Optimization of sequence alignment for simple sequence repeat regions.

PubMed

Jighly, Abdulqader; Hamwieh, Aladdin; Ogbonnaya, Francis C

2011-07-20

Microsatellites, or simple sequence repeats (SSRs), are tandemly repeated DNA sequences, including tandem copies of specific sequences no longer than six bases, that are distributed in the genome. SSR has been used as a molecular marker because it is easy to detect and is used in a range of applications, including genetic diversity, genome mapping, and marker assisted selection. It is also very mutable because of slipping in the DNA polymerase during DNA replication. This unique mutation increases the insertion/deletion (INDELs) mutation frequency to a high ratio - more than other types of molecular markers such as single nucleotide polymorphism (SNPs).SNPs are more frequent than INDELs. Therefore, all designed algorithms for sequence alignment fit the vast majority of the genomic sequence without considering microsatellite regions, as unique sequences that require special consideration. The old algorithm is limited in its application because there are many overlaps between different repeat units which result in false evolutionary relationships. To overcome the limitation of the aligning algorithm when dealing with SSR loci, a new algorithm was developed using PERL script with a Tk graphical interface. This program is based on aligning sequences after determining the repeated units first, and the last SSR nucleotides positions. This results in a shifting process according to the inserted repeated unit type.When studying the phylogenic relations before and after applying the new algorithm, many differences in the trees were obtained by increasing the SSR length and complexity. However, less distance between different linage had been observed after applying the new algorithm. The new algorithm produces better estimates for aligning SSR loci because it reflects more reliable evolutionary relations between different linages. It reduces overlapping during SSR alignment, which results in a more realistic phylogenic relationship.
Multispecies genetic objectives in spatial conservation planning.

PubMed

Nielsen, Erica S; Beger, Maria; Henriques, Romina; Selkoe, Kimberly A; von der Heyden, Sophie

2017-08-01

Growing threats to biodiversity and global alteration of habitats and species distributions make it increasingly necessary to consider evolutionary patterns in conservation decision making. Yet, there is no clear-cut guidance on how genetic features can be incorporated into conservation-planning processes, despite multiple molecular markers and several genetic metrics for each marker type to choose from. Genetic patterns differ between species, but the potential tradeoffs among genetic objectives for multiple species in conservation planning are currently understudied. We compared spatial conservation prioritizations derived from 2 metrics of genetic diversity (nucleotide and haplotype diversity) and 2 metrics of genetic isolation (private haplotypes and local genetic differentiation) in mitochondrial DNA of 5 marine species. We compared outcomes of conservation plans based only on habitat representation with plans based on genetic data and habitat representation. Fewer priority areas were selected for conservation plans based solely on habitat representation than on plans that included habitat and genetic data. All 4 genetic metrics selected approximately similar conservation-priority areas, which is likely a result of prioritizing genetic patterns across a genetically diverse array of species. Largely, our results suggest that multispecies genetic conservation objectives are vital to creating protected-area networks that appropriately preserve community-level evolutionary patterns. © 2016 Society for Conservation Biology.
Diversity and evolutionary origins of fungi associated with seeds of a neotropical pioneer tree: a case study for analysing fungal environmental samples.

PubMed

U'ren, Jana M; Dalling, James W; Gallery, Rachel E; Maddison, David R; Davis, E Christine; Gibson, Cara M; Arnold, A Elizabeth

2009-04-01

Fungi associated with seeds of tropical trees pervasively affect seed survival and germination, and thus are an important, but understudied, component of forest ecology. Here, we examine the diversity and evolutionary origins of fungi isolated from seeds of an important pioneer tree (Cecropia insignis, Cecropiaceae) following burial in soil for five months in a tropical moist forest in Panama. Our approach, which relied on molecular sequence data because most isolates did not sporulate in culture, provides an opportunity to evaluate several methods currently used to analyse environmental samples of fungi. First, intra- and interspecific divergence were estimated for the nu-rITS and 5.8S gene for four genera of Ascomycota that are commonly recovered from seeds. Using these values we estimated species boundaries for 527 isolates, showing that seed-associated fungi are highly diverse, horizontally transmitted, and genotypically congruent with some foliar endophytes from the same site. We then examined methods for inferring the taxonomic placement and phylogenetic relationships of these fungi, evaluating the effects of manual versus automated alignment, model selection, and inference methods, as well as the quality of BLAST-based identification using GenBank. We found that common methods such as neighbor-joining and Bayesian inference differ in their sensitivity to alignment methods; analyses of particular fungal genera differ in their sensitivity to alignments; and numerous and sometimes intricate disparities exist between BLAST-based versus phylogeny-based identification methods. Lastly, we used our most robust methods to infer phylogenetic relationships of seed-associated fungi in four focal genera, and reconstructed ancestral states to generate preliminary hypotheses regarding the evolutionary origins of this guild. Our results illustrate the dynamic evolutionary relationships among endophytic fungi, pathogens, and seed-associated fungi, and the apparent evolutionary distinctiveness of saprotrophs. Our study also elucidates the diversity, taxonomy, and ecology of an important group of plant-associated fungi and highlights some of the advantages and challenges inherent in the use of ITS data for environmental sampling of fungi.
Behavioral fever in ectothermic vertebrates.

PubMed

Rakus, Krzysztof; Ronsmans, Maygane; Vanderplasschen, Alain

2017-01-01

Fever is an evolutionary conserved defense mechanism which is present in both endothermic and ectothermic vertebrates. Ectotherms in response to infection can increase their body temperature by moving to warmer places. This process is known as behavioral fever. In this review, we summarize the current knowledge on the mechanisms of induction of fever in mammals. We further discuss the evolutionary conserved mechanisms existing between fever of mammals and behavioral fever of ectothermic vertebrates. Finally, the experimental evidences supporting an adaptive value of behavioral fever expressed by ectothermic vertebrates are summarized. Copyright © 2016 Elsevier Ltd. All rights reserved.
The beta-diversity of species interactions: Untangling the drivers of geographic variation in plant-pollinator diversity and function across scales.

PubMed

Burkle, Laura A; Myers, Jonathan A; Belote, R Travis

2016-01-01

Geographic patterns of biodiversity have long inspired interest in processes that shape the assembly, diversity, and dynamics of communities at different spatial scales. To study mechanisms of community assembly, ecologists often compare spatial variation in community composition (beta-diversity) across environmental and spatial gradients. These same patterns inspired evolutionary biologists to investigate how micro- and macro-evolutionary processes create gradients in biodiversity. Central to these perspectives are species interactions, which contribute to community assembly and geographic variation in evolutionary processes. However, studies of beta-diversity have predominantly focused on single trophic levels, resulting in gaps in our understanding of variation in species-interaction networks (interaction beta-diversity), especially at scales most relevant to evolutionary studies of geographic variation. We outline two challenges and their consequences in scaling-up studies of interaction beta-diversity from local to biogeographic scales using plant-pollinator interactions as a model system in ecology, evolution, and conservation. First, we highlight how variation in regional species pools may contribute to variation in interaction beta-diversity among biogeographic regions with dissimilar evolutionary history. Second, we highlight how pollinator behavior (host-switching) links ecological networks to geographic patterns of plant-pollinator interactions and evolutionary processes. Third, we outline key unanswered questions regarding the role of geographic variation in plant-pollinator interactions for conservation and ecosystem services (pollination) in changing environments. We conclude that the largest advances in the burgeoning field of interaction beta-diversity will come from studies that integrate frameworks in ecology, evolution, and conservation to understand the causes and consequences of interaction beta-diversity across scales. © 2016 Botanical Society of America.
DNA Barcode Sequence Identification Incorporating Taxonomic Hierarchy and within Taxon Variability

PubMed Central

Little, Damon P.

2011-01-01

For DNA barcoding to succeed as a scientific endeavor an accurate and expeditious query sequence identification method is needed. Although a global multiple–sequence alignment can be generated for some barcoding markers (e.g. COI, rbcL), not all barcoding markers are as structurally conserved (e.g. matK). Thus, algorithms that depend on global multiple–sequence alignments are not universally applicable. Some sequence identification methods that use local pairwise alignments (e.g. BLAST) are unable to accurately differentiate between highly similar sequences and are not designed to cope with hierarchic phylogenetic relationships or within taxon variability. Here, I present a novel alignment–free sequence identification algorithm–BRONX–that accounts for observed within taxon variability and hierarchic relationships among taxa. BRONX identifies short variable segments and corresponding invariant flanking regions in reference sequences. These flanking regions are used to score variable regions in the query sequence without the production of a global multiple–sequence alignment. By incorporating observed within taxon variability into the scoring procedure, misidentifications arising from shared alleles/haplotypes are minimized. An explicit treatment of more inclusive terminals allows for separate identifications to be made for each taxonomic level and/or for user–defined terminals. BRONX performs better than all other methods when there is imperfect overlap between query and reference sequences (e.g. mini–barcode queries against a full–length barcode database). BRONX consistently produced better identifications at the genus–level for all query types. PMID:21857897
Open Reading Frame Phylogenetic Analysis on the Cloud

PubMed Central

2013-01-01

Phylogenetic analysis has become essential in researching the evolutionary relationships between viruses. These relationships are depicted on phylogenetic trees, in which viruses are grouped based on sequence similarity. Viral evolutionary relationships are identified from open reading frames rather than from complete sequences. Recently, cloud computing has become popular for developing internet-based bioinformatics tools. Biocloud is an efficient, scalable, and robust bioinformatics computing service. In this paper, we propose a cloud-based open reading frame phylogenetic analysis service. The proposed service integrates the Hadoop framework, virtualization technology, and phylogenetic analysis methods to provide a high-availability, large-scale bioservice. In a case study, we analyze the phylogenetic relationships among Norovirus. Evolutionary relationships are elucidated by aligning different open reading frame sequences. The proposed platform correctly identifies the evolutionary relationships between members of Norovirus. PMID:23671843
Extinction Risk and Overfishing: Reconciling Conservation and Fisheries Perspectives on the Status of Marine Fishes

PubMed Central

Davies, Trevor D.; Baum, Julia K.

2012-01-01

Anthropogenic disturbances are ubiquitous in the ocean, but their impacts on marine species are hotly debated. We evaluated marine fish statuses using conservation (Red List threatened or not) and fisheries (above or below reference points) metrics, compared their alignment, and diagnosed why discrepancies arise. Whereas only 13.5% of Red Listed marine fishes (n = 2952) are threatened, 40% and 21% of populations with stock assessments (n = 166) currently are below their more conservative and riskier reference points, respectively. Conservation and fisheries metrics aligned well (70.5% to 80.7%), despite their mathematical disconnect. Red Listings were not biased towards exaggerating threat status, and egregious errors, where populations were categorized at opposite extremes of fisheries and conservation metrics, were rare. Our analyses suggest conservation and fisheries scientists will agree on the statuses of exploited marine fishes in most cases, leaving only the question of appropriate management responses for populations of mutual concern still unresolved. PMID:22872806
Extinction risk and overfishing: reconciling conservation and fisheries perspectives on the status of marine fishes.

PubMed

Davies, Trevor D; Baum, Julia K

2012-01-01

Anthropogenic disturbances are ubiquitous in the ocean, but their impacts on marine species are hotly debated. We evaluated marine fish statuses using conservation (Red List threatened or not) and fisheries (above or below reference points) metrics, compared their alignment, and diagnosed why discrepancies arise. Whereas only 13.5% of Red Listed marine fishes (n = 2952) are threatened, 40% and 21% of populations with stock assessments (n = 166) currently are below their more conservative and riskier reference points, respectively. Conservation and fisheries metrics aligned well (70.5% to 80.7%), despite their mathematical disconnect. Red Listings were not biased towards exaggerating threat status, and egregious errors, where populations were categorized at opposite extremes of fisheries and conservation metrics, were rare. Our analyses suggest conservation and fisheries scientists will agree on the statuses of exploited marine fishes in most cases, leaving only the question of appropriate management responses for populations of mutual concern still unresolved.
Are hotspots of evolutionary potential adequately protected in southern California?

USGS Publications Warehouse

Vandergast, A.G.; Bohonak, A.J.; Hathaway, S.A.; Boys, J.; Fisher, R.N.

2008-01-01

Reserves are often designed to protect rare habitats, or "typical" exemplars of ecoregions and geomorphic provinces. This approach focuses on current patterns of organismal and ecosystem-level biodiversity, but typically ignores the evolutionary processes that control the gain and loss of biodiversity at these and other levels (e.g., genetic, ecological). In order to include evolutionary processes in conservation planning efforts, their spatial components must first be identified and mapped. We describe a GIS-based approach for explicitly mapping patterns of genetic divergence and diversity for multiple species (a "multi-species genetic landscape"). Using this approach, we analyzed mitochondrial DNA datasets from 21 vertebrate and invertebrate species in southern California to identify areas with common phylogeographic breaks and high intrapopulation diversity. The result is an evolutionary framework for southern California within which patterns of genetic diversity can be analyzed in the context of historical processes, future evolutionary potential and current reserve design. Our multi-species genetic landscapes pinpoint six hotspots where interpopulation genetic divergence is consistently high, five evolutionary hotspots within which genetic connectivity is high, and three hotspots where intrapopulation genetic diversity is high. These 14 hotspots can be grouped into eight geographic areas, of which five largely are unprotected at this time. The multi-species genetic landscape approach may provide an avenue to readily incorporate measures of evolutionary process into GIS-based systematic conservation assessment and land-use planning.
A Nascent Peptide Signal Responsive to Endogenous Levels of Polyamines Acts to Stimulate Regulatory Frameshifting on Antizyme mRNA.

PubMed

Yordanova, Martina M; Wu, Cheng; Andreev, Dmitry E; Sachs, Matthew S; Atkins, John F

2015-07-17

The protein antizyme is a negative regulator of cellular polyamine concentrations from yeast to mammals. Synthesis of functional antizyme requires programmed +1 ribosomal frameshifting at the 3' end of the first of two partially overlapping ORFs. The frameshift is the sensor and effector in an autoregulatory circuit. Except for Saccharomyces cerevisiae antizyme mRNA, the frameshift site alone only supports low levels of frameshifting. The high levels usually observed depend on the presence of cis-acting stimulatory elements located 5' and 3' of the frameshift site. Antizyme genes from different evolutionary branches have evolved different stimulatory elements. Prior and new multiple alignments of fungal antizyme mRNA sequences from the Agaricomycetes class of Basidiomycota show a distinct pattern of conservation 5' of the frameshift site consistent with a function at the amino acid level. As shown here when tested in Schizosaccharomyces pombe and mammalian HEK293T cells, the 5' part of this conserved sequence acts at the nascent peptide level to stimulate the frameshifting, without involving stalling detectable by toe-printing. However, the peptide is only part of the signal. The 3' part of the stimulator functions largely independently and acts at least mostly at the nucleotide level. When polyamine levels were varied, the stimulatory effect was seen to be especially responsive in the endogenous polyamine concentration range, and this effect may be more general. A conserved RNA secondary structure 3' of the frameshift site has weaker stimulatory and polyamine sensitizing effects on frameshifting. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.

Length variation and sequence divergence in mitochondrial control region of Schizothoracine (Teleostei: Cyperinidae) species.

PubMed

Syed, Mudasir Ahmad; Bhat, Farooz Ahmad; Balkhi, Masood-ul Hassan; Bhat, Bilal Ahmad

2016-01-01

Schizothoracine fish commonly called snow trouts inhibit the entire network of snow and spring fed cool waters of Kashmir, India. Over 10 species reported earlier, only five species have been found, these include Schizothorax niger, Schizothorax esocinus, Schizothorax plagiostomus, Schizothorax curvifrons and Schizothorax labiatus. The relationship between these species is contradicting. To understand the evolutionary relation of these species, we examined the sequence information of mitochondrial D-loop of 25 individuals representing five species. Sequence alignment showed D-loop region highly variable and length variation was observed in di-nucleotide (TA)n microsatellite between and within species. Interestingly, all these species have (TA)n microsatellite not associated with longer tandem repeats at the 3' end of the mitochondrial control region and do not show heteroplasmy. Our analysis also indicates the presence of four conserved sequence blocks (CSB), CSB-D, CSB-1, CSB-II and CSB-III, four (Termination Associated Sequence) TAS motifs and 15bp pyrimidine block within the mitochondrial control region, that are highly conserved within genus Schizothorax when compared with other species. The phylogenetic analysis carried by Maximum likelihood (ML), Neighbor Joining (NJ) and Bayesian inference (BI) generated almost identical results. The resultant BI tree showed a close genetic relationship of all the five species and supports two distinct grouping of S. esocinus species. Besides the species relation, the presence of length variation in tandem repeats is attributed to differences in predicting the stability of secondary structures. The role of CSBs and TASs, reported so far as main regulatory signals, would explain the conservation of these elements in evolution.
ACP5 (Uteroferrin): Phylogeny of an Ancient and Conserved Gene Expressed in the Endometrium of Mammals1

PubMed Central

Padua, Maria B.; Lynch, Vincent J.; Alvarez, Natalia V.; Garthwaite, Mark A.; Golos, Thaddeus G.; Bazer, Fuller W.; Kalkunte, Satyan; Sharma, Surendra; Wagner, Gunter P.; Hansen, Peter J.

2012-01-01

ABSTRACT Type 5 acid phosphatase (ACP5; also known as tartrate-resistant acid phosphatase or uteroferrin) is a metalloprotein secreted by the endometrial glandular epithelium of pigs, mares, sheep, and water buffalo. In this paper, we describe the phylogenetic distribution of endometrial expression of ACP5 and demonstrate that endometrial expression arose early in evolution (i.e., before divergence of prototherian and therian mammals ∼166 million years ago). To determine expression of ACP5 in the pregnant endometrium, RNA was isolated from rhesus, mouse, rat, dog, sheep, cow, horse, armadillo, opossum, and duck-billed platypus. Results from RT-PCR and RNA-Seq experiments confirmed that ACP5 is expressed in all species examined. ACP5 was also demonstrated immunochemically in endometrium of rhesus, marmoset, sheep, cow, goat, and opossum. Alignment of inferred amino acid sequences shows a high conservation of ACP5 throughout speciation, with species-specific differences most extensive in the N-terminal and C-terminal regions of the protein. Analysis by Selecton indicated that most of the sites in ACP5 are undergoing purifying selection, and no sites undergoing positive selection were found. In conclusion, endometrial expression of ACP5 is a common feature in all orders of mammals and has been subjected to purifying selection. Expression of ACP5 in the uterus predates the divergence of therians and prototherians. ACP5 is an evolutionary conserved gene that likely exerts a common function important for pregnancy in mammals using a wide range of reproductive strategies. PMID:22278982
ELMO Domains, Evolutionary and Functional Characterization of a Novel GTPase-activating Protein (GAP) Domain for Arf Protein Family GTPases*

PubMed Central

East, Michael P.; Bowzard, J. Bradford; Dacks, Joel B.; Kahn, Richard A.

2012-01-01

The human family of ELMO domain-containing proteins (ELMODs) consists of six members and is defined by the presence of the ELMO domain. Within this family are two subclassifications of proteins, based on primary sequence conservation, protein size, and domain architecture, deemed ELMOD and ELMO. In this study, we used homology searching and phylogenetics to identify ELMOD family homologs in genomes from across eukaryotic diversity. This demonstrated not only that the protein family is ancient but also that ELMOs are potentially restricted to the supergroup Opisthokonta (Metazoa and Fungi), whereas proteins with the ELMOD organization are found in diverse eukaryotes and thus were likely the form present in the last eukaryotic common ancestor. The segregation of the ELMO clade from the larger ELMOD group is consistent with their contrasting functions as unconventional Rac1 guanine nucleotide exchange factors and the Arf family GTPase-activating proteins, respectively. We used unbiased, phylogenetic sorting and sequence alignments to identify the most highly conserved residues within the ELMO domain to identify a putative GAP domain within the ELMODs. Three independent but complementary assays were used to provide an initial characterization of this domain. We identified a highly conserved arginine residue critical for both the biochemical and cellular GAP activity of ELMODs. We also provide initial evidence of the function of human ELMOD1 as an Arf family GAP at the Golgi. These findings provide the basis for the future study of the ELMOD family of proteins and a new avenue for the study of Arf family GTPases. PMID:23014990
Characterisation of circadian rhythms of various duckweeds.

PubMed

Muranaka, T; Okada, M; Yomo, J; Kubota, S; Oyama, T

2015-01-01

The plant circadian clock controls various physiological phenomena that are important for adaptation to natural day-night cycles. Many components of the circadian clock have been identified in Arabidopsis thaliana, the model plant for molecular genetic studies. Recent studies revealed evolutionary conservation of clock components in green plants. Homologues of clock-related genes have been isolated from Lemna gibba and Lemna aequinoctialis, and it has been demonstrated that these homologues function in the clock system in a manner similar to their functioning in Arabidopsis. While clock components are widely conserved, circadian phenomena display diversity even within the Lemna genus. In order to survey the full extent of diversity in circadian rhythms among duckweed plants, we characterised the circadian rhythms of duckweed by employing a semi-transient bioluminescent reporter system. Using a particle bombardment method, circadian bioluminescent reporters were introduced into nine strains representing five duckweed species: Spirodela polyrhiza, Landoltia punctata, Lemna gibba, L. aequinoctialis and Wolffia columbiana. We then monitored luciferase (luc+) reporter activities driven by AtCCA1, ZmUBQ1 or CaMV35S promoters under entrainment and free-running conditions. Under entrainment, AtCCA1::luc+ showed similar diurnal rhythms in all strains. This suggests that the mechanism of biological timing under day-night cycles is conserved throughout the evolution of duckweeds. Under free-running conditions, we observed circadian rhythms of AtCCA1::luc+, ZmUBQ1::luc+ and CaMV35S::luc+. These circadian rhythms showed diversity in period length and sustainability, suggesting that circadian clock mechanisms are somewhat diversified among duckweeds. © 2014 German Botanical Society and The Royal Botanical Society of the Netherlands.
Evolutionary biology: microsporidia sex--a missing link to fungi.

PubMed

Dyer, Paul S

2008-11-11

The evolutionary origins of the microsporidia, a group of intracellular eukaryotic pathogens, have been unclear. Genome analysis of a sex locus and other gene clusters has now revealed conserved synteny with zygomycete fungi, indicating that microsporidia are true fungi descended from a zygomycete ancestor.
Test of Von Baer's law of the conservation of early development.

PubMed

Poe, Steven

2006-11-01

One of the oldest and most pervasive ideas in comparative embryology is the perceived evolutionary conservation of early ontogeny relative to late ontogeny. Karl Von Baer first noted the similarity of early ontogeny across taxa, and Ernst Haeckel and Charles Darwin gave evolutionary interpretation to this phenomenon. In spite of a resurgence of interest in comparative embryology and the development of mechanistic explanations for Von Baer's law, the pattern itself has been largely untested. Here, I use statistical phylogenetic approaches to show that Von Baer's law is an unnecessarily complex explanation of the patterns of ontogenetic timing in several clades of vertebrates. Von Baer's law suggests a positive correlation between ontogenetic time and amount of evolutionary change. I compare ranked position in ontogeny to frequency of evolutionary change in rank for developmental events and find that these measures are not correlated, thus failing to support Von Baer's model. An alternative model that postulates that small changes in ontogenetic rank are evolutionarily easier than large changes is tentatively supported.
Different Evolutionary Modifications as a Guide to Rewire Two-Component Systems

PubMed Central

Krueger, Beate; Friedrich, Torben; Förster, Frank; Bernhardt, Jörg; Gross, Roy; Dandekar, Thomas

2012-01-01

Two-component systems (TCS) are short signalling pathways generally occurring in prokaryotes. They frequently regulate prokaryotic stimulus responses and thus are also of interest for engineering in biotechnology and synthetic biology. The aim of this study is to better understand and describe rewiring of TCS while investigating different evolutionary scenarios. Based on large-scale screens of TCS in different organisms, this study gives detailed data, concrete alignments, and structure analysis on three general modification scenarios, where TCS were rewired for new responses and functions: (i) exchanges in the sequence within single TCS domains, (ii) exchange of whole TCS domains; (iii) addition of new components modulating TCS function. As a result, the replacement of stimulus and promotor cassettes to rewire TCS is well defined exploiting the alignments given here. The diverged TCS examples are non-trivial and the design is challenging. Designed connector proteins may also be useful to modify TCS in selected cases. PMID:22586357
Variation in MHC class II B genes in marbled murrelets: implications for delineating conservation units

Treesearch

C. Vásquez-Carrillo; V. Friesen; L. Hall; M.Z. Peery

2013-01-01

Conserving genetic variation is critical for maintaining the evolutionary potential and viability of a species. Genetic studies seeking to delineate conservation units, however, typically focus on characterizing neutral genetic variation and may not identify populations harboring local adaptations. Here, variation at two major histocompatibility complex (MHC) class II...
Evolutionary inference via the Poisson Indel Process.

PubMed

Bouchard-Côté, Alexandre; Jordan, Michael I

2013-01-22

We address the problem of the joint statistical inference of phylogenetic trees and multiple sequence alignments from unaligned molecular sequences. This problem is generally formulated in terms of string-valued evolutionary processes along the branches of a phylogenetic tree. The classic evolutionary process, the TKF91 model [Thorne JL, Kishino H, Felsenstein J (1991) J Mol Evol 33(2):114-124] is a continuous-time Markov chain model composed of insertion, deletion, and substitution events. Unfortunately, this model gives rise to an intractable computational problem: The computation of the marginal likelihood under the TKF91 model is exponential in the number of taxa. In this work, we present a stochastic process, the Poisson Indel Process (PIP), in which the complexity of this computation is reduced to linear. The Poisson Indel Process is closely related to the TKF91 model, differing only in its treatment of insertions, but it has a global characterization as a Poisson process on the phylogeny. Standard results for Poisson processes allow key computations to be decoupled, which yields the favorable computational profile of inference under the PIP model. We present illustrative experiments in which Bayesian inference under the PIP model is compared with separate inference of phylogenies and alignments.
Evolutionary inference via the Poisson Indel Process

PubMed Central

Bouchard-Côté, Alexandre; Jordan, Michael I.

2013-01-01

We address the problem of the joint statistical inference of phylogenetic trees and multiple sequence alignments from unaligned molecular sequences. This problem is generally formulated in terms of string-valued evolutionary processes along the branches of a phylogenetic tree. The classic evolutionary process, the TKF91 model [Thorne JL, Kishino H, Felsenstein J (1991) J Mol Evol 33(2):114–124] is a continuous-time Markov chain model composed of insertion, deletion, and substitution events. Unfortunately, this model gives rise to an intractable computational problem: The computation of the marginal likelihood under the TKF91 model is exponential in the number of taxa. In this work, we present a stochastic process, the Poisson Indel Process (PIP), in which the complexity of this computation is reduced to linear. The Poisson Indel Process is closely related to the TKF91 model, differing only in its treatment of insertions, but it has a global characterization as a Poisson process on the phylogeny. Standard results for Poisson processes allow key computations to be decoupled, which yields the favorable computational profile of inference under the PIP model. We present illustrative experiments in which Bayesian inference under the PIP model is compared with separate inference of phylogenies and alignments. PMID:23275296
ChIP-seq Identification of Weakly Conserved Heart Enhancers

PubMed Central

Blow, Matthew J.; McCulley, David J.; Li, Zirong; Zhang, Tao; Akiyama, Jennifer A.; Holt, Amy; Plajzer-Frick, Ingrid; Shoukry, Malak; Wright, Crystal; Chen, Feng; Afzal, Veena; Bristow, James; Ren, Bing; Black, Brian L.; Rubin, Edward M.; Visel, Axel; Pennacchio, Len A.

2011-01-01

Accurate control of tissue-specific gene expression plays a pivotal role in heart development, but few cardiac transcriptional enhancers have thus far been identified. Extreme non-coding sequence conservation successfully predicts enhancers active in many tissues, but fails to identify substantial numbers of heart enhancers. Here we used ChIP-seq with the enhancer-associated protein p300 from mouse embryonic day 11.5 heart tissue to identify over three thousand candidate heart enhancers genome-wide. Compared to other tissues studied at this time-point, most candidate heart enhancers are less deeply conserved in vertebrate evolution. Nevertheless, the testing of 130 candidate regions in a transgenic mouse assay revealed that most of them reproducibly function as enhancers active in the heart, irrespective of their degree of evolutionary constraint. These results provide evidence for a large population of poorly conserved heart enhancers and suggest that the evolutionary constraint of embryonic enhancers can vary depending on tissue type. PMID:20729851
Fast and accurate phylogeny reconstruction using filtered spaced-word matches

PubMed Central

Sohrabi-Jahromi, Salma; Morgenstern, Burkhard

2017-01-01

Abstract Motivation: Word-based or ‘alignment-free’ algorithms are increasingly used for phylogeny reconstruction and genome comparison, since they are much faster than traditional approaches that are based on full sequence alignments. Existing alignment-free programs, however, are less accurate than alignment-based methods. Results: We propose Filtered Spaced Word Matches (FSWM), a fast alignment-free approach to estimate phylogenetic distances between large genomic sequences. For a pre-defined binary pattern of match and don’t-care positions, FSWM rapidly identifies spaced word-matches between input sequences, i.e. gap-free local alignments with matching nucleotides at the match positions and with mismatches allowed at the don’t-care positions. We then estimate the number of nucleotide substitutions per site by considering the nucleotides aligned at the don’t-care positions of the identified spaced-word matches. To reduce the noise from spurious random matches, we use a filtering procedure where we discard all spaced-word matches for which the overall similarity between the aligned segments is below a threshold. We show that our approach can accurately estimate substitution frequencies even for distantly related sequences that cannot be analyzed with existing alignment-free methods; phylogenetic trees constructed with FSWM distances are of high quality. A program run on a pair of eukaryotic genomes of a few hundred Mb each takes a few minutes. Availability and Implementation: The program source code for FSWM including a documentation, as well as the software that we used to generate artificial genome sequences are freely available at http://fswm.gobics.de/ Contact: chris.leimeister@stud.uni-goettingen.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28073754
MISFITS: evaluating the goodness of fit between a phylogenetic model and an alignment.

PubMed

Nguyen, Minh Anh Thi; Klaere, Steffen; von Haeseler, Arndt

2011-01-01

As models of sequence evolution become more and more complicated, many criteria for model selection have been proposed, and tools are available to select the best model for an alignment under a particular criterion. However, in many instances the selected model fails to explain the data adequately as reflected by large deviations between observed pattern frequencies and the corresponding expectation. We present MISFITS, an approach to evaluate the goodness of fit (http://www.cibiv.at/software/misfits). MISFITS introduces a minimum number of "extra substitutions" on the inferred tree to provide a biologically motivated explanation why the alignment may deviate from expectation. These extra substitutions plus the evolutionary model then fully explain the alignment. We illustrate the method on several examples and then give a survey about the goodness of fit of the selected models to the alignments in the PANDIT database.
Self-organized sorting limits behavioral variability in swarms

PubMed Central

Copenhagen, Katherine; Quint, David A.; Gopinathan, Ajay

2016-01-01

Swarming is a phenomenon where collective motion arises from simple local interactions between typically identical individuals. Here, we investigate the effects of variability in behavior among the agents in finite swarms with both alignment and cohesive interactions. We show that swarming is abolished above a critical fraction of non-aligners who do not participate in alignment. In certain regimes, however, swarms above the critical threshold can dynamically reorganize and sort out excess non-aligners to maintain the average fraction close to the critical value. This persists even in swarms with a distribution of alignment interactions, suggesting a simple, robust and efficient mechanism that allows heterogeneously mixed populations to naturally regulate their composition and remain in a collective swarming state or even differentiate among behavioral phenotypes. We show that, for evolving swarms, this self-organized sorting behavior can couple to the evolutionary dynamics leading to new evolutionarily stable equilibrium populations set by the physical swarm parameters. PMID:27550316
Self-organized sorting limits behavioral variability in swarms

NASA Astrophysics Data System (ADS)

Copenhagen, Katherine; Quint, David A.; Gopinathan, Ajay

2016-08-01

Swarming is a phenomenon where collective motion arises from simple local interactions between typically identical individuals. Here, we investigate the effects of variability in behavior among the agents in finite swarms with both alignment and cohesive interactions. We show that swarming is abolished above a critical fraction of non-aligners who do not participate in alignment. In certain regimes, however, swarms above the critical threshold can dynamically reorganize and sort out excess non-aligners to maintain the average fraction close to the critical value. This persists even in swarms with a distribution of alignment interactions, suggesting a simple, robust and efficient mechanism that allows heterogeneously mixed populations to naturally regulate their composition and remain in a collective swarming state or even differentiate among behavioral phenotypes. We show that, for evolving swarms, this self-organized sorting behavior can couple to the evolutionary dynamics leading to new evolutionarily stable equilibrium populations set by the physical swarm parameters.
The emergence of human prosociality: aligning with others through feelings, concerns, and norms

PubMed Central

Jensen, Keith; Vaish, Amrisha; Schmidt, Marco F. H.

2014-01-01

The fact that humans cooperate with nonkin is something we take for granted, but this is an anomaly in the animal kingdom. Our species’ ability to behave prosocially may be based on human-unique psychological mechanisms. We argue here that these mechanisms include the ability to care about the welfare of others (other-regarding concerns), to “feel into” others (empathy), and to understand, adhere to, and enforce social norms (normativity). We consider how these motivational, emotional, and normative substrates of prosociality develop in childhood and emerged in our evolutionary history. Moreover, we suggest that these three mechanisms all serve the critical function of aligning individuals with others: Empathy and other-regarding concerns align individuals with one another, and norms align individuals with their group. Such alignment allows us to engage in the kind of large-scale cooperation seen uniquely in humans. PMID:25120521
A vulnerability assessment of 300 species in Florida: threats from sea level rise, land use, and climate change.

PubMed

Reece, Joshua Steven; Noss, Reed F; Oetting, Jon; Hoctor, Tom; Volk, Michael

2013-01-01

Species face many threats, including accelerated climate change, sea level rise, and conversion and degradation of habitat from human land uses. Vulnerability assessments and prioritization protocols have been proposed to assess these threats, often in combination with information such as species rarity; ecological, evolutionary or economic value; and likelihood of success. Nevertheless, few vulnerability assessments or prioritization protocols simultaneously account for multiple threats or conservation values. We applied a novel vulnerability assessment tool, the Standardized Index of Vulnerability and Value, to assess the conservation priority of 300 species of plants and animals in Florida given projections of climate change, human land-use patterns, and sea level rise by the year 2100. We account for multiple sources of uncertainty and prioritize species under five different systems of value, ranging from a primary emphasis on vulnerability to threats to an emphasis on metrics of conservation value such as phylogenetic distinctiveness. Our results reveal remarkable consistency in the prioritization of species across different conservation value systems. Species of high priority include the Miami blue butterfly (Cyclargus thomasi bethunebakeri), Key tree cactus (Pilosocereus robinii), Florida duskywing butterfly (Ephyriades brunnea floridensis), and Key deer (Odocoileus virginianus clavium). We also identify sources of uncertainty and the types of life history information consistently missing across taxonomic groups. This study characterizes the vulnerabilities to major threats of a broad swath of Florida's biodiversity and provides a system for prioritizing conservation efforts that is quantitative, flexible, and free from hidden value judgments.
Scanning wave photopolymerization enables dye-free alignment patterning of liquid crystals

PubMed Central

Hisano, Kyohei; Aizawa, Miho; Ishizu, Masaki; Kurata, Yosuke; Nakano, Wataru; Akamatsu, Norihisa; Barrett, Christopher J.; Shishido, Atsushi

2017-01-01

Hierarchical control of two-dimensional (2D) molecular alignment patterns over large areas is essential for designing high-functional organic materials and devices. However, even by the most powerful current methods, dye molecules that discolor and destabilize the materials need to be doped in, complicating the process. We present a dye-free alignment patterning technique, based on a scanning wave photopolymerization (SWaP) concept, that achieves a spatial light–triggered mass flow to direct molecular order using scanning light to propagate the wavefront. This enables one to generate macroscopic, arbitrary 2D alignment patterns in a wide variety of optically transparent polymer films from various polymerizable mesogens with sufficiently high birefringence (>0.1) merely by single-step photopolymerization, without alignment layers or polarized light sources. A set of 150,000 arrays of a radial alignment pattern with a size of 27.4 μm × 27.4 μm were successfully inscribed by SWaP, in which each individual pattern is smaller by a factor of 104 than that achievable by conventional photoalignment methods. This dye-free inscription of microscopic, complex alignment patterns over large areas provides a new pathway for designing higher-performance optical and mechanical devices. PMID:29152567
Polarization of Magnetic Dipole Emission and Spinning Dust Emission from Magnetic Nanoparticles

NASA Astrophysics Data System (ADS)

Hoang, Thiem; Lazarian, Alex

2016-04-01

Magnetic dipole emission (MDE) from interstellar magnetic nanoparticles is potentially an important Galactic foreground in the microwave frequencies, and its polarization level may pose great challenges for achieving reliable measurements of cosmic microwave background B-mode signal. To obtain realistic predictions for the polarization of MDE, we first compute the degree of alignment of big silicate grains incorporated with magnetic inclusions. We find that thermally rotating big grains with magnetic inclusions are weakly aligned and can achieve alignment saturation when the magnetic alignment rate becomes much faster than the rotational damping rate. We then compute the degree of alignment for free-flying magnetic nanoparticles, taking into account various interaction processes of grains with the ambient gas and radiation field, including neutral collisions, ion collisions, and infrared emission. We find that the rotational damping by infrared emission can significantly decrease the degree of alignment of small particles from the saturation level, whereas the excitation by ion collisions can enhance the alignment of ultrasmall particles. Using the computed degrees of alignment, we predict the polarization level of MDE from free-flying magnetic nanoparticles to be rather low. Such a polarization level is within the upper limits measured for anomalous microwave emission (AME), which indicates that MDE from free-flying iron particles may not be ruled out as a source of AME. We also quantify rotational emission from free-flying iron nanoparticles with permanent magnetic moments and find that its emissivity is about one order of magnitude lower than that from spinning polycyclic aromatic hydrocarbons.
POLARIZATION OF MAGNETIC DIPOLE EMISSION AND SPINNING DUST EMISSION FROM MAGNETIC NANOPARTICLES

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hoang, Thiem; Lazarian, Alex

2016-04-20

Magnetic dipole emission (MDE) from interstellar magnetic nanoparticles is potentially an important Galactic foreground in the microwave frequencies, and its polarization level may pose great challenges for achieving reliable measurements of cosmic microwave background B-mode signal. To obtain realistic predictions for the polarization of MDE, we first compute the degree of alignment of big silicate grains incorporated with magnetic inclusions. We find that thermally rotating big grains with magnetic inclusions are weakly aligned and can achieve alignment saturation when the magnetic alignment rate becomes much faster than the rotational damping rate. We then compute the degree of alignment for free-flyingmore » magnetic nanoparticles, taking into account various interaction processes of grains with the ambient gas and radiation field, including neutral collisions, ion collisions, and infrared emission. We find that the rotational damping by infrared emission can significantly decrease the degree of alignment of small particles from the saturation level, whereas the excitation by ion collisions can enhance the alignment of ultrasmall particles. Using the computed degrees of alignment, we predict the polarization level of MDE from free-flying magnetic nanoparticles to be rather low. Such a polarization level is within the upper limits measured for anomalous microwave emission (AME), which indicates that MDE from free-flying iron particles may not be ruled out as a source of AME. We also quantify rotational emission from free-flying iron nanoparticles with permanent magnetic moments and find that its emissivity is about one order of magnitude lower than that from spinning polycyclic aromatic hydrocarbons.« less

The Most Deeply Conserved Noncoding Sequences in Plants Serve Similar Functions to Those in Vertebrates Despite Large Differences in Evolutionary Rates[W

PubMed Central

Burgess, Diane; Freeling, Michael

2014-01-01

In vertebrates, conserved noncoding elements (CNEs) are functionally constrained sequences that can show striking conservation over >400 million years of evolutionary distance and frequently are located megabases away from target developmental genes. Conserved noncoding sequences (CNSs) in plants are much shorter, and it has been difficult to detect conservation among distantly related genomes. In this article, we show not only that CNS sequences can be detected throughout the eudicot clade of flowering plants, but also that a subset of 37 CNSs can be found in all flowering plants (diverging ∼170 million years ago). These CNSs are functionally similar to vertebrate CNEs, being highly associated with transcription factor and development genes and enriched in transcription factor binding sites. Some of the most highly conserved sequences occur in genes encoding RNA binding proteins, particularly the RNA splicing–associated SR genes. Differences in sequence conservation between plants and animals are likely to reflect differences in the biology of the organisms, with plants being much more able to tolerate genomic deletions and whole-genome duplication events due, in part, to their far greater fecundity compared with vertebrates. PMID:24681619
Increased alignment sensitivity improves the usage of genome alignments for comparative gene annotation.

PubMed

Sharma, Virag; Hiller, Michael

2017-08-21

Genome alignments provide a powerful basis to transfer gene annotations from a well-annotated reference genome to many other aligned genomes. The completeness of these annotations crucially depends on the sensitivity of the underlying genome alignment. Here, we investigated the impact of the genome alignment parameters and found that parameters with a higher sensitivity allow the detection of thousands of novel alignments between orthologous exons that have been missed before. In particular, comparisons between species separated by an evolutionary distance of >0.75 substitutions per neutral site, like human and other non-placental vertebrates, benefit from increased sensitivity. To systematically test if increased sensitivity improves comparative gene annotations, we built a multiple alignment of 144 vertebrate genomes and used this alignment to map human genes to the other 143 vertebrates with CESAR. We found that higher alignment sensitivity substantially improves the completeness of comparative gene annotations by adding on average 2382 and 7440 novel exons and 117 and 317 novel genes for mammalian and non-mammalian species, respectively. Our results suggest a more sensitive alignment strategy that should generally be used for genome alignments between distantly-related species. Our 144-vertebrate genome alignment and the comparative gene annotations (https://bds.mpi-cbg.de/hillerlab/144VertebrateAlignment_CESAR/) are a valuable resource for comparative genomics. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Regulation of G-protein coupled receptor traffic by an evolutionary conserved hydrophobic signal.

PubMed

Angelotti, Tim; Daunt, David; Shcherbakova, Olga G; Kobilka, Brian; Hurt, Carl M

2010-04-01

Plasma membrane (PM) expression of G-protein coupled receptors (GPCRs) is required for activation by extracellular ligands; however, mechanisms that regulate PM expression of GPCRs are poorly understood. For some GPCRs, such as alpha2c-adrenergic receptors (alpha(2c)-ARs), heterologous expression in non-native cells results in limited PM expression and extensive endoplasmic reticulum (ER) retention. Recently, ER export/retentions signals have been proposed to regulate cellular trafficking of several GPCRs. By utilizing a chimeric alpha(2a)/alpha(2c)-AR strategy, we identified an evolutionary conserved hydrophobic sequence (ALAAALAAAAA) in the extracellular amino terminal region that is responsible in part for alpha(2c)-AR subtype-specific trafficking. To our knowledge, this is the first luminal ER retention signal reported for a GPCR. Removal or disruption of the ER retention signal dramatically increased PM expression and decreased ER retention. Conversely, transplantation of this hydrophobic sequence into alpha(2a)-ARs reduced their PM expression and increased ER retention. This evolutionary conserved hydrophobic trafficking signal within alpha(2c)-ARs serves as a regulator of GPCR trafficking.
The origin, current diversity and future conservation of the modern lion (Panthera leo)

PubMed Central

Barnett, Ross; Yamaguchi, Nobuyuki; Barnes, Ian; Cooper, Alan

2006-01-01

Understanding the phylogeographic processes affecting endangered species is crucial both to interpreting their evolutionary history and to the establishment of conservation strategies. Lions provide a key opportunity to explore such processes; however, a lack of genetic diversity and shortage of suitable samples has until now hindered such investigation. We used mitochondrial control region DNA (mtDNA) sequences to investigate the phylogeographic history of modern lions, using samples from across their entire range. We find the sub-Saharan African lions are basal among modern lions, supporting a single African origin model of modern lion evolution, equivalent to the ‘recent African origin’ model of modern human evolution. We also find the greatest variety of mtDNA haplotypes in the centre of Africa, which may be due to the distribution of physical barriers and continental-scale habitat changes caused by Pleistocene glacial oscillations. Our results suggest that the modern lion may currently consist of three geographic populations on the basis of their recent evolutionary history: North African–Asian, southern African and middle African. Future conservation strategies should take these evolutionary subdivisions into consideration. PMID:16901830
Spontaneous magnetic alignment behaviour in free-living lizards.

PubMed

Diego-Rasilla, Francisco J; Pérez-Mellado, Valentín; Pérez-Cembranos, Ana

2017-04-01

Several species of vertebrates exhibit spontaneous longitudinal body axis alignment relative to the Earth's magnetic field (i.e., magnetic alignment) while they are performing different behavioural tasks. Since magnetoreception is still not fully understood, studying magnetic alignment provides evidence for magnetoreception and broadens current knowledge of magnetic sense in animals. Furthermore, magnetic alignment widens the roles of magnetic sensitivity in animals and may contribute to shed new light on magnetoreception. In this context, spontaneous alignment in two species of lacertid lizards (Podarcis muralis and Podarcis lilfordi) during basking periods was monitored. Alignments in 255 P. muralis and 456 P. lilfordi were measured over a 5-year period. The possible influence of the sun's position (i.e., altitude and azimuth) and geomagnetic field values corresponding to the moment in which a particular lizard was observed on lizards' body axis orientation was evaluated. Both species exhibited a highly significant bimodal orientation along the north-northeast and south-southwest magnetic axis. The evidence from this study suggests that free-living lacertid lizards exhibit magnetic alignment behaviour, since their body alignments cannot be explained by an effect of the sun's position. On the contrary, lizard orientations were significantly correlated with geomagnetic field values at the time of each observation. We suggest that this behaviour might provide lizards with a constant directional reference while they are sun basking. This directional reference might improve their mental map of space to accomplish efficient escape behaviour. This study is the first to provide spontaneous magnetic alignment behaviour in free-living reptiles.
Spontaneous magnetic alignment behaviour in free-living lizards

NASA Astrophysics Data System (ADS)

Diego-Rasilla, Francisco J.; Pérez-Mellado, Valentín; Pérez-Cembranos, Ana

2017-04-01

Several species of vertebrates exhibit spontaneous longitudinal body axis alignment relative to the Earth's magnetic field (i.e., magnetic alignment) while they are performing different behavioural tasks. Since magnetoreception is still not fully understood, studying magnetic alignment provides evidence for magnetoreception and broadens current knowledge of magnetic sense in animals. Furthermore, magnetic alignment widens the roles of magnetic sensitivity in animals and may contribute to shed new light on magnetoreception. In this context, spontaneous alignment in two species of lacertid lizards ( Podarcis muralis and Podarcis lilfordi) during basking periods was monitored. Alignments in 255 P. muralis and 456 P. lilfordi were measured over a 5-year period. The possible influence of the sun's position (i.e., altitude and azimuth) and geomagnetic field values corresponding to the moment in which a particular lizard was observed on lizards' body axis orientation was evaluated. Both species exhibited a highly significant bimodal orientation along the north-northeast and south-southwest magnetic axis. The evidence from this study suggests that free-living lacertid lizards exhibit magnetic alignment behaviour, since their body alignments cannot be explained by an effect of the sun's position. On the contrary, lizard orientations were significantly correlated with geomagnetic field values at the time of each observation. We suggest that this behaviour might provide lizards with a constant directional reference while they are sun basking. This directional reference might improve their mental map of space to accomplish efficient escape behaviour. This study is the first to provide spontaneous magnetic alignment behaviour in free-living reptiles.
Structural analysis of key gap junction domains--Lessons from genome data and disease-linked mutants.

PubMed

Bai, Donglin

2016-02-01

A gap junction (GJ) channel is formed by docking of two GJ hemichannels and each of these hemichannels is a hexamer of connexins. All connexin genes have been identified in human, mouse, and rat genomes and their homologous genes in many other vertebrates are available in public databases. The protein sequences of these connexins align well with high sequence identity in the same connexin across different species. Domains in closely related connexins and several residues in all known connexins are also well-conserved. These conserved residues form signatures (also known as sequence logos) in these domains and are likely to play important biological functions. In this review, the sequence logos of individual connexins, groups of connexins with common ancestors, and all connexins are analyzed to visualize natural evolutionary variations and the hot spots for human disease-linked mutations. Several gap junction domains are homologous, likely forming similar structures essential for their function. The availability of a high resolution Cx26 GJ structure and the subsequently-derived homology structure models for other connexin GJ channels elevated our understanding of sequence logos at the three-dimensional GJ structure level, thus facilitating the understanding of how disease-linked connexin mutants might impair GJ structure and function. This knowledge will enable the design of complementary variants to rescue disease-linked mutants. Copyright © 2015 Elsevier Ltd. All rights reserved.
Adiabatic Field-Free Alignment of Asymmetric Top Molecules with an Optical Centrifuge.

PubMed

Korobenko, A; Milner, V

2016-05-06

We use an optical centrifuge to align asymmetric top SO_{2} molecules by adiabatically spinning their most polarizable O-O axis. The effective centrifugal potential in the rotating frame confines the sulfur atoms to the plane of the laser-induced rotation, leading to the planar molecular alignment that persists after the molecules are released from the centrifuge. The periodic appearance of the full three-dimensional alignment, typically observed only with linear and symmetric top molecules, is also detected. Together with strong in-plane centrifugal forces, which bend the molecules by up to 10 deg, permanent field-free alignment offers new ways of controlling molecules with laser light.
A Conserved Endocrine Mechanism Controls the Formation of Dauer and Infective Larvae in Nematodes

PubMed Central

Ogawa, Akira; Streit, Adrian; Antebi, Adam; Sommer, Ralf J.

2009-01-01

Summary Under harsh environmental conditions Caenorhabditis elegans larvae undergo arrest and form dauer larvae that can attach to other animals to facilitate dispersal[1]. It has been argued that this phenomenon, called phoresy, represents an intermediate step towards parasitism[2, 3]. Indeed, parasitic nematodes invade their hosts as infective larvae, a stage that shows striking morphological similarities to dauer larvae[1]. While the molecular regulation of dauer entry in C. elegans involves insulin and TGF-ß signaling[4-8], studies of TGF-ß orthologues in parasitic nematodes did not provide evidence for a common origin of dauer and infective larvae[9-14]. To identify conserved candidate regulators between Caenorhabditis and parasitic nematodes we used an evolutionary approach involving Pristionchus pacificus as intermediate. We show by mutational and pharmacological analysis that Pristionchus and Caenorhabditis share the dafachronic acid-DAF-12 system as core endocrine module for dauer formation. One of the dafachronic acids, Δ7-DA, has a conserved role in the mammalian parasite Strongyloides papillosus where it controls entry into the infective stage. Application of Δ7-DA blocks formation of infective larvae and results in the generation of free-living animals. The conservation of this small molecule ligand represents a fundamental link between dauer and infective larvae and might provide a general strategy for nematode parasitism. PMID:19110431
Nuclear rDNA pseudogenes in Chagas disease vectors: evolutionary implications of a new 5.8S+ITS-2 paralogous sequence marker in triatomines of North, Central and northern South America.

PubMed

Bargues, M Dolores; Zuriaga, M Angeles; Mas-Coma, Santiago

2014-01-01

A pseudogene, paralogous to rDNA 5.8S and ITS-2, is described in Meccus dimidiata dimidiata, M. d. capitata, M. d. maculippenis, M. d. hegneri, M. sp. aff. dimidiata, M. p. phyllosoma, M. p. longipennis, M. p. pallidipennis, M. p. picturata, M. p. mazzottii, Triatoma mexicana, Triatoma nitida and Triatoma sanguisuga, covering North America, Central America and northern South America. Such a nuclear rDNA pseudogene is very rare. In the 5.8S gene, criteria for pseudogene identification included length variability, lower GC content, mutations regarding the functional uniform sequence, and relatively high base substitutions in evolutionary conserved sites. At ITS-2 level, criteria were the shorter sequence and large proportion of insertions and deletions (indels). Pseudogenic 5.8S and ITS-2 secondary structures were different from the functional foldings, different one another, showing less negative values for minimum free energy (mfe) and centroid predictions, and lower fit between mfe, partition function, and centroid structures. A complete characterization indicated a processed pseudogenic unit of the ghost type, escaping from rDNA concerted evolution and with functionality subject to constraints instead of evolving free by neutral drift. Despite a high indel number, low mutation number and an evolutionary rate similar to the functional ITS-2, that pseudogene distinguishes different taxa and furnishes coherent phylogenetic topologies with resolution similar to the functional ITS-2. The discovery of a pseudogene in many phylogenetically related species is unique in animals and allowed for an estimation of its palaeobiogeographical origin based on molecular clock data, inheritance pathways, evolutionary rate and pattern, and geographical spread. Additional to the technical risk to be considered henceforth, this relict pseudogene, designated as "ps(5.8S+ITS-2)", proves to be a valuable marker for specimen classification, phylogenetic analyses, and systematic/taxonomic studies. It opens a new research field, Chagas disease epidemiology and control included, given its potential relationships with triatomine fitness, behaviour and adaptability. Copyright © 2013 Elsevier B.V. All rights reserved.
Informational Gene Phylogenies Do Not Support a Fourth Domain of Life for Nucleocytoplasmic Large DNA Viruses

PubMed Central

Williams, Tom A.; Embley, T. Martin; Heinz, Eva

2011-01-01

Mimivirus is a nucleocytoplasmic large DNA virus (NCLDV) with a genome size (1.2 Mb) and coding capacity ( 1000 genes) comparable to that of some cellular organisms. Unlike other viruses, Mimivirus and its NCLDV relatives encode homologs of broadly conserved informational genes found in Bacteria, Archaea, and Eukaryotes, raising the possibility that they could be placed on the tree of life. A recent phylogenetic analysis of these genes showed the NCLDVs emerging as a monophyletic group branching between Eukaryotes and Archaea. These trees were interpreted as evidence for an independent “fourth domain” of life that may have contributed DNA processing genes to the ancestral eukaryote. However, the analysis of ancient evolutionary events is challenging, and tree reconstruction is susceptible to bias resulting from non-phylogenetic signals in the data. These include compositional heterogeneity and homoplasy, which can lead to the spurious grouping of compositionally-similar or fast-evolving sequences. Here, we show that these informational gene alignments contain both significant compositional heterogeneity and homoplasy, which were not adequately modelled in the original analysis. When we use more realistic evolutionary models that better fit the data, the resulting trees are unable to reject a simple null hypothesis in which these informational genes, like many other NCLDV genes, were acquired by horizontal transfer from eukaryotic hosts. Our results suggest that a fourth domain is not required to explain the available sequence data. PMID:21698163
Evolutionary Insights from a Genetically Divergent Hantavirus Harbored by the European Common Mole (Talpa europaea)

PubMed Central

Kang, Hae Ji; Bennett, Shannon N.; Sumibcay, Laarni; Arai, Satoru; Hope, Andrew G.; Mocz, Gabor; Song, Jin-Won; Cook, Joseph A.; Yanagihara, Richard

2009-01-01

Background The discovery of genetically distinct hantaviruses in shrews (Order Soricomorpha, Family Soricidae) from widely separated geographic regions challenges the hypothesis that rodents (Order Rodentia, Family Muridae and Cricetidae) are the primordial reservoir hosts of hantaviruses and also predicts that other soricomorphs harbor hantaviruses. Recently, novel hantavirus genomes have been detected in moles of the Family Talpidae, including the Japanese shrew mole (Urotrichus talpoides) and American shrew mole (Neurotrichus gibbsii). We present new insights into the evolutionary history of hantaviruses gained from a highly divergent hantavirus, designated Nova virus (NVAV), identified in the European common mole (Talpa europaea) captured in Hungary. Methodology/Principal Findings Pair-wise alignment and comparison of the full-length S- and L-genomic segments indicated moderately low sequence similarity of 54–65% and 46–63% at the nucleotide and amino acid levels, respectively, between NVAV and representative rodent- and soricid-borne hantaviruses. Despite the high degree of sequence divergence, the predicted secondary structure of the NVAV nucleocapsid protein exhibited the characteristic coiled-coil domains at the amino-terminal end, and the L-segment motifs, typically found in hantaviruses, were well conserved. Phylogenetic analyses, using maximum-likelihood and Bayesian methods, showed that NVAV formed a distinct clade that was evolutionarily distant from all other hantaviruses. Conclusions Newly identified hantaviruses harbored by shrews and moles support long-standing virus-host relationships and suggest that ancestral soricomorphs, rather than rodents, may have been the early or original mammalian hosts. PMID:19582155
Saving seeds: Optimally planning our Ex Situ conservation collections to ensure species' evolutionary potential

Treesearch

Sean M. Hoban

2017-01-01

In the face of ongoing environmental change, conservation and natural resource agencies are initiating or expanding ex situ seed collections from natural plant populations. Seed collections have many uses, including in provenance trials, breeding programs, seed orchards, gene banks for long-term conservation (live plants or seeds), restoration, reforestation, and...
The post-genomic era of biological network alignment.

PubMed

Faisal, Fazle E; Meng, Lei; Crawford, Joseph; Milenković, Tijana

2015-12-01

Biological network alignment aims to find regions of topological and functional (dis)similarities between molecular networks of different species. Then, network alignment can guide the transfer of biological knowledge from well-studied model species to less well-studied species between conserved (aligned) network regions, thus complementing valuable insights that have already been provided by genomic sequence alignment. Here, we review computational challenges behind the network alignment problem, existing approaches for solving the problem, ways of evaluating their alignment quality, and the approaches' biomedical applications. We discuss recent innovative efforts of improving the existing view of network alignment. We conclude with open research questions in comparative biological network research that could further our understanding of principles of life, evolution, disease, and therapeutics.
FEAST: sensitive local alignment with multiple rates of evolution.

PubMed

Hudek, Alexander K; Brown, Daniel G

2011-01-01

We present a pairwise local aligner, FEAST, which uses two new techniques: a sensitive extension algorithm for identifying homologous subsequences, and a descriptive probabilistic alignment model. We also present a new procedure for training alignment parameters and apply it to the human and mouse genomes, producing a better parameter set for these sequences. Our extension algorithm identifies homologous subsequences by considering all evolutionary histories. It has higher maximum sensitivity than Viterbi extensions, and better balances specificity. We model alignments with several submodels, each with unique statistical properties, describing strongly similar and weakly similar regions of homologous DNA. Training parameters using two submodels produces superior alignments, even when we align with only the parameters from the weaker submodel. Our extension algorithm combined with our new parameter set achieves sensitivity 0.59 on synthetic tests. In contrast, LASTZ with default settings achieves sensitivity 0.35 with the same false positive rate. Using the weak submodel as parameters for LASTZ increases its sensitivity to 0.59 with high error. FEAST is available at http://monod.uwaterloo.ca/feast/.
Fabrication and Characterization of Aligned Flexible Lead-Free Piezoelectric Nanofibers for Wearable Device Applications

PubMed Central

Ji, Sang Hyun; Yun, Ji Sun

2018-01-01

Flexible lead-free piezoelectric nanofibers, based on BNT-ST (0.78Bi0.5Na0.5TiO3-0.22SrTiO3) ceramic and poly(vinylidene fluoride-trifluoroethylene) (PVDF-TrFE) copolymers, were fabricated by an electrospinning method and the effects of the degree of alignment in the nanofibers on the piezoelectric characteristics were investigated. The microstructure of the lead-free piezoelectric nanofibers was observed by field emission scanning electron microscope (FE-SEM) and the orientation was analyzed by fast Fourier transform (FFT) images. X-ray diffraction (XRD) analysis confirmed that the phase was not changed by the electrospinning process and maintained a perovskite phase. Polarization-electric field (P-E) loops and piezoresponse force microscopy (PFM) were used to investigate the piezoelectric properties of the piezoelectric nanofibers, according to the degree of alignment—the well aligned piezoelectric nanofibers had higher piezoelectric properties. Furthermore, the output voltage of the aligned lead-free piezoelectric nanofibers was measured according to the vibration frequency and the bending motion and the aligned piezoelectric nanofibers with a collector rotation speed of 1500 rpm performed the best. PMID:29596372
Some assembly required: evolutionary and systems perspectives on the mammalian reproductive system.

PubMed

Mordhorst, Bethany R; Wilson, Miranda L; Conant, Gavin C

2016-01-01

In this review, we discuss the way that insights from evolutionary theory and systems biology shed light on form and function in mammalian reproductive systems. In the first part of the review, we contrast the rapid evolution seen in some reproductive genes with the generally conservative nature of development. We discuss directional selection and coevolution as potential drivers of rapid evolution in sperm and egg proteins. Such rapid change is very different from the highly conservative nature of later embryo development. However, it is not unique, as some regions of the sex chromosomes also show elevated rates of evolutionary change. To explain these contradictory trends, we argue that it is not reproductive functions per se that induce rapid evolution. Rather, it is the fact that biotic interactions, such as speciation events and sexual conflict, have no evolutionary endpoint and hence can drive continuous evolutionary changes. Returning to the question of sex chromosome evolution, we discuss the way that recent advances in evolutionary genomics and systems biology and, in particular, the development of a theory of gene balance provide a better understanding of the evolutionary patterns seen on these chromosomes. We end the review with a discussion of a surprising and incompletely understood phenomenon observed in early embryos: namely the Warburg effect, whereby glucose is fermented to lactate and alanine rather than respired to carbon dioxide. We argue that evolutionary insights, from both yeasts and tumor cells, help to explain the Warburg effect, and that new metabolic modeling approaches are useful in assessing the potential sources of the effect.
Alignment-free detection of horizontal gene transfer between closely related bacterial genomes.

PubMed

Domazet-Lošo, Mirjana; Haubold, Bernhard

2011-09-01

Bacterial epidemics are often caused by strains that have acquired their increased virulence through horizontal gene transfer. Due to this association with disease, the detection of horizontal gene transfer continues to receive attention from microbiologists and bioinformaticians alike. Most software for detecting transfer events is based on alignments of sets of genes or of entire genomes. But despite great advances in the design of algorithms and computer programs, genome alignment remains computationally challenging. We have therefore developed an alignment-free algorithm for rapidly detecting horizontal gene transfer between closely related bacterial genomes. Our implementation of this algorithm is called alfy for "ALignment Free local homologY" and is freely available from http://guanine.evolbio.mpg.de/alfy/. In this comment we demonstrate the application of alfy to the genomes of Staphylococcus aureus. We also argue that-contrary to popular belief and in spite of increasing computer speed-algorithmic optimization is becoming more, not less, important if genome data continues to accumulate at the present rate.
Evolutionary Meta-Analysis of Association Studies Reveals Ancient Constraints Affecting Disease Marker Discovery

PubMed Central

Dudley, Joel T.; Chen, Rong; Sanderford, Maxwell; Butte, Atul J.; Kumar, Sudhir

2012-01-01

Genome-wide disease association studies contrast genetic variation between disease cohorts and healthy populations to discover single nucleotide polymorphisms (SNPs) and other genetic markers revealing underlying genetic architectures of human diseases. Despite scores of efforts over the past decade, many reproducible genetic variants that explain substantial proportions of the heritable risk of common human diseases remain undiscovered. We have conducted a multispecies genomic analysis of 5,831 putative human risk variants for more than 230 disease phenotypes reported in 2,021 studies. We find that the current approaches show a propensity for discovering disease-associated SNPs (dSNPs) at conserved genomic positions because the effect size (odds ratio) and allelic P value of genetic association of an SNP relates strongly to the evolutionary conservation of their genomic position. We propose a new measure for ranking SNPs that integrates evolutionary conservation scores and the P value (E-rank). Using published data from a large case-control study, we demonstrate that E-rank method prioritizes SNPs with a greater likelihood of bona fide and reproducible genetic disease associations, many of which may explain greater proportions of genetic variance. Therefore, long-term evolutionary histories of genomic positions offer key practical utility in reassessing data from existing disease association studies, and in the design and analysis of future studies aimed at revealing the genetic basis of common human diseases. PMID:22389448
Identifying Genetic Hotspots by Mapping Molecular Diversity of Widespread Trees: When Commonness Matters.

PubMed

Souto, Cintia P; Mathiasen, Paula; Acosta, María Cristina; Quiroga, María Paula; Vidal-Russell, Romina; Echeverría, Cristian; Premoli, Andrea C

2015-01-01

Conservation planning requires setting priorities at the same spatial scale at which decision-making processes are undertaken considering all levels of biodiversity, but current methods for identifying biodiversity hotspots ignore its genetic component. We developed a fine-scale approach based on the definition of genetic hotspots, which have high genetic diversity and unique variants that represent their evolutionary potential and evolutionary novelties. Our hypothesis is that wide-ranging taxa with similar ecological tolerances, yet of phylogenetically independent lineages, have been and currently are shaped by ecological and evolutionary forces that result in geographically concordant genetic patterns. We mapped previously published genetic diversity and unique variants of biparentally inherited markers and chloroplast sequences for 9 species from 188 and 275 populations, respectively, of the 4 woody dominant families of the austral temperate forest, an area considered a biodiversity hotspot. Spatial distribution patterns of genetic polymorphisms differed among taxa according to their ecological tolerances. Eight genetic hotspots were detected and we recommend conservation actions for some in the southern Coastal Range in Chile. Existing spatially explicit genetic data from multiple populations and species can help to identify biodiversity hotspots and guide conservation actions to establish science-based protected areas that will preserve the evolutionary potential of key habitats and species. © The American Genetic Association 2015. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Evolutionary effects of alternative artificial propagation programs: implications for viability of endangered anadromous salmonids

PubMed Central

McClure, Michelle M; Utter, Fred M; Baldwin, Casey; Carmichael, Richard W; Hassemer, Peter F; Howell, Philip J; Spruell, Paul; Cooney, Thomas D; Schaller, Howard A; Petrosky, Charles E

2008-01-01

Most hatchery programs for anadromous salmonids have been initiated to increase the numbers of fish for harvest, to mitigate for habitat losses, or to increase abundance in populations at low abundance. However, the manner in which these programs are implemented can have significant impacts on the evolutionary trajectory and long-term viability of populations. In this paper, we review the potential benefits and risks of hatchery programs relative to the conservation of species listed under the US Endangered Species Act. To illustrate, we present the range of potential effects within a population as well as among populations of Chinook salmon (Oncorhynchus tshawytscha) where changes to major hatchery programs are being considered. We apply evolutionary considerations emerging from these examples to suggest broader principles for hatchery uses that are consistent with conservation goals. We conclude that because of the evolutionary risks posed by artificial propagation programs, they should not be viewed as a substitute for addressing other limiting factors that prevent achieving viability. At the population level, artificial propagation programs that are implemented as a short-term approach to avoid imminent extinction are more likely to achieve long-term population viability than approaches that rely on long-term supplementation. In addition, artificial propagation programs can have out-of-population impacts that should be considered in conservation planning. PMID:25567637
Structural analysis of polarizing indels: an emerging consensus on the root of the tree of life

PubMed Central

2009-01-01

Background The root of the tree of life has been a holy grail ever since Darwin first used the tree as a metaphor for evolution. New methods seek to narrow down the location of the root by excluding it from branches of the tree of life. This is done by finding traits that must be derived, and excluding the root from the taxa those traits cover. However the two most comprehensive attempts at this strategy, performed by Cavalier-Smith and Lake et al., have excluded each other's rootings. Results The indel polarizations of Lake et al. rely on high quality alignments between paralogs that diverged before the last universal common ancestor (LUCA). Therefore, sequence alignment artifacts may skew their conclusions. We have reviewed their data using protein structure information where available. Several of the conclusions are quite different when viewed in the light of structure which is conserved over longer evolutionary time scales than sequence. We argue there is no polarization that excludes the root from all Gram-negatives, and that polarizations robustly exclude the root from the Archaea. Conclusion We conclude that there is no contradiction between the polarization datasets. The combination of these datasets excludes the root from every possible position except near the Chloroflexi. Reviewers This article was reviewed by Greg Fournier (nominated by J. Peter Gogarten), Purificación López-García, and Eugene Koonin. PMID:19706177
Differentiated evolutionary relationships among chordates from comparative alignments of multiple sequences of MyoD and MyoG myogenic regulatory factors.

PubMed

Oliani, L C; Lidani, K C F; Gabriel, J E

2015-10-16

MyoD and MyoG are transcription factors that have essential roles in myogenic lineage determination and muscle differentiation. The purpose of this study was to compare multiple amino acid sequences of myogenic regulatory proteins to infer evolutionary relationships among chordates. Protein sequences from Mus musculus (P10085 and P12979), human Homo sapiens (P15172 and P15173), bovine Bos taurus (Q7YS82 and Q7YS81), wild pig Sus scrofa (P49811 and P49812), quail Coturnix coturnix (P21572 and P34060), chicken Gallus gallus (P16075 and P17920), rat Rattus norvegicus (Q02346 and P20428), domestic water buffalo Bubalus bubalis (D2SP11 and A7L034), and sheep Ovis aries (Q90477 and D3YKV7) were searched from a non-redundant protein sequence database UniProtKB/Swiss-Prot, and subsequently analyzed using the Mega6.0 software. MyoD evolutionary analyses revealed the presence of three main clusters with all mammals branched in one cluster, members of the order Rodentia (mouse and rat) in a second branch linked to the first, and birds of the order Galliformes (chicken and quail) remaining isolated in a third. MyoG evolutionary analyses aligned sequences in two main clusters, all mammalian specimens grouped in different sub-branches, and birds clustered in a second branch. These analyses suggest that the evolution of MyoD and MyoG was driven by different pathways.
Social Media: Menagerie of Metrics

DTIC Science & Technology

2010-01-27

intelligence, an evolutionary algorithm (EA) is a subset of evolutionary computation, a generic population-based metaheuristic optimization algorithm . An EA...Cloning - 22 Animals were cloned to date; genetic algorithms can help prediction (e.g. “elitism” - attempts to ensure selection by including performers...28, 2010 Evolutionary Algorithm • Evolutionary algorithm From Wikipedia, the free encyclopedia Artificial intelligence portal In artificial
Multiple alignment-free sequence comparison

PubMed Central

Ren, Jie; Song, Kai; Sun, Fengzhu; Deng, Minghua; Reinert, Gesine

2013-01-01

Motivation: Recently, a range of new statistics have become available for the alignment-free comparison of two sequences based on k-tuple word content. Here, we extend these statistics to the simultaneous comparison of more than two sequences. Our suite of statistics contains, first, and , extensions of statistics for pairwise comparison of the joint k-tuple content of all the sequences, and second, , and , averages of sums of pairwise comparison statistics. The two tasks we consider are, first, to identify sequences that are similar to a set of target sequences, and, second, to measure the similarity within a set of sequences. Results: Our investigation uses both simulated data as well as cis-regulatory module data where the task is to identify cis-regulatory modules with similar transcription factor binding sites. We find that although for real data, all of our statistics show a similar performance, on simulated data the Shepp-type statistics are in some instances outperformed by star-type statistics. The multiple alignment-free statistics are more sensitive to contamination in the data than the pairwise average statistics. Availability: Our implementation of the five statistics is available as R package named ‘multiAlignFree’ at be http://www-rcf.usc.edu/∼fsun/Programs/multiAlignFree/multiAlignFreemain.html. Contact: reinert@stats.ox.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23990418
Tetrapods on the EDGE: Overcoming data limitations to identify phylogenetic conservation priorities

PubMed Central

Gray, Claudia L.; Wearn, Oliver R.; Owen, Nisha R.

2018-01-01

The scale of the ongoing biodiversity crisis requires both effective conservation prioritisation and urgent action. As extinction is non-random across the tree of life, it is important to prioritise threatened species which represent large amounts of evolutionary history. The EDGE metric prioritises species based on their Evolutionary Distinctiveness (ED), which measures the relative contribution of a species to the total evolutionary history of their taxonomic group, and Global Endangerment (GE), or extinction risk. EDGE prioritisations rely on adequate phylogenetic and extinction risk data to generate meaningful priorities for conservation. However, comprehensive phylogenetic trees of large taxonomic groups are extremely rare and, even when available, become quickly out-of-date due to the rapid rate of species descriptions and taxonomic revisions. Thus, it is important that conservationists can use the available data to incorporate evolutionary history into conservation prioritisation. We compared published and new methods to estimate missing ED scores for species absent from a phylogenetic tree whilst simultaneously correcting the ED scores of their close taxonomic relatives. We found that following artificial removal of species from a phylogenetic tree, the new method provided the closest estimates of their “true” ED score, differing from the true ED score by an average of less than 1%, compared to the 31% and 38% difference of the previous methods. The previous methods also substantially under- and over-estimated scores as more species were artificially removed from a phylogenetic tree. We therefore used the new method to estimate ED scores for all tetrapods. From these scores we updated EDGE prioritisation rankings for all tetrapod species with IUCN Red List assessments, including the first EDGE prioritisation for reptiles. Further, we identified criteria to identify robust priority species in an effort to further inform conservation action whilst limiting uncertainty and anticipating future phylogenetic advances. PMID:29641585
Efficient alignment-free DNA barcode analytics.

PubMed

Kuksa, Pavel; Pavlovic, Vladimir

2009-11-10

In this work we consider barcode DNA analysis problems and address them using alternative, alignment-free methods and representations which model sequences as collections of short sequence fragments (features). The methods use fixed-length representations (spectrum) for barcode sequences to measure similarities or dissimilarities between sequences coming from the same or different species. The spectrum-based representation not only allows for accurate and computationally efficient species classification, but also opens possibility for accurate clustering analysis of putative species barcodes and identification of critical within-barcode loci distinguishing barcodes of different sample groups. New alignment-free methods provide highly accurate and fast DNA barcode-based identification and classification of species with substantial improvements in accuracy and speed over state-of-the-art barcode analysis methods. We evaluate our methods on problems of species classification and identification using barcodes, important and relevant analytical tasks in many practical applications (adverse species movement monitoring, sampling surveys for unknown or pathogenic species identification, biodiversity assessment, etc.) On several benchmark barcode datasets, including ACG, Astraptes, Hesperiidae, Fish larvae, and Birds of North America, proposed alignment-free methods considerably improve prediction accuracy compared to prior results. We also observe significant running time improvements over the state-of-the-art methods. Our results show that newly developed alignment-free methods for DNA barcoding can efficiently and with high accuracy identify specimens by examining only few barcode features, resulting in increased scalability and interpretability of current computational approaches to barcoding.
EAPhy: A Flexible Tool for High-throughput Quality Filtering of Exon-alignments and Data Processing for Phylogenetic Methods.

PubMed

Blom, Mozes P K

2015-08-05

Recently developed molecular methods enable geneticists to target and sequence thousands of orthologous loci and infer evolutionary relationships across the tree of life. Large numbers of genetic markers benefit species tree inference but visual inspection of alignment quality, as traditionally conducted, is challenging with thousands of loci. Furthermore, due to the impracticality of repeated visual inspection with alternative filtering criteria, the potential consequences of using datasets with different degrees of missing data remain nominally explored in most empirical phylogenomic studies. In this short communication, I describe a flexible high-throughput pipeline designed to assess alignment quality and filter exonic sequence data for subsequent inference. The stringency criteria for alignment quality and missing data can be adapted based on the expected level of sequence divergence. Each alignment is automatically evaluated based on the stringency criteria specified, significantly reducing the number of alignments that require visual inspection. By developing a rapid method for alignment filtering and quality assessment, the consistency of phylogenetic estimation based on exonic sequence alignments can be further explored across distinct inference methods, while accounting for different degrees of missing data.
Protein Sectors: Statistical Coupling Analysis versus Conservation

PubMed Central

Teşileanu, Tiberiu; Colwell, Lucy J.; Leibler, Stanislas

2015-01-01

Statistical coupling analysis (SCA) is a method for analyzing multiple sequence alignments that was used to identify groups of coevolving residues termed “sectors”. The method applies spectral analysis to a matrix obtained by combining correlation information with sequence conservation. It has been asserted that the protein sectors identified by SCA are functionally significant, with different sectors controlling different biochemical properties of the protein. Here we reconsider the available experimental data and note that it involves almost exclusively proteins with a single sector. We show that in this case sequence conservation is the dominating factor in SCA, and can alone be used to make statistically equivalent functional predictions. Therefore, we suggest shifting the experimental focus to proteins for which SCA identifies several sectors. Correlations in protein alignments, which have been shown to be informative in a number of independent studies, would then be less dominated by sequence conservation. PMID:25723535
LenVarDB: database of length-variant protein domains.

PubMed

Mutt, Eshita; Mathew, Oommen K; Sowdhamini, Ramanathan

2014-01-01

Protein domains are functionally and structurally independent modules, which add to the functional variety of proteins. This array of functional diversity has been enabled by evolutionary changes, such as amino acid substitutions or insertions or deletions, occurring in these protein domains. Length variations (indels) can introduce changes at structural, functional and interaction levels. LenVarDB (freely available at http://caps.ncbs.res.in/lenvardb/) traces these length variations, starting from structure-based sequence alignments in our Protein Alignments organized as Structural Superfamilies (PASS2) database, across 731 structural classification of proteins (SCOP)-based protein domain superfamilies connected to 2 730 625 sequence homologues. Alignment of sequence homologues corresponding to a structural domain is available, starting from a structure-based sequence alignment of the superfamily. Orientation of the length-variant (indel) regions in protein domains can be visualized by mapping them on the structure and on the alignment. Knowledge about location of length variations within protein domains and their visual representation will be useful in predicting changes within structurally or functionally relevant sites, which may ultimately regulate protein function. Non-technical summary: Evolutionary changes bring about natural changes to proteins that may be found in many organisms. Such changes could be reflected as amino acid substitutions or insertions-deletions (indels) in protein sequences. LenVarDB is a database that provides an early overview of observed length variations that were set among 731 protein families and after examining >2 million sequences. Indels are followed up to observe if they are close to the active site such that they can affect the activity of proteins. Inclusion of such information can aid the design of bioengineering experiments.
Disentangling methodological and biological sources of gene tree discordance on Oryza (Poaceae) chromosome 3.

PubMed

Zwickl, Derrick J; Stein, Joshua C; Wing, Rod A; Ware, Doreen; Sanderson, Michael J

2014-09-01

We describe new methods for characterizing gene tree discordance in phylogenomic data sets, which screen for deviations from neutral expectations, summarize variation in statistical support among gene trees, and allow comparison of the patterns of discordance induced by various analysis choices. Using an exceptionally complete set of genome sequences for the short arm of chromosome 3 in Oryza (rice) species, we applied these methods to identify the causes and consequences of differing patterns of discordance in the sets of gene trees inferred using a panel of 20 distinct analysis pipelines. We found that discordance patterns were strongly affected by aspects of data selection, alignment, and alignment masking. Unusual patterns of discordance evident when using certain pipelines were reduced or eliminated by using alternative pipelines, suggesting that they were the product of methodological biases rather than evolutionary processes. In some cases, once such biases were eliminated, evolutionary processes such as introgression could be implicated. Additionally, patterns of gene tree discordance had significant downstream impacts on species tree inference. For example, inference from supermatrices was positively misleading when pipelines that led to biased gene trees were used. Several results may generalize to other data sets: we found that gene tree and species tree inference gave more reasonable results when intron sequence was included during sequence alignment and tree inference, the alignment software PRANK was used, and detectable "block-shift" alignment artifacts were removed. We discuss our findings in the context of well-established relationships in Oryza and continuing controversies regarding the domestication history of O. sativa. © The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Genomic analysis of the symbiotic marine crenarchaeon, Cenarchaeumsymbiosum

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hallam, Steven J.; Konstantinidis, Konstantinos T.; Brochier,Celine

2006-06-24

Crenarchaea are ubiquitous and abundant microbial constituents of soils, sediments, lakes and ocean waters, yet relatively little is known about their fundamental evolutionary, ecological, and physiological properties. To better describe the ubiquitous nonthermophilic Crenarchaea, we analyzed the genome sequence of one representative, the uncultivated sponge symbiont, Cenarchaeum symbiosum. C. symbiosum genotypes coinhabiting the same host partitioned into two dominant populations, corresponding to previously described a- and b-type ribosomal RNA variants. Although synthetic, overlapping a- and b-type ribotypes harbored significant genetic variability. A single tiling path comprising the dominant a-type genotype was assembled, and used to explore the biological properties ofmore » C. symbiosum and its planktonic relatives. Out of a total of 2,066 predicted open reading frames, 36% were more highly conserved with other Archaea. The remainder partitioned between bacteria (18%), eukaryotes (1.5%) and viruses (0.1%). A total of 525 open reading frames were more highly conserved with sequences derived from marine environmental genomic surveys, most probably representing orthologous genes found in free-living planktonic Crenarchaea. The remaining genes partitioned between functional RNAs (2.4%), and hypotheticals (42%) with limited homology to known functional genes. The latter category likely contains genes specifically involved in mediated archaeal-sponge symbiosis. Phylogenetic analyses placed C. symbiosum as a basal crenarchaeon, sharing specific genomic features in common with either Crenarchaea, Euryarchaea, or both. The genome sequence of C. symbiosum reflect a unique and unusual evolutionary, physiological, and ecological history, one remarkably distinct from that of any other previously known microbial lineage.« less
Endemicity and evolutionary value: a study of Chilean endemic vascular plant genera

PubMed Central

Scherson, Rosa A; Albornoz, Abraham A; Moreira-Muñoz, Andrés S; Urbina-Casanova, Rafael

2014-01-01

This study uses phylogeny-based measures of evolutionary potential (phylogenetic diversity and community structure) to evaluate the evolutionary value of vascular plant genera endemic to Chile. Endemicity is regarded as a very important consideration for conservation purposes. Taxa that are endemic to a single country are valuable conservation targets, as their protection depends upon a single government policy. This is especially relevant in developing countries in which conservation is not always a high resource allocation priority. Phylogeny-based measures of evolutionary potential such as phylogenetic diversity (PD) have been regarded as meaningful measures of the “value” of taxa and ecosystems, as they are able to account for the attributes that could allow taxa to recover from environmental changes. Chile is an area of remarkable endemism, harboring a flora that shows the highest number of endemic genera in South America. We studied PD and community structure of this flora using a previously available supertree at the genus level, to which we added DNA sequences of 53 genera endemic to Chile. Using discrepancy values and a null model approach, we decoupled PD from taxon richness, in order to compare their geographic distribution over a one-degree grid. An interesting pattern was observed in which areas to the southwest appear to harbor more PD than expected by their generic richness than those areas to the north of the country. In addition, some southern areas showed more PD than expected by chance, as calculated with the null model approach. Geological history as documented by the study of ancient floras as well as glacial refuges in the coastal range of southern Chile during the quaternary seem to be consistent with the observed pattern, highlighting the importance of this area for conservation purposes. PMID:24683462
ADOMA: A Command Line Tool to Modify ClustalW Multiple Alignment Output.

PubMed

Zaal, Dionne; Nota, Benjamin

2016-01-01

We present ADOMA, a command line tool that produces alternative outputs from ClustalW multiple alignments of nucleotide or protein sequences. ADOMA can simplify the output of alignments by showing only the different residues between sequences, which is often desirable when only small differences such as single nucleotide polymorphisms are present (e.g., between different alleles). Another feature of ADOMA is that it can enhance the ClustalW output by coloring the residues in the alignment. This tool is easily integrated into automated Linux pipelines for next-generation sequencing data analysis, and may be useful for researchers in a broad range of scientific disciplines including evolutionary biology and biomedical sciences. The source code is freely available at https://sourceforge. net/projects/adoma/. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Whole-genome alignment.

PubMed

Dewey, Colin N

2012-01-01

Whole-genome alignment (WGA) is the prediction of evolutionary relationships at the nucleotide level between two or more genomes. It combines aspects of both colinear sequence alignment and gene orthology prediction, and is typically more challenging to address than either of these tasks due to the size and complexity of whole genomes. Despite the difficulty of this problem, numerous methods have been developed for its solution because WGAs are valuable for genome-wide analyses, such as phylogenetic inference, genome annotation, and function prediction. In this chapter, we discuss the meaning and significance of WGA and present an overview of the methods that address it. We also examine the problem of evaluating whole-genome aligners and offer a set of methodological challenges that need to be tackled in order to make the most effective use of our rapidly growing databases of whole genomes.
The Three Domains of Conservation Genetics: Case Histories from Hawaiian Waters

PubMed Central

2016-01-01

The scientific field of conservation biology is dominated by 3 specialties: phylogenetics, ecology, and evolution. Under this triad, phylogenetics is oriented towards the past history of biodiversity, conserving the divergent branches in the tree of life. The ecological component is rooted in the present, maintaining the contemporary life support systems for biodiversity. Evolutionary conservation (as defined here) is concerned with preserving the raw materials for generating future biodiversity. All 3 domains can be documented with genetic case histories in the waters of the Hawaiian Archipelago, an isolated chain of volcanic islands with 2 types of biodiversity: colonists, and new species that arose from colonists. This review demonstrates that 1) phylogenetic studies have identified previously unknown branches in the tree of life that are endemic to Hawaiian waters; 2) population genetic surveys define isolated marine ecosystems as management units, and 3) phylogeographic analyses illustrate the pathways of colonization that can enhance future biodiversity. Conventional molecular markers have advanced all 3 domains in conservation biology over the last 3 decades, and recent advances in genomics are especially valuable for understanding the foundations of future evolutionary diversity. PMID:27001936
Conservation genetics of the genus Martes: Assessing within-species movements, units to conserve, and connectivity across ecological and evolutionary time [Chapter 17

Treesearch

Michael K. Schwartz; Aritz Ruiz-Gonzalez; Ryuchi Masuda; Cino Pertoldi

2012-01-01

Understanding the physical and temporal factors that structure Martes populations is essential to the conservation and management of the 8 recognized Martes species. Recently, advances in 3 distinct subdisciplines in molecular ecology have provided insights into historical and contemporary environmental factors that have created population substructure and influenced...
An evolutionary analysis identifies a conserved pentapeptide stretch containing the two essential lysine residues for rice L-myo-inositol 1-phosphate synthase catalytic activity

PubMed Central

Basak, Papri; Maitra-Majee, Susmita; Das, Jayanta Kumar; Mukherjee, Abhishek; Ghosh Dastidar, Shubhra; Pal Choudhury, Pabitra

2017-01-01

A molecular evolutionary analysis of a well conserved protein helps to determine the essential amino acids in the core catalytic region. Based on the chemical properties of amino acid residues, phylogenetic analysis of a total of 172 homologous sequences of a highly conserved enzyme, L-myo-inositol 1-phosphate synthase or MIPS from evolutionarily diverse organisms was performed. This study revealed the presence of six phylogenetically conserved blocks, out of which four embrace the catalytic core of the functional protein. Further, specific amino acid modifications targeting the lysine residues, known to be important for MIPS catalysis, were performed at the catalytic site of a MIPS from monocotyledonous model plant, Oryza sativa (OsMIPS1). Following this study, OsMIPS mutants with deletion or replacement of lysine residues in the conserved blocks were made. Based on the enzyme kinetics performed on the deletion/replacement mutants, phylogenetic and structural comparison with the already established crystal structures from non-plant sources, an evolutionarily conserved peptide stretch was identified at the active pocket which contains the two most important lysine residues essential for catalytic activity. PMID:28950028
Functional Targets of the Monogenic Diabetes Transcription Factors HNF-1α and HNF-4α Are Highly Conserved Between Mice and Humans

PubMed Central

Boj, Sylvia F.; Servitja, Joan Marc; Martin, David; Rios, Martin; Talianidis, Iannis; Guigo, Roderic; Ferrer, Jorge

2009-01-01

OBJECTIVE The evolutionary conservation of transcriptional mechanisms has been widely exploited to understand human biology and disease. Recent findings, however, unexpectedly showed that the transcriptional regulators hepatocyte nuclear factor (HNF)-1α and -4α rarely bind to the same genes in mice and humans, leading to the proposal that tissue-specific transcriptional regulation has undergone extensive divergence in the two species. Such observations have major implications for the use of mouse models to understand HNF-1α– and HNF-4α–deficient diabetes. However, the significance of studies that assess binding without considering regulatory function is poorly understood. RESEARCH DESIGN AND METHODS We compared previously reported mouse and human HNF-1α and HNF-4α binding studies with independent binding experiments. We also integrated binding studies with mouse and human loss-of-function gene expression datasets. RESULTS First, we confirmed the existence of species-specific HNF-1α and -4α binding, yet observed incomplete detection of binding in the different datasets, causing an underestimation of binding conservation. Second, only a minor fraction of HNF-1α– and HNF-4α–bound genes were downregulated in the absence of these regulators. This subset of functional targets did not show evidence for evolutionary divergence of binding or binding sequence motifs. Finally, we observed differences between conserved and species-specific binding properties. For example, conserved binding was more frequently located near transcriptional start sites and was more likely to involve multiple binding events in the same gene. CONCLUSIONS Despite evolutionary changes in binding, essential direct transcriptional functions of HNF-1α and -4α are largely conserved between mice and humans. PMID:19188435
Finding a common path: predicting gene function using inferred evolutionary trees.

PubMed

Reynolds, Kimberly A

2014-07-14

Reporting in Cell, Li and colleagues (2014) describe an innovative method to functionally classify genes using evolutionary information. This approach demonstrates broad utility for eukaryotic gene annotation and suggests an intriguing new decomposition of pathways and complexes into evolutionarily conserved modules. Copyright © 2014 Elsevier Inc. All rights reserved.

Mitochondrial DNA haplotype distribution patterns in Pinus ponderosa (pinaceae): range-wide evolutionary history and implications for conservation

Treesearch

Kevin M. Potter; Valerie D. Hipkins; Mary F. Mahalovich; Robert E. Means

2013-01-01

Premise of the study: Ponderosa pine ( Pinus ponderosa Douglas ex P. Lawson & C. Lawson) exhibits complicated patterns of morphological and genetic variation across its range in western North America. This study aims to clarify P. ponderosa evolutionary history and phylogeography using a highly polymorphic...
Comparative In silico Study of Sex-Determining Region Y (SRY) Protein Sequences Involved in Sex-Determining.

PubMed

Vakili Azghandi, Masoume; Nasiri, Mohammadreza; Shamsa, Ali; Jalali, Mohsen; Shariati, Mohammad Mahdi

2016-04-01

The SRY gene (SRY) provides instructions for making a transcription factor called the sex-determining region Y protein. The sex-determining region Y protein causes a fetus to develop as a male. In this study, SRY of 15 spices included of human, chimpanzee, dog, pig, rat, cattle, buffalo, goat, sheep, horse, zebra, frog, urial, dolphin and killer whale were used for determine of bioinformatic differences. Nucleotide sequences of SRY were retrieved from the NCBI databank. Bioinformatic analysis of SRY is done by CLC Main Workbench version 5.5 and ClustalW (http:/www.ebi.ac.uk/clustalw/) and MEGA6 softwares. The multiple sequence alignment results indicated that SRY protein sequences from Orcinus orca (killer whale) and Tursiopsaduncus (dolphin) have least genetic distance of 0.33 in these 15 species and are 99.67% identical at the amino acid level. Homosapiens and Pantroglodytes (chimpanzee) have the next lowest genetic distance of 1.35 and are 98.65% identical at the amino acid level. These findings indicate that the SRY proteins are conserved in the 15 species, and their evolutionary relationships are similar.
Recent hybrid origin of three rare chinese turtles

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stuart, Bryan L.; Parham, James F.

Three rare geoemydid turtles described from Chinese tradespecimens in the early 1990s, Ocadia glyphistoma, O. philippeni, andSacalia pseudocellata, are suspected to be hybrids because they are knownonly from their original descriptions and because they have morphologiesintermediate between other, better-known species. We cloned the allelesof a bi-parentally inherited nuclear intron from samples of these threespecies. The two aligned parental alleles of O. glyphistoma, O.philippeni, and S. pseudocellata have 5-11.5 times more heterozygouspositions than do 13 other geoemydid species. Phylogenetic analysis showsthat the two alleles from each turtle are strongly paraphyletic, butcorrectly match sequences of other species that were hypothesized frommorphology tomore » be their parental species. We conclude that these rareturtles represent recent hybrids rather than valid species. Specifically,"O. glyphistoma" is a hybrid of Mauremys sinensis and M. cf. annamensis,"O. philippeni" is a hybrid of M. sinensis and Cuora trifasciata, and "S.pseudocellata" is a hybrid of C. trifasciata and S. quadriocellata.Conservation resources are better directed toward finding and protectingpopulations of other rare Southeast Asian turtles that do representdistinct evolutionary lineages.« less
Genomicus update 2015: KaryoView and MatrixView provide a genome-wide perspective to multispecies comparative genomics

PubMed Central

Louis, Alexandra; Nguyen, Nga Thi Thuy; Muffato, Matthieu; Roest Crollius, Hugues

2015-01-01

The Genomicus web server (http://www.genomicus.biologie.ens.fr/genomicus) is a visualization tool allowing comparative genomics in four different phyla (Vertebrate, Fungi, Metazoan and Plants). It provides access to genomic information from extant species, as well as ancestral gene content and gene order for vertebrates and flowering plants. Here we present the new features available for vertebrate genome with a focus on new graphical tools. The interface to enter the database has been improved, two pairwise genome comparison tools are now available (KaryoView and MatrixView) and the multiple genome comparison tools (PhyloView and AlignView) propose three new kinds of representation and a more intuitive menu. These new developments have been implemented for Genomicus portal dedicated to vertebrates. This allows the analysis of 68 extant animal genomes, as well as 58 ancestral reconstructed genomes. The Genomicus server also provides access to ancestral gene orders, to facilitate evolutionary and comparative genomics studies, as well as computationally predicted regulatory interactions, thanks to the representation of conserved non-coding elements with their putative gene targets. PMID:25378326
Reliability of image-free navigation to monitor lower-limb alignment.

PubMed

Pearle, Andrew D; Goleski, Patrick; Musahl, Volker; Kendoff, Daniel

2009-02-01

Proper alignment of the mechanical axis of the lower limb is the principal goal of a high tibial osteotomy. A well-accepted and relevant technical specification is the coronal plane lower-limb alignment. Target values for coronal plane alignment after high tibial osteotomy include 2 degrees of overcorrection, while tolerances for this specification have been established as 2 degrees to 4 degrees. However, the role of axial plane and sagittal plane realignment after high tibial osteotomy is poorly understood; consequently, targets and tolerance for this technical specification remain undefined. This article reviews the literature concerning the reliability and precision of navigation in monitoring the clinically relevant specification of lower-limb alignment in high tibial osteotomy. We conclude that image-free navigation registration may be clinically useful for intraoperative monitoring of the coronal plane only. Only fair and poor results for the axial and sagittal planes can be obtained by image-free navigation systems. In the future, combined image-based data, such as those from radiographs, magnetic resonance imaging, and gait analysis, may be used to help to improve the accuracy and reproducibility of quantitative intraoperative monitoring of lower-limb alignment.
Evolutionary conserved microRNAs are ubiquitously expressed compared to tick-specific miRNAs in the cattle tick Rhipicephalus (Boophilus) microplus

PubMed Central

2011-01-01

Background MicroRNAs (miRNAs) are small non-coding RNAs that act as regulators of gene expression in eukaryotes modulating a large diversity of biological processes. The discovery of miRNAs has provided new opportunities to understand the biology of a number of species. The cattle tick, Rhipicephalus (Boophilus) microplus, causes significant economic losses in cattle production worldwide and this drives us to further understand their biology so that effective control measures can be developed. To be able to provide new insights into the biology of cattle ticks and to expand the repertoire of tick miRNAs we utilized Illumina technology to sequence the small RNA transcriptomes derived from various life stages and selected organs of R. microplus. Results To discover and profile cattle tick miRNAs we employed two complementary approaches, one aiming to find evolutionary conserved miRNAs and another focused on the discovery of novel cattle-tick specific miRNAs. We found 51 evolutionary conserved R. microplus miRNA loci, with 36 of these previously found in the tick Ixodes scapularis. The majority of the R. microplus miRNAs are perfectly conserved throughout evolution with 11, 5 and 15 of these conserved since the Nephrozoan (640 MYA), Protostomian (620MYA) and Arthropoda (540 MYA) ancestor, respectively. We then employed a de novo computational screening for novel tick miRNAs using the draft genome of I. scapularis and genomic contigs of R. microplus as templates. This identified 36 novel R. microplus miRNA loci of which 12 were conserved in I. scapularis. Overall we found 87 R. microplus miRNA loci, of these 15 showed the expression of both miRNA and miRNA* sequences. R. microplus miRNAs showed a variety of expression profiles, with the evolutionary-conserved miRNAs mainly expressed in all life stages at various levels, while the expression of novel tick-specific miRNAs was mostly limited to particular life stages and/or tick organs. Conclusions Anciently acquired miRNAs in the R. microplus lineage not only tend to accumulate the least amount of nucleotide substitutions as compared to those recently acquired miRNAs, but also show ubiquitous expression profiles through out tick life stages and organs contrasting with the restricted expression profiles of novel tick-specific miRNAs. PMID:21699734
GenomeVista

DOE Office of Scientific and Technical Information (OSTI.GOV)

Poliakov, Alexander; Couronne, Olivier

2002-11-04

Aligning large vertebrate genomes that are structurally complex poses a variety of problems not encountered on smaller scales. Such genomes are rich in repetitive elements and contain multiple segmental duplications, which increases the difficulty of identifying true orthologous SNA segments in alignments. The sizes of the sequences make many alignment algorithms designed for comparing single proteins extremely inefficient when processing large genomic intervals. We integrated both local and global alignment tools and developed a suite of programs for automatically aligning large vertebrate genomes and identifying conserved non-coding regions in the alignments. Our method uses the BLAT local alignment program tomore » find anchors on the base genome to identify regions of possible homology for a query sequence. These regions are postprocessed to find the best candidates which are then globally aligned using the AVID global alignment program. In the last step conserved non-coding segments are identified using VISTA. Our methods are fast and the resulting alignments exhibit a high degree of sensitivity, covering more than 90% of known coding exons in the human genome. The GenomeVISTA software is a suite of Perl programs that is built on a MySQL database platform. The scheduler gets control data from the database, builds a queve of jobs, and dispatches them to a PC cluster for execution. The main program, running on each node of the cluster, processes individual sequences. A Perl library acts as an interface between the database and the above programs. The use of a separate library allows the programs to function independently of the database schema. The library also improves on the standard Perl MySQL database interfere package by providing auto-reconnect functionality and improved error handling.« less
Body shape transformation along a shared axis of anatomical evolution in labyrinth fishes (Anabantoidei).

PubMed

Collar, David C; Quintero, Michelle; Buttler, Bernardo; Ward, Andrea B; Mehta, Rita S

2016-03-01

Major morphological transformations, such as the evolution of elongate body shape in vertebrates, punctuate evolutionary history. A fundamental step in understanding the processes that give rise to such transformations is identification of the underlying anatomical changes. But as we demonstrate in this study, important insights can also be gained by comparing these changes to those that occur in ancestral and closely related lineages. In labyrinth fishes (Anabantoidei), rapid evolution of a highly derived torpedo-shaped body in the common ancestor of the pikehead (Luciocephalus aura and L. pulcher) occurred primarily through exceptional elongation of the head, with secondary contributions involving reduction in body depth and lengthening of the precaudal vertebral region. This combination of changes aligns closely with the primary axis of anatomical diversification in other anabantoids, revealing that pikehead evolution involved extraordinarily rapid change in structures that were ancestrally labile. Finer-scale examination of the anatomical components that determine head elongation also shows alignment between the pikehead evolutionary trajectory and the primary axis of cranial diversification in anabantoids, with much higher evolutionary rates leading to the pikehead. Altogether, our results show major morphological transformation stemming from extreme change along a shared morphological axis in labyrinth fishes. © 2016 The Author(s). Evolution © 2016 The Society for the Study of Evolution.
CoSMoS: Conserved Sequence Motif Search in the proteome

PubMed Central

Liu, Xiao I; Korde, Neeraj; Jakob, Ursula; Leichert, Lars I

2006-01-01

Background With the ever-increasing number of gene sequences in the public databases, generating and analyzing multiple sequence alignments becomes increasingly time consuming. Nevertheless it is a task performed on a regular basis by researchers in many labs. Results We have now created a database called CoSMoS to find the occurrences and at the same time evaluate the significance of sequence motifs and amino acids encoded in the whole genome of the model organism Escherichia coli K12. We provide a precomputed set of multiple sequence alignments for each individual E. coli protein with all of its homologues in the RefSeq database. The alignments themselves, information about the occurrence of sequence motifs together with information on the conservation of each of the more than 1.3 million amino acids encoded in the E. coli genome can be accessed via the web interface of CoSMoS. Conclusion CoSMoS is a valuable tool to identify highly conserved sequence motifs, to find regions suitable for mutational studies in functional analyses and to predict important structural features in E. coli proteins. PMID:16433915
Reconstruction of chromosome rearrangements between the two most ancestral duckweed species Spirodela polyrhiza and S. intermedia.

PubMed

Hoang, Phuong T N; Schubert, Ingo

2017-12-01

The monophyletic duckweeds comprising five genera within the monocot order Alismatales are neotenic, free-floating, aquatic organisms with fast vegetative propagation. Some species are considered for efficient biomass production, for life stock feeding, and for (simultaneous) wastewater phytoremediation. The ancestral genus Spirodela consists of only two species, Spirodela polyrhiza and Spirodela intermedia, both with a similar small genome (~160 Mbp/1C). Reference genome drafts and a physical map of 96 BACs on the 20 chromosome pairs of S. polyrhiza strain 7498 are available and provide useful tools for further evolutionary studies within and between duckweed genera. Here we applied sequential comparative multicolor fluorescence in situ hybridization (mcFISH) to address homeologous chromosomes in S. intermedia (2n = 36), to detect chromosome rearrangements between both species and to elucidate the mechanisms which may have led to the chromosome number alteration after their evolutionary separation. Ten chromosome pairs proved to be conserved between S. polyrhiza and S. intermedia, the remaining ones experienced, depending on the assumed direction of evolution, translocations, inversion, and fissions, respectively. These results represent a first step to unravel karyotype evolution among duckweeds and are anchor points for future genome assembly of S. intermedia.
Partial sequence homogenization in the 5S multigene families may generate sequence chimeras and spurious results in phylogenetic reconstructions.

PubMed

Galián, José A; Rosato, Marcela; Rosselló, Josep A

2014-03-01

Multigene families have provided opportunities for evolutionary biologists to assess molecular evolution processes and phylogenetic reconstructions at deep and shallow systematic levels. However, the use of these markers is not free of technical and analytical challenges. Many evolutionary studies that used the nuclear 5S rDNA gene family rarely used contiguous 5S coding sequences due to the routine use of head-to-tail polymerase chain reaction primers that are anchored to the coding region. Moreover, the 5S coding sequences have been concatenated with independent, adjacent gene units in many studies, creating simulated chimeric genes as the raw data for evolutionary analysis. This practice is based on the tacitly assumed, but rarely tested, hypothesis that strict intra-locus concerted evolution processes are operating in 5S rDNA genes, without any empirical evidence as to whether it holds for the recovered data. The potential pitfalls of analysing the patterns of molecular evolution and reconstructing phylogenies based on these chimeric genes have not been assessed to date. Here, we compared the sequence integrity and phylogenetic behavior of entire versus concatenated 5S coding regions from a real data set obtained from closely related plant species (Medicago, Fabaceae). Our results suggest that within arrays sequence homogenization is partially operating in the 5S coding region, which is traditionally assumed to be highly conserved. Consequently, concatenating 5S genes increases haplotype diversity, generating novel chimeric genotypes that most likely do not exist within the genome. In addition, the patterns of gene evolution are distorted, leading to incorrect haplotype relationships in some evolutionary reconstructions.
Fabrication of Free-Standing, Self-Aligned, High-Aspect-Ratio Synthetic Ommatidia.

PubMed

Jun, Brian M; Serra, Francesca; Xia, Yu; Kang, Hong Suk; Yang, Shu

2016-11-16

Free-standing, self-aligned, high-aspect-ratio (length to cross-section, up to 15.5) waveguides that mimic insects' ommatidia are fabricated. Self-aligned waveguides under the lenses are created after exposing photoresist SU-8 film through the negative polydimethylsiloxane (PDMS) lens array. Instead of drying from the developer, the waveguides are coated with poly(vinyl alcohol) and then immersed into a mixture of PDMS precursor and diethyl ether. The slow drying of diethyl ether, followed by curing and peeling off PDMS, allows for the fabrication of free-standing waveguides without collapse. We show that the synthetic ommatidia can confine light and propagate it all the way to the tips.
On the Evolution of the Cardiac Pacemaker

PubMed Central

Burkhard, Silja; van Eif, Vincent; Garric, Laurence; Christoffels, Vincent M.; Bakkers, Jeroen

2017-01-01

The rhythmic contraction of the heart is initiated and controlled by an intrinsic pacemaker system. Cardiac contractions commence at very early embryonic stages and coordination remains crucial for survival. The underlying molecular mechanisms of pacemaker cell development and function are still not fully understood. Heart form and function show high evolutionary conservation. Even in simple contractile cardiac tubes in primitive invertebrates, cardiac function is controlled by intrinsic, autonomous pacemaker cells. Understanding the evolutionary origin and development of cardiac pacemaker cells will help us outline the important pathways and factors involved. Key patterning factors, such as the homeodomain transcription factors Nkx2.5 and Shox2, and the LIM-homeodomain transcription factor Islet-1, components of the T-box (Tbx), and bone morphogenic protein (Bmp) families are well conserved. Here we compare the dominant pacemaking systems in various organisms with respect to the underlying molecular regulation. Comparative analysis of the pathways involved in patterning the pacemaker domain in an evolutionary context might help us outline a common fundamental pacemaker cell gene programme. Special focus is given to pacemaker development in zebrafish, an extensively used model for vertebrate development. Finally, we conclude with a summary of highly conserved key factors in pacemaker cell development and function. PMID:29367536
Simultaneous phylogeny reconstruction and multiple sequence alignment

PubMed Central

Yue, Feng; Shi, Jian; Tang, Jijun

2009-01-01

Background A phylogeny is the evolutionary history of a group of organisms. To date, sequence data is still the most used data type for phylogenetic reconstruction. Before any sequences can be used for phylogeny reconstruction, they must be aligned, and the quality of the multiple sequence alignment has been shown to affect the quality of the inferred phylogeny. At the same time, all the current multiple sequence alignment programs use a guide tree to produce the alignment and experiments showed that good guide trees can significantly improve the multiple alignment quality. Results We devise a new algorithm to simultaneously align multiple sequences and search for the phylogenetic tree that leads to the best alignment. We also implemented the algorithm as a C program package, which can handle both DNA and protein data and can take simple cost model as well as complex substitution matrices, such as PAM250 or BLOSUM62. The performance of the new method are compared with those from other popular multiple sequence alignment tools, including the widely used programs such as ClustalW and T-Coffee. Experimental results suggest that this method has good performance in terms of both phylogeny accuracy and alignment quality. Conclusion We present an algorithm to align multiple sequences and reconstruct the phylogenies that minimize the alignment score, which is based on an efficient algorithm to solve the median problems for three sequences. Our extensive experiments suggest that this method is very promising and can produce high quality phylogenies and alignments. PMID:19208110
Evolutionary rescue in vertebrates: evidence, applications and uncertainty

PubMed Central

Vander Wal, E.; Garant, D.; Festa-Bianchet, M.; Pelletier, F.

2013-01-01

The current rapid rate of human-driven environmental change presents wild populations with novel conditions and stresses. Theory and experimental evidence for evolutionary rescue present a promising case for species facing environmental change persisting via adaptation. Here, we assess the potential for evolutionary rescue in wild vertebrates. Available information on evolutionary rescue was rare and restricted to abundant and highly fecund species that faced severe intentional anthropogenic selective pressures. However, examples from adaptive tracking in common species and genetic rescues in species of conservation concern provide convincing evidence in favour of the mechanisms of evolutionary rescue. We conclude that low population size, long generation times and limited genetic variability will result in evolutionary rescue occurring rarely for endangered species without intervention. Owing to the risks presented by current environmental change and the possibility of evolutionary rescue in nature, we suggest means to study evolutionary rescue by mapping genotype → phenotype → demography → fitness relationships, and priorities for applying evolutionary rescue to wild populations. PMID:23209171
Descriptive Statistics of the Genome: Phylogenetic Classification of Viruses.

PubMed

Hernandez, Troy; Yang, Jie

2016-10-01

The typical process for classifying and submitting a newly sequenced virus to the NCBI database involves two steps. First, a BLAST search is performed to determine likely family candidates. That is followed by checking the candidate families with the pairwise sequence alignment tool for similar species. The submitter's judgment is then used to determine the most likely species classification. The aim of this article is to show that this process can be automated into a fast, accurate, one-step process using the proposed alignment-free method and properly implemented machine learning techniques. We present a new family of alignment-free vectorizations of the genome, the generalized vector, that maintains the speed of existing alignment-free methods while outperforming all available methods. This new alignment-free vectorization uses the frequency of genomic words (k-mers), as is done in the composition vector, and incorporates descriptive statistics of those k-mers' positional information, as inspired by the natural vector. We analyze five different characterizations of genome similarity using k-nearest neighbor classification and evaluate these on two collections of viruses totaling over 10,000 viruses. We show that our proposed method performs better than, or as well as, other methods at every level of the phylogenetic hierarchy. The data and R code is available upon request.
A Vulnerability Assessment of 300 Species in Florida: Threats from Sea Level Rise, Land Use, and Climate Change

PubMed Central

Reece, Joshua Steven; Noss, Reed F.; Oetting, Jon; Hoctor, Tom; Volk, Michael

2013-01-01

Species face many threats, including accelerated climate change, sea level rise, and conversion and degradation of habitat from human land uses. Vulnerability assessments and prioritization protocols have been proposed to assess these threats, often in combination with information such as species rarity; ecological, evolutionary or economic value; and likelihood of success. Nevertheless, few vulnerability assessments or prioritization protocols simultaneously account for multiple threats or conservation values. We applied a novel vulnerability assessment tool, the Standardized Index of Vulnerability and Value, to assess the conservation priority of 300 species of plants and animals in Florida given projections of climate change, human land-use patterns, and sea level rise by the year 2100. We account for multiple sources of uncertainty and prioritize species under five different systems of value, ranging from a primary emphasis on vulnerability to threats to an emphasis on metrics of conservation value such as phylogenetic distinctiveness. Our results reveal remarkable consistency in the prioritization of species across different conservation value systems. Species of high priority include the Miami blue butterfly (Cyclargus thomasi bethunebakeri), Key tree cactus (Pilosocereus robinii), Florida duskywing butterfly (Ephyriades brunnea floridensis), and Key deer (Odocoileus virginianus clavium). We also identify sources of uncertainty and the types of life history information consistently missing across taxonomic groups. This study characterizes the vulnerabilities to major threats of a broad swath of Florida’s biodiversity and provides a system for prioritizing conservation efforts that is quantitative, flexible, and free from hidden value judgments. PMID:24260447
Multiple alignment analysis on phylogenetic tree of the spread of SARS epidemic using distance method

NASA Astrophysics Data System (ADS)

Amiroch, S.; Pradana, M. S.; Irawan, M. I.; Mukhlash, I.

2017-09-01

Multiple Alignment (MA) is a particularly important tool for studying the viral genome and determine the evolutionary process of the specific virus. Application of MA in the case of the spread of the Severe acute respiratory syndrome (SARS) epidemic is an interesting thing because this virus epidemic a few years ago spread so quickly that medical attention in many countries. Although there has been a lot of software to process multiple sequences, but the use of pairwise alignment to process MA is very important to consider. In previous research, the alignment between the sequences to process MA algorithm, Super Pairwise Alignment, but in this study used a dynamic programming algorithm Needleman wunchs simulated in Matlab. From the analysis of MA obtained and stable region and unstable which indicates the position where the mutation occurs, the system network topology that produced the phylogenetic tree of the SARS epidemic distance method, and system area networks mutation.
Comparative functional pan-genome analyses to build connections between genomic dynamics and phenotypic evolution in polycyclic aromatic hydrocarbon metabolism in the genus Mycobacterium.

PubMed

Kweon, Ohgew; Kim, Seong-Jae; Blom, Jochen; Kim, Sung-Kwan; Kim, Bong-Soo; Baek, Dong-Heon; Park, Su Inn; Sutherland, John B; Cerniglia, Carl E

2015-02-14

The bacterial genus Mycobacterium is of great interest in the medical and biotechnological fields. Despite a flood of genome sequencing and functional genomics data, significant gaps in knowledge between genome and phenome seriously hinder efforts toward the treatment of mycobacterial diseases and practical biotechnological applications. In this study, we propose the use of systematic, comparative functional pan-genomic analysis to build connections between genomic dynamics and phenotypic evolution in polycyclic aromatic hydrocarbon (PAH) metabolism in the genus Mycobacterium. Phylogenetic, phenotypic, and genomic information for 27 completely genome-sequenced mycobacteria was systematically integrated to reconstruct a mycobacterial phenotype network (MPN) with a pan-genomic concept at a network level. In the MPN, mycobacterial phenotypes show typical scale-free relationships. PAH degradation is an isolated phenotype with the lowest connection degree, consistent with phylogenetic and environmental isolation of PAH degraders. A series of functional pan-genomic analyses provide conserved and unique types of genomic evidence for strong epistatic and pleiotropic impacts on evolutionary trajectories of the PAH-degrading phenotype. Under strong natural selection, the detailed gene gain/loss patterns from horizontal gene transfer (HGT)/deletion events hypothesize a plausible evolutionary path, an epistasis-based birth and pleiotropy-dependent death, for PAH metabolism in the genus Mycobacterium. This study generated a practical mycobacterial compendium of phenotypic and genomic changes, focusing on the PAH-degrading phenotype, with a pan-genomic perspective of the evolutionary events and the environmental challenges. Our findings suggest that when selection acts on PAH metabolism, only a small fraction of possible trajectories is likely to be observed, owing mainly to a combination of the ambiguous phenotypic effects of PAHs and the corresponding pleiotropy- and epistasis-dependent evolutionary adaptation. Evolutionary constraints on the selection of trajectories, like those seen in PAH-degrading phenotypes, are likely to apply to the evolution of other phenotypes in the genus Mycobacterium.
Reproductive isolation, evolutionary distinctiveness and setting conservation priorities: The case of European lake whitefish and the endangered North Sea houting (Coregonus spp.)

PubMed Central

2008-01-01

Background Adaptive radiation within fishes of the Coregonus lavaretus complex has created numerous morphs, posing significant challenges for taxonomy and conservation priorities. The highly endangered North Sea houting (C. oxyrhynchus; abbreviated NSH) has been considered a separate species from European lake whitefish (C. lavaretus; abbreviated ELW) due to morphological divergence and adaptation to oceanic salinities. However, its evolutionary and taxonomic status is controversial. We analysed microsatellite DNA polymorphism in nine populations from the Jutland Peninsula and the Baltic Sea, representing NSH (three populations, two of which are reintroduced) and ELW (six populations). The objectives were to: 1) analyse postglacial recolonization of whitefish in the region; 2) assess the evolutionary distinctiveness of NSH, and 3) apply several approaches for defining conservation units towards setting conservation priorities for NSH. Results Bayesian cluster analyses of genetic differentiation identified four major groups, corresponding to NSH and three groups of ELW (Western Jutland, Central Jutland, Baltic Sea). Estimates of historical migration rates indicated recolonization in a north-eastern direction, suggesting that all except the Baltic Sea population predominantly represent postglacial recolonization via the ancient Elbe River. Contemporary gene flow has not occurred between NSH and ELW, with a divergence time within the last 4,000 years suggested from coalescence methods. NSH showed interbreeding with ELW when brought into contact by stocking. Thus, reproductive isolation of NSH was not absolute, although possible interbreeding beyond the F1 level could not be resolved. Conclusion Fishes of the C. lavaretus complex in the Jutland Peninsula originate from the same recolonization event. NSH has evolved recently and its species status may be questioned due to incomplete reproductive isolation from ELW, but it was shown to merit consideration as an independent conservation unit. Yet, application of several approaches for defining conservation units generated mixed outcomes regarding its conservation priority. Within the total species complex, it remains one among many recently evolved unique forms. Its uniqueness and high conservation priority is more evident at a local geographical scale, where conservation efforts will also benefit populations of a number of other endangered species. PMID:18471278

Efficient alignment-free DNA barcode analytics

PubMed Central

Kuksa, Pavel; Pavlovic, Vladimir

2009-01-01

Background In this work we consider barcode DNA analysis problems and address them using alternative, alignment-free methods and representations which model sequences as collections of short sequence fragments (features). The methods use fixed-length representations (spectrum) for barcode sequences to measure similarities or dissimilarities between sequences coming from the same or different species. The spectrum-based representation not only allows for accurate and computationally efficient species classification, but also opens possibility for accurate clustering analysis of putative species barcodes and identification of critical within-barcode loci distinguishing barcodes of different sample groups. Results New alignment-free methods provide highly accurate and fast DNA barcode-based identification and classification of species with substantial improvements in accuracy and speed over state-of-the-art barcode analysis methods. We evaluate our methods on problems of species classification and identification using barcodes, important and relevant analytical tasks in many practical applications (adverse species movement monitoring, sampling surveys for unknown or pathogenic species identification, biodiversity assessment, etc.) On several benchmark barcode datasets, including ACG, Astraptes, Hesperiidae, Fish larvae, and Birds of North America, proposed alignment-free methods considerably improve prediction accuracy compared to prior results. We also observe significant running time improvements over the state-of-the-art methods. Conclusion Our results show that newly developed alignment-free methods for DNA barcoding can efficiently and with high accuracy identify specimens by examining only few barcode features, resulting in increased scalability and interpretability of current computational approaches to barcoding. PMID:19900305
Insights into the origin and distribution of biodiversity in the Brazilian Atlantic forest hot spot: a statistical phylogeographic study using a low-dispersal organism.

PubMed

Álvarez-Presas, M; Sánchez-Gracia, A; Carbayo, F; Rozas, J; Riutort, M

2014-06-01

The relative importance of the processes that generate and maintain biodiversity is a major and controversial topic in evolutionary biology with large implications for conservation management. The Atlantic Forest of Brazil, one of the world's richest biodiversity hot spots, is severely damaged by human activities. To formulate an efficient conservation policy, a good understanding of spatial and temporal biodiversity patterns and their underlying evolutionary mechanisms is required. With this aim, we performed a comprehensive phylogeographic study using a low-dispersal organism, the land planarian species Cephaloflexa bergi (Platyhelminthes, Tricladida). Analysing multi-locus DNA sequence variation under the Approximate Bayesian Computation framework, we evaluated two scenarios proposed to explain the diversity of Southern Atlantic Forest (SAF) region. We found that most sampled localities harbour high levels of genetic diversity, with lineages sharing common ancestors that predate the Pleistocene. Remarkably, we detected the molecular hallmark of the isolation-by-distance effect and little evidence of a recent colonization of SAF localities; nevertheless, some populations might result from very recent secondary contacts. We conclude that extant SAF biodiversity originated and has been shaped by complex interactions between ancient geological events and more recent evolutionary processes, whereas Pleistocene climate changes had a minor influence in generating present-day diversity. We also demonstrate that land planarians are an advantageous biological model for making phylogeographic and, particularly, fine-scale evolutionary inferences, and propose appropriate conservation policies.
Phylodiversity to inform conservation policy: An Australian example.

PubMed

Laity, Tania; Laffan, Shawn W; González-Orozco, Carlos E; Faith, Daniel P; Rosauer, Dan F; Byrne, Margaret; Miller, Joseph T; Crayn, Darren; Costion, Craig; Moritz, Craig C; Newport, Karl

2015-11-15

Phylodiversity measures summarise the phylogenetic diversity patterns of groups of organisms. By using branches of the tree of life, rather than its tips (e.g., species), phylodiversity measures provide important additional information about biodiversity that can improve conservation policy and outcomes. As a biodiverse nation with a strong legislative and policy framework, Australia provides an opportunity to use phylogenetic information to inform conservation decision-making. We explored the application of phylodiversity measures across Australia with a focus on two highly biodiverse regions, the south west of Western Australia (SWWA) and the South East Queensland bioregion (SEQ). We analysed seven diverse groups of organisms spanning five separate phyla on the evolutionary tree of life, the plant genera Acacia and Daviesia, mammals, hylid frogs, myobatrachid frogs, passerine birds, and camaenid land snails. We measured species richness, weighted species endemism (WE) and two phylodiversity measures, phylogenetic diversity (PD) and phylogenetic endemism (PE), as well as their respective complementarity scores (a measure of gains and losses) at 20 km resolution. Higher PD was identified within SEQ for all fauna groups, whereas more PD was found in SWWA for both plant groups. PD and PD complementarity were strongly correlated with species richness and species complementarity for most groups but less so for plants. PD and PE were found to complement traditional species-based measures for all groups studied: PD and PE follow similar spatial patterns to richness and WE, but highlighted different areas that would not be identified by conventional species-based biodiversity analyses alone. The application of phylodiversity measures, particularly the novel weighted complementary measures considered here, in conservation can enhance protection of the evolutionary history that contributes to present day biodiversity values of areas. Phylogenetic measures in conservation can include important elements of biodiversity in conservation planning, such as evolutionary potential and feature diversity that will improve decision-making and lead to better biodiversity conservation outcomes. Crown Copyright © 2015. Published by Elsevier B.V. All rights reserved.
Alignment methods: strategies, challenges, benchmarking, and comparative overview.

PubMed

Löytynoja, Ari

2012-01-01

Comparative evolutionary analyses of molecular sequences are solely based on the identities and differences detected between homologous characters. Errors in this homology statement, that is errors in the alignment of the sequences, are likely to lead to errors in the downstream analyses. Sequence alignment and phylogenetic inference are tightly connected and many popular alignment programs use the phylogeny to divide the alignment problem into smaller tasks. They then neglect the phylogenetic tree, however, and produce alignments that are not evolutionarily meaningful. The use of phylogeny-aware methods reduces the error but the resulting alignments, with evolutionarily correct representation of homology, can challenge the existing practices and methods for viewing and visualising the sequences. The inter-dependency of alignment and phylogeny can be resolved by joint estimation of the two; methods based on statistical models allow for inferring the alignment parameters from the data and correctly take into account the uncertainty of the solution but remain computationally challenging. Widely used alignment methods are based on heuristic algorithms and unlikely to find globally optimal solutions. The whole concept of one correct alignment for the sequences is questionable, however, as there typically exist vast numbers of alternative, roughly equally good alignments that should also be considered. This uncertainty is hidden by many popular alignment programs and is rarely correctly taken into account in the downstream analyses. The quest for finding and improving the alignment solution is complicated by the lack of suitable measures of alignment goodness. The difficulty of comparing alternative solutions also affects benchmarks of alignment methods and the results strongly depend on the measure used. As the effects of alignment error cannot be predicted, comparing the alignments' performance in downstream analyses is recommended.
Evolutionary re-wiring of p63 and the epigenomic regulatory landscape in keratinocytes and its potential implications on species-specific gene expression and phenotypes

PubMed Central

Sethi, Isha; Gluck, Christian; Zhou, Huiqing

2017-01-01

Abstract Although epidermal keratinocyte development and differentiation proceeds in similar fashion between humans and mice, evolutionary pressures have also wrought significant species-specific physiological differences. These differences between species could arise in part, by the rewiring of regulatory network due to changes in the global targets of lineage-specific transcriptional master regulators such as p63. Here we have performed a systematic and comparative analysis of the p63 target gene network within the integrated framework of the transcriptomic and epigenomic landscape of mouse and human keratinocytes. We determined that there exists a core set of ∼1600 genomic regions distributed among enhancers and super-enhancers, which are conserved and occupied by p63 in keratinocytes from both species. Notably, these DNA segments are typified by consensus p63 binding motifs under purifying selection and are associated with genes involved in key keratinocyte and skin-centric biological processes. However, the majority of the p63-bound mouse target regions consist of either murine-specific DNA elements that are not alignable to the human genome or exhibit no p63 binding in the orthologous syntenic regions, typifying an occupancy lost subset. Our results suggest that these evolutionarily divergent regions have undergone significant turnover of p63 binding sites and are associated with an underlying inactive and inaccessible chromatin state, indicative of their selective functional activity in the transcriptional regulatory network in mouse but not human. Furthermore, we demonstrate that this selective targeting of genes by p63 correlates with subtle, but measurable transcriptional differences in mouse and human keratinocytes that converges on major metabolic processes, which often exhibit species-specific trends. Collectively our study offers possible molecular explanation for the observable phenotypic differences between the mouse and human skin and broadly informs on the prevailing principles that govern the tug-of-war between evolutionary forces of rigidity and plasticity over transcriptional regulatory programs. PMID:28505376
Phylogenetic Network Analysis Revealed the Occurrence of Horizontal Gene Transfer of 16S rRNA in the Genus Enterobacter

PubMed Central

Sato, Mitsuharu; Miyazaki, Kentaro

2017-01-01

Horizontal gene transfer (HGT) is a ubiquitous genetic event in bacterial evolution, but it seldom occurs for genes involved in highly complex supramolecules (or biosystems), which consist of many gene products. The ribosome is one such supramolecule, but several bacteria harbor dissimilar and/or chimeric 16S rRNAs in their genomes, suggesting the occurrence of HGT of this gene. However, we know little about whether the genes actually experience HGT and, if so, the frequency of such a transfer. This is primarily because the methods currently employed for phylogenetic analysis (e.g., neighbor-joining, maximum likelihood, and maximum parsimony) of 16S rRNA genes assume point mutation-driven tree-shape evolution as an evolutionary model, which is intrinsically inappropriate to decipher the evolutionary history for genes driven by recombination. To address this issue, we applied a phylogenetic network analysis, which has been used previously for detection of genetic recombination in homologous alleles, to the 16S rRNA gene. We focused on the genus Enterobacter, whose phylogenetic relationships inferred by multi-locus sequence alignment analysis and 16S rRNA sequences are incompatible. All 10 complete genomic sequences were retrieved from the NCBI database, in which 71 16S rRNA genes were included. Neighbor-joining analysis demonstrated that the genes residing in the same genomes clustered, indicating the occurrence of intragenomic recombination. However, as suggested by the low bootstrap values, evolutionary relationships between the clusters were uncertain. We then applied phylogenetic network analysis to representative sequences from each cluster. We found three ancestral 16S rRNA groups; the others were likely created through recursive recombination between the ancestors and chimeric descendants. Despite the large sequence changes caused by the recombination events, the RNA secondary structures were conserved. Successive intergenomic and intragenomic recombination thus shaped the evolution of 16S rRNA genes in the genus Enterobacter. PMID:29180992
Skylign: a tool for creating informative, interactive logos representing sequence alignments and profile hidden Markov models

PubMed Central

2014-01-01

Background Logos are commonly used in molecular biology to provide a compact graphical representation of the conservation pattern of a set of sequences. They render the information contained in sequence alignments or profile hidden Markov models by drawing a stack of letters for each position, where the height of the stack corresponds to the conservation at that position, and the height of each letter within a stack depends on the frequency of that letter at that position. Results We present a new tool and web server, called Skylign, which provides a unified framework for creating logos for both sequence alignments and profile hidden Markov models. In addition to static image files, Skylign creates a novel interactive logo plot for inclusion in web pages. These interactive logos enable scrolling, zooming, and inspection of underlying values. Skylign can avoid sampling bias in sequence alignments by down-weighting redundant sequences and by combining observed counts with informed priors. It also simplifies the representation of gap parameters, and can optionally scale letter heights based on alternate calculations of the conservation of a position. Conclusion Skylign is available as a website, a scriptable web service with a RESTful interface, and as a software package for download. Skylign’s interactive logos are easily incorporated into a web page with just a few lines of HTML markup. Skylign may be found at http://skylign.org. PMID:24410852
Evaluation of the Role of Functional Constraints on the Integrity of an Ultraconserved Region in the Genus Drosophila

PubMed Central

Díaz-Castillo, Carlos; Xia, Xiao-Qin; Ranz, José M.

2012-01-01

Why gene order is conserved over long evolutionary timespans remains elusive. A common interpretation is that gene order conservation might reflect the existence of functional constraints that are important for organismal performance. Alteration of the integrity of genomic regions, and therefore of those constraints, would result in detrimental effects. This notion seems especially plausible in those genomes that can easily accommodate gene reshuffling via chromosomal inversions since genomic regions free of constraints are likely to have been disrupted in one or more lineages. Nevertheless, no empirical test has been performed to this notion. Here, we disrupt one of the largest conserved genomic regions of the Drosophila genome by chromosome engineering and examine the phenotypic consequences derived from such disruption. The targeted region exhibits multiple patterns of functional enrichment suggestive of the presence of constraints. The carriers of the disrupted collinear block show no defects in their viability, fertility, and parameters of general homeostasis, although their odorant perception is altered. This change in odorant perception does not correlate with modifications of the level of expression and sex bias of the genes within the genomic region disrupted. Our results indicate that even in highly rearranged genomes, like those of Diptera, unusually high levels of gene order conservation cannot be systematically attributed to functional constraints, which raises the possibility that other mechanisms can be in place and therefore the underpinnings of the maintenance of gene organization might be more diverse than previously thought. PMID:22319453
Conserved thioredoxin fold is present in Pisum sativum L. sieve element occlusion-1 protein

PubMed Central

Umate, Pavan; Tuteja, Renu

2010-01-01

Homology-based three-dimensional model for Pisum sativum sieve element occlusion 1 (Ps.SEO1) (forisomes) protein was constructed. A stretch of amino acids (residues 320 to 456) which is well conserved in all known members of forisomes proteins was used to model the 3D structure of Ps.SEO1. The structural prediction was done using Protein Homology/analogY Recognition Engine (PHYRE) web server. Based on studies of local sequence alignment, the thioredoxin-fold containing protein [Structural Classification of Proteins (SCOP) code d1o73a_], a member of the glutathione peroxidase family was selected as a template for modeling the spatial structure of Ps.SEO1. Selection was based on comparison of primary sequence, higher match quality and alignment accuracy. Motif 1 (EVF) is conserved in Ps.SEO1, Vicia faba (Vf.For1) and Medicago truncatula (MT.SEO3); motif 2 (KKED) is well conserved across all forisomes proteins and motif 3 (IGYIGNP) is conserved in Ps.SEO1 and Vf.For1. PMID:20404566
Visualization of genome signatures of eukaryote genomes by batch-learning self-organizing map with a special emphasis on Drosophila genomes.

PubMed

Abe, Takashi; Hamano, Yuta; Ikemura, Toshimichi

2014-01-01

A strategy of evolutionary studies that can compare vast numbers of genome sequences is becoming increasingly important with the remarkable progress of high-throughput DNA sequencing methods. We previously established a sequence alignment-free clustering method "BLSOM" for di-, tri-, and tetranucleotide compositions in genome sequences, which can characterize sequence characteristics (genome signatures) of a wide range of species. In the present study, we generated BLSOMs for tetra- and pentanucleotide compositions in approximately one million sequence fragments derived from 101 eukaryotes, for which almost complete genome sequences were available. BLSOM recognized phylotype-specific characteristics (e.g., key combinations of oligonucleotide frequencies) in the genome sequences, permitting phylotype-specific clustering of the sequences without any information regarding the species. In our detailed examination of 12 Drosophila species, the correlation between their phylogenetic classification and the classification on the BLSOMs was observed to visualize oligonucleotides diagnostic for species-specific clustering.
Facilitating Constructive Alignment in Power Systems Engineering Education Using Free and Open-Source Software

ERIC Educational Resources Information Center

Vanfretti, L.; Milano, F.

2012-01-01

This paper describes how the use of free and open-source software (FOSS) can facilitate the application of constructive alignment theory in power systems engineering education by enabling the deep learning approach in power system analysis courses. With this aim, this paper describes the authors' approach in using the Power System Analysis Toolbox…
Evolutionary conservation of codon optimality reveals hidden signatures of cotranslational folding.

PubMed

Pechmann, Sebastian; Frydman, Judith

2013-02-01

The choice of codons can influence local translation kinetics during protein synthesis. Whether codon preference is linked to cotranslational regulation of polypeptide folding remains unclear. Here, we derive a revised translational efficiency scale that incorporates the competition between tRNA supply and demand. Applying this scale to ten closely related yeast species, we uncover the evolutionary conservation of codon optimality in eukaryotes. This analysis reveals universal patterns of conserved optimal and nonoptimal codons, often in clusters, which associate with the secondary structure of the translated polypeptides independent of the levels of expression. Our analysis suggests an evolved function for codon optimality in regulating the rhythm of elongation to facilitate cotranslational polypeptide folding, beyond its previously proposed role of adapting to the cost of expression. These findings establish how mRNA sequences are generally under selection to optimize the cotranslational folding of corresponding polypeptides.
Quantitation of base substitutions in eukaryotic 5S rRNA: selection for the maintenance of RNA secondary structure.

PubMed

Curtiss, W C; Vournakis, J N

1984-01-01

Eukaryotic 5S rRNA sequences from 34 diverse species were compared by the following method: (1) The sequences were aligned; (2) the positions of substitutions were located by comparison of all possible pairs of sequences; (3) the substitution sites were mapped to an assumed general base pairing model; and (4) the R-Y model of base stacking was used to study stacking pattern relationships in the structure. An analysis of the sequence and structure variability in each region of the molecule is presented. It was found that the degree of base substitution varies over a wide range, from absolute conservation to occurrence of over 90% of the possible observable substitutions. The substitutions are located primarily in stem regions of the 5S rRNA secondary structure. More than 88% of the substitutions in helical regions maintain base pairing. The disruptive substitutions are primarily located at the edges of helical regions, resulting in shortening of the helical regions and lengthening of the adjacent nonpaired regions. Base stacking patterns determined by the R-Y model are mapped onto the general secondary structure. Intrastrand and interstrand stacking could stabilize alternative coaxial structures and limit the conformational flexibility of nonpaired regions. Two short contiguous regions are 100% conserved in all species. This may reflect evolutionary constraints imposed at the DNA level by the requirement for binding of a 5S gene transcription initiation factor during gene expression.
Identifying designatable units for intraspecific conservation prioritization: a hierarchical approach applied to the lake whitefish species complex (Coregonus spp.)

PubMed Central

Mee, Jonathan A; Bernatchez, Louis; Reist, Jim D; Rogers, Sean M; Taylor, Eric B

2015-01-01

The concept of the designatable unit (DU) affords a practical approach to identifying diversity below the species level for conservation prioritization. However, its suitability for defining conservation units in ecologically diverse, geographically widespread and taxonomically challenging species complexes has not been broadly evaluated. The lake whitefish species complex (Coregonus spp.) is geographically widespread in the Northern Hemisphere, and it contains a great deal of variability in ecology and evolutionary legacy within and among populations, as well as a great deal of taxonomic ambiguity. Here, we employ a set of hierarchical criteria to identify DUs within the Canadian distribution of the lake whitefish species complex. We identified 36 DUs based on (i) reproductive isolation, (ii) phylogeographic groupings, (iii) local adaptation and (iv) biogeographic regions. The identification of DUs is required for clear discussion regarding the conservation prioritization of lake whitefish populations. We suggest conservation priorities among lake whitefish DUs based on biological consequences of extinction, risk of extinction and distinctiveness. Our results exemplify the need for extensive genetic and biogeographic analyses for any species with broad geographic distributions and the need for detailed evaluation of evolutionary history and adaptive ecological divergence when defining intraspecific conservation units. PMID:26029257
Genomics and evolutionary aspect of calcium signaling event in calmodulin and calmodulin-like proteins in plants.

PubMed

Mohanta, Tapan Kumar; Kumar, Pradeep; Bae, Hanhong

2017-02-03

Ca 2+ ion is a versatile second messenger that operate in a wide ranges of cellular processes that impact nearly every aspect of life. Ca 2+ regulates gene expression and biotic and abiotic stress responses in organisms ranging from unicellular algae to multi-cellular higher plants through the cascades of calcium signaling processes. In this study, we deciphered the genomics and evolutionary aspects of calcium signaling event of calmodulin (CaM) and calmodulin like- (CML) proteins. We studied the CaM and CML gene family of 41 different species across the plant lineages. Genomic analysis showed that plant encodes more calmodulin like-protein than calmodulins. Further analyses showed, the majority of CMLs were intronless, while CaMs were intron rich. Multiple sequence alignment showed, the EF-hand domain of CaM contains four conserved D-x-D motifs, one in each EF-hand while CMLs contain only one D-x-D-x-D motif in the fourth EF-hand. Phylogenetic analysis revealed that, the CMLs were evolved earlier than CaM and later diversified. Gene expression analysis demonstrated that different CaM and CMLs genes were express differentially in different tissues in a spatio-temporal manner. In this study we provided in detailed genome-wide identifications and characterization of CaM and CML protein family, phylogenetic relationships, and domain structure. Expression study of CaM and CML genes were conducted in Glycine max and Phaseolus vulgaris. Our study provides a strong foundation for future functional research in CaM and CML gene family in plant kingdom.
Exceptional Evolutionary Expansion of Prefrontal Cortex in Great Apes and Humans.

PubMed

Smaers, Jeroen B; Gómez-Robles, Aida; Parks, Ashley N; Sherwood, Chet C

2017-03-06

One of the enduring questions that has driven neuroscientific enquiry in the last century has been the nature of differences in the prefrontal cortex of humans versus other animals [1]. The prefrontal cortex has drawn particular interest due to its role in a range of evolutionarily specialized cognitive capacities such as language [2], imagination [3], and complex decision making [4]. Both cytoarchitectonic [5] and comparative neuroimaging [6] studies have converged on the conclusion that the proportion of prefrontal cortex in the human brain is greatly increased relative to that of other primates. However, considering the tremendous overall expansion of the neocortex in human evolution, it has proven difficult to ascertain whether this extent of prefrontal enlargement follows general allometric growth patterns, or whether it is exceptional [1]. Species' adherence to a common allometric relationship suggests conservation through phenotypic integration, while species' deviations point toward the occurrence of shifts in genetic and/or developmental mechanisms. Here we investigate prefrontal cortex scaling across anthropoid primates and find that great ape and human prefrontal cortex expansion are non-allometrically derived features of cortical organization. This result aligns with evidence for a developmental heterochronic shift in human prefrontal growth [7, 8], suggesting an association between neurodevelopmental changes and cortical organization on a macroevolutionary scale. The evolutionary origin of non-allometric prefrontal enlargement is estimated to lie at the root of great apes (∼19-15 mya), indicating that selection for changes in executive cognitive functions characterized both great ape and human cortical organization. Copyright © 2017 Elsevier Ltd. All rights reserved.
Amino acid positions subject to multiple coevolutionary constraints can be robustly identified by their eigenvector network centrality scores.

PubMed

Parente, Daniel J; Ray, J Christian J; Swint-Kruse, Liskin

2015-12-01

As proteins evolve, amino acid positions key to protein structure or function are subject to mutational constraints. These positions can be detected by analyzing sequence families for amino acid conservation or for coevolution between pairs of positions. Coevolutionary scores are usually rank-ordered and thresholded to reveal the top pairwise scores, but they also can be treated as weighted networks. Here, we used network analyses to bypass a major complication of coevolution studies: For a given sequence alignment, alternative algorithms usually identify different, top pairwise scores. We reconciled results from five commonly-used, mathematically divergent algorithms (ELSC, McBASC, OMES, SCA, and ZNMI), using the LacI/GalR and 1,6-bisphosphate aldolase protein families as models. Calculations used unthresholded coevolution scores from which column-specific properties such as sequence entropy and random noise were subtracted; "central" positions were identified by calculating various network centrality scores. When compared among algorithms, network centrality methods, particularly eigenvector centrality, showed markedly better agreement than comparisons of the top pairwise scores. Positions with large centrality scores occurred at key structural locations and/or were functionally sensitive to mutations. Further, the top central positions often differed from those with top pairwise coevolution scores: instead of a few strong scores, central positions often had multiple, moderate scores. We conclude that eigenvector centrality calculations reveal a robust evolutionary pattern of constraints-detectable by divergent algorithms--that occur at key protein locations. Finally, we discuss the fact that multiple patterns coexist in evolutionary data that, together, give rise to emergent protein functions. © 2015 Wiley Periodicals, Inc.
Analysis of septins across kingdoms reveals orthology and new motifs.

PubMed

Pan, Fangfang; Malmberg, Russell L; Momany, Michelle

2007-07-01

Septins are cytoskeletal GTPase proteins first discovered in the fungus Saccharomyces cerevisiae where they organize the septum and link nuclear division with cell division. More recently septins have been found in animals where they are important in processes ranging from actin and microtubule organization to embryonic patterning and where defects in septins have been implicated in human disease. Previous studies suggested that many animal septins fell into independent evolutionary groups, confounding cross-kingdom comparison. In the current work, we identified 162 septins from fungi, microsporidia and animals and analyzed their phylogenetic relationships. There was support for five groups of septins with orthology between kingdoms. Group 1 (which includes S. cerevisiae Cdc10p and human Sept9) and Group 2 (which includes S. cerevisiae Cdc3p and human Sept7) contain sequences from fungi and animals. Group 3 (which includes S. cerevisiae Cdc11p) and Group 4 (which includes S. cerevisiae Cdc12p) contain sequences from fungi and microsporidia. Group 5 (which includes Aspergillus nidulans AspE) contains sequences from filamentous fungi. We suggest a modified nomenclature based on these phylogenetic relationships. Comparative sequence alignments revealed septin derivatives of already known G1, G3 and G4 GTPase motifs, four new motifs from two to twelve amino acids long and six conserved single amino acid positions. One of these new motifs is septin-specific and several are group specific. Our studies provide an evolutionary history for this important family of proteins and a framework and consistent nomenclature for comparison of septin orthologs across kingdoms.
The Three Domains of Conservation Genetics: Case Histories from Hawaiian Waters.

PubMed

Bowen, Brian W

2016-07-01

The scientific field of conservation biology is dominated by 3 specialties: phylogenetics, ecology, and evolution. Under this triad, phylogenetics is oriented towards the past history of biodiversity, conserving the divergent branches in the tree of life. The ecological component is rooted in the present, maintaining the contemporary life support systems for biodiversity. Evolutionary conservation (as defined here) is concerned with preserving the raw materials for generating future biodiversity. All 3 domains can be documented with genetic case histories in the waters of the Hawaiian Archipelago, an isolated chain of volcanic islands with 2 types of biodiversity: colonists, and new species that arose from colonists. This review demonstrates that 1) phylogenetic studies have identified previously unknown branches in the tree of life that are endemic to Hawaiian waters; 2) population genetic surveys define isolated marine ecosystems as management units, and 3) phylogeographic analyses illustrate the pathways of colonization that can enhance future biodiversity. Conventional molecular markers have advanced all 3 domains in conservation biology over the last 3 decades, and recent advances in genomics are especially valuable for understanding the foundations of future evolutionary diversity. © The American Genetic Association. 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
The evolutionarily conserved transcription factor PRDM12 controls sensory neuron development and pain perception.

PubMed

Nagy, Vanja; Cole, Tiffany; Van Campenhout, Claude; Khoung, Thang M; Leung, Calvin; Vermeiren, Simon; Novatchkova, Maria; Wenzel, Daniel; Cikes, Domagoj; Polyansky, Anton A; Kozieradzki, Ivona; Meixner, Arabella; Bellefroid, Eric J; Neely, G Gregory; Penninger, Josef M

2015-01-01

PR homology domain-containing member 12 (PRDM12) belongs to a family of conserved transcription factors implicated in cell fate decisions. Here we show that PRDM12 is a key regulator of sensory neuronal specification in Xenopus. Modeling of human PRDM12 mutations that cause hereditary sensory and autonomic neuropathy (HSAN) revealed remarkable conservation of the mutated residues in evolution. Expression of wild-type human PRDM12 in Xenopus induced the expression of sensory neuronal markers, which was reduced using various human PRDM12 mutants. In Drosophila, we identified Hamlet as the functional PRDM12 homolog that controls nociceptive behavior in sensory neurons. Furthermore, expression analysis of human patient fibroblasts with PRDM12 mutations uncovered possible downstream target genes. Knockdown of several of these target genes including thyrotropin-releasing hormone degrading enzyme (TRHDE) in Drosophila sensory neurons resulted in altered cellular morphology and impaired nociception. These data show that PRDM12 and its functional fly homolog Hamlet are evolutionary conserved master regulators of sensory neuronal specification and play a critical role in pain perception. Our data also uncover novel pathways in multiple species that regulate evolutionary conserved nociception.

Bergmann's Rule rules body size in an ectotherm: heat conservation in a lizard along a 2200-metre elevational gradient.

PubMed

Zamora-Camacho, F J; Reguera, S; Moreno-Rueda, G

2014-12-01

Bergmann's Rule predicts larger body sizes in colder habitats, increasing organisms' ability to conserve heat. Originally formulated for endotherms, it is controversial whether Bergmann's Rule may be applicable to ectotherms, given that larger ectotherms show diminished capacity for heating up. We predict that Bergmann's Rule will be applicable to ectotherms when the benefits of a higher conservation of heat due to a larger body size overcompensate for decreased capacity to heating up. We test this hypothesis in the lizard Psammodromus algirus, which shows increased body size with elevation in Sierra Nevada (SE Spain). We measured heating and cooling rates of lizards from different elevations (from 300 to 2500 m above sea level) under controlled conditions. We found no significant differences in the heating rate along an elevational gradient. However, the cooling rate diminished with elevation and body size: highland lizards, with larger masses, have a higher thermal inertia for cooling, which allows them to maintain heat for more time and keep a high body temperature despite the lower thermal availability. Consequently, the net gaining of heat increased with elevation and body size. This study highlights that the heat conservation mechanism for explaining Bergmann's Rule works and is applicable to ectotherms, depending on the thermal benefits and costs associated with larger body sizes. © 2014 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2014 European Society For Evolutionary Biology.
Evolutionary traps as keys to understanding behavioral maladaptation

USGS Publications Warehouse

Robertson, Bruce A.; Chalfoun, Anna

2016-01-01

Evolutionary traps are severe cases of behavioral maladaptation that occur when, due to human activity, the cues animals use to guide their behavior become uncoupled from their fitness consequences. The result is that animals can prefer the most dangerous resources or behaviors, even when better options are available. Traps are increasingly common and represent a significant wildlife conservation problem. Understanding of the more proximate sensory-cognitive mechanisms underpinning traps remains poor, which highlights the need for interdisciplinary and collaborative approaches to investigating traps. Key to advancing basic trap theory and its conservation applications will be the development of appropriate and tractable model systems to investigate the mechanisms that cause traps within species, and how mechanisms vary across species.
Are lowland rainforests really evolutionary museums? Phylogeography of the green hylia (Hylia prasina) in the Afrotropics.

PubMed

Marks, Ben D

2010-04-01

A recent trend in the literature highlights the special role that tropical montane regions and habitat transitions peripheral to large blocks of lowland rainforest play in the diversification process. The emerging view is one of lowland rainforests as evolutionary 'museums'; where biotic diversity is maintained over evolutionary time, and additional diversity is accrued from peripheral areas, but where there has been little recent diversification. This leads to the prediction of genetic diversity without geographic structure in widespread taxa. Here, I assess the notion of the lowland rainforest 'museum' with a phylogeographic study of the green hylia (Aves: Sylviidae: Hylia prasina) using 1132 bp of mtDNA sequence data. The distribution of genetic diversity within the mainland subspecies of Hylia reveals five highly divergent haplotype groups distributed in accordance with broad-scale areas of endemism in the Afrotropics. This pattern of genetic diversity within a currently described subspecies refutes the characterization of lowland forests as evolutionary museums. If the pattern of geographic variation in Hylia occurs broadly in widespread rainforest species, conservation policy makers may need to rethink their priorities for conservation in the Afrotropics. (c) 2009 Elsevier Inc. All rights reserved.
Classification and evolutionary analysis of the basic helix-loop-helix gene family in the green anole lizard, Anolis carolinensis.

PubMed

Liu, Ake; Wang, Yong; Zhang, Debao; Wang, Xuhua; Song, Huifang; Dang, Chunwang; Yao, Qin; Chen, Keping

2013-08-01

Helix-loop-helix (bHLH) proteins play essential regulatory roles in a variety of biological processes. These highly conserved proteins form a large transcription factor superfamily, and are commonly identified in large numbers within animal, plant, and fungal genomes. The bHLH domain has been well studied in many animal species, but has not yet been characterized in non-avian reptiles. In this study, we identified 102 putative bHLH genes in the genome of the green anole lizard, Anolis carolinensis. Based on phylogenetic analysis, these genes were classified into 43 families, with 43, 24, 16, 3, 10, and 3 members assigned into groups A, B, C, D, E, and F, respectively, and 3 members categorized as "orphans". Within-group evolutionary relationships inferred from the phylogenetic analysis were consistent with highly conserved patterns observed for introns and additional domains. Results from phylogenetic analysis of the H/E(spl) family suggest that genome and tandem gene duplications have contributed to this family's expansion. Our classification and evolutionary analysis has provided insights into the evolutionary diversification of animal bHLH genes, and should aid future studies on bHLH protein regulation of key growth and developmental processes.
Sequence Diversity Diagram for comparative analysis of multiple sequence alignments.

PubMed

Sakai, Ryo; Aerts, Jan

2014-01-01

The sequence logo is a graphical representation of a set of aligned sequences, commonly used to depict conservation of amino acid or nucleotide sequences. Although it effectively communicates the amount of information present at every position, this visual representation falls short when the domain task is to compare between two or more sets of aligned sequences. We present a new visual presentation called a Sequence Diversity Diagram and validate our design choices with a case study. Our software was developed using the open-source program called Processing. It loads multiple sequence alignment FASTA files and a configuration file, which can be modified as needed to change the visualization. The redesigned figure improves on the visual comparison of two or more sets, and it additionally encodes information on sequential position conservation. In our case study of the adenylate kinase lid domain, the Sequence Diversity Diagram reveals unexpected patterns and new insights, for example the identification of subgroups within the protein subfamily. Our future work will integrate this visual encoding into interactive visualization tools to support higher level data exploration tasks.
Race as a moderator of the relationship between religiosity and political alignment.

PubMed

Cohen, Adam B; Malka, Ariel; Hill, Eric D; Thoemmes, Felix; Hill, Peter C; Sundie, Jill M

2009-03-01

Religiosity, especially religious fundamentalism, is often assumed to have an inherent connection with conservative politics. This article proposes that the relationship varies by race in the United States. In Study 1, race moderated the relationships between religiosity indicators and political alignment in a nationally representative sample. In Study 2, the effect replicated in a student sample with more reliable measures. Among both Black and Latino Americans, the relationship between religiosity and conservative politics is far weaker than it is among White Americans, and it is sometimes altogether absent. In Study 3, a tradition-focused view of religion was found to more strongly mediate the link between religiosity and political attitudes among Whites than it did among Blacks and Latinos. It is argued that the relationship between religiosity and political alignment is best understood as a product of cultural-historical conditions associated with group memberships.
Strategies and tools for whole genome alignments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Couronne, Olivier; Poliakov, Alexander; Bray, Nicolas

2002-11-25

The availability of the assembled mouse genome makespossible, for the first time, an alignment and comparison of two largevertebrate genomes. We have investigated different strategies ofalignment for the subsequent analysis of conservation of genomes that areeffective for different quality assemblies. These strategies were appliedto the comparison of the working draft of the human genome with the MouseGenome Sequencing Consortium assembly, as well as other intermediatemouse assemblies. Our methods are fast and the resulting alignmentsexhibit a high degree of sensitivity, covering more than 90 percent ofknown coding exons in the human genome. We have obtained such coveragewhile preserving specificity. With amore » view towards the end user, we havedeveloped a suite of tools and websites for automatically aligning, andsubsequently browsing and working with whole genome comparisons. Wedescribe the use of these tools to identify conserved non-coding regionsbetween the human and mouse genomes, some of which have not beenidentified by other methods.« less
A possible biochemical missing link among archaebacteria

NASA Technical Reports Server (NTRS)

Achenbach-Richter, Laurie; Woese, Carl R.; Stetter, Karl O.

1987-01-01

The characteristics of the newly discovered strain of archaebacteria, VC-16, the only archaebacterium known to reduce sulfate, suggest that VC-16 might represent a transitional form between an anaerobic thermophilic sulfur-based type of metabolism and methanogenesis. It is shown here, using a matrix of evolutionary distances derived from an alignment of various archaebacterial 16S rRNAs and the phylogenetic tree derived from these evolutionary distances, that the lineage represented by strain VC-16 arises from the archaebacterial tree precisely where such an interpretation would predict that it would, between the Methanococcus lineage and that of Thermococcus.
Toward a method for tracking virus evolutionary trajectory applied to the pandemic H1N1 2009 influenza virus.

PubMed

Squires, R Burke; Pickett, Brett E; Das, Sajal; Scheuermann, Richard H

2014-12-01

In 2009 a novel pandemic H1N1 influenza virus (H1N1pdm09) emerged as the first official influenza pandemic of the 21st century. Early genomic sequence analysis pointed to the swine origin of the virus. Here we report a novel computational approach to determine the evolutionary trajectory of viral sequences that uses data-driven estimations of nucleotide substitution rates to track the gradual accumulation of observed sequence alterations over time. Phylogenetic analysis and multiple sequence alignments show that sequences belonging to the resulting evolutionary trajectory of the H1N1pdm09 lineage exhibit a gradual accumulation of sequence variations and tight temporal correlations in the topological structure of the phylogenetic trees. These results suggest that our evolutionary trajectory analysis (ETA) can more effectively pinpoint the evolutionary history of viruses, including the host and geographical location traversed by each segment, when compared against either BLAST or traditional phylogenetic analysis alone. Copyright © 2014 Elsevier B.V. All rights reserved.
Analysis, Characterization, and Loci of the tuf Genes in Lactobacillus and Bifidobacterium Species and Their Direct Application for Species Identification

PubMed Central

Ventura, Marco; Canchaya, Carlos; Meylan, Valèrie; Klaenhammer, Todd R.; Zink, Ralf

2003-01-01

We analyzed the tuf gene, encoding elongation factor Tu, from 33 strains representing 17 Lactobacillus species and 8 Bifidobacterium species. The tuf sequences were aligned and used to infer phylogenesis among species of lactobacilli and bifidobacteria. We demonstrated that the synonymous substitution affecting this gene renders elongation factor Tu a reliable molecular clock for investigating evolutionary distances of lactobacilli and bifidobacteria. In fact, the phylogeny generated by these tuf sequences is consistent with that derived from 16S rRNA analysis. The investigation of a multiple alignment of tuf sequences revealed regions conserved among strains belonging to the same species but distinct from those of other species. PCR primers complementary to these regions allowed species-specific identification of closely related species, such as Lactobacillus casei group members. These tuf gene-based assays developed in this study provide an alternative to present methods for the identification for lactic acid bacterial species. Since a variable number of tuf genes have been described for bacteria, the presence of multiple genes was examined. Southern analysis revealed one tuf gene in the genomes of lactobacilli and bifidobacteria, but the tuf gene was arranged differently in the genomes of these two taxa. Our results revealed that the tuf gene in bifidobacteria is flanked by the same gene constellation as the str operon, as originally reported for Escherichia coli. In contrast, bioinformatic and transcriptional analyses of the DNA region flanking the tuf gene in four Lactobacillus species indicated the same four-gene unit and suggested a novel tuf operon specific for the genus Lactobacillus. PMID:14602655
Characterization of tannase protein sequences of bacteria and fungi: an in silico study.

PubMed

Banerjee, Amrita; Jana, Arijit; Pati, Bikash R; Mondal, Keshab C; Das Mohapatra, Pradeep K

2012-04-01

The tannase protein sequences of 149 bacteria and 36 fungi were retrieved from NCBI database. Among them only 77 bacterial and 31 fungal tannase sequences were taken which have different amino acid compositions. These sequences were analysed for different physical and chemical properties, superfamily search, multiple sequence alignment, phylogenetic tree construction and motif finding to find out the functional motif and the evolutionary relationship among them. The superfamily search for these tannase exposed the occurrence of proline iminopeptidase-like, biotin biosynthesis protein BioH, O-acetyltransferase, carboxylesterase/thioesterase 1, carbon-carbon bond hydrolase, haloperoxidase, prolyl oligopeptidase, C-terminal domain and mycobacterial antigens families and alpha/beta hydrolase superfamily. Some bacterial and fungal sequence showed similarity with different families individually. The multiple sequence alignment of these tannase protein sequences showed conserved regions at different stretches with maximum homology from amino acid residues 389-469 and 482-523 which could be used for designing degenerate primers or probes specific for tannase producing bacterial and fungal species. Phylogenetic tree showed two different clusters; one has only bacteria and another have both fungi and bacteria showing some relationship between these different genera. Although in second cluster near about all fungal species were found together in a corner which indicates the sequence level similarity among fungal genera. The distributions of fourteen motifs analysis revealed Motif 1 with a signature amino acid sequence of 29 amino acids, i.e. GCSTGGREALKQAQRWPHDYDGIIANNPA, was uniformly observed in 83.3 % of studied tannase sequences representing its participation with the structure and enzymatic function.
Alignment of RNA molecules: Binding energy and statistical properties of random sequences

DOE Office of Scientific and Technical Information (OSTI.GOV)

Valba, O. V., E-mail: valbaolga@gmail.com; Nechaev, S. K., E-mail: sergei.nechaev@gmail.com; Tamm, M. V., E-mail: thumm.m@gmail.com

2012-02-15

A new statistical approach to the problem of pairwise alignment of RNA sequences is proposed. The problem is analyzed for a pair of interacting polymers forming an RNA-like hierarchical cloverleaf structures. An alignment is characterized by the numbers of matches, mismatches, and gaps. A weight function is assigned to each alignment; this function is interpreted as a free energy taking into account both direct monomer-monomer interactions and a combinatorial contribution due to formation of various cloverleaf secondary structures. The binding free energy is determined for a pair of RNA molecules. Statistical properties are discussed, including fluctuations of the binding energymore » between a pair of RNA molecules and loop length distribution in a complex. Based on an analysis of the free energy per nucleotide pair complexes of random RNAs as a function of the number of nucleotide types c, a hypothesis is put forward about the exclusivity of the alphabet c = 4 used by nature.« less
Functional characterization of p53 pathway components in the ancient metazoan Trichoplax adhaerens

NASA Astrophysics Data System (ADS)

Siau, Jia Wei; Coffill, Cynthia R.; Zhang, Weiyun Villien; Tan, Yaw Sing; Hundt, Juliane; Lane, David; Verma, Chandra; Ghadessy, Farid

2016-09-01

The identification of genes encoding a p53 family member and an Mdm2 ortholog in the ancient placozoan Trichoplax adhaerens advocates for the evolutionary conservation of a pivotal stress-response pathway observed in all higher eukaryotes. Here, we recapitulate several key functionalities ascribed to this known interacting protein pair by analysis of the placozoan proteins (Tap53 and TaMdm2) using both in vitro and cellular assays. In addition to interacting with each other, the Tap53 and TaMdm2 proteins are also able to respectively bind human Mdm2 and p53, providing strong evidence for functional conservation. The key p53-degrading function of Mdm2 is also conserved in TaMdm2. Tap53 retained DNA binding associated with p53 transcription activation function. However, it lacked transactivation function in reporter genes assays using a heterologous cell line, suggesting a cofactor incompatibility. Overall, the data supports functional roles for TaMdm2 and Tap53, and further defines the p53 pathway as an evolutionary conserved fulcrum mediating cellular response to stress.
Evolutionary Conserved Regulation of HIF-1β by NF-κB

PubMed Central

van Uden, Patrick; Kenneth, Niall S.; Webster, Ryan; Müller, H. Arno; Mudie, Sharon; Rocha, Sonia

2011-01-01

Hypoxia Inducible Factor-1 (HIF-1) is essential for mammalian development and is the principal transcription factor activated by low oxygen tensions. HIF-α subunit quantities and their associated activity are regulated in a post-translational manner, through the concerted action of a class of enzymes called Prolyl Hydroxylases (PHDs) and Factor Inhibiting HIF (FIH) respectively. However, alternative modes of HIF-α regulation such as translation or transcription are under-investigated, and their importance has not been firmly established. Here, we demonstrate that NF-κB regulates the HIF pathway in a significant and evolutionary conserved manner. We demonstrate that NF-κB directly regulates HIF-1β mRNA and protein. In addition, we found that NF-κB–mediated changes in HIF-1β result in modulation of HIF-2α protein. HIF-1β overexpression can rescue HIF-2α protein levels following NF-κB depletion. Significantly, NF-κB regulates HIF-1β (tango) and HIF-α (sima) levels and activity (Hph/fatiga, ImpL3/ldha) in Drosophila, both in normoxia and hypoxia, indicating an evolutionary conserved mode of regulation. These results reveal a novel mechanism of HIF regulation, with impact in the development of novel therapeutic strategies for HIF–related pathologies including ageing, ischemia, and cancer. PMID:21298084
The eastern migratory caribou: the role of genetic introgression in ecotype evolution.

PubMed

Klütsch, Cornelya F C; Manseau, Micheline; Trim, Vicki; Polfus, Jean; Wilson, Paul J

2016-02-01

Understanding the evolutionary history of contemporary animal groups is essential for conservation and management of endangered species like caribou (Rangifer tarandus). In central Canada, the ranges of two caribou subspecies (barren-ground/woodland caribou) and two woodland caribou ecotypes (boreal/eastern migratory) overlap. Our objectives were to reconstruct the evolutionary history of the eastern migratory ecotype and to assess the potential role of introgression in ecotype evolution. STRUCTURE analyses identified five higher order groups (i.e. three boreal caribou populations, eastern migratory ecotype and barren-ground). The evolutionary history of the eastern migratory ecotype was best explained by an early genetic introgression from barren-ground into a woodland caribou lineage during the Late Pleistocene and subsequent divergence of the eastern migratory ecotype during the Holocene. These results are consistent with the retreat of the Laurentide ice sheet and the colonization of the Hudson Bay coastal areas subsequent to the establishment of forest tundra vegetation approximately 7000 years ago. This historical reconstruction of the eastern migratory ecotype further supports its current classification as a conservation unit, specifically a Designatable Unit, under Canada's Species at Risk Act. These findings have implications for other sub-specific contact zones for caribou and other North American species in conservation unit delineation.
Evolution of disorder in Mediator complex and its functional relevance

PubMed Central

Nagulapalli, Malini; Maji, Sourobh; Dwivedi, Nidhi; Dahiya, Pradeep; Thakur, Jitendra K.

2016-01-01

Mediator, an important component of eukaryotic transcriptional machinery, is a huge multisubunit complex. Though the complex is known to be conserved across all the eukaryotic kingdoms, the evolutionary topology of its subunits has never been studied. In this study, we profiled disorder in the Mediator subunits of 146 eukaryotes belonging to three kingdoms viz., metazoans, plants and fungi, and attempted to find correlation between the evolution of Mediator complex and its disorder. Our analysis suggests that disorder in Mediator complex have played a crucial role in the evolutionary diversification of complexity of eukaryotic organisms. Conserved intrinsic disordered regions (IDRs) were identified in only six subunits in the three kingdoms whereas unique patterns of IDRs were identified in other Mediator subunits. Acquisition of novel molecular recognition features (MoRFs) through evolution of new subunits or through elongation of the existing subunits was evident in metazoans and plants. A new concept of ‘junction-MoRF’ has been introduced. Evolutionary link between CBP and Med15 has been provided which explain the evolution of extended-IDR in CBP from Med15 KIX-IDR junction-MoRF suggesting role of junction-MoRF in evolution and modulation of protein–protein interaction repertoire. This study can be informative and helpful in understanding the conserved and flexible nature of Mediator complex across eukaryotic kingdoms. PMID:26590257
The eastern migratory caribou: the role of genetic introgression in ecotype evolution

PubMed Central

Klütsch, Cornelya F. C.; Manseau, Micheline; Trim, Vicki; Polfus, Jean; Wilson, Paul J.

2016-01-01

Understanding the evolutionary history of contemporary animal groups is essential for conservation and management of endangered species like caribou (Rangifer tarandus). In central Canada, the ranges of two caribou subspecies (barren-ground/woodland caribou) and two woodland caribou ecotypes (boreal/eastern migratory) overlap. Our objectives were to reconstruct the evolutionary history of the eastern migratory ecotype and to assess the potential role of introgression in ecotype evolution. STRUCTURE analyses identified five higher order groups (i.e. three boreal caribou populations, eastern migratory ecotype and barren-ground). The evolutionary history of the eastern migratory ecotype was best explained by an early genetic introgression from barren-ground into a woodland caribou lineage during the Late Pleistocene and subsequent divergence of the eastern migratory ecotype during the Holocene. These results are consistent with the retreat of the Laurentide ice sheet and the colonization of the Hudson Bay coastal areas subsequent to the establishment of forest tundra vegetation approximately 7000 years ago. This historical reconstruction of the eastern migratory ecotype further supports its current classification as a conservation unit, specifically a Designatable Unit, under Canada’s Species at Risk Act. These findings have implications for other sub-specific contact zones for caribou and other North American species in conservation unit delineation. PMID:26998320
Gymnosperms on the EDGE.

PubMed

Forest, Félix; Moat, Justin; Baloch, Elisabeth; Brummitt, Neil A; Bachman, Steve P; Ickert-Bond, Steffi; Hollingsworth, Peter M; Liston, Aaron; Little, Damon P; Mathews, Sarah; Rai, Hardeep; Rydin, Catarina; Stevenson, Dennis W; Thomas, Philip; Buerki, Sven

2018-04-16

Driven by limited resources and a sense of urgency, the prioritization of species for conservation has been a persistent concern in conservation science. Gymnosperms (comprising ginkgo, conifers, cycads, and gnetophytes) are one of the most threatened groups of living organisms, with 40% of the species at high risk of extinction, about twice as many as the most recent estimates for all plants (i.e. 21.4%). This high proportion of species facing extinction highlights the urgent action required to secure their future through an objective prioritization approach. The Evolutionary Distinct and Globally Endangered (EDGE) method rapidly ranks species based on their evolutionary distinctiveness and the extinction risks they face. EDGE is applied to gymnosperms using a phylogenetic tree comprising DNA sequence data for 85% of gymnosperm species (923 out of 1090 species), to which the 167 missing species were added, and IUCN Red List assessments available for 92% of species. The effect of different extinction probability transformations and the handling of IUCN data deficient species on the resulting rankings is investigated. Although top entries in our ranking comprise species that were expected to score well (e.g. Wollemia nobilis, Ginkgo biloba), many were unexpected (e.g. Araucaria araucana). These results highlight the necessity of using approaches that integrate evolutionary information in conservation science.
Evolutionary perspectives on wildlife disease: concepts and applications

PubMed Central

Vander Wal, Eric; Garant, Dany; Pelletier, Fanie

2014-01-01

Wildlife disease has the potential to cause significant ecological, socioeconomic, and health impacts. As a result, all tools available need to be employed when host–pathogen dynamics merit conservation or management interventions. Evolutionary principles, such as evolutionary history, phenotypic and genetic variation, and selection, have the potential to unravel many of the complex ecological realities of infectious disease in the wild. Despite this, their application to wildlife disease ecology and management remains in its infancy. In this article, we outline the impetus behind applying evolutionary principles to disease ecology and management issues in the wild. We then introduce articles from this special issue on Evolutionary Perspectives on Wildlife Disease: Concepts and Applications, outlining how each is exemplar of a practical wildlife disease challenge that can be enlightened by applied evolution. Ultimately, we aim to bring new insights to wildlife disease ecology and its management using tools and techniques commonly employed in evolutionary ecology. PMID:25469154
The novel cytochrome c6 of chloroplasts: a case of evolutionary bricolage?

PubMed

Howe, Christopher J; Schlarb-Ridley, Beatrix G; Wastl, Juergen; Purton, Saul; Bendall, Derek S

2006-01-01

Cytochrome c6 has long been known as a redox carrier of the thylakoid lumen of cyanobacteria and some eukaryotic algae that can substitute for plastocyanin in electron transfer. Until recently, it was widely accepted that land plants lack a cytochrome c6. However, a homologue of the protein has now been identified in several plant species together with an additional isoform in the green alga Chlamydomonas reinhardtii. This form of the protein, designated cytochrome c6A, differs from the 'conventional' cytochrome c6 in possessing a conserved insertion of 12 amino acids that includes two absolutely conserved cysteine residues. There are conflicting reports of whether cytochrome c6A can substitute for plastocyanin in photosynthetic electron transfer. The evidence for and against this is reviewed and the likely evolutionary history of cytochrome c6A is discussed. It is suggested that it has been converted from a primary role in electron transfer to one in regulation within the chloroplast, and is an example of evolutionary 'bricolage'.

Local Geometry and Evolutionary Conservation of Protein Surfaces Reveal the Multiple Recognition Patches in Protein-Protein Interactions

PubMed Central

Laine, Elodie; Carbone, Alessandra

2015-01-01

Protein-protein interactions (PPIs) are essential to all biological processes and they represent increasingly important therapeutic targets. Here, we present a new method for accurately predicting protein-protein interfaces, understanding their properties, origins and binding to multiple partners. Contrary to machine learning approaches, our method combines in a rational and very straightforward way three sequence- and structure-based descriptors of protein residues: evolutionary conservation, physico-chemical properties and local geometry. The implemented strategy yields very precise predictions for a wide range of protein-protein interfaces and discriminates them from small-molecule binding sites. Beyond its predictive power, the approach permits to dissect interaction surfaces and unravel their complexity. We show how the analysis of the predicted patches can foster new strategies for PPIs modulation and interaction surface redesign. The approach is implemented in JET2, an automated tool based on the Joint Evolutionary Trees (JET) method for sequence-based protein interface prediction. JET2 is freely available at www.lcqb.upmc.fr/JET2. PMID:26690684
Conservation of Endo16 expression in sea urchins despite evolutionary divergence in both cis and trans-acting components of transcriptional regulation

NASA Technical Reports Server (NTRS)

Romano, Laura A.; Wray, Gregory A.

2003-01-01

Evolutionary changes in transcriptional regulation undoubtedly play an important role in creating morphological diversity. However, there is little information about the evolutionary dynamics of cis-regulatory sequences. This study examines the functional consequence of evolutionary changes in the Endo16 promoter of sea urchins. The Endo16 gene encodes a large extracellular protein that is expressed in the endoderm and may play a role in cell adhesion. Its promoter has been characterized in exceptional detail in the purple sea urchin, Strongylocentrotus purpuratus. We have characterized the structure and function of the Endo16 promoter from a second sea urchin species, Lytechinus variegatus. The Endo16 promoter sequences have evolved in a strongly mosaic manner since these species diverged approximately 35 million years ago: the most proximal region (module A) is conserved, but the remaining modules (B-G) are unalignable. Despite extensive divergence in promoter sequences, the pattern of Endo16 transcription is largely conserved during embryonic and larval development. Transient expression assays demonstrate that 2.2 kb of upstream sequence in either species is sufficient to drive GFP reporter expression that correctly mimics this pattern of Endo16 transcription. Reciprocal cross-species transient expression assays imply that changes have also evolved in the set of transcription factors that interact with the Endo16 promoter. Taken together, these results suggest that stabilizing selection on the transcriptional output may have operated to maintain a similar pattern of Endo16 expression in S. purpuratus and L. variegatus, despite dramatic divergence in promoter sequence and mechanisms of transcriptional regulation.
Wasabi: An Integrated Platform for Evolutionary Sequence Analysis and Data Visualization.

PubMed

Veidenberg, Andres; Medlar, Alan; Löytynoja, Ari

2016-04-01

Wasabi is an open source, web-based environment for evolutionary sequence analysis. Wasabi visualizes sequence data together with a phylogenetic tree within a modern, user-friendly interface: The interface hides extraneous options, supports context sensitive menus, drag-and-drop editing, and displays additional information, such as ancestral sequences, associated with specific tree nodes. The Wasabi environment supports reproducibility by automatically storing intermediate analysis steps and includes built-in functions to share data between users and publish analysis results. For computational analysis, Wasabi supports PRANK and PAGAN for phylogeny-aware alignment and alignment extension, and it can be easily extended with other tools. Along with drag-and-drop import of local files, Wasabi can access remote data through URL and import sequence data, GeneTrees and EPO alignments directly from Ensembl. To demonstrate a typical workflow using Wasabi, we reproduce key findings from recent comparative genomics studies, including a reanalysis of the EGLN1 gene from the tiger genome study: These case studies can be browsed within Wasabi at http://wasabiapp.org:8000?id=usecases. Wasabi runs inside a web browser and does not require any installation. One can start using it at http://wasabiapp.org. All source code is licensed under the AGPLv3. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Biclustering as a method for RNA local multiple sequence alignment.

PubMed

Wang, Shu; Gutell, Robin R; Miranker, Daniel P

2007-12-15

Biclustering is a clustering method that simultaneously clusters both the domain and range of a relation. A challenge in multiple sequence alignment (MSA) is that the alignment of sequences is often intended to reveal groups of conserved functional subsequences. Simultaneously, the grouping of the sequences can impact the alignment; precisely the kind of dual situation biclustering is intended to address. We define a representation of the MSA problem enabling the application of biclustering algorithms. We develop a computer program for local MSA, BlockMSA, that combines biclustering with divide-and-conquer. BlockMSA simultaneously finds groups of similar sequences and locally aligns subsequences within them. Further alignment is accomplished by dividing both the set of sequences and their contents. The net result is both a multiple sequence alignment and a hierarchical clustering of the sequences. BlockMSA was tested on the subsets of the BRAliBase 2.1 benchmark suite that display high variability and on an extension to that suite to larger problem sizes. Also, alignments were evaluated of two large datasets of current biological interest, T box sequences and Group IC1 Introns. The results were compared with alignments computed by ClustalW, MAFFT, MUCLE and PROBCONS alignment programs using Sum of Pairs (SPS) and Consensus Count. Results for the benchmark suite are sensitive to problem size. On problems of 15 or greater sequences, BlockMSA is consistently the best. On none of the problems in the test suite are there appreciable differences in scores among BlockMSA, MAFFT and PROBCONS. On the T box sequences, BlockMSA does the most faithful job of reproducing known annotations. MAFFT and PROBCONS do not. On the Intron sequences, BlockMSA, MAFFT and MUSCLE are comparable at identifying conserved regions. BlockMSA is implemented in Java. Source code and supplementary datasets are available at http://aug.csres.utexas.edu/msa/
Human Variation in Short Regions Predisposed to Deep Evolutionary Conservation

PubMed Central

Loots, Gabriela G.; Ovcharenko, Ivan

2010-01-01

The landscape of the human genome consists of millions of short islands of conservation that are 100% conserved across multiple vertebrate genomes (termed “bricks”), the majority of which are located in noncoding regions. Several hundred thousand bricks are deeply conserved reaching the genomes of amphibians and fish. Deep phylogenetic conservation of noncoding DNA has been reported to be strongly associated with the presence of gene regulatory elements, introducing bricks as a proxy to the functional noncoding landscape of the human genome. Here, we report a significant overrepresentation of bricks in the promoters of transcription factors and developmental genes, where the high level of phylogenetic conservation correlates with an increase in brick overrepresentation. We also found that the presence of a brick dictates a predisposition to evolutionary constraint, with only 0.7% of the amniota brick central nucleotides being diverged within the primate lineage—an 11-fold reduction in the divergence rate compared with random expectation. Human single-nucleotide polymorphism (SNP) data explains only 3% of primate-specific variation in amniota bricks, thus arguing for a widespread fixation of brick mutations within the primate lineage and prior to human radiation. This variation, in turn, might have been utilized as a driving force for primate- and hominoid-specific adaptation. We also discovered a pronounced deviation from the evolutionary predisposition in the human lineage, with over 20-fold increase in the substitution rate at brick SNP sites over expected values. In addition, contrary to typical brick mutations, brick variation commonly encountered in the human population displays limited, if any, signatures of negative selection as measured by the minor allele frequency and population differentiation (F-statistical measure) measures. These observations argue for the plasticity of gene regulatory mechanisms in vertebrates—with evidence of strong purifying selection acting on the gene regulatory landscape of the human genome, where widespread advantageous mutations in putative regulatory elements are likely utilized in functional diversification and adaptation of species. PMID:20093432
Stereospecific suppression of active site mutants by methylphosphonate substituted substrates reveals the stereochemical course of site-specific DNA recombination

PubMed Central

Rowley, Paul A.; Kachroo, Aashiq H.; Ma, Chien-Hui; Maciaszek, Anna D.; Guga, Piotr; Jayaram, Makkuni

2015-01-01

Tyrosine site-specific recombinases, which promote one class of biologically important phosphoryl transfer reactions in DNA, exemplify active site mechanisms for stabilizing the phosphate transition state. A highly conserved arginine duo (Arg-I; Arg-II) of the recombinase active site plays a crucial role in this function. Cre and Flp recombinase mutants lacking either arginine can be rescued by compensatory charge neutralization of the scissile phosphate via methylphosphonate (MeP) modification. The chemical chirality of MeP, in conjunction with mutant recombinases, reveals the stereochemical contributions of Arg-I and Arg-II. The SP preference of the native reaction is specified primarily by Arg-I. MeP reaction supported by Arg-II is nearly bias-free or RP-biased, depending on the Arg-I substituent. Positional conservation of the arginines does not translate into strict functional conservation. Charge reversal by glutamic acid substitution at Arg-I or Arg-II has opposite effects on Cre and Flp in MeP reactions. In Flp, the base immediately 5′ to the scissile MeP strongly influences the choice between the catalytic tyrosine and water as the nucleophile for strand scission, thus between productive recombination and futile hydrolysis. The recombinase active site embodies the evolutionary optimization of interactions that not only favor the normal reaction but also proscribe antithetical side reactions. PMID:25999343
Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS): A web-based tool for addressing the challenges of cross-species extrapolation of chemical toxicity

EPA Science Inventory

Conservation of a molecular target across species can be used as a line-of-evidence to predict the likelihood of chemical susceptibility. The web-based Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool was developed to simplify, streamline, and quantitat...
USAF Hearing Conservation Program, DOEHRS Data Repository Annual Report: CY2014

DTIC Science & Technology

2016-02-01

tinnitus . The goal was to align the DOEHRS-HC DR data with DoD Hearing Conservation and Readiness Working Group initiatives and Government...Accountability Office recommendations [3]. The data collected from the standardized tinnitus questions are projected to be mined by the DoD in future studies
EvoluCode: Evolutionary Barcodes as a Unifying Framework for Multilevel Evolutionary Data.

PubMed

Linard, Benjamin; Nguyen, Ngoc Hoan; Prosdocimi, Francisco; Poch, Olivier; Thompson, Julie D

2012-01-01

Evolutionary systems biology aims to uncover the general trends and principles governing the evolution of biological networks. An essential part of this process is the reconstruction and analysis of the evolutionary histories of these complex, dynamic networks. Unfortunately, the methodologies for representing and exploiting such complex evolutionary histories in large scale studies are currently limited. Here, we propose a new formalism, called EvoluCode (Evolutionary barCode), which allows the integration of different evolutionary parameters (eg, sequence conservation, orthology, synteny …) in a unifying format and facilitates the multilevel analysis and visualization of complex evolutionary histories at the genome scale. The advantages of the approach are demonstrated by constructing barcodes representing the evolution of the complete human proteome. Two large-scale studies are then described: (i) the mapping and visualization of the barcodes on the human chromosomes and (ii) automatic clustering of the barcodes to highlight protein subsets sharing similar evolutionary histories and their functional analysis. The methodologies developed here open the way to the efficient application of other data mining and knowledge extraction techniques in evolutionary systems biology studies. A database containing all EvoluCode data is available at: http://lbgi.igbmc.fr/barcodes.
Gene context conservation of a higher order than operons.

PubMed

Lathe, W C; Snel, B; Bork, P

2000-10-01

Operons, co-transcribed and co-regulated contiguous sets of genes, are poorly conserved over short periods of evolutionary time. The gene order, gene content and regulatory mechanisms of operons can be very different, even in closely related species. Here, we present several lines of evidence which suggest that, although an operon and its individual genes and regulatory structures are rearranged when comparing the genomes of different species, this rearrangement is a conservative process. Genomic rearrangements invariably maintain individual genes in very specific functional and regulatory contexts. We call this conserved context an uber-operon.
Cognitive Adaptations for n-person Exchange: The Evolutionary Roots of Organizational Behavior.

PubMed

Tooby, John; Cosmides, Leda; Price, Michael E

2006-03-01

Organizations are composed of stable, predominantly cooperative interactions or n -person exchanges. Humans have been engaging in n -person exchanges for a great enough period of evolutionary time that we appear to have evolved a distinct constellation of species-typical mechanisms specialized to solve the adaptive problems posed by this form of social interaction. These mechanisms appear to have been evolutionarily elaborated out of the cognitive infrastructure that initially evolved for dyadic exchange. Key adaptive problems that these mechanisms are designed to solve include coordination among individuals, and defense against exploitation by free riders. Multi-individual cooperation could not have been maintained over evolutionary time if free riders reliably benefited more than contributors to collective enterprises, and so outcompeted them. As a result, humans evolved mechanisms that implement an aversion to exploitation by free riding, and a strategy of conditional cooperation, supplemented by punitive sentiment towards free riders. Because of the design of these mechanisms, how free riding is treated is a central determinant of the survival and health of cooperative organizations. The mapping of the evolved psychology of n -party exchange cooperation may contribute to the construction of a principled theoretical foundation for the understanding of human behavior in organizations.
Cognitive Adaptations for n-person Exchange: The Evolutionary Roots of Organizational Behavior

PubMed Central

Tooby, John; Cosmides, Leda; Price, Michael E.

2013-01-01

Organizations are composed of stable, predominantly cooperative interactions or n-person exchanges. Humans have been engaging in n-person exchanges for a great enough period of evolutionary time that we appear to have evolved a distinct constellation of species-typical mechanisms specialized to solve the adaptive problems posed by this form of social interaction. These mechanisms appear to have been evolutionarily elaborated out of the cognitive infrastructure that initially evolved for dyadic exchange. Key adaptive problems that these mechanisms are designed to solve include coordination among individuals, and defense against exploitation by free riders. Multi-individual cooperation could not have been maintained over evolutionary time if free riders reliably benefited more than contributors to collective enterprises, and so outcompeted them. As a result, humans evolved mechanisms that implement an aversion to exploitation by free riding, and a strategy of conditional cooperation, supplemented by punitive sentiment towards free riders. Because of the design of these mechanisms, how free riding is treated is a central determinant of the survival and health of cooperative organizations. The mapping of the evolved psychology of n-party exchange cooperation may contribute to the construction of a principled theoretical foundation for the understanding of human behavior in organizations. PMID:23814325
Role of Conserved Proline Residues in Human Apolipoprotein A-IV Structure and Function*

PubMed Central

Deng, Xiaodi; Walker, Ryan G.; Morris, Jamie; Davidson, W. Sean; Thompson, Thomas B.

2015-01-01

Apolipoprotein (apo)A-IV is a lipid emulsifying protein linked to a range of protective roles in obesity, diabetes, and cardiovascular disease. It exists in several states in plasma including lipid-bound in HDL and chylomicrons and as monomeric and dimeric lipid-free/poor forms. Our recent x-ray crystal structure of the central domain of apoA-IV shows that it adopts an elongated helical structure that dimerizes via two long reciprocating helices. A striking feature is the alignment of conserved proline residues across the dimer interface. We speculated that this plays important roles in the structure of the lipid-free protein and its ability to bind lipid. Here we show that the systematic conversion of these prolines to alanine increased the thermodynamic stability of apoA-IV and its propensity to oligomerize. Despite the structural stabilization, we noted an increase in the ability to bind and reorganize lipids and to promote cholesterol efflux from cells. The novel properties of these mutants allowed us to isolate the first trimeric form of an exchangeable apolipoprotein and characterize it by small-angle x-ray scattering and chemical cross-linking. The results suggest that the reciprocating helix interaction is a common feature of all apoA-IV oligomers. We propose a model of how self-association of apoA-IV can result in spherical lipoprotein particles, a model that may have broader applications to other exchangeable apolipoprotein family members. PMID:25733664
A survey and evaluations of histogram-based statistics in alignment-free sequence comparison.

PubMed

Luczak, Brian B; James, Benjamin T; Girgis, Hani Z

2017-12-06

Since the dawn of the bioinformatics field, sequence alignment scores have been the main method for comparing sequences. However, alignment algorithms are quadratic, requiring long execution time. As alternatives, scientists have developed tens of alignment-free statistics for measuring the similarity between two sequences. We surveyed tens of alignment-free k-mer statistics. Additionally, we evaluated 33 statistics and multiplicative combinations between the statistics and/or their squares. These statistics are calculated on two k-mer histograms representing two sequences. Our evaluations using global alignment scores revealed that the majority of the statistics are sensitive and capable of finding similar sequences to a query sequence. Therefore, any of these statistics can filter out dissimilar sequences quickly. Further, we observed that multiplicative combinations of the statistics are highly correlated with the identity score. Furthermore, combinations involving sequence length difference or Earth Mover's distance, which takes the length difference into account, are always among the highest correlated paired statistics with identity scores. Similarly, paired statistics including length difference or Earth Mover's distance are among the best performers in finding the K-closest sequences. Interestingly, similar performance can be obtained using histograms of shorter words, resulting in reducing the memory requirement and increasing the speed remarkably. Moreover, we found that simple single statistics are sufficient for processing next-generation sequencing reads and for applications relying on local alignment. Finally, we measured the time requirement of each statistic. The survey and the evaluations will help scientists with identifying efficient alternatives to the costly alignment algorithm, saving thousands of computational hours. The source code of the benchmarking tool is available as Supplementary Materials. © The Author 2017. Published by Oxford University Press.
Systems-level feedback regulation of cell cycle transitions in Ostreococcus tauri.

PubMed

Kapuy, Orsolya; Vinod, P K; Bánhegyi, Gábor; Novák, Béla

2018-05-01

Ostreococcus tauri is the smallest free-living unicellular organism with one copy of each core cell cycle genes in its genome. There is a growing interest in this green algae due to its evolutionary origin. Since O. tauri is diverged early in the green lineage, relatively close to the ancestral eukaryotic cell, it might hold a key phylogenetic position in the eukaryotic tree of life. In this study, we focus on the regulatory network of its cell division cycle. We propose a mathematical modelling framework to integrate the existing knowledge of cell cycle network of O. tauri. We observe that feedback loop regulation of both G1/S and G2/M transitions in O. tauri is conserved, which can make the transition bistable. This is essential to make the transition irreversible as shown in other eukaryotic organisms. By performing sequence analysis, we also predict the presence of the Greatwall/PP2A pathway in the cell cycle of O. tauri. Since O. tauri cell cycle machinery is conserved, the exploration of the dynamical characteristic of the cell division cycle will help in further understanding the regulation of cell cycle in higher eukaryotes. Copyright © 2018 Elsevier Masson SAS. All rights reserved.
Conservation of mRNA secondary structures may filter out mutations in Escherichia coli evolution

PubMed Central

Chursov, Andrey; Frishman, Dmitrij; Shneider, Alexander

2013-01-01

Recent reports indicate that mutations in viral genomes tend to preserve RNA secondary structure, and those mutations that disrupt secondary structural elements may reduce gene expression levels, thereby serving as a functional knockout. In this article, we explore the conservation of secondary structures of mRNA coding regions, a previously unknown factor in bacterial evolution, by comparing the structural consequences of mutations in essential and nonessential Escherichia coli genes accumulated over 40 000 generations in the course of the ‘long-term evolution experiment’. We monitored the extent to which mutations influence minimum free energy (MFE) values, assuming that a substantial change in MFE is indicative of structural perturbation. Our principal finding is that purifying selection tends to eliminate those mutations in essential genes that lead to greater changes of MFE values and, therefore, may be more disruptive for the corresponding mRNA secondary structures. This effect implies that synonymous mutations disrupting mRNA secondary structures may directly affect the fitness of the organism. These results demonstrate that the need to maintain intact mRNA structures imposes additional evolutionary constraints on bacterial genomes, which go beyond preservation of structure and function of the encoded proteins. PMID:23783573
Evolution across the Curriculum: Microbiology

PubMed Central

Burmeister, Alita R.; Smith, James J.

2016-01-01

An integrated understanding of microbiology and evolutionary biology is essential for students pursuing careers in microbiology and healthcare fields. In this Perspective, we discuss the usefulness of evolutionary concepts and an overall evolutionary framework for students enrolled in microbiology courses. Further, we propose a set of learning goals for students studying microbial evolution concepts. We then describe some barriers to microbial evolution teaching and learning and encourage the continued incorporation of evidence-based teaching practices into microbiology courses at all levels. Next, we review the current status of microbial evolution assessment tools and describe some education resources available for teaching microbial evolution. Successful microbial evolution education will require that evolution be taught across the undergraduate biology curriculum, with a continued focus on applications and applied careers, while aligning with national biology education reform initiatives. Journal of Microbiology & Biology Education PMID:27158306
Detecting and Analyzing Genetic Recombination Using RDP4.

PubMed

Martin, Darren P; Murrell, Ben; Khoosal, Arjun; Muhire, Brejnev

2017-01-01

Recombination between nucleotide sequences is a major process influencing the evolution of most species on Earth. The evolutionary value of recombination has been widely debated and so too has its influence on evolutionary analysis methods that assume nucleotide sequences replicate without recombining. When nucleic acids recombine, the evolution of the daughter or recombinant molecule cannot be accurately described by a single phylogeny. This simple fact can seriously undermine the accuracy of any phylogenetics-based analytical approach which assumes that the evolutionary history of a set of recombining sequences can be adequately described by a single phylogenetic tree. There are presently a large number of available methods and associated computer programs for analyzing and characterizing recombination in various classes of nucleotide sequence datasets. Here we examine the use of some of these methods to derive and test recombination hypotheses using multiple sequence alignments.
Planar solar concentrator featuring alignment-free total-internal-reflection collectors and an innovative compound tracker.

PubMed

Teng, Tun-Chien; Lai, Wei-Che

2014-12-15

This study proposed a planar solar concentrator featuring alignment-free total-internal-reflection (TIR) collectors and an innovative compound tracker. The compound tracker, combining a mechanical single-axis tracker and scrollable prism sheets, can achieve a performance on a par with dual-axis tracking while reducing the cost of the tracking system and increasing its robustness. The alignment-free TIR collectors are assembled on the waveguide without requiring alignment, so the planar concentrator is relatively easily manufactured and markedly increases the feasibility for use in large concentrators. Further, the identical TIR collector is applicable to various-sized waveguide slab without requiring modification, which facilitates flexibility regarding the size of the waveguide slab. In the simulation model, the thickness of the slab was 2 mm, and its maximal length reached 6 m. With an average angular tolerance of ±0.6°, and after considering both the Fresnel loss and the angular spread of the sun, the simulation indicates that the waveguide concentrator of a 1000-mm length provides the optical efficiencies of 62-77% at the irradiance concentrations of 387-688, and the one of a 2000-mm length provides the optical efficiencies of 52-64.5% at the irradiance concentrations of 645-1148. Alternatively, if a 100-mm horizontally staggered waveguide slab is collocated with the alignment-free TIR collectors, the optical efficiency would be greatly improved up to 91.5% at an irradiance concentration of 1098 (C(geo) = 1200X).
Matrix metalloproteinases: structures, evolution, and diversification.

PubMed

Massova, I; Kotra, L P; Fridman, R; Mobashery, S

1998-09-01

A comprehensive sequence alignment of 64 members of the family of matrix metalloproteinases (MMPs) for the entire sequences, and subsequently the catalytic and the hemopexin-like domains, have been performed. The 64 MMPs were selected from plants, invertebrates, and vertebrates. The analyses disclosed that as many as 23 distinct subfamilies of these proteins are known to exist. Information from the sequence alignments was correlated with structures, both crystallographic as well as computational, of the catalytic domains for the 23 representative members of the MMP family. A survey of the metal binding sites and two loops containing variable sequences of amino acids, which are important for substrate interactions, are discussed. The collective data support the proposal that the assembly of the domains into multidomain enzymes was likely to be an early evolutionary event. This was followed by diversification, perhaps in parallel among the MMPs, in a subsequent evolutionary time scale. Analysis indicates that a retrograde structure simplification may have accounted for the evolution of MMPs with simple domain constituents, such as matrilysin, from the larger and more elaborate enzymes.

Systematic Error in Seed Plant Phylogenomics

PubMed Central

Zhong, Bojian; Deusch, Oliver; Goremykin, Vadim V.; Penny, David; Biggs, Patrick J.; Atherton, Robin A.; Nikiforova, Svetlana V.; Lockhart, Peter James

2011-01-01

Resolving the closest relatives of Gnetales has been an enigmatic problem in seed plant phylogeny. The problem is known to be difficult because of the extent of divergence between this diverse group of gymnosperms and their closest phylogenetic relatives. Here, we investigate the evolutionary properties of conifer chloroplast DNA sequences. To improve taxon sampling of Cupressophyta (non-Pinaceae conifers), we report sequences from three new chloroplast (cp) genomes of Southern Hemisphere conifers. We have applied a site pattern sorting criterion to study compositional heterogeneity, heterotachy, and the fit of conifer chloroplast genome sequences to a general time reversible + G substitution model. We show that non-time reversible properties of aligned sequence positions in the chloroplast genomes of Gnetales mislead phylogenetic reconstruction of these seed plants. When 2,250 of the most varied sites in our concatenated alignment are excluded, phylogenetic analyses favor a close evolutionary relationship between the Gnetales and Pinaceae—the Gnepine hypothesis. Our analytical protocol provides a useful approach for evaluating the robustness of phylogenomic inferences. Our findings highlight the importance of goodness of fit between substitution model and data for understanding seed plant phylogeny. PMID:22016337
Evolutionary History of LINE-1 in the Major Clades of Placental Mammals

PubMed Central

Waters, Paul D.; Dobigny, Gauthier; Waddell, Peter J.; Robinson, Terence J.

2007-01-01

Background LINE-1 constitutes an important component of mammalian genomes. It has a dynamic evolutionary history characterized by the rise, fall and replacement of subfamilies. Most data concerning LINE-1 biology and evolution are derived from the human and mouse genomes and are often assumed to hold for all placentals. Methodology To examine LINE-1 relationships, sequences from the 3′ region of the reverse transcriptase from 21 species (representing 13 orders across Afrotheria, Xenarthra, Supraprimates and Laurasiatheria) were obtained from whole genome sequence assemblies, or by PCR with degenerate primers. These sequences were aligned and analysed. Principal Findings Our analysis reflects accepted placental relationships suggesting mostly lineage-specific LINE-1 families. The data provide clear support for several clades including Glires, Supraprimates, Laurasiatheria, Boreoeutheria, Xenarthra and Afrotheria. Within the afrotherian LINE-1 (AfroLINE) clade, our tree supports Paenungulata, Afroinsectivora and Afroinsectiphillia. Xenarthran LINE-1 (XenaLINE) falls sister to AfroLINE, providing some support for the Atlantogenata (Xenarthra+Afrotheria) hypothesis. Significance LINEs and SINEs make up approximately half of all placental genomes, so understanding their dynamics is an essential aspect of comparative genomics. Importantly, a tree of LINE-1 offers a different view of the root, as long edges (branches) such as that to marsupials are shortened and/or broken up. Additionally, a robust phylogeny of diverse LINE-1 is essential in testing that site-specific LINE-1 insertions, often regarded as homoplasy-free phylogenetic markers, are indeed unique and not convergent. PMID:17225861
Genetic recapture identifies long-distance breeding dispersal in Greater Sage-Grouse (Centrocercus urophasianus)

Treesearch

Todd B. Cross; David E. Naugle; John C. Carlson; Michael K. Schwartz

2017-01-01

Dispersal can strongly influence the demographic and evolutionary trajectory of populations. For many species, little is known about dispersal, despite its importance to conservation. The Greater Sage-Grouse (Centrocercus urophasianus) is a species of conservation concern that ranges across 11 western U.S. states and 2 Canadian provinces. To investigate dispersal...
The application of genetic indicators in wild populations: Potential and pitfalls for genetic monitoring [Chapter 15

Treesearch

Jennifer Pierson; Gordon Luikart; Michael Schwartz

2015-01-01

The genetic aspects of biodiversity and conservation have been long recognised as important to the viability of populations and evolutionary potential of species (Lande 1988). Yet incorporating genetic considerations into conservation, management, and decision making has lagged behind this recognition (Mace et al. 2003; Laikre et al. 2010). Gene-level (genetic...
Ex situ gene conservation for conifers in the Pacific Northwest.

Treesearch

Sara R. Lipow; J. Bradley St. Clair; G.R. Johnson

2002-01-01

Recently, a group of public and private organizations responsible for managing much of the timberland in western Oregon and Washington formed the Pacific Northwest forest tree Gene Conservation Group (GCG) to ensure that the evolutionary potential of important regional tree species is maintained. The group is first compiling data to evaluate the genetic resource status...
Molecular Genetic Equipment for Improved Inventory and Monitoring of Species of Conservation Concern on Department of Defense Lands

DTIC Science & Technology

2015-11-18

University of Idaho was tasked with designing methods to monitor species of concern on DoD lands as part of four DoD grants. With the funds granted we were...Approved for Public Release; Distribution Unlimited Final Report: Molecular Genetic Equipment for Improved Inventory and Monitoring of Species of...Monitoring of Species of Conservation Concern on Department of Defense Lands Report Title The Laboratory for Ecological, Evolutionary and Conservation
Conservation and diversification of the miR166 family in soybean and potential roles of newly identified miR166s.

PubMed

Li, Xuyan; Xie, Xin; Li, Ji; Cui, Yuhai; Hou, Yanming; Zhai, Lulu; Wang, Xiao; Fu, Yanli; Liu, Ranran; Bian, Shaomin

2017-02-01

microRNA166 (miR166) is a highly conserved family of miRNAs implicated in a wide range of cellular and physiological processes in plants. miR166 family generally comprises multiple miR166 members in plants, which might exhibit functional redundancy and specificity. The soybean miR166 family consists of 21 members according to the miRBase database. However, the evolutionary conservation and functional diversification of miR166 family members in soybean remain poorly understood. We identified five novel miR166s in soybean by data mining approach, thus enlarging the size of miR166 family from 21 to 26 members. Phylogenetic analyses of the 26 miR166s and their precursors indicated that soybean miR166 family exhibited both evolutionary conservation and diversification, and ten pairs of miR166 precursors with high sequence identity were individually grouped into a discrete clade in the phylogenetic tree. The analysis of genomic organization and evolution of MIR166 gene family revealed that eight segmental duplications and four tandem duplications might occur during evolution of the miR166 family in soybean. The cis-elements in promoters of MIR166 family genes and their putative targets pointed to their possible contributions to the functional conservation and diversification. The targets of soybean miR166s were predicted, and the cleavage of ATHB14-LIKE transcript was experimentally validated by RACE PCR. Further, the expression patterns of the five newly identified MIR166s and 12 target genes were examined during seed development and in response to abiotic stresses, which provided important clues for dissecting their functions and isoform specificity. This study enlarged the size of soybean miR166 family from 21 to 26 members, and the 26 soybean miR166s exhibited evolutionary conservation and diversification. These findings have laid a foundation for elucidating functional conservation and diversification of miR166 family members, especially during seed development or under abiotic stresses.
The Role of Integrative Taxonomy in the Conservation Management of Cryptic Species: The Taxonomic Status of Endangered Earless Dragons (Agamidae: Tympanocryptis) in the Grasslands of Queensland, Australia

PubMed Central

Melville, Jane; Smith, Katie; Hobson, Rod; Hunjan, Sumitha; Shoo, Luke

2014-01-01

Molecular phylogenetics is increasingly highlighting the prevalence of cryptic species, where morphologically similar organisms have long independent evolutionary histories. When such cryptic species are known to be declining in numbers and are at risk of extinction due to a range of threatening processes, the disjunction between molecular systematics research and conservation policy becomes a significant problem. We investigate the taxonomic status of Tympanocryptis populations in Queensland, which have previously been assigned to T. tetraporophora, using three species delimitation approaches. The taxonomic uncertainties in this species-group are of particular importance in the Darling Downs Earless Dragon (T. cf. tetraporophora), which is ranked as an endangered ‘species’ of high priority for conservation by the Queensland Department of Environment and Heritage Protection. We undertook a morphological study, integrated with a comprehensive genetic study and species delimitation analyses, to investigate the species status of populations in the region. Phylogenetic analyses of two gene regions (mtDNA: ND2; nuclear: RAG1) revealed high levels of genetic divergence between populations, indicating isolation over long evolutionary time frames, and strongly supporting two independent evolutionary lineages in southeastern Queensland, from the Darling Downs, and a third in the Gulf Region of northern Queensland. Of the three species delimitation protocols used, we found integrative taxonomy the most applicable to this cryptic species complex. Our study demonstrates the utility of integrative taxonomy as a species delimitation approach in cryptic complexes of species with conservation significance, where limited numbers of specimens are available. PMID:25076129
The role of integrative taxonomy in the conservation management of cryptic species: the taxonomic status of endangered earless dragons (Agamidae: Tympanocryptis) in the grasslands of Queensland, Australia.

PubMed

Melville, Jane; Smith, Katie; Hobson, Rod; Hunjan, Sumitha; Shoo, Luke

2014-01-01

Molecular phylogenetics is increasingly highlighting the prevalence of cryptic species, where morphologically similar organisms have long independent evolutionary histories. When such cryptic species are known to be declining in numbers and are at risk of extinction due to a range of threatening processes, the disjunction between molecular systematics research and conservation policy becomes a significant problem. We investigate the taxonomic status of Tympanocryptis populations in Queensland, which have previously been assigned to T. tetraporophora, using three species delimitation approaches. The taxonomic uncertainties in this species-group are of particular importance in the Darling Downs Earless Dragon (T. cf. tetraporophora), which is ranked as an endangered 'species' of high priority for conservation by the Queensland Department of Environment and Heritage Protection. We undertook a morphological study, integrated with a comprehensive genetic study and species delimitation analyses, to investigate the species status of populations in the region. Phylogenetic analyses of two gene regions (mtDNA: ND2; nuclear: RAG1) revealed high levels of genetic divergence between populations, indicating isolation over long evolutionary time frames, and strongly supporting two independent evolutionary lineages in southeastern Queensland, from the Darling Downs, and a third in the Gulf Region of northern Queensland. Of the three species delimitation protocols used, we found integrative taxonomy the most applicable to this cryptic species complex. Our study demonstrates the utility of integrative taxonomy as a species delimitation approach in cryptic complexes of species with conservation significance, where limited numbers of specimens are available.
Evolutionary genetics of insect innate immunity.

PubMed

Viljakainen, Lumi

2015-11-01

Patterns of evolution in immune defense genes help to understand the evolutionary dynamics between hosts and pathogens. Multiple insect genomes have been sequenced, with many of them having annotated immune genes, which paves the way for a comparative genomic analysis of insect immunity. In this review, I summarize the current state of comparative and evolutionary genomics of insect innate immune defense. The focus is on the conserved and divergent components of immunity with an emphasis on gene family evolution and evolution at the sequence level; both population genetics and molecular evolution frameworks are considered. © The Author 2015. Published by Oxford University Press.
Evolutionary conservation and regulation of particular alternative splicing events in plant SR proteins

PubMed Central

Kalyna, Maria; Lopato, Sergiy; Voronin, Viktor; Barta, Andrea

2006-01-01

Alternative splicing is an important mechanism for fine tuning of gene expression at the post-transcriptional level. SR proteins govern splice site selection and spliceosome assembly. The Arabidopsis genome encodes 19 SR proteins, several of which have no orthologues in metazoan. Three of the plant specific subfamilies are characterized by the presence of a relatively long alternatively spliced intron located in their first RNA recognition motif, which potentially results in an extremely truncated protein. In atRSZ33, a member of the RS2Z subfamily, this alternative splicing event was shown to be autoregulated. Here we show that atRSp31, a member of the RS subfamily, does not autoregulate alternative splicing of its similarily positioned intron. Interestingly, this alternative splicing event is regulated by atRSZ33. We demonstrate that the positions of these long introns and their capability for alternative splicing are conserved from green algae to flowering plants. Moreover, in particular alternative splicing events the splicing signals are embedded into highly conserved sequences. In different taxa, these conserved sequences occur in at least one gene within a subfamily. The evolutionary preservation of alternative splice forms together with highly conserved intron features argues for additional functions hidden in the genes of these plant-specific SR proteins. PMID:16936312
Structure based alignment and clustering of proteins (STRALCP)

DOEpatents

Zemla, Adam T.; Zhou, Carol E.; Smith, Jason R.; Lam, Marisa W.

2013-06-18

Disclosed are computational methods of clustering a set of protein structures based on local and pair-wise global similarity values. Pair-wise local and global similarity values are generated based on pair-wise structural alignments for each protein in the set of protein structures. Initially, the protein structures are clustered based on pair-wise local similarity values. The protein structures are then clustered based on pair-wise global similarity values. For each given cluster both a representative structure and spans of conserved residues are identified. The representative protein structure is used to assign newly-solved protein structures to a group. The spans are used to characterize conservation and assign a "structural footprint" to the cluster.
Triangular Alignment (TAME). A Tensor-based Approach for Higher-order Network Alignment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mohammadi, Shahin; Gleich, David F.; Kolda, Tamara G.

2015-11-01

Network alignment is an important tool with extensive applications in comparative interactomics. Traditional approaches aim to simultaneously maximize the number of conserved edges and the underlying similarity of aligned entities. We propose a novel formulation of the network alignment problem that extends topological similarity to higher-order structures and provide a new objective function that maximizes the number of aligned substructures. This objective function corresponds to an integer programming problem, which is NP-hard. Consequently, we approximate this objective function as a surrogate function whose maximization results in a tensor eigenvalue problem. Based on this formulation, we present an algorithm called Triangularmore » AlignMEnt (TAME), which attempts to maximize the number of aligned triangles across networks. We focus on alignment of triangles because of their enrichment in complex networks; however, our formulation and resulting algorithms can be applied to general motifs. Using a case study on the NAPABench dataset, we show that TAME is capable of producing alignments with up to 99% accuracy in terms of aligned nodes. We further evaluate our method by aligning yeast and human interactomes. Our results indicate that TAME outperforms the state-of-art alignment methods both in terms of biological and topological quality of the alignments.« less
Unified Alignment of Protein-Protein Interaction Networks.

PubMed

Malod-Dognin, Noël; Ban, Kristina; Pržulj, Nataša

2017-04-19

Paralleling the increasing availability of protein-protein interaction (PPI) network data, several network alignment methods have been proposed. Network alignments have been used to uncover functionally conserved network parts and to transfer annotations. However, due to the computational intractability of the network alignment problem, aligners are heuristics providing divergent solutions and no consensus exists on a gold standard, or which scoring scheme should be used to evaluate them. We comprehensively evaluate the alignment scoring schemes and global network aligners on large scale PPI data and observe that three methods, HUBALIGN, L-GRAAL and NATALIE, regularly produce the most topologically and biologically coherent alignments. We study the collective behaviour of network aligners and observe that PPI networks are almost entirely aligned with a handful of aligners that we unify into a new tool, Ulign. Ulign enables complete alignment of two networks, which traditional global and local aligners fail to do. Also, multiple mappings of Ulign define biologically relevant soft clusterings of proteins in PPI networks, which may be used for refining the transfer of annotations across networks. Hence, PPI networks are already well investigated by current aligners, so to gain additional biological insights, a paradigm shift is needed. We propose such a shift come from aligning all available data types collectively rather than any particular data type in isolation from others.
Functional Characterization of a Syntaxin Involved in Tomato (Solanum lycopersicum) Resistance against Powdery Mildew.

PubMed

Bracuto, Valentina; Appiano, Michela; Zheng, Zheng; Wolters, Anne-Marie A; Yan, Zhe; Ricciardi, Luigi; Visser, Richard G F; Pavan, Stefano; Bai, Yuling

2017-01-01

Specific syntaxins, such as Arabidopsis AtPEN1 and its barley ortholog ROR2, play a major role in plant defense against powdery mildews. Indeed, the impairment of these genes results in increased fungal penetration in both host and non-host interactions. In this study, a genome-wide survey allowed the identification of 21 tomato syntaxins. Two of them, named SlPEN1a and SlPEN1b , are closely related to AtPEN1 . RNAi-based silencing of SlPEN1a in a tomato line carrying a loss-of-function mutation of the susceptibility gene SlMLO1 led to compromised resistance toward the tomato powdery mildew fungus Oidium neolycopersici . Moreover, it resulted in a significant increase in the penetration rate of the non-adapted powdery mildew fungus Blumeria graminis f. sp. hordei . Codon-based evolutionary analysis and multiple alignments allowed the detection of amino acid residues that are under purifying selection and are specifically conserved in syntaxins involved in plant-powdery mildew interactions. Our findings provide both insights on the evolution of syntaxins and information about their function which is of interest for future studies on plant-pathogen interactions and tomato breeding.
Is the Link Between the Observed Velocities of Neutron Stars and their Progenitors a Simple Mass Relationship?

NASA Astrophysics Data System (ADS)

Bray, J. C.

2017-11-01

While the imparting of velocity `kicks' to compact remnants from supernovae is widely accepted, the relationship of the `kick' to the progenitor is not. We propose the `kick' is predominantly a result of conservation of momentum between the ejected and compact remnant masses. We propose the `kick' velocity is given by v kick = α(M ejecta/M remnant)+β, where α and β are constants we wish to determine. To test this we use the BPASS v2 (Binary Population and Spectral Synthesis) code to create stellar populations from both single star and binary star evolutionary pathways. We then use our Remnant Ejecta and Progenitor Explosion Relationship (REAPER) code to apply `kicks' to neutron stars from supernovae in these models using a grid of α and β values, (from 0 to 200 km s-1 in steps of 10 km s-1), in three different `kick' orientations, (isotropic, spin-axis aligned and orthogonal to spin-axis) and weighted by three different Salpeter initial mass functions (IMF's), with slopes of -2.0, -2.35 and -2.70. We compare our synthetic 2D and 3D velocity probability distributions to the distributions provided by Hobbs et al. (1995).
A Method to Find Longevity-Selected Positions in the Mammalian Proteome

PubMed Central

Semeiks, Jeremy; Grishin, Nick V.

2012-01-01

Evolutionary theory suggests that the force of natural selection decreases with age. To explore the extent to which this prediction directly affects protein structure and function, we used multiple regression to find longevity-selected positions, defined as the columns of a sequence alignment conserved in long-lived but not short-lived mammal species. We analyzed 7,590 orthologous protein families in 33 mammalian species, accounting for body mass, phylogeny, and species-specific mutation rate. Overall, we found that the number of longevity-selected positions in the mammalian proteome is much higher than would be expected by chance. Further, these positions are enriched in domains of several proteins that interact with one another in inflammation and other aging-related processes, as well as in organismal development. We present as an example the kinase domain of anti-Müllerian hormone type-2 receptor (AMHR2). AMHR2 inhibits ovarian follicle recruitment and growth, and a homology model of the kinase domain shows that its longevity-selected positions cluster near a SNP associated with delayed human menopause. Distinct from its canonical role in development, this region of AMHR2 may function to regulate the protein’s activity in a lifespan-specific manner. PMID:22701678
The gene coding for small ribosomal subunit RNA in the basidiomycete Ustilago maydis contains a group I intron.

PubMed Central

De Wachter, R; Neefs, J M; Goris, A; Van de Peer, Y

1992-01-01

The nucleotide sequence of the gene coding for small ribosomal subunit RNA in the basidiomycete Ustilago maydis was determined. It revealed the presence of a group I intron with a length of 411 nucleotides. This is the third occurrence of such an intron discovered in a small subunit rRNA gene encoded by a eukaryotic nuclear genome. The other two occurrences are in Pneumocystis carinii, a fungus of uncertain taxonomic status, and Ankistrodesmus stipitatus, a green alga. The nucleotides of the conserved core structure of 101 group I intron sequences present in different genes and genome types were aligned and their evolutionary relatedness was examined. This revealed a cluster including all group I introns hitherto found in eukaryotic nuclear genes coding for small and large subunit rRNAs. A secondary structure model was designed for the area of the Ustilago maydis small ribosomal subunit RNA precursor where the intron is situated. It shows that the internal guide sequence pairing with the intron boundaries fits between two helices of the small subunit rRNA, and that minimal rearrangement of base pairs suffices to achieve the definitive secondary structure of the 18S rRNA upon splicing. PMID:1561081
A fully automatic evolutionary classification of protein folds: Dali Domain Dictionary version 3

PubMed Central

Dietmann, Sabine; Park, Jong; Notredame, Cedric; Heger, Andreas; Lappe, Michael; Holm, Liisa

2001-01-01

The Dali Domain Dictionary (http://www.ebi.ac.uk/dali/domain) is a numerical taxonomy of all known structures in the Protein Data Bank (PDB). The taxonomy is derived fully automatically from measurements of structural, functional and sequence similarities. Here, we report the extension of the classification to match the traditional four hierarchical levels corresponding to: (i) supersecondary structural motifs (attractors in fold space), (ii) the topology of globular domains (fold types), (iii) remote homologues (functional families) and (iv) homologues with sequence identity above 25% (sequence families). The computational definitions of attractors and functional families are new. In September 2000, the Dali classification contained 10 531 PDB entries comprising 17 101 chains, which were partitioned into five attractor regions, 1375 fold types, 2582 functional families and 3724 domain sequence families. Sequence families were further associated with 99 582 unique homologous sequences in the HSSP database, which increases the number of effectively known structures several-fold. The resulting database contains the description of protein domain architecture, the definition of structural neighbours around each known structure, the definition of structurally conserved cores and a comprehensive library of explicit multiple alignments of distantly related protein families. PMID:11125048
Genomicus update 2015: KaryoView and MatrixView provide a genome-wide perspective to multispecies comparative genomics.

PubMed

Louis, Alexandra; Nguyen, Nga Thi Thuy; Muffato, Matthieu; Roest Crollius, Hugues

2015-01-01

The Genomicus web server (http://www.genomicus.biologie.ens.fr/genomicus) is a visualization tool allowing comparative genomics in four different phyla (Vertebrate, Fungi, Metazoan and Plants). It provides access to genomic information from extant species, as well as ancestral gene content and gene order for vertebrates and flowering plants. Here we present the new features available for vertebrate genome with a focus on new graphical tools. The interface to enter the database has been improved, two pairwise genome comparison tools are now available (KaryoView and MatrixView) and the multiple genome comparison tools (PhyloView and AlignView) propose three new kinds of representation and a more intuitive menu. These new developments have been implemented for Genomicus portal dedicated to vertebrates. This allows the analysis of 68 extant animal genomes, as well as 58 ancestral reconstructed genomes. The Genomicus server also provides access to ancestral gene orders, to facilitate evolutionary and comparative genomics studies, as well as computationally predicted regulatory interactions, thanks to the representation of conserved non-coding elements with their putative gene targets. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

The primary structure of the thymidine kinase gene of fish lymphocystis disease virus.

PubMed

Schnitzler, P; Handermann, M; Szépe, O; Darai, G

1991-06-01

The DNA nucleotide sequence of the thymidine kinase (TK) gene of fish lymphocystis disease virus (FLDV) which has been localized between the coordinates 0.678 to 0.688 of the viral genome was determined. The analysis of the DNA nucleotide sequence located between the recognition sites of HindIII (0.669 map unit; nucleotide position 1) and AccI (nucleotide position 2032) revealed the presence of an open reading frame of 954 bp on the lower strand of this region between nucleotide positions 1868 (ATG) and 915 (TAA). It encodes for a protein of 318 amino acid residues. The evolutionary relationships of the TK gene of FLDV to the other known TK genes was investigated using the method of progressive sequence alignment. These analyses revealed a high degree of diversity between the protein sequence of FLDV TK gene and the amino acid composition of other TKs tested. However, significant conservations were detected at several regions of amino acid residues of the FLDV TK protein when compared to the amino acid sequence of TKs of African swine fever virus, fowlpox virus, shope fibroma virus, and vaccinia virus and to the amino acid sequences of the cellular cytoplasmic TK of chicken, mouse, and man.
Constructing a 'Chromonome' of Yellowtail (Seriola quinqueradiata) for Comparative Analysis of Chromosomal Rearrangements

PubMed Central

Kawase, Junya; Aoki, Jun-ya; Araki, Kazuo

2018-01-01

To investigate chromosome evolution in fish species, we newly mapped 181 markers that allowed us to construct a yellowtail (Seriola quinqueradiata) radiation hybrid (RH) physical map with 1,713 DNA markers, which was far denser than a previous map, and we anchored the de novo assembled sequences onto the RH physical map. Finally, we mapped a total of 13,977 expressed sequence tags (ESTs) on a genome sequence assembly aligned with the physical map. Using the high-density physical map and anchored genome sequences, we accurately compared the yellowtail genome structure with the genome structures of five model fishes to identify characteristics of the yellowtail genome. Between yellowtail and Japanese medaka (Oryzias latipes), almost all regions of the chromosomes were conserved and some blocks comprising several markers were translocated. Using the genome information of the spotted gar (Lepisosteus oculatus) as a reference, we further documented syntenic relationships and chromosomal rearrangements that occurred during evolution in four other acanthopterygian species (Japanese medaka, zebrafish, spotted green pufferfish and three-spined stickleback). The evolutionary chromosome translocation frequency was 1.5-2-times higher in yellowtail than in medaka, pufferfish, and stickleback. PMID:29290830
Applying molecular genetic tools to the conservation and action plan for the critically endangered Far Eastern leopard (Panthera pardus orientalis).

PubMed

Uphyrkina, Olga; O'Brien, Stephen J

2003-08-01

A role for molecular genetic approaches in conservation of endangered taxa is now commonly recognized. Because conservation genetic analyses provide essential insights on taxonomic status, recent evolutionary history and current health of endangered taxa, they are considered in nearly all conservation programs. Genetic analyses of the critically endangered Far Eastern, or Amur leopard, Panthera pardus orientalis, have been done recently to address all of these questions and develop strategies for survival of the leopard in the wild. The genetic status and implication for conservation management of the Far Eastern leopard subspecies are discussed.
America and the Containment of Arab Radical Nationalism: The Eisenhower Years

DTIC Science & Technology

1994-05-01

evolutionary process in the transformation and defense of the Arab East. 4 The emergence of Nasser and radical nationalism throughout the area required a... processes were evolutionary and optimistic. It would require decades to accomplish what had 10 taken centuries in their own societies. It also required...numerous foreign teohnicians and progressive political leaders, the latter being excluded from the political process by the conservatives. Many of these
Evolution in biodiversity policy – current gaps and future needs

PubMed Central

Santamaría, Luis; Méndez, Pablo F

2012-01-01

The intensity and speed of human alterations to the planet's ecosystems are yielding our static, ahistorical view of biodiversity obsolete. Human actions frequently trigger fast evolutionary responses, affect extant genetic variation and result in the establishment of new communities and co-evolutionary networks for which we lack past analogues. Contemporary evolution interplays with ecological changes to determine the response of organisms and ecosystems to anthropogenic pressures. Examples on wild species include responses to harvest (e.g. fisheries, hunting, angling), habitat loss and fragmentation (e.g. genetic effects of isolation), biotic exchange (e.g. evolutionary responses to control measures), climate change (e.g. local adaptation and its interplay with dispersal processes) and the responses of endangered species to conservation measures. A review of international and EU biodiversity policies showed numerous opportunities for the integration of evolutionary knowledge, with the realistic prospect of improving their efficacy. Such opportunities should be extended to other sectoral policies of direct relevance for biodiversity – notably nature conservation, fisheries, agriculture, water resources, spatial planning and climate change. These avenues for improvement are, however, challenged by the low level of enforcement of biodiversity policies, linked to the nonbinding nature of most biodiversity-policy documents, and the decreasing representation of biodiversity in EU's research policy. PMID:25568042
SSAW: A new sequence similarity analysis method based on the stationary discrete wavelet transform.

PubMed

Lin, Jie; Wei, Jing; Adjeroh, Donald; Jiang, Bing-Hua; Jiang, Yue

2018-05-02

Alignment-free sequence similarity analysis methods often lead to significant savings in computational time over alignment-based counterparts. A new alignment-free sequence similarity analysis method, called SSAW is proposed. SSAW stands for Sequence Similarity Analysis using the Stationary Discrete Wavelet Transform (SDWT). It extracts k-mers from a sequence, then maps each k-mer to a complex number field. Then, the series of complex numbers formed are transformed into feature vectors using the stationary discrete wavelet transform. After these steps, the original sequence is turned into a feature vector with numeric values, which can then be used for clustering and/or classification. Using two different types of applications, namely, clustering and classification, we compared SSAW against the the-state-of-the-art alignment free sequence analysis methods. SSAW demonstrates competitive or superior performance in terms of standard indicators, such as accuracy, F-score, precision, and recall. The running time was significantly better in most cases. These make SSAW a suitable method for sequence analysis, especially, given the rapidly increasing volumes of sequence data required by most modern applications.
Phylogenetic distribution and evolutionary pattern of an α-proteobacterial small RNA gene that controls polyhydroxybutyrate accumulation in Sinorhizobium meliloti.

PubMed

Lagares, Antonio; Roux, Indra; Valverde, Claudio

2016-06-01

It has become clear that sRNAs play relevant regulatory functions in bacteria. However, a comprehensive understanding of their biological roles considering evolutionary aspects has not been achieved for most of them. Thus, we have characterized the evolutionary and phylogenetic aspects of the Sinorhizobium meliloti mmgR gene encoding the small RNA MmgR, which has been recently reported to be involved in the regulation of polyhydroxybutyrate accumulation in this bacterium. We constructed a covariance model from a multiple sequence and structure alignment of mmgR close homologs that allowed us to extend the search and to detect further remote homologs of the sRNA gene. From our results, mmgR seemed to evolve from a common ancestor of the α-proteobacteria that diverged from the order of Rickettsiales. We have found mmgR homologs in most current species of α-proteobacteria, with a few exceptions in which genomic reduction events or gene rearrangements seem to explain its absence. Furthermore, a strong microsyntenic relationship was found between a large set of mmgR homologs and homologs of a gene encoding a putative N-formyl glutamate amidohydrolase (NFGAH) that allowed us to trace back the evolutionary path of this group of mmgR orthologs. Among them, structure and sequence traits have been completely conserved throughout evolution, namely a Rho-independent terminator and a 10-mer (5'-UUUCCUCCCU-3') that is predicted to remain in a single-stranded region of the sRNA. We thus propose the definition of the new family of α-proteobacterial sRNAs αr8, as well as the subfamily αr8s1 which encompass S. meliloti mmgR orthologs physically linked with the downstream open reading frame encoding a putative NFGAH. So far, mmgR is the trans-encoded small RNA with the widest phylogenetic distribution of well recognized orthologs among α-proteobacteria. Expression of the expected MmgR transcript in rhizobiales other than S. meliloti (Sinorhizobium fredii, Rhizobium leguminosarum and Rhizobium etli) was confirmed by Northern blot. These findings will contribute to the understanding of the biological role(s) of mmgR in the α-proteobacteria. Copyright © 2016 Elsevier Inc. All rights reserved.
Conserved enzymes mediate the early reactions of carotenoid biosynthesis in nonphotosynthetic and photosynthetic prokaryotes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Armstrong, G.A.; Hearst, J.E.; Alberti, M.

1990-12-01

Carotenoids comprise one of the most widespread classes of pigments found in nature. The first reactions of C{sub 40} carotenoid biosynthesis proceed through common intermediates in all organisms, suggesting the evolutionary conservation of early enzymes from this pathway. The authors report here the nucleotide sequence of three genes from the carotenoid biosynthesis gene cluster of Erwinia herbicola, a nonphotosynthetic epiphytic bacterium, which encode homologs of the CrtB, CrtE, and CrtI proteins of Rhodobacter capsulatus, a purple nonsulfur photosynthetic bacterium. CrtB (prephytoene pyrophosphate synthase), CrtE (phytoene synthase), and CrtI (phytoene dehydrogenase) are required for the first three reactions specific to themore » carotenoid branch of general isoprenoid metabolism. All three dehydrogenases possess a hydrophobic N-terminal domain containing a putative ADP-binding {beta}{alpha}{beta} fold characteristic of enzymes known to bind FAD or NAD(P) cofactors. These data indicate the structural conservation of early carotenoid biosynthesis enzymes in evolutionary diverse organisms.« less
Conservatism of lizard thermal tolerances and body temperatures across evolutionary history and geography.

PubMed

Grigg, Joseph W; Buckley, Lauren B

2013-04-23

Species may exhibit similar thermal tolerances via either common ancestry or environmental filtering and local adaptation, if the species inhabit similar environments. We ask whether upper and lower thermal limits (critical thermal maxima and minima) and body temperatures are more strongly conserved across evolutionary history or geography for lizard populations distributed globally. We find that critical thermal maxima are highly conserved with location accounting for a higher proportion of the variation than phylogeny. Notably, thermal tolerance breadth is conserved across the phylogeny despite critical thermal minima showing little niche conservatism. Body temperatures observed during activity in the field show the greatest degree of conservatism, with phylogeny accounting for most of the variation. This suggests that propensities for thermoregulatory behaviour, which can buffer body temperatures from environmental variation, are similar within lineages. Phylogeny and geography constrain thermal tolerances similarly within continents, but variably within clades. Conservatism of thermal tolerances across lineages suggests that the potential for local adaptation to alleviate the impacts of climate change on lizards may be limited.
Lysosomal enzymes and their receptors in invertebrates: an evolutionary perspective.

PubMed

Kumar, Nadimpalli Siva; Bhamidimarri, Poorna M

2015-01-01

Lysosomal biogenesis is an important process in eukaryotic cells to maintain cellular homeostasis. The key components that are involved in the biogenesis such as the lysosomal enzymes, their modifications and the mannose 6-phosphate receptors have been well studied and their evolutionary conservation across mammalian and non-mammalian vertebrates is clearly established. Invertebrate lysosomal biogenesis pathway on the other hand is not well studied. Although, details on mannose 6-phosphate receptors and enzymes involved in lysosomal enzyme modifications were reported earlier, a clear cut pathway has not been established. Recent research on the invertebrate species involving biogenesis of lysosomal enzymes suggests a possible conserved pathway in invertebrates. This review presents certain observations based on these processes that include biochemical, immunological and functional studies. Major conclusions include conservation of MPR-dependent pathway in higher invertebrates and recent evidence suggests that MPR-independent pathway might have been more prominent among lower invertebrates. The possible components of MPR-independent pathway that may play a role in lysosomal enzyme targeting are also discussed here.
On the interconnection of stable protein complexes: inter-complex hubs and their conservation in Saccharomyces cerevisiae and Homo sapiens networks.

PubMed

Guerra, Concettina

2015-01-01

Protein complexes are key molecular entities that perform a variety of essential cellular functions. The connectivity of proteins within a complex has been widely investigated with both experimental and computational techniques. We developed a computational approach to identify and characterise proteins that play a role in interconnecting complexes. We computed a measure of inter-complex centrality, the crossroad index, based on disjoint paths connecting proteins in distinct complexes and identified inter-complex hubs as proteins with a high value of the crossroad index. We applied the approach to a set of stable complexes in Saccharomyces cerevisiae and in Homo sapiens. Just as done for hubs, we evaluated the topological and biological properties of inter-complex hubs addressing the following questions. Do inter-complex hubs tend to be evolutionary conserved? What is the relation between crossroad index and essentiality? We found a good correlation between inter-complex hubs and both evolutionary conservation and essentiality.
Evolutionary conservation of cold-induced antisense RNAs of FLOWERING LOCUS C in Arabidopsis thaliana perennial relatives.

PubMed

Castaings, Loren; Bergonzi, Sara; Albani, Maria C; Kemi, Ulla; Savolainen, Outi; Coupland, George

2014-07-17

Antisense RNA (asRNA) COOLAIR is expressed at A. thaliana FLOWERING LOCUS C (FLC) in response to winter temperatures. Its contribution to cold-induced silencing of FLC was proposed but its functional and evolutionary significance remain unclear. Here we identify a highly conserved block containing the COOLAIR first exon and core promoter at the 3' end of several FLC orthologues. Furthermore, asRNAs related to COOLAIR are expressed at FLC loci in the perennials A. alpina and A. lyrata, although some splicing variants differ from A. thaliana. Study of the A. alpina orthologue, PERPETUAL FLOWERING 1 (PEP1), demonstrates that AaCOOLAIR is induced each winter of the perennial life cycle. Introduction of PEP1 into A. thaliana reveals that AaCOOLAIR cis-elements confer cold-inducibility in this heterologous species while the difference between PEP1 and FLC mRNA patterns depends on both cis-elements and species-specific trans-acting factors. Thus, expression of COOLAIR is highly conserved, supporting its importance in FLC regulation.
Distinct retroelement classes define evolutionary breakpoints demarcating sites of evolutionary novelty

PubMed Central

Longo, Mark S; Carone, Dawn M; Green, Eric D; O'Neill, Michael J; O'Neill, Rachel J

2009-01-01

Background Large-scale genome rearrangements brought about by chromosome breaks underlie numerous inherited diseases, initiate or promote many cancers and are also associated with karyotype diversification during species evolution. Recent research has shown that these breakpoints are nonrandomly distributed throughout the mammalian genome and many, termed "evolutionary breakpoints" (EB), are specific genomic locations that are "reused" during karyotypic evolution. When the phylogenetic trajectory of orthologous chromosome segments is considered, many of these EB are coincident with ancient centromere activity as well as new centromere formation. While EB have been characterized as repeat-rich regions, it has not been determined whether specific sequences have been retained during evolution that would indicate previous centromere activity or a propensity for new centromere formation. Likewise, the conservation of specific sequence motifs or classes at EBs among divergent mammalian taxa has not been determined. Results To define conserved sequence features of EBs associated with centromere evolution, we performed comparative sequence analysis of more than 4.8 Mb within the tammar wallaby, Macropus eugenii, derived from centromeric regions (CEN), euchromatic regions (EU), and an evolutionary breakpoint (EB) that has undergone convergent breakpoint reuse and past centromere activity in marsupials. We found a dramatic enrichment for long interspersed nucleotide elements (LINE1s) and endogenous retroviruses (ERVs) and a depletion of short interspersed nucleotide elements (SINEs) shared between CEN and EBs. We analyzed the orthologous human EB (14q32.33), known to be associated with translocations in many cancers including multiple myelomas and plasma cell leukemias, and found a conserved distribution of similar repetitive elements. Conclusion Our data indicate that EBs tracked within the class Mammalia harbor sequence features retained since the divergence of marsupials and eutherians that may have predisposed these genomic regions to large-scale chromosomal instability. PMID:19630942
Evolutionary dynamics of protein domain architecture in plants

PubMed Central

2012-01-01

Background Protein domains are the structural, functional and evolutionary units of the protein. Protein domain architectures are the linear arrangements of domain(s) in individual proteins. Although the evolutionary history of protein domain architecture has been extensively studied in microorganisms, the evolutionary dynamics of domain architecture in the plant kingdom remains largely undefined. To address this question, we analyzed the lineage-based protein domain architecture content in 14 completed green plant genomes. Results Our analyses show that all 14 plant genomes maintain similar distributions of species-specific, single-domain, and multi-domain architectures. Approximately 65% of plant domain architectures are universally present in all plant lineages, while the remaining architectures are lineage-specific. Clear examples are seen of both the loss and gain of specific protein architectures in higher plants. There has been a dynamic, lineage-wise expansion of domain architectures during plant evolution. The data suggest that this expansion can be largely explained by changes in nuclear ploidy resulting from rounds of whole genome duplications. Indeed, there has been a decrease in the number of unique domain architectures when the genomes were normalized into a presumed ancestral genome that has not undergone whole genome duplications. Conclusions Our data show the conservation of universal domain architectures in all available plant genomes, indicating the presence of an evolutionarily conserved, core set of protein components. However, the occurrence of lineage-specific domain architectures indicates that domain architecture diversity has been maintained beyond these core components in plant genomes. Although several features of genome-wide domain architecture content are conserved in plants, the data clearly demonstrate lineage-wise, progressive changes and expansions of individual protein domain architectures, reinforcing the notion that plant genomes have undergone dynamic evolution. PMID:22252370
Strongly aligned gas-phase molecules at free-electron lasers

DOE PAGES

Kierspel, Thomas; Wiese, Joss; Mullins, Terry; ...

2015-09-16

Here, we demonstrate a novel experimental implementation to strongly align molecules at full repetition rates of free-electron lasers. We utilized the available in-house laser system at the coherent x-ray imaging beamline at the linac coherent light source. Chirped laser pulses, i.e., the direct output from the regenerative amplifier of the Ti:Sa chirped pulse amplification laser system, were used to strongly align 2, 5-diiodothiophene molecules in a molecular beam. The alignment laser pulses had pulse energies of a few mJ and a pulse duration of 94 ps. A degree of alignment ofmore » $$\\langle {\\mathrm{cos}}^{2}{\\theta }_{2{\\rm{D}}}\\rangle =0.85$$ was measured, limited by the intrinsic temperature of the molecular beam rather than by the available laser system. With the general availability of synchronized chirped-pulse-amplified near-infrared laser systems at short-wavelength laser facilities, our approach allows for the universal preparation of molecules tightly fixed in space for experiments with x-ray pulses.« less
Expression conservation within the circadian clock of a monocot: natural variation at barley Ppd-H1 affects circadian expression of flowering time genes, but not clock orthologs.

PubMed

Campoli, Chiara; Shtaya, Munqez; Davis, Seth J; von Korff, Maria

2012-06-21

The circadian clock is an endogenous mechanism that coordinates biological processes with daily changes in the environment. In plants, circadian rhythms contribute to both agricultural productivity and evolutionary fitness. In barley, the photoperiod response regulator and flowering-time gene Ppd-H1 is orthologous to the Arabidopsis core-clock gene PRR7. However, relatively little is known about the role of Ppd-H1 and other components of the circadian clock in temperate crop species. In this study, we identified barley clock orthologs and tested the effects of natural genetic variation at Ppd-H1 on diurnal and circadian expression of clock and output genes from the photoperiod-response pathway. Barley clock orthologs HvCCA1, HvGI, HvPRR1, HvPRR37 (Ppd-H1), HvPRR73, HvPRR59 and HvPRR95 showed a high level of sequence similarity and conservation of diurnal and circadian expression patterns, when compared to Arabidopsis. The natural mutation at Ppd-H1 did not affect diurnal or circadian cycling of barley clock genes. However, the Ppd-H1 mutant was found to be arrhythmic under free-running conditions for the photoperiod-response genes HvCO1, HvCO2, and the MADS-box transcription factor and vernalization responsive gene Vrn-H1. We suggest that the described eudicot clock is largely conserved in the monocot barley. However, genetic differentiation within gene families and differences in the function of Ppd-H1 suggest evolutionary modification in the angiosperm clock. Our data indicates that natural variation at Ppd-H1 does not affect the expression level of clock genes, but controls photoperiodic output genes. Circadian control of Vrn-H1 in barley suggests that this vernalization responsive gene is also controlled by the photoperiod-response pathway. Structural and functional characterization of the barley circadian clock will set the basis for future studies of the adaptive significance of the circadian clock in Triticeae species.
High-harmonic spectroscopy of aligned molecules

NASA Astrophysics Data System (ADS)

Yun, Hyeok; Yun, Sang Jae; Lee, Gae Hwang; Nam, Chang Hee

2017-01-01

High harmonics emitted from aligned molecules driven by intense femtosecond laser pulses provide the opportunity to explore the structural information of molecules. The field-free molecular alignment technique is an expedient tool for investigating the structural characteristics of linear molecules. The underlying physics of field-free alignment, showing the characteristic revival structure specific to molecular species, is clearly explained from the quantum-phase analysis of molecular rotational states. The anisotropic nature of molecules is shown from the harmonic polarization measurement performed with spatial interferometry. The multi-orbital characteristics of molecules are investigated using high-harmonic spectroscopy, applied to molecules of N2 and CO2. In the latter case the two-dimensional high-harmonic spectroscopy, implemented using a two-color laser field, is applied to distinguish harmonics from different orbitals. Molecular high-harmonic spectroscopy will open a new route to investigate ultrafast dynamics of molecules.
Whole-proteome phylogeny of large dsDNA viruses and parvoviruses through a composition vector method related to dynamical language model

PubMed Central

2010-01-01

Background The vast sequence divergence among different virus groups has presented a great challenge to alignment-based analysis of virus phylogeny. Due to the problems caused by the uncertainty in alignment, existing tools for phylogenetic analysis based on multiple alignment could not be directly applied to the whole-genome comparison and phylogenomic studies of viruses. There has been a growing interest in alignment-free methods for phylogenetic analysis using complete genome data. Among the alignment-free methods, a dynamical language (DL) method proposed by our group has successfully been applied to the phylogenetic analysis of bacteria and chloroplast genomes. Results In this paper, the DL method is used to analyze the whole-proteome phylogeny of 124 large dsDNA viruses and 30 parvoviruses, two data sets with large difference in genome size. The trees from our analyses are in good agreement to the latest classification of large dsDNA viruses and parvoviruses by the International Committee on Taxonomy of Viruses (ICTV). Conclusions The present method provides a new way for recovering the phylogeny of large dsDNA viruses and parvoviruses, and also some insights on the affiliation of a number of unclassified viruses. In comparison, some alignment-free methods such as the CV Tree method can be used for recovering the phylogeny of large dsDNA viruses, but they are not suitable for resolving the phylogeny of parvoviruses with a much smaller genome size. PMID:20565983
A flexible statistical model for alignment of label-free proteomics data – incorporating ion mobility and product ion information

PubMed Central

2013-01-01

Background The goal of many proteomics experiments is to determine the abundance of proteins in biological samples, and the variation thereof in various physiological conditions. High-throughput quantitative proteomics, specifically label-free LC-MS/MS, allows rapid measurement of thousands of proteins, enabling large-scale studies of various biological systems. Prior to analyzing these information-rich datasets, raw data must undergo several computational processing steps. We present a method to address one of the essential steps in proteomics data processing - the matching of peptide measurements across samples. Results We describe a novel method for label-free proteomics data alignment with the ability to incorporate previously unused aspects of the data, particularly ion mobility drift times and product ion information. We compare the results of our alignment method to PEPPeR and OpenMS, and compare alignment accuracy achieved by different versions of our method utilizing various data characteristics. Our method results in increased match recall rates and similar or improved mismatch rates compared to PEPPeR and OpenMS feature-based alignment. We also show that the inclusion of drift time and product ion information results in higher recall rates and more confident matches, without increases in error rates. Conclusions Based on the results presented here, we argue that the incorporation of ion mobility drift time and product ion information are worthy pursuits. Alignment methods should be flexible enough to utilize all available data, particularly with recent advancements in experimental separation methods. PMID:24341404
A flexible statistical model for alignment of label-free proteomics data--incorporating ion mobility and product ion information.

PubMed

Benjamin, Ashlee M; Thompson, J Will; Soderblom, Erik J; Geromanos, Scott J; Henao, Ricardo; Kraus, Virginia B; Moseley, M Arthur; Lucas, Joseph E

2013-12-16

The goal of many proteomics experiments is to determine the abundance of proteins in biological samples, and the variation thereof in various physiological conditions. High-throughput quantitative proteomics, specifically label-free LC-MS/MS, allows rapid measurement of thousands of proteins, enabling large-scale studies of various biological systems. Prior to analyzing these information-rich datasets, raw data must undergo several computational processing steps. We present a method to address one of the essential steps in proteomics data processing--the matching of peptide measurements across samples. We describe a novel method for label-free proteomics data alignment with the ability to incorporate previously unused aspects of the data, particularly ion mobility drift times and product ion information. We compare the results of our alignment method to PEPPeR and OpenMS, and compare alignment accuracy achieved by different versions of our method utilizing various data characteristics. Our method results in increased match recall rates and similar or improved mismatch rates compared to PEPPeR and OpenMS feature-based alignment. We also show that the inclusion of drift time and product ion information results in higher recall rates and more confident matches, without increases in error rates. Based on the results presented here, we argue that the incorporation of ion mobility drift time and product ion information are worthy pursuits. Alignment methods should be flexible enough to utilize all available data, particularly with recent advancements in experimental separation methods.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.