Gayral, Philippe; Iskra-Caruana, Marie-Line
2009-07-01
Banana streak virus (BSV) is a plant dsDNA pararetrovirus (family Caulimoviridae, genus badnavirus). Although integration is not an essential step in the BSV replication cycle, the nuclear genome of banana (Musa sp.) contains BSV endogenous pararetrovirus sequences (BSV EPRVs). Some BSV EPRVs are infectious by reconstituting a functional viral genome. Recent studies revealed a large molecular diversity of episomal BSV viruses (i.e., nonintegrated) while others focused on BSV EPRV sequences only. In this study, the evolutionary history of badnavirus integration in banana was inferred from phylogenetic relationships between BSV and BSV EPRVs. The relative evolution rates and selective pressures (d(N)/d(S) ratio) were also compared between endogenous and episomal viral sequences. At least 27 recent independent integration events occurred after the divergence of three banana species, indicating that viral integration is a recent and frequent phenomenon. Relaxation of selective pressure on badnaviral sequences that experienced neutral evolution after integration in the plant genome was recorded. Additionally, a significant decrease (35%) in the EPRV evolution rate was observed compared to BSV, reflecting the difference in the evolution rate between episomal dsDNA viruses and plant genome. The comparison of our results with the evolution rate of the Musa genome and other reverse-transcribing viruses suggests that EPRVs play an active role in episomal BSV diversity and evolution.
Fort, Philippe; Albertini, Aurélie; Van-Hua, Aurélie; Berthomieu, Arnaud; Roche, Stéphane; Delsuc, Frédéric; Pasteur, Nicole; Capy, Pierre; Gaudin, Yves; Weill, Mylène
2012-01-01
Retroelements represent a considerable fraction of many eukaryotic genomes and are considered major drives for adaptive genetic innovations. Recent discoveries showed that despite not normally using DNA intermediates like retroviruses do, Mononegaviruses (i.e., viruses with nonsegmented, negative-sense RNA genomes) can integrate gene fragments into the genomes of their hosts. This was shown for Bornaviridae and Filoviridae, the sequences of which have been found integrated into the germ line cells of many vertebrate hosts. Here, we show that Rhabdoviridae sequences, the major Mononegavirales family, have integrated only into the genomes of arthropod species. We identified 185 integrated rhabdoviral elements (IREs) coding for nucleoproteins, glycoproteins, or RNA-dependent RNA polymerases; they were mostly found in the genomes of the mosquito Aedes aegypti and the blacklegged tick Ixodes scapularis. Phylogenetic analyses showed that most IREs in A. aegypti derived from multiple independent integration events. Since RNA viruses are submitted to much higher substitution rates as compared with their hosts, IREs thus represent fossil traces of the diversity of extinct Rhabdoviruses. Furthermore, analyses of orthologous IREs in A. aegypti field mosquitoes sampled worldwide identified an integrated polymerase IRE fragment that appeared under purifying selection within several million years, which supports a functional role in the host's biology. These results show that A. aegypti was subjected to repeated Rhabdovirus infectious episodes during its evolution history, which led to the accumulation of many integrated sequences. They also suggest that like retroviruses, integrated rhabdoviral sequences may participate actively in the evolution of their hosts.
Emerging Concepts of Data Integration in Pathogen Phylodynamics.
Baele, Guy; Suchard, Marc A; Rambaut, Andrew; Lemey, Philippe
2017-01-01
Phylodynamics has become an increasingly popular statistical framework to extract evolutionary and epidemiological information from pathogen genomes. By harnessing such information, epidemiologists aim to shed light on the spatio-temporal patterns of spread and to test hypotheses about the underlying interaction of evolutionary and ecological dynamics in pathogen populations. Although the field has witnessed a rich development of statistical inference tools with increasing levels of sophistication, these tools initially focused on sequences as their sole primary data source. Integrating various sources of information, however, promises to deliver more precise insights in infectious diseases and to increase opportunities for statistical hypothesis testing. Here, we review how the emerging concept of data integration is stimulating new advances in Bayesian evolutionary inference methodology which formalize a marriage of statistical thinking and evolutionary biology. These approaches include connecting sequence to trait evolution, such as for host, phenotypic and geographic sampling information, but also the incorporation of covariates of evolutionary and epidemic processes in the reconstruction procedures. We highlight how a full Bayesian approach to covariate modeling and testing can generate further insights into sequence evolution, trait evolution, and population dynamics in pathogen populations. Specific examples demonstrate how such approaches can be used to test the impact of host on rabies and HIV evolutionary rates, to identify the drivers of influenza dispersal as well as the determinants of rabies cross-species transmissions, and to quantify the evolutionary dynamics of influenza antigenicity. Finally, we briefly discuss how data integration is now also permeating through the inference of transmission dynamics, leading to novel insights into tree-generative processes and detailed reconstructions of transmission trees. [Bayesian inference; birth–death models; coalescent models; continuous trait evolution; covariates; data integration; discrete trait evolution; pathogen phylodynamics.
Emerging Concepts of Data Integration in Pathogen Phylodynamics
Baele, Guy; Suchard, Marc A.; Rambaut, Andrew; Lemey, Philippe
2017-01-01
Phylodynamics has become an increasingly popular statistical framework to extract evolutionary and epidemiological information from pathogen genomes. By harnessing such information, epidemiologists aim to shed light on the spatio-temporal patterns of spread and to test hypotheses about the underlying interaction of evolutionary and ecological dynamics in pathogen populations. Although the field has witnessed a rich development of statistical inference tools with increasing levels of sophistication, these tools initially focused on sequences as their sole primary data source. Integrating various sources of information, however, promises to deliver more precise insights in infectious diseases and to increase opportunities for statistical hypothesis testing. Here, we review how the emerging concept of data integration is stimulating new advances in Bayesian evolutionary inference methodology which formalize a marriage of statistical thinking and evolutionary biology. These approaches include connecting sequence to trait evolution, such as for host, phenotypic and geographic sampling information, but also the incorporation of covariates of evolutionary and epidemic processes in the reconstruction procedures. We highlight how a full Bayesian approach to covariate modeling and testing can generate further insights into sequence evolution, trait evolution, and population dynamics in pathogen populations. Specific examples demonstrate how such approaches can be used to test the impact of host on rabies and HIV evolutionary rates, to identify the drivers of influenza dispersal as well as the determinants of rabies cross-species transmissions, and to quantify the evolutionary dynamics of influenza antigenicity. Finally, we briefly discuss how data integration is now also permeating through the inference of transmission dynamics, leading to novel insights into tree-generative processes and detailed reconstructions of transmission trees. [Bayesian inference; birth–death models; coalescent models; continuous trait evolution; covariates; data integration; discrete trait evolution; pathogen phylodynamics. PMID:28173504
Seeking Synthesis: The Integrative Problem in Understanding Language and Its Evolution.
Dale, Rick; Kello, Christopher T; Schoenemann, P Thomas
2016-04-01
We discuss two problems for a general scientific understanding of language, sequences and synergies: how language is an intricately sequenced behavior and how language is manifested as a multidimensionally structured behavior. Though both are central in our understanding, we observe that the former tends to be studied more than the latter. We consider very general conditions that hold in human brain evolution and its computational implications, and identify multimodal and multiscale organization as two key characteristics of emerging cognitive function in our species. This suggests that human brains, and cognitive function specifically, became more adept at integrating diverse information sources and operating at multiple levels for linguistic performance. We argue that framing language evolution, learning, and use in terms of synergies suggests new research questions, and it may be a fruitful direction for new developments in theory and modeling of language as an integrated system. Copyright © 2016 Cognitive Science Society, Inc.
Evol and ProDy for bridging protein sequence evolution and structural dynamics
Mao, Wenzhi; Liu, Ying; Chennubhotla, Chakra; Lezon, Timothy R.; Bahar, Ivet
2014-01-01
Correlations between sequence evolution and structural dynamics are of utmost importance in understanding the molecular mechanisms of function and their evolution. We have integrated Evol, a new package for fast and efficient comparative analysis of evolutionary patterns and conformational dynamics, into ProDy, a computational toolbox designed for inferring protein dynamics from experimental and theoretical data. Using information-theoretic approaches, Evol coanalyzes conservation and coevolution profiles extracted from multiple sequence alignments of protein families with their inferred dynamics. Availability and implementation: ProDy and Evol are open-source and freely available under MIT License from http://prody.csb.pitt.edu/. Contact: bahar@pitt.edu PMID:24849577
MicRhoDE: a curated database for the analysis of microbial rhodopsin diversity and evolution
Boeuf, Dominique; Audic, Stéphane; Brillet-Guéguen, Loraine; Caron, Christophe; Jeanthon, Christian
2015-01-01
Microbial rhodopsins are a diverse group of photoactive transmembrane proteins found in all three domains of life and in viruses. Today, microbial rhodopsin research is a flourishing research field in which new understandings of rhodopsin diversity, function and evolution are contributing to broader microbiological and molecular knowledge. Here, we describe MicRhoDE, a comprehensive, high-quality and freely accessible database that facilitates analysis of the diversity and evolution of microbial rhodopsins. Rhodopsin sequences isolated from a vast array of marine and terrestrial environments were manually collected and curated. To each rhodopsin sequence are associated related metadata, including predicted spectral tuning of the protein, putative activity and function, taxonomy for sequences that can be linked to a 16S rRNA gene, sampling date and location, and supporting literature. The database currently covers 7857 aligned sequences from more than 450 environmental samples or organisms. Based on a robust phylogenetic analysis, we introduce an operational classification system with multiple phylogenetic levels ranging from superclusters to species-level operational taxonomic units. An integrated pipeline for online sequence alignment and phylogenetic tree construction is also provided. With a user-friendly interface and integrated online bioinformatics tools, this unique resource should be highly valuable for upcoming studies of the biogeography, diversity, distribution and evolution of microbial rhodopsins. Database URL: http://micrhode.sb-roscoff.fr. PMID:26286928
MicRhoDE: a curated database for the analysis of microbial rhodopsin diversity and evolution.
Boeuf, Dominique; Audic, Stéphane; Brillet-Guéguen, Loraine; Caron, Christophe; Jeanthon, Christian
2015-01-01
Microbial rhodopsins are a diverse group of photoactive transmembrane proteins found in all three domains of life and in viruses. Today, microbial rhodopsin research is a flourishing research field in which new understandings of rhodopsin diversity, function and evolution are contributing to broader microbiological and molecular knowledge. Here, we describe MicRhoDE, a comprehensive, high-quality and freely accessible database that facilitates analysis of the diversity and evolution of microbial rhodopsins. Rhodopsin sequences isolated from a vast array of marine and terrestrial environments were manually collected and curated. To each rhodopsin sequence are associated related metadata, including predicted spectral tuning of the protein, putative activity and function, taxonomy for sequences that can be linked to a 16S rRNA gene, sampling date and location, and supporting literature. The database currently covers 7857 aligned sequences from more than 450 environmental samples or organisms. Based on a robust phylogenetic analysis, we introduce an operational classification system with multiple phylogenetic levels ranging from superclusters to species-level operational taxonomic units. An integrated pipeline for online sequence alignment and phylogenetic tree construction is also provided. With a user-friendly interface and integrated online bioinformatics tools, this unique resource should be highly valuable for upcoming studies of the biogeography, diversity, distribution and evolution of microbial rhodopsins. Database URL: http://micrhode.sb-roscoff.fr. © The Author(s) 2015. Published by Oxford University Press.
Gayral, Philippe; Blondin, Laurence; Guidolin, Olivier; Carreel, Françoise; Hippolyte, Isabelle; Perrier, Xavier; Iskra-Caruana, Marie-Line
2010-07-01
Endogenous plant pararetroviruses (EPRVs) are viral sequences of the family Caulimoviridae integrated into the nuclear genome of numerous plant species. The ability of some endogenous sequences of Banana streak viruses (eBSVs) in the genome of banana (Musa sp.) to induce infections just like the virus itself was recently demonstrated (P. Gayral et al., J. Virol. 83:6697-6710, 2008). Although eBSVs probably arose from accidental events, infectious eBSVs constitute an extreme case of parasitism, as well as a newly described strategy for vertical virus transmission in plants. We investigated the early evolutionary stages of infectious eBSV for two distinct BSV species-GF (BSGFV) and Imové (BSImV)-through the study of their distribution, insertion polymorphism, and structure evolution among selected banana genotypes representative of the diversity of 60 wild Musa species and genotypes. To do so, the historical frame of host evolution was analyzed by inferring banana phylogeny from two chloroplast regions-matK and trnL-trnF-as well as from the nuclear genome, using 19 microsatellite loci. We demonstrated that both BSV species integrated recently in banana evolution, circa 640,000 years ago. The two infectious eBSVs were subjected to different selective pressures and showed distinct levels of rearrangement within their final structure. In addition, the molecular phylogenies of integrated and nonintegrated BSVs enabled us to establish the phylogenetic origins of eBSGFV and eBSImV.
Gini, Beatrice; Mischel, Paul S
2014-08-01
Single-cell sequencing approaches are needed to characterize the genomic diversity of complex tumors, shedding light on their evolutionary paths and potentially suggesting more effective therapies. In this issue of Cancer Discovery, Francis and colleagues develop a novel integrative approach to identify distinct tumor subpopulations based on joint detection of clonal and subclonal events from bulk tumor and single-nucleus whole-genome sequencing, allowing them to infer a subclonal architecture. Surprisingly, the authors identify convergent evolution of multiple, mutually exclusive, independent EGFR gain-of-function variants in a single tumor. This study demonstrates the value of integrative single-cell genomics and highlights the biologic primacy of EGFR as an actionable target in glioblastoma. ©2014 American Association for Cancer Research.
Modeling of the Orbital Evolution of 2060 Chiron
NASA Astrophysics Data System (ADS)
Kovalenko, Nataliya S.; Babenko, Yury G.; Churyumov, Klim I.
2002-03-01
The origin of Centaurs is one of the most interesting problems of Solar system science, and it has not yet been solved. To shed light on this problem one can investigate Centaurs' past and future orbital evolution. In this paper we discuss the results of Chiron's orbital evolution modeling. It was the first discovered Centaur and is the brightest one. Numerical integration was produced for 1 Myr forward and backward from the present time. A program based on the Everhart single sequence method for integrating orbits was used.
Evol and ProDy for bridging protein sequence evolution and structural dynamics.
Bakan, Ahmet; Dutta, Anindita; Mao, Wenzhi; Liu, Ying; Chennubhotla, Chakra; Lezon, Timothy R; Bahar, Ivet
2014-09-15
Correlations between sequence evolution and structural dynamics are of utmost importance in understanding the molecular mechanisms of function and their evolution. We have integrated Evol, a new package for fast and efficient comparative analysis of evolutionary patterns and conformational dynamics, into ProDy, a computational toolbox designed for inferring protein dynamics from experimental and theoretical data. Using information-theoretic approaches, Evol coanalyzes conservation and coevolution profiles extracted from multiple sequence alignments of protein families with their inferred dynamics. ProDy and Evol are open-source and freely available under MIT License from http://prody.csb.pitt.edu/. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Unusual RNA plant virus integration in the soybean genome leads to the production of small RNAs.
da Fonseca, Guilherme Cordenonsi; de Oliveira, Luiz Felipe Valter; de Morais, Guilherme Loss; Abdelnor, Ricardo Vilela; Nepomuceno, Alexandre Lima; Waterhouse, Peter M; Farinelli, Laurent; Margis, Rogerio
2016-05-01
Horizontal gene transfer (HGT) is known to be a major force in genome evolution. The acquisition of genes from viruses by eukaryotic genomes is a well-studied example of HGT, including rare cases of non-retroviral RNA virus integration. The present study describes the integration of cucumber mosaic virus RNA-1 into soybean genome. After an initial metatranscriptomic analysis of small RNAs derived from soybean, the de novo assembly resulted a 3029-nt contig homologous to RNA-1. The integration of this sequence in the soybean genome was confirmed by DNA deep sequencing. The locus where the integration occurred harbors the full RNA-1 sequence followed by the partial sequence of an endogenous mRNA and another sequence of RNA-1 as an inverted repeat and allowing the formation of a hairpin structure. This region recombined into a retrotransposon located inside an exon of a soybean gene. The nucleotide similarity of the integrated sequence compared to other Cucumber mosaic virus sequences indicates that the integration event occurred recently. We described a rare event of non-retroviral RNA virus integration in soybean that leads to the production of a double-stranded RNA in a similar fashion to virus resistance RNAi plants. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Evolution of Enzyme Superfamilies: Comprehensive Exploration of Sequence-Function Relationships.
Baier, F; Copp, J N; Tokuriki, N
2016-11-22
The sequence and functional diversity of enzyme superfamilies have expanded through billions of years of evolution from a common ancestor. Understanding how protein sequence and functional "space" have expanded, at both the evolutionary and molecular level, is central to biochemistry, molecular biology, and evolutionary biology. Integrative approaches that examine protein sequence, structure, and function have begun to provide comprehensive views of the functional diversity and evolutionary relationships within enzyme superfamilies. In this review, we outline the recent advances in our understanding of enzyme evolution and superfamily functional diversity. We describe the tools that have been used to comprehensively analyze sequence relationships and to characterize sequence and function relationships. We also highlight recent large-scale experimental approaches that systematically determine the activity profiles across enzyme superfamilies. We identify several intriguing insights from this recent body of work. First, promiscuous activities are prevalent among extant enzymes. Second, many divergent proteins retain "function connectivity" via enzyme promiscuity, which can be used to probe the evolutionary potential and history of enzyme superfamilies. Finally, we discuss open questions regarding the intricacies of enzyme divergence, as well as potential research directions that will deepen our understanding of enzyme superfamily evolution.
Bueno, Danilo; Palacios-Gimenez, Octavio Manuel; Martí, Dardo Andrea; Mariguela, Tatiane Casagrande; Cabral-de-Mello, Diogo Cavalcanti
2016-08-01
The 5S ribosomal DNA (rDNA) sequences are subject of dynamic evolution at chromosomal and molecular levels, evolving through concerted and/or birth-and-death fashion. Among grasshoppers, the chromosomal location for this sequence was established for some species, but little molecular information was obtained to infer evolutionary patterns. Here, we integrated data from chromosomal and nucleotide sequence analysis for 5S rDNA in two Abracris species aiming to identify evolutionary dynamics. For both species, two arrays were identified, a larger sequence (named type-I) that consisted of the entire 5S rDNA gene plus NTS (non-transcribed spacer) and a smaller (named type-II) with truncated 5S rDNA gene plus short NTS that was considered a pseudogene. For type-I sequences, the gene corresponding region contained the internal control region and poly-T motif and the NTS presented partial transposable elements. Between the species, nucleotide differences for type-I were noticed, while type-II was identical, suggesting pseudogenization in a common ancestor. At chromosomal point to view, the type-II was placed in one bivalent, while type-I occurred in multiple copies in distinct chromosomes. In Abracris, the evolution of 5S rDNA was apparently influenced by the chromosomal distribution of clusters (single or multiple location), resulting in a mixed mechanism integrating concerted and birth-and-death evolution depending on the unit.
Probabilistic models of eukaryotic evolution: time for integration
Lartillot, Nicolas
2015-01-01
In spite of substantial work and recent progress, a global and fully resolved picture of the macroevolutionary history of eukaryotes is still under construction. This concerns not only the phylogenetic relations among major groups, but also the general characteristics of the underlying macroevolutionary processes, including the patterns of gene family evolution associated with endosymbioses, as well as their impact on the sequence evolutionary process. All these questions raise formidable methodological challenges, calling for a more powerful statistical paradigm. In this direction, model-based probabilistic approaches have played an increasingly important role. In particular, improved models of sequence evolution accounting for heterogeneities across sites and across lineages have led to significant, although insufficient, improvement in phylogenetic accuracy. More recently, one main trend has been to move away from simple parametric models and stepwise approaches, towards integrative models explicitly considering the intricate interplay between multiple levels of macroevolutionary processes. Such integrative models are in their infancy, and their application to the phylogeny of eukaryotes still requires substantial improvement of the underlying models, as well as additional computational developments. PMID:26323768
Fujimi, T J; Nakajyo, T; Nishimura, E; Ogura, E; Tsuchiya, T; Tamiya, T
2003-08-14
The genes encoding erabutoxin (short chain neurotoxin) isoforms (Ea, Eb, and Ec), LsIII (long chain neurotoxin) and a novel long chain neurotoxin pseudogene were cloned from a Laticauda semifasciata genomic library. Short and long chain neurotoxin genes were also cloned from the genome of Laticauda laticaudata, a closely related species of L. semifasciata, by PCR. A putative matrix attached region (MAR) sequence was found in the intron I of the LsIII gene. Comparative analysis of 11 structurally relevant snake toxin genes (three-finger-structure toxins) revealed the molecular evolution of these toxins. Three-finger-structure toxin genes diverged from a common ancestor through two types of evolutionary pathways (long and short types), early in the course of evolution. At a later stage of evolution in each gene, the accumulation of mutations in the exons, especially exon II, by accelerated evolution may have caused the increased diversification in their functions. It was also revealed that the putative MAR sequence found in the LsIII gene was integrated into the gene after the species-level divergence.
Nielsen, Tue Kjærgaard; Rasmussen, Morten; Demanèche, Sandrine; Cecillon, Sébastien; Vogel, Timothy M; Hansen, Lars Hestbjerg
2017-09-01
Bacterial degraders of chlorophenoxy herbicides have been isolated from various ecosystems, including pristine environments. Among these degraders, the sphingomonads constitute a prominent group that displays versatile xenobiotic-degradation capabilities. Four separate sequencing strategies were required to provide the complete sequence of the complex and plastic genome of the canonical chlorophenoxy herbicide-degrading Sphingobium herbicidovorans MH. The genome has an intricate organization of the chlorophenoxy-herbicide catabolic genes sdpA, rdpA, and cadABCD that encode the (R)- and (S)-enantiomer-specific 2,4-dichlorophenoxypropionate dioxygenases and four subunits of a Rieske non-heme iron oxygenase involved in 2-methyl-chlorophenoxyacetic acid degradation, respectively. Several major genomic rearrangements are proposed to help understand the evolution and mobility of these important genes and their genetic context. Single-strain mobilomic sequence analysis uncovered plasmids and insertion sequence-associated circular intermediates in this environmentally important bacterium and enabled the description of evolutionary models for pesticide degradation in strain MH and related organisms. The mobilome presented a complex mosaic of mobile genetic elements including four plasmids and several circular intermediate DNA molecules of insertion-sequence elements and transposons that are central to the evolution of xenobiotics degradation. Furthermore, two individual chromosomally integrated prophages were shown to excise and form free circular DNA molecules. This approach holds great potential for improving the understanding of genome plasticity, evolution, and microbial ecology. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mushegian, Arcady R., E-mail: mushegian2@gmail.com; Elena, Santiago F., E-mail: sfelena@ibmcp.upv.es; The Santa Fe Institute, Santa Fe, NM 87501
Homologs of Tobacco mosaic virus 30K cell-to-cell movement protein are encoded by diverse plant viruses. Mechanisms of action and evolutionary origins of these proteins remain obscure. We expand the picture of conservation and evolution of the 30K proteins, producing sequence alignment of the 30K superfamily with the broadest phylogenetic coverage thus far and illuminating structural features of the core all-beta fold of these proteins. Integrated copies of pararetrovirus 30K movement genes are prevalent in euphyllophytes, with at least one copy intact in nearly every examined species, and mRNAs detected for most of them. Sequence analysis suggests repeated integrations, pseudogenizations, andmore » positive selection in those provirus genes. An unannotated 30K-superfamily gene in Arabidopsis thaliana genome is likely expressed as a fusion with the At1g37113 transcript. This molecular background of endopararetrovirus gene products in plants may change our view of virus infection and pathogenesis, and perhaps of cellular homeostasis in the hosts. - Highlights: • Sequence region shared by plant virus “30K” movement proteins has an all-beta fold. • Most euphyllophyte genomes contain integrated copies of pararetroviruses. • These integrated virus genomes often include intact movement protein genes. • Molecular evidence suggests that these “30K” genes may be selected for function.« less
2011-01-01
Background We present the genome sequence of the tammar wallaby, Macropus eugenii, which is a member of the kangaroo family and the first representative of the iconic hopping mammals that symbolize Australia to be sequenced. The tammar has many unusual biological characteristics, including the longest period of embryonic diapause of any mammal, extremely synchronized seasonal breeding and prolonged and sophisticated lactation within a well-defined pouch. Like other marsupials, it gives birth to highly altricial young, and has a small number of very large chromosomes, making it a valuable model for genomics, reproduction and development. Results The genome has been sequenced to 2 × coverage using Sanger sequencing, enhanced with additional next generation sequencing and the integration of extensive physical and linkage maps to build the genome assembly. We also sequenced the tammar transcriptome across many tissues and developmental time points. Our analyses of these data shed light on mammalian reproduction, development and genome evolution: there is innovation in reproductive and lactational genes, rapid evolution of germ cell genes, and incomplete, locus-specific X inactivation. We also observe novel retrotransposons and a highly rearranged major histocompatibility complex, with many class I genes located outside the complex. Novel microRNAs in the tammar HOX clusters uncover new potential mammalian HOX regulatory elements. Conclusions Analyses of these resources enhance our understanding of marsupial gene evolution, identify marsupial-specific conserved non-coding elements and critical genes across a range of biological systems, including reproduction, development and immunity, and provide new insight into marsupial and mammalian biology and genome evolution. PMID:21854559
Renfree, Marilyn B; Papenfuss, Anthony T; Deakin, Janine E; Lindsay, James; Heider, Thomas; Belov, Katherine; Rens, Willem; Waters, Paul D; Pharo, Elizabeth A; Shaw, Geoff; Wong, Emily S W; Lefèvre, Christophe M; Nicholas, Kevin R; Kuroki, Yoko; Wakefield, Matthew J; Zenger, Kyall R; Wang, Chenwei; Ferguson-Smith, Malcolm; Nicholas, Frank W; Hickford, Danielle; Yu, Hongshi; Short, Kirsty R; Siddle, Hannah V; Frankenberg, Stephen R; Chew, Keng Yih; Menzies, Brandon R; Stringer, Jessica M; Suzuki, Shunsuke; Hore, Timothy A; Delbridge, Margaret L; Patel, Hardip R; Mohammadi, Amir; Schneider, Nanette Y; Hu, Yanqiu; O'Hara, William; Al Nadaf, Shafagh; Wu, Chen; Feng, Zhi-Ping; Cocks, Benjamin G; Wang, Jianghui; Flicek, Paul; Searle, Stephen M J; Fairley, Susan; Beal, Kathryn; Herrero, Javier; Carone, Dawn M; Suzuki, Yutaka; Sugano, Sumio; Toyoda, Atsushi; Sakaki, Yoshiyuki; Kondo, Shinji; Nishida, Yuichiro; Tatsumoto, Shoji; Mandiou, Ion; Hsu, Arthur; McColl, Kaighin A; Lansdell, Benjamin; Weinstock, George; Kuczek, Elizabeth; McGrath, Annette; Wilson, Peter; Men, Artem; Hazar-Rethinam, Mehlika; Hall, Allison; Davis, John; Wood, David; Williams, Sarah; Sundaravadanam, Yogi; Muzny, Donna M; Jhangiani, Shalini N; Lewis, Lora R; Morgan, Margaret B; Okwuonu, Geoffrey O; Ruiz, San Juana; Santibanez, Jireh; Nazareth, Lynne; Cree, Andrew; Fowler, Gerald; Kovar, Christie L; Dinh, Huyen H; Joshi, Vandita; Jing, Chyn; Lara, Fremiet; Thornton, Rebecca; Chen, Lei; Deng, Jixin; Liu, Yue; Shen, Joshua Y; Song, Xing-Zhi; Edson, Janette; Troon, Carmen; Thomas, Daniel; Stephens, Amber; Yapa, Lankesha; Levchenko, Tanya; Gibbs, Richard A; Cooper, Desmond W; Speed, Terence P; Fujiyama, Asao; Graves, Jennifer A M; O'Neill, Rachel J; Pask, Andrew J; Forrest, Susan M; Worley, Kim C
2011-08-29
We present the genome sequence of the tammar wallaby, Macropus eugenii, which is a member of the kangaroo family and the first representative of the iconic hopping mammals that symbolize Australia to be sequenced. The tammar has many unusual biological characteristics, including the longest period of embryonic diapause of any mammal, extremely synchronized seasonal breeding and prolonged and sophisticated lactation within a well-defined pouch. Like other marsupials, it gives birth to highly altricial young, and has a small number of very large chromosomes, making it a valuable model for genomics, reproduction and development. The genome has been sequenced to 2 × coverage using Sanger sequencing, enhanced with additional next generation sequencing and the integration of extensive physical and linkage maps to build the genome assembly. We also sequenced the tammar transcriptome across many tissues and developmental time points. Our analyses of these data shed light on mammalian reproduction, development and genome evolution: there is innovation in reproductive and lactational genes, rapid evolution of germ cell genes, and incomplete, locus-specific X inactivation. We also observe novel retrotransposons and a highly rearranged major histocompatibility complex, with many class I genes located outside the complex. Novel microRNAs in the tammar HOX clusters uncover new potential mammalian HOX regulatory elements. Analyses of these resources enhance our understanding of marsupial gene evolution, identify marsupial-specific conserved non-coding elements and critical genes across a range of biological systems, including reproduction, development and immunity, and provide new insight into marsupial and mammalian biology and genome evolution.
cyclostratigraphy, sequence stratigraphy and organic matter accumulation mechanism
NASA Astrophysics Data System (ADS)
Cong, F.; Li, J.
2016-12-01
The first member of Maokou Formation of Sichuan basin is composed of well preserved carbonate ramp couplets of limestone and marlstone/shale. It acts as one of the potential shale gas source rock, and is suitable for time-series analysis. We conducted time-series analysis to identify high-frequency sequences, reconstruct high-resolution sedimentation rate, estimate detailed primary productivity for the first time in the study intervals and discuss organic matter accumulation mechanism of source rock under sequence stratigraphic framework.Using the theory of cyclostratigraphy and sequence stratigraphy, the high-frequency sequences of one outcrop profile and one drilling well are identified. Two third-order sequences and eight fourth-order sequences are distinguished on outcrop profile based on the cycle stacking patterns. For drilling well, sequence boundary and four system tracts is distinguished by "integrated prediction error filter analysis" (INPEFA) of Gamma-ray logging data, and eight fourth-order sequences is identified by 405ka long eccentricity curve in depth domain which is quantified and filtered by integrated analysis of MTM spectral analysis, evolutive harmonic analysis (EHA), evolutive average spectral misfit (eASM) and band-pass filtering. It suggests that high-frequency sequences correlate well with Milankovitch orbital signals recorded in sediments, and it is applicable to use cyclostratigraphy theory in dividing high-frequency(4-6 orders) sequence stratigraphy.High-resolution sedimentation rate is reconstructed through the study interval by tracking the highly statistically significant short eccentricity component (123ka) revealed by EHA. Based on sedimentation rate, measured TOC and density data, the burial flux, delivery flux and primary productivity of organic carbon was estimated. By integrating redox proxies, we can discuss the controls on organic matter accumulation by primary production and preservation under the high-resolution sequence stratigraphic framework. Results show that high average organic carbon contents in the study interval are mainly attributed to high primary production. The results also show a good correlation between high organic carbon accumulation and intervals of transgression.
Integrating protein structural dynamics and evolutionary analysis with Bio3D.
Skjærven, Lars; Yao, Xin-Qiu; Scarabelli, Guido; Grant, Barry J
2014-12-10
Popular bioinformatics approaches for studying protein functional dynamics include comparisons of crystallographic structures, molecular dynamics simulations and normal mode analysis. However, determining how observed displacements and predicted motions from these traditionally separate analyses relate to each other, as well as to the evolution of sequence, structure and function within large protein families, remains a considerable challenge. This is in part due to the general lack of tools that integrate information of molecular structure, dynamics and evolution. Here, we describe the integration of new methodologies for evolutionary sequence, structure and simulation analysis into the Bio3D package. This major update includes unique high-throughput normal mode analysis for examining and contrasting the dynamics of related proteins with non-identical sequences and structures, as well as new methods for quantifying dynamical couplings and their residue-wise dissection from correlation network analysis. These new methodologies are integrated with major biomolecular databases as well as established methods for evolutionary sequence and comparative structural analysis. New functionality for directly comparing results derived from normal modes, molecular dynamics and principal component analysis of heterogeneous experimental structure distributions is also included. We demonstrate these integrated capabilities with example applications to dihydrofolate reductase and heterotrimeric G-protein families along with a discussion of the mechanistic insight provided in each case. The integration of structural dynamics and evolutionary analysis in Bio3D enables researchers to go beyond a prediction of single protein dynamics to investigate dynamical features across large protein families. The Bio3D package is distributed with full source code and extensive documentation as a platform independent R package under a GPL2 license from http://thegrantlab.org/bio3d/ .
Nielsen, Tue Kjærgaard; Rasmussen, Morten; Demanèche, Sandrine; Cecillon, Sébastien; Vogel, Timothy M.
2017-01-01
Abstract Bacterial degraders of chlorophenoxy herbicides have been isolated from various ecosystems, including pristine environments. Among these degraders, the sphingomonads constitute a prominent group that displays versatile xenobiotic-degradation capabilities. Four separate sequencing strategies were required to provide the complete sequence of the complex and plastic genome of the canonical chlorophenoxy herbicide-degrading Sphingobium herbicidovorans MH. The genome has an intricate organization of the chlorophenoxy-herbicide catabolic genes sdpA, rdpA, and cadABCD that encode the (R)- and (S)-enantiomer-specific 2,4-dichlorophenoxypropionate dioxygenases and four subunits of a Rieske non-heme iron oxygenase involved in 2-methyl-chlorophenoxyacetic acid degradation, respectively. Several major genomic rearrangements are proposed to help understand the evolution and mobility of these important genes and their genetic context. Single-strain mobilomic sequence analysis uncovered plasmids and insertion sequence-associated circular intermediates in this environmentally important bacterium and enabled the description of evolutionary models for pesticide degradation in strain MH and related organisms. The mobilome presented a complex mosaic of mobile genetic elements including four plasmids and several circular intermediate DNA molecules of insertion-sequence elements and transposons that are central to the evolution of xenobiotics degradation. Furthermore, two individual chromosomally integrated prophages were shown to excise and form free circular DNA molecules. This approach holds great potential for improving the understanding of genome plasticity, evolution, and microbial ecology. PMID:28961970
A Polyglot Approach to Bioinformatics Data Integration: A Phylogenetic Analysis of HIV-1
Reisman, Steven; Hatzopoulos, Thomas; Läufer, Konstantin; Thiruvathukal, George K.; Putonti, Catherine
2016-01-01
As sequencing technologies continue to drop in price and increase in throughput, new challenges emerge for the management and accessibility of genomic sequence data. We have developed a pipeline for facilitating the storage, retrieval, and subsequent analysis of molecular data, integrating both sequence and metadata. Taking a polyglot approach involving multiple languages, libraries, and persistence mechanisms, sequence data can be aggregated from publicly available and local repositories. Data are exposed in the form of a RESTful web service, formatted for easy querying, and retrieved for downstream analyses. As a proof of concept, we have developed a resource for annotated HIV-1 sequences. Phylogenetic analyses were conducted for >6,000 HIV-1 sequences revealing spatial and temporal factors influence the evolution of the individual genes uniquely. Nevertheless, signatures of origin can be extrapolated even despite increased globalization. The approach developed here can easily be customized for any species of interest. PMID:26819543
DPTEdb, an integrative database of transposable elements in dioecious plants.
Li, Shu-Fen; Zhang, Guo-Jun; Zhang, Xue-Jin; Yuan, Jin-Hong; Deng, Chuan-Liang; Gu, Lian-Feng; Gao, Wu-Jun
2016-01-01
Dioecious plants usually harbor 'young' sex chromosomes, providing an opportunity to study the early stages of sex chromosome evolution. Transposable elements (TEs) are mobile DNA elements frequently found in plants and are suggested to play important roles in plant sex chromosome evolution. The genomes of several dioecious plants have been sequenced, offering an opportunity to annotate and mine the TE data. However, comprehensive and unified annotation of TEs in these dioecious plants is still lacking. In this study, we constructed a dioecious plant transposable element database (DPTEdb). DPTEdb is a specific, comprehensive and unified relational database and web interface. We used a combination of de novo, structure-based and homology-based approaches to identify TEs from the genome assemblies of previously published data, as well as our own. The database currently integrates eight dioecious plant species and a total of 31 340 TEs along with classification information. DPTEdb provides user-friendly web interfaces to browse, search and download the TE sequences in the database. Users can also use tools, including BLAST, GetORF, HMMER, Cut sequence and JBrowse, to analyze TE data. Given the role of TEs in plant sex chromosome evolution, the database will contribute to the investigation of TEs in structural, functional and evolutionary dynamics of the genome of dioecious plants. In addition, the database will supplement the research of sex diversification and sex chromosome evolution of dioecious plants.Database URL: http://genedenovoweb.ticp.net:81/DPTEdb/index.php. © The Author(s) 2016. Published by Oxford University Press.
Moroz, Leonid L
2015-12-01
The origins of neural systems and centralized brains are one of the major transitions in evolution. These events might occur more than once over 570-600 million years. The convergent evolution of neural circuits is evident from a diversity of unique adaptive strategies implemented by ctenophores, cnidarians, acoels, molluscs, and basal deuterostomes. But, further integration of biodiversity research and neuroscience is required to decipher critical events leading to development of complex integrative and cognitive functions. Here, we outline reference species and interdisciplinary approaches in reconstructing the evolution of nervous systems. In the "omic" era, it is now possible to establish fully functional genomics laboratories aboard of oceanic ships and perform sequencing and real-time analyses of data at any oceanic location (named here as Ship-Seq). In doing so, fragile, rare, cryptic, and planktonic organisms, or even entire marine ecosystems, are becoming accessible directly to experimental and physiological analyses by modern analytical tools. Thus, we are now in a position to take full advantages from countless "experiments" Nature performed for us in the course of 3.5 billion years of biological evolution. Together with progress in computational and comparative genomics, evolutionary neuroscience, proteomic and developmental biology, a new surprising picture is emerging that reveals many ways of how nervous systems evolved. As a result, this symposium provides a unique opportunity to revisit old questions about the origins of biological complexity. © The Author 2015. Published by Oxford University Press on behalf of the Society for Integrative and Comparative Biology. All rights reserved. For permissions please email: journals.permissions@oup.com.
Focal plane AIT sequence: evolution from HRG-Spot 5 to Pleiades HR
NASA Astrophysics Data System (ADS)
Le Goff, Roland; Pranyies, Pascal; Toubhans, Isabelle
2017-11-01
Optical and geometrical image qualities of Focal Planes, for "push-broom" high resolution remote sensing satellites, require the implementation of specific means and methods for the AIT sequence. Indeed the geometric performances of the focal plane mainly axial focusing and transverse registration, are duly obtained on the basis of adjustment, setting and measurement of optical and CCD components with an accuracy of a few microns. Since the end of the 1970s, EADS-SODERN has developed a series of detection units for earth observation instruments like SPOT and Helios. And EADS-SODERN is now responsible for the development of the Pleiades High Resolution Focal Plane assembly. This paper presents the AIT sequences. We introduce all the efforts, innovative solutions and improvements made on the assembly facilities to match the technical evolutions and breakthrough of the Pleiades HR FP concept in comparison with the previous High Resolution Geometric SPOT 5 Focal Plane. The main evolution drivers are the implementation of strip filters and the realization of 400 mm continuous retinas. For Pleiades HR AIT sequence, three specific integration and measuring benches, corresponding with the different assembly stages, are used: a 3-D non-contact measurement machine for the assembly of detection module, a 3-D measurement machine for mirror integration on the main Focal Plane SiC structure, and a 3-D geometric coordinates control bench to focus detection module lines and to ensure they are well registered together.
Herrmann, Alexander; Haake, Andrea; Ammerpohl, Ole; Martin-Guerrero, Idoia; Szafranski, Karol; Stemshorn, Kathryn; Nothnagel, Michael; Kotsopoulos, Steve K; Richter, Julia; Warner, Jason; Olson, Jeff; Link, Darren R; Schreiber, Stefan; Krawczak, Michael; Platzer, Matthias; Nürnberg, Peter; Siebert, Reiner; Hampe, Jochen
2011-01-01
Cytosine methylation provides an epigenetic level of cellular plasticity that is important for development, differentiation and cancerogenesis. We adopted microdroplet PCR to bisulfite treated target DNA in combination with second generation sequencing to simultaneously assess DNA sequence and methylation. We show measurement of methylation status in a wide range of target sequences (total 34 kb) with an average coverage of 95% (median 100%) and good correlation to the opposite strand (rho = 0.96) and to pyrosequencing (rho = 0.87). Data from lymphoma and colorectal cancer samples for SNRPN (imprinted gene), FGF6 (demethylated in the cancer samples) and HS3ST2 (methylated in the cancer samples) serve as a proof of principle showing the integration of SNP data and phased DNA-methylation information into "hepitypes" and thus the analysis of DNA methylation phylogeny in the somatic evolution of cancer.
Probabilistic modeling of the evolution of gene synteny within reconciled phylogenies
2015-01-01
Background Most models of genome evolution concern either genetic sequences, gene content or gene order. They sometimes integrate two of the three levels, but rarely the three of them. Probabilistic models of gene order evolution usually have to assume constant gene content or adopt a presence/absence coding of gene neighborhoods which is blind to complex events modifying gene content. Results We propose a probabilistic evolutionary model for gene neighborhoods, allowing genes to be inserted, duplicated or lost. It uses reconciled phylogenies, which integrate sequence and gene content evolution. We are then able to optimize parameters such as phylogeny branch lengths, or probabilistic laws depicting the diversity of susceptibility of syntenic regions to rearrangements. We reconstruct a structure for ancestral genomes by optimizing a likelihood, keeping track of all evolutionary events at the level of gene content and gene synteny. Ancestral syntenies are associated with a probability of presence. We implemented the model with the restriction that at most one gene duplication separates two gene speciations in reconciled gene trees. We reconstruct ancestral syntenies on a set of 12 drosophila genomes, and compare the evolutionary rates along the branches and along the sites. We compare with a parsimony method and find a significant number of results not supported by the posterior probability. The model is implemented in the Bio++ library. It thus benefits from and enriches the classical models and methods for molecular evolution. PMID:26452018
Cyberinfrastructure for Fusarium (CiF)
USDA-ARS?s Scientific Manuscript database
The rapidly increasing number of genome sequences from diverse fungal species and expanding phylogenetic data necessitate highly integrated informatics platforms to adequately support the use of these resources for studying fungal biology and evolution. The long-term goal of Cyberinfrastructure for...
Song, Jia; Zheng, Sisi; Nguyen, Nhung; Wang, Youjun; Zhou, Yubin; Lin, Kui
2017-10-03
Because phylogenetic inference is an important basis for answering many evolutionary problems, a large number of algorithms have been developed. Some of these algorithms have been improved by integrating gene evolution models with the expectation of accommodating the hierarchy of evolutionary processes. To the best of our knowledge, however, there still is no single unifying model or algorithm that can take all evolutionary processes into account through a stepwise or simultaneous method. On the basis of three existing phylogenetic inference algorithms, we built an integrated pipeline for inferring the evolutionary history of a given gene family; this pipeline can model gene sequence evolution, gene duplication-loss, gene transfer and multispecies coalescent processes. As a case study, we applied this pipeline to the STIMATE (TMEM110) gene family, which has recently been reported to play an important role in store-operated Ca 2+ entry (SOCE) mediated by ORAI and STIM proteins. We inferred their phylogenetic trees in 69 sequenced chordate genomes. By integrating three tree reconstruction algorithms with diverse evolutionary models, a pipeline for inferring the evolutionary history of a gene family was developed, and its application was demonstrated.
Integration and macroevolutionary patterns in the pollination biology of conifers.
Leslie, Andrew B; Beaulieu, Jeremy M; Crane, Peter R; Knopf, Patrick; Donoghue, Michael J
2015-06-01
Integration influences patterns of trait evolution, but the relationship between these patterns and the degree of trait integration is not well understood. To explore this further, we study a specialized pollination mechanism in conifers whose traits are linked through function but not development. This mechanism depends on interactions among three characters: pollen that is buoyant, ovules that face downward at pollination, and the production of a liquid droplet that buoyant grains float through to enter the ovule. We use a well-sampled phylogeny of conifers to test correlated evolution among these characters and specific sequences of character change. Using likelihood models of character evolution, we find that pollen morphology and ovule characters evolve in a concerted manner, where the flotation mechanism breaks down irreversibly following changes in orientation or drop production. The breakdown of this functional constraint, which may be facilitated by the lack of developmental integration among the constituent traits, is associated with increased trait variation and more diverse pollination strategies. Although this functional "release" increases diversity in some ways, the irreversible way in which the flotation mechanism is lost may eventually result in its complete disappearance from seed plant reproductive biology. © 2015 The Author(s). Evolution © 2015 The Society for the Study of Evolution.
Orbital evolution of some Centaurs
NASA Astrophysics Data System (ADS)
Kovalenko, Nataliya; Babenko, Yuri; Churyumov, Klim
2002-11-01
In this work we investigated the dynamical evolution of Centaurs objects 2060 (Chiron), 5145 (Pholus), 7066 (Nessus), 8405 (Asbolus), 10199 (Chariklo), 10370 (Hylonome), and Scattered-Disk object 15874. We have carried out orbital integration of test particles with initial orbits similar to those of these objects. Calculations were produced for +/-600kyr-10Myr starting at epoch and using the implicit single sequence Everhart methods. 12 variational orbits for each of selected Centaurs also have been numerically integrated for +/-200 kyr toward the past and the future. The most probable paths were traced up to +/-1 Myr. The character of orbital elements changes and peculiarities of close approaches to giant planets are discussed.
Kawano, Yasuhiro; Neeley, Shane; Adachi, Kei; Nakai, Hiroyuki
2013-01-01
Overlapping open reading frames (ORFs) in viral genomes undergo co-evolution; however, how individual amino acids coded by overlapping ORFs are structurally, functionally, and co-evolutionarily constrained remains difficult to address by conventional homologous sequence alignment approaches. We report here a new experimental and computational evolution-based methodology to address this question and report its preliminary application to elucidating a mode of co-evolution of the frame-shifted overlapping ORFs in the adeno-associated virus (AAV) serotype 2 viral genome. These ORFs encode both capsid VP protein and non-structural assembly-activating protein (AAP). To show proof of principle of the new method, we focused on the evolutionarily conserved QVKEVTQ and KSKRSRR motifs, a pair of overlapping heptapeptides in VP and AAP, respectively. In the new method, we first identified a large number of capsid-forming VP3 mutants and functionally competent AAP mutants of these motifs from mutant libraries by experimental directed evolution under no co-evolutionary constraints. We used Illumina sequencing to obtain a large dataset and then statistically assessed the viability of VP and AAP heptapeptide mutants. The obtained heptapeptide information was then integrated into an evolutionary algorithm, with which VP and AAP were co-evolved from random or native nucleotide sequences in silico. As a result, we demonstrate that these two heptapeptide motifs could exhibit high degeneracy if coded by separate nucleotide sequences, and elucidate how overlap-evoked co-evolutionary constraints play a role in making the VP and AAP heptapeptide sequences into the present shape. Specifically, we demonstrate that two valine (V) residues and β-strand propensity in QVKEVTQ are structurally important, the strongly negative and hydrophilic nature of KSKRSRR is functionally important, and overlap-evoked co-evolution imposes strong constraints on serine (S) residues in KSKRSRR, despite high degeneracy of the motifs in the absence of co-evolutionary constraints.
ERIC Educational Resources Information Center
Maloney, Peter C.; Wilson, T. Hastings
1985-01-01
Constructs an evolutionary sequence to account for the diversity of ion pumps found today. Explanations include primary ion pumps in bacteria, features and distribution of ATP-driven pumps, preference for cation transport, and proton pump reversal. The integrated evolutionary hypothesis should encourage new experimental approaches. (DH)
EGenBio: A Data Management System for Evolutionary Genomics and Biodiversity
Nahum, Laila A; Reynolds, Matthew T; Wang, Zhengyuan O; Faith, Jeremiah J; Jonna, Rahul; Jiang, Zhi J; Meyer, Thomas J; Pollock, David D
2006-01-01
Background Evolutionary genomics requires management and filtering of large numbers of diverse genomic sequences for accurate analysis and inference on evolutionary processes of genomic and functional change. We developed Evolutionary Genomics and Biodiversity (EGenBio; ) to begin to address this. Description EGenBio is a system for manipulation and filtering of large numbers of sequences, integrating curated sequence alignments and phylogenetic trees, managing evolutionary analyses, and visualizing their output. EGenBio is organized into three conceptual divisions, Evolution, Genomics, and Biodiversity. The Genomics division includes tools for selecting pre-aligned sequences from different genes and species, and for modifying and filtering these alignments for further analysis. Species searches are handled through queries that can be modified based on a tree-based navigation system and saved. The Biodiversity division contains tools for analyzing individual sequences or sequence alignments, whereas the Evolution division contains tools involving phylogenetic trees. Alignments are annotated with analytical results and modification history using our PRAED format. A miscellaneous Tools section and Help framework are also available. EGenBio was developed around our comparative genomic research and a prototype database of mtDNA genomes. It utilizes MySQL-relational databases and dynamic page generation, and calls numerous custom programs. Conclusion EGenBio was designed to serve as a platform for tools and resources to ease combined analysis in evolution, genomics, and biodiversity. PMID:17118150
Integrative studies of cultural evolution: crossing disciplinary boundaries to produce new insights.
Kolodny, Oren; Feldman, Marcus W; Creanza, Nicole
2018-04-05
Culture evolves according to dynamics on multiple temporal scales, from individuals' minute-by-minute behaviour to millennia of cultural accumulation that give rise to population-level differences. These dynamics act on a range of entities-including behavioural sequences, ideas and artefacts as well as individuals, populations and whole species-and involve mechanisms at multiple levels, from neurons in brains to inter-population interactions. Studying such complex phenomena requires an integration of perspectives from a diverse array of fields, as well as bridging gaps between traditionally disparate areas of study. In this article, which also serves as an introduction to the current special issue, we highlight some specific respects in which the study of cultural evolution has benefited and should continue to benefit from an integrative approach. We showcase a number of pioneering studies of cultural evolution that bring together numerous disciplines. These studies illustrate the value of perspectives from different fields for understanding cultural evolution, such as cognitive science and neuroanatomy, behavioural ecology, population dynamics, and evolutionary genetics. They also underscore the importance of understanding cultural processes when interpreting research about human genetics, neuroscience, behaviour and evolution.This article is part of the theme issue 'Bridging cultural gaps: interdisciplinary studies in human cultural evolution'. © 2018 The Author(s).
Moroz, Leonid L.
2015-01-01
The origins of neural systems and centralized brains are one of the major transitions in evolution. These events might occur more than once over 570–600 million years. The convergent evolution of neural circuits is evident from a diversity of unique adaptive strategies implemented by ctenophores, cnidarians, acoels, molluscs, and basal deuterostomes. But, further integration of biodiversity research and neuroscience is required to decipher critical events leading to development of complex integrative and cognitive functions. Here, we outline reference species and interdisciplinary approaches in reconstructing the evolution of nervous systems. In the “omic” era, it is now possible to establish fully functional genomics laboratories aboard of oceanic ships and perform sequencing and real-time analyses of data at any oceanic location (named here as Ship-Seq). In doing so, fragile, rare, cryptic, and planktonic organisms, or even entire marine ecosystems, are becoming accessible directly to experimental and physiological analyses by modern analytical tools. Thus, we are now in a position to take full advantages from countless “experiments” Nature performed for us in the course of 3.5 billion years of biological evolution. Together with progress in computational and comparative genomics, evolutionary neuroscience, proteomic and developmental biology, a new surprising picture is emerging that reveals many ways of how nervous systems evolved. As a result, this symposium provides a unique opportunity to revisit old questions about the origins of biological complexity. PMID:26163680
Chang, Suhua; Zhang, Jiajie; Liao, Xiaoyun; Zhu, Xinxing; Wang, Dahai; Zhu, Jiang; Feng, Tao; Zhu, Baoli; Gao, George F; Wang, Jian; Yang, Huanming; Yu, Jun; Wang, Jing
2007-01-01
Frequent outbreaks of highly pathogenic avian influenza and the increasing data available for comparative analysis require a central database specialized in influenza viruses (IVs). We have established the Influenza Virus Database (IVDB) to integrate information and create an analysis platform for genetic, genomic, and phylogenetic studies of the virus. IVDB hosts complete genome sequences of influenza A virus generated by Beijing Institute of Genomics (BIG) and curates all other published IV sequences after expert annotation. Our Q-Filter system classifies and ranks all nucleotide sequences into seven categories according to sequence content and integrity. IVDB provides a series of tools and viewers for comparative analysis of the viral genomes, genes, genetic polymorphisms and phylogenetic relationships. A search system has been developed for users to retrieve a combination of different data types by setting search options. To facilitate analysis of global viral transmission and evolution, the IV Sequence Distribution Tool (IVDT) has been developed to display the worldwide geographic distribution of chosen viral genotypes and to couple genomic data with epidemiological data. The BLAST, multiple sequence alignment and phylogenetic analysis tools were integrated for online data analysis. Furthermore, IVDB offers instant access to pre-computed alignments and polymorphisms of IV genes and proteins, and presents the results as SNP distribution plots and minor allele distributions. IVDB is publicly available at http://influenza.genomics.org.cn.
Sequence space and the ongoing expansion of the protein universe.
Povolotskaya, Inna S; Kondrashov, Fyodor A
2010-06-17
The need to maintain the structural and functional integrity of an evolving protein severely restricts the repertoire of acceptable amino-acid substitutions. However, it is not known whether these restrictions impose a global limit on how far homologous protein sequences can diverge from each other. Here we explore the limits of protein evolution using sequence divergence data. We formulate a computational approach to study the rate of divergence of distant protein sequences and measure this rate for ancient proteins, those that were present in the last universal common ancestor. We show that ancient proteins are still diverging from each other, indicating an ongoing expansion of the protein sequence universe. The slow rate of this divergence is imposed by the sparseness of functional protein sequences in sequence space and the ruggedness of the protein fitness landscape: approximately 98 per cent of sites cannot accept an amino-acid substitution at any given moment but a vast majority of all sites may eventually be permitted to evolve when other, compensatory, changes occur. Thus, approximately 3.5 x 10(9) yr has not been enough to reach the limit of divergent evolution of proteins, and for most proteins the limit of sequence similarity imposed by common function may not exceed that of random sequences.
Single-cell sequencing and tumorigenesis: improved understanding of tumor evolution and metastasis.
Ellsworth, Darrell L; Blackburn, Heather L; Shriver, Craig D; Rabizadeh, Shahrooz; Soon-Shiong, Patrick; Ellsworth, Rachel E
2017-12-01
Extensive genomic and transcriptomic heterogeneity in human cancer often negatively impacts treatment efficacy and survival, thus posing a significant ongoing challenge for modern treatment regimens. State-of-the-art DNA- and RNA-sequencing methods now provide high-resolution genomic and gene expression portraits of individual cells, facilitating the study of complex molecular heterogeneity in cancer. Important developments in single-cell sequencing (SCS) technologies over the past 5 years provide numerous advantages over traditional sequencing methods for understanding the complexity of carcinogenesis, but significant hurdles must be overcome before SCS can be clinically useful. In this review, we: (1) highlight current methodologies and recent technological advances for isolating single cells, single-cell whole-genome and whole-transcriptome amplification using minute amounts of nucleic acids, and SCS, (2) summarize research investigating molecular heterogeneity at the genomic and transcriptomic levels and how this heterogeneity affects clonal evolution and metastasis, and (3) discuss the promise for integrating SCS in the clinical care arena for improved patient care.
Bioinformatics and genomic analysis of transposable elements in eukaryotic genomes.
Janicki, Mateusz; Rooke, Rebecca; Yang, Guojun
2011-08-01
A major portion of most eukaryotic genomes are transposable elements (TEs). During evolution, TEs have introduced profound changes to genome size, structure, and function. As integral parts of genomes, the dynamic presence of TEs will continue to be a major force in reshaping genomes. Early computational analyses of TEs in genome sequences focused on filtering out "junk" sequences to facilitate gene annotation. When the high abundance and diversity of TEs in eukaryotic genomes were recognized, these early efforts transformed into the systematic genome-wide categorization and classification of TEs. The availability of genomic sequence data reversed the classical genetic approaches to discovering new TE families and superfamilies. Curated TE databases and their accurate annotation of genome sequences in turn facilitated the studies on TEs in a number of frontiers including: (1) TE-mediated changes of genome size and structure, (2) the influence of TEs on genome and gene functions, (3) TE regulation by host, (4) the evolution of TEs and their population dynamics, and (5) genomic scale studies of TE activity. Bioinformatics and genomic approaches have become an integral part of large-scale studies on TEs to extract information with pure in silico analyses or to assist wet lab experimental studies. The current revolution in genome sequencing technology facilitates further progress in the existing frontiers of research and emergence of new initiatives. The rapid generation of large-sequence datasets at record low costs on a routine basis is challenging the computing industry on storage capacity and manipulation speed and the bioinformatics community for improvement in algorithms and their implementations.
Galián, José A; Rosato, Marcela; Rosselló, Josep A
2014-03-01
Multigene families have provided opportunities for evolutionary biologists to assess molecular evolution processes and phylogenetic reconstructions at deep and shallow systematic levels. However, the use of these markers is not free of technical and analytical challenges. Many evolutionary studies that used the nuclear 5S rDNA gene family rarely used contiguous 5S coding sequences due to the routine use of head-to-tail polymerase chain reaction primers that are anchored to the coding region. Moreover, the 5S coding sequences have been concatenated with independent, adjacent gene units in many studies, creating simulated chimeric genes as the raw data for evolutionary analysis. This practice is based on the tacitly assumed, but rarely tested, hypothesis that strict intra-locus concerted evolution processes are operating in 5S rDNA genes, without any empirical evidence as to whether it holds for the recovered data. The potential pitfalls of analysing the patterns of molecular evolution and reconstructing phylogenies based on these chimeric genes have not been assessed to date. Here, we compared the sequence integrity and phylogenetic behavior of entire versus concatenated 5S coding regions from a real data set obtained from closely related plant species (Medicago, Fabaceae). Our results suggest that within arrays sequence homogenization is partially operating in the 5S coding region, which is traditionally assumed to be highly conserved. Consequently, concatenating 5S genes increases haplotype diversity, generating novel chimeric genotypes that most likely do not exist within the genome. In addition, the patterns of gene evolution are distorted, leading to incorrect haplotype relationships in some evolutionary reconstructions.
2011-01-01
Background A robust bacterial artificial chromosome (BAC)-based physical map is essential for many aspects of genomics research, including an understanding of chromosome evolution, high-resolution genome mapping, marker-assisted breeding, positional cloning of genes, and quantitative trait analysis. To facilitate turkey genetics research and better understand avian genome evolution, a BAC-based integrated physical, genetic, and comparative map was developed for this important agricultural species. Results The turkey genome physical map was constructed based on 74,013 BAC fingerprints (11.9 × coverage) from two independent libraries, and it was integrated with the turkey genetic map and chicken genome sequence using over 41,400 BAC assignments identified by 3,499 overgo hybridization probes along with > 43,000 BAC end sequences. The physical-comparative map consists of 74 BAC contigs, with an average contig size of 13.6 Mb. All but four of the turkey chromosomes were spanned on this map by three or fewer contigs, with 14 chromosomes spanned by a single contig and nine chromosomes spanned by two contigs. This map predicts 20 to 27 major rearrangements distinguishing turkey and chicken chromosomes, despite up to 40 million years of separate evolution between the two species. These data elucidate the chromosomal evolutionary pattern within the Phasianidae that led to the modern turkey and chicken karyotypes. The predominant rearrangement mode involves intra-chromosomal inversions, and there is a clear bias for these to result in centromere locations at or near telomeres in turkey chromosomes, in comparison to interstitial centromeres in the orthologous chicken chromosomes. Conclusion The BAC-based turkey-chicken comparative map provides novel insights into the evolution of avian genomes, a framework for assembly of turkey whole genome shotgun sequencing data, and tools for enhanced genetic improvement of these important agricultural and model species. PMID:21906286
Evolution dynamics of a model for gene duplication under adaptive conflict
NASA Astrophysics Data System (ADS)
Ancliff, Mark; Park, Jeong-Man
2014-06-01
We present and solve the dynamics of a model for gene duplication showing escape from adaptive conflict. We use a Crow-Kimura quasispecies model of evolution where the fitness landscape is a function of Hamming distances from two reference sequences, which are assumed to optimize two different gene functions, to describe the dynamics of a mixed population of individuals with single and double copies of a pleiotropic gene. The evolution equations are solved through a spin coherent state path integral, and we find two phases: one is an escape from an adaptive conflict phase, where each copy of a duplicated gene evolves toward subfunctionalization, and the other is a duplication loss of function phase, where one copy maintains its pleiotropic form and the other copy undergoes neutral mutation. The phase is determined by a competition between the fitness benefits of subfunctionalization and the greater mutational load associated with maintaining two gene copies. In the escape phase, we find a dynamics of an initial population of single gene sequences only which escape adaptive conflict through gene duplication and find that there are two time regimes: until a time t* single gene sequences dominate, and after t* double gene sequences outgrow single gene sequences. The time t* is identified as the time necessary for subfunctionalization to evolve and spread throughout the double gene sequences, and we show that there is an optimum mutation rate which minimizes this time scale.
Bentolila, Stéphane; Stefanov, Stefan
2012-01-01
Plant mitochondrial genomes have features that distinguish them radically from their animal counterparts: a high rate of rearrangement, of uptake and loss of DNA sequences, and an extremely low point mutation rate. Perhaps the most unique structural feature of plant mitochondrial DNAs is the presence of large repeated sequences involved in intramolecular and intermolecular recombination. In addition, rare recombination events can occur across shorter repeats, creating rearrangements that result in aberrant phenotypes, including pollen abortion, which is known as cytoplasmic male sterility (CMS). Using next-generation sequencing, we pyrosequenced two rice (Oryza sativa) mitochondrial genomes that belong to the indica subspecies. One genome is normal, while the other carries the wild abortive-CMS. We find that numerous rearrangements in the rice mitochondrial genome occur even between close cytotypes during rice evolution. Unlike maize (Zea mays), a closely related species also belonging to the grass family, integration of plastid sequences did not play a role in the sequence divergence between rice cytotypes. This study also uncovered an excellent candidate for the wild abortive-CMS-encoding gene; like most of the CMS-associated open reading frames that are known in other species, this candidate was created via a rearrangement, is chimeric in structure, possesses predicted transmembrane domains, and coopted the promoter of a genuine mitochondrial gene. Our data give new insights into rice mitochondrial evolution, correcting previous reports. PMID:22128137
NASA Astrophysics Data System (ADS)
Kovalenko, N.; Churyumov, K.; Babenko, Yu.
2002-01-01
Chiron, comet 39P/Oterma and comet 29P/Schwassmann-Wachmann 1 are discussed. The orbital evolutions of chosen objects were traced 1 million years backward and forward from the present time. For numerical integration the program based on the Everhart implicit single sequence methods for integrating orbits was used. perturbations from the giant planets and very chaotic. It is now believed that Centaurs could be captured from the Kuiper Belt and in the future transform into the short-period comets. Currently more then 20 Centaurs are known. The cometary activity in one of them (2060 Chiron) has been detected up to now. simulated the past and future orbital evolution of active Centaur 2060 (95P) Chiron and two distant Jupiter-family comets with similar to Centaurs' perihelia and aphelia - comets 39P/Oterma and 29P/Schwassmann-Wachmann 1. Only our knowledge gathered from the Earth-based observations, orbital evolution investigations and future spacecraft missions will solve this problem.
Integrative studies of cultural evolution: crossing disciplinary boundaries to produce new insights
2018-01-01
Culture evolves according to dynamics on multiple temporal scales, from individuals' minute-by-minute behaviour to millennia of cultural accumulation that give rise to population-level differences. These dynamics act on a range of entities—including behavioural sequences, ideas and artefacts as well as individuals, populations and whole species—and involve mechanisms at multiple levels, from neurons in brains to inter-population interactions. Studying such complex phenomena requires an integration of perspectives from a diverse array of fields, as well as bridging gaps between traditionally disparate areas of study. In this article, which also serves as an introduction to the current special issue, we highlight some specific respects in which the study of cultural evolution has benefited and should continue to benefit from an integrative approach. We showcase a number of pioneering studies of cultural evolution that bring together numerous disciplines. These studies illustrate the value of perspectives from different fields for understanding cultural evolution, such as cognitive science and neuroanatomy, behavioural ecology, population dynamics, and evolutionary genetics. They also underscore the importance of understanding cultural processes when interpreting research about human genetics, neuroscience, behaviour and evolution. This article is part of the theme issue ‘Bridging cultural gaps: interdisciplinary studies in human cultural evolution’. PMID:29440515
Yohn, Chris T; Jiang, Zhaoshi; McGrath, Sean D; Hayden, Karen E; Khaitovich, Philipp; Johnson, Matthew E; Eichler, Marla Y; McPherson, John D; Zhao, Shaying; Pääbo, Svante; Eichler, Evan E
2005-04-01
Retroviral infections of the germline have the potential to episodically alter gene function and genome structure during the course of evolution. Horizontal transmissions between species have been proposed, but little evidence exists for such events in the human/great ape lineage of evolution. Based on analysis of finished BAC chimpanzee genome sequence, we characterize a retroviral element (Pan troglodytes endogenous retrovirus 1 [PTERV1]) that has become integrated in the germline of African great ape and Old World monkey species but is absent from humans and Asian ape genomes. We unambiguously map 287 retroviral integration sites and determine that approximately 95.8% of the insertions occur at non-orthologous regions between closely related species. Phylogenetic analysis of the endogenous retrovirus reveals that the gorilla and chimpanzee elements share a monophyletic origin with a subset of the Old World monkey retroviral elements, but that the average sequence divergence exceeds neutral expectation for a strictly nuclear inherited DNA molecule. Within the chimpanzee, there is a significant integration bias against genes, with only 14 of these insertions mapping within intronic regions. Six out of ten of these genes, for which there are expression data, show significant differences in transcript expression between human and chimpanzee. Our data are consistent with a retroviral infection that bombarded the genomes of chimpanzees and gorillas independently and concurrently, 3-4 million years ago. We speculate on the potential impact of such recent events on the evolution of humans and great apes.
Biophysical and structural considerations for protein sequence evolution
2011-01-01
Background Protein sequence evolution is constrained by the biophysics of folding and function, causing interdependence between interacting sites in the sequence. However, current site-independent models of sequence evolutions do not take this into account. Recent attempts to integrate the influence of structure and biophysics into phylogenetic models via statistical/informational approaches have not resulted in expected improvements in model performance. This suggests that further innovations are needed for progress in this field. Results Here we develop a coarse-grained physics-based model of protein folding and binding function, and compare it to a popular informational model. We find that both models violate the assumption of the native sequence being close to a thermodynamic optimum, causing directional selection away from the native state. Sampling and simulation show that the physics-based model is more specific for fold-defining interactions that vary less among residue type. The informational model diffuses further in sequence space with fewer barriers and tends to provide less support for an invariant sites model, although amino acid substitutions are generally conservative. Both approaches produce sequences with natural features like dN/dS < 1 and gamma-distributed rates across sites. Conclusions Simple coarse-grained models of protein folding can describe some natural features of evolving proteins but are currently not accurate enough to use in evolutionary inference. This is partly due to improper packing of the hydrophobic core. We suggest possible improvements on the representation of structure, folding energy, and binding function, as regards both native and non-native conformations, and describe a large number of possible applications for such a model. PMID:22171550
Williams, Kelly P.
2003-01-01
A partial screen for genetic elements integrated into completely sequenced bacterial genomes shows more significant bias in specificity for the tmRNA gene (ssrA) than for any type of tRNA gene. Horizontal gene transfer, a major avenue of bacterial evolution, was assessed by focusing on elements using this single attachment locus. Diverse elements use ssrA; among enterobacteria alone, at least four different integrase subfamilies have independently evolved specificity for ssrA, and almost every strain analyzed presents a unique set of integrated elements. Even elements using essentially the same integrase can be very diverse, as is a group with an ssrA-specific integrase of the P4 subfamily. This same integrase appears to promote damage routinely at attachment sites, which may be adaptive. Elements in arrays can recombine; one such event mediated by invertible DNA segments within neighboring elements likely explains the monophasic nature of Salmonella enterica serovar Typhi. One of a limited set of conserved sequences occurs at the attachment site of each enterobacterial element, apparently serving as a transcriptional terminator for ssrA. Elements were usually found integrated into tRNA-like sequence at the 3′ end of ssrA, at subsites corresponding to those used in tRNA genes; an exception was found at the non-tRNA-like 3′ end produced by ssrA gene permutation in cyanobacteria, suggesting that, during the evolution of new site specificity by integrases, tropism toward a conserved 3′ end of an RNA gene may be as strong as toward a tRNA-like sequence. The proximity of ssrA and smpB, which act in concert, was also surveyed. PMID:12533482
Serohijos, Adrian W.R.; Shakhnovich, Eugene I.
2014-01-01
The variation among sequences and structures in nature is both determined by physical laws and by evolutionary history. However, these two factors are traditionally investigated by disciplines with different emphasis and philosophy—molecular biophysics on one hand and evolutionary population genetics in another. Here, we review recent theoretical and computational approaches that address the critical need to integrate these two disciplines. We first articulate the elements of these integrated approaches. Then, we survey their contribution to our mechanistic understanding of molecular evolution, the polymorphisms in coding region, the distribution of fitness effects (DFE) of mutations, the observed folding stability of proteins in nature, and the distribution of protein folds in genomes. PMID:24952216
The public goods hypothesis for the evolution of life on Earth
2011-01-01
It is becoming increasingly difficult to reconcile the observed extent of horizontal gene transfers with the central metaphor of a great tree uniting all evolving entities on the planet. In this manuscript we describe the Public Goods Hypothesis and show that it is appropriate in order to describe biological evolution on the planet. According to this hypothesis, nucleotide sequences (genes, promoters, exons, etc.) are simply seen as goods, passed from organism to organism through both vertical and horizontal transfer. Public goods sequences are defined by having the properties of being largely non-excludable (no organism can be effectively prevented from accessing these sequences) and non-rival (while such a sequence is being used by one organism it is also available for use by another organism). The universal nature of genetic systems ensures that such non-excludable sequences exist and non-excludability explains why we see a myriad of genes in different combinations in sequenced genomes. There are three features of the public goods hypothesis. Firstly, segments of DNA are seen as public goods, available for all organisms to integrate into their genomes. Secondly, we expect the evolution of mechanisms for DNA sharing and of defense mechanisms against DNA intrusion in genomes. Thirdly, we expect that we do not see a global tree-like pattern. Instead, we expect local tree-like patterns to emerge from the combination of a commonage of genes and vertical inheritance of genomes by cell division. Indeed, while genes are theoretically public goods, in reality, some genes are excludable, particularly, though not only, when they have variant genetic codes or behave as coalition or club goods, available for all organisms of a coalition to integrate into their genomes, and non-rival within the club. We view the Tree of Life hypothesis as a regionalized instance of the Public Goods hypothesis, just like classical mechanics and euclidean geometry are seen as regionalized instances of quantum mechanics and Riemannian geometry respectively. We argue for this change using an axiomatic approach that shows that the Public Goods hypothesis is a better accommodation of the observed data than the Tree of Life hypothesis. PMID:21861918
The Public Goods Hypothesis for the evolution of life on Earth.
McInerney, James O; Pisani, Davide; Bapteste, Eric; O'Connell, Mary J
2011-08-23
It is becoming increasingly difficult to reconcile the observed extent of horizontal gene transfers with the central metaphor of a great tree uniting all evolving entities on the planet. In this manuscript we describe the Public Goods Hypothesis and show that it is appropriate in order to describe biological evolution on the planet. According to this hypothesis, nucleotide sequences (genes, promoters, exons, etc.) are simply seen as goods, passed from organism to organism through both vertical and horizontal transfer. Public goods sequences are defined by having the properties of being largely non-excludable (no organism can be effectively prevented from accessing these sequences) and non-rival (while such a sequence is being used by one organism it is also available for use by another organism). The universal nature of genetic systems ensures that such non-excludable sequences exist and non-excludability explains why we see a myriad of genes in different combinations in sequenced genomes. There are three features of the public goods hypothesis. Firstly, segments of DNA are seen as public goods, available for all organisms to integrate into their genomes. Secondly, we expect the evolution of mechanisms for DNA sharing and of defense mechanisms against DNA intrusion in genomes. Thirdly, we expect that we do not see a global tree-like pattern. Instead, we expect local tree-like patterns to emerge from the combination of a commonage of genes and vertical inheritance of genomes by cell division. Indeed, while genes are theoretically public goods, in reality, some genes are excludable, particularly, though not only, when they have variant genetic codes or behave as coalition or club goods, available for all organisms of a coalition to integrate into their genomes, and non-rival within the club. We view the Tree of Life hypothesis as a regionalized instance of the Public Goods hypothesis, just like classical mechanics and euclidean geometry are seen as regionalized instances of quantum mechanics and Riemannian geometry respectively. We argue for this change using an axiomatic approach that shows that the Public Goods hypothesis is a better accommodation of the observed data than the Tree of Life hypothesis.
Islander: A database of precisely mapped genomic islands in tRNA and tmRNA genes
Hudson, Corey M.; Lau, Britney Y.; Williams, Kelly P.
2014-11-05
Genomic islands are mobile DNAs that are major agents of bacterial and archaeal evolution. Integration into prokaryotic chromosomes usually occurs site-specifically at tRNA or tmRNA gene (together, tDNA) targets, catalyzed by tyrosine integrases. This splits the target gene, yet sequences within the island restore the disrupted gene; the regenerated target and its displaced fragment precisely mark the endpoints of the island. We applied this principle to search for islands in genomic DNA sequences. Our algorithm identifies tDNAs, finds fragments of those tDNAs in the same replicon and removes unlikely candidate islands through a series of filters. A search for islandsmore » in 2168 whole prokaryotic genomes produced 3919 candidates. The website Islander (recently moved to http://bioinformatics.sandia.gov/islander/) presents these precisely mapped candidate islands, the gene content and the island sequence. The algorithm further insists that each island encode an integrase, and attachment site sequence identity is carefully noted; therefore, the database also serves in the study of integrase site-specificity and its evolution.« less
Natural product-inspired cascade synthesis yields modulators of centrosome integrity.
Dückert, Heiko; Pries, Verena; Khedkar, Vivek; Menninger, Sascha; Bruss, Hanna; Bird, Alexander W; Maliga, Zoltan; Brockmeyer, Andreas; Janning, Petra; Hyman, Anthony; Grimme, Stefan; Schürmann, Markus; Preut, Hans; Hübel, Katja; Ziegler, Slava; Kumar, Kamal; Waldmann, Herbert
2011-12-25
In biology-oriented synthesis, the scaffolds of biologically relevant compound classes inspire the synthesis of focused compound collections enriched in bioactivity. This criterion is, in particular, met by the scaffolds of natural products selected in evolution. The synthesis of natural product-inspired compound collections calls for efficient reaction sequences that preferably combine multiple individual transformations in one operation. Here we report the development of a one-pot, twelve-step cascade reaction sequence that includes nine different reactions and two opposing kinds of organocatalysis. The cascade sequence proceeds within 10-30 min and transforms readily available substrates into complex indoloquinolizines that resemble the core tetracyclic scaffold of numerous polycyclic indole alkaloids. Biological investigation of a corresponding focused compound collection revealed modulators of centrosome integrity, termed centrocountins, which caused fragmented and supernumerary centrosomes, chromosome congression defects, multipolar mitotic spindles, acentrosomal spindle poles and multipolar cell division by targeting the centrosome-associated proteins nucleophosmin and Crm1.
Vincent, Antony T; Trudel, Mélanie V; Freschi, Luca; Nagar, Vandan; Gagné-Thivierge, Cynthia; Levesque, Roger C; Charette, Steve J
2016-01-12
Aeromonads make up a group of Gram-negative bacteria that includes human and fish pathogens. The Aeromonas salmonicida species has the peculiarity of including five known subspecies. However, few studies of the genomes of A. salmonicida subspecies have been reported to date. We sequenced the genomes of additional A. salmonicida isolates, including three from India, using next-generation sequencing in order to gain a better understanding of the genomic and phylogenetic links between A. salmonicida subspecies. Their relative phylogenetic positions were confirmed by a core genome phylogeny based on 1645 gene sequences. The Indian isolates, which formed a sub-group together with A. salmonicida subsp. pectinolytica, were able to grow at either at 18 °C and 37 °C, unlike the A. salmonicida psychrophilic isolates that did not grow at 37 °C. Amino acid frequencies, GC content, tRNA composition, loss and gain of genes during evolution, pseudogenes as well as genes under positive selection and the mobilome were studied to explain this intraspecies dichotomy. Insertion sequences appeared to be an important driving force that locked the psychrophilic strains into their particular lifestyle in order to conserve their genomic integrity. This observation, based on comparative genomics, is in agreement with previous results showing that insertion sequence mobility induced by heat in A. salmonicida subspecies causes genomic plasticity, resulting in a deleterious effect on the virulence of the bacterium. We provide a proof-of-concept that selfish DNAs play a major role in the evolution of bacterial species by modeling genomes.
Hogeweg, Paulien
2012-01-01
Most of evolutionary theory has abstracted away from how information is coded in the genome and how this information is transformed into traits on which selection takes place. While in the earliest stages of biological evolution, in the RNA world, the mapping from the genotype into function was largely predefined by the physical-chemical properties of the evolving entities (RNA replicators, e.g. from sequence to folded structure and catalytic sites), in present-day organisms, the mapping itself is the result of evolution. I will review results of several in silico evolutionary studies which examine the consequences of evolving the genetic coding, and the ways this information is transformed, while adapting to prevailing environments. Such multilevel evolution leads to long-term information integration. Through genome, network, and dynamical structuring, the occurrence and/or effect of random mutations becomes nonrandom, and facilitates rapid adaptation. This is what does happen in the in silico experiments. Is it also what did happen in biological evolution? I will discuss some data that suggest that it did. In any case, these results provide us with novel search images to tackle the wealth of biological data.
A Web-Based Monitoring System for Multidisciplinary Design Projects
NASA Technical Reports Server (NTRS)
Rogers, James L.; Salas, Andrea O.; Weston, Robert P.
1998-01-01
In today's competitive environment, both industry and government agencies are under pressure to reduce the time and cost of multidisciplinary design projects. New tools have been introduced to assist in this process by facilitating the integration of and communication among diverse disciplinary codes. One such tool, a framework for multidisciplinary computational environments, is defined as a hardware and software architecture that enables integration, execution, and communication among diverse disciplinary processes. An examination of current frameworks reveals weaknesses in various areas, such as sequencing, displaying, monitoring, and controlling the design process. The objective of this research is to explore how Web technology, integrated with an existing framework, can improve these areas of weakness. This paper describes a Web-based system that optimizes and controls the execution sequence of design processes; and monitors the project status and results. The three-stage evolution of the system with increasingly complex problems demonstrates the feasibility of this approach.
Tal, Asaf; Arbel-Goren, Rinat; Costantino, Nina; Court, Donald L; Stavans, Joel
2014-05-20
The search for specific sequences on long genomes is a key process in many biological contexts. How can specific target sequences be located with high efficiency, within physiologically relevant times? We addressed this question for viral integration, a fundamental mechanism of horizontal gene transfer driving prokaryotic evolution, using the infection of Escherichia coli bacteria with bacteriophage λ and following the establishment of a lysogenic state. Following the targeting process in individual live E. coli cells in real time revealed that λ DNA remains confined near the entry point of a cell following infection. The encounter between the 15-bp-long target sequence on the chromosome and the recombination site on the viral genome is facilitated by the directed motion of bacterial DNA generated during chromosome replication, in conjunction with constrained diffusion of phage DNA. Moving the native bacterial integration site to different locations on the genome and measuring the integration frequency in these strains reveals that the frequencies of the native site and a site symmetric to it relative to the origin are similar, whereas both are significantly higher than when the integration site is moved near the terminus, consistent with the replication-driven mechanism we propose. This novel search mechanism is yet another example of the exquisite coevolution of λ with its host.
Dong, Zheng; Zhou, Hongyu; Tao, Peng
2018-02-01
PAS domains are widespread in archaea, bacteria, and eukaryota, and play important roles in various functions. In this study, we aim to explore functional evolutionary relationship among proteins in the PAS domain superfamily in view of the sequence-structure-dynamics-function relationship. We collected protein sequences and crystal structure data from RCSB Protein Data Bank of the PAS domain superfamily belonging to three biological functions (nucleotide binding, photoreceptor activity, and transferase activity). Protein sequences were aligned and then used to select sequence-conserved residues and build phylogenetic tree. Three-dimensional structure alignment was also applied to obtain structure-conserved residues. The protein dynamics were analyzed using elastic network model (ENM) and validated by molecular dynamics (MD) simulation. The result showed that the proteins with same function could be grouped by sequence similarity, and proteins in different functional groups displayed statistically significant difference in their vibrational patterns. Interestingly, in all three functional groups, conserved amino acid residues identified by sequence and structure conservation analysis generally have a lower fluctuation than other residues. In addition, the fluctuation of conserved residues in each biological function group was strongly correlated with the corresponding biological function. This research suggested a direct connection in which the protein sequences were related to various functions through structural dynamics. This is a new attempt to delineate functional evolution of proteins using the integrated information of sequence, structure, and dynamics. © 2017 The Protein Society.
Moya, Claudio E; Raiber, Matthias; Taulis, Mauricio; Cox, Malcolm E
2015-03-01
The Galilee and Eromanga basins are sub-basins of the Great Artesian Basin (GAB). In this study, a multivariate statistical approach (hierarchical cluster analysis, principal component analysis and factor analysis) is carried out to identify hydrochemical patterns and assess the processes that control hydrochemical evolution within key aquifers of the GAB in these basins. The results of the hydrochemical assessment are integrated into a 3D geological model (previously developed) to support the analysis of spatial patterns of hydrochemistry, and to identify the hydrochemical and hydrological processes that control hydrochemical variability. In this area of the GAB, the hydrochemical evolution of groundwater is dominated by evapotranspiration near the recharge area resulting in a dominance of the Na-Cl water types. This is shown conceptually using two selected cross-sections which represent discrete groundwater flow paths from the recharge areas to the deeper parts of the basins. With increasing distance from the recharge area, a shift towards a dominance of carbonate (e.g. Na-HCO3 water type) has been observed. The assessment of hydrochemical changes along groundwater flow paths highlights how aquifers are separated in some areas, and how mixing between groundwater from different aquifers occurs elsewhere controlled by geological structures, including between GAB aquifers and coal bearing strata of the Galilee Basin. The results of this study suggest that distinct hydrochemical differences can be observed within the previously defined Early Cretaceous-Jurassic aquifer sequence of the GAB. A revision of the two previously recognised hydrochemical sequences is being proposed, resulting in three hydrochemical sequences based on systematic differences in hydrochemistry, salinity and dominant hydrochemical processes. The integrated approach presented in this study which combines different complementary multivariate statistical techniques with a detailed assessment of the geological framework of these sedimentary basins, can be adopted in other complex multi-aquifer systems to assess hydrochemical evolution and its geological controls. Copyright © 2014 Elsevier B.V. All rights reserved.
Shapiro, James A
2016-06-08
The 21st century genomics-based analysis of evolutionary variation reveals a number of novel features impossible to predict when Dobzhansky and other evolutionary biologists formulated the neo-Darwinian Modern Synthesis in the middle of the last century. These include three distinct realms of cell evolution; symbiogenetic fusions forming eukaryotic cells with multiple genome compartments; horizontal organelle, virus and DNA transfers; functional organization of proteins as systems of interacting domains subject to rapid evolution by exon shuffling and exonization; distributed genome networks integrated by mobile repetitive regulatory signals; and regulation of multicellular development by non-coding lncRNAs containing repetitive sequence components. Rather than single gene traits, all phenotypes involve coordinated activity by multiple interacting cell molecules. Genomes contain abundant and functional repetitive components in addition to the unique coding sequences envisaged in the early days of molecular biology. Combinatorial coding, plus the biochemical abilities cells possess to rearrange DNA molecules, constitute a powerful toolbox for adaptive genome rewriting. That is, cells possess "Read-Write Genomes" they alter by numerous biochemical processes capable of rapidly restructuring cellular DNA molecules. Rather than viewing genome evolution as a series of accidental modifications, we can now study it as a complex biological process of active self-modification.
Shapiro, James A.
2016-01-01
The 21st century genomics-based analysis of evolutionary variation reveals a number of novel features impossible to predict when Dobzhansky and other evolutionary biologists formulated the neo-Darwinian Modern Synthesis in the middle of the last century. These include three distinct realms of cell evolution; symbiogenetic fusions forming eukaryotic cells with multiple genome compartments; horizontal organelle, virus and DNA transfers; functional organization of proteins as systems of interacting domains subject to rapid evolution by exon shuffling and exonization; distributed genome networks integrated by mobile repetitive regulatory signals; and regulation of multicellular development by non-coding lncRNAs containing repetitive sequence components. Rather than single gene traits, all phenotypes involve coordinated activity by multiple interacting cell molecules. Genomes contain abundant and functional repetitive components in addition to the unique coding sequences envisaged in the early days of molecular biology. Combinatorial coding, plus the biochemical abilities cells possess to rearrange DNA molecules, constitute a powerful toolbox for adaptive genome rewriting. That is, cells possess “Read–Write Genomes” they alter by numerous biochemical processes capable of rapidly restructuring cellular DNA molecules. Rather than viewing genome evolution as a series of accidental modifications, we can now study it as a complex biological process of active self-modification. PMID:27338490
Tanouchi, Yu; Covert, Markus W
2017-09-19
During its lysogenic life cycle, the phage genome is integrated into the host chromosome by site-specific recombination. In this report, we analyze lambda phage integration into noncanonical sites using next-generation sequencing and show that it generates significant genetic diversity by targeting over 300 unique sites in the host Escherichia coli genome. Moreover, these integration events can have important phenotypic consequences for the host, including changes in cell motility and increased antibiotic resistance. Importantly, the new technologies that we developed to enable this study-sequencing secondary sites using next-generation sequencing and then selecting relevant lysogens using clustered regularly interspaced short palindromic repeat (CRISPR)/Cas9-based selection-are broadly applicable to other phage-bacterium systems. IMPORTANCE Bacteriophages play an important role in bacterial evolution through lysogeny, where the phage genome is integrated into the host chromosome. While phage integration generally occurs at a specific site in the host chromosome, it is also known to occur at other, so-called secondary sites. In this study, we developed a new experimental technology to comprehensively study secondary integration sites and discovered that phage can integrate into over 300 unique sites in the host genome, resulting in significant genetic diversity in bacteria. We further developed an assay to examine the phenotypic consequence of such diverse integration events and found that phage integration can cause changes in evolutionarily relevant traits such as bacterial motility and increases in antibiotic resistance. Importantly, our method is readily applicable to other phage-bacterium systems. Copyright © 2017 Tanouchi and Covert.
MEvoLib v1.0: the first molecular evolution library for Python.
Álvarez-Jarreta, Jorge; Ruiz-Pesini, Eduardo
2016-10-28
Molecular evolution studies involve many different hard computational problems solved, in most cases, with heuristic algorithms that provide a nearly optimal solution. Hence, diverse software tools exist for the different stages involved in a molecular evolution workflow. We present MEvoLib, the first molecular evolution library for Python, providing a framework to work with different tools and methods involved in the common tasks of molecular evolution workflows. In contrast with already existing bioinformatics libraries, MEvoLib is focused on the stages involved in molecular evolution studies, enclosing the set of tools with a common purpose in a single high-level interface with fast access to their frequent parameterizations. The gene clustering from partial or complete sequences has been improved with a new method that integrates accessible external information (e.g. GenBank's features data). Moreover, MEvoLib adjusts the fetching process from NCBI databases to optimize the download bandwidth usage. In addition, it has been implemented using parallelization techniques to cope with even large-case scenarios. MEvoLib is the first library for Python designed to facilitate molecular evolution researches both for expert and novel users. Its unique interface for each common task comprises several tools with their most used parameterizations. It has also included a method to take advantage of biological knowledge to improve the gene partition of sequence datasets. Additionally, its implementation incorporates parallelization techniques to enhance computational costs when handling very large input datasets.
NASA Technical Reports Server (NTRS)
Rede, Leonard J.; Booth, Andrew; Hsieh, Jonathon; Summer, Kellee
2004-01-01
This paper presents a discussion of the evolution of a sequencer from a simple EPICS (Experimental Physics and Industrial Control System) based sequencer into a complex implementation designed utilizing UML (Unified Modeling Language) methodologies and a CASE (Computer Aided Software Engineering) tool approach. The main purpose of the sequencer (called the IF Sequencer) is to provide overall control of the Keck Interferometer to enable science operations be carried out by a single operator (and/or observer). The interferometer links the two 10m telescopes of the W. M. Keck Observatory at Mauna Kea, Hawaii. The IF Sequencer is a high-level, multi-threaded, Hare1 finite state machine, software program designed to orchestrate several lower-level hardware and software hard real time subsystems that must perform their work in a specific and sequential order. The sequencing need not be done in hard real-time. Each state machine thread commands either a high-speed real-time multiple mode embedded controller via CORB A, or slower controllers via EPICS Channel Access interfaces. The overall operation of the system is simplified by the automation. The UML is discussed and our use of it to implement the sequencer is presented. The decision to use the Rhapsody product as our CASE tool is explained and reflected upon. Most importantly, a section on lessons learned is presented and the difficulty of integrating CASE tool automatically generated C++ code into a large control system consisting of multiple infrastructures is presented.
NASA Astrophysics Data System (ADS)
Reder, Leonard J.; Booth, Andrew; Hsieh, Jonathan; Summers, Kellee R.
2004-09-01
This paper presents a discussion of the evolution of a sequencer from a simple Experimental Physics and Industrial Control System (EPICS) based sequencer into a complex implementation designed utilizing UML (Unified Modeling Language) methodologies and a Computer Aided Software Engineering (CASE) tool approach. The main purpose of the Interferometer Sequencer (called the IF Sequencer) is to provide overall control of the Keck Interferometer to enable science operations to be carried out by a single operator (and/or observer). The interferometer links the two 10m telescopes of the W. M. Keck Observatory at Mauna Kea, Hawaii. The IF Sequencer is a high-level, multi-threaded, Harel finite state machine software program designed to orchestrate several lower-level hardware and software hard real-time subsystems that must perform their work in a specific and sequential order. The sequencing need not be done in hard real-time. Each state machine thread commands either a high-speed real-time multiple mode embedded controller via CORBA, or slower controllers via EPICS Channel Access interfaces. The overall operation of the system is simplified by the automation. The UML is discussed and our use of it to implement the sequencer is presented. The decision to use the Rhapsody product as our CASE tool is explained and reflected upon. Most importantly, a section on lessons learned is presented and the difficulty of integrating CASE tool automatically generated C++ code into a large control system consisting of multiple infrastructures is presented.
Increased complexity of circRNA expression during species evolution.
Dong, Rui; Ma, Xu-Kai; Chen, Ling-Ling; Yang, Li
2017-08-03
Circular RNAs (circRNAs) are broadly identified from precursor mRNA (pre-mRNA) back-splicing across various species. Recent studies have suggested a cell-/tissue- specific manner of circRNA expression. However, the distinct expression pattern of circRNAs among species and its underlying mechanism still remain to be explored. Here, we systematically compared circRNA expression from human and mouse, and found that only a small portion of human circRNAs could be determined in parallel mouse samples. The conserved circRNA expression between human and mouse is correlated with the existence of orientation-opposite complementary sequences in introns that flank back-spliced exons in both species, but not the circRNA sequences themselves. Quantification of RNA pairing capacity of orientation-opposite complementary sequences across circRNA-flanking introns by Complementary Sequence Index (CSI) identifies that among all types of complementary sequences, SINEs, especially Alu elements in human, contribute the most for circRNA formation and that their diverse distribution across species leads to the increased complexity of circRNA expression during species evolution. Together, our integrated and comparative reference catalog of circRNAs in different species reveals a species-specific pattern of circRNA expression and suggests a previously under-appreciated impact of fast-evolved SINEs on the regulation of (circRNA) gene expression.
Hamon, Perla; Grover, Corrinne E; Davis, Aaron P; Rakotomalala, Jean-Jacques; Raharimalala, Nathalie E; Albert, Victor A; Sreenath, Hosahalli L; Stoffelen, Piet; Mitchell, Sharon E; Couturon, Emmanuel; Hamon, Serge; de Kochko, Alexandre; Crouzillat, Dominique; Rigoreau, Michel; Sumirat, Ucu; Akaffou, Sélastique; Guyot, Romain
2017-04-01
A comprehensive and meaningful phylogenetic hypothesis for the commercially important coffee genus (Coffea) has long been a key objective for coffee researchers. For molecular studies, progress has been limited by low levels of sequence divergence, leading to insufficient topological resolution and statistical support in phylogenetic trees, particularly for the major lineages and for the numerous species occurring in Madagascar. We report here the first almost fully resolved, broadly sampled phylogenetic hypothesis for coffee, the result of combining genotyping-by-sequencing (GBS) technology with a newly developed, lab-based workflow to integrate short read next-generation sequencing for low numbers of additional samples. Biogeographic patterns indicate either Africa or Asia (or possibly the Arabian Peninsula) as the most likely ancestral locality for the origin of the coffee genus, with independent radiations across Africa, Asia, and the Western Indian Ocean Islands (including Madagascar and Mauritius). The evolution of caffeine, an important trait for commerce and society, was evaluated in light of our phylogeny. High and consistent caffeine content is found only in species from the equatorial, fully humid environments of West and Central Africa, possibly as an adaptive response to increased levels of pest predation. Moderate caffeine production, however, evolved at least one additional time recently (between 2 and 4Mya) in a Madagascan lineage, which suggests that either the biosynthetic pathway was already in place during the early evolutionary history of coffee, or that caffeine synthesis within the genus is subject to convergent evolution, as is also the case for caffeine synthesis in coffee versus tea and chocolate. Copyright © 2017 Elsevier Inc. All rights reserved.
A short introduction to cytogenetic studies in mammals with reference to the present volume.
Graphodatsky, A; Ferguson-Smith, M A; Stanyon, R
2012-01-01
Genome diversity has long been studied from the comparative cytogenetic perspective. Early workers documented differences between species in diploid chromosome number and fundamental number. Banding methods allowed more detailed descriptions of between-species rearrangements and classes of differentially staining chromosome material. The infusion of molecular methods into cytogenetics provided a third revolution, which is still not exhausted. Chromosome painting has provided a global view of the translocation history of mammalian genome evolution, well summarized in the contributions to this special volume. More recently, FISH of cloned DNA has provided details on defining breakpoint and intrachromosomal marker order, which have helped to document inversions and centromere repositioning. The most recent trend in comparative molecular cytogenetics is to integrate sequencing information in order to formulate and test reconstructions of ancestral genomes and phylogenomic hypotheses derived from comparative cytogenetics. The integration of comparative cytogenetics and sequencing promises to provide an understanding of what drives chromosome rearrangements and genome evolution in general. We believe that the contributions in this volume, in no small way, point the way to the next phase in cytogenetic studies. Copyright © 2012 S. Karger AG, Basel.
Functional interrogation of non-coding DNA through CRISPR genome editing
Canver, Matthew C.; Bauer, Daniel E.; Orkin, Stuart H.
2017-01-01
Methodologies to interrogate non-coding regions have lagged behind coding regions despite comprising the vast majority of the genome. However, the rapid evolution of clustered regularly interspaced short palindromic repeats (CRISPR)-based genome editing has provided a multitude of novel techniques for laboratory investigation including significant contributions to the toolbox for studying non-coding DNA. CRISPR-mediated loss-of-function strategies rely on direct disruption of the underlying sequence or repression of transcription without modifying the targeted DNA sequence. CRISPR-mediated gain-of-function approaches similarly benefit from methods to alter the targeted sequence through integration of customized sequence into the genome as well as methods to activate transcription. Here we review CRISPR-based loss- and gain-of-function techniques for the interrogation of non-coding DNA. PMID:28288828
ITEP: an integrated toolkit for exploration of microbial pan-genomes.
Benedict, Matthew N; Henriksen, James R; Metcalf, William W; Whitaker, Rachel J; Price, Nathan D
2014-01-03
Comparative genomics is a powerful approach for studying variation in physiological traits as well as the evolution and ecology of microorganisms. Recent technological advances have enabled sequencing large numbers of related genomes in a single project, requiring computational tools for their integrated analysis. In particular, accurate annotations and identification of gene presence and absence are critical for understanding and modeling the cellular physiology of newly sequenced genomes. Although many tools are available to compare the gene contents of related genomes, new tools are necessary to enable close examination and curation of protein families from large numbers of closely related organisms, to integrate curation with the analysis of gain and loss, and to generate metabolic networks linking the annotations to observed phenotypes. We have developed ITEP, an Integrated Toolkit for Exploration of microbial Pan-genomes, to curate protein families, compute similarities to externally-defined domains, analyze gene gain and loss, and generate draft metabolic networks from one or more curated reference network reconstructions in groups of related microbial species among which the combination of core and variable genes constitute the their "pan-genomes". The ITEP toolkit consists of: (1) a series of modular command-line scripts for identification, comparison, curation, and analysis of protein families and their distribution across many genomes; (2) a set of Python libraries for programmatic access to the same data; and (3) pre-packaged scripts to perform common analysis workflows on a collection of genomes. ITEP's capabilities include de novo protein family prediction, ortholog detection, analysis of functional domains, identification of core and variable genes and gene regions, sequence alignments and tree generation, annotation curation, and the integration of cross-genome analysis and metabolic networks for study of metabolic network evolution. ITEP is a powerful, flexible toolkit for generation and curation of protein families. ITEP's modular design allows for straightforward extension as analysis methods and tools evolve. By integrating comparative genomics with the development of draft metabolic networks, ITEP harnesses the power of comparative genomics to build confidence in links between genotype and phenotype and helps disambiguate gene annotations when they are evaluated in both evolutionary and metabolic network contexts.
Habitability of super-Earth planets around other suns: models including Red Giant Branch evolution.
von Bloh, W; Cuntz, M; Schröder, K-P; Bounama, C; Franck, S
2009-01-01
The unexpected diversity of exoplanets includes a growing number of super-Earth planets, i.e., exoplanets with masses of up to several Earth masses and a similar chemical and mineralogical composition as Earth. We present a thermal evolution model for a 10 Earth-mass planet orbiting a star like the Sun. Our model is based on the integrated system approach, which describes the photosynthetic biomass production and takes into account a variety of climatological, biogeochemical, and geodynamical processes. This allows us to identify a so-called photosynthesis-sustaining habitable zone (pHZ), as determined by the limits of biological productivity on the planetary surface. Our model considers solar evolution during the main-sequence stage and along the Red Giant Branch as described by the most recent solar model. We obtain a large set of solutions consistent with the principal possibility of life. The highest likelihood of habitability is found for "water worlds." Only mass-rich water worlds are able to realize pHZ-type habitability beyond the stellar main sequence on the Red Giant Branch.
Origin-Dependent Inverted-Repeat Amplification: Tests of a Model for Inverted DNA Amplification.
Brewer, Bonita J; Payen, Celia; Di Rienzi, Sara C; Higgins, Megan M; Ong, Giang; Dunham, Maitreya J; Raghuraman, M K
2015-12-01
DNA replication errors are a major driver of evolution--from single nucleotide polymorphisms to large-scale copy number variations (CNVs). Here we test a specific replication-based model to explain the generation of interstitial, inverted triplications. While no genetic information is lost, the novel inversion junctions and increased copy number of the included sequences create the potential for adaptive phenotypes. The model--Origin-Dependent Inverted-Repeat Amplification (ODIRA)-proposes that a replication error at pre-existing short, interrupted, inverted repeats in genomic sequences generates an extrachromosomal, inverted dimeric, autonomously replicating intermediate; subsequent genomic integration of the dimer yields this class of CNV without loss of distal chromosomal sequences. We used a combination of in vitro and in vivo approaches to test the feasibility of the proposed replication error and its downstream consequences on chromosome structure in the yeast Saccharomyces cerevisiae. We show that the proposed replication error-the ligation of leading and lagging nascent strands to create "closed" forks-can occur in vitro at short, interrupted inverted repeats. The removal of molecules with two closed forks results in a hairpin-capped linear duplex that we show replicates in vivo to create an inverted, dimeric plasmid that subsequently integrates into the genome by homologous recombination, creating an inverted triplication. While other models have been proposed to explain inverted triplications and their derivatives, our model can also explain the generation of human, de novo, inverted amplicons that have a 2:1 mixture of sequences from both homologues of a single parent--a feature readily explained by a plasmid intermediate that arises from one homologue and integrates into the other homologue prior to meiosis. Our tests of key features of ODIRA lend support to this mechanism and suggest further avenues of enquiry to unravel the origins of interstitial, inverted CNVs pivotal in human health and evolution.
2014-01-01
Background Recent advancements in next-generation sequencing technology have enabled cost-effective sequencing of whole or partial genomes, permitting the discovery and characterization of molecular polymorphisms. Double-digest restriction-site associated DNA sequencing (ddRAD-seq) is a powerful and inexpensive approach to developing numerous single nucleotide polymorphism (SNP) markers and constructing a high-density genetic map. To enrich genomic resources for Japanese eel (Anguilla japonica), we constructed a ddRAD-based genetic map using an Ion Torrent Personal Genome Machine and anchored scaffolds of the current genome assembly to 19 linkage groups of the Japanese eel. Furthermore, we compared the Japanese eel genome with genomes of model fishes to infer the history of genome evolution after the teleost-specific genome duplication. Results We generated the ddRAD-based linkage map of the Japanese eel, where the maps for female and male spanned 1748.8 cM and 1294.5 cM, respectively, and were arranged into 19 linkage groups. A total of 2,672 SNP markers and 115 Simple Sequence Repeat markers provide anchor points to 1,252 scaffolds covering 151 Mb (13%) of the current genome assembly of the Japanese eel. Comparisons among the Japanese eel, medaka, zebrafish and spotted gar genomes showed highly conserved synteny among teleosts and revealed part of the eight major chromosomal rearrangement events that occurred soon after the teleost-specific genome duplication. Conclusions The ddRAD-seq approach combined with the Ion Torrent Personal Genome Machine sequencing allowed us to conduct efficient and flexible SNP genotyping. The integration of the genetic map and the assembled sequence provides a valuable resource for fine mapping and positional cloning of quantitative trait loci associated with economically important traits and for investigating comparative genomics of the Japanese eel. PMID:24669946
Kai, Wataru; Nomura, Kazuharu; Fujiwara, Atushi; Nakamura, Yoji; Yasuike, Motoshige; Ojima, Nobuhiko; Masaoka, Tetsuji; Ozaki, Akiyuki; Kazeto, Yukinori; Gen, Koichiro; Nagao, Jiro; Tanaka, Hideki; Kobayashi, Takanori; Ototake, Mitsuru
2014-03-26
Recent advancements in next-generation sequencing technology have enabled cost-effective sequencing of whole or partial genomes, permitting the discovery and characterization of molecular polymorphisms. Double-digest restriction-site associated DNA sequencing (ddRAD-seq) is a powerful and inexpensive approach to developing numerous single nucleotide polymorphism (SNP) markers and constructing a high-density genetic map. To enrich genomic resources for Japanese eel (Anguilla japonica), we constructed a ddRAD-based genetic map using an Ion Torrent Personal Genome Machine and anchored scaffolds of the current genome assembly to 19 linkage groups of the Japanese eel. Furthermore, we compared the Japanese eel genome with genomes of model fishes to infer the history of genome evolution after the teleost-specific genome duplication. We generated the ddRAD-based linkage map of the Japanese eel, where the maps for female and male spanned 1748.8 cM and 1294.5 cM, respectively, and were arranged into 19 linkage groups. A total of 2,672 SNP markers and 115 Simple Sequence Repeat markers provide anchor points to 1,252 scaffolds covering 151 Mb (13%) of the current genome assembly of the Japanese eel. Comparisons among the Japanese eel, medaka, zebrafish and spotted gar genomes showed highly conserved synteny among teleosts and revealed part of the eight major chromosomal rearrangement events that occurred soon after the teleost-specific genome duplication. The ddRAD-seq approach combined with the Ion Torrent Personal Genome Machine sequencing allowed us to conduct efficient and flexible SNP genotyping. The integration of the genetic map and the assembled sequence provides a valuable resource for fine mapping and positional cloning of quantitative trait loci associated with economically important traits and for investigating comparative genomics of the Japanese eel.
Belyi, Vladimir A.; Levine, Arnold J.; Skalka, Anna Marie
2010-01-01
Vertebrate genomes contain numerous copies of retroviral sequences, acquired over the course of evolution. Until recently they were thought to be the only type of RNA viruses to be so represented, because integration of a DNA copy of their genome is required for their replication. In this study, an extensive sequence comparison was conducted in which 5,666 viral genes from all known non-retroviral families with single-stranded RNA genomes were matched against the germline genomes of 48 vertebrate species, to determine if such viruses could also contribute to the vertebrate genetic heritage. In 19 of the tested vertebrate species, we discovered as many as 80 high-confidence examples of genomic DNA sequences that appear to be derived, as long ago as 40 million years, from ancestral members of 4 currently circulating virus families with single strand RNA genomes. Surprisingly, almost all of the sequences are related to only two families in the Order Mononegavirales: the Bornaviruses and the Filoviruses, which cause lethal neurological disease and hemorrhagic fevers, respectively. Based on signature landmarks some, and perhaps all, of the endogenous virus-like DNA sequences appear to be LINE element-facilitated integrations derived from viral mRNAs. The integrations represent genes that encode viral nucleocapsid, RNA-dependent-RNA-polymerase, matrix and, possibly, glycoproteins. Integrations are generally limited to one or very few copies of a related viral gene per species, suggesting that once the initial germline integration was obtained (or selected), later integrations failed or provided little advantage to the host. The conservation of relatively long open reading frames for several of the endogenous sequences, the virus-like protein regions represented, and a potential correlation between their presence and a species' resistance to the diseases caused by these pathogens, are consistent with the notion that their products provide some important biological advantage to the species. In addition, the viruses could also benefit, as some resistant species (e.g. bats) may serve as natural reservoirs for their persistence and transmission. Given the stringent limitations imposed in this informatics search, the examples described here should be considered a low estimate of the number of such integration events that have persisted over evolutionary time scales. Clearly, the sources of genetic information in vertebrate genomes are much more diverse than previously suspected. PMID:20686665
Belyi, Vladimir A; Levine, Arnold J; Skalka, Anna Marie
2010-07-29
Vertebrate genomes contain numerous copies of retroviral sequences, acquired over the course of evolution. Until recently they were thought to be the only type of RNA viruses to be so represented, because integration of a DNA copy of their genome is required for their replication. In this study, an extensive sequence comparison was conducted in which 5,666 viral genes from all known non-retroviral families with single-stranded RNA genomes were matched against the germline genomes of 48 vertebrate species, to determine if such viruses could also contribute to the vertebrate genetic heritage. In 19 of the tested vertebrate species, we discovered as many as 80 high-confidence examples of genomic DNA sequences that appear to be derived, as long ago as 40 million years, from ancestral members of 4 currently circulating virus families with single strand RNA genomes. Surprisingly, almost all of the sequences are related to only two families in the Order Mononegavirales: the Bornaviruses and the Filoviruses, which cause lethal neurological disease and hemorrhagic fevers, respectively. Based on signature landmarks some, and perhaps all, of the endogenous virus-like DNA sequences appear to be LINE element-facilitated integrations derived from viral mRNAs. The integrations represent genes that encode viral nucleocapsid, RNA-dependent-RNA-polymerase, matrix and, possibly, glycoproteins. Integrations are generally limited to one or very few copies of a related viral gene per species, suggesting that once the initial germline integration was obtained (or selected), later integrations failed or provided little advantage to the host. The conservation of relatively long open reading frames for several of the endogenous sequences, the virus-like protein regions represented, and a potential correlation between their presence and a species' resistance to the diseases caused by these pathogens, are consistent with the notion that their products provide some important biological advantage to the species. In addition, the viruses could also benefit, as some resistant species (e.g. bats) may serve as natural reservoirs for their persistence and transmission. Given the stringent limitations imposed in this informatics search, the examples described here should be considered a low estimate of the number of such integration events that have persisted over evolutionary time scales. Clearly, the sources of genetic information in vertebrate genomes are much more diverse than previously suspected.
Evolution of microbes and viruses: a paradigm shift in evolutionary biology?
Koonin, Eugene V.; Wolf, Yuri I.
2012-01-01
When Charles Darwin formulated the central principles of evolutionary biology in the Origin of Species in 1859 and the architects of the Modern Synthesis integrated these principles with population genetics almost a century later, the principal if not the sole objects of evolutionary biology were multicellular eukaryotes, primarily animals and plants. Before the advent of efficient gene sequencing, all attempts to extend evolutionary studies to bacteria have been futile. Sequencing of the rRNA genes in thousands of microbes allowed the construction of the three- domain “ribosomal Tree of Life” that was widely thought to have resolved the evolutionary relationships between the cellular life forms. However, subsequent massive sequencing of numerous, complete microbial genomes revealed novel evolutionary phenomena, the most fundamental of these being: (1) pervasive horizontal gene transfer (HGT), in large part mediated by viruses and plasmids, that shapes the genomes of archaea and bacteria and call for a radical revision (if not abandonment) of the Tree of Life concept, (2) Lamarckian-type inheritance that appears to be critical for antivirus defense and other forms of adaptation in prokaryotes, and (3) evolution of evolvability, i.e., dedicated mechanisms for evolution such as vehicles for HGT and stress-induced mutagenesis systems. In the non-cellular part of the microbial world, phylogenomics and metagenomics of viruses and related selfish genetic elements revealed enormous genetic and molecular diversity and extremely high abundance of viruses that come across as the dominant biological entities on earth. Furthermore, the perennial arms race between viruses and their hosts is one of the defining factors of evolution. Thus, microbial phylogenomics adds new dimensions to the fundamental picture of evolution even as the principle of descent with modification discovered by Darwin and the laws of population genetics remain at the core of evolutionary biology. PMID:22993722
Paillot, Romain; Steward, Karen F.; Webb, Katy; Ainslie, Fern; Jourdan, Thibaud; Bason, Nathalie C.; Holroyd, Nancy E.; Mungall, Karen; Quail, Michael A.; Sanders, Mandy; Simmonds, Mark; Willey, David; Brooks, Karen; Aanensen, David M.; Spratt, Brian G.; Jolley, Keith A.; Maiden, Martin C. J.; Kehoe, Michael; Chanter, Neil; Bentley, Stephen D.; Robinson, Carl; Maskell, Duncan J.; Parkhill, Julian; Waller, Andrew S.
2009-01-01
The continued evolution of bacterial pathogens has major implications for both human and animal disease, but the exchange of genetic material between host-restricted pathogens is rarely considered. Streptococcus equi subspecies equi (S. equi) is a host-restricted pathogen of horses that has evolved from the zoonotic pathogen Streptococcus equi subspecies zooepidemicus (S. zooepidemicus). These pathogens share approximately 80% genome sequence identity with the important human pathogen Streptococcus pyogenes. We sequenced and compared the genomes of S. equi 4047 and S. zooepidemicus H70 and screened S. equi and S. zooepidemicus strains from around the world to uncover evidence of the genetic events that have shaped the evolution of the S. equi genome and led to its emergence as a host-restricted pathogen. Our analysis provides evidence of functional loss due to mutation and deletion, coupled with pathogenic specialization through the acquisition of bacteriophage encoding a phospholipase A2 toxin, and four superantigens, and an integrative conjugative element carrying a novel iron acquisition system with similarity to the high pathogenicity island of Yersinia pestis. We also highlight that S. equi, S. zooepidemicus, and S. pyogenes share a common phage pool that enhances cross-species pathogen evolution. We conclude that the complex interplay of functional loss, pathogenic specialization, and genetic exchange between S. equi, S. zooepidemicus, and S. pyogenes continues to influence the evolution of these important streptococci. PMID:19325880
Chromosome Evolution in Connection with Repetitive Sequences and Epigenetics in Plants.
Li, Shu-Fen; Su, Ting; Cheng, Guang-Qian; Wang, Bing-Xiao; Li, Xu; Deng, Chuan-Liang; Gao, Wu-Jun
2017-10-24
Chromosome evolution is a fundamental aspect of evolutionary biology. The evolution of chromosome size, structure and shape, number, and the change in DNA composition suggest the high plasticity of nuclear genomes at the chromosomal level. Repetitive DNA sequences, which represent a conspicuous fraction of every eukaryotic genome, particularly in plants, are found to be tightly linked with plant chromosome evolution. Different classes of repetitive sequences have distinct distribution patterns on the chromosomes. Mounting evidence shows that repetitive sequences may play multiple generative roles in shaping the chromosome karyotypes in plants. Furthermore, recent development in our understanding of the repetitive sequences and plant chromosome evolution has elucidated the involvement of a spectrum of epigenetic modification. In this review, we focused on the recent evidence relating to the distribution pattern of repetitive sequences in plant chromosomes and highlighted their potential relevance to chromosome evolution in plants. We also discussed the possible connections between evolution and epigenetic alterations in chromosome structure and repatterning, such as heterochromatin formation, centromere function, and epigenetic-associated transposable element inactivation.
Petruzziello, Filomena; Fouillen, Laetitia; Wadensten, Henrik; Kretz, Robert; Andren, Per E; Rainer, Gregor; Zhang, Xiaozhe
2012-02-03
Neuropeptidomics is used to characterize endogenous peptides in the brain of tree shrews (Tupaia belangeri). Tree shrews are small animals similar to rodents in size but close relatives of primates, and are excellent models for brain research. Currently, tree shrews have no complete proteome information available on which direct database search can be allowed for neuropeptide identification. To increase the capability in the identification of neuropeptides in tree shrews, we developed an integrated mass spectrometry (MS)-based approach that combines methods including data-dependent, directed, and targeted liquid chromatography (LC)-Fourier transform (FT)-tandem MS (MS/MS) analysis, database construction, de novo sequencing, precursor protein search, and homology analysis. Using this integrated approach, we identified 107 endogenous peptides that have sequences identical or similar to those from other mammalian species. High accuracy MS and tandem MS information, with BLAST analysis and chromatographic characteristics were used to confirm the sequences of all the identified peptides. Interestingly, further sequence homology analysis demonstrated that tree shrew peptides have a significantly higher degree of homology to equivalent sequences in humans than those in mice or rats, consistent with the close phylogenetic relationship between tree shrews and primates. Our results provide the first extensive characterization of the peptidome in tree shrews, which now permits characterization of their function in nervous and endocrine system. As the approach developed fully used the conservative properties of neuropeptides in evolution and the advantage of high accuracy MS, it can be portable for identification of neuropeptides in other species for which the fully sequenced genomes or proteomes are not available.
Functional interrogation of non-coding DNA through CRISPR genome editing.
Canver, Matthew C; Bauer, Daniel E; Orkin, Stuart H
2017-05-15
Methodologies to interrogate non-coding regions have lagged behind coding regions despite comprising the vast majority of the genome. However, the rapid evolution of clustered regularly interspaced short palindromic repeats (CRISPR)-based genome editing has provided a multitude of novel techniques for laboratory investigation including significant contributions to the toolbox for studying non-coding DNA. CRISPR-mediated loss-of-function strategies rely on direct disruption of the underlying sequence or repression of transcription without modifying the targeted DNA sequence. CRISPR-mediated gain-of-function approaches similarly benefit from methods to alter the targeted sequence through integration of customized sequence into the genome as well as methods to activate transcription. Here we review CRISPR-based loss- and gain-of-function techniques for the interrogation of non-coding DNA. Copyright © 2017 Elsevier Inc. All rights reserved.
Chaitankar, Vijender; Karakülah, Gökhan; Ratnapriya, Rinki; Giuste, Felipe O.; Brooks, Matthew J.; Swaroop, Anand
2016-01-01
The advent of high throughput next generation sequencing (NGS) has accelerated the pace of discovery of disease-associated genetic variants and genomewide profiling of expressed sequences and epigenetic marks, thereby permitting systems-based analyses of ocular development and disease. Rapid evolution of NGS and associated methodologies presents significant challenges in acquisition, management, and analysis of large data sets and for extracting biologically or clinically relevant information. Here we illustrate the basic design of commonly used NGS-based methods, specifically whole exome sequencing, transcriptome, and epigenome profiling, and provide recommendations for data analyses. We briefly discuss systems biology approaches for integrating multiple data sets to elucidate gene regulatory or disease networks. While we provide examples from the retina, the NGS guidelines reviewed here are applicable to other tissues/cell types as well. PMID:27297499
Phylogenetics of modern birds in the era of genomics
Edwards, Scott V; Bryan Jennings, W; Shedlock, Andrew M
2005-01-01
In the 14 years since the first higher-level bird phylogenies based on DNA sequence data, avian phylogenetics has witnessed the advent and maturation of the genomics era, the completion of the chicken genome and a suite of technologies that promise to add considerably to the agenda of avian phylogenetics. In this review, we summarize current approaches and data characteristics of recent higher-level bird studies and suggest a number of as yet untested molecular and analytical approaches for the unfolding tree of life for birds. A variety of comparative genomics strategies, including adoption of objective quality scores for sequence data, analysis of contiguous DNA sequences provided by large-insert genomic libraries, and the systematic use of retroposon insertions and other rare genomic changes all promise an integrated phylogenetics that is solidly grounded in genome evolution. The avian genome is an excellent testing ground for such approaches because of the more balanced representation of single-copy and repetitive DNA regions than in mammals. Although comparative genomics has a number of obvious uses in avian phylogenetics, its application to large numbers of taxa poses a number of methodological and infrastructural challenges, and can be greatly facilitated by a ‘community genomics’ approach in which the modest sequencing throughputs of single PI laboratories are pooled to produce larger, complementary datasets. Although the polymerase chain reaction era of avian phylogenetics is far from complete, the comparative genomics era—with its ability to vastly increase the number and type of molecular characters and to provide a genomic context for these characters—will usher in a host of new perspectives and opportunities for integrating genome evolution and avian phylogenetics. PMID:16024355
Morard, Raphaël; Darling, Kate F; Mahé, Frédéric; Audic, Stéphane; Ujiié, Yurika; Weiner, Agnes K M; André, Aurore; Seears, Heidi A; Wade, Christopher M; Quillévéré, Frédéric; Douady, Christophe J; Escarguel, Gilles; de Garidel-Thoron, Thibault; Siccha, Michael; Kucera, Michal; de Vargas, Colomban
2015-11-01
Planktonic foraminifera (Rhizaria) are ubiquitous marine pelagic protists producing calcareous shells with conspicuous morphology. They play an important role in the marine carbon cycle, and their exceptional fossil record serves as the basis for biochronostratigraphy and past climate reconstructions. A major worldwide sampling effort over the last two decades has resulted in the establishment of multiple large collections of cryopreserved individual planktonic foraminifera samples. Thousands of 18S rDNA partial sequences have been generated, representing all major known morphological taxa across their worldwide oceanic range. This comprehensive data coverage provides an opportunity to assess patterns of molecular ecology and evolution in a holistic way for an entire group of planktonic protists. We combined all available published and unpublished genetic data to build PFR(2), the Planktonic foraminifera Ribosomal Reference database. The first version of the database includes 3322 reference 18S rDNA sequences belonging to 32 of the 47 known morphospecies of extant planktonic foraminifera, collected from 460 oceanic stations. All sequences have been rigorously taxonomically curated using a six-rank annotation system fully resolved to the morphological species level and linked to a series of metadata. The PFR(2) website, available at http://pfr2.sb-roscoff.fr, allows downloading the entire database or specific sections, as well as the identification of new planktonic foraminiferal sequences. Its novel, fully documented curation process integrates advances in morphological and molecular taxonomy. It allows for an increase in its taxonomic resolution and assures that integrity is maintained by including a complete contingency tracking of annotations and assuring that the annotations remain internally consistent. © 2015 John Wiley & Sons Ltd.
AGAPE (Automated Genome Analysis PipelinE) for Pan-Genome Analysis of Saccharomyces cerevisiae
Song, Giltae; Dickins, Benjamin J. A.; Demeter, Janos; Engel, Stacia; Dunn, Barbara; Cherry, J. Michael
2015-01-01
The characterization and public release of genome sequences from thousands of organisms is expanding the scope for genetic variation studies. However, understanding the phenotypic consequences of genetic variation remains a challenge in eukaryotes due to the complexity of the genotype-phenotype map. One approach to this is the intensive study of model systems for which diverse sources of information can be accumulated and integrated. Saccharomyces cerevisiae is an extensively studied model organism, with well-known protein functions and thoroughly curated phenotype data. To develop and expand the available resources linking genomic variation with function in yeast, we aim to model the pan-genome of S. cerevisiae. To initiate the yeast pan-genome, we newly sequenced or re-sequenced the genomes of 25 strains that are commonly used in the yeast research community using advanced sequencing technology at high quality. We also developed a pipeline for automated pan-genome analysis, which integrates the steps of assembly, annotation, and variation calling. To assign strain-specific functional annotations, we identified genes that were not present in the reference genome. We classified these according to their presence or absence across strains and characterized each group of genes with known functional and phenotypic features. The functional roles of novel genes not found in the reference genome and associated with strains or groups of strains appear to be consistent with anticipated adaptations in specific lineages. As more S. cerevisiae strain genomes are released, our analysis can be used to collate genome data and relate it to lineage-specific patterns of genome evolution. Our new tool set will enhance our understanding of genomic and functional evolution in S. cerevisiae, and will be available to the yeast genetics and molecular biology community. PMID:25781462
Dong, Xinran; Wang, Xiao; Zhang, Feng; Tian, Weidong
2016-01-01
Accelerated evolution of regulatory sequence can alter the expression pattern of target genes, and cause phenotypic changes. In this study, we used DNase I hypersensitive sites (DHSs) to annotate putative regulatory sequences in the human genome, and conducted a genome-wide analysis of the effects of accelerated evolution on regulatory sequences. Working under the assumption that local ancient repeat elements of DHSs are under neutral evolution, we discovered that ∼0.44% of DHSs are under accelerated evolution (ace-DHSs). We found that ace-DHSs tend to be more active than background DHSs, and are strongly associated with epigenetic marks of active transcription. The target genes of ace-DHSs are significantly enriched in neuron-related functions, and their expression levels are positively selected in the human brain. Thus, these lines of evidences strongly suggest that accelerated evolution on regulatory sequences plays important role in the evolution of human-specific phenotypes. PMID:27401230
Wang, Qingguo; Jia, Peilin; Zhao, Zhongming
2015-01-01
Fueled by widespread applications of high-throughput next generation sequencing (NGS) technologies and urgent need to counter threats of pathogenic viruses, large-scale studies were conducted recently to investigate virus integration in host genomes (for example, human tumor genomes) that may cause carcinogenesis or other diseases. A limiting factor in these studies, however, is rapid virus evolution and resulting polymorphisms, which prevent reads from aligning readily to commonly used virus reference genomes, and, accordingly, make virus integration sites difficult to detect. Another confounding factor is host genomic instability as a result of virus insertions. To tackle these challenges and improve our capability to identify cryptic virus-host fusions, we present a new approach that detects Virus intEgration sites through iterative Reference SEquence customization (VERSE). To the best of our knowledge, VERSE is the first approach to improve detection through customizing reference genomes. Using 19 human tumors and cancer cell lines as test data, we demonstrated that VERSE substantially enhanced the sensitivity of virus integration site detection. VERSE is implemented in the open source package VirusFinder 2 that is available at http://bioinfo.mc.vanderbilt.edu/VirusFinder/.
Cao, Hieu Xuan; Vu, Giang Thi Ha; Wang, Wenqin; Appenroth, Klaus J; Messing, Joachim; Schubert, Ingo
2016-01-01
Duckweeds are aquatic monocotyledonous plants of potential economic interest with fast vegetative propagation, comprising 37 species with variable genome sizes (0.158-1.88 Gbp). The genomic sequence of Spirodela polyrhiza, the smallest and the most ancient duckweed genome, needs to be aligned to its chromosomes as a reference and prerequisite to study the genome and karyotype evolution of other duckweed species. We selected physically mapped bacterial artificial chromosomes (BACs) containing Spirodela DNA inserts with little or no repetitive elements as probes for multicolor fluorescence in situ hybridization (mcFISH), using an optimized BAC pooling strategy, to validate its physical map and correlate it with its chromosome complement. By consecutive mcFISH analyses, we assigned the originally assembled 32 pseudomolecules (supercontigs) of the genomic sequences to the 20 chromosomes of S. polyrhiza. A Spirodela cytogenetic map containing 96 BAC markers with an average distance of 0.89 Mbp was constructed. Using a cocktail of 41 BACs in three colors, all chromosome pairs could be individualized simultaneously. Seven ancestral blocks emerged from duplicated chromosome segments of 19 Spirodela chromosomes. The chromosomally integrated genome of S. polyrhiza and the established prerequisites for comparative chromosome painting enable future studies on the chromosome homoeology and karyotype evolution of duckweed species. © 2015 IPK Gatersleben. New Phytologist © 2015 New Phytologist Trust.
Jeong, Young-Min; Kim, Namshin; Ahn, Byung Ohg; Oh, Mijin; Chung, Won-Hyong; Chung, Hee; Jeong, Seongmun; Lim, Ki-Byung; Hwang, Yoon-Jung; Kim, Goon-Bo; Baek, Seunghoon; Choi, Sang-Bong; Hyung, Dae-Jin; Lee, Seung-Won; Sohn, Seong-Han; Kwon, Soo-Jin; Jin, Mina; Seol, Young-Joo; Chae, Won Byoung; Choi, Keun Jin; Park, Beom-Seok; Yu, Hee-Ju; Mun, Jeong-Hwan
2016-07-01
This study presents a chromosome-scale draft genome sequence of radish that is assembled into nine chromosomal pseudomolecules. A comprehensive comparative genome analysis with the Brassica genomes provides genomic evidences on the evolution of the mesohexaploid radish genome. Radish (Raphanus sativus L.) is an agronomically important root vegetable crop and its origin and phylogenetic position in the tribe Brassiceae is controversial. Here we present a comprehensive analysis of the radish genome based on the chromosome sequences of R. sativus cv. WK10039. The radish genome was sequenced and assembled into 426.2 Mb spanning >98 % of the gene space, of which 344.0 Mb were integrated into nine chromosome pseudomolecules. Approximately 36 % of the genome was repetitive sequences and 46,514 protein-coding genes were predicted and annotated. Comparative mapping of the tPCK-like ancestral genome revealed that the radish genome has intermediate characteristics between the Brassica A/C and B genomes in the triplicated segments, suggesting an internal origin from the genus Brassica. The evolutionary characteristics shared between radish and other Brassica species provided genomic evidences that the current form of nine chromosomes in radish was rearranged from the chromosomes of hexaploid progenitor. Overall, this study provides a chromosome-scale draft genome sequence of radish as well as novel insight into evolution of the mesohexaploid genomes in the tribe Brassiceae.
Dong, Xinran; Wang, Xiao; Zhang, Feng; Tian, Weidong
2016-10-01
Accelerated evolution of regulatory sequence can alter the expression pattern of target genes, and cause phenotypic changes. In this study, we used DNase I hypersensitive sites (DHSs) to annotate putative regulatory sequences in the human genome, and conducted a genome-wide analysis of the effects of accelerated evolution on regulatory sequences. Working under the assumption that local ancient repeat elements of DHSs are under neutral evolution, we discovered that ∼0.44% of DHSs are under accelerated evolution (ace-DHSs). We found that ace-DHSs tend to be more active than background DHSs, and are strongly associated with epigenetic marks of active transcription. The target genes of ace-DHSs are significantly enriched in neuron-related functions, and their expression levels are positively selected in the human brain. Thus, these lines of evidences strongly suggest that accelerated evolution on regulatory sequences plays important role in the evolution of human-specific phenotypes. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Microbial Ecology and Evolution in the Acid Mine Drainage Model System.
Huang, Li-Nan; Kuang, Jia-Liang; Shu, Wen-Sheng
2016-07-01
Acid mine drainage (AMD) is a unique ecological niche for acid- and toxic-metals-adapted microorganisms. These low-complexity systems offer a special opportunity for the ecological and evolutionary analyses of natural microbial assemblages. The last decade has witnessed an unprecedented interest in the study of AMD communities using 16S rRNA high-throughput sequencing and community genomic and postgenomic methodologies, significantly advancing our understanding of microbial diversity, community function, and evolution in acidic environments. This review describes new data on AMD microbial ecology and evolution, especially dynamics of microbial diversity, community functions, and population genomes, and further identifies gaps in our current knowledge that future research, with integrated applications of meta-omics technologies, will fill. Copyright © 2016 Elsevier Ltd. All rights reserved.
Genetics and evolution of triatomines: from phylogeny to vector control
Gourbière, S; Dorn, P; Tripet, F; Dumonteil, E
2012-01-01
Triatomines are hemipteran bugs acting as vectors of the protozoan parasite Trypanosoma cruzi. This parasite causes Chagas disease, one of the major parasitic diseases in the Americas. Studies of triatomine genetics and evolution have been particularly useful in the design of rational vector control strategies, and are reviewed here. The phylogeography of several triatomine species is now slowly emerging, and the struggle to reconcile the phenotypic, phylogenetic, ecological and epidemiological species concepts makes for a very dynamic field. Population genetic studies using different markers indicate a wide range of population structures, depending on the triatomine species, ranging from highly fragmented to mobile, interbreeding populations. Triatomines transmit T. cruzi in the context of complex interactions between the insect vectors, their bacterial symbionts and the parasites; however, an integrated view of the significance of these interactions in triatomine biology, evolution and in disease transmission is still lacking. The development of novel genetic markers, together with the ongoing sequencing of the Rhodnius prolixus genome and more integrative studies, will provide key tools to expanding our understanding of these important insect vectors and allow the design of improved vector control strategies. PMID:21897436
Limsakul, Praopim; Peng, Qin; Wu, Yiqian; Allen, Molly E; Liang, Jing; Remacle, Albert G; Lopez, Tyler; Ge, Xin; Kay, Brian K; Zhao, Huimin; Strongin, Alex Y; Yang, Xiang-Lei; Lu, Shaoying; Wang, Yingxiao
2018-04-19
Monitoring enzymatic activities at the cell surface is challenging due to the poor efficiency of transport and membrane integration of fluorescence resonance energy transfer (FRET)-based biosensors. Therefore, we developed a hybrid biosensor with separate donor and acceptor that assemble in situ. The directed evolution and sequence-function analysis technologies were integrated to engineer a monobody variant (PEbody) that binds to R-phycoerythrin (R-PE) dye. PEbody was used for visualizing the dynamic formation/separation of intercellular junctions. We further fused PEbody with the enhanced CFP and an enzyme-specific peptide at the extracellular surface to create a hybrid FRET biosensor upon R-PE capture for monitoring membrane-type-1 matrix metalloproteinase (MT1-MMP) activities. This biosensor revealed asymmetric distribution of MT1-MMP activities, which were high and low at loose and stable cell-cell contacts, respectively. Therefore, directed evolution and rational design are promising tools to engineer molecular binders and hybrid FRET biosensors for monitoring molecular regulations at the surface of living cells. Copyright © 2018 Elsevier Ltd. All rights reserved.
Repetitive sequences in plant nuclear DNA: types, distribution, evolution and function.
Mehrotra, Shweta; Goyal, Vinod
2014-08-01
Repetitive DNA sequences are a major component of eukaryotic genomes and may account for up to 90% of the genome size. They can be divided into minisatellite, microsatellite and satellite sequences. Satellite DNA sequences are considered to be a fast-evolving component of eukaryotic genomes, comprising tandemly-arrayed, highly-repetitive and highly-conserved monomer sequences. The monomer unit of satellite DNA is 150-400 base pairs (bp) in length. Repetitive sequences may be species- or genus-specific, and may be centromeric or subtelomeric in nature. They exhibit cohesive and concerted evolution caused by molecular drive, leading to high sequence homogeneity. Repetitive sequences accumulate variations in sequence and copy number during evolution, hence they are important tools for taxonomic and phylogenetic studies, and are known as "tuning knobs" in the evolution. Therefore, knowledge of repetitive sequences assists our understanding of the organization, evolution and behavior of eukaryotic genomes. Repetitive sequences have cytoplasmic, cellular and developmental effects and play a role in chromosomal recombination. In the post-genomics era, with the introduction of next-generation sequencing technology, it is possible to evaluate complex genomes for analyzing repetitive sequences and deciphering the yet unknown functional potential of repetitive sequences. Copyright © 2014 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.
Droc, Gaëtan; Larivière, Delphine; Guignon, Valentin; Yahiaoui, Nabila; This, Dominique; Garsmeur, Olivier; Dereeper, Alexis; Hamelin, Chantal; Argout, Xavier; Dufayard, Jean-François; Lengelle, Juliette; Baurens, Franc-Christophe; Cenci, Alberto; Pitollat, Bertrand; D’Hont, Angélique; Ruiz, Manuel; Rouard, Mathieu; Bocs, Stéphanie
2013-01-01
Banana is one of the world’s favorite fruits and one of the most important crops for developing countries. The banana reference genome sequence (Musa acuminata) was recently released. Given the taxonomic position of Musa, the completed genomic sequence has particular comparative value to provide fresh insights about the evolution of the monocotyledons. The study of the banana genome has been enhanced by a number of tools and resources that allows harnessing its sequence. First, we set up essential tools such as a Community Annotation System, phylogenomics resources and metabolic pathways. Then, to support post-genomic efforts, we improved banana existing systems (e.g. web front end, query builder), we integrated available Musa data into generic systems (e.g. markers and genetic maps, synteny blocks), we have made interoperable with the banana hub, other existing systems containing Musa data (e.g. transcriptomics, rice reference genome, workflow manager) and finally, we generated new results from sequence analyses (e.g. SNP and polymorphism analysis). Several uses cases illustrate how the Banana Genome Hub can be used to study gene families. Overall, with this collaborative effort, we discuss the importance of the interoperability toward data integration between existing information systems. Database URL: http://banana-genome.cirad.fr/ PMID:23707967
Scott, Martin; Worden, Paul; Huntington, Peter; Hudson, Bernard; Karagiannis, Thomas; Charles, Ian G.; Djordjevic, Steven P.
2016-01-01
Pseudomonas aeruginosa are noscomially acquired, opportunistic pathogens that pose a major threat to the health of burns patients and the immunocompromised. We sequenced the genomes of P. aeruginosa isolates RNS_PA1, RNS_PA46 and RNS_PAE05, which displayed resistance to almost all frontline antibiotics, including gentamicin, piperacillin, timentin, meropenem, ceftazidime and colistin. We provide evidence that the isolates are representatives of P. aeruginosa sequence type (ST) 235 and carry Tn6162 and Tn6163 in genomic islands 1 (GI1) and 2 (GI2), respectively. GI1 disrupts the endA gene at precisely the same chromosomal location as in P. aeruginosa strain VR-143/97, of unknown ST, creating an identical CA direct repeat. The class 1 integron associated with Tn6163 in GI2 carries a blaGES-5–aacA4–gcuE15–aphA15 cassette array conferring resistance to carbapenems and aminoglycosides. GI2 is flanked by a 12 nt direct repeat motif, abuts a tRNA-gly gene, and encodes proteins with putative roles in integration, conjugative transfer as well as integrative conjugative element-specific proteins. This suggests that GI2 may have evolved from a novel integrative conjugative element. Our data provide further support to the hypothesis that genomic islands play an important role in de novo evolution of multiple antibiotic resistance phenotypes in P. aeruginosa. PMID:26962050
Chromosome Evolution in Connection with Repetitive Sequences and Epigenetics in Plants
Li, Shu-Fen; Su, Ting; Cheng, Guang-Qian; Wang, Bing-Xiao; Li, Xu; Deng, Chuan-Liang; Gao, Wu-Jun
2017-01-01
Chromosome evolution is a fundamental aspect of evolutionary biology. The evolution of chromosome size, structure and shape, number, and the change in DNA composition suggest the high plasticity of nuclear genomes at the chromosomal level. Repetitive DNA sequences, which represent a conspicuous fraction of every eukaryotic genome, particularly in plants, are found to be tightly linked with plant chromosome evolution. Different classes of repetitive sequences have distinct distribution patterns on the chromosomes. Mounting evidence shows that repetitive sequences may play multiple generative roles in shaping the chromosome karyotypes in plants. Furthermore, recent development in our understanding of the repetitive sequences and plant chromosome evolution has elucidated the involvement of a spectrum of epigenetic modification. In this review, we focused on the recent evidence relating to the distribution pattern of repetitive sequences in plant chromosomes and highlighted their potential relevance to chromosome evolution in plants. We also discussed the possible connections between evolution and epigenetic alterations in chromosome structure and repatterning, such as heterochromatin formation, centromere function, and epigenetic-associated transposable element inactivation. PMID:29064432
Integrated Desert Terrain Forecasting for Military Operations
2013-02-15
N. Porat. The role of the Nile in initiating a massive dust influx to the Negev late in the middle Pleistocene, Geological Society of America... Negev , Israel, resulting from regional tectonics blocking Mediterranean frontal systems, Geology, (06 2006): 0. doi: 10.1130/G22354.1 12/16/2012...Finkel, N. Porat, Y. Enzel, Y. Eyal. Quaternary-scale evolution of sequences of talus ?atirons in the hyperarid Negev , Geomorphology, (12 2010): 0
DOE Office of Scientific and Technical Information (OSTI.GOV)
Korenberg, J.R.
The ultimate goal of this research is to generate and apply novel technologies to speed completion and integration of the human genome map and sequence with biomedical problems. To do this, techniques were developed and genome-wide resources generated. This includes a genome-wide Mapped and Integrated BAC/PAC Resource that has been used for gene finding, map completion and anchoring, breakpoint definition and sequencing. In the last period of the grant, the Human Mapped BAC/PAC Resource was also applied to determine regions of human variation and to develop a novel paradigm of primate evolution through to humans. Further, in order to moremore » rapidly evaluate animal models of human disease, a BAC Map of the mouse was generated in collaboration with the MTI Genome Center, Dr. Bruce Birren.« less
NASA Astrophysics Data System (ADS)
Lazzez, Marzouk; Zouaghi, Taher; Ben Youssef, Mohamed
2008-08-01
A multidisciplinary study concerning Aptian and Albian deposits is reported from petroleum wells and the exposed section. The biostratigraphic and sedimentological analysis defined four sedimentary units. Well-logging signals' analysis allows us to refine the record resolution on Aptian series and reveals, in the Djeffara field, a transgressive system tract (TST) and a highstand system tract (HST). Exceptionally, the first sequence (S1) in the Mareth 1 well and the fifth sequence in the two wells Mareth 1 and Gourine 1 reveal the lower-stand system tract (LST). The unconformities characterized by the absence of Upper Aptian (Clansayesian) and Lower to Middle Albian deposits signed by a significant gamma-ray reduction. The Middle and Upper Albian is represented by only one deposit sequence (S6) in Mareth 1. Towards the south, in the Gourine well, two deposit sequences were identified (S6 and S7); to specify the Aptian and Albian evolution of the deposit sequences, a tentative correlation has been established between the Chotts and Djeffara areas. This correlation allows us to characterize the sedimentary unconformities related to the tectonics and eustatic events. The Chotts and the Djeffara deposition areas were developed, characterized by an irregular subsidence and separated by the Tebaga Medenine high area. The Aptian-Albian subsidence platform of southern Tunisia may be considered as a block diagram of environmental deposit with regressive and transgressive trends, showing the impact of tectonic deformations on the palaeogeographic evolution of southeastern Tunisia during the Austrian phase. This study also must be replaced within regional structural patterns that may explain both the sequential and sedimentological evolution of the area. Deformations regionally identified are integrated in the more general context of both Tethyan and Atlantic areas related to the drift of the African platform.
Skinner, Michael K
2015-04-26
Environment has a critical role in the natural selection process for Darwinian evolution. The primary molecular component currently considered for neo-Darwinian evolution involves genetic alterations and random mutations that generate the phenotypic variation required for natural selection to act. The vast majority of environmental factors cannot directly alter DNA sequence. Epigenetic mechanisms directly regulate genetic processes and can be dramatically altered by environmental factors. Therefore, environmental epigenetics provides a molecular mechanism to directly alter phenotypic variation generationally. Lamarck proposed in 1802 the concept that environment can directly alter phenotype in a heritable manner. Environmental epigenetics and epigenetic transgenerational inheritance provide molecular mechanisms for this process. Therefore, environment can on a molecular level influence the phenotypic variation directly. The ability of environmental epigenetics to alter phenotypic and genotypic variation directly can significantly impact natural selection. Neo-Lamarckian concept can facilitate neo-Darwinian evolution. A unified theory of evolution is presented to describe the integration of environmental epigenetic and genetic aspects of evolution. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Update on Genomic Databases and Resources at the National Center for Biotechnology Information.
Tatusova, Tatiana
2016-01-01
The National Center for Biotechnology Information (NCBI), as a primary public repository of genomic sequence data, collects and maintains enormous amounts of heterogeneous data. Data for genomes, genes, gene expressions, gene variation, gene families, proteins, and protein domains are integrated with the analytical, search, and retrieval resources through the NCBI website, text-based search and retrieval system, provides a fast and easy way to navigate across diverse biological databases.Comparative genome analysis tools lead to further understanding of evolution processes quickening the pace of discovery. Recent technological innovations have ignited an explosion in genome sequencing that has fundamentally changed our understanding of the biology of living organisms. This huge increase in DNA sequence data presents new challenges for the information management system and the visualization tools. New strategies have been designed to bring an order to this genome sequence shockwave and improve the usability of associated data.
Hybridization capture reveals evolution and conservation across the entire Koala retrovirus genome.
Tsangaras, Kyriakos; Siracusa, Matthew C; Nikolaidis, Nikolas; Ishida, Yasuko; Cui, Pin; Vielgrader, Hanna; Helgen, Kristofer M; Roca, Alfred L; Greenwood, Alex D
2014-01-01
The koala retrovirus (KoRV) is the only retrovirus known to be in the midst of invading the germ line of its host species. Hybridization capture and next generation sequencing were used on modern and museum DNA samples of koala (Phascolarctos cinereus) to examine ca. 130 years of evolution across the full KoRV genome. Overall, the entire proviral genome appeared to be conserved across time in sequence, protein structure and transcriptional binding sites. A total of 138 polymorphisms were detected, of which 72 were found in more than one individual. At every polymorphic site in the museum koalas, one of the character states matched that of modern KoRV. Among non-synonymous polymorphisms, radical substitutions involving large physiochemical differences between amino acids were elevated in env, potentially reflecting anti-viral immune pressure or avoidance of receptor interference. Polymorphisms were not detected within two functional regions believed to affect infectivity. Host sequences flanking proviral integration sites were also captured; with few proviral loci shared among koalas. Recently described variants of KoRV, designated KoRV-B and KoRV-J, were not detected in museum samples, suggesting that these variants may be of recent origin.
Hybridization Capture Reveals Evolution and Conservation across the Entire Koala Retrovirus Genome
Ishida, Yasuko; Cui, Pin; Vielgrader, Hanna; Helgen, Kristofer M.; Roca, Alfred L.; Greenwood, Alex D.
2014-01-01
The koala retrovirus (KoRV) is the only retrovirus known to be in the midst of invading the germ line of its host species. Hybridization capture and next generation sequencing were used on modern and museum DNA samples of koala (Phascolarctos cinereus) to examine ca. 130 years of evolution across the full KoRV genome. Overall, the entire proviral genome appeared to be conserved across time in sequence, protein structure and transcriptional binding sites. A total of 138 polymorphisms were detected, of which 72 were found in more than one individual. At every polymorphic site in the museum koalas, one of the character states matched that of modern KoRV. Among non-synonymous polymorphisms, radical substitutions involving large physiochemical differences between amino acids were elevated in env, potentially reflecting anti-viral immune pressure or avoidance of receptor interference. Polymorphisms were not detected within two functional regions believed to affect infectivity. Host sequences flanking proviral integration sites were also captured; with few proviral loci shared among koalas. Recently described variants of KoRV, designated KoRV-B and KoRV-J, were not detected in museum samples, suggesting that these variants may be of recent origin. PMID:24752422
The evolution of transcriptional regulation in eukaryotes
NASA Technical Reports Server (NTRS)
Wray, Gregory A.; Hahn, Matthew W.; Abouheif, Ehab; Balhoff, James P.; Pizer, Margaret; Rockman, Matthew V.; Romano, Laura A.
2003-01-01
Gene expression is central to the genotype-phenotype relationship in all organisms, and it is an important component of the genetic basis for evolutionary change in diverse aspects of phenotype. However, the evolution of transcriptional regulation remains understudied and poorly understood. Here we review the evolutionary dynamics of promoter, or cis-regulatory, sequences and the evolutionary mechanisms that shape them. Existing evidence indicates that populations harbor extensive genetic variation in promoter sequences, that a substantial fraction of this variation has consequences for both biochemical and organismal phenotype, and that some of this functional variation is sorted by selection. As with protein-coding sequences, rates and patterns of promoter sequence evolution differ considerably among loci and among clades for reasons that are not well understood. Studying the evolution of transcriptional regulation poses empirical and conceptual challenges beyond those typically encountered in analyses of coding sequence evolution: promoter organization is much less regular than that of coding sequences, and sequences required for the transcription of each locus reside at multiple other loci in the genome. Because of the strong context-dependence of transcriptional regulation, sequence inspection alone provides limited information about promoter function. Understanding the functional consequences of sequence differences among promoters generally requires biochemical and in vivo functional assays. Despite these challenges, important insights have already been gained into the evolution of transcriptional regulation, and the pace of discovery is accelerating.
Taylor, William R.; Gibbs, Melanie; Breuker, Casper J.; Holland, Peter W. H.
2014-01-01
Gene duplications within the conserved Hox cluster are rare in animal evolution, but in Lepidoptera an array of divergent Hox-related genes (Shx genes) has been reported between pb and zen. Here, we use genome sequencing of five lepidopteran species (Polygonia c-album, Pararge aegeria, Callimorpha dominula, Cameraria ohridella, Hepialus sylvina) plus a caddisfly outgroup (Glyphotaelius pellucidus) to trace the evolution of the lepidopteran Shx genes. We demonstrate that Shx genes originated by tandem duplication of zen early in the evolution of large clade Ditrysia; Shx are not found in a caddisfly and a member of the basally diverging Hepialidae (swift moths). Four distinct Shx genes were generated early in ditrysian evolution, and were stably retained in all descendent Lepidoptera except the silkmoth which has additional duplications. Despite extensive sequence divergence, molecular modelling indicates that all four Shx genes have the potential to encode stable homeodomains. The four Shx genes have distinct spatiotemporal expression patterns in early development of the Speckled Wood butterfly (Pararge aegeria), with ShxC demarcating the future sites of extraembryonic tissue formation via strikingly localised maternal RNA in the oocyte. All four genes are also expressed in presumptive serosal cells, prior to the onset of zen expression. Lepidopteran Shx genes represent an unusual example of Hox cluster expansion and integration of novel genes into ancient developmental regulatory networks. PMID:25340822
Prof. Hayashi's work on the pre-main sequence evolution and brown dwarfs
NASA Astrophysics Data System (ADS)
Nakano, Takenori
2012-09-01
Prof. Hayashi's work on the evolution of stars in the pre-main sequence stage is reviewed. The historical background and the process of finding the Hayashi phase are mentioned. The work on the evolution of low-mass stars is also reviewed including the determination of the bottom of the main sequence and evolution of brown dwarfs, and comparison is made with the other works in the same period.
Mizas, Ch; Sirakoulis, G Ch; Mardiris, V; Karafyllidis, I; Glykos, N; Sandaltzopoulos, R
2008-04-01
Change of DNA sequence that fuels evolution is, to a certain extent, a deterministic process because mutagenesis does not occur in an absolutely random manner. So far, it has not been possible to decipher the rules that govern DNA sequence evolution due to the extreme complexity of the entire process. In our attempt to approach this issue we focus solely on the mechanisms of mutagenesis and deliberately disregard the role of natural selection. Hence, in this analysis, evolution refers to the accumulation of genetic alterations that originate from mutations and are transmitted through generations without being subjected to natural selection. We have developed a software tool that allows modelling of a DNA sequence as a one-dimensional cellular automaton (CA) with four states per cell which correspond to the four DNA bases, i.e. A, C, T and G. The four states are represented by numbers of the quaternary number system. Moreover, we have developed genetic algorithms (GAs) in order to determine the rules of CA evolution that simulate the DNA evolution process. Linear evolution rules were considered and square matrices were used to represent them. If DNA sequences of different evolution steps are available, our approach allows the determination of the underlying evolution rule(s). Conversely, once the evolution rules are deciphered, our tool may reconstruct the DNA sequence in any previous evolution step for which the exact sequence information was unknown. The developed tool may be used to test various parameters that could influence evolution. We describe a paradigm relying on the assumption that mutagenesis is governed by a near-neighbour-dependent mechanism. Based on the satisfactory performance of our system in the deliberately simplified example, we propose that our approach could offer a starting point for future attempts to understand the mechanisms that govern evolution. The developed software is open-source and has a user-friendly graphical input interface.
Nearly complete 28S rRNA gene sequences confirm new hypotheses of sponge evolution.
Thacker, Robert W; Hill, April L; Hill, Malcolm S; Redmond, Niamh E; Collins, Allen G; Morrow, Christine C; Spicer, Lori; Carmack, Cheryl A; Zappe, Megan E; Pohlmann, Deborah; Hall, Chelsea; Diaz, Maria C; Bangalore, Purushotham V
2013-09-01
The highly collaborative research sponsored by the NSF-funded Assembling the Porifera Tree of Life (PorToL) project is providing insights into some of the most difficult questions in metazoan systematics. Our understanding of phylogenetic relationships within the phylum Porifera has changed considerably with increased taxon sampling and data from additional molecular markers. PorToL researchers have falsified earlier phylogenetic hypotheses, discovered novel phylogenetic alliances, found phylogenetic homes for enigmatic taxa, and provided a more precise understanding of the evolution of skeletal features, secondary metabolites, body organization, and symbioses. Some of these exciting new discoveries are shared in the papers that form this issue of Integrative and Comparative Biology. Our analyses of over 300 nearly complete 28S ribosomal subunit gene sequences provide specific case studies that illustrate how our dataset confirms new hypotheses of sponge evolution. We recovered monophyletic clades for all 4 classes of sponges, as well as the 4 major clades of Demospongiae (Keratosa, Myxospongiae, Haploscleromorpha, and Heteroscleromorpha), but our phylogeny differs in several aspects from traditional classifications. In most major clades of sponges, families within orders appear to be paraphyletic. Although additional sampling of genes and taxa are needed to establish whether this pattern results from a lack of phylogenetic resolution or from a paraphyletic classification system, many of our results are congruent with those obtained from 18S ribosomal subunit gene sequences and complete mitochondrial genomes. These data provide further support for a revision of the traditional classification of sponges.
Nearly Complete 28S rRNA Gene Sequences Confirm New Hypotheses of Sponge Evolution
Thacker, Robert W.; Hill, April L.; Hill, Malcolm S.; Redmond, Niamh E.; Collins, Allen G.; Morrow, Christine C.; Spicer, Lori; Carmack, Cheryl A.; Zappe, Megan E.; Pohlmann, Deborah; Hall, Chelsea; Diaz, Maria C.; Bangalore, Purushotham V.
2013-01-01
The highly collaborative research sponsored by the NSF-funded Assembling the Porifera Tree of Life (PorToL) project is providing insights into some of the most difficult questions in metazoan systematics. Our understanding of phylogenetic relationships within the phylum Porifera has changed considerably with increased taxon sampling and data from additional molecular markers. PorToL researchers have falsified earlier phylogenetic hypotheses, discovered novel phylogenetic alliances, found phylogenetic homes for enigmatic taxa, and provided a more precise understanding of the evolution of skeletal features, secondary metabolites, body organization, and symbioses. Some of these exciting new discoveries are shared in the papers that form this issue of Integrative and Comparative Biology. Our analyses of over 300 nearly complete 28S ribosomal subunit gene sequences provide specific case studies that illustrate how our dataset confirms new hypotheses of sponge evolution. We recovered monophyletic clades for all 4 classes of sponges, as well as the 4 major clades of Demospongiae (Keratosa, Myxospongiae, Haploscleromorpha, and Heteroscleromorpha), but our phylogeny differs in several aspects from traditional classifications. In most major clades of sponges, families within orders appear to be paraphyletic. Although additional sampling of genes and taxa are needed to establish whether this pattern results from a lack of phylogenetic resolution or from a paraphyletic classification system, many of our results are congruent with those obtained from 18S ribosomal subunit gene sequences and complete mitochondrial genomes. These data provide further support for a revision of the traditional classification of sponges. PMID:23748742
Gupta, R S
1998-12-01
The presence of shared conserved insertion or deletions (indels) in protein sequences is a special type of signature sequence that shows considerable promise for phylogenetic inference. An alternative model of microbial evolution based on the use of indels of conserved proteins and the morphological features of prokaryotic organisms is proposed. In this model, extant archaebacteria and gram-positive bacteria, which have a simple, single-layered cell wall structure, are termed monoderm prokaryotes. They are believed to be descended from the most primitive organisms. Evidence from indels supports the view that the archaebacteria probably evolved from gram-positive bacteria, and I suggest that this evolution occurred in response to antibiotic selection pressures. Evidence is presented that diderm prokaryotes (i.e., gram-negative bacteria), which have a bilayered cell wall, are derived from monoderm prokaryotes. Signature sequences in different proteins provide a means to define a number of different taxa within prokaryotes (namely, low G+C and high G+C gram-positive, Deinococcus-Thermus, cyanobacteria, chlamydia-cytophaga related, and two different groups of Proteobacteria) and to indicate how they evolved from a common ancestor. Based on phylogenetic information from indels in different protein sequences, it is hypothesized that all eukaryotes, including amitochondriate and aplastidic organisms, received major gene contributions from both an archaebacterium and a gram-negative eubacterium. In this model, the ancestral eukaryotic cell is a chimera that resulted from a unique fusion event between the two separate groups of prokaryotes followed by integration of their genomes.
Gupta, Radhey S.
1998-01-01
The presence of shared conserved insertion or deletions (indels) in protein sequences is a special type of signature sequence that shows considerable promise for phylogenetic inference. An alternative model of microbial evolution based on the use of indels of conserved proteins and the morphological features of prokaryotic organisms is proposed. In this model, extant archaebacteria and gram-positive bacteria, which have a simple, single-layered cell wall structure, are termed monoderm prokaryotes. They are believed to be descended from the most primitive organisms. Evidence from indels supports the view that the archaebacteria probably evolved from gram-positive bacteria, and I suggest that this evolution occurred in response to antibiotic selection pressures. Evidence is presented that diderm prokaryotes (i.e., gram-negative bacteria), which have a bilayered cell wall, are derived from monoderm prokaryotes. Signature sequences in different proteins provide a means to define a number of different taxa within prokaryotes (namely, low G+C and high G+C gram-positive, Deinococcus-Thermus, cyanobacteria, chlamydia-cytophaga related, and two different groups of Proteobacteria) and to indicate how they evolved from a common ancestor. Based on phylogenetic information from indels in different protein sequences, it is hypothesized that all eukaryotes, including amitochondriate and aplastidic organisms, received major gene contributions from both an archaebacterium and a gram-negative eubacterium. In this model, the ancestral eukaryotic cell is a chimera that resulted from a unique fusion event between the two separate groups of prokaryotes followed by integration of their genomes. PMID:9841678
Barrera-Redondo, Josué; Ramírez-Barahona, Santiago; Eguiarte, Luis E
2018-05-01
Variation in rates of molecular evolution (heterotachy) is a common phenomenon among plants. Although multiple theoretical models have been proposed, fundamental questions remain regarding the combined effects of ecological and morphological traits on rate heterogeneity. Here, we used tree ferns to explore the correlation between rates of molecular evolution in chloroplast DNA sequences and several morphological and environmental factors within a Bayesian framework. We revealed direct and indirect effects of body size, biological productivity, and temperature on substitution rates, where smaller tree ferns living in warmer and less productive environments tend to have faster rates of molecular evolution. In addition, we found that variation in the ratio of nonsynonymous to synonymous substitution rates (dN/dS) in the chloroplast rbcL gene was significantly correlated with ecological and morphological variables. Heterotachy in tree ferns may be influenced by effective population size associated with variation in body size and productivity. Macroevolutionary hypotheses should go beyond explaining heterotachy in terms of mutation rates and instead, should integrate population-level factors to better understand the processes affecting the tempo of evolution at the molecular level. © 2018 The Author(s). Evolution © 2018 The Society for the Study of Evolution.
Voorhees, Ian E H; Dalziel, Benjamin D; Glaser, Amy; Dubovi, Edward J; Murcia, Pablo R; Newbury, Sandra; Toohey-Kurth, Kathy; Su, Shuo; Kriti, Divya; Van Bakel, Harm; Goodman, Laura B; Leutenegger, Christian; Holmes, Edward C; Parrish, Colin R
2018-06-06
Avian-origin H3N2 canine influenza virus (CIV) transferred to dogs in Asia around 2005, becoming enzootic throughout China and Korea before reaching the USA in early 2015. To understand the post-transfer evolution and epidemiology of this virus, particularly the cause of recent and ongoing increases in incidence in the USA, we performed an integrated analysis of whole-genome sequence data from 64 newly sequenced viruses and comprehensive surveillance data. This reveals that the circulation of H3N2 CIV within the USA is typified by recurrent epidemic burst-fadeout dynamics driven by multiple introductions of virus from Asia. Although all major viral lineages displayed similar rates of genomic sequence evolution, H3N2 CIV consistently exhibited proportionally more non-synonymous substitutions per site compared to avian reservoir viruses, indicative of a large-scale change in selection pressures. Despite these genotypic differences, we found no evidence of adaptive evolution or increased viral transmission, with epidemiological models indicating a basic reproductive number, R 0 , of between 1 and 1.5 across nearly all USA outbreaks, consistent with maintained, but heterogeneous circulation. We propose that CIV's mode of viral circulation may have resulted in evolutionary cul-de-sacs, in which there is little opportunity for the selection of the more transmissible H3N2 CIV phenotypes necessary to enable circulation through a general dog population characterized by widespread contact heterogeneity. CIV must therefore rely on metapopulations of high host density (notably animal shelters) within the greater dog population and reintroduction from other populations or face complete epidemic extinction. IMPORTANCE The relatively recent appearance of influenza A virus (IAV) epidemics in dogs expands our understanding of IAV host-range and ecology, providing useful and relevant models for understanding critical factors involved in viral emergence. Here, we integrate viral whole-genome sequence analysis and comprehensive surveillance data to examine the evolution of the emerging avian-origin H3N2 canine influenza virus (CIV), particularly the factors driving ongoing circulation and recent increase in incidence of the virus within the USA. Our results provide a detailed understanding of how H3N2 CIV achieves sustained circulation within the USA, despite widespread host contact heterogeneity and recurrent epidemic fade-out. Moreover, our findings suggest that the types and intensity of selection pressures an emerging virus experiences are highly dependent on host population structure and ecology, and may inhibit an emerging virus from acquiring sustained epidemic or pandemic circulation. Copyright © 2018 American Society for Microbiology.
Nero, Thomas M; Dalia, Triana N; Wang, Joseph Che-Yen; Kysela, David T; Bochman, Matthew L; Dalia, Ankur B
2018-05-02
Acquisition of foreign DNA by natural transformation is an important mechanism of adaptation and evolution in diverse microbial species. Here, we characterize the mechanism of ComM, a broadly conserved AAA+ protein previously implicated in homologous recombination of transforming DNA (tDNA) in naturally competent Gram-negative bacterial species. In vivo, we found that ComM was required for efficient comigration of linked genetic markers in Vibrio cholerae and Acinetobacter baylyi, which is consistent with a role in branch migration. Also, ComM was particularly important for integration of tDNA with increased sequence heterology, suggesting that its activity promotes the acquisition of novel DNA sequences. In vitro, we showed that purified ComM binds ssDNA, oligomerizes into a hexameric ring, and has bidirectional helicase and branch migration activity. Based on these data, we propose a model for tDNA integration during natural transformation. This study provides mechanistic insight into the enigmatic steps involved in tDNA integration and uncovers the function of a protein required for this conserved mechanism of horizontal gene transfer.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bohacs, K.M.
1990-05-01
Deep basinal rocks of the Monterey Formation can be allocated to different depositional environments based on an integration of bedding, facies stacking patterns, lithology, biofacies, and inorganic and organic chemistry. These rocks show evidence of systematic changes in depositional environments that can be related to eustatic sea level change and basin evolution. Even deep-basinal environments are affected by changing sea level through changes in circulation patterns and intensities nutrient budgets and dispersal patterns, and location and intensity of the oceanic oxygen minimum. The sequence-stratigraphic framework was constructed based on the physical expression of the outcrop strata and confirmed by typingmore » the outcrop sections to an integrated well-log/seismic grid through outcrop gamma-ray-spectral profiles. Interpretation of a sequence boundary was based on increased proportions of hemipelagic facies, evidence of increased bottom-energy levels above the boundary, and local erosion and relief on the surface. The proportion of shallower water and reworked dinoflagellates increased to a local maximum above the boundary, Downlap surfaces exhibited increased proportions of pelagic facies around the surface, evidence of decreased bottom-energy levels and terrigenous sedimentation rates, and little or no significant erosion on the surface. The proportion of deeper water dinoflagellates increased to a local maximum at or near the downlap surface; there was no evidence of reworked individuals. The detailed sequence-stratigraphic framework makes it possible to the rock properties to genetic processes for construction of predictive models.« less
Dissecting the relationship between protein structure and sequence variation
NASA Astrophysics Data System (ADS)
Shahmoradi, Amir; Wilke, Claus; Wilke Lab Team
2015-03-01
Over the past decade several independent works have shown that some structural properties of proteins are capable of predicting protein evolution. The strength and significance of these structure-sequence relations, however, appear to vary widely among different proteins, with absolute correlation strengths ranging from 0 . 1 to 0 . 8 . Here we present the results from a comprehensive search for the potential biophysical and structural determinants of protein evolution by studying more than 200 structural and evolutionary properties in a dataset of 209 monomeric enzymes. We discuss the main protein characteristics responsible for the general patterns of protein evolution, and identify sequence divergence as the main determinant of the strengths of virtually all structure-evolution relationships, explaining ~ 10 - 30 % of observed variation in sequence-structure relations. In addition to sequence divergence, we identify several protein structural properties that are moderately but significantly coupled with the strength of sequence-structure relations. In particular, proteins with more homogeneous back-bone hydrogen bond energies, large fractions of helical secondary structures and low fraction of beta sheets tend to have the strongest sequence-structure relation. BEACON-NSF center for the study of evolution in action.
Maddin, Hillary C; Reisz, Robert R; Anderson, Jason S
2010-01-01
Ontogenetic data can play a prominent role in addressing questions in tetrapod evolution, but such evidence from the fossil record is often incompletely considered because it is limited to initiation of ossification, or allometric changes with increasing size. In the present study, specimens of a new species of an archaic amphibian (280 Myr old), Acheloma n. sp., a member of the temnospondyl superfamily Dissorophoidea and the sister group to Amphibamidae, which is thought to include at least two of our modern amphibian clades, anurans and caudatans (Batrachia), provides us with new developmental data. We identify five ontogenetic events, enabling us to construct a partial ontogenetic trajectory (integration of developmental and transformation sequence data) related to the relative timing of completion of neurocranial structures. Comparison of the adult amphibamid morphology with this partial ontogeny identifies a heterochronic event that occurred within the neurocranium at some point in time between the two taxa, which is consistent with the predictions of miniaturization in amphibamids, providing the first insights into the influence of miniaturization on the neurocranium in a fossil tetrapod group. This study refines hypotheses of large-scale evolutionary trends within Dissorophoidea that may have facilitated the radiation of amphibamids and, projected forward, the origin of the generalized batrachian skull. Most importantly, this study highlights the importance of integrating developmental and transformation sequence data, instead of onset of ossification alone, into investigations of major events in tetrapod evolution using evidence provided by the fossil record, and highlights the value of even highly incomplete growth series comprised of relatively late-stage individuals.
CRISPR interference: RNA-directed adaptive immunity in bacteria and archaea
Marraffini, Luciano A.; Sontheimer, Erik J.
2010-01-01
Sequence-directed genetic interference pathways control gene expression and preserve genome integrity in all kingdoms of life. The importance of such pathways is highlighted by the extensive study of RNA interference (RNAi) and related processes in eukaryotes. In many bacteria and most archaea, clustered, regularly interspaced short palindromic repeats (CRISPRs) are involved in a more recently discovered interference pathway that protects cells from bacteriophages and conjugative plasmids. CRISPR sequences provide an adaptive, heritable record of past infections and express CRISPR RNAs — small RNAs that target invasive nucleic acids. Here, we review the mechanisms of CRISPR interference and its roles in microbial physiology and evolution. We also discuss potential applications of this novel interference pathway. PMID:20125085
Evolution Analysis of Simple Sequence Repeats in Plant Genome.
Qin, Zhen; Wang, Yanping; Wang, Qingmei; Li, Aixian; Hou, Fuyun; Zhang, Liming
2015-01-01
Simple sequence repeats (SSRs) are widespread units on genome sequences, and play many important roles in plants. In order to reveal the evolution of plant genomes, we investigated the evolutionary regularities of SSRs during the evolution of plant species and the plant kingdom by analysis of twelve sequenced plant genome sequences. First, in the twelve studied plant genomes, the main SSRs were those which contain repeats of 1-3 nucleotides combination. Second, in mononucleotide SSRs, the A/T percentage gradually increased along with the evolution of plants (except for P. patens). With the increase of SSRs repeat number the percentage of A/T in C. reinhardtii had no significant change, while the percentage of A/T in terrestrial plants species gradually declined. Third, in dinucleotide SSRs, the percentage of AT/TA increased along with the evolution of plant kingdom and the repeat number increased in terrestrial plants species. This trend was more obvious in dicotyledon than monocotyledon. The percentage of CG/GC showed the opposite pattern to the AT/TA. Forth, in trinucleotide SSRs, the percentages of combinations including two or three A/T were in a rising trend along with the evolution of plant kingdom; meanwhile with the increase of SSRs repeat number in plants species, different species chose different combinations as dominant SSRs. SSRs in C. reinhardtii, P. patens, Z. mays and A. thaliana showed their specific patterns related to evolutionary position or specific changes of genome sequences. The results showed that, SSRs not only had the general pattern in the evolution of plant kingdom, but also were associated with the evolution of the specific genome sequence. The study of the evolutionary regularities of SSRs provided new insights for the analysis of the plant genome evolution.
Microbial evolution of sulphate reduction when lateral gene transfer is geographically restricted.
Chi Fru, E
2011-07-01
Lateral gene transfer (LGT) is an important mechanism by which micro-organisms acquire new functions. This process has been suggested to be central to prokaryotic evolution in various environments. However, the influence of geographical constraints on the evolution of laterally acquired genes in microbial metabolic evolution is not yet well understood. In this study, the influence of geographical isolation on the evolution of laterally acquired dissimilatory sulphite reductase (dsr) gene sequences in the sulphate-reducing micro-organisms (SRM) was investigated. Sequences on four continental blocks related to SRM known to have received dsr by LGT were analysed using standard phylogenetic and multidimensional statistical methods. Sequences related to lineages with large genetic diversity correlated positively with habitat divergence. Those affiliated to Thermodesulfobacterium indicated strong biogeographical delineation; hydrothermal-vent sequences clustered independently from hot-spring sequences. Some of the hydrothermal-vent and hot-spring sequences suggested to have been acquired from a common ancestral source may have diverged upon isolation within distinct habitats. In contrast, analysis of some Desulfotomaculum sequences indicated they could have been transferred from different ancestral sources but converged upon isolation within the same niche. These results hint that, after lateral acquisition of dsr genes, barriers to gene flow probably play a strong role in their subsequent evolution.
A Generative Angular Model of Protein Structure Evolution
Golden, Michael; García-Portugués, Eduardo; Sørensen, Michael; Mardia, Kanti V.; Hamelryck, Thomas; Hein, Jotun
2017-01-01
Abstract Recently described stochastic models of protein evolution have demonstrated that the inclusion of structural information in addition to amino acid sequences leads to a more reliable estimation of evolutionary parameters. We present a generative, evolutionary model of protein structure and sequence that is valid on a local length scale. The model concerns the local dependencies between sequence and structure evolution in a pair of homologous proteins. The evolutionary trajectory between the two structures in the protein pair is treated as a random walk in dihedral angle space, which is modeled using a novel angular diffusion process on the two-dimensional torus. Coupling sequence and structure evolution in our model allows for modeling both “smooth” conformational changes and “catastrophic” conformational jumps, conditioned on the amino acid changes. The model has interpretable parameters and is comparatively more realistic than previous stochastic models, providing new insights into the relationship between sequence and structure evolution. For example, using the trained model we were able to identify an apparent sequence–structure evolutionary motif present in a large number of homologous protein pairs. The generative nature of our model enables us to evaluate its validity and its ability to simulate aspects of protein evolution conditioned on an amino acid sequence, a related amino acid sequence, a related structure or any combination thereof. PMID:28453724
The proximal-to-distal sequence in upper-limb motions on multiple levels and time scales.
Serrien, Ben; Baeyens, Jean-Pierre
2017-10-01
The proximal-to-distal sequence is a phenomenon that can be observed in a large variety of motions of the upper limbs in both humans and other mammals. The mechanisms behind this sequence are not completely understood and motor control theories able to explain this phenomenon are currently incomplete. The aim of this narrative review is to take a theoretical constraints-led approach to the proximal-to-distal sequence and provide a broad multidisciplinary overview of relevant literature. This sequence exists at multiple levels (brain, spine, muscles, kinetics and kinematics) and on multiple time scales (motion, motor learning and development, growth and possibly even evolution). We hypothesize that the proximodistal spatiotemporal direction on each time scale and level provides part of the organismic constraints that guide the dynamics at the other levels and time scales. The constraint-led approach in this review may serve as a first onset towards integration of evidence and a framework for further experimentation to reveal the dynamics of the proximal-to-distal sequence. Copyright © 2017 Elsevier B.V. All rights reserved.
PlantRGDB: A Database of Plant Retrocopied Genes.
Wang, Yi
2017-01-01
RNA-based gene duplication, known as retrocopy, plays important roles in gene origination and genome evolution. The genomes of many plants have been sequenced, offering an opportunity to annotate and mine the retrocopies in plant genomes. However, comprehensive and unified annotation of retrocopies in these plants is still lacking. In this study I constructed the PlantRGDB (Plant Retrocopied Gene DataBase), the first database of plant retrocopies, to provide a putatively complete centralized list of retrocopies in plant genomes. The database is freely accessible at http://probes.pw.usda.gov/plantrgdb or http://aegilops.wheat.ucdavis.edu/plantrgdb. It currently integrates 49 plant species and 38,997 retrocopies along with characterization information. PlantRGDB provides a user-friendly web interface for searching, browsing and downloading the retrocopies in the database. PlantRGDB also offers graphical viewer-integrated sequence information for displaying the structure of each retrocopy. The attributes of the retrocopies of each species are reported using a browse function. In addition, useful tools, such as an advanced search and BLAST, are available to search the database more conveniently. In conclusion, the database will provide a web platform for obtaining valuable insight into the generation of retrocopies and will supplement research on gene duplication and genome evolution in plants. © The Author 2017. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Marsic, Damien; Govindasamy, Lakshmanan; Currlin, Seth; Markusic, David M; Tseng, Yu-Shan; Herzog, Roland W; Agbandje-McKenna, Mavis; Zolotukhin, Sergei
2014-01-01
Methodologies to improve existing adeno-associated virus (AAV) vectors for gene therapy include either rational approaches or directed evolution to derive capsid variants characterized by superior transduction efficiencies in targeted tissues. Here, we integrated both approaches in one unified design strategy of “virtual family shuffling” to derive a combinatorial capsid library whereby only variable regions on the surface of the capsid are modified. Individual sublibraries were first assembled in order to preselect compatible amino acid residues within restricted surface-exposed regions to minimize the generation of dead-end variants. Subsequently, the successful families were interbred to derive a combined library of ~8 × 105 complexity. Next-generation sequencing of the packaged viral DNA revealed capsid surface areas susceptible to directed evolution, thus providing guidance for future designs. We demonstrated the utility of the library by deriving an AAV2-based vector characterized by a 20-fold higher transduction efficiency in murine liver, now equivalent to that of AAV8. PMID:25048217
Orbital evolution of 95/P Chiron, 39P/Oterma, 29P/Shwassmann-Wachmann 1, and of 33 Centaurs
NASA Astrophysics Data System (ADS)
Kovalenko, N. S.; Churyumov, K. I.; Babenko, Yu. G.
2011-12-01
The paper is devoted to numerical modeling of orbital evolution of 34 Centaurs, and 2 distant Jupiter-family comets - 39P/Oterma and 29P/Shwassmann-Wachmann 1. As a result the evolutionary tracks of orbital elements of 33 Centaurs and 3 comets (95/P Chiron (2060), 39P/Oterma and 29P/Shwassmann-Wachmann 1) are obtained. The integrations were produced for 1 Myr back and forth in time starting at epoch and using the implicit single sequence Everhart methods. The statistical analysis of numerical integrations results was done, trends in changes of Centaurs' orbital elements in the past and in the future are revealed. The part of Centaurs that are potential comets is defined by the values of perihelia distributions for modeled orbits. It is shown that Centaurs may transits into orbits typical for Jupiter-family comets, and vice versa. Centaurs represent one of possible sources for replenishment of JFCs population, but other sources are also necessary.
The genome diversity and karyotype evolution of mammals
2011-01-01
The past decade has witnessed an explosion of genome sequencing and mapping in evolutionary diverse species. While full genome sequencing of mammals is rapidly progressing, the ability to assemble and align orthologous whole chromosome regions from more than a few species is still not possible. The intense focus on building of comparative maps for companion (dog and cat), laboratory (mice and rat) and agricultural (cattle, pig, and horse) animals has traditionally been used as a means to understand the underlying basis of disease-related or economically important phenotypes. However, these maps also provide an unprecedented opportunity to use multispecies analysis as a tool for inferring karyotype evolution. Comparative chromosome painting and related techniques are now considered to be the most powerful approaches in comparative genome studies. Homologies can be identified with high accuracy using molecularly defined DNA probes for fluorescence in situ hybridization (FISH) on chromosomes of different species. Chromosome painting data are now available for members of nearly all mammalian orders. In most orders, there are species with rates of chromosome evolution that can be considered as 'default' rates. The number of rearrangements that have become fixed in evolutionary history seems comparatively low, bearing in mind the 180 million years of the mammalian radiation. Comparative chromosome maps record the history of karyotype changes that have occurred during evolution. The aim of this review is to provide an overview of these recent advances in our endeavor to decipher the karyotype evolution of mammals by integrating the published results together with some of our latest unpublished results. PMID:21992653
Evolution and classification of the CRISPR-Cas systems
S. Makarova, Kira; H. Haft, Daniel; Barrangou, Rodolphe; J. J. Brouns, Stan; Charpentier, Emmanuelle; Horvath, Philippe; Moineau, Sylvain; J. M. Mojica, Francisco; I. Wolf, Yuri; Yakunin, Alexander F.; van der Oost, John; V. Koonin, Eugene
2012-01-01
The CRISPR–Cas (clustered regularly interspaced short palindromic repeats–CRISPR-associated proteins) modules are adaptive immunity systems that are present in many archaea and bacteria. These defence systems are encoded by operons that have an extraordinarily diverse architecture and a high rate of evolution for both the cas genes and the unique spacer content. Here, we provide an updated analysis of the evolutionary relationships between CRISPR–Cas systems and Cas proteins. Three major types of CRISPR–Cas system are delineated, with a further division into several subtypes and a few chimeric variants. Given the complexity of the genomic architectures and the extremely dynamic evolution of the CRISPR–Cas systems, a unified classification of these systems should be based on multiple criteria. Accordingly, we propose a `polythetic' classification that integrates the phylogenies of the most common cas genes, the sequence and organization of the CRISPR repeats and the architecture of the CRISPR–cas loci. PMID:21552286
NASA Astrophysics Data System (ADS)
Abidi, Oussama; Inoubli, Mohamed Hédi; Sebei, Kawthar; Amiri, Adnen; Boussiga, Haifa; Nasr, Imen Hamdi; Salem, Abdelhamid Ben; Elabed, Mahmoud
2017-05-01
The Maastrichtian-Paleocene El Haria formation was studied and defined in Tunisia on the basis of outcrops and borehole data; few studies were interested in its three-dimensional extent. In this paper, the El Haria formation is reviewed in the context of a tectono-stratigraphic interval using an integrated seismic stratigraphic analysis based on borehole lithology logs, electrical well logging, well shots, vertical seismic profiles and post-stack surface data. Seismic analysis benefits from appropriate calibration with borehole data, conventional interpretation, velocity mapping, seismic attributes and post-stack model-based inversion. The applied methodology proved to be powerful for charactering the marly Maastrichtian-Paleocene interval of the El Haria formation. Migrated seismic sections together with borehole measurements are used to detail the three-dimensional changes in thickness, facies and depositional environment in the Cap Bon and Gulf of Hammamet regions during the Maastrichtian-Paleocene time. Furthermore, dating based on their microfossil content divulges local and multiple internal hiatuses within the El Haria formation which are related to the geodynamic evolution of the depositional floor since the Campanian stage. Interpreted seismic sections display concordance, unconformities, pinchouts, sedimentary gaps, incised valleys and syn-sedimentary normal faulting. Based on the seismic reflection geometry and terminations, seven sequences are delineated. These sequences are related to base-level changes as the combination of depositional floor paleo-topography, tectonic forces, subsidence and the developed accommodation space. These factors controlled the occurrence of the various parts of the Maastrichtian-Paleocene interval. Detailed examinations of these deposits together with the analysis of the structural deformation at different time periods allowed us to obtain a better understanding of the sediment architecture in depth and the delineation of the geodynamic evolution of the region.
Interplay between Chaperones and Protein Disorder Promotes the Evolution of Protein Networks
Pechmann, Sebastian; Frydman, Judith
2014-01-01
Evolution is driven by mutations, which lead to new protein functions but come at a cost to protein stability. Non-conservative substitutions are of interest in this regard because they may most profoundly affect both function and stability. Accordingly, organisms must balance the benefit of accepting advantageous substitutions with the possible cost of deleterious effects on protein folding and stability. We here examine factors that systematically promote non-conservative mutations at the proteome level. Intrinsically disordered regions in proteins play pivotal roles in protein interactions, but many questions regarding their evolution remain unanswered. Similarly, whether and how molecular chaperones, which have been shown to buffer destabilizing mutations in individual proteins, generally provide robustness during proteome evolution remains unclear. To this end, we introduce an evolutionary parameter λ that directly estimates the rate of non-conservative substitutions. Our analysis of λ in Escherichia coli, Saccharomyces cerevisiae, and Homo sapiens sequences reveals how co- and post-translationally acting chaperones differentially promote non-conservative substitutions in their substrates, likely through buffering of their destabilizing effects. We further find that λ serves well to quantify the evolution of intrinsically disordered proteins even though the unstructured, thus generally variable regions in proteins are often flanked by very conserved sequences. Crucially, we show that both intrinsically disordered proteins and highly re-wired proteins in protein interaction networks, which have evolved new interactions and functions, exhibit a higher λ at the expense of enhanced chaperone assistance. Our findings thus highlight an intricate interplay of molecular chaperones and protein disorder in the evolvability of protein networks. Our results illuminate the role of chaperones in enabling protein evolution, and underline the importance of the cellular context and integrated approaches for understanding proteome evolution. We feel that the development of λ may be a valuable addition to the toolbox applied to understand the molecular basis of evolution. PMID:24968255
Evolution of bird genomes-a transposon's-eye view.
Kapusta, Aurélie; Suh, Alexander
2017-02-01
Birds, the most species-rich monophyletic group of land vertebrates, have been subject to some of the most intense sequencing efforts to date, making them an ideal case study for recent developments in genomics research. Here, we review how our understanding of bird genomes has changed with the recent sequencing of more than 75 species from all major avian taxa. We illuminate avian genome evolution from a previously neglected perspective: their repetitive genomic parasites, transposable elements (TEs) and endogenous viral elements (EVEs). We show that (1) birds are unique among vertebrates in terms of their genome organization; (2) information about the diversity of avian TEs and EVEs is changing rapidly; (3) flying birds have smaller genomes yet more TEs than flightless birds; (4) current second-generation genome assemblies fail to capture the variation in avian chromosome number and genome size determined with cytogenetics; (5) the genomic microcosm of bird-TE "arms races" has yet to be explored; and (6) upcoming third-generation genome assemblies suggest that birds exhibit stability in gene-rich regions and instability in TE-rich regions. We emphasize that integration of cytogenetics and single-molecule technologies with repeat-resolved genome assemblies is essential for understanding the evolution of (bird) genomes. © 2016 New York Academy of Sciences.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Adekunle, S.S.A.; Wyandt, H.; Mark, H.F.L.
1994-09-01
Recently we mapped the telomeric repeat sequences to 111 interstitial sites in the human genome and to sites of gaps and breaks induced by aphidicolin and sister chromatid exchange sites detected by BrdU. Many of these sites correspond to conserved fragile sites in man, gorilla and chimpazee, to sites of conserved sister chromatid exchange in the mammalian X chromosome, to mutagenic sensitive sites, mapped locations of proto-oncogenes, breakpoints implicated in primate evolution and to breakpoints indicated as the sole anomaly in neoplasia. This observation prompted us to investigate if the interstitial telomeric sites cluster with these sites. An extensive literaturemore » search was carried out to find all the available published sites mentioned above. For comparison, we also carried out a statistical analysis of the clustering of the sites of the telomeric repeats with the gene locations where only nucleotide mutations have been observed as the only chromosomal abnormality. Our results indicate that the telomeric repeats cluster most with fragile sites, mutagenic sensitive sites and breakpoints implicated in primate evolution and least with cancer breakpoints, mapped locations of proto-oncogenes and other genes with nucleotide mutations.« less
The evolution of resistance genes in multi-protein plant resistance systems.
Friedman, Aaron R; Baker, Barbara J
2007-12-01
The genomic perspective aids in integrating the analysis of single resistance (R-) genes into a higher order model of complex plant resistance systems. The majority of R-genes encode a class of proteins with nucleotide binding (NB) and leucine-rich repeat (LRR) domains. Several R-proteins act in multi-protein R-complexes that mediate interaction with pathogen effectors to induce resistance signaling. The complexity of these systems seems to have resulted from multiple rounds of plant-pathogen co-evolution. R-gene evolution is thought to be facilitated by the formation of R-gene clusters, which permit sequence exchanges via recombinatorial mispairing and generate high haplotypic diversity. This pattern of evolution may also generate diversity at other loci that contribute to the R-complex. The rate of recombination at R-clusters is not necessarily homogeneous or consistent over evolutionary time: recent evidence suggests that recombination at R-clusters is increased following pathogen infection, suggesting a mechanism that induces temporary genome instability in response to extreme stress. DNA methylation and chromatin modifications may allow this instability to be conditionally regulated and targeted to specific genome regions. Knowledge of natural R-gene evolution may contribute to strategies for artificial evolution of novel resistance specificities.
Beyond DNA: integrating inclusive inheritance into an extended theory of evolution.
Danchin, Étienne; Charmantier, Anne; Champagne, Frances A; Mesoudi, Alex; Pujol, Benoit; Blanchet, Simon
2011-06-17
Many biologists are calling for an 'extended evolutionary synthesis' that would 'modernize the modern synthesis' of evolution. Biological information is typically considered as being transmitted across generations by the DNA sequence alone, but accumulating evidence indicates that both genetic and non-genetic inheritance, and the interactions between them, have important effects on evolutionary outcomes. We review the evidence for such effects of epigenetic, ecological and cultural inheritance and parental effects, and outline methods that quantify the relative contributions of genetic and non-genetic heritability to the transmission of phenotypic variation across generations. These issues have implications for diverse areas, from the question of missing heritability in human complex-trait genetics to the basis of major evolutionary transitions.
Serohijos, Adrian W R; Shakhnovich, Eugene I
2014-06-01
The variation among sequences and structures in nature is both determined by physical laws and by evolutionary history. However, these two factors are traditionally investigated by disciplines with different emphasis and philosophy-molecular biophysics on one hand and evolutionary population genetics in another. Here, we review recent theoretical and computational approaches that address the crucial need to integrate these two disciplines. We first articulate the elements of these approaches. Then, we survey their contribution to our mechanistic understanding of molecular evolution, the polymorphisms in coding region, the distribution of fitness effects (DFE) of mutations, the observed folding stability of proteins in nature, and the distribution of protein folds in genomes. Copyright © 2014 Elsevier Ltd. All rights reserved.
The Evolution of Bony Vertebrate Enhancers at Odds with Their Coding Sequence Landscape.
Yousaf, Aisha; Sohail Raza, Muhammad; Ali Abbasi, Amir
2015-08-06
Enhancers lie at the heart of transcriptional and developmental gene regulation. Therefore, changes in enhancer sequences usually disrupt the target gene expression and result in disease phenotypes. Despite the well-established role of enhancers in development and disease, evolutionary sequence studies are lacking. The current study attempts to unravel the puzzle of bony vertebrates' conserved noncoding elements (CNE) enhancer evolution. Bayesian phylogenetics of enhancer sequences spotlights promising interordinal relationships among placental mammals, proposing a closer relationship between humans and laurasiatherians while placing rodents at the basal position. Clock-based estimates of enhancer evolution provided a dynamic picture of interspecific rate changes across the bony vertebrate lineage. Moreover, coelacanth in the study augmented our appreciation of the vertebrate cis-regulatory evolution during water-land transition. Intriguingly, we observed a pronounced upsurge in enhancer evolution in land-dwelling vertebrates. These novel findings triggered us to further investigate the evolutionary trend of coding as well as CNE nonenhancer repertoires, to highlight the relative evolutionary dynamics of diverse genomic landscapes. Surprisingly, the evolutionary rates of enhancer sequences were clearly at odds with those of the coding and the CNE nonenhancer sequences during vertebrate adaptation to land, with land vertebrates exhibiting significantly reduced rates of coding sequence evolution in comparison to their fast evolving regulatory landscape. The observed variation in tetrapod cis-regulatory elements caused the fine-tuning of associated gene regulatory networks. Therefore, the increased evolutionary rate of tetrapods' enhancer sequences might be responsible for the variation in developmental regulatory circuits during the process of vertebrate adaptation to land. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Sansevere, Emily A; Luo, Xiao; Park, Joo Youn; Yoon, Sunghyun; Seo, Keun Seok; Robinson, D Ashley
2017-04-15
ICE 6013 represents one of two families of integrative conjugative elements (ICEs) identified in the pan-genome of the human and animal pathogen Staphylococcus aureus Here we investigated the excision and conjugation functions of ICE 6013 and further characterized the diversity of this element. ICE 6013 excision was not significantly affected by growth, temperature, pH, or UV exposure and did not depend on recA The IS 30 -like DDE transposase (Tpase; encoded by orf1 and orf2 ) of ICE 6013 must be uninterrupted for excision to occur, whereas disrupting three of the other open reading frames (ORFs) on the element significantly affects the level of excision. We demonstrate that ICE 6013 conjugatively transfers to different S. aureus backgrounds at frequencies approaching that of the conjugative plasmid pGO1. We found that excision is required for conjugation, that not all S. aureus backgrounds are successful recipients, and that transconjugants acquire the ability to transfer ICE 6013 Sequencing of chromosomal integration sites in serially passaged transconjugants revealed a significant integration site preference for a 15-bp AT-rich palindromic consensus sequence, which surrounds the 3-bp target site that is duplicated upon integration. A sequence analysis of ICE 6013 from different host strains of S. aureus and from eight other species of staphylococci identified seven divergent subfamilies of ICE 6013 that include sequences previously classified as a transposon, a plasmid, and various ICEs. In summary, these results indicate that the IS 30 -like Tpase functions as the ICE 6013 recombinase and that ICE 6013 represents a diverse family of mobile genetic elements that mediate conjugation in staphylococci. IMPORTANCE Integrative conjugative elements (ICEs) encode the abilities to integrate into and excise from bacterial chromosomes and plasmids and mediate conjugation between bacteria. As agents of horizontal gene transfer, ICEs may affect bacterial evolution. ICE 6013 represents one of two known families of ICEs in the pathogen Staphylococcus aureus , but its core functions of excision and conjugation are not well studied. Here, we show that ICE 6013 depends on its IS 30 -like DDE transposase for excision, which is unique among ICEs, and we demonstrate the conjugative transfer and integration site preference of ICE 6013 A sequence analysis revealed that ICE 6013 has diverged into seven subfamilies that are dispersed among staphylococci. Copyright © 2017 American Society for Microbiology.
Stepanauskas, Ramunas; Fergusson, Elizabeth A; Brown, Joseph; Poulton, Nicole J; Tupper, Ben; Labonté, Jessica M; Becraft, Eric D; Brown, Julia M; Pachiadaki, Maria G; Povilaitis, Tadas; Thompson, Brian P; Mascena, Corianna J; Bellows, Wendy K; Lubys, Arvydas
2017-07-20
Microbial single-cell genomics can be used to provide insights into the metabolic potential, interactions, and evolution of uncultured microorganisms. Here we present WGA-X, a method based on multiple displacement amplification of DNA that utilizes a thermostable mutant of the phi29 polymerase. WGA-X enhances genome recovery from individual microbial cells and viral particles while maintaining ease of use and scalability. The greatest improvements are observed when amplifying high G+C content templates, such as those belonging to the predominant bacteria in agricultural soils. By integrating WGA-X with calibrated index-cell sorting and high-throughput genomic sequencing, we are able to analyze genomic sequences and cell sizes of hundreds of individual, uncultured bacteria, archaea, protists, and viral particles, obtained directly from marine and soil samples, in a single experiment. This approach may find diverse applications in microbiology and in biomedical and forensic studies of humans and other multicellular organisms.Single-cell genomics can be used to study uncultured microorganisms. Here, Stepanauskas et al. present a method combining improved multiple displacement amplification and FACS, to obtain genomic sequences and cell size information from uncultivated microbial cells and viral particles in environmental samples.
Holden, Matthew T. G.; Hauser, Heidi; Sanders, Mandy; Ngo, Thi Hoa; Cherevach, Inna; Cronin, Ann; Goodhead, Ian; Mungall, Karen; Quail, Michael A.; Price, Claire; Rabbinowitsch, Ester; Sharp, Sarah; Croucher, Nicholas J.; Chieu, Tran Bich; Thi Hoang Mai, Nguyen; Diep, To Song; Chinh, Nguyen Tran; Kehoe, Michael; Leigh, James A.; Ward, Philip N.; Dowson, Christopher G.; Whatmore, Adrian M.; Chanter, Neil; Iversen, Pernille; Gottschalk, Marcelo; Slater, Josh D.; Smith, Hilde E.; Spratt, Brian G.; Xu, Jianguo; Ye, Changyun; Bentley, Stephen; Barrell, Barclay G.; Schultsz, Constance; Maskell, Duncan J.; Parkhill, Julian
2009-01-01
Background Streptococcus suis is a zoonotic pathogen that infects pigs and can occasionally cause serious infections in humans. S. suis infections occur sporadically in human Europe and North America, but a recent major outbreak has been described in China with high levels of mortality. The mechanisms of S. suis pathogenesis in humans and pigs are poorly understood. Methodology/Principal Findings The sequencing of whole genomes of S. suis isolates provides opportunities to investigate the genetic basis of infection. Here we describe whole genome sequences of three S. suis strains from the same lineage: one from European pigs, and two from human cases from China and Vietnam. Comparative genomic analysis was used to investigate the variability of these strains. S. suis is phylogenetically distinct from other Streptococcus species for which genome sequences are currently available. Accordingly, ∼40% of the ∼2 Mb genome is unique in comparison to other Streptococcus species. Finer genomic comparisons within the species showed a high level of sequence conservation; virtually all of the genome is common to the S. suis strains. The only exceptions are three ∼90 kb regions, present in the two isolates from humans, composed of integrative conjugative elements and transposons. Carried in these regions are coding sequences associated with drug resistance. In addition, small-scale sequence variation has generated pseudogenes in putative virulence and colonization factors. Conclusions/Significance The genomic inventories of genetically related S. suis strains, isolated from distinct hosts and diseases, exhibit high levels of conservation. However, the genomes provide evidence that horizontal gene transfer has contributed to the evolution of drug resistance. PMID:19603075
Chen, Sunlu; Zheng, Huizhen; Kishima, Yuji
2017-06-01
The interplay of different virus species in a host cell after infection can affect the adaptation of each virus. Endogenous viral elements, such as endogenous pararetroviruses (PRVs), have arisen from vertical inheritance of viral sequences integrated into host germline genomes. As viral genomic fossils, these sequences can thus serve as valuable paleogenomic data to study the long-term evolutionary dynamics of virus-virus interactions, but they have rarely been applied for this purpose. All extant PRVs have been considered autonomous species in their parasitic life cycle in host cells. Here, we provide evidence for multiple non-autonomous PRV species with structural defects in viral activity that have frequently infected ancient grass hosts and adapted through interplay between viruses. Our paleogenomic analyses using endogenous PRVs in grass genomes revealed that these non-autonomous PRV species have participated in interplay with autonomous PRVs in a possible commensal partnership, or, alternatively, with one another in a possible mutualistic partnership. These partnerships, which have been established by the sharing of noncoding regulatory sequences (NRSs) in intergenic regions between two partner viruses, have been further maintained and altered by the sequence homogenization of NRSs between partners. Strikingly, we found that frequent region-specific recombination, rather than mutation selection, is the main causative mechanism of NRS homogenization. Our results, obtained from ancient DNA records of viruses, suggest that adaptation of PRVs has occurred by concerted evolution of NRSs between different virus species in the same host. Our findings further imply that evaluation of within-host NRS interactions within and between populations of viral pathogens may be important.
Vasconcelos, Ana Tereza R.; Ferreira, Henrique B.; Bizarro, Cristiano V.; Bonatto, Sandro L.; Carvalho, Marcos O.; Pinto, Paulo M.; Almeida, Darcy F.; Almeida, Luiz G. P.; Almeida, Rosana; Alves-Filho, Leonardo; Assunção, Enedina N.; Azevedo, Vasco A. C.; Bogo, Maurício R.; Brigido, Marcelo M.; Brocchi, Marcelo; Burity, Helio A.; Camargo, Anamaria A.; Camargo, Sandro S.; Carepo, Marta S.; Carraro, Dirce M.; de Mattos Cascardo, Júlio C.; Castro, Luiza A.; Cavalcanti, Gisele; Chemale, Gustavo; Collevatti, Rosane G.; Cunha, Cristina W.; Dallagiovanna, Bruno; Dambrós, Bibiana P.; Dellagostin, Odir A.; Falcão, Clarissa; Fantinatti-Garboggini, Fabiana; Felipe, Maria S. S.; Fiorentin, Laurimar; Franco, Gloria R.; Freitas, Nara S. A.; Frías, Diego; Grangeiro, Thalles B.; Grisard, Edmundo C.; Guimarães, Claudia T.; Hungria, Mariangela; Jardim, Sílvia N.; Krieger, Marco A.; Laurino, Jomar P.; Lima, Lucymara F. A.; Lopes, Maryellen I.; Loreto, Élgion L. S.; Madeira, Humberto M. F.; Manfio, Gilson P.; Maranhão, Andrea Q.; Martinkovics, Christyanne T.; Medeiros, Sílvia R. B.; Moreira, Miguel A. M.; Neiva, Márcia; Ramalho-Neto, Cicero E.; Nicolás, Marisa F.; Oliveira, Sergio C.; Paixão, Roger F. C.; Pedrosa, Fábio O.; Pena, Sérgio D. J.; Pereira, Maristela; Pereira-Ferrari, Lilian; Piffer, Itamar; Pinto, Luciano S.; Potrich, Deise P.; Salim, Anna C. M.; Santos, Fabrício R.; Schmitt, Renata; Schneider, Maria P. C.; Schrank, Augusto; Schrank, Irene S.; Schuck, Adriana F.; Seuanez, Hector N.; Silva, Denise W.; Silva, Rosane; Silva, Sérgio C.; Soares, Célia M. A.; Souza, Kelly R. L.; Souza, Rangel C.; Staats, Charley C.; Steffens, Maria B. R.; Teixeira, Santuza M. R.; Urmenyi, Turan P.; Vainstein, Marilene H.; Zuccherato, Luciana W.; Simpson, Andrew J. G.; Zaha, Arnaldo
2005-01-01
This work reports the results of analyses of three complete mycoplasma genomes, a pathogenic (7448) and a nonpathogenic (J) strain of the swine pathogen Mycoplasma hyopneumoniae and a strain of the avian pathogen Mycoplasma synoviae; the genome sizes of the three strains were 920,079 bp, 897,405 bp, and 799,476 bp, respectively. These genomes were compared with other sequenced mycoplasma genomes reported in the literature to examine several aspects of mycoplasma evolution. Strain-specific regions, including integrative and conjugal elements, and genome rearrangements and alterations in adhesin sequences were observed in the M. hyopneumoniae strains, and all of these were potentially related to pathogenicity. Genomic comparisons revealed that reduction in genome size implied loss of redundant metabolic pathways, with maintenance of alternative routes in different species. Horizontal gene transfer was consistently observed between M. synoviae and Mycoplasma gallisepticum. Our analyses indicated a likely transfer event of hemagglutinin-coding DNA sequences from M. gallisepticum to M. synoviae. PMID:16077101
Centromere Binding and Evolution of Chromosomal Partition Systems in the Burkholderiales
Passot, Fanny M.; Calderon, Virginie; Fichant, Gwennaele; Lane, David
2012-01-01
How split genomes arise and evolve in bacteria is poorly understood. Since each replicon of such genomes encodes a specific partition (Par) system, the evolution of Par systems could shed light on their evolution. The cystic fibrosis pathogen Burkholderia cenocepacia has three chromosomes (c1, c2, and c3) and one plasmid (pBC), whose compatibility depends on strictly specific interactions of the centromere sequences (parS) with their cognate binding proteins (ParB). However, the Par systems of B. cenocepacia c2, c3, and pBC share many features, suggesting that they arose within an extended family. Database searching revealed seven subfamilies of Par systems like those of B. cenocepacia. All are from plasmids and secondary chromosomes of the Burkholderiales, which reinforces the proposal of an extended family. The subfamily of the Par system of B. cenocepacia c3 includes plasmid variants with parS sequences divergent from that of c3. Using electrophoretic mobility shift assay (EMSA), we found that ParB-c3 binds specifically to centromeres of these variants, despite high DNA sequence divergence. We suggest that the Par system of B. cenocepacia c3 has preserved the features of an ancestral system. In contrast, these features have diverged variably in the plasmid descendants. One such descendant is found both in Ralstonia pickettii 12D, on a free plasmid, and in Ralstonia pickettii 12J, on a plasmid integrated into the main chromosome. These observations suggest that we are witnessing a plasmid-chromosome interaction from which a third chromosome will emerge in a two-chromosome species. PMID:22522899
Centromere binding and evolution of chromosomal partition systems in the Burkholderiales.
Passot, Fanny M; Calderon, Virginie; Fichant, Gwennaele; Lane, David; Pasta, Franck
2012-07-01
How split genomes arise and evolve in bacteria is poorly understood. Since each replicon of such genomes encodes a specific partition (Par) system, the evolution of Par systems could shed light on their evolution. The cystic fibrosis pathogen Burkholderia cenocepacia has three chromosomes (c1, c2, and c3) and one plasmid (pBC), whose compatibility depends on strictly specific interactions of the centromere sequences (parS) with their cognate binding proteins (ParB). However, the Par systems of B. cenocepacia c2, c3, and pBC share many features, suggesting that they arose within an extended family. Database searching revealed seven subfamilies of Par systems like those of B. cenocepacia. All are from plasmids and secondary chromosomes of the Burkholderiales, which reinforces the proposal of an extended family. The subfamily of the Par system of B. cenocepacia c3 includes plasmid variants with parS sequences divergent from that of c3. Using electrophoretic mobility shift assay (EMSA), we found that ParB-c3 binds specifically to centromeres of these variants, despite high DNA sequence divergence. We suggest that the Par system of B. cenocepacia c3 has preserved the features of an ancestral system. In contrast, these features have diverged variably in the plasmid descendants. One such descendant is found both in Ralstonia pickettii 12D, on a free plasmid, and in Ralstonia pickettii 12J, on a plasmid integrated into the main chromosome. These observations suggest that we are witnessing a plasmid-chromosome interaction from which a third chromosome will emerge in a two-chromosome species.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bohacs, K.M.
1991-02-01
Deep basinal rocks of the Monterey Formation can be allocated to different depositional environments based on an integration of bedding, stacking patterns of facies, lithology, biofacies, and inorganic and organic chemistry. These rocks show evidence of systematic changes in depositional environments that can be related to eustatic sea level changes and basin evolution. Even deep-basinal environments are affected by changing sea level through changes in circulation patterns and intensities, nutrient budgets and dispersal patterns, and location and intensity of the oceanic oxygen minimum. The sequence-stratigraphic framework was constructed based on the physical expression of the outcrop strata and confirmed bymore » typing the outcrop sections to an integrated will-log/seismic grid through outcrop gamma-ray spectral profiles. Interpretation of a sequence boundary was based on increased proportions of hemipelagic facies and evidence of increased bottom-energy levels above the boundary, and local erosion and relief on the surface. The proportion of shallower water and reworked dinoflagellates increased to a local maximum above the boundary. Downlap surfaces exhibited increased proportions of pelagic facies around the surface, a secular change in the dominant lithology across the surface, evidence of decreased bottom-energy levels and terrigenous sedimentation rates, and little or not significant erosion on the surface. The proportion of deeper water dinoflagellates increased to a local maximum at or near the downlap surface; there was no evidence of reworked individuals. The detailed sequence-stratigraphic framework makes it possible to tie rock properties to genetic processes for construction of predictive models.« less
BGD: a database of bat genomes.
Fang, Jianfei; Wang, Xuan; Mu, Shuo; Zhang, Shuyi; Dong, Dong
2015-01-01
Bats account for ~20% of mammalian species, and are the only mammals with true powered flight. For the sake of their specialized phenotypic traits, many researches have been devoted to examine the evolution of bats. Until now, some whole genome sequences of bats have been assembled and annotated, however, a uniform resource for the annotated bat genomes is still unavailable. To make the extensive data associated with the bat genomes accessible to the general biological communities, we established a Bat Genome Database (BGD). BGD is an open-access, web-available portal that integrates available data of bat genomes and genes. It hosts data from six bat species, including two megabats and four microbats. Users can query the gene annotations using efficient searching engine, and it offers browsable tracks of bat genomes. Furthermore, an easy-to-use phylogenetic analysis tool was also provided to facilitate online phylogeny study of genes. To the best of our knowledge, BGD is the first database of bat genomes. It will extend our understanding of the bat evolution and be advantageous to the bat sequences analysis. BGD is freely available at: http://donglab.ecnu.edu.cn/databases/BatGenome/.
Kim, Seungill; Park, Minkyu; Yeom, Seon-In; Kim, Yong-Min; Lee, Je Min; Lee, Hyun-Ah; Seo, Eunyoung; Choi, Jaeyoung; Cheong, Kyeongchae; Kim, Ki-Tae; Jung, Kyongyong; Lee, Gir-Won; Oh, Sang-Keun; Bae, Chungyun; Kim, Saet-Byul; Lee, Hye-Young; Kim, Shin-Young; Kim, Myung-Shin; Kang, Byoung-Cheorl; Jo, Yeong Deuk; Yang, Hee-Bum; Jeong, Hee-Jin; Kang, Won-Hee; Kwon, Jin-Kyung; Shin, Chanseok; Lim, Jae Yun; Park, June Hyun; Huh, Jin Hoe; Kim, June-Sik; Kim, Byung-Dong; Cohen, Oded; Paran, Ilan; Suh, Mi Chung; Lee, Saet Buyl; Kim, Yeon-Ki; Shin, Younhee; Noh, Seung-Jae; Park, Junhyung; Seo, Young Sam; Kwon, Suk-Yoon; Kim, Hyun A; Park, Jeong Mee; Kim, Hyun-Jin; Choi, Sang-Bong; Bosland, Paul W; Reeves, Gregory; Jo, Sung-Hwan; Lee, Bong-Woo; Cho, Hyung-Taeg; Choi, Hee-Seung; Lee, Min-Soo; Yu, Yeisoo; Do Choi, Yang; Park, Beom-Seok; van Deynze, Allen; Ashrafi, Hamid; Hill, Theresa; Kim, Woo Taek; Pai, Hyun-Sook; Ahn, Hee Kyung; Yeam, Inhwa; Giovannoni, James J; Rose, Jocelyn K C; Sørensen, Iben; Lee, Sang-Jik; Kim, Ryan W; Choi, Ik-Young; Choi, Beom-Soon; Lim, Jong-Sung; Lee, Yong-Hwan; Choi, Doil
2014-03-01
Hot pepper (Capsicum annuum), one of the oldest domesticated crops in the Americas, is the most widely grown spice crop in the world. We report whole-genome sequencing and assembly of the hot pepper (Mexican landrace of Capsicum annuum cv. CM334) at 186.6× coverage. We also report resequencing of two cultivated peppers and de novo sequencing of the wild species Capsicum chinense. The genome size of the hot pepper was approximately fourfold larger than that of its close relative tomato, and the genome showed an accumulation of Gypsy and Caulimoviridae family elements. Integrative genomic and transcriptomic analyses suggested that change in gene expression and neofunctionalization of capsaicin synthase have shaped capsaicinoid biosynthesis. We found differential molecular patterns of ripening regulators and ethylene synthesis in hot pepper and tomato. The reference genome will serve as a platform for improving the nutritional and medicinal values of Capsicum species.
Artificial Intelligence, DNA Mimicry, and Human Health.
Stefano, George B; Kream, Richard M
2017-08-14
The molecular evolution of genomic DNA across diverse plant and animal phyla involved dynamic registrations of sequence modifications to maintain existential homeostasis to increasingly complex patterns of environmental stressors. As an essential corollary, driver effects of positive evolutionary pressure are hypothesized to effect concerted modifications of genomic DNA sequences to meet expanded platforms of regulatory controls for successful implementation of advanced physiological requirements. It is also clearly apparent that preservation of updated registries of advantageous modifications of genomic DNA sequences requires coordinate expansion of convergent cellular proofreading/error correction mechanisms that are encoded by reciprocally modified genomic DNA. Computational expansion of operationally defined DNA memory extends to coordinate modification of coding and previously under-emphasized noncoding regions that now appear to represent essential reservoirs of untapped genetic information amenable to evolutionary driven recruitment into the realm of biologically active domains. Additionally, expansion of DNA memory potential via chemical modification and activation of noncoding sequences is targeted to vertical augmentation and integration of an expanded cadre of transcriptional and epigenetic regulatory factors affecting linear coding of protein amino acid sequences within open reading frames.
Vallée, Geneviève C; Muñoz, Daniella Santos; Sankoff, David
2016-11-11
Of the approximately two hundred sequenced plant genomes, how many and which ones were sequenced motivated by strictly or largely scientific considerations, and how many by chiefly economic, in a wide sense, incentives? And how large a role does publication opportunity play? In an integration of multiple disparate databases and other sources of information, we collect and analyze data on the size (number of species) in the plant orders and families containing sequenced genomes, on the trade value of these species, and of all the same-family or same-order species, and on the publication priority within the family and order. These data are subjected to multiple regression and other statistical analyses. We find that despite the initial importance of model organisms, it is clearly economic considerations that outweigh others in the choice of genome to be sequenced. This has important implications for generalizations about plant genomes, since human choices of plants to harvest (and cultivate) will have incurred many biases with respect to phenotypic characteristics and hence of genomic properties, and recent genomic evolution will also have been affected by human agricultural practices.
Walker, Sara Imari; Grover, Martha A.; Hud, Nicholas V.
2012-01-01
Many models for the origin of life have focused on understanding how evolution can drive the refinement of a preexisting enzyme, such as the evolution of efficient replicase activity. Here we present a model for what was, arguably, an even earlier stage of chemical evolution, when polymer sequence diversity was generated and sustained before, and during, the onset of functional selection. The model includes regular environmental cycles (e.g. hydration-dehydration cycles) that drive polymers between times of replication and functional activity, which coincide with times of different monomer and polymer diffusivity. Template-directed replication of informational polymers, which takes place during the dehydration stage of each cycle, is considered to be sequence-independent. New sequences are generated by spontaneous polymer formation, and all sequences compete for a finite monomer resource that is recycled via reversible polymerization. Kinetic Monte Carlo simulations demonstrate that this proposed prebiotic scenario provides a robust mechanism for the exploration of sequence space. Introduction of a polymer sequence with monomer synthetase activity illustrates that functional sequences can become established in a preexisting pool of otherwise non-functional sequences. Functional selection does not dominate system dynamics and sequence diversity remains high, permitting the emergence and spread of more than one functional sequence. It is also observed that polymers spontaneously form clusters in simulations where polymers diffuse more slowly than monomers, a feature that is reminiscent of a previous proposal that the earliest stages of life could have been defined by the collective evolution of a system-wide cooperation of polymer aggregates. Overall, the results presented demonstrate the merits of considering plausible prebiotic polymer chemistries and environments that would have allowed for the rapid turnover of monomer resources and for regularly varying monomer/polymer diffusivities. PMID:22493682
Integrative workflows for metagenomic analysis
Ladoukakis, Efthymios; Kolisis, Fragiskos N.; Chatziioannou, Aristotelis A.
2014-01-01
The rapid evolution of all sequencing technologies, described by the term Next Generation Sequencing (NGS), have revolutionized metagenomic analysis. They constitute a combination of high-throughput analytical protocols, coupled to delicate measuring techniques, in order to potentially discover, properly assemble and map allelic sequences to the correct genomes, achieving particularly high yields for only a fraction of the cost of traditional processes (i.e., Sanger). From a bioinformatic perspective, this boils down to many GB of data being generated from each single sequencing experiment, rendering the management or even the storage, critical bottlenecks with respect to the overall analytical endeavor. The enormous complexity is even more aggravated by the versatility of the processing steps available, represented by the numerous bioinformatic tools that are essential, for each analytical task, in order to fully unveil the genetic content of a metagenomic dataset. These disparate tasks range from simple, nonetheless non-trivial, quality control of raw data to exceptionally complex protein annotation procedures, requesting a high level of expertise for their proper application or the neat implementation of the whole workflow. Furthermore, a bioinformatic analysis of such scale, requires grand computational resources, imposing as the sole realistic solution, the utilization of cloud computing infrastructures. In this review article we discuss different, integrative, bioinformatic solutions available, which address the aforementioned issues, by performing a critical assessment of the available automated pipelines for data management, quality control, and annotation of metagenomic data, embracing various, major sequencing technologies and applications. PMID:25478562
NGSPanPipe: A Pipeline for Pan-genome Identification in Microbial Strains from Experimental Reads.
Kulsum, Umay; Kapil, Arti; Singh, Harpreet; Kaur, Punit
2018-01-01
Recent advancements in sequencing technologies have decreased both time span and cost for sequencing the whole bacterial genome. High-throughput Next-Generation Sequencing (NGS) technology has led to the generation of enormous data concerning microbial populations publically available across various repositories. As a consequence, it has become possible to study and compare the genomes of different bacterial strains within a species or genus in terms of evolution, ecology and diversity. Studying the pan-genome provides insights into deciphering microevolution, global composition and diversity in virulence and pathogenesis of a species. It can also assist in identifying drug targets and proposing vaccine candidates. The effective analysis of these large genome datasets necessitates the development of robust tools. Current methods to develop pan-genome do not support direct input of raw reads from the sequencer machine but require preprocessing of reads as an assembled protein/gene sequence file or the binary matrix of orthologous genes/proteins. We have designed an easy-to-use integrated pipeline, NGSPanPipe, which can directly identify the pan-genome from short reads. The output from the pipeline is compatible with other pan-genome analysis tools. We evaluated our pipeline with other methods for developing pan-genome, i.e. reference-based assembly and de novo assembly using simulated reads of Mycobacterium tuberculosis. The single script pipeline (pipeline.pl) is applicable for all bacterial strains. It integrates multiple in-house Perl scripts and is freely accessible from https://github.com/Biomedinformatics/NGSPanPipe .
Genomic investigations of evolutionary dynamics and epistasis in microbial evolution experiments.
Jerison, Elizabeth R; Desai, Michael M
2015-12-01
Microbial evolution experiments enable us to watch adaptation in real time, and to quantify the repeatability and predictability of evolution by comparing identical replicate populations. Further, we can resurrect ancestral types to examine changes over evolutionary time. Until recently, experimental evolution has been limited to measuring phenotypic changes, or to tracking a few genetic markers over time. However, recent advances in sequencing technology now make it possible to extensively sequence clones or whole-population samples from microbial evolution experiments. Here, we review recent work exploiting these techniques to understand the genomic basis of evolutionary change in experimental systems. We first focus on studies that analyze the dynamics of genome evolution in microbial systems. We then survey work that uses observations of sequence evolution to infer aspects of the underlying fitness landscape, concentrating on the epistatic interactions between mutations and the constraints these interactions impose on adaptation. Copyright © 2015 Elsevier Ltd. All rights reserved.
Position specific variation in the rate of evolution in transcription factor binding sites
Moses, Alan M; Chiang, Derek Y; Kellis, Manolis; Lander, Eric S; Eisen, Michael B
2003-01-01
Background The binding sites of sequence specific transcription factors are an important and relatively well-understood class of functional non-coding DNAs. Although a wide variety of experimental and computational methods have been developed to characterize transcription factor binding sites, they remain difficult to identify. Comparison of non-coding DNA from related species has shown considerable promise in identifying these functional non-coding sequences, even though relatively little is known about their evolution. Results Here we analyse the genome sequences of the budding yeasts Saccharomyces cerevisiae, S. bayanus, S. paradoxus and S. mikatae to study the evolution of transcription factor binding sites. As expected, we find that both experimentally characterized and computationally predicted binding sites evolve slower than surrounding sequence, consistent with the hypothesis that they are under purifying selection. We also observe position-specific variation in the rate of evolution within binding sites. We find that the position-specific rate of evolution is positively correlated with degeneracy among binding sites within S. cerevisiae. We test theoretical predictions for the rate of evolution at positions where the base frequencies deviate from background due to purifying selection and find reasonable agreement with the observed rates of evolution. Finally, we show how the evolutionary characteristics of real binding motifs can be used to distinguish them from artefacts of computational motif finding algorithms. Conclusion As has been observed for protein sequences, the rate of evolution in transcription factor binding sites varies with position, suggesting that some regions are under stronger functional constraint than others. This variation likely reflects the varying importance of different positions in the formation of the protein-DNA complex. The characterization of the pattern of evolution in known binding sites will likely contribute to the effective use of comparative sequence data in the identification of transcription factor binding sites and is an important step toward understanding the evolution of functional non-coding DNA. PMID:12946282
Genome-wide signatures of convergent evolution in echolocating mammals
Parker, Joe; Tsagkogeorga, Georgia; Cotton, James A.; Liu, Yuan; Provero, Paolo; Stupka, Elia; Rossiter, Stephen J.
2013-01-01
Evolution is typically thought to proceed through divergence of genes, proteins, and ultimately phenotypes1-3. However, similar traits might also evolve convergently in unrelated taxa due to similar selection pressures4,5. Adaptive phenotypic convergence is widespread in nature, and recent results from a handful of genes have suggested that this phenomenon is powerful enough to also drive recurrent evolution at the sequence level6-9. Where homoplasious substitutions do occur these have long been considered the result of neutral processes. However, recent studies have demonstrated that adaptive convergent sequence evolution can be detected in vertebrates using statistical methods that model parallel evolution9,10 although the extent to which sequence convergence between genera occurs across genomes is unknown. Here we analyse genomic sequence data in mammals that have independently evolved echolocation and show for the first time that convergence is not a rare process restricted to a handful of loci but is instead widespread, continuously distributed and commonly driven by natural selection acting on a small number of sites per locus. Systematic analyses of convergent sequence evolution in 805,053 amino acids within 2,326 orthologous coding gene sequences compared across 22 mammals (including four new bat genomes) revealed signatures consistent with convergence in nearly 200 loci. Strong and significant support for convergence among bats and the dolphin was seen in numerous genes linked to hearing or deafness, consistent with an involvement in echolocation. Surprisingly we also found convergence in many genes linked to vision: the convergent signal of many sensory genes was robustly correlated with the strength of natural selection. This first attempt to detect genome-wide convergent sequence evolution across divergent taxa reveals the phenomenon to be much more pervasive than previously recognised. PMID:24005325
Rotational evolution of slow-rotator sequence stars
NASA Astrophysics Data System (ADS)
Lanzafame, A. C.; Spada, F.
2015-12-01
Context. The observed relationship between mass, age and rotation in open clusters shows the progressive development of a slow-rotator sequence among stars possessing a radiative interior and a convective envelope during their pre-main sequence and main-sequence evolution. After 0.6 Gyr, most cluster members of this type have settled on this sequence. Aims: The observed clustering on this sequence suggests that it corresponds to some equilibrium or asymptotic condition that still lacks a complete theoretical interpretation, and which is crucial to our understanding of the stellar angular momentum evolution. Methods: We couple a rotational evolution model, which takes internal differential rotation into account, with classical and new proposals for the wind braking law, and fit models to the data using a Monte Carlo Markov chain (MCMC) method tailored to the problem at hand. We explore to what extent these models are able to reproduce the mass and time dependence of the stellar rotational evolution on the slow-rotator sequence. Results: The description of the evolution of the slow-rotator sequence requires taking the transfer of angular momentum from the radiative core to the convective envelope into account. We find that, in the mass range 0.85-1.10 M⊙, the core-envelope coupling timescale for stars in the slow-rotator sequence scales as M-7.28. Quasi-solid body rotation is achieved only after 1-2 Gyr, depending on stellar mass, which implies that observing small deviations from the Skumanich law (P ∝ √{t}) would require period data of older open clusters than is available to date. The observed evolution in the 0.1-2.5 Gyr age range and in the 0.85-1.10 M⊙ mass range is best reproduced by assuming an empirical mass dependence of the wind angular momentum loss proportional to the convective turnover timescale and to the stellar moment of inertia. Period isochrones based on our MCMC fit provide a tool for inferring stellar ages of solar-like main-sequence stars from their mass and rotation period that is largely independent of the wind braking model adopted. These effectively represent gyro-chronology relationships that take the physics of the two-zone model for the stellar angular momentum evolution into account.
Aslam, Luqman; Beal, Kathryn; Ann Blomberg, Le; Bouffard, Pascal; Burt, David W.; Crasta, Oswald; Crooijmans, Richard P. M. A.; Cooper, Kristal; Coulombe, Roger A.; De, Supriyo; Delany, Mary E.; Dodgson, Jerry B.; Dong, Jennifer J.; Evans, Clive; Frederickson, Karin M.; Flicek, Paul; Florea, Liliana; Folkerts, Otto; Groenen, Martien A. M.; Harkins, Tim T.; Herrero, Javier; Hoffmann, Steve; Megens, Hendrik-Jan; Jiang, Andrew; de Jong, Pieter; Kaiser, Pete; Kim, Heebal; Kim, Kyu-Won; Kim, Sungwon; Langenberger, David; Lee, Mi-Kyung; Lee, Taeheon; Mane, Shrinivasrao; Marcais, Guillaume; Marz, Manja; McElroy, Audrey P.; Modise, Thero; Nefedov, Mikhail; Notredame, Cédric; Paton, Ian R.; Payne, William S.; Pertea, Geo; Prickett, Dennis; Puiu, Daniela; Qioa, Dan; Raineri, Emanuele; Ruffier, Magali; Salzberg, Steven L.; Schatz, Michael C.; Scheuring, Chantel; Schmidt, Carl J.; Schroeder, Steven; Searle, Stephen M. J.; Smith, Edward J.; Smith, Jacqueline; Sonstegard, Tad S.; Stadler, Peter F.; Tafer, Hakim; Tu, Zhijian (Jake); Van Tassell, Curtis P.; Vilella, Albert J.; Williams, Kelly P.; Yorke, James A.; Zhang, Liqing; Zhang, Hong-Bin; Zhang, Xiaojun; Zhang, Yang; Reed, Kent M.
2010-01-01
A synergistic combination of two next-generation sequencing platforms with a detailed comparative BAC physical contig map provided a cost-effective assembly of the genome sequence of the domestic turkey (Meleagris gallopavo). Heterozygosity of the sequenced source genome allowed discovery of more than 600,000 high quality single nucleotide variants. Despite this heterozygosity, the current genome assembly (∼1.1 Gb) includes 917 Mb of sequence assigned to specific turkey chromosomes. Annotation identified nearly 16,000 genes, with 15,093 recognized as protein coding and 611 as non-coding RNA genes. Comparative analysis of the turkey, chicken, and zebra finch genomes, and comparing avian to mammalian species, supports the characteristic stability of avian genomes and identifies genes unique to the avian lineage. Clear differences are seen in number and variety of genes of the avian immune system where expansions and novel genes are less frequent than examples of gene loss. The turkey genome sequence provides resources to further understand the evolution of vertebrate genomes and genetic variation underlying economically important quantitative traits in poultry. This integrated approach may be a model for providing both gene and chromosome level assemblies of other species with agricultural, ecological, and evolutionary interest. PMID:20838655
DNA and RNA editing of retrotransposons accelerate mammalian genome evolution.
Knisbacher, Binyamin A; Levanon, Erez Y
2015-04-01
Genome evolution is commonly viewed as a gradual process that is driven by random mutations that accumulate over time. However, DNA- and RNA-editing enzymes have been identified that can accelerate evolution by actively modifying the genomically encoded information. The apolipoprotein B mRNA editing enzymes, catalytic polypeptide-like (APOBECs) are potent restriction factors that can inhibit retroelements by cytosine-to-uridine editing of retroelement DNA after reverse transcription. In some cases, a retroelement may successfully integrate into the genome despite being hypermutated. Such events introduce unique sequences into the genome and are thus a source of genomic innovation. adenosine deaminases that act on RNA (ADARs) catalyze adenosine-to-inosine editing in double-stranded RNA, commonly formed by oppositely oriented retroelements. The RNA editing confers plasticity to the transcriptome by generating many transcript variants from a single genomic locus. If the editing produces a beneficial variant, the genome may maintain the locus that produces the RNA-edited transcript for its novel function. Here, we discuss how these two powerful editing mechanisms, which both target inserted retroelements, facilitate expedited genome evolution. © 2015 New York Academy of Sciences.
NASA Astrophysics Data System (ADS)
Gallet, F.; Bolmont, E.; Mathis, S.; Charbonnel, C.; Amard, L.
2017-08-01
Context. Star-planet interactions must be taken into account in stellar models to understand the dynamical evolution of close-in planets. The dependence of the tidal interactions on the structural and rotational evolution of the star is of particular importance and should be correctly treated. Aims: We quantify how tidal dissipation in the convective envelope of rotating low-mass stars evolves from the pre-main sequence up to the red-giant branch depending on the initial stellar mass. We investigate the consequences of this evolution on planetary orbital evolution. Methods: We couple the tidal dissipation formalism previously described to the stellar evolution code STAREVOL and apply this coupling to rotating stars with masses between 0.3 and 1.4 M⊙. As a first step, this formalism assumes a simplified bi-layer stellar structure with corresponding averaged densities for the radiative core and the convective envelope. We use a frequency-averaged treatment of the dissipation of tidal inertial waves in the convection zone (but neglect the dissipation of tidal gravity waves in the radiation zone). In addition, we generalize a recent work by following the orbital evolution of close-in planets using the new tidal dissipation predictions for advanced phases of stellar evolution. Results: On the pre-main sequence the evolution of tidal dissipation is controlled by the evolution of the internal structure of the contracting star. On the main sequence it is strongly driven by the variation of surface rotation that is impacted by magnetized stellar winds braking. The main effect of taking into account the rotational evolution of the stars is to lower the tidal dissipation strength by about four orders of magnitude on the main sequence, compared to a normalized dissipation rate that only takes into account structural changes. Conclusions: The evolution of the dissipation strongly depends on the evolution of the internal structure and rotation of the star. From the pre-main sequence up to the tip of the red-giant branch, it varies by several orders of magnitude, with strong consequences for the orbital evolution of close-in massive planets. These effects are the strongest during the pre-main sequence, implying that the planets are mainly sensitive to the star's early history.
Extrachromosomal oncogene amplification drives tumor evolution and genetic heterogeneity
Turner, Kristen M.; Deshpande, Viraj; Beyter, Doruk; Koga, Tomoyuki; Rusert, Jessica; Lee, Catherine; Li, Bin; Arden, Karen; Ren, Bing; Nathanson, David A.; Kornblum, Harley I.; Taylor, Michael D.; Kaushal, Sharmeela; Cavenee, Webster K.; Wechsler-Reya, Robert; Furnari, Frank B.; Vandenberg, Scott R.; Rao, P. Nagesh; Wahl, Geoffrey M.; Bafna, Vineet; Mischel, Paul S.
2017-01-01
Human cells have twenty-three pairs of chromosomes but in cancer, genes can be amplified in chromosomes or in circular extrachromosomal DNA (ECDNA), whose frequency and functional significance are not understood1–4. We performed whole genome sequencing, structural modeling and cytogenetic analyses of 17 different cancer types, including 2572 metaphases, and developed ECdetect to conduct unbiased integrated ECDNA detection and analysis. ECDNA was found in nearly half of human cancers varying by tumor type, but almost never in normal cells. Driver oncogenes were amplified most commonly on ECDNA, elevating transcript level. Mathematical modeling predicted that ECDNA amplification elevates oncogene copy number and increases intratumoral heterogeneity more effectively than chromosomal amplification, which we validated by quantitative analyses of cancer samples. These results suggest that ECDNA contributes to accelerated evolution in cancer. PMID:28178237
Chiba, Satoshi
1999-04-01
An endemic land snail genus Mandarina of the oceanic Bonin (Ogasawara) Islands shows exceptionally rapid evolution not only of morphological and ecological traits, but of DNA sequence. A phylogenetic relationship based on mitochondrial DNA (mtDNA) sequences suggests that morphological differences equivalent to the differences between families were produced between Mandarina and its ancestor during the Pleistocene. The inferred phylogeny shows that species with similar morphologies and life habitats appeared repeatedly and independently in different lineages and islands at different times. Sequential adaptive radiations occurred in different islands of the Bonin Islands and species occupying arboreal, semiarboreal, and terrestrial habitat arose independently in each island. Because of a close relationship between shell morphology and life habitat, independent evolution of the same life habitat in different islands created species possesing the same shell morphology in different islands and lineages. This rapid evolution produced some incongruences between phylogenetic relationship and species taxonomy. Levels of sequence divergence of mtDNA among the species of Mandarina is extremely high. The maximum level of sequence divergence at 16S and 12S ribosomal RNA sequence within Mandarina are 18.7% and 17.7%, respectively, and this suggests that evolution of mtDNA of Mandarina is extremely rapid, more than 20 times faster than the standard rate in other animals. The present examination reveals that evolution of morphological and ecological traits occurs at extremely high rates in the time of adaptive radiation, especially in fragmented environments. © 1999 The Society for the Study of Evolution.
Rapid rate of control-region evolution in Pacific butterflyfishes (Chaetodontidae).
McMillan, W O; Palumbi, S R
1997-11-01
Sequence differences in the tRNA-proline (tRNApro) end of the mitochondrial control-region of three species of Pacific butterflyfishes accumulated 33-43 times more rapidly than did changes within the mitochondrial cytochrome b gene (cytb). Rapid evolution in this region was accompanied by strong transition/transversion bias and large variation in the probability of a DNA substitution among sites. These substitution constraints placed an absolute ceiling on the magnitude of sequence divergence that could be detected between individuals. This divergence "ceiling" was reached rapidly and led to a decay in the relative rate of control-region/cytb b evolution. A high rate of evolution in this section of the control-region of butterflyfishes stands in marked contrast to the patterns reported in some other fish lineages. Although the mechanism underlying rate variation remains unclear, all taxa with rapid evolution in the 5'-end of the control-region showed extreme transition biases. By contrast, in taxa with slower control-region evolution, transitions accumulated at nearly the same rate as transversions. More information is needed to understand the relationship between nucleotide bias and the rate of evolution in the 5'-end of the control-region. Despite strong constraints on sequence change, phylogenetic information was preserved in the group of recently differentiated species and supported the clustering of sequences into three major mtDNA groupings. Within these groups, very similar control-region sequences were widely distributed across the Pacific Ocean and were shared between recognized species, indicating a lack of mitochondrial sequence monophyly among species.
Llopart, Ana
2018-05-01
The hemizygosity of the X (Z) chromosome fully exposes the fitness effects of mutations on that chromosome and has evolutionary consequences on the relative rates of evolution of X and autosomes. Specifically, several population genetics models predict increased rates of evolution in X-linked loci relative to autosomal loci. This prediction of faster-X evolution has been evaluated and confirmed for both protein coding sequences and gene expression. In the case of faster-X evolution for gene expression divergence, it is often assumed that variation in 5' noncoding sequences is associated with variation in transcript abundance between species but a formal, genomewide test of this hypothesis is still missing. Here, I use whole genome sequence data in Drosophila yakuba and D. santomea to evaluate this hypothesis and report positive correlations between sequence divergence at 5' noncoding sequences and gene expression divergence. I also examine polymorphism and divergence in 9,279 noncoding sequences located at the 5' end of annotated genes and detected multiple signals of positive selection. Notably, I used the traditional synonymous sites as neutral reference to test for adaptive evolution, but I also used bases 8-30 of introns <65 bp, which have been proposed to be a better neutral choice. X-linked genes with high degree of male-biased expression show the most extreme adaptive pattern at 5' noncoding regions, in agreement with faster-X evolution for gene expression divergence and a higher incidence of positively selected recessive mutations. © 2018 The Authors. Molecular Ecology Published by John Wiley & Sons Ltd.
Zhu, Yuan O; Aw, Pauline P K; de Sessions, Paola Florez; Hong, Shuzhen; See, Lee Xian; Hong, Lewis Z; Wilm, Andreas; Li, Chen Hao; Hue, Stephane; Lim, Seng Gee; Nagarajan, Niranjan; Burkholder, William F; Hibberd, Martin
2017-10-27
Viral populations are complex, dynamic, and fast evolving. The evolution of groups of closely related viruses in a competitive environment is termed quasispecies. To fully understand the role that quasispecies play in viral evolution, characterizing the trajectories of viral genotypes in an evolving population is the key. In particular, long-range haplotype information for thousands of individual viruses is critical; yet generating this information is non-trivial. Popular deep sequencing methods generate relatively short reads that do not preserve linkage information, while third generation sequencing methods have higher error rates that make detection of low frequency mutations a bioinformatics challenge. Here we applied BAsE-Seq, an Illumina-based single-virion sequencing technology, to eight samples from four chronic hepatitis B (CHB) patients - once before antiviral treatment and once after viral rebound due to resistance. With single-virion sequencing, we obtained 248-8796 single-virion sequences per sample, which allowed us to find evidence for both hard and soft selective sweeps. We were able to reconstruct population demographic history that was independently verified by clinically collected data. We further verified four of the samples independently through PacBio SMRT and Illumina Pooled deep sequencing. Overall, we showed that single-virion sequencing yields insight into viral evolution and population dynamics in an efficient and high throughput manner. We believe that single-virion sequencing is widely applicable to the study of viral evolution in the context of drug resistance and host adaptation, allows differentiation between soft or hard selective sweeps, and may be useful in the reconstruction of intra-host viral population demographic history.
Valenzuela, Nicole
2009-07-01
Painted turtles (Chrysemys picta) are representatives of a vertebrate clade whose biology and phylogenetic position hold a key to our understanding of fundamental aspects of vertebrate evolution. These features make them an ideal emerging model system. Extensive ecological and physiological research provide the context in which to place new research advances in evolutionary genetics, genomics, evolutionary developmental biology, and ecological developmental biology which are enabled by current resources, such as a bacterial artificial chromosome (BAC) library of C. picta, and the imminent development of additional ones such as genome sequences and cDNA and expressed sequence tag (EST) libraries. This integrative approach will allow the research community to continue making advances to provide functional and evolutionary explanations for the lability of biological traits found not only among reptiles but vertebrates in general. Moreover, because humans and reptiles share a common ancestor, and given the ease of using nonplacental vertebrates in experimental biology compared with mammalian embryos, painted turtles are also an emerging model system for biomedical research. For example, painted turtles have been studied to understand many biological responses to overwintering and anoxia, as potential sentinels for environmental xenobiotics, and as a model to decipher the ecology and evolution of sexual development and reproduction. Thus, painted turtles are an excellent reptilian model system for studies with human health, environmental, ecological, and evolutionary significance.
Wallis, Michael
2008-01-15
Mammalian growth hormone (GH) sequences have been shown previously to display episodic evolution: the sequence is generally strongly conserved but on at least two occasions during mammalian evolution (on lineages leading to higher primates and ruminants) bursts of rapid evolution occurred. However, the number of mammalian orders studied previously has been relatively limited, and the availability of sequence data via mammalian genome projects provides the potential for extending the range of GH gene sequences examined. Complete or nearly complete GH gene sequences for six mammalian species for which no data were previously available have been extracted from the genome databases-Dasypus novemcinctus (nine-banded armadillo), Erinaceus europaeus (western European hedgehog), Myotis lucifugus (little brown bat), Procavia capensis (cape rock hyrax), Sorex araneus (European shrew), Spermophilus tridecemlineatus (13-lined ground squirrel). In addition incomplete data for several other species have been extended. Examination of the data in detail and comparison with previously available sequences has allowed assessment of the reliability of deduced sequences. Several of the new sequences differ substantially from the consensus sequence previously determined for eutherian GHs, indicating greater variability than previously recognised, and confirming the episodic pattern of evolution. The episodic pattern is not seen for signal sequences, 5' upstream sequence or synonymous substitutions-it is specific to the mature protein sequence, suggesting that it relates to the hormonal function. The substitutions accumulated during the course of GH evolution have occurred mainly on the side of the hormone facing away from the receptor, in a non-random fashion, and it is suggested that this may reflect interaction of the receptor-bound hormone with other proteins or small ligands.
A new molecular evolution model for limited insertion independent of substitution.
Lèbre, Sophie; Michel, Christian J
2013-10-01
We recently introduced a new molecular evolution model called the IDIS model for Insertion Deletion Independent of Substitution [13,14]. In the IDIS model, the three independent processes of substitution, insertion and deletion of residues have constant rates. In order to control the genome expansion during evolution, we generalize here the IDIS model by introducing an insertion rate which decreases when the sequence grows and tends to 0 for a maximum sequence length nmax. This new model, called LIIS for Limited Insertion Independent of Substitution, defines a matrix differential equation satisfied by a vector P(t) describing the sequence content in each residue at evolution time t. An analytical solution is obtained for any diagonalizable substitution matrix M. Thus, the LIIS model gives an expression of the sequence content vector P(t) in each residue under evolution time t as a function of the eigenvalues and the eigenvectors of matrix M, the residue insertion rate vector R, the total insertion rate r, the initial and maximum sequence lengths n0 and nmax, respectively, and the sequence content vector P(t0) at initial time t0. The derivation of the analytical solution is much more technical, compared to the IDIS model, as it involves Gauss hypergeometric functions. Several propositions of the LIIS model are derived: proof that the IDIS model is a particular case of the LIIS model when the maximum sequence length nmax tends to infinity, fixed point, time scale, time step and time inversion. Using a relation between the sequence length l and the evolution time t, an expression of the LIIS model as a function of the sequence length l=n(t) is obtained. Formulas for 'insertion only', i.e. when the substitution rates are all equal to 0, are derived at evolution time t and sequence length l. Analytical solutions of the LIIS model are explicitly derived, as a function of either evolution time t or sequence length l, for two classical substitution matrices: the 3-parameter symmetric substitution matrix [12] (LIIS-SYM3) and the HKY asymmetric substitution matrix[9] (LIIS-HKY). An evaluation of the LIIS model (precisely, LIIS-HKY) based on four statistical analyses of the GC content in complete genomes of four prokaryotic taxonomic groups, namely Chlamydiae, Crenarchaeota, Spirochaetes and Thermotogae, shows the expected improvement from the theory of the LIIS model compared to the IDIS model. Copyright © 2013 Elsevier Inc. All rights reserved.
Integrative View of the Diversity and Evolution of SWEET and SemiSWEET Sugar Transporters
Jia, Baolei; Zhu, Xiao Feng; Pu, Zhong Ji; Duan, Yu Xi; Hao, Lu Jiang; Zhang, Jie; Chen, Li-Qing; Jeon, Che Ok; Xuan, Yuan Hu
2017-01-01
Sugars Will Eventually be Exported Transporter (SWEET) and SemiSWEET are recently characterized families of sugar transporters in eukaryotes and prokaryotes, respectively. SemiSWEETs contain 3 transmembrane helices (TMHs), while SWEETs contain 7. Here, we performed sequence-based comprehensive analyses for SWEETs and SemiSWEETs across the biosphere. In total, 3,249 proteins were identified and ≈60% proteins were found in green plants and Oomycota, which include a number of important plant pathogens. Protein sequence similarity networks indicate that proteins from different organisms are significantly clustered. Of note, SemiSWEETs with 3 or 4 TMHs that may fuse to SWEET were identified in plant genomes. 7-TMH SWEETs were found in bacteria, implying that SemiSWEET can be fused directly in prokaryote. 15-TMH extraSWEET and 25-TMH superSWEET were also observed in wild rice and oomycetes, respectively. The transporters can be classified into 4, 2, 2, and 2 clades in plants, Metazoa, unicellular eukaryotes, and prokaryotes, respectively. The consensus and coevolution of amino acids in SWEETs were identified by multiple sequence alignments. The functions of the highly conserved residues were analyzed by molecular dynamics analysis. The 19 most highly conserved residues in the SWEETs were further confirmed by point mutagenesis using SWEET1 from Arabidopsis thaliana. The results proved that the conserved residues located in the extrafacial gate (Y57, G58, G131, and P191), the substrate binding pocket (N73, N192, and W176), and the intrafacial gate (P43, Y83, F87, P145, M161, P162, and Q202) play important roles for substrate recognition and transport processes. Taken together, our analyses provide a foundation for understanding the diversity, classification, and evolution of SWEETs and SemiSWEETs using large-scale sequence analysis and further show that gene duplication and gene fusion are important factors driving the evolution of SWEETs. PMID:29326750
Integrative View of the Diversity and Evolution of SWEET and SemiSWEET Sugar Transporters.
Jia, Baolei; Zhu, Xiao Feng; Pu, Zhong Ji; Duan, Yu Xi; Hao, Lu Jiang; Zhang, Jie; Chen, Li-Qing; Jeon, Che Ok; Xuan, Yuan Hu
2017-01-01
Sugars Will Eventually be Exported Transporter (SWEET) and SemiSWEET are recently characterized families of sugar transporters in eukaryotes and prokaryotes, respectively. SemiSWEETs contain 3 transmembrane helices (TMHs), while SWEETs contain 7. Here, we performed sequence-based comprehensive analyses for SWEETs and SemiSWEETs across the biosphere. In total, 3,249 proteins were identified and ≈60% proteins were found in green plants and Oomycota, which include a number of important plant pathogens. Protein sequence similarity networks indicate that proteins from different organisms are significantly clustered. Of note, SemiSWEETs with 3 or 4 TMHs that may fuse to SWEET were identified in plant genomes. 7-TMH SWEETs were found in bacteria, implying that SemiSWEET can be fused directly in prokaryote. 15-TMH extraSWEET and 25-TMH superSWEET were also observed in wild rice and oomycetes, respectively. The transporters can be classified into 4, 2, 2, and 2 clades in plants, Metazoa, unicellular eukaryotes, and prokaryotes, respectively. The consensus and coevolution of amino acids in SWEETs were identified by multiple sequence alignments. The functions of the highly conserved residues were analyzed by molecular dynamics analysis. The 19 most highly conserved residues in the SWEETs were further confirmed by point mutagenesis using SWEET1 from Arabidopsis thaliana . The results proved that the conserved residues located in the extrafacial gate (Y57, G58, G131, and P191), the substrate binding pocket (N73, N192, and W176), and the intrafacial gate (P43, Y83, F87, P145, M161, P162, and Q202) play important roles for substrate recognition and transport processes. Taken together, our analyses provide a foundation for understanding the diversity, classification, and evolution of SWEETs and SemiSWEETs using large-scale sequence analysis and further show that gene duplication and gene fusion are important factors driving the evolution of SWEETs.
NASA Astrophysics Data System (ADS)
Batezelli, Alessandro; Ladeira, Francisco Sergio Bernardes
2016-01-01
With the breakup of the supercontinent Gondwana, the South American Plate has undergone an intense process of tectonic restructuring that led to the genesis of the interior basins that encompassed continental sedimentary sequences. The Brazilian Bauru, Sanfranciscana and Parecis basins during Late Cretaceous have had their evolution linked to this process of structuring and therefore have very similar sedimentary characteristics. The purpose of this study is to establish a detailed understanding of alluvial sedimentary processes and architecture within a stratigraphic sequence framework using the concept of the stratigraphic base level or the ratio between the accommodation space and sediment supply. The integration of the stratigraphic and facies data contributed to defining the stratigraphic architecture of the Bauru, Sanfranciscana and Parecis Basins, supporting a model for continental sequences that depicts qualitative changes in the sedimentation rate (S) and accommodation space (A) that occurred during the Cretaceous. This study discusses the origin of the unconformity surfaces (K-0, K-1 and K-1A) that separate Sequences 1, 2A and 2B and the sedimentary characteristics of the Bauru, Sanfranciscana and Parecis Basins from the Aptian to the Maastrichtian, comparing the results with other Cretaceous Brazilian basins. The lower Cretaceous Sequence 1 (Caiuá and Areado groups) is interpreted as a low-accommodation systems tract compound by fluvial and aeolian systems. The upper Cretaceous lacustrine, braided river-dominated alluvial fan and aeolian systems display characteristics of the evolution from high-to low-accommodation systems tracts (Sequences 2A and 2B). Unconformity K-0 is related to the origin of the Bauru Basin itself in the Early Cretaceous. In Sanfranciscana and Parecis basins, the unconformity K-0 marks the contact between aeolian deposits from Lower Cretaceous and Upper Cretaceous alluvial systems (Sequences 1 and 2). Unconformity K-1, which was generated in the Late Cretaceous, is related to an increase of the A/S ratio, whereas Unconformity K-1A is the result of the decrease in the A/S ratio. Unconformity K-1A bound Sequence 2A (lacustrine and fluvial systems) and Sequence 2B (alluvial deposits) in Bauru Basin whereas in the Sanfranciscana and Parecis basins this unconformity marks the transition from alluvial system to aeolian system (Sequences 2A and 2B). Changes in depositional style in both basins correspond to two distinct tectonic moments occurring within the South American plate. The first associated with post-volcanic thermal subsidence of the Early Cretaceous (Serra Geral and Tapirapuã volcanismos), and the second moment associated with the uplift occurred in the Late Cretaceous (Alto Paranaíba, Vilhena and Serra Formosa Arcs).
OncoNEM: inferring tumor evolution from single-cell sequencing data.
Ross, Edith M; Markowetz, Florian
2016-04-15
Single-cell sequencing promises a high-resolution view of genetic heterogeneity and clonal evolution in cancer. However, methods to infer tumor evolution from single-cell sequencing data lag behind methods developed for bulk-sequencing data. Here, we present OncoNEM, a probabilistic method for inferring intra-tumor evolutionary lineage trees from somatic single nucleotide variants of single cells. OncoNEM identifies homogeneous cellular subpopulations and infers their genotypes as well as a tree describing their evolutionary relationships. In simulation studies, we assess OncoNEM's robustness and benchmark its performance against competing methods. Finally, we show its applicability in case studies of muscle-invasive bladder cancer and essential thrombocythemia.
Li, Shu-Fen; Zhang, Guo-Jun; Yuan, Jin-Hong; Deng, Chuan-Liang; Gao, Wu-Jun
2016-05-01
The present review discusses the roles of repetitive sequences played in plant sex chromosome evolution, and highlights epigenetic modification as potential mechanism of repetitive sequences involved in sex chromosome evolution. Sex determination in plants is mostly based on sex chromosomes. Classic theory proposes that sex chromosomes evolve from a specific pair of autosomes with emergence of a sex-determining gene(s). Subsequently, the newly formed sex chromosomes stop recombination in a small region around the sex-determining locus, and over time, the non-recombining region expands to almost all parts of the sex chromosomes. Accumulation of repetitive sequences, mostly transposable elements and tandem repeats, is a conspicuous feature of the non-recombining region of the Y chromosome, even in primitive one. Repetitive sequences may play multiple roles in sex chromosome evolution, such as triggering heterochromatization and causing recombination suppression, leading to structural and morphological differentiation of sex chromosomes, and promoting Y chromosome degeneration and X chromosome dosage compensation. In this article, we review the current status of this field, and based on preliminary evidence, we posit that repetitive sequences are involved in sex chromosome evolution probably via epigenetic modification, such as DNA and histone methylation, with small interfering RNAs as the mediator.
Understanding protein evolution: from protein physics to Darwinian selection.
Zeldovich, Konstantin B; Shakhnovich, Eugene I
2008-01-01
Efforts in whole-genome sequencing and structural proteomics start to provide a global view of the protein universe, the set of existing protein structures and sequences. However, approaches based on the selection of individual sequences have not been entirely successful at the quantitative description of the distribution of structures and sequences in the protein universe because evolutionary pressure acts on the entire organism, rather than on a particular molecule. In parallel to this line of study, studies in population genetics and phenomenological molecular evolution established a mathematical framework to describe the changes in genome sequences in populations of organisms over time. Here, we review both microscopic (physics-based) and macroscopic (organism-level) models of protein-sequence evolution and demonstrate that bridging the two scales provides the most complete description of the protein universe starting from clearly defined, testable, and physiologically relevant assumptions.
Differential evolution-simulated annealing for multiple sequence alignment
NASA Astrophysics Data System (ADS)
Addawe, R. C.; Addawe, J. M.; Sueño, M. R. K.; Magadia, J. C.
2017-10-01
Multiple sequence alignments (MSA) are used in the analysis of molecular evolution and sequence structure relationships. In this paper, a hybrid algorithm, Differential Evolution - Simulated Annealing (DESA) is applied in optimizing multiple sequence alignments (MSAs) based on structural information, non-gaps percentage and totally conserved columns. DESA is a robust algorithm characterized by self-organization, mutation, crossover, and SA-like selection scheme of the strategy parameters. Here, the MSA problem is treated as a multi-objective optimization problem of the hybrid evolutionary algorithm, DESA. Thus, we name the algorithm as DESA-MSA. Simulated sequences and alignments were generated to evaluate the accuracy and efficiency of DESA-MSA using different indel sizes, sequence lengths, deletion rates and insertion rates. The proposed hybrid algorithm obtained acceptable solutions particularly for the MSA problem evaluated based on the three objectives.
Building blocks of a fish head: Developmental and variational modularity in a complex system.
Lehoux, Caroline; Cloutier, Richard
2015-11-01
Evolution of the vertebrate skull is developmentally constrained by the interactions among its anatomical systems, such as the dermatocranium and the sensory system. The interaction between the dermal bones and lateral line canals has been debated for decades but their morphological integration has never been tested. An ontogenetic series of 97 juvenile and adult Amia calva (Actinopterygii) was used to describe the patterning and modularity of sensory lateral line canals and their integration with supporting cranial bones. Developmental modules were tested for the otic canal and supratemporal commissure by computing correlations in the branching sequence of groups of pores. Landmarks were digitized on 25 specimens to test a priori hypotheses of variational and developmental modularity at the level of canals and dermal bones. Branching sequence suggests a specific patterning supported by significant positive correlations in the sequence of appearance of branches between bilateral sides. Differences in patterning between the otic canal and the supratemporal commissure and tests of modularity with geometric morphometrics suggest that both canals form distinct modules. The integration between bones and canals was insufficient to detect a module. However, both components were not independent. Groups of pores tended to disappear without affecting other groups of pores suggesting that they are quasi-independent units acting as modules. This study provides evidence of a hierarchical organization for the modular sensory system that could explain variation of pattern of canals among species and their association with dermal bones. © 2015 Wiley Periodicals, Inc.
Tsioris, Konstantinos; Gupta, Namita T.; Ogunniyi, Adebola O.; Zimnisky, Ross M.; Qian, Feng; Yao, Yi; Wang, Xiaomei; Stern, Joel N. H.; Chari, Raj; Briggs, Adrian W.; Clouser, Christopher R.; Vigneault, Francois; Church, George M.; Garcia, Melissa N.; Murray, Kristy O.; Montgomery, Ruth R.; Kleinstein, Steven H.; Love, J. Christopher
2015-01-01
West Nile virus infection (WNV) is an emerging mosquito-borne disease that can lead to severe neurological illness and currently has no available treatment or vaccine. Using microengraving, an integrated single-cell analysis method, we analyzed a cohort of subjects infected with WNV - recently infected and post-convalescent subjects - and efficiently identified four novel WNV neutralizing antibodies. We also assessed the humoral response to WNV on a single-cell and repertoire level by integrating next generation sequencing (NGS) into our analysis. The results from single-cell analysis indicate persistence of WNV-specific memory B cells and antibody-secreting cells in post-convalescent subjects. These cells exhibited class-switched antibody isotypes. Furthermore, the results suggest that the antibody response itself does not predict the clinical severity of the disease (asymptomatic or symptomatic). Using the nucleotide coding sequences for WNV-specific antibodies derived from single cells, we revealed the ontogeny of expanded WNV-specific clones in the repertoires of recently infected subjects through NGS and bioinformatic analysis. This analysis also indicated that the humoral response to WNV did not depend on an anamnestic response, due to an unlikely previous exposure to the virus. The innovative and integrative approach presented here to analyze the evolution of neutralizing antibodies from natural infection on a single-cell and repertoire level can also be applied to vaccine studies, and could potentially aid the development of therapeutic antibodies and our basic understanding of other infectious diseases. PMID:26481611
Tsioris, Konstantinos; Gupta, Namita T; Ogunniyi, Adebola O; Zimnisky, Ross M; Qian, Feng; Yao, Yi; Wang, Xiaomei; Stern, Joel N H; Chari, Raj; Briggs, Adrian W; Clouser, Christopher R; Vigneault, Francois; Church, George M; Garcia, Melissa N; Murray, Kristy O; Montgomery, Ruth R; Kleinstein, Steven H; Love, J Christopher
2015-12-01
West Nile virus (WNV) infection is an emerging mosquito-borne disease that can lead to severe neurological illness and currently has no available treatment or vaccine. Using microengraving, an integrated single-cell analysis method, we analyzed a cohort of subjects infected with WNV - recently infected and post-convalescent subjects - and efficiently identified four novel WNV neutralizing antibodies. We also assessed the humoral response to WNV on a single-cell and repertoire level by integrating next generation sequencing (NGS) into our analysis. The results from single-cell analysis indicate persistence of WNV-specific memory B cells and antibody-secreting cells in post-convalescent subjects. These cells exhibited class-switched antibody isotypes. Furthermore, the results suggest that the antibody response itself does not predict the clinical severity of the disease (asymptomatic or symptomatic). Using the nucleotide coding sequences for WNV-specific antibodies derived from single cells, we revealed the ontogeny of expanded WNV-specific clones in the repertoires of recently infected subjects through NGS and bioinformatic analysis. This analysis also indicated that the humoral response to WNV did not depend on an anamnestic response, due to an unlikely previous exposure to the virus. The innovative and integrative approach presented here to analyze the evolution of neutralizing antibodies from natural infection on a single-cell and repertoire level can also be applied to vaccine studies, and could potentially aid the development of therapeutic antibodies and our basic understanding of other infectious diseases.
Martin, Guillaume; Baurens, Franc-Christophe; Cardi, Céline; Aury, Jean-Marc; D’Hont, Angélique
2013-01-01
Background Banana (genus Musa) is a crop of major economic importance worldwide. It is a monocotyledonous member of the Zingiberales, a sister group of the widely studied Poales. Most cultivated bananas are natural Musa inter-(sub-)specific triploid hybrids. A Musa acuminata reference nuclear genome sequence was recently produced based on sequencing of genomic DNA enriched in nucleus. Methodology/Principal Findings The Musa acuminata chloroplast genome was assembled with chloroplast reads extracted from whole-genome-shotgun sequence data. The Musa chloroplast genome is a circular molecule of 169,972 bp with a quadripartite structure containing two single copy regions, a Large Single Copy region (LSC, 88,338 bp) and a Small Single Copy region (SSC, 10,768 bp) separated by Inverted Repeat regions (IRs, 35,433 bp). Two forms of the chloroplast genome relative to the orientation of SSC versus LSC were found. The Musa chloroplast genome shows an extreme IR expansion at the IR/SSC boundary relative to the most common structures found in angiosperms. This expansion consists of the integration of three additional complete genes (rps15, ndhH and ycf1) and part of the ndhA gene. No such expansion has been observed in monocots so far. Simple Sequence Repeats were identified in the Musa chloroplast genome and a new set of Musa chloroplastic markers was designed. Conclusion The complete sequence of M. acuminata ssp malaccensis chloroplast we reported here is the first one for the Zingiberales order. As such it provides new insight in the evolution of the chloroplast of monocotyledons. In particular, it reinforces that IR/SSC expansion has occurred independently several times within monocotyledons. The discovery of new polymorphic markers within Musa chloroplast opens new perspectives to better understand the origin of cultivated triploid bananas. PMID:23840670
Martin, Guillaume; Baurens, Franc-Christophe; Cardi, Céline; Aury, Jean-Marc; D'Hont, Angélique
2013-01-01
Banana (genus Musa) is a crop of major economic importance worldwide. It is a monocotyledonous member of the Zingiberales, a sister group of the widely studied Poales. Most cultivated bananas are natural Musa inter-(sub-)specific triploid hybrids. A Musa acuminata reference nuclear genome sequence was recently produced based on sequencing of genomic DNA enriched in nucleus. The Musa acuminata chloroplast genome was assembled with chloroplast reads extracted from whole-genome-shotgun sequence data. The Musa chloroplast genome is a circular molecule of 169,972 bp with a quadripartite structure containing two single copy regions, a Large Single Copy region (LSC, 88,338 bp) and a Small Single Copy region (SSC, 10,768 bp) separated by Inverted Repeat regions (IRs, 35,433 bp). Two forms of the chloroplast genome relative to the orientation of SSC versus LSC were found. The Musa chloroplast genome shows an extreme IR expansion at the IR/SSC boundary relative to the most common structures found in angiosperms. This expansion consists of the integration of three additional complete genes (rps15, ndhH and ycf1) and part of the ndhA gene. No such expansion has been observed in monocots so far. Simple Sequence Repeats were identified in the Musa chloroplast genome and a new set of Musa chloroplastic markers was designed. The complete sequence of M. acuminata ssp malaccensis chloroplast we reported here is the first one for the Zingiberales order. As such it provides new insight in the evolution of the chloroplast of monocotyledons. In particular, it reinforces that IR/SSC expansion has occurred independently several times within monocotyledons. The discovery of new polymorphic markers within Musa chloroplast opens new perspectives to better understand the origin of cultivated triploid bananas.
SNP-VISTA: An interactive SNP visualization tool
Shah, Nameeta; Teplitsky, Michael V; Minovitsky, Simon; Pennacchio, Len A; Hugenholtz, Philip; Hamann, Bernd; Dubchak, Inna L
2005-01-01
Background Recent advances in sequencing technologies promise to provide a better understanding of the genetics of human disease as well as the evolution of microbial populations. Single Nucleotide Polymorphisms (SNPs) are established genetic markers that aid in the identification of loci affecting quantitative traits and/or disease in a wide variety of eukaryotic species. With today's technological capabilities, it has become possible to re-sequence a large set of appropriate candidate genes in individuals with a given disease in an attempt to identify causative mutations. In addition, SNPs have been used extensively in efforts to study the evolution of microbial populations, and the recent application of random shotgun sequencing to environmental samples enables more extensive SNP analysis of co-occurring and co-evolving microbial populations. The program is available at [1]. Results We have developed and present two modifications of an interactive visualization tool, SNP-VISTA, to aid in the analyses of the following types of data: A. Large-scale re-sequence data of disease-related genes for discovery of associated and/or causative alleles (GeneSNP-VISTA). B. Massive amounts of ecogenomics data for studying homologous recombination in microbial populations (EcoSNP-VISTA). The main features and capabilities of SNP-VISTA are: 1) mapping of SNPs to gene structure; 2) classification of SNPs, based on their location in the gene, frequency of occurrence in samples and allele composition; 3) clustering, based on user-defined subsets of SNPs, highlighting haplotypes as well as recombinant sequences; 4) integration of protein evolutionary conservation visualization; and 5) display of automatically calculated recombination points that are user-editable. Conclusion The main strength of SNP-VISTA is its graphical interface and use of visual representations, which support interactive exploration and hence better understanding of large-scale SNP data by the user. PMID:16336665
USDA-ARS?s Scientific Manuscript database
Interrogation of modern and ancient bovine genome sequences provides a valuable model to study the evolution of cattle. Here, we analyse the first complete wild aurochs (Bos primigenius) genome sequence using DNA extracted from a ~ 6,750 year-old humerus bone retrieved from a cave site in Derbyshire...
Did A Planet Survive A Post-Main Sequence Evolutionary Event?
NASA Astrophysics Data System (ADS)
Sorber, Rebecca; Jang-Condell, Hannah; Zimmerman, Mara
2018-06-01
The GL86 is star system approximately 10 pc away with a main sequence K- type ~ 0.77 M⊙ star (GL 86A) with a white dwarf ~0.49 M⊙ companion (GL86 B). The system has a ~ 18.4 AU semi-major axis, an orbital period of ~353 yrs, and an eccentricity of ~ 0.39. A 4.5 MJ planet orbits the main sequence star with a semi-major axis of 0.113 AU, an orbital period of 15.76 days, in a near circular orbit with an eccentricity of 0.046. If we assume that this planet was formed during the time when the white dwarf was a main sequence star, it would be difficult for the planet to have remained in a stable orbit during the post-main sequence evolution of GL86 B. The post-main sequence evolution with planet survival will be examined by modeling using the program Mercury (Chambers 1999). Using the model, we examine the origins of the planet: whether it formed before or after the post-main sequence evolution of GL86B. The modeling will give us insight into the dynamical evolution of, not only, the binary star system, but also the planet’s life cycle.
Petit, Daniel; Teppa, Elin; Mir, Anne-Marie; Vicogne, Dorothée; Thisse, Christine; Thisse, Bernard; Filloux, Cyril; Harduin-Lepers, Anne
2015-01-01
Sialyltransferases are responsible for the synthesis of a diverse range of sialoglycoconjugates predicted to be pivotal to deuterostomes’ evolution. In this work, we reconstructed the evolutionary history of the metazoan α2,3-sialyltransferases family (ST3Gal), a subset of sialyltransferases encompassing six subfamilies (ST3Gal I–ST3Gal VI) functionally characterized in mammals. Exploration of genomic and expressed sequence tag databases and search of conserved sialylmotifs led to the identification of a large data set of st3gal-related gene sequences. Molecular phylogeny and large scale sequence similarity network analysis identified four new vertebrate subfamilies called ST3Gal III-r, ST3Gal VII, ST3Gal VIII, and ST3Gal IX. To address the issue of the origin and evolutionary relationships of the st3gal-related genes, we performed comparative syntenic mapping of st3gal gene loci combined to ancestral genome reconstruction. The ten vertebrate ST3Gal subfamilies originated from genome duplication events at the base of vertebrates and are organized in three distinct and ancient groups of genes predating the early deuterostomes. Inferring st3gal gene family history identified also several lineage-specific gene losses, the significance of which was explored in a functional context. Toward this aim, spatiotemporal distribution of st3gal genes was analyzed in zebrafish and bovine tissues. In addition, molecular evolutionary analyses using specificity determining position and coevolved amino acid predictions led to the identification of amino acid residues with potential implication in functional divergence of vertebrate ST3Gal. We propose a detailed scenario of the evolutionary relationships of st3gal genes coupled to a conceptual framework of the evolution of ST3Gal functions. PMID:25534026
Retrotransposons as regulators of gene expression
Elbarbary, Reyad A.; Lucas, Bronwyn A.; Maquat, Lynne E.
2016-01-01
Transposable elements (TEs) are both a boon and a bane to eukaryotic organisms, depending on where they integrate into the genome and how their sequences function once integrated. We focus on two types of TEs: long interspersed elements (LINEs) and short interspersed elements (SINEs). LINEs and SINEs are retrotransposons; that is, they transpose via an RNA intermediate. We discuss how LINEs and SINEs have expanded in eukaryotic genomes and contribute to genome evolution. An emerging body of evidence indicates that LINEs and SINEs function to regulate gene expression by affecting chromatin structure, gene transcription, pre-mRNA processing, or aspects of mRNA metabolism. We also describe how adenosine-to-inosine editing influences SINE function and how ongoing retrotransposition is countered by the body’s defense mechanisms. PMID:26912865
Ramharack, Pritika; Soliman, Mahmoud E S
2018-06-01
Originally developed for the analysis of biological sequences, bioinformatics has advanced into one of the most widely recognized domains in the scientific community. Despite this technological evolution, there is still an urgent need for nontoxic and efficient drugs. The onus now falls on the 'omics domain to meet this need by implementing bioinformatics techniques that will allow for the introduction of pioneering approaches in the rational drug design process. Here, we categorize an updated list of informatics tools and explore the capabilities of integrative bioinformatics in disease control. We believe that our review will serve as a comprehensive guide toward bioinformatics-oriented disease and drug discovery research. Copyright © 2018 Elsevier Ltd. All rights reserved.
Valiadi, Martha; Iglesias-Rodriguez, Maria Debora
2014-01-01
Dinoflagellate bioluminescence systems operate with or without a luciferin binding protein, representing two distinct modes of light production. However, the distribution, diversity, and evolution of the luciferin binding protein gene within bioluminescent dinoflagellates are not well known. We used PCR to detect and partially sequence this gene from the heterotrophic dinoflagellate Noctiluca scintillans and a group of ecologically important gonyaulacoid species. We report an additional luciferin binding protein gene in N. scintillans which is not attached to luciferase, further to its typical combined bioluminescence gene. This supports the hypothesis that a profound re-organization of the bioluminescence system has taken place in this organism. We also show that the luciferin binding protein gene is present in the genera Ceratocorys, Gonyaulax, and Protoceratium, and is prevalent in bioluminescent species of Alexandrium. Therefore, this gene is an integral component of the standard molecular bioluminescence machinery in dinoflagellates. Nucleotide sequences showed high within-strain variation among gene copies, revealing a highly diverse gene family comprising multiple gene types in some organisms. Phylogenetic analyses showed that, in some species, the evolution of the luciferin binding protein gene was different from the organism's general phylogenies, highlighting the complex evolutionary history of dinoflagellate bioluminescence systems. © 2013 The Author(s) Journal of Eukaryotic Microbiology © 2013 International Society of Protistologists.
Classification and Lineage Tracing of SH2 Domains Throughout Eukaryotes.
Liu, Bernard A
2017-01-01
Today there exists a rapidly expanding number of sequenced genomes. Cataloging protein interaction domains such as the Src Homology 2 (SH2) domain across these various genomes can be accomplished with ease due to existing algorithms and predictions models. An evolutionary analysis of SH2 domains provides a step towards understanding how SH2 proteins integrated with existing signaling networks to position phosphotyrosine signaling as a crucial driver of robust cellular communication networks in metazoans. However organizing and tracing SH2 domain across organisms and understanding their evolutionary trajectory remains a challenge. This chapter describes several methodologies towards analyzing the evolutionary trajectory of SH2 domains including a global SH2 domain classification system, which facilitates annotation of new SH2 sequences essential for tracing the lineage of SH2 domains throughout eukaryote evolution. This classification utilizes a combination of sequence homology, protein domain architecture and the boundary positions between introns and exons within the SH2 domain or genes encoding these domains. Discrete SH2 families can then be traced across various genomes to provide insight into its origins. Furthermore, additional methods for examining potential mechanisms for divergence of SH2 domains from structural changes to alterations in the protein domain content and genome duplication will be discussed. Therefore a better understanding of SH2 domain evolution may enhance our insight into the emergence of phosphotyrosine signaling and the expansion of protein interaction domains.
Extensive Mobilome-Driven Genome Diversification in Mouse Gut-Associated Bacteroides vulgatus mpk.
Lange, Anna; Beier, Sina; Steimle, Alex; Autenrieth, Ingo B; Huson, Daniel H; Frick, Julia-Stefanie
2016-04-25
Like many other Bacteroides species, Bacteroides vulgatus strain mpk, a mouse fecal isolate which was shown to promote intestinal homeostasis, utilizes a variety of mobile elements for genome evolution. Based on sequences collected by Pacific Biosciences SMRT sequencing technology, we discuss the challenges of assembling and studying a bacterial genome of high plasticity. Additionally, we conducted comparative genomics comparing this commensal strain with the B. vulgatus type strain ATCC 8482 as well as multiple other Bacteroides and Parabacteroides strains to reveal the most important differences and identify the unique features of B. vulgatus mpk. The genome of B. vulgatus mpk harbors a large and diverse set of mobile element proteins compared with other sequenced Bacteroides strains. We found evidence of a number of different horizontal gene transfer events and a genome landscape that has been extensively altered by different mobilization events. A CRISPR/Cas system could be identified that provides a possible mechanism for preventing the integration of invading external DNA. We propose that the high genome plasticity and the introduced genome instabilities of B. vulgatus mpk arising from the various mobilization events might play an important role not only in its adaptation to the challenging intestinal environment in general, but also in its ability to interact with the gut microbiota. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Evolution beyond neo-Darwinism: a new conceptual framework.
Noble, Denis
2015-01-01
Experimental results in epigenetics and related fields of biological research show that the Modern Synthesis (neo-Darwinist) theory of evolution requires either extension or replacement. This article examines the conceptual framework of neo-Darwinism, including the concepts of 'gene', 'selfish', 'code', 'program', 'blueprint', 'book of life', 'replicator' and 'vehicle'. This form of representation is a barrier to extending or replacing existing theory as it confuses conceptual and empirical matters. These need to be clearly distinguished. In the case of the central concept of 'gene', the definition has moved all the way from describing a necessary cause (defined in terms of the inheritable phenotype itself) to an empirically testable hypothesis (in terms of causation by DNA sequences). Neo-Darwinism also privileges 'genes' in causation, whereas in multi-way networks of interactions there can be no privileged cause. An alternative conceptual framework is proposed that avoids these problems, and which is more favourable to an integrated systems view of evolution. © 2015. Published by The Company of Biologists Ltd.
Lithium in halo stars from standard stellar evolution
NASA Technical Reports Server (NTRS)
Deliyannis, Constantine P.; Demarque, Pierre; Kawaler, Steven D.
1990-01-01
A grid has been constructed of theoretical evolution sequences of models for low-metallicity stars from the premain-sequence to the giant branch phases. The grid is used to study the history of surface Li abundance during standard stellar evolution. The Li-7 observations of halo stars by Spite and Spite (1982) and subsequent observations are synthesized to separate the halo stars by age. The theory of surface Li abundance is illustrated by following the evolution of a reference halo star model from the contracting fully convective premain sequence to the giant branch phase. The theoretical models are compared with observed Li abundances. The results show that the halo star lithium abundances can be explained in the context of standard stellar evolution theory using completely standard assumptions and physics.
Nakano, Shogo; Asano, Yasuhisa
2015-02-03
Development of software and methods for design of complete sequences of functional proteins could contribute to studies of protein engineering and protein evolution. To this end, we developed the INTMSAlign software, and used it to design functional proteins and evaluate their usefulness. The software could assign both consensus and correlation residues of target proteins. We generated three protein sequences with S-selective hydroxynitrile lyase (S-HNL) activity, which we call designed S-HNLs; these proteins folded as efficiently as the native S-HNL. Sequence and biochemical analysis of the designed S-HNLs suggested that accumulation of neutral mutations occurs during the process of S-HNLs evolution from a low-activity form to a high-activity (native) form. Taken together, our results demonstrate that our software and the associated methods could be applied not only to design of complete sequences, but also to predictions of protein evolution, especially within families such as esterases and S-HNLs.
NASA Astrophysics Data System (ADS)
Nakano, Shogo; Asano, Yasuhisa
2015-02-01
Development of software and methods for design of complete sequences of functional proteins could contribute to studies of protein engineering and protein evolution. To this end, we developed the INTMSAlign software, and used it to design functional proteins and evaluate their usefulness. The software could assign both consensus and correlation residues of target proteins. We generated three protein sequences with S-selective hydroxynitrile lyase (S-HNL) activity, which we call designed S-HNLs; these proteins folded as efficiently as the native S-HNL. Sequence and biochemical analysis of the designed S-HNLs suggested that accumulation of neutral mutations occurs during the process of S-HNLs evolution from a low-activity form to a high-activity (native) form. Taken together, our results demonstrate that our software and the associated methods could be applied not only to design of complete sequences, but also to predictions of protein evolution, especially within families such as esterases and S-HNLs.
Guérillot, Romain; Siguier, Patricia; Gourbeyre, Edith; Chandler, Michael; Glaser, Philippe
2014-01-01
Transposable elements (TEs) are major components of both prokaryotic and eukaryotic genomes and play a significant role in their evolution. In this study, we have identified new prokaryotic DDE transposase families related to the eukaryotic Mutator-like transposases. These genes were retrieved by cascade PSI-Blast using as initial query the transposase of the streptococcal integrative and conjugative element (ICE) TnGBS2. By combining secondary structure predictions and protein sequence alignments, we predicted the DDE catalytic triad and the DNA-binding domain recognizing the terminal inverted repeats. Furthermore, we systematically characterized the organization and the insertion specificity of the TEs relying on these prokaryotic Mutator-like transposases (p-MULT) for their mobility. Strikingly, two distant TE families target their integration upstream σA dependent promoters. This allowed us to identify a transposase sequence signature associated with this unique insertion specificity and to show that the dissymmetry between the two inverted repeats is responsible for the orientation of the insertion. Surprisingly, while DDE transposases are generally associated with small and simple transposons such as insertion sequences (ISs), p-MULT encoding TEs show an unprecedented diversity with several families of IS, transposons, and ICEs ranging in size from 1.1 to 52 kb. PMID:24418649
Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Hubisz, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Zhang, Peili; Liu, Jing; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catharine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenée; Verduzco, Daniel; Clerc-Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.
2005-01-01
We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura, and compared this to the genome sequence of Drosophila melanogaster, a primary model organism. Throughout evolution the vast majority of Drosophila genes have remained on the same chromosome arm, but within each arm gene order has been extensively reshuffled, leading to a minimum of 921 syntenic blocks shared between the species. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 25–55 million years (Myr) since the pseudoobscura/melanogaster divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome-wide average, consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than random and nearby sequences between the species—but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila. PMID:15632085
Integrated Solar System Exploration Education and Public Outreach: Theme, Products and Activities
NASA Technical Reports Server (NTRS)
Lowes, Leslie; Lindstrom, Marilyn; Stockman, Stephanie; Scalice, Daniela; Allen, Jaclyn; Tobola, Kay; Klug, Sheri; Harmon, Art
2004-01-01
NASA's Solar System Exploration Program is entering an unprecedented period of exploration and discovery. Its goal is to understand the origin and evolution of the solar system and life within it. SSE missions are operating or in development to study the far reaches of our solar system and beyond. These missions proceed in sequence for each body from reconnaissance flybys through orbiters and landers or rovers to sample returns. SSE research programs develop new instruments, analyze mission data or returned samples, and provide experimental or theoretical models to aid in interpretation.
Sequence-Level Mechanisms of Human Epigenome Evolution
Prendergast, James G.D.; Chambers, Emily V.; Semple, Colin A.M.
2014-01-01
DNA methylation and chromatin states play key roles in development and disease. However, the extent of recent evolutionary divergence in the human epigenome and the influential factors that have shaped it are poorly understood. To determine the links between genome sequence and human epigenome evolution, we examined the divergence of DNA methylation and chromatin states following segmental duplication events in the human lineage. Chromatin and DNA methylation states were found to have been generally well conserved following a duplication event, with the evolution of the epigenome largely uncoupled from the total number of genetic changes in the surrounding DNA sequence. However, the epigenome at tissue-specific, distal regulatory regions was observed to be unusually prone to diverge following duplication, with particular sequence differences, altering known sequence motifs, found to be associated with divergence in patterns of DNA methylation and chromatin. Alu elements were found to have played a particularly prominent role in shaping human epigenome evolution, and we show that human-specific AluY insertion events are strongly linked to the evolution of the DNA methylation landscape and gene expression levels, including at key neurological genes in the human brain. Studying paralogous regions within the same sample enables the study of the links between genome and epigenome evolution while controlling for biological and technical variation. We show DNA methylation and chromatin divergence between duplicated regions are linked to the divergence of particular genetic motifs, with Alu elements having played a disproportionate role in the evolution of the epigenome in the human lineage. PMID:24966180
Liu, Lin; Nardo, David; Li, Eric; Wang, Gary P
2016-03-13
CD4 T-cell depletion from HIV infection leads to a global decline in anti-hepatitis C virus (HCV) envelope neutralizing antibody (nAb) response, which may play a role in accelerating liver fibrosis. An increase in anti-HCV nAb titers has been reported during antiretroviral therapy (ART) but its impact on HCV remains poorly understood. The objective of this study is to determine the effects of ART on long-term HCV evolution. We examined HCV quasispecies structure and long-term evolution in HIV/HCV coinfected patients with ART-induced CD4 T-cell recovery, and compared with patients with CD4 T-cell depletion from delayed ART. We applied a single-variant sequencing (SVS) method to construct authentic viral quasispecies and compared sequence evolution in HCV envelope, the primary target for humoral immune responses, and NS3, a target for cellular immunity, between the two cohorts. The SVS method corrected biases known to skew the proportions of viral variants, revealing authentic HCV quasispeices structures. We observed higher rates of HCV envelope sequence evolution in patients with ART-induced CD4 T-cell recovery, compared with patients with CD4 T-cell depletion from delayed ART (P = 0.03). Evolutionary rates for NS3 were considerably lower than the rates for envelope (P < 0.01), with no significant difference observed between the two groups. ART-induced CD4 T-cell recovery results in rapid sequence evolution in HCV envelope, but not in NS3. These results suggest that suppressive ART disproportionally enhances HCV-specific humoral responses more than cellular responses, resulting in rapid sequence evolution in HCV envelope but not NS3.
MitoRes: a resource of nuclear-encoded mitochondrial genes and their products in Metazoa.
Catalano, Domenico; Licciulli, Flavio; Turi, Antonio; Grillo, Giorgio; Saccone, Cecilia; D'Elia, Domenica
2006-01-24
Mitochondria are sub-cellular organelles that have a central role in energy production and in other metabolic pathways of all eukaryotic respiring cells. In the last few years, with more and more genomes being sequenced, a huge amount of data has been generated providing an unprecedented opportunity to use the comparative analysis approach in studies of evolution and functional genomics with the aim of shedding light on molecular mechanisms regulating mitochondrial biogenesis and metabolism. In this context, the problem of the optimal extraction of representative datasets of genomic and proteomic data assumes a crucial importance. Specialised resources for nuclear-encoded mitochondria-related proteins already exist; however, no mitochondrial database is currently available with the same features of MitoRes, which is an update of the MitoNuc database extensively modified in its structure, data sources and graphical interface. It contains data on nuclear-encoded mitochondria-related products for any metazoan species for which this type of data is available and also provides comprehensive sequence datasets (gene, transcript and protein) as well as useful tools for their extraction and export. MitoRes http://www2.ba.itb.cnr.it/MitoRes/ consolidates information from publicly external sources and automatically annotates them into a relational database. Additionally, it also clusters proteins on the basis of their sequence similarity and interconnects them with genomic data. The search engine and sequence management tools allow the query/retrieval of the database content and the extraction and export of sequences (gene, transcript, protein) and related sub-sequences (intron, exon, UTR, CDS, signal peptide and gene flanking regions) ready to be used for in silico analysis. The tool we describe here has been developed to support lab scientists and bioinformaticians alike in the characterization of molecular features and evolution of mitochondrial targeting sequences. The way it provides for the retrieval and extraction of sequences allows the user to overcome the obstacles encountered in the integrative use of different bioinformatic resources and the completeness of the sequence collection allows intra- and interspecies comparison at different biological levels (gene, transcript and protein).
Determinants of the rate of protein sequence evolution
Zhang, Jianzhi; Yang, Jian-Rong
2015-01-01
The rate and mechanism of protein sequence evolution have been central questions in evolutionary biology since the 1960s. Although the rate of protein sequence evolution depends primarily on the level of functional constraint, exactly what constitutes functional constraint has remained unclear. The increasing availability of genomic data has allowed for much needed empirical examinations on the nature of functional constraint. These studies found that the evolutionary rate of a protein is predominantly influenced by its expression level rather than functional importance. A combination of theoretical and empirical analyses have identified multiple mechanisms behind these observations and demonstrated a prominent role that selection against errors in molecular and cellular processes plays in protein evolution. PMID:26055156
2013-01-01
Background As for other major crops, achieving a complete wheat genome sequence is essential for the application of genomics to breeding new and improved varieties. To overcome the complexities of the large, highly repetitive and hexaploid wheat genome, the International Wheat Genome Sequencing Consortium established a chromosome-based strategy that was validated by the construction of the physical map of chromosome 3B. Here, we present improved strategies for the construction of highly integrated and ordered wheat physical maps, using chromosome 1BL as a template, and illustrate their potential for evolutionary studies and map-based cloning. Results Using a combination of novel high throughput marker assays and an assembly program, we developed a high quality physical map representing 93% of wheat chromosome 1BL, anchored and ordered with 5,489 markers including 1,161 genes. Analysis of the gene space organization and evolution revealed that gene distribution and conservation along the chromosome results from the superimposition of the ancestral grass and recent wheat evolutionary patterns, leading to a peak of synteny in the central part of the chromosome arm and an increased density of non-collinear genes towards the telomere. With a density of about 11 markers per Mb, the 1BL physical map provides 916 markers, including 193 genes, for fine mapping the 40 QTLs mapped on this chromosome. Conclusions Here, we demonstrate that high marker density physical maps can be developed in complex genomes such as wheat to accelerate map-based cloning, gain new insights into genome evolution, and provide a foundation for reference sequencing. PMID:23800011
Neo-Darwinism, the Modern Synthesis and selfish genes: are they of use in physiology?
Noble, Denis
2011-01-01
This article argues that the gene-centric interpretations of evolution, and more particularly the selfish gene expression of those interpretations, form barriers to the integration of physiological science with evolutionary theory. A gene-centred approach analyses the relationships between genotypes and phenotypes in terms of differences (change the genotype and observe changes in phenotype). We now know that, most frequently, this does not correctly reveal the relationships because of extensive buffering by robust networks of interactions. By contrast, understanding biological function through physiological analysis requires an integrative approach in which the activity of the proteins and RNAs formed from each DNA template is analysed in networks of interactions. These networks also include components that are not specified by nuclear DNA. Inheritance is not through DNA sequences alone. The selfish gene idea is not useful in the physiological sciences, since selfishness cannot be defined as an intrinsic property of nucleotide sequences independently of gene frequency, i.e. the ‘success’ in the gene pool that is supposed to be attributable to the ‘selfish’ property. It is not a physiologically testable hypothesis. PMID:21135048
Neo-Darwinism, the modern synthesis and selfish genes: are they of use in physiology?
Noble, Denis
2011-03-01
This article argues that the gene-centric interpretations of evolution, and more particularly the selfish gene expression of those interpretations, form barriers to the integration of physiological science with evolutionary theory. A gene-centred approach analyses the relationships between genotypes and phenotypes in terms of differences (change the genotype and observe changes in phenotype). We now know that, most frequently, this does not correctly reveal the relationships because of extensive buffering by robust networks of interactions. By contrast, understanding biological function through physiological analysis requires an integrative approach in which the activity of the proteins and RNAs formed from each DNA template is analysed in networks of interactions. These networks also include components that are not specified by nuclear DNA. Inheritance is not through DNA sequences alone. The selfish gene idea is not useful in the physiological sciences, since selfishness cannot be defined as an intrinsic property of nucleotide sequences independently of gene frequency, i.e. the 'success' in the gene pool that is supposed to be attributable to the 'selfish' property. It is not a physiologically testable hypothesis.
Derkarabetian, Shahan; Steinmann, David B.; Hedin, Marshal
2010-01-01
Background Many cave-dwelling animal species display similar morphologies (troglomorphism) that have evolved convergent within and among lineages under the similar selective pressures imposed by cave habitats. Here we study such ecomorphological evolution in cave-dwelling Sclerobuninae harvestmen (Opiliones) from the western United States, providing general insights into morphological homoplasy, rates of morphological change, and the temporal context of cave evolution. Methodology/Principal Findings We gathered DNA sequence data from three independent gene regions, and combined these data with Bayesian hypothesis testing, morphometrics analysis, study of penis morphology, and relaxed molecular clock analyses. Using multivariate morphometric analysis, we find that phylogenetically unrelated taxa have convergently evolved troglomorphism; alternative phylogenetic hypotheses involving less morphological convergence are not supported by Bayesian hypothesis testing. In one instance, this morphology is found in specimens from a high-elevation stony debris habitat, suggesting that troglomorphism can evolve in non-cave habitats. We discovered a strong positive relationship between troglomorphy index and relative divergence time, making it possible to predict taxon age from morphology. Most of our time estimates for the origin of highly-troglomorphic cave forms predate the Pleistocene. Conclusions/Significance While several regions in the eastern and central United States are well-known hotspots for cave evolution, few modern phylogenetic studies have addressed the evolution of cave-obligate species in the western United States. Our integrative studies reveal the recurrent evolution of troglomorphism in a perhaps unexpected geographic region, at surprisingly deep time depths, and in sometimes surprising habitats. Because some newly discovered troglomorphic populations represent undescribed species, our findings stress the need for further biological exploration, integrative systematic research, and conservation efforts in western US cave habitats. PMID:20479884
Improvisation in evolution of genes and genomes: whose structure is it anyway?
Shakhnovich, Boris E; Shakhnovich, Eugene I
2008-06-01
Significant progress has been made in recent years in a variety of seemingly unrelated fields such as sequencing, protein structure prediction, and high-throughput transcriptomics and metabolomics. At the same time, new microscopic models have been developed that made it possible to analyze the evolution of genes and genomes from first principles. The results from these efforts enable, for the first time, a comprehensive insight into the evolution of complex systems and organisms on all scales--from sequences to organisms and populations. Every newly sequenced genome uncovers new genes, families, and folds. Where do these new genes come from? How do gene duplication and subsequent divergence of sequence and structure affect the fitness of the organism? What role does regulation play in the evolution of proteins and folds? Emerging synergism between data and modeling provides first robust answers to these questions.
The Genomic Evolution of Prostate Cancer
2017-06-01
management and grant writing skills. 15. SUBJECT TERMS Cancer genetics , tumor evolution, tumor heterogeneity, prostate cancer, exome sequencing 16...aggressive disease, it is unclear if the genetic alterations more common in late disease are present early on, but at low frequency, or if they only...from localized to metastatic prostate cancer. 2. KEYWORDS: Cancer genetics , tumor evolution, tumor heterogeneity, prostate cancer, exome sequencing
Pang, Erli; Wu, Xiaomei; Lin, Kui
2016-06-01
Protein evolution plays an important role in the evolution of each genome. Because of their functional nature, in general, most of their parts or sites are differently constrained selectively, particularly by purifying selection. Most previous studies on protein evolution considered individual proteins in their entirety or compared protein-coding sequences with non-coding sequences. Less attention has been paid to the evolution of different parts within each protein of a given genome. To this end, based on PfamA annotation of all human proteins, each protein sequence can be split into two parts: domains or unassigned regions. Using this rationale, single nucleotide polymorphisms (SNPs) in protein-coding sequences from the 1000 Genomes Project were mapped according to two classifications: SNPs occurring within protein domains and those within unassigned regions. With these classifications, we found: the density of synonymous SNPs within domains is significantly greater than that of synonymous SNPs within unassigned regions; however, the density of non-synonymous SNPs shows the opposite pattern. We also found there are signatures of purifying selection on both the domain and unassigned regions. Furthermore, the selective strength on domains is significantly greater than that on unassigned regions. In addition, among all of the human protein sequences, there are 117 PfamA domains in which no SNPs are found. Our results highlight an important aspect of protein domains and may contribute to our understanding of protein evolution.
Currin, Andrew; Swainston, Neil; Day, Philip J.
2015-01-01
The amino acid sequence of a protein affects both its structure and its function. Thus, the ability to modify the sequence, and hence the structure and activity, of individual proteins in a systematic way, opens up many opportunities, both scientifically and (as we focus on here) for exploitation in biocatalysis. Modern methods of synthetic biology, whereby increasingly large sequences of DNA can be synthesised de novo, allow an unprecedented ability to engineer proteins with novel functions. However, the number of possible proteins is far too large to test individually, so we need means for navigating the ‘search space’ of possible protein sequences efficiently and reliably in order to find desirable activities and other properties. Enzymologists distinguish binding (K d) and catalytic (k cat) steps. In a similar way, judicious strategies have blended design (for binding, specificity and active site modelling) with the more empirical methods of classical directed evolution (DE) for improving k cat (where natural evolution rarely seeks the highest values), especially with regard to residues distant from the active site and where the functional linkages underpinning enzyme dynamics are both unknown and hard to predict. Epistasis (where the ‘best’ amino acid at one site depends on that or those at others) is a notable feature of directed evolution. The aim of this review is to highlight some of the approaches that are being developed to allow us to use directed evolution to improve enzyme properties, often dramatically. We note that directed evolution differs in a number of ways from natural evolution, including in particular the available mechanisms and the likely selection pressures. Thus, we stress the opportunities afforded by techniques that enable one to map sequence to (structure and) activity in silico, as an effective means of modelling and exploring protein landscapes. Because known landscapes may be assessed and reasoned about as a whole, simultaneously, this offers opportunities for protein improvement not readily available to natural evolution on rapid timescales. Intelligent landscape navigation, informed by sequence-activity relationships and coupled to the emerging methods of synthetic biology, offers scope for the development of novel biocatalysts that are both highly active and robust. PMID:25503938
NASA Technical Reports Server (NTRS)
Horch, E.; Demarque, P.; Pinsonneault, M.
1992-01-01
Evolutionary calculations of high-metallicity horizontal-branch stars show that for the relevant masses and helium abundances, post-HB evolution in the HR diagram does not proceed toward and along the AGB, but rather toward a 'slow blue phase' in the vicinity of the helium-burning main sequence, following the extinction of the hydrogen shell energy source. For solar and twice solar metallicity, the blue phase begins during the helium shell-burning phase (in agreement with the work of Brocato and Castellani and Tornambe); for 3 times solar metallicity, it begins earlier, during the helium core-burning phase. This behavior differs from what takes place at lower metallicities. The implications for high-metallicity old stellar populations in the Galactic bulge and for the integrated colors of elliptical galaxies are discussed.
Chertkova, Aleksandra A; Schiffman, Joshua S; Nuzhdin, Sergey V; Kozlov, Konstantin N; Samsonova, Maria G; Gursky, Vitaly V
2017-02-07
Cis-regulatory sequences are often composed of many low-affinity transcription factor binding sites (TFBSs). Determining the evolutionary and functional importance of regulatory sequence composition is impeded without a detailed knowledge of the genotype-phenotype map. We simulate the evolution of regulatory sequences involved in Drosophila melanogaster embryo segmentation during early development. Natural selection evaluates gene expression dynamics produced by a computational model of the developmental network. We observe a dramatic decrease in the total number of transcription factor binding sites through the course of evolution. Despite a decrease in average sequence binding energies through time, the regulatory sequences tend towards organisations containing increased high affinity transcription factor binding sites. Additionally, the binding energies of separate sequence segments demonstrate ubiquitous mutual correlations through time. Fewer than 10% of initial TFBSs are maintained throughout the entire simulation, deemed 'core' sites. These sites have increased functional importance as assessed under wild-type conditions and their binding energy distributions are highly conserved. Furthermore, TFBSs within close proximity of core sites exhibit increased longevity, reflecting functional regulatory interactions with core sites. In response to elevated mutational pressure, evolution tends to sample regulatory sequence organisations with fewer, albeit on average, stronger functional transcription factor binding sites. These organisations are also shaped by the regulatory interactions among core binding sites with sites in their local vicinity.
Sedimentary sequence evolution in a Foredeep basin: Eastern Venezuela
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bejarano, C.; Funes, D.; Sarzalho, S.
1996-08-01
Well log-seismic sequence stratigraphy analysis in the Eastern Venezuela Foreland Basin leads to study of the evolution of sedimentary sequences onto the Cretaceous-Paleocene passive margin. This basin comprises two different foredeep sub-basins: The Guarico subbasin to the west, older, and the Maturin sub-basin to the east, younger. A foredeep switching between these two sub-basins is observed at 12.5 m.y. Seismic interpretation and well log sections across the study area show sedimentary sequences with transgressive sands and coastal onlaps to the east-southeast for the Guarico sub-basin, as well as truncations below the switching sequence (12.5 m.y.), and the Maturin sub-basin showsmore » apparent coastal onlaps to the west-northwest, as well as a marine onlap (deeper water) in the west, where it starts to establish. Sequence stratigraphy analysis of these sequences with well logs allowed the study of the evolution of stratigraphic section from Paleocene to middle Miocene (68.0-12.0 m.y.). On the basis of well log patterns, the sequences were divided in regressive-transgressive-regressive sedimentary cycles caused by changes in relative sea level. Facies distributions were analyzed and the sequences were divided into simple sequences or sub- sequences of a greater frequencies than third order depositional sequences.« less
NASA Astrophysics Data System (ADS)
Du, Qiuding; Wang, Zhengjiang; Wang, Jian; Deng, Qi; Yang, Fei
2016-03-01
Meso- to Neoproterozoic magmatic events are widespread in the Yangtze Block. The geochronology and tectonic significance of the Shennongjia Group in the Yangtze Block are still highly controversial. An integrated geochronology and geochemistry approach provides new insights into the geochronological framework, tectonic setting, magmatic events, and basin evolution of the northern Yangtze Block. Our new precise sensitive high-resolution ion microprobe U-Pb data indicate a deposition age of 1180 ± 15 Ma for the Shicaohe Formation subalkaline basaltic tuff that is geochemically similar to modern intracontinental rift volcanic rocks. The integration of available geochemical data together with our new U-Pb ages indicates the Shicaohe Formation subalkaline basaltic tuff formed ca. 1180 in a continental rift-related setting on a passive continental margin. The Shennongjia Group is topped by the Zhengjiaya Formation volcanic sequence, indicating arc-related igneous events at 1103 Ma. The transition of the late Mesoproterozoic tectonic regime from intracontinental extension to convergence occurred between ca. 1180 and 1103 Ma in the northern Yangtze Block. Tectonic evolution in the Neoproterozoic led to accretion along the northern margin of the Yangtze Block. These results provide geochronological evidence, which is of utmost importance for reconfiguration of the chronostratigraphic framework and for promoting research on Mesoproterozoic strata in China, thereby increasing understanding of magmatic events and basin evolutionary history in the northern Yangtze Block.
Johnson, Cari L.; Graham, Stephan A.
2007-01-01
An integrated database of outcrop studies, borehole logs, and seismic-reflection profiles is used to divide Eocene through Miocene strata of the central and southern San Joaquin Basin, California, into a framework of nine stratigraphic sequences. These third- and higher-order sequences (<3 m.y. duration) comprise the principal intervals for petroleum assessment for the basin, including key reservoir and source rock intervals. Important characteristics of each sequence are discussed, including distribution and stratigraphic relationships, sedimentary facies, regional correlation, and age relations. This higher-order stratigraphic packaging represents relatively short-term fluctuations in various forcing factors including climatic effects, changes in sediment supply, local and regional tectonism, and fluctuations in global eustatic sea level. These stratigraphic packages occur within the context of second-order stratigraphic megasequences, which mainly reflect long-term tectonic basin evolution. Despite more than a century of petroleum exploration in the San Joaquin Basin, many uncertainties remain regarding the age, correlation, and origin of the third- and higher-order sequences. Nevertheless, a sequence stratigraphic approach allows definition of key intervals based on genetic affinity rather than purely lithostratigraphic relationships, and thus is useful for reconstructing the multiphase history of this basin, as well as understanding its petroleum systems.
NASA Astrophysics Data System (ADS)
Castillo Vincentelli, Maria Gabriela; Favoreto, Julia; Roemers-Oliveira, Eduardo
2018-02-01
An integrated geophysical and geological analysis of a carbonate reservoir can offer an effective method to better understand the paleogeographical evolution and distribution of a geological reservoir and non-reservoir facies. Therefore, we propose a better method for obtaining geological facies from geophysical facies, helping to characterize the permo-porous system of this kind of play. The goal is to determine the main geological phases from a specific hydrocarbon producer (Albian Campos Basin, Brazil). The applied method includes the use of a petrographic and qualitative description from the integrated reservoir with seismic interpretation of an attribute map (energy, root mean square, mean amplitude, maximum negative amplitude, etc), all calculated at the Albian level for each of the five identified phases. The studied carbonate reservoir is approximately 6 km long with a main direction of NE-SW, and it was sub-divided as follows (from bottom to top): (1) the first depositional sequence of the bank was composed mainly of packstone, indicating that the local structure adjacent to the main bank is protected from environmental conditions; (2) characterized by the presence of grainstone developed at the higher structure; (3) the main sequence of the peloidal packstone with mudstones oncoids; (4) corresponds to the oil production of carbonate reservoirs formed by oolitic grainstone deposited at the top of the carbonate bank; at this phase, rising sea levels formed channels that connected the open sea shelf with the restricted circulation shelf; and (5) mudstone and wackestone represent the system’s flooding phase.
MLV integration site selection is driven by strong enhancers and active promoters
LaFave, Matthew C.; Varshney, Gaurav K.; Gildea, Derek E.; Wolfsberg, Tyra G.; Baxevanis, Andreas D.; Burgess, Shawn M.
2014-01-01
Retroviruses integrate into the host genome in patterns specific to each virus. Understanding the causes of these patterns can provide insight into viral integration mechanisms, pathology and genome evolution, and is critical to the development of safe gene therapy vectors. We generated murine leukemia virus integrations in human HepG2 and K562 cells and subjected them to second-generation sequencing, using a DNA barcoding technique that allowed us to quantify independent integration events. We characterized >3 700 000 unique integration events in two ENCODE-characterized cell lines. We find that integrations were most highly enriched in a subset of strong enhancers and active promoters. In both cell types, approximately half the integrations were found in <2% of the genome, demonstrating genomic influences even narrower than previously believed. The integration pattern of murine leukemia virus appears to be largely driven by regions that have high enrichment for multiple marks of active chromatin; the combination of histone marks present was sufficient to explain why some strong enhancers were more prone to integration than others. The approach we used is applicable to analyzing the integration pattern of any exogenous element and could be a valuable preclinical screen to evaluate the safety of gene therapy vectors. PMID:24464997
Campos, Pollyanna Fernandes; Andrade-Silva, Débora; Zelanis, André; Paes Leme, Adriana Franco; Rocha, Marisa Maria Teixeira; Menezes, Milene Cristina; Serrano, Solange M.T.; Junqueira-de-Azevedo, Inácio de Loiola Meirelles
2016-01-01
Only few studies on snake venoms were dedicated to deeply characterize the toxin secretion of animals from the Colubridae family, despite the fact that they represent the majority of snake diversity. As a consequence, some evolutionary trends observed in venom proteins that underpinned the evolutionary histories of snake toxins were based on data from a minor parcel of the clade. Here, we investigated the proteins of the totally unknown venom from Phalotris mertensi (Dipsadinae subfamily), in order to obtain a detailed profile of its toxins and to appreciate evolutionary tendencies occurring in colubrid venoms. By means of integrated omics and functional approaches, including RNAseq, Sanger sequencing, high-resolution proteomics, recombinant protein production, and enzymatic tests, we verified an active toxic secretion containing up to 21 types of proteins. A high content of Kunitz-type proteins and C-type lectins were observed, although several enzymatic components such as metalloproteinases and an L-amino acid oxidase were also present in the venom. Interestingly, an arguable venom component of other species was demonstrated as a true venom protein and named svLIPA (snake venom acid lipase). This finding indicates the importance of checking the actual protein occurrence across species before rejecting genes suggested to code for toxins, which are relevant for the discussion about the early evolution of reptile venoms. Moreover, trends in the evolution of some toxin classes, such as simplification of metalloproteinases and rearrangements of Kunitz and Wap domains, parallel similar phenomena observed in other venomous snake families and provide a broader picture of toxin evolution. PMID:27412610
Biological intuition in alignment-free methods: response to Posada.
Ragan, Mark A; Chan, Cheong Xin
2013-08-01
A recent editorial in Journal of Molecular Evolution highlights opportunities and challenges facing molecular evolution in the era of next-generation sequencing. Abundant sequence data should allow more-complex models to be fit at higher confidence, making phylogenetic inference more reliable and improving our understanding of evolution at the molecular level. However, concern that approaches based on multiple sequence alignment may be computationally infeasible for large datasets is driving the development of so-called alignment-free methods for sequence comparison and phylogenetic inference. The recent editorial characterized these approaches as model-free, not based on the concept of homology, and lacking in biological intuition. We argue here that alignment-free methods have not abandoned models or homology, and can be biologically intuitive.
Transgenerational effects of insecticides-implications for rapid pest evolution in agroecosystems.
Brevik, Kristian; Lindström, Leena; McKay, Stephanie D; Chen, Yolanda H
2018-04-01
Although pesticides are a major selective force in driving the evolution of insect pests, the evolutionary processes that give rise to insecticide resistance remain poorly understood. Insecticide resistance has been widely observed to increase with frequent and intense insecticide exposure, but can be lost following the relaxation of insecticide use. One possible but rarely explored explanation is that insecticide resistance may be associated with epigenetic modifications, which influence the patterning of gene expression without changing underlying DNA sequence. Epigenetic modifications such as DNA methylation, histone modifications, and small RNAs have been observed to be heritable in arthropods, but their role in the context of rapid evolution of insecticide resistance remain poorly understood. Here, we discuss evidence supporting how: firstly, insecticide-induced effects can be transgenerationally inherited; secondly, epigenetic modifications are heritable; and thirdly, epigenetic modifications are responsive to pesticide and xenobiotic stress. Therefore, pesticides may drive the evolution of resistance via epigenetic processes. Moreover, insect pests primed by pesticides may be more tolerant of other stress, further enhancing their success in adapting to agroecosystems. Resolving the role of epigenetic modifications in the rapid evolution of insect pests has the potential to lead to new approaches for integrated pest management as well as improve our understanding of how anthropogenic stress may drive the evolution of insect pests. Copyright © 2018 Elsevier Inc. All rights reserved.
Wood, Natasha; Bhattacharya, Tanmoy; Keele, Brandon F; Giorgi, Elena; Liu, Michael; Gaschen, Brian; Daniels, Marcus; Ferrari, Guido; Haynes, Barton F; McMichael, Andrew; Shaw, George M; Hahn, Beatrice H; Korber, Bette; Seoighe, Cathal
2009-05-01
The pattern of viral diversification in newly infected individuals provides information about the host environment and immune responses typically experienced by the newly transmitted virus. For example, sites that tend to evolve rapidly across multiple early-infection patients could be involved in enabling escape from common early immune responses, could represent adaptation for rapid growth in a newly infected host, or could represent reversion from less fit forms of the virus that were selected for immune escape in previous hosts. Here we investigated the diversification of HIV-1 env coding sequences in 81 very early B subtype infections previously shown to have resulted from transmission or expansion of single viruses (n = 78) or two closely related viruses (n = 3). In these cases, the sequence of the infecting virus can be estimated accurately, enabling inference of both the direction of substitutions as well as distinction between insertion and deletion events. By integrating information across multiple acutely infected hosts, we find evidence of adaptive evolution of HIV-1 env and identify a subset of codon sites that diversified more rapidly than can be explained by a model of neutral evolution. Of 24 such rapidly diversifying sites, 14 were either i) clustered and embedded in CTL epitopes that were verified experimentally or predicted based on the individual's HLA or ii) in a nucleotide context indicative of APOBEC-mediated G-to-A substitutions, despite having excluded heavily hypermutated sequences prior to the analysis. In several cases, a rapidly evolving site was embedded both in an APOBEC motif and in a CTL epitope, suggesting that APOBEC may facilitate early immune escape. Ten rapidly diversifying sites could not be explained by CTL escape or APOBEC hypermutation, including the most frequently mutated site, in the fusion peptide of gp41. We also examined the distribution, extent, and sequence context of insertions and deletions, and we provide evidence that the length variation seen in hypervariable loop regions of the envelope glycoprotein is a consequence of selection and not of mutational hotspots. Our results provide a detailed view of the process of diversification of HIV-1 following transmission, highlighting the role of CTL escape and hypermutation in shaping viral evolution during the establishment of new infections.
Wood, Natasha; Bhattacharya, Tanmoy; Keele, Brandon F.; Giorgi, Elena; Liu, Michael; Gaschen, Brian; Daniels, Marcus; Ferrari, Guido; Haynes, Barton F.; McMichael, Andrew; Shaw, George M.; Hahn, Beatrice H.; Korber, Bette; Seoighe, Cathal
2009-01-01
The pattern of viral diversification in newly infected individuals provides information about the host environment and immune responses typically experienced by the newly transmitted virus. For example, sites that tend to evolve rapidly across multiple early-infection patients could be involved in enabling escape from common early immune responses, could represent adaptation for rapid growth in a newly infected host, or could represent reversion from less fit forms of the virus that were selected for immune escape in previous hosts. Here we investigated the diversification of HIV-1 env coding sequences in 81 very early B subtype infections previously shown to have resulted from transmission or expansion of single viruses (n = 78) or two closely related viruses (n = 3). In these cases, the sequence of the infecting virus can be estimated accurately, enabling inference of both the direction of substitutions as well as distinction between insertion and deletion events. By integrating information across multiple acutely infected hosts, we find evidence of adaptive evolution of HIV-1 env and identify a subset of codon sites that diversified more rapidly than can be explained by a model of neutral evolution. Of 24 such rapidly diversifying sites, 14 were either i) clustered and embedded in CTL epitopes that were verified experimentally or predicted based on the individual's HLA or ii) in a nucleotide context indicative of APOBEC-mediated G-to-A substitutions, despite having excluded heavily hypermutated sequences prior to the analysis. In several cases, a rapidly evolving site was embedded both in an APOBEC motif and in a CTL epitope, suggesting that APOBEC may facilitate early immune escape. Ten rapidly diversifying sites could not be explained by CTL escape or APOBEC hypermutation, including the most frequently mutated site, in the fusion peptide of gp41. We also examined the distribution, extent, and sequence context of insertions and deletions, and we provide evidence that the length variation seen in hypervariable loop regions of the envelope glycoprotein is a consequence of selection and not of mutational hotspots. Our results provide a detailed view of the process of diversification of HIV-1 following transmission, highlighting the role of CTL escape and hypermutation in shaping viral evolution during the establishment of new infections. PMID:19424423
Echave, Julian; Wilke, Claus O.
2018-01-01
For decades, rates of protein evolution have been interpreted in terms of the vague concept of “functional importance”. Slowly evolving proteins or sites within proteins were assumed to be more functionally important and thus subject to stronger selection pressure. More recently, biophysical models of protein evolution, which combine evolutionary theory with protein biophysics, have completely revolutionized our view of the forces that shape sequence divergence. Slowly evolving proteins have been found to evolve slowly because of selection against toxic misfolding and misinteractions, linking their rate of evolution primarily to their abundance. Similarly, most slowly evolving sites in proteins are not directly involved in function, but mutating them has large impacts on protein structure and stability. Here, we review the studies of the emergent field of biophysical protein evolution that have shaped our current understanding of sequence divergence patterns. We also propose future research directions to develop this nascent field. PMID:28301766
Translational Implications of Tumor Heterogeneity
Jamal-Hanjani, Mariam; Quezada, Sergio A.; Larkin, James; Swanton, Charles
2015-01-01
Advances in next-generation sequencing and bioinformatics have led to an unprecedented view of the cancer genome and its evolution. Genomic studies have demonstrated the complex and heterogeneous clonal landscape of tumors of different origins, and the potential impact of intratumor heterogeneity on treatment response and resistance, cancer progression and the risk of disease relapse. However, the significance of subclonal mutations, in particular mutations in driver genes, and their evolution through time and their dynamics in response to cancer therapies, is yet to be determined. The necessary tools are now available to prospectively determine whether clonal heterogeneity can be used as a biomarker of clinical outcome, and to what extent subclonal somatic alterations might influence clinical outcome. Studies that employ longitudinal tissue sampling, integrating both genomic and clinical data, have the potential to reveal the subclonal composition and track the evolution of tumors in order to address these questions, and to begin to define the breadth of genetic diversity in different tumor types, and its relevance to patient outcome. Such studies may provide further evidence for novel drug resistance mechanisms informing novel combinatorial, adaptive and tumour immune-therapies placed within the context of tumor evolution. PMID:25770293
Mobile DNA and evolution in the 21st century
2010-01-01
Scientific history has had a profound effect on the theories of evolution. At the beginning of the 21st century, molecular cell biology has revealed a dense structure of information-processing networks that use the genome as an interactive read-write (RW) memory system rather than an organism blueprint. Genome sequencing has documented the importance of mobile DNA activities and major genome restructuring events at key junctures in evolution: exon shuffling, changes in cis-regulatory sites, horizontal transfer, cell fusions and whole genome doublings (WGDs). The natural genetic engineering functions that mediate genome restructuring are activated by multiple stimuli, in particular by events similar to those found in the DNA record: microbial infection and interspecific hybridization leading to the formation of allotetraploids. These molecular genetic discoveries, plus a consideration of how mobile DNA rearrangements increase the efficiency of generating functional genomic novelties, make it possible to formulate a 21st century view of interactive evolutionary processes. This view integrates contemporary knowledge of the molecular basis of genetic change, major genome events in evolution, and stimuli that activate DNA restructuring with classical cytogenetic understanding about the role of hybridization in species diversification. PMID:20226073
Retrotransposons as regulators of gene expression.
Elbarbary, Reyad A; Lucas, Bronwyn A; Maquat, Lynne E
2016-02-12
Transposable elements (TEs) are both a boon and a bane to eukaryotic organisms, depending on where they integrate into the genome and how their sequences function once integrated. We focus on two types of TEs: long interspersed elements (LINEs) and short interspersed elements (SINEs). LINEs and SINEs are retrotransposons; that is, they transpose via an RNA intermediate. We discuss how LINEs and SINEs have expanded in eukaryotic genomes and contribute to genome evolution. An emerging body of evidence indicates that LINEs and SINEs function to regulate gene expression by affecting chromatin structure, gene transcription, pre-mRNA processing, or aspects of mRNA metabolism. We also describe how adenosine-to-inosine editing influences SINE function and how ongoing retrotransposition is countered by the body's defense mechanisms. Copyright © 2016, American Association for the Advancement of Science.
Standage, Daniel S; Berens, Ali J; Glastad, Karl M; Severin, Andrew J; Brendel, Volker P; Toth, Amy L
2016-04-01
Comparative genomics of social insects has been intensely pursued in recent years with the goal of providing insights into the evolution of social behaviour and its underlying genomic and epigenomic basis. However, the comparative approach has been hampered by a paucity of data on some of the most informative social forms (e.g. incipiently and primitively social) and taxa (especially members of the wasp family Vespidae) for studying social evolution. Here, we provide a draft genome of the primitively eusocial model insect Polistes dominula, accompanied by analysis of caste-related transcriptome and methylome sequence data for adult queens and workers. Polistes dominula possesses a fairly typical hymenopteran genome, but shows very low genomewide GC content and some evidence of reduced genome size. We found numerous caste-related differences in gene expression, with evidence that both conserved and novel genes are related to caste differences. Most strikingly, these -omics data reveal a major reduction in one of the major epigenetic mechanisms that has been previously suggested to be important for caste differences in social insects: DNA methylation. Along with a conspicuous loss of a key gene associated with environmentally responsive DNA methylation (the de novo DNA methyltransferase Dnmt3), these wasps have greatly reduced genomewide methylation to almost zero. In addition to providing a valuable resource for comparative analysis of social insect evolution, our integrative -omics data for this important behavioural and evolutionary model system call into question the general importance of DNA methylation in caste differences and evolution in social insects. © 2016 The Authors. Molecular Ecology Published by John Wiley & Sons Ltd.
Evidence of birth-and-death evolution of 5S rRNA gene in Channa species (Teleostei, Perciformes).
Barman, Anindya Sundar; Singh, Mamta; Singh, Rajeev Kumar; Lal, Kuldeep Kumar
2016-12-01
In higher eukaryotes, minor rDNA family codes for 5S rRNA that is arranged in tandem arrays and comprises of a highly conserved 120 bp long coding sequence with a variable non-transcribed spacer (NTS). Initially the 5S rDNA repeats are considered to be evolved by the process of concerted evolution. But some recent reports, including teleost fishes suggested that evolution of 5S rDNA repeat does not fit into the concerted evolution model and evolution of 5S rDNA family may be explained by a birth-and-death evolution model. In order to study the mode of evolution of 5S rDNA repeats in Perciformes fish species, nucleotide sequence and molecular organization of five species of genus Channa were analyzed in the present study. Molecular analyses revealed several variants of 5S rDNA repeats (four types of NTS) and networks created by a neighbor net algorithm for each type of sequences (I, II, III and IV) did not show a clear clustering in species specific manner. The stable secondary structure is predicted and upstream and downstream conserved regulatory elements were characterized. Sequence analyses also shown the presence of two putative pseudogenes in Channa marulius. Present study supported that 5S rDNA repeats in genus Channa were evolved under the process of birth-and-death.
Advances for studying clonal evolution in cancer.
Ding, Li; Raphael, Benjamin J; Chen, Feng; Wendl, Michael C
2013-11-01
The "clonal evolution" model of cancer emerged and "evolved" amid ongoing advances in technology, especially in recent years during which next generation sequencing instruments have provided ever higher resolution pictures of the genetic changes in cancer cells and heterogeneity in tumors. It has become increasingly clear that clonal evolution is not a single sequential process, but instead frequently involves simultaneous evolution of multiple subclones that co-exist because they are of similar fitness or are spatially separated. Co-evolution of subclones also occurs when they complement each other's survival advantages. Recent studies have also shown that clonal evolution is highly heterogeneous: different individual tumors of the same type may undergo very different paths of clonal evolution. New methodological advancements, including deep digital sequencing of a mixed tumor population, single cell sequencing, and the development of more sophisticated computational tools, will continue to shape and reshape the models of clonal evolution. In turn, these will provide both an improved framework for the understanding of cancer progression and a guide for treatment strategies aimed at the elimination of all, rather than just some, of the cancer cells within a patient. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Wang, Penghao; Wilson, Susan R
2013-01-01
Mass spectrometry-based protein identification is a very challenging task. The main identification approaches include de novo sequencing and database searching. Both approaches have shortcomings, so an integrative approach has been developed. The integrative approach firstly infers partial peptide sequences, known as tags, directly from tandem spectra through de novo sequencing, and then puts these sequences into a database search to see if a close peptide match can be found. However the current implementation of this integrative approach has several limitations. Firstly, simplistic de novo sequencing is applied and only very short sequence tags are used. Secondly, most integrative methods apply an algorithm similar to BLAST to search for exact sequence matches and do not accommodate sequence errors well. Thirdly, by applying these methods the integrated de novo sequencing makes a limited contribution to the scoring model which is still largely based on database searching. We have developed a new integrative protein identification method which can integrate de novo sequencing more efficiently into database searching. Evaluated on large real datasets, our method outperforms popular identification methods.
NASA Astrophysics Data System (ADS)
Humpula, James F.; Ostrom, Peggy H.; Gandhi, Hasand; Strahler, John R.; Walker, Angela K.; Stafford, Thomas W.; Smith, James J.; Voorhies, Michael R.; George Corner, R.; Andrews, Phillip C.
2007-12-01
Ancient DNA sequences offer an extraordinary opportunity to unravel the evolutionary history of ancient organisms. Protein sequences offer another reservoir of genetic information that has recently become tractable through the application of mass spectrometric techniques. The extent to which ancient protein sequences resolve phylogenetic relationships, however, has not been explored. We determined the osteocalcin amino acid sequence from the bone of an extinct Camelid (21 ka, Camelops hesternus) excavated from Isleta Cave, New Mexico and three bones of extant camelids: bactrian camel ( Camelus bactrianus); dromedary camel ( Camelus dromedarius) and guanaco ( Llama guanacoe) for a diagenetic and phylogenetic assessment. There was no difference in sequence among the four taxa. Structural attributes observed in both modern and ancient osteocalcin include a post-translation modification, Hyp 9, deamidation of Gln 35 and Gln 39, and oxidation of Met 36. Carbamylation of the N-terminus in ancient osteocalcin may result in blockage and explain previous difficulties in sequencing ancient proteins via Edman degradation. A phylogenetic analysis using osteocalcin sequences of 25 vertebrate taxa was conducted to explore osteocalcin protein evolution and the utility of osteocalcin sequences for delineating phylogenetic relationships. The maximum likelihood tree closely reflected generally recognized taxonomic relationships. For example, maximum likelihood analysis recovered rodents, birds and, within hominins, the Homo-Pan-Gorilla trichotomy. Within Artiodactyla, character state analysis showed that a substitution of Pro 4 for His 4 defines the Capra-Ovis clade within Artiodactyla. Homoplasy in our analysis indicated that osteocalcin evolution is not a perfect indicator of species evolution. Limited sequence availability prevented assigning functional significance to sequence changes. Our preliminary analysis of osteocalcin evolution represents an initial step towards a complete character analysis aimed at determining the evolutionary history of this functionally significant protein. We emphasize that ancient protein sequencing and phylogenetic analyses using amino acid sequences must pay close attention to post-translational modifications, amino acid substitutions due to diagenetic alteration and the impacts of isobaric amino acids on mass shifts and sequence alignments.
The 13Carbon footprint of B[e] supergiants
NASA Astrophysics Data System (ADS)
Liermann, A.; Kraus, M.; Schnurr, O.; Fernandes, M. Borges
2010-10-01
We report on the first detection of 13C enhancement in two B[e] supergiants (B[e]SGs) in the Large Magellanic Cloud. Stellar evolution models predict the surface abundance in 13C to strongly increase during main-sequence and post-main-sequence evolution of massive stars. However, direct identification of chemically processed material on the surface of B[e]SGs is hampered by their dense, disc-forming winds, hiding the stars. Recent theoretical computations predict the detectability of enhanced 13C via the molecular emission in 13CO arising in the circumstellar discs of B[e]SGs. To test this potential method and to unambiguously identify a post-main-sequence B[e] SG by its 13CO emission, we have obtained high-quality K-band spectra of two known B[e] SGs in the Large Magellanic Cloud, using the Very Large Telescope's Spectrograph for INtegral Field Observation in the Near-Infrared (VLT/SINFONI). Both stars clearly show the 13CO band emission, whose strength implies a strong enhancement of 13C, in agreement with theoretical predictions. This first ever direct confirmation of the evolved nature of B[e]SGs thus paves the way to the first identification of a Galactic B[e]SG. Based on observations collected with the ESO VLT Paranal Observatory under programme 384.D-1078(A). E-mail: liermann@mpifr-bonn.mpg.de (AL); kraus@sunstel.asu.cas.cz (MK); oschnurr@aip.de (OS); borges@on.br (MBF)
Walker, Joseph F; Zanis, Michael J; Emery, Nancy C
2014-04-01
Complete chloroplast genome studies can help resolve relationships among large, complex plant lineages such as Asteraceae. We present the first whole plastome from the Madieae tribe and compare its sequence variation to other chloroplast genomes in Asteraceae. We used high throughput sequencing to obtain the Lasthenia burkei chloroplast genome. We compared sequence structure and rates of molecular evolution in the small single copy (SSC), large single copy (LSC), and inverted repeat (IR) regions to those for eight Asteraceae accessions and one Solanaceae accession. The chloroplast sequence of L. burkei is 150 746 bp and contains 81 unique protein coding genes and 4 coding ribosomal RNA sequences. We identified three major inversions in the L. burkei chloroplast, all of which have been found in other Asteraceae lineages, and a previously unreported inversion in Lactuca sativa. Regions flanking inversions contained tRNA sequences, but did not have particularly high G + C content. Substitution rates varied among the SSC, LSC, and IR regions, and rates of evolution within each region varied among species. Some observed differences in rates of molecular evolution may be explained by the relative proportion of coding to noncoding sequence within regions. Rates of molecular evolution vary substantially within and among chloroplast genomes, and major inversion events may be promoted by the presence of tRNAs. Collectively, these results provide insight into different mechanisms that may promote intramolecular recombination and the inversion of large genomic regions in the plastome.
A Cross-Course Investigation of Integrative Cases for Evolution Education.
White, Peter John Thomas; Heidemann, Merle K; Smith, James J
2015-12-01
Evolution is a cornerstone theory in biology, yet many undergraduate students have difficulty understanding it. One reason for this is that evolution is often taught in a macro-scale context without explicit links to micro-scale processes. To address this, we developed a series of integrative evolution cases that present the evolution of various traits from their origin in genetic mutation, to the synthesis of modified proteins, to how these proteins produce novel phenotypes, to the related macro-scale impacts that the novel phenotypes have on populations in ecological communities. We postulated that students would develop a fuller understanding of evolution when learning biology in a context where these integrative evolution cases are used. We used a previously developed assessment tool, the ATEEK (Assessment Tool for Evaluating Evolution Knowledge), within a pre-course/post-course assessment framework. Students who learned biology in courses using the integrative cases performed significantly better on the evolution assessment than did students in courses that did not use the cases. We also found that student understanding of evolution increased with increased exposure to the integrative evolution cases. These findings support the general hypothesis that students acquire a more complete understanding of evolution when they learn about its genetic and molecular mechanisms along with macro-scale explanations.
A Cross-Course Investigation of Integrative Cases for Evolution Education †
White, Peter John Thomas; Heidemann, Merle K.; Smith, James J.
2015-01-01
Evolution is a cornerstone theory in biology, yet many undergraduate students have difficulty understanding it. One reason for this is that evolution is often taught in a macro-scale context without explicit links to micro-scale processes. To address this, we developed a series of integrative evolution cases that present the evolution of various traits from their origin in genetic mutation, to the synthesis of modified proteins, to how these proteins produce novel phenotypes, to the related macro-scale impacts that the novel phenotypes have on populations in ecological communities. We postulated that students would develop a fuller understanding of evolution when learning biology in a context where these integrative evolution cases are used. We used a previously developed assessment tool, the ATEEK (Assessment Tool for Evaluating Evolution Knowledge), within a pre-course/post-course assessment framework. Students who learned biology in courses using the integrative cases performed significantly better on the evolution assessment than did students in courses that did not use the cases. We also found that student understanding of evolution increased with increased exposure to the integrative evolution cases. These findings support the general hypothesis that students acquire a more complete understanding of evolution when they learn about its genetic and molecular mechanisms along with macro-scale explanations. PMID:26753023
Kim, Kyunghee; Lee, Sang-Choon; Lee, Junki; Yu, Yeisoo; Yang, Kiwoung; Choi, Beom-Soon; Koh, Hee-Jong; Waminal, Nomar Espinosa; Choi, Hong-Il; Kim, Nam-Hoon; Jang, Woojong; Park, Hyun-Seung; Lee, Jonghoon; Lee, Hyun Oh; Joh, Ho Jun; Lee, Hyeon Ju; Park, Jee Young; Perumal, Sampath; Jayakodi, Murukarthick; Lee, Yun Sun; Kim, Backki; Copetti, Dario; Kim, Soonok; Kim, Sunggil; Lim, Ki-Byung; Kim, Young-Dong; Lee, Jungho; Cho, Kwang-Su; Park, Beom-Seok; Wing, Rod A.; Yang, Tae-Jin
2015-01-01
Cytoplasmic chloroplast (cp) genomes and nuclear ribosomal DNA (nR) are the primary sequences used to understand plant diversity and evolution. We introduce a high-throughput method to simultaneously obtain complete cp and nR sequences using Illumina platform whole-genome sequence. We applied the method to 30 rice specimens belonging to nine Oryza species. Concurrent phylogenomic analysis using cp and nR of several of specimens of the same Oryza AA genome species provides insight into the evolution and domestication of cultivated rice, clarifying three ambiguous but important issues in the evolution of wild Oryza species. First, cp-based trees clearly classify each lineage but can be biased by inter-subspecies cross-hybridization events during speciation. Second, O. glumaepatula, a South American wild rice, includes two cytoplasm types, one of which is derived from a recent interspecies hybridization with O. longistminata. Third, the Australian O. rufipogan-type rice is a perennial form of O. meridionalis. PMID:26506948
Rapid evolution of cis-regulatory sequences via local point mutations
NASA Technical Reports Server (NTRS)
Stone, J. R.; Wray, G. A.
2001-01-01
Although the evolution of protein-coding sequences within genomes is well understood, the same cannot be said of the cis-regulatory regions that control transcription. Yet, changes in gene expression are likely to constitute an important component of phenotypic evolution. We simulated the evolution of new transcription factor binding sites via local point mutations. The results indicate that new binding sites appear and become fixed within populations on microevolutionary timescales under an assumption of neutral evolution. Even combinations of two new binding sites evolve very quickly. We predict that local point mutations continually generate considerable genetic variation that is capable of altering gene expression.
Wang, Xumin; Deng, Xin; Zhang, Xiaowei; Hu, Songnian; Yu, Jun
2012-01-01
The complete nucleotide sequences of the chloroplast (cp) and mitochondrial (mt) genomes of resurrection plant Boea hygrometrica (Bh, Gesneriaceae) have been determined with the lengths of 153,493 bp and 510,519 bp, respectively. The smaller chloroplast genome contains more genes (147) with a 72% coding sequence, and the larger mitochondrial genome have less genes (65) with a coding faction of 12%. Similar to other seed plants, the Bh cp genome has a typical quadripartite organization with a conserved gene in each region. The Bh mt genome has three recombinant sequence repeats of 222 bp, 843 bp, and 1474 bp in length, which divide the genome into a single master circle (MC) and four isomeric molecules. Compared to other angiosperms, one remarkable feature of the Bh mt genome is the frequent transfer of genetic material from the cp genome during recent Bh evolution. We also analyzed organellar genome evolution in general regarding genome features as well as compositional dynamics of sequence and gene structure/organization, providing clues for the understanding of the evolution of organellar genomes in plants. The cp-derived sequences including tRNAs found in angiosperm mt genomes support the conclusion that frequent gene transfer events may have begun early in the land plant lineage. PMID:22291979
NASA Astrophysics Data System (ADS)
Noirel, Josselin; Simonson, Thomas
2008-11-01
Following Kimura's neutral theory of molecular evolution [M. Kimura, The Neutral Theory of Molecular Evolution (Cambridge University Press, Cambridge, 1983) (reprinted in 1986)], it has become common to assume that the vast majority of viable mutations of a gene confer little or no functional advantage. Yet, in silico models of protein evolution have shown that mutational robustness of sequences could be selected for, even in the context of neutral evolution. The evolution of a biological population can be seen as a diffusion on the network of viable sequences. This network is called a "neutral network." Depending on the mutation rate μ and the population size N, the biological population can evolve purely randomly (μN ≪1) or it can evolve in such a way as to select for sequences of higher mutational robustness (μN ≫1). The stringency of the selection depends not only on the product μN but also on the exact topology of the neutral network, the special arrangement of which was named "superfunnel." Even though the relation between mutation rate, population size, and selection was thoroughly investigated, a study of the salient topological features of the superfunnel that could affect the strength of the selection was wanting. This question is addressed in this study. We use two different models of proteins: on lattice and off lattice. We compare neutral networks computed using these models to random networks. From this, we identify two important factors of the topology that determine the stringency of the selection for mutationally robust sequences. First, the presence of highly connected nodes ("hubs") in the network increases the selection for mutationally robust sequences. Second, the stringency of the selection increases when the correlation between a sequence's mutational robustness and its neighbors' increases. The latter finding relates a global characteristic of the neutral network to a local one, which is attainable through experiments or molecular modeling.
Noirel, Josselin; Simonson, Thomas
2008-11-14
Following Kimura's neutral theory of molecular evolution [M. Kimura, The Neutral Theory of Molecular Evolution (Cambridge University Press, Cambridge, 1983) (reprinted in 1986)], it has become common to assume that the vast majority of viable mutations of a gene confer little or no functional advantage. Yet, in silico models of protein evolution have shown that mutational robustness of sequences could be selected for, even in the context of neutral evolution. The evolution of a biological population can be seen as a diffusion on the network of viable sequences. This network is called a "neutral network." Depending on the mutation rate mu and the population size N, the biological population can evolve purely randomly (muN<1) or it can evolve in such a way as to select for sequences of higher mutational robustness (muN>1). The stringency of the selection depends not only on the product muN but also on the exact topology of the neutral network, the special arrangement of which was named "superfunnel." Even though the relation between mutation rate, population size, and selection was thoroughly investigated, a study of the salient topological features of the superfunnel that could affect the strength of the selection was wanting. This question is addressed in this study. We use two different models of proteins: on lattice and off lattice. We compare neutral networks computed using these models to random networks. From this, we identify two important factors of the topology that determine the stringency of the selection for mutationally robust sequences. First, the presence of highly connected nodes ("hubs") in the network increases the selection for mutationally robust sequences. Second, the stringency of the selection increases when the correlation between a sequence's mutational robustness and its neighbors' increases. The latter finding relates a global characteristic of the neutral network to a local one, which is attainable through experiments or molecular modeling.
The set of triple-resonance sequences with a multiple quantum coherence evolution period
NASA Astrophysics Data System (ADS)
Koźmiński, Wiktor; Zhukov, Igor
2004-12-01
The new pulse sequence building block that relies on evolution of heteronuclear multiple quantum coherences is proposed. The particular chemical shifts are obtained in multiple quadrature, using linear combinations of frequencies taken from spectra measured at different quantum levels. The pulse sequences designed in this way consist of small number of RF-pulses, are as short as possible, and could be applied for determination of coupling constants. The examples presented involve 2D correlations H NCO, H NCA, H N(CO) CA, and H(N) COCA via heteronuclear zero and double coherences, as well as 2D H NCOCA technique with simultaneous evolution of triple and three distinct single quantum coherences. Applications of the new sequences are presented for 13C, 15N-labeled ubiquitin.
Swain, Martin T.; Larkin, Denis M.; Caffrey, Conor R.; Davies, Stephen J.; Loukas, Alex; Skelly, Patrick J.; Hoffmann, Karl F.
2011-01-01
Schistosoma genomes provide a comprehensive resource for identifying the molecular processes that shape parasite evolution and for discovering novel chemotherapeutic or immunoprophylactic targets. Here, we demonstrate how intra- and intergenus comparative genomics can be used to drive these investigations forward, illustrate the advantages and limitations of these approaches and review how post genomic technologies offer complementary strategies for genome characterisation. While sequencing and functional characterisation of other schistosome/platyhelminth genomes continues to expedite anthelmintic discovery, we contend that future priorities should equally focus on improving assembly quality, and chromosomal assignment, of existing schistosome/platyhelminth genomes. PMID:22024648
PTGBase: an integrated database to study tandem duplicated genes in plants.
Yu, Jingyin; Ke, Tao; Tehrim, Sadia; Sun, Fengming; Liao, Boshou; Hua, Wei
2015-01-01
Tandem duplication is a wide-spread phenomenon in plant genomes and plays significant roles in evolution and adaptation to changing environments. Tandem duplicated genes related to certain functions will lead to the expansion of gene families and bring increase of gene dosage in the form of gene cluster arrays. Many tandem duplication events have been studied in plant genomes; yet, there is a surprising shortage of efforts to systematically present the integration of large amounts of information about publicly deposited tandem duplicated gene data across the plant kingdom. To address this shortcoming, we developed the first plant tandem duplicated genes database, PTGBase. It delivers the most comprehensive resource available to date, spanning 39 plant genomes, including model species and newly sequenced species alike. Across these genomes, 54 130 tandem duplicated gene clusters (129 652 genes) are presented in the database. Each tandem array, as well as its member genes, is characterized in complete detail. Tandem duplicated genes in PTGBase can be explored through browsing or searching by identifiers or keywords of functional annotation and sequence similarity. Users can download tandem duplicated gene arrays easily to any scale, up to the complete annotation data set for an entire plant genome. PTGBase will be updated regularly with newly sequenced plant species as they become available. © The Author(s) 2015. Published by Oxford University Press.
2010-01-01
Background Classical and quantitative linkage analyses of genetic crosses have traditionally been used to map genes of interest, such as those conferring chloroquine or quinine resistance in malaria parasites. Next-generation sequencing technologies now present the possibility of determining genome-wide genetic variation at single base-pair resolution. Here, we combine in vivo experimental evolution, a rapid genetic strategy and whole genome re-sequencing to identify the precise genetic basis of artemisinin resistance in a lineage of the rodent malaria parasite, Plasmodium chabaudi. Such genetic markers will further the investigation of resistance and its control in natural infections of the human malaria, P. falciparum. Results A lineage of isogenic in vivo drug-selected mutant P. chabaudi parasites was investigated. By measuring the artemisinin responses of these clones, the appearance of an in vivo artemisinin resistance phenotype within the lineage was defined. The underlying genetic locus was mapped to a region of chromosome 2 by Linkage Group Selection in two different genetic crosses. Whole-genome deep coverage short-read re-sequencing (Illumina® Solexa) defined the point mutations, insertions, deletions and copy-number variations arising in the lineage. Eight point mutations arise within the mutant lineage, only one of which appears on chromosome 2. This missense mutation arises contemporaneously with artemisinin resistance and maps to a gene encoding a de-ubiquitinating enzyme. Conclusions This integrated approach facilitates the rapid identification of mutations conferring selectable phenotypes, without prior knowledge of biological and molecular mechanisms. For malaria, this model can identify candidate genes before resistant parasites are commonly observed in natural human malaria populations. PMID:20846421
The evolutionary and integrative roles of transthyretin in thyroid hormone homeostasis.
Schreiber, G
2002-10-01
In larger mammals, thyroid hormone-binding plasma proteins are albumin, transthyretin (TTR) and thyroxine (T4)-binding globulin. They differ characteristically in affinities and release rates for T4 and triiodothyronine (T3). Together, they form a 'buffering' system counteracting thyroid hormone permeation from aqueous to lipid phases. Evolution led to important differences in the expression pattern of these three proteins in tissues. In adult liver, TTR is only made in eutherians and herbivorous marsupials. During development, it is also made in tadpole and fish liver. More intense TTR synthesis than in liver is found in the choroid plexus of reptilians, birds and mammals, but none in the choroid plexus of amphibians and fish, i.e. species without a neocortex. All brain-made TTR is secreted into the cerebrospinal fluid, where it becomes the major thyroid hormone-binding protein. During ontogeny, the maximum TTR synthesis in the choroid plexus precedes that of the growth rate of the brain and occurs during the period of maximum neuroblast replication. TTR is only one component in a network of factors determining thyroid hormone distribution. This explains why, under laboratory conditions, TTR-knockout mice show no major abnormalities. The ratio of TTR affinity for T4 over affinity for T3 is higher in eutherians than in reptiles and birds. This favors T4 transport from blood to brain providing more substrate for conversion of the biologically less active T4 into the biologically more active T3 by the tissue-specific brain deiodinases. The change in affinity of TTR during evolution involves a shortening and an increase in the hydrophilicity of the N-terminal regions of the TTR subunits. The molecular mechanism for this change is a stepwise shift of the splice site at the intron 1/exon 2 border of the TTR gene. The shift probably results from a sequence of single base mutations. Thus, TTR evolution provides an example for a molecular mechanism of positive Darwinian evolution. The amino acid sequences of fish and amphibian TTRs are very similar to those in mammals, suggesting that substantial TTR evolution occurred before the vertebrate stage. Open reading frames for TTR-like sequences already exist in Caenorhabditis elegans, yeast and Escherichia coli genomes.
Test Particle Stability in Exoplanet Systems
NASA Astrophysics Data System (ADS)
Frewen, Shane; Hansen, B. M.
2011-01-01
Astronomy is currently going through a golden age of exoplanet discovery. Yet despite that, there is limited research on the evolution of exoplanet systems driven by stellar evolution. In this work we look at the stability of test particles in known exoplanet systems during the host star's main sequence and white dwarf stages. In particular, we compare the instability regions that develop before and after the star loses mass to form a white dwarf, a process which causes the semi-major axes of the outer planets to expand adiabatically. We investigate the possibility of secular and resonant perturbations resulting in these regions as well as the method of removal of test particles for the instability regions, such as ejection and collision with the central star. To run our simulations we used the MERCURY software package (Chambers, 1999) and evolved our systems for over 108 years using a hybrid symplectic/Bulirsch-Stoer integrator.
Mans, Robert; Daran, Jean-Marc G; Pronk, Jack T
2018-04-01
Evolutionary engineering, which uses laboratory evolution to select for industrially relevant traits, is a popular strategy in the development of high-performing yeast strains for industrial production of fuels and chemicals. By integrating whole-genome sequencing, bioinformatics, classical genetics and genome-editing techniques, evolutionary engineering has also become a powerful approach for identification and reverse engineering of molecular mechanisms that underlie industrially relevant traits. New techniques enable acceleration of in vivo mutation rates, both across yeast genomes and at specific loci. Recent studies indicate that phenotypic trade-offs, which are often observed after evolution under constant conditions, can be mitigated by using dynamic cultivation regimes. Advances in research on synthetic regulatory circuits offer exciting possibilities to extend the applicability of evolutionary engineering to products of yeasts whose synthesis requires a net input of cellular energy. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Zattara, Eduardo E; Busey, Hannah A; Linz, David M; Tomoyasu, Yoshinori; Moczek, Armin P
2016-07-13
The origin and integration of novel traits are fundamental processes during the developmental evolution of complex organisms. Yet how novel traits integrate into pre-existing contexts remains poorly understood. Beetle horns represent a spectacular evolutionary novelty integrated within the context of the adult dorsal head, a highly conserved trait complex present since the origin of insects. We investigated whether otd1/2 and six3, members of a highly conserved gene network that instructs the formation of the anterior end of most bilaterians, also play roles in patterning more recently evolved traits. Using ablation-based fate-mapping, comparative larval RNA interference (RNAi) and transcript sequencing, we found that otd1/2, but not six3, play a fundamental role in the post-embryonic formation of the adult dorsal head and head horns of Onthophagus beetles. By contrast, neither gene appears to pattern the adult head of Tribolium flour beetles even though all are expressed in the dorsal head epidermis of both Onthophagus and Tribolium We propose that, at least in beetles, the roles of otd genes during post-embryonic development are decoupled from their embryonic functions, and that potentially non-functional post-embryonic expression in the dorsal head facilitated their co-option into a novel horn-patterning network during Onthophagus evolution. © 2016 The Author(s).
Busey, Hannah A.; Linz, David M.; Tomoyasu, Yoshinori; Moczek, Armin P.
2016-01-01
The origin and integration of novel traits are fundamental processes during the developmental evolution of complex organisms. Yet how novel traits integrate into pre-existing contexts remains poorly understood. Beetle horns represent a spectacular evolutionary novelty integrated within the context of the adult dorsal head, a highly conserved trait complex present since the origin of insects. We investigated whether otd1/2 and six3, members of a highly conserved gene network that instructs the formation of the anterior end of most bilaterians, also play roles in patterning more recently evolved traits. Using ablation-based fate-mapping, comparative larval RNA interference (RNAi) and transcript sequencing, we found that otd1/2, but not six3, play a fundamental role in the post-embryonic formation of the adult dorsal head and head horns of Onthophagus beetles. By contrast, neither gene appears to pattern the adult head of Tribolium flour beetles even though all are expressed in the dorsal head epidermis of both Onthophagus and Tribolium. We propose that, at least in beetles, the roles of otd genes during post-embryonic development are decoupled from their embryonic functions, and that potentially non-functional post-embryonic expression in the dorsal head facilitated their co-option into a novel horn-patterning network during Onthophagus evolution. PMID:27412276
Interspecific Plastome Recombination Reflects Ancient Reticulate Evolution in Picea (Pinaceae).
Sullivan, Alexis R; Schiffthaler, Bastian; Thompson, Stacey Lee; Street, Nathaniel R; Wang, Xiao-Ru
2017-07-01
Plastid sequences are a cornerstone in plant systematic studies and key aspects of their evolution, such as uniparental inheritance and absent recombination, are often treated as axioms. While exceptions to these assumptions can profoundly influence evolutionary inference, detecting them can require extensive sampling, abundant sequence data, and detailed testing. Using advancements in high-throughput sequencing, we analyzed the whole plastomes of 65 accessions of Picea, a genus of ∼35 coniferous forest tree species, to test for deviations from canonical plastome evolution. Using complementary hypothesis and data-driven tests, we found evidence for chimeric plastomes generated by interspecific hybridization and recombination in the clade comprising Norway spruce (P. abies) and 10 other species. Support for interspecific recombination remained after controlling for sequence saturation, positive selection, and potential alignment artifacts. These results reconcile previous conflicting plastid-based phylogenies and strengthen the mounting evidence of reticulate evolution in Picea. Given the relatively high frequency of hybridization and biparental plastid inheritance in plants, we suggest interspecific plastome recombination may be more widespread than currently appreciated and could underlie reported cases of discordant plastid phylogenies. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Limits of neutral drift: lessons from the in vitro evolution of two ribozymes.
Petrie, Katherine L; Joyce, Gerald F
2014-10-01
The relative contributions of adaptive selection and neutral drift to genetic change are unknown but likely depend on the inherent abundance of functional genotypes in sequence space and how accessible those genotypes are to one another. To better understand the relative roles of selection and drift in evolution, local fitness landscapes for two different RNA ligase ribozymes were examined using a continuous in vitro evolution system under conditions that foster the capacity for neutral drift to mediate genetic change. The exploration of sequence space was accelerated by increasing the mutation rate using mutagenic nucleotide analogs. Drift was encouraged by carrying out evolution within millions of separate compartments to exploit the founder effect. Deep sequencing of individuals from the evolved populations revealed that the distribution of genotypes did not escape the starting local fitness peak, remaining clustered around the sequence used to initiate evolution. This is consistent with a fitness landscape where high-fitness genotypes are sparse and well isolated, and suggests, at least in this context, that neutral drift alone is not a primary driver of genetic change. Neutral drift does, however, provide a repository of genetic variation upon which adaptive selection can act.
Computational analysis of sequence selection mechanisms.
Meyerguz, Leonid; Grasso, Catherine; Kleinberg, Jon; Elber, Ron
2004-04-01
Mechanisms leading to gene variations are responsible for the diversity of species and are important components of the theory of evolution. One constraint on gene evolution is that of protein foldability; the three-dimensional shapes of proteins must be thermodynamically stable. We explore the impact of this constraint and calculate properties of foldable sequences using 3660 structures from the Protein Data Bank. We seek a selection function that receives sequences as input, and outputs survival probability based on sequence fitness to structure. We compute the number of sequences that match a particular protein structure with energy lower than the native sequence, the density of the number of sequences, the entropy, and the "selection" temperature. The mechanism of structure selection for sequences longer than 200 amino acids is approximately universal. For shorter sequences, it is not. We speculate on concrete evolutionary mechanisms that show this behavior.
Evolution of epigenetic regulation in vertebrate genomes
Lowdon, Rebecca F.; Jang, Hyo Sik; Wang, Ting
2016-01-01
Empirical models of sequence evolution have spurred progress in the field of evolutionary genetics for decades. We are now realizing the importance and complexity of the eukaryotic epigenome. While epigenome analysis has been applied to genomes from single cell eukaryotes to human, comparative analyses are still relatively few, and computational algorithms to quantify epigenome evolution remain scarce. Accordingly, a quantitative model of epigenome evolution remains to be established. Here we review the comparative epigenomics literature and synthesize its overarching themes. We also suggest one mechanism, transcription factor binding site turnover, which relates sequence evolution to epigenetic conservation or divergence. Lastly, we propose a framework for how the field can move forward to build a coherent quantitative model of epigenome evolution. PMID:27080453
Astrovirology: Viruses at Large in the Universe.
Berliner, Aaron J; Mochizuki, Tomohiro; Stedman, Kenneth M
2018-02-01
Viruses are the most abundant biological entities on modern Earth. They are highly diverse both in structure and genomic sequence, play critical roles in evolution, strongly influence terran biogeochemistry, and are believed to have played important roles in the origin and evolution of life. However, there is yet very little focus on viruses in astrobiology. Viruses arguably have coexisted with cellular life-forms since the earliest stages of life, may have been directly involved therein, and have profoundly influenced cellular evolution. Viruses are the only entities on modern Earth to use either RNA or DNA in both single- and double-stranded forms for their genetic material and thus may provide a model for the putative RNA-protein world. With this review, we hope to inspire integration of virus research into astrobiology and also point out pressing unanswered questions in astrovirology, particularly regarding the detection of virus biosignatures and whether viruses could be spread extraterrestrially. We present basic virology principles, an inclusive definition of viruses, review current virology research pertinent to astrobiology, and propose ideas for future astrovirology research foci. Key Words: Astrobiology-Virology-Biosignatures-Origin of life-Roadmap. Astrobiology 18, 207-223.
The tangled bank of amino acids
Pollock, David D.
2016-01-01
Abstract The use of amino acid substitution matrices to model protein evolution has yielded important insights into both the evolutionary process and the properties of specific protein families. In order to make these models tractable, standard substitution matrices represent the average results of the evolutionary process rather than the underlying molecular biophysics and population genetics, treating proteins as a set of independently evolving sites rather than as an integrated biomolecular entity. With advances in computing and the increasing availability of sequence data, we now have an opportunity to move beyond current substitution matrices to more interpretable mechanistic models with greater fidelity to the evolutionary process of mutation and selection and the holistic nature of the selective constraints. As part of this endeavour, we consider how epistatic interactions induce spatial and temporal rate heterogeneity, and demonstrate how these generally ignored factors can reconcile standard substitution rate matrices and the underlying biology, allowing us to better understand the meaning of these substitution rates. Using computational simulations of protein evolution, we can demonstrate the importance of both spatial and temporal heterogeneity in modelling protein evolution. PMID:27028523
Extensive concerted evolution of rice paralogs and the road to regaining independence.
Wang, Xiyin; Tang, Haibao; Bowers, John E; Feltus, Frank A; Paterson, Andrew H
2007-11-01
Many genes duplicated by whole-genome duplications (WGDs) are more similar to one another than expected. We investigated whether concerted evolution through conversion and crossing over, well-known to affect tandem gene clusters, also affects dispersed paralogs. Genome sequences for two Oryza subspecies reveal appreciable gene conversion in the approximately 0.4 MY since their divergence, with a gradual progression toward independent evolution of older paralogs. Since divergence from subspecies indica, approximately 8% of japonica paralogs produced 5-7 MYA on chromosomes 11 and 12 have been affected by gene conversion and several reciprocal exchanges of chromosomal segments, while approximately 70-MY-old "paleologs" resulting from a genome duplication (GD) show much less conversion. Sequence similarity analysis in proximal gene clusters also suggests more conversion between younger paralogs. About 8% of paleologs may have been converted since rice-sorghum divergence approximately 41 MYA. Domain-encoding sequences are more frequently converted than nondomain sequences, suggesting a sort of circularity--that sequences conserved by selection may be further conserved by relatively frequent conversion. The higher level of concerted evolution in the 5-7 MY-old segmental duplication may reflect the behavior of many genomes within the first few million years after duplication or polyploidization.
Campo, Daniel; García-Vázquez, Eva
2012-01-01
The 5S rDNA is organized in the genome as tandemly repeated copies of a structural unit composed of a coding sequence plus a nontranscribed spacer (NTS). The coding region is highly conserved in the evolution, whereas the NTS vary in both length and sequence. It has been proposed that 5S rRNA genes are members of a gene family that have arisen through concerted evolution. In this study, we describe the molecular organization and evolution of the 5S rDNA in the genera Lepidorhombus and Scophthalmus (Scophthalmidae) and compared it with already known 5S rDNA of the very different genera Merluccius (Merluccidae) and Salmo (Salmoninae), to identify common structural elements or patterns for understanding 5S rDNA evolution in fish. High intra- and interspecific diversity within the 5S rDNA family in all the genera can be explained by a combination of duplications, deletions, and transposition events. Sequence blocks with high similarity in all the 5S rDNA members across species were identified for the four studied genera, with evidences of intense gene conversion within noncoding regions. We propose a model to explain the evolution of the 5S rDNA, in which the evolutionary units are blocks of nucleotides rather than the entire sequences or single nucleotides. This model implies a "two-speed" evolution: slow within blocks (homogenized by recombination) and fast within the gene family (diversified by duplications and deletions).
Advances for Studying Clonal Evolution in Cancer
Raphael, Benjamin J.; Chen, Feng; Wendl, Michael C.
2013-01-01
The “clonal evolution” model of cancer emerged and “evolved” amid ongoing advances in technology, especially in recent years during which next generation sequencing instruments have provided ever higher resolution pictures of the genetic changes in cancer cells and heterogeneity in tumors. It has become increasingly clear that clonal evolution is not a single sequential process, but instead frequently involves simultaneous evolution of multiple subclones that co-exist because they are of similar fitness or are spatially separated. Co-evolution of subclones also occurs when they complement each other’s survival advantages. Recent studies have also shown that clonal evolution is highly heterogeneous: different individual tumors of the same type may undergo very different paths of clonal evolution. New methodological advancements, including deep digital sequencing of a mixed tumor population, single cell sequencing, and the development of more sophisticated computational tools, will continue to shape and reshape the models of clonal evolution. In turn, these will provide both an improved framework for the understanding of cancer progression and a guide for treatment strategies aimed at the elimination of all, rather than just some, of the cancer cells within a patient. PMID:23353056
Evolution of meiotic recombination genes in maize and teosinte.
Sidhu, Gaganpreet K; Warzecha, Tomasz; Pawlowski, Wojciech P
2017-01-25
Meiotic recombination is a major source of genetic variation in eukaryotes. The role of recombination in evolution is recognized but little is known about how evolutionary forces affect the recombination pathway itself. Although the recombination pathway is fundamentally conserved across different species, genetic variation in recombination components and outcomes has been observed. Theoretical predictions and empirical studies suggest that changes in the recombination pathway are likely to provide adaptive abilities to populations experiencing directional or strong selection pressures, such as those occurring during species domestication. We hypothesized that adaptive changes in recombination may be associated with adaptive evolution patterns of genes involved in meiotic recombination. To examine how maize evolution and domestication affected meiotic recombination genes, we studied patterns of sequence polymorphism and divergence in eleven genes controlling key steps in the meiotic recombination pathway in a diverse set of maize inbred lines and several accessions of teosinte, the wild ancestor of maize. We discovered that, even though the recombination genes generally exhibited high sequence conservation expected in a pathway controlling a key cellular process, they showed substantial levels and diverse patterns of sequence polymorphism. Among others, we found differences in sequence polymorphism patterns between tropical and temperate maize germplasms. Several recombination genes displayed patterns of polymorphism indicative of adaptive evolution. Despite their ancient origin and overall sequence conservation, meiotic recombination genes can exhibit extensive and complex patterns of molecular evolution. Changes in these genes could affect the functioning of the recombination pathway, and may have contributed to the successful domestication of maize and its expansion to new cultivation areas.
Statistical Features of the 2010 Beni-Ilmane, Algeria, Aftershock Sequence
NASA Astrophysics Data System (ADS)
Hamdache, M.; Peláez, J. A.; Gospodinov, D.; Henares, J.
2018-03-01
The aftershock sequence of the 2010 Beni-Ilmane ( M W 5.5) earthquake is studied in depth to analyze the spatial and temporal variability of seismicity parameters of the relationships modeling the sequence. The b value of the frequency-magnitude distribution is examined rigorously. A threshold magnitude of completeness equal to 2.1, using the maximum curvature procedure or the changing point algorithm, and a b value equal to 0.96 ± 0.03 have been obtained for the entire sequence. Two clusters have been identified and characterized by their faulting type, exhibiting b values equal to 0.99 ± 0.05 and 1.04 ± 0.05. Additionally, the temporal decay of the aftershock sequence was examined using a stochastic point process. The analysis was done through the restricted epidemic-type aftershock sequence (RETAS) stochastic model, which allows the possibility to recognize the prevailing clustering pattern of the relaxation process in the examined area. The analysis selected the epidemic-type aftershock sequence (ETAS) model to offer the most appropriate description of the temporal distribution, which presumes that all events in the sequence can cause secondary aftershocks. Finally, the fractal dimensions are estimated using the integral correlation. The obtained D 2 values are 2.15 ± 0.01, 2.23 ± 0.01 and 2.17 ± 0.02 for the entire sequence, and for the first and second cluster, respectively. An analysis of the temporal evolution of the fractal dimensions D -2, D 0, D 2 and the spectral slope has been also performed to derive and characterize the different clusters included in the sequence.
NASA Astrophysics Data System (ADS)
Greco, Gerson A.; González, Pablo D.; González, Santiago N.; Sato, Ana M.; Basei, Miguel A. S.; Tassinari, Colombo C. G.; Sato, Kei; Varela, Ricardo; Llambías, Eduardo J.
2015-10-01
The low-grade Nahuel Niyeu Formation in the Aguada Cecilio area (40°50‧S-65°53‧W) shows ultramafic to felsic metaigneous rocks forming a sill swarm intercalated in the metasedimentary sequence and a polyphase deformation which permit an integrated study of the magmatic and tectonometamorphic evolution of this geological unit. In this paper we present a geological characterization of the Nahuel Niyeu Formation in the Aguada Cecilio area combining mapping, structural and metamorphic analysis with a SHRIMP U-Pb age and geochemical data from the metaigneous rocks. The metasedimentary sequence consists of alternating metagreywackes and phyllites, and minor metasandstones and granule metaconglomerates. The sills are pre-kinematic intrusions and yielded one SHRIMP U-Pb, zircon crystallization age of 513.6 ± 3.3 Ma. Their injection occurred after consolidation of the sedimentary sequence. A syn-sedimentary volcanic activity is interpreted by a metaandesite lava flow interlayered in the metasedimentary sequence. Sedimentary and igneous protoliths of the Nahuel Niyeu Formation would have been formed in a continental margin basin associated with active magmatic arc during the Cambrian Epoch 2. Two main low-grade tectonometamorphic events affected the Nahuel Niyeu Formation, one during the Cambrian Epoch 2-Early Ordovician and the other probably in the late Permian at ˜260 Ma. Local late folds could belong to the final stages of the late Permian deformation or be even younger. In a regional context, the Nahuel Niyeu and El Jagüelito formations and Mina Gonzalito Complex show a comparable Cambrian-Ordovician evolution related to the Terra Australis Orogen in the south Gondwana margin. This evolution is also coeval with the late and early stages of the Pampean and Famatinian orogenies of Central Argentina, respectively. The late Permian event recorded in the Nahuel Niyeu Formation in Aguada Cecilio area is identified by comparable structures affecting the Mina Gonzalito Complex and El Jagüelito Formation and resetting ages from granitoids. This event represents the Gondwanide Orogeny within the same Terra Australis Orogen.
Zhao, Liang; Jiang, Xi-Wang; Zuo, Yun-Juan; Liu, Xiao-Lin; Chin, Siew-Wai; Haberle, Rosemarie; Potter, Daniel; Chang, Zhao-Yang; Wen, Jun
2016-01-01
Prunus is an economically important genus well-known for cherries, plums, almonds, and peaches. The genus can be divided into three major groups based on inflorescence structure and ploidy levels: (1) the diploid solitary-flower group (subg. Prunus, Amygdalus and Emplectocladus); (2) the diploid corymbose group (subg. Cerasus); and (3) the polyploid racemose group (subg. Padus, subg. Laurocerasus, and the Maddenia group). The plastid phylogeny suggests three major clades within Prunus: Prunus-Amygdalus-Emplectocladus, Cerasus, and Laurocerasus-Padus-Maddenia, while nuclear ITS trees resolve Laurocerasus-Padus-Maddenia as a paraphyletic group. In this study, we employed sequences of the nuclear loci At103, ITS and s6pdh to explore the origins and evolution of the racemose group. Two copies of the At103 gene were identified in Prunus. One copy is found in Prunus species with solitary and corymbose inflorescences as well as those with racemose inflorescences, while the second copy (II) is present only in taxa with racemose inflorescences. The copy I sequences suggest that all racemose species form a paraphyletic group composed of four clades, each of which is definable by morphology and geography. The tree from the combined At103 and ITS sequences and the tree based on the single gene s6pdh had similar general topologies to the tree based on the copy I sequences of At103, with the combined At103-ITS tree showing stronger support in most clades. The nuclear At103, ITS and s6pdh data in conjunction with the plastid data are consistent with the hypothesis that multiple independent allopolyploidy events contributed to the origins of the racemose group. A widespread species or lineage may have served as the maternal parent for multiple hybridizations involving several paternal lineages. This hypothesis of the complex evolutionary history of the racemose group in Prunus reflects a major step forward in our understanding of diversification of the genus and has important implications for the interpretation of its phylogeny, evolution, and classification.
Zuo, Yun-juan; Liu, Xiao-Lin; Chin, Siew-Wai; Haberle, Rosemarie; Potter, Daniel; Chang, Zhao-Yang; Wen, Jun
2016-01-01
Prunus is an economically important genus well-known for cherries, plums, almonds, and peaches. The genus can be divided into three major groups based on inflorescence structure and ploidy levels: (1) the diploid solitary-flower group (subg. Prunus, Amygdalus and Emplectocladus); (2) the diploid corymbose group (subg. Cerasus); and (3) the polyploid racemose group (subg. Padus, subg. Laurocerasus, and the Maddenia group). The plastid phylogeny suggests three major clades within Prunus: Prunus-Amygdalus-Emplectocladus, Cerasus, and Laurocerasus-Padus-Maddenia, while nuclear ITS trees resolve Laurocerasus-Padus-Maddenia as a paraphyletic group. In this study, we employed sequences of the nuclear loci At103, ITS and s6pdh to explore the origins and evolution of the racemose group. Two copies of the At103 gene were identified in Prunus. One copy is found in Prunus species with solitary and corymbose inflorescences as well as those with racemose inflorescences, while the second copy (II) is present only in taxa with racemose inflorescences. The copy I sequences suggest that all racemose species form a paraphyletic group composed of four clades, each of which is definable by morphology and geography. The tree from the combined At103 and ITS sequences and the tree based on the single gene s6pdh had similar general topologies to the tree based on the copy I sequences of At103, with the combined At103-ITS tree showing stronger support in most clades. The nuclear At103, ITS and s6pdh data in conjunction with the plastid data are consistent with the hypothesis that multiple independent allopolyploidy events contributed to the origins of the racemose group. A widespread species or lineage may have served as the maternal parent for multiple hybridizations involving several paternal lineages. This hypothesis of the complex evolutionary history of the racemose group in Prunus reflects a major step forward in our understanding of diversification of the genus and has important implications for the interpretation of its phylogeny, evolution, and classification. PMID:27294529
A disruptive sequencer meets disruptive publishing.
Loman, Nick; Goodwin, Sarah; Jansen, Hans; Loose, Matt
2015-01-01
Nanopore sequencing was recently made available to users in the form of the Oxford Nanopore MinION. Released to users through an early access programme, the MinION is made unique by its tiny form factor and ability to generate very long sequences from single DNA molecules. The platform is undergoing rapid evolution with three distinct nanopore types and five updates to library preparation chemistry in the last 18 months. To keep pace with the rapid evolution of this sequencing platform, and to provide a space where new analysis methods can be openly discussed, we present a new F1000Research channel devoted to updates to and analysis of nanopore sequence data.
NASA Astrophysics Data System (ADS)
Eigen, Manfred
1988-12-01
The Darwinian concept of evolution through natural selection has been revised and put on a solid physical basis, in a form which applies to self-replicable macromolecules. Two new concepts are introduced: sequence space and quasi-species. Evolutionary change in the DNA- or RNA-sequence of a gene can be mapped as a trajectory in a sequence space of dimension ν, where ν corresponds to the number of changeable positions in the genomic sequence. Emphasis, however, is shifted from the single surviving wildtype, a single point in the sequence space, to the complex structure of the mutant distribution that constitutes the quasi-species. Selection is equivalent to an establishment of the quasi-species in a localized region of sequence space, subject to threshold conditions for the error rate and sequence length. Arrival of a new mutant may violate the local threshold condition and thereby lead to a displacement of the quasi-species into a different region of sequence space. This transformation is similar to a phase transition; the dynamical equations that describe the quase-species have been shown to be analogous to those of the two-dimensional Ising model of ferromagnetism. The occurrence of a selectively advantageous mutant is biased by the particulars of the quasi-species distribution, whose mutants are populated according to their fitness relative to that of the wild-type. Inasmuch as fitness regions are connected (like mountain ridges) the evolutionary trajectory is guided to regions of optimal fitness. Evolution experiments in test tubes confirm this modification of the simple chance and law nature of the Darwinian concept. The results of the theory can also be applied to the construction of a machine that provides optimal conditions for a rapid evolution of functionally active macromolecules. An introduction to the physics of molecular evolution by the author has appeared recently.1 Detailed studies of the kinetics and mechanisms of replication of RNA, the most likely candidate for early evolution2,3, and of the implications on natural selection have been given in Refs. 4 and 5. The quasi-species model has been constructed in Refs. 6 and 7 using the concept of sequence space. Subsequently various methods have been invented to elucidate this concept and to relate it to the theory of critical phenomena 8-19. The instability of the quasi-species at the error threshold is discussed in Ref. 10. Evolution experiments with RNA strands in test tubes are described in Refs. 21 and 22.
Construction of Red Fox Chromosomal Fragments from the Short-Read Genome Assembly.
Rando, Halie M; Farré, Marta; Robson, Michael P; Won, Naomi B; Johnson, Jennifer L; Buch, Ronak; Bastounes, Estelle R; Xiang, Xueyan; Feng, Shaohong; Liu, Shiping; Xiong, Zijun; Kim, Jaebum; Zhang, Guojie; Trut, Lyudmila N; Larkin, Denis M; Kukekova, Anna V
2018-06-20
The genome of a red fox ( Vulpes vulpes ) was recently sequenced and assembled using next-generation sequencing (NGS). The assembly is of high quality, with 94X coverage and a scaffold N50 of 11.8 Mbp, but is split into 676,878 scaffolds, some of which are likely to contain assembly errors. Fragmentation and misassembly hinder accurate gene prediction and downstream analysis such as the identification of loci under selection. Therefore, assembly of the genome into chromosome-scale fragments was an important step towards developing this genomic model. Scaffolds from the assembly were aligned to the dog reference genome and compared to the alignment of an outgroup genome (cat) against the dog to identify syntenic sequences among species. The program Reference-Assisted Chromosome Assembly (RACA) then integrated the comparative alignment with the mapping of the raw sequencing reads generated during assembly against the fox scaffolds. The 128 sequence fragments RACA assembled were compared to the fox meiotic linkage map to guide the construction of 40 chromosomal fragments. This computational approach to assembly was facilitated by prior research in comparative mammalian genomics, and the continued improvement of the red fox genome can in turn offer insight into canid and carnivore chromosome evolution. This assembly is also necessary for advancing genetic research in foxes and other canids.
Guisinger, Mary M; Chumley, Timothy W; Kuehl, Jennifer V; Boore, Jeffrey L; Jansen, Robert K
2010-02-01
Plastid genomes of the grasses (Poaceae) are unusual in their organization and rates of sequence evolution. There has been a recent surge in the availability of grass plastid genome sequences, but a comprehensive comparative analysis of genome evolution has not been performed that includes any related families in the Poales. We report on the plastid genome of Typha latifolia, the first non-grass Poales sequenced to date, and we present comparisons of genome organization and sequence evolution within Poales. Our results confirm that grass plastid genomes exhibit acceleration in both genomic rearrangements and nucleotide substitutions. Poaceae have multiple structural rearrangements, including three inversions, three genes losses (accD, ycf1, ycf2), intron losses in two genes (clpP, rpoC1), and expansion of the inverted repeat (IR) into both large and small single-copy regions. These rearrangements are restricted to the Poaceae, and IR expansion into the small single-copy region correlates with the phylogeny of the family. Comparisons of 73 protein-coding genes for 47 angiosperms including nine Poaceae genera confirm that the branch leading to Poaceae has significantly accelerated rates of change relative to other monocots and angiosperms. Furthermore, rates of sequence evolution within grasses are lower, indicating a deceleration during diversification of the family. Overall there is a strong correlation between accelerated rates of genomic rearrangements and nucleotide substitutions in Poaceae, a phenomenon that has been noted recently throughout angiosperms. The cause of the correlation is unknown, but faulty DNA repair has been suggested in other systems including bacterial and animal mitochondrial genomes.
Evolution of multiple quantum coherences with scaled dipolar Hamiltonian
NASA Astrophysics Data System (ADS)
Sánchez, Claudia M.; Buljubasich, Lisandro; Pastawski, Horacio M.; Chattah, Ana K.
2017-08-01
In this article, we introduce a pulse sequence which allows the monitoring of multiple quantum coherences distribution of correlated spin states developed with scaled dipolar Hamiltonian. The pulse sequence is a modification of our previous Proportionally Refocused Loschmidt echo (PRL echo) with phase increment, in order to verify the accuracy of the weighted coherent quantum dynamics. The experiments were carried out with different scaling factors to analyze the evolution of the total magnetization, the time dependence of the multiple quantum coherence orders, and the development of correlated spins clusters. In all cases, a strong dependence between the evolution rate and the weighting factor is observed. Remarkably, all the curves appeared overlapped in a single trend when plotted against the self-time, a new time scale that includes the scaling factor into the evolution time. In other words, the spin system displayed always the same quantum evolution, slowed down as the scaling factor decreases, confirming the high performance of the new pulse sequence.
Evolutionary genetics of insect innate immunity.
Viljakainen, Lumi
2015-11-01
Patterns of evolution in immune defense genes help to understand the evolutionary dynamics between hosts and pathogens. Multiple insect genomes have been sequenced, with many of them having annotated immune genes, which paves the way for a comparative genomic analysis of insect immunity. In this review, I summarize the current state of comparative and evolutionary genomics of insect innate immune defense. The focus is on the conserved and divergent components of immunity with an emphasis on gene family evolution and evolution at the sequence level; both population genetics and molecular evolution frameworks are considered. © The Author 2015. Published by Oxford University Press.
Pandey, Ravi S; Azad, Rajeev K
2016-03-01
Sex chromosomes have evolved from a pair of homologous autosomes which differentiated into sex determination systems, such as XY or ZW system, as a consequence of successive recombination suppression between the gametologous chromosomes. Identifying the regions of recombination suppression, namely, the "evolutionary strata", is central to understanding the history and dynamics of sex chromosome evolution. Evolution of sex chromosomes as a consequence of serial recombination suppressions is well-studied for mammals and birds, but not for plants, although 48 dioecious plants have already been reported. Only two plants Silene latifolia and papaya have been studied until now for the presence of evolutionary strata on their X chromosomes, made possible by the sequencing of sex-linked genes on both the X and Y chromosomes, which is a requirement of all current methods that determine stratum structure based on the comparison of gametologous sex chromosomes. To circumvent this limitation and detect strata even if only the sequence of sex chromosome in the homogametic sex (i.e. X or Z chromosome) is available, we have developed an integrated segmentation and clustering method. In application to gene sequences on the papaya X chromosome and protein-coding sequences on the S. latifolia X chromosome, our method could decipher all known evolutionary strata, as reported by previous studies. Our method, after validating on known strata on the papaya and S. latifolia X chromosome, was applied to the chromosome 19 of Populus trichocarpa, an incipient sex chromosome, deciphering two, yet unknown, evolutionary strata. In addition, we applied this approach to the recently sequenced sex chromosome V of the brown alga Ectocarpus sp. that has a haploid sex determination system (UV system) recovering the sex determining and pseudoautosomal regions, and then to the mating-type chromosomes of an anther-smut fungus Microbotryum lychnidis-dioicae predicting five strata in the non-recombining region of both the chromosomes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhou, Xiaofan; Peris, David; Kominek, Jacek
The availability of genomes across the tree of life is highly biased toward vertebrates, pathogens, human disease models, and organisms with relatively small and simple genomes. Recent progress in genomics has enabled the de novo decoding of the genome of virtually any organism, greatly expanding its potential for understanding the biology and evolution of the full spectrum of biodiversity. The increasing diversity of sequencing technologies, assays, and de novo assembly algorithms have augmented the complexity of de novo genome sequencing projects in nonmodel organisms. To reduce the costs and challenges in de novo genome sequencing projects and streamline their experimentalmore » design and analysis, we developed iWGS (in silico Whole Genome Sequencer and Analyzer), an automated pipeline for guiding the choice of appropriate sequencing strategy and assembly protocols. iWGS seamlessly integrates the four key steps of a de novo genome sequencing project: data generation (through simulation), data quality control, de novo assembly, and assembly evaluation and validation. The last three steps can also be applied to the analysis of real data. iWGS is designed to enable the user to have great flexibility in testing the range of experimental designs available for genome sequencing projects, and supports all major sequencing technologies and popular assembly tools. Three case studies illustrate how iWGS can guide the design of de novo genome sequencing projects, and evaluate the performance of a wide variety of user-specified sequencing strategies and assembly protocols on genomes of differing architectures. iWGS, along with a detailed documentation, is freely available at https://github.com/zhouxiaofan1983/iWGS.« less
Zhou, Xiaofan; Peris, David; Kominek, Jacek; ...
2016-09-16
The availability of genomes across the tree of life is highly biased toward vertebrates, pathogens, human disease models, and organisms with relatively small and simple genomes. Recent progress in genomics has enabled the de novo decoding of the genome of virtually any organism, greatly expanding its potential for understanding the biology and evolution of the full spectrum of biodiversity. The increasing diversity of sequencing technologies, assays, and de novo assembly algorithms have augmented the complexity of de novo genome sequencing projects in nonmodel organisms. To reduce the costs and challenges in de novo genome sequencing projects and streamline their experimentalmore » design and analysis, we developed iWGS (in silico Whole Genome Sequencer and Analyzer), an automated pipeline for guiding the choice of appropriate sequencing strategy and assembly protocols. iWGS seamlessly integrates the four key steps of a de novo genome sequencing project: data generation (through simulation), data quality control, de novo assembly, and assembly evaluation and validation. The last three steps can also be applied to the analysis of real data. iWGS is designed to enable the user to have great flexibility in testing the range of experimental designs available for genome sequencing projects, and supports all major sequencing technologies and popular assembly tools. Three case studies illustrate how iWGS can guide the design of de novo genome sequencing projects, and evaluate the performance of a wide variety of user-specified sequencing strategies and assembly protocols on genomes of differing architectures. iWGS, along with a detailed documentation, is freely available at https://github.com/zhouxiaofan1983/iWGS.« less
Davis, G L; McMullen, M D; Baysdorfer, C; Musket, T; Grant, D; Staebell, M; Xu, G; Polacco, M; Koster, L; Melia-Hancock, S; Houchins, K; Chao, S; Coe, E H
1999-01-01
We have constructed a 1736-locus maize genome map containing1156 loci probed by cDNAs, 545 probed by random genomic clones, 16 by simple sequence repeats (SSRs), 14 by isozymes, and 5 by anonymous clones. Sequence information is available for 56% of the loci with 66% of the sequenced loci assigned functions. A total of 596 new ESTs were mapped from a B73 library of 5-wk-old shoots. The map contains 237 loci probed by barley, oat, wheat, rice, or tripsacum clones, which serve as grass genome reference points in comparisons between maize and other grass maps. Ninety core markers selected for low copy number, high polymorphism, and even spacing along the chromosome delineate the 100 bins on the map. The average bin size is 17 cM. Use of bin assignments enables comparison among different maize mapping populations and experiments including those involving cytogenetic stocks, mutants, or quantitative trait loci. Integration of nonmaize markers in the map extends the resources available for gene discovery beyond the boundaries of maize mapping information into the expanse of map, sequence, and phenotype information from other grass species. This map provides a foundation for numerous basic and applied investigations including studies of gene organization, gene and genome evolution, targeted cloning, and dissection of complex traits. PMID:10388831
Bolzán, Alejandro D
2017-07-01
By definition, telomeric sequences are located at the very ends or terminal regions of chromosomes. However, several vertebrate species show blocks of (TTAGGG)n repeats present in non-terminal regions of chromosomes, the so-called interstitial telomeric sequences (ITSs), interstitial telomeric repeats or interstitial telomeric bands, which include those intrachromosomal telomeric-like repeats located near (pericentromeric ITSs) or within the centromere (centromeric ITSs) and those telomeric repeats located between the centromere and the telomere (i.e., truly interstitial telomeric sequences) of eukaryotic chromosomes. According with their sequence organization, localization and flanking sequences, ITSs can be classified into four types: 1) short ITSs, 2) subtelomeric ITSs, 3) fusion ITSs, and 4) heterochromatic ITSs. The first three types have been described mainly in the human genome, whereas heterochromatic ITSs have been found in several vertebrate species but not in humans. Several lines of evidence suggest that ITSs play a significant role in genome instability and evolution. This review aims to summarize our current knowledge about the origin, function, instability and evolution of these telomeric-like repeats in vertebrate chromosomes. Copyright © 2017 Elsevier B.V. All rights reserved.
Advances in Cryptococcus genomics: insights into the evolution of pathogenesis.
Cuomo, Christina A; Rhodes, Johanna; Desjardins, Christopher A
2018-01-01
Cryptococcus species are the causative agents of cryptococcal meningitis, a significant source of mortality in immunocompromised individuals. Initial work on the molecular epidemiology of this fungal pathogen utilized genotyping approaches to describe the genetic diversity and biogeography of two species, Cryptococcus neoformans and Cryptococcus gattii. Whole genome sequencing of representatives of both species resulted in reference assemblies enabling a wide array of downstream studies and genomic resources. With the increasing availability of whole genome sequencing, both species have now had hundreds of individual isolates sequenced, providing fine-scale insight into the evolution and diversification of Cryptococcus and allowing for the first genome-wide association studies to identify genetic variants associated with human virulence. Sequencing has also begun to examine the microevolution of isolates during prolonged infection and to identify variants specific to outbreak lineages, highlighting the potential role of hyper-mutation in evolving within short time scales. We can anticipate that further advances in sequencing technology and sequencing microbial genomes at scale, including metagenomics approaches, will continue to refine our view of how the evolution of Cryptococcus drives its success as a pathogen.
The Population History of Endogenous Retroviruses in Mule Deer (Odocoileus hemionus)
2014-01-01
Mobile elements are powerful agents of genomic evolution and can be exceptionally informative markers for investigating species and population-level evolutionary history. While several studies have utilized retrotransposon-based insertional polymorphisms to resolve phylogenies, few population studies exist outside of humans. Endogenous retroviruses are LTR-retrotransposons derived from retroviruses that have become stably integrated in the host genome during past infections and transmitted vertically to subsequent generations. They offer valuable insight into host-virus co-evolution and a unique perspective on host evolutionary history because they integrate into the genome at a discrete point in time. We examined the evolutionary history of a cervid endogenous gammaretrovirus (CrERVγ) in mule deer (Odocoileus hemionus). We sequenced 14 CrERV proviruses (CrERV-in1 to -in14), and examined the prevalence and distribution of 13 proviruses in 262 deer among 15 populations from Montana, Wyoming, and Utah. CrERV absence in white-tailed deer (O. virginianus), identical 5′ and 3′ long terminal repeat (LTR) sequences, insertional polymorphism, and CrERV divergence time estimates indicated that most endogenization events occurred within the last 200000 years. Population structure inferred from CrERVs (F ST = 0.008) and microsatellites (θ = 0.01) was low, but significant, with Utah, northwestern Montana, and a Helena herd being particularly differentiated. Clustering analyses indicated regional structuring, and non-contiguous clustering could often be explained by known translocations. Cluster ensemble results indicated spatial localization of viruses, specifically in deer from northeastern and western Montana. This study demonstrates the utility of endogenous retroviruses to elucidate and provide novel insight into both ERV evolutionary history and the history of contemporary host populations. PMID:24336966
The population history of endogenous retroviruses in mule deer (Odocoileus heminous)
Kamath, Pauline L.; Elleder, Daniel; Bao, Le; Cross, Paul C.; Powell, John H.; Poss, Mary
2013-01-01
Mobile elements are powerful agents of genomic evolution and can be exceptionally informative markers for investigating species and population-level evolutionary history. While several studies have utilized retrotransposon-based insertional polymorphisms to resolve phylogenies, few population studies exist outside of humans. Endogenous retroviruses are LTR-retrotransposons derived from retroviruses that have become stably integrated in the host genome during past infections and transmitted vertically to subsequent generations. They offer valuable insight into host-virus co-evolution and a unique perspective on host evolutionary history because they integrate into the genome at a discrete point in time. We examined the evolutionary history of a cervid endogenous gammaretrovirus (CrERVγ) in mule deer (Odocoileus hemionus). We sequenced 14 CrERV proviruses (CrERV-in1 to -in14), and examined the prevalence and distribution of 13 proviruses in 262 deer among 15 populations from Montana, Wyoming, and Utah. CrERV absence in white-tailed deer (O. virginianus), identical 5′ and 3′ long terminal repeat (LTR) sequences, insertional polymorphism, and CrERV divergence time estimates indicated that most endogenization events occurred within the last 200000 years. Population structure inferred from CrERVs (F ST = 0.008) and microsatellites (θ = 0.01) was low, but significant, with Utah, northwestern Montana, and a Helena herd being particularly differentiated. Clustering analyses indicated regional structuring, and non-contiguous clustering could often be explained by known translocations. Cluster ensemble results indicated spatial localization of viruses, specifically in deer from northeastern and western Montana. This study demonstrates the utility of endogenous retroviruses to elucidate and provide novel insight into both ERV evolutionary history and the history of contemporary host populations.
Sramkó, Gábor; Paun, Ovidiu
2018-01-01
Abstract Background and Aims Bee orchids (Ophrys) have become the most popular model system for studying reproduction via insect-mediated pseudo-copulation and for exploring the consequent, putatively adaptive, evolutionary radiations. However, despite intensive past research, both the phylogenetic structure and species diversity within the genus remain highly contentious. Here, we integrate next-generation sequencing and morphological cladistic techniques to clarify the phylogeny of the genus. Methods At least two accessions of each of the ten species groups previously circumscribed from large-scale cloned nuclear ribosomal internal transcibed spacer (nrITS) sequencing were subjected to restriction site-associated sequencing (RAD-seq). The resulting matrix of 4159 single nucleotide polymorphisms (SNPs) for 34 accessions was used to construct an unrooted network and a rooted maximum likelihood phylogeny. A parallel morphological cladistic matrix of 43 characters generated both polymorphic and non-polymorphic sets of parsimony trees before being mapped across the RAD-seq topology. Key Results RAD-seq data strongly support the monophyly of nine out of ten groups previously circumscribed using nrITS and resolve three major clades; in contrast, supposed microspecies are barely distinguishable. Strong incongruence separated the RAD-seq trees from both the morphological trees and traditional classifications; mapping of the morphological characters across the RAD-seq topology rendered them far more homoplastic. Conclusions The comparatively high level of morphological homoplasy reflects extensive convergence, whereas the derived placement of the fusca group is attributed to paedomorphic simplification. The phenotype of the most recent common ancestor of the extant lineages is inferred, but it post-dates the majority of the character-state changes that typify the genus. RAD-seq may represent the high-water mark of the contribution of molecular phylogenetics to understanding evolution within Ophrys; further progress will require large-scale population-level studies that integrate phenotypic and genotypic data in a cogent conceptual framework. PMID:29325077
Diene, Seydina M; Merhej, Vicky; Henry, Mireille; El Filali, Adil; Roux, Véronique; Robert, Catherine; Azza, Saïd; Gavory, Frederick; Barbe, Valérie; La Scola, Bernard; Raoult, Didier; Rolain, Jean-Marc
2013-02-01
Here, we sequenced the 5,419,609 bp circular genome of an Enterobacter aerogenes clinical isolate that killed a patient and was resistant to almost all current antibiotics (except gentamicin) commonly used to treat Enterobacterial infections, including colistin. Genomic and phylogenetic analyses explain the discrepancies of this bacterium and show that its core genome originates from another genus, Klebsiella. Atypical characteristics of this bacterium (i.e., motility, presence of ornithine decarboxylase, and lack of urease activity) are attributed to genomic mosaicism, by acquisition of additional genes, such as the complete 60,582 bp flagellar assembly operon acquired "en bloc" from the genus Serratia. The genealogic tree of the 162,202 bp multidrug-resistant conjugative plasmid shows that it is a chimera of transposons and integrative conjugative elements from various bacterial origins, resembling a rhizome. Moreover, we demonstrate biologically that a G53S mutation in the pmrA gene results in colistin resistance. E. aerogenes has a large RNA population comprising 8 rRNA operons and 87 cognate tRNAs that have the ability to translate transferred genes that use different codons, as exemplified by the significantly different codon usage between genes from the core genome and the "mobilome." On the basis of our findings, the evolution of this bacterium to become a "killer bug" with new genomic repertoires was from three criteria that are "opportunity, power, and usage" to indicate a sympatric lifestyle: "opportunity" to meet other bacteria and exchange foreign sequences since this bacteria was similar to sympatric bacteria; "power" to integrate these foreign sequences such as the acquisition of several mobile genetic elements (plasmids, integrative conjugative element, prophages, transposons, flagellar assembly system, etc.) found in his genome; and "usage" to have the ability to translate these sequences including those from rare codons to serve as a translator of foreign languages.
Origin-Dependent Inverted-Repeat Amplification: Tests of a Model for Inverted DNA Amplification
Brewer, Bonita J.; Payen, Celia; Di Rienzi, Sara C.; Higgins, Megan M.; Ong, Giang; Dunham, Maitreya J.; Raghuraman, M. K.
2015-01-01
DNA replication errors are a major driver of evolution—from single nucleotide polymorphisms to large-scale copy number variations (CNVs). Here we test a specific replication-based model to explain the generation of interstitial, inverted triplications. While no genetic information is lost, the novel inversion junctions and increased copy number of the included sequences create the potential for adaptive phenotypes. The model—Origin-Dependent Inverted-Repeat Amplification (ODIRA)—proposes that a replication error at pre-existing short, interrupted, inverted repeats in genomic sequences generates an extrachromosomal, inverted dimeric, autonomously replicating intermediate; subsequent genomic integration of the dimer yields this class of CNV without loss of distal chromosomal sequences. We used a combination of in vitro and in vivo approaches to test the feasibility of the proposed replication error and its downstream consequences on chromosome structure in the yeast Saccharomyces cerevisiae. We show that the proposed replication error—the ligation of leading and lagging nascent strands to create “closed” forks—can occur in vitro at short, interrupted inverted repeats. The removal of molecules with two closed forks results in a hairpin-capped linear duplex that we show replicates in vivo to create an inverted, dimeric plasmid that subsequently integrates into the genome by homologous recombination, creating an inverted triplication. While other models have been proposed to explain inverted triplications and their derivatives, our model can also explain the generation of human, de novo, inverted amplicons that have a 2:1 mixture of sequences from both homologues of a single parent—a feature readily explained by a plasmid intermediate that arises from one homologue and integrates into the other homologue prior to meiosis. Our tests of key features of ODIRA lend support to this mechanism and suggest further avenues of enquiry to unravel the origins of interstitial, inverted CNVs pivotal in human health and evolution. PMID:26700858
Bateman, Richard M; Sramkó, Gábor; Paun, Ovidiu
2018-01-25
Bee orchids (Ophrys) have become the most popular model system for studying reproduction via insect-mediated pseudo-copulation and for exploring the consequent, putatively adaptive, evolutionary radiations. However, despite intensive past research, both the phylogenetic structure and species diversity within the genus remain highly contentious. Here, we integrate next-generation sequencing and morphological cladistic techniques to clarify the phylogeny of the genus. At least two accessions of each of the ten species groups previously circumscribed from large-scale cloned nuclear ribosomal internal transcibed spacer (nrITS) sequencing were subjected to restriction site-associated sequencing (RAD-seq). The resulting matrix of 4159 single nucleotide polymorphisms (SNPs) for 34 accessions was used to construct an unrooted network and a rooted maximum likelihood phylogeny. A parallel morphological cladistic matrix of 43 characters generated both polymorphic and non-polymorphic sets of parsimony trees before being mapped across the RAD-seq topology. RAD-seq data strongly support the monophyly of nine out of ten groups previously circumscribed using nrITS and resolve three major clades; in contrast, supposed microspecies are barely distinguishable. Strong incongruence separated the RAD-seq trees from both the morphological trees and traditional classifications; mapping of the morphological characters across the RAD-seq topology rendered them far more homoplastic. The comparatively high level of morphological homoplasy reflects extensive convergence, whereas the derived placement of the fusca group is attributed to paedomorphic simplification. The phenotype of the most recent common ancestor of the extant lineages is inferred, but it post-dates the majority of the character-state changes that typify the genus. RAD-seq may represent the high-water mark of the contribution of molecular phylogenetics to understanding evolution within Ophrys; further progress will require large-scale population-level studies that integrate phenotypic and genotypic data in a cogent conceptual framework. © The Author(s) 2018. Published by Oxford University Press on behalf of the Annals of Botany Company.
Development of a Prognostic Marker for Lung Cancer Using Analysis of Tumor Evolution
2017-08-01
SUPPLEMENTARY NOTES 14. ABSTRACT The goal of this project is to sequence the exomes of single tumor cells from tumors in order to construct evolutionary trees...dissociation, tumor cell isolation, whole genome amplification, and exome sequencing. We have begun to sequence the exomes of single cells and to...of populations, the evolution of tumor cells within a tumor can be diagrammed on a phylogenetic tree. The more diverse a tumor’s phylogenetic tree
Marean, Curtis W
2016-07-05
Scientists have identified a series of milestones in the evolution of the human food quest that are anticipated to have had far-reaching impacts on biological, behavioural and cultural evolution: the inclusion of substantial portions of meat, the broad spectrum revolution and the transition to food production. The foraging shift to dense and predictable resources is another key milestone that had consequential impacts on the later part of human evolution. The theory of economic defendability predicts that this shift had an important consequence-elevated levels of intergroup territoriality and conflict. In this paper, this theory is integrated with a well-established general theory of hunter-gatherer adaptations and is used to make predictions for the sequence of appearance of several evolved traits of modern humans. The distribution of dense and predictable resources in Africa is reviewed and found to occur only in aquatic contexts (coasts, rivers and lakes). The palaeoanthropological empirical record contains recurrent evidence for a shift to the exploitation of dense and predictable resources by 110 000 years ago, and the first known occurrence is in a marine coastal context in South Africa. Some theory predicts that this elevated conflict would have provided the conditions for selection for the hyperprosocial behaviours unique to modern humans.This article is part of the themed issue 'Major transitions in human evolution'. © 2016 The Author(s).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schulman, Al
2009-08-09
Three subfamilies of grasses, the Erhardtoideae (rice), the Panicoideae (maize, sorghum, sugar cane and millet), and the Pooideae (wheat, barley and cool season forage grasses) provide the basis of human nutrition and are poised to become major sources of renewable energy. Here we describe the complete genome sequence of the wild grass Brachypodium distachyon (Brachypodium), the first member of the Pooideae subfamily to be completely sequenced. Comparison of the Brachypodium, rice and sorghum genomes reveals a precise sequence- based history of genome evolution across a broad diversity of the grass family and identifies nested insertions of whole chromosomes into centromericmore » regions as a predominant mechanism driving chromosome evolution in the grasses. The relatively compact genome of Brachypodium is maintained by a balance of retroelement replication and loss. The complete genome sequence of Brachypodium, coupled to its exceptional promise as a model system for grass research, will support the development of new energy and food crops« less
Favorable genomic environments for cis-regulatory evolution: A novel theoretical framework.
Maeso, Ignacio; Tena, Juan J
2016-09-01
Cis-regulatory changes are arguably the primary evolutionary source of animal morphological diversity. With the recent explosion of genome-wide comparisons of the cis-regulatory content in different animal species is now possible to infer general principles underlying enhancer evolution. However, these studies have also revealed numerous discrepancies and paradoxes, suggesting that the mechanistic causes and modes of cis-regulatory evolution are still not well understood and are probably much more complex than generally appreciated. Here, we argue that the mutational mechanisms and genomic regions generating new regulatory activities must comply with the constraints imposed by the molecular properties of cis-regulatory elements (CREs) and the organizational features of long-range chromatin interactions. Accordingly, we propose a new integrative evolutionary framework for cis-regulatory evolution based on two major premises for the origin of novel enhancer activity: (i) an accessible chromatin environment and (ii) compatibility with the 3D structure and interactions of pre-existing CREs. Mechanisms and DNA sequences not fulfilling these premises, will be less likely to have a measurable impact on gene expression and as such, will have a minor contribution to the evolution of gene regulation. Finally, we discuss current comparative cis-regulatory data under the light of this new evolutionary model, and propose that the two most prominent mechanisms for the evolution of cis-regulatory changes are the overprinting of ancestral CREs and the exaptation of transposable elements. Copyright © 2015 Elsevier Ltd. All rights reserved.
Integrating single-cell transcriptomic data across different conditions, technologies, and species.
Butler, Andrew; Hoffman, Paul; Smibert, Peter; Papalexi, Efthymia; Satija, Rahul
2018-06-01
Computational single-cell RNA-seq (scRNA-seq) methods have been successfully applied to experiments representing a single condition, technology, or species to discover and define cellular phenotypes. However, identifying subpopulations of cells that are present across multiple data sets remains challenging. Here, we introduce an analytical strategy for integrating scRNA-seq data sets based on common sources of variation, enabling the identification of shared populations across data sets and downstream comparative analysis. We apply this approach, implemented in our R toolkit Seurat (http://satijalab.org/seurat/), to align scRNA-seq data sets of peripheral blood mononuclear cells under resting and stimulated conditions, hematopoietic progenitors sequenced using two profiling technologies, and pancreatic cell 'atlases' generated from human and mouse islets. In each case, we learn distinct or transitional cell states jointly across data sets, while boosting statistical power through integrated analysis. Our approach facilitates general comparisons of scRNA-seq data sets, potentially deepening our understanding of how distinct cell states respond to perturbation, disease, and evolution.
Protein-Protein Interaction Network and Gene Ontology
NASA Astrophysics Data System (ADS)
Choi, Yunkyu; Kim, Seok; Yi, Gwan-Su; Park, Jinah
Evolution of computer technologies makes it possible to access a large amount and various kinds of biological data via internet such as DNA sequences, proteomics data and information discovered about them. It is expected that the combination of various data could help researchers find further knowledge about them. Roles of a visualization system are to invoke human abilities to integrate information and to recognize certain patterns in the data. Thus, when the various kinds of data are examined and analyzed manually, an effective visualization system is an essential part. One instance of these integrated visualizations can be combination of protein-protein interaction (PPI) data and Gene Ontology (GO) which could help enhance the analysis of PPI network. We introduce a simple but comprehensive visualization system that integrates GO and PPI data where GO and PPI graphs are visualized side-by-side and supports quick reference functions between them. Furthermore, the proposed system provides several interactive visualization methods for efficiently analyzing the PPI network and GO directedacyclic- graph such as context-based browsing and common ancestors finding.
Hocum, Jonah D; Battrell, Logan R; Maynard, Ryan; Adair, Jennifer E; Beard, Brian C; Rawlings, David J; Kiem, Hans-Peter; Miller, Daniel G; Trobridge, Grant D
2015-07-07
Analyzing the integration profile of retroviral vectors is a vital step in determining their potential genotoxic effects and developing safer vectors for therapeutic use. Identifying retroviral vector integration sites is also important for retroviral mutagenesis screens. We developed VISA, a vector integration site analysis server, to analyze next-generation sequencing data for retroviral vector integration sites. Sequence reads that contain a provirus are mapped to the human genome, sequence reads that cannot be localized to a unique location in the genome are filtered out, and then unique retroviral vector integration sites are determined based on the alignment scores of the remaining sequence reads. VISA offers a simple web interface to upload sequence files and results are returned in a concise tabular format to allow rapid analysis of retroviral vector integration sites.
NASA Technical Reports Server (NTRS)
Kretsinger, R. H.; Nakayama, S.
1993-01-01
In the previous three reports in this series we demonstrated that the EF-hand family of proteins evolved by a complex pattern of gene duplication, transposition, and splicing. The dendrograms based on exon sequences are nearly identical to those based on protein sequences for troponin C, the essential light chain myosin, the regulatory light chain, and calpain. This validates both the computational methods and the dendrograms for these subfamilies. The proposal of congruence for calmodulin, troponin C, essential light chain, and regulatory light chain was confirmed. There are, however, significant differences in the calmodulin dendrograms computed from DNA and from protein sequences. In this study we find that introns are distributed throughout the EF-hand domain and the interdomain regions. Further, dendrograms based on intron type and distribution bear little resemblance to those based on protein or on DNA sequences. We conclude that introns are inserted, and probably deleted, with relatively high frequency. Further, in the EF-hand family exons do not correspond to structural domains and exon shuffling played little if any role in the evolution of this widely distributed homolog family. Calmodulin has had a turbulent evolution. Its dendrograms based on protein sequence, exon sequence, 3'-tail sequence, intron sequences, and intron positions all show significant differences.
Gómez, José María; Perfectti, Francisco; Klingenberg, Christian Peter
2014-01-01
Flowers of animal-pollinated plants are integrated structures shaped by the action of pollinator-mediated selection. It is widely assumed that pollination specialization increases the magnitude of floral integration. However, empirical evidence is still inconclusive. In this study, we explored the role of pollinator diversity in shaping the evolution of corolla-shape integration in Erysimum, a plant genus with generalized pollination systems. We quantified floral integration in Erysimum using geometric morphometrics and explored its evolution using phylogenetic comparative methods. Corolla-shape integration was low but significantly different from zero in all study species. Spatial autocorrelation and phylogenetic signal in corolla-shape integration were not detected. In addition, integration in Erysimum seems to have evolved in a way that is consistent with Brownian motion, but with frequent convergent evolution. Corolla-shape integration was negatively associated with the number of pollinators visiting the flowers of each Erysimum species. That is, it was lower in those species having a more generalized pollination system. This negative association may occur because the co-occurrence of many pollinators imposes conflicting selection and cancels out any consistent selection on specific floral traits, preventing the evolution of highly integrated flowers. PMID:25002702
Huang, Yan; Hu, Junhua; Wang, Bin; Song, Zhaobin; Zhou, Caiquan; Jiang, Jianping
2016-03-01
Species of the genus Gynandropaa within the family Dicroglossidae are typical spiny frogs whose taxonomic status has long been in doubt. We used integrative methods, involving morphological and molecular analyses, to elucidate the phylogenetic relationships, and to determine identities and the geographic distribution of each valid species. We obtained DNA sequence data of 5 species of Gynandropaa (complete sequences of the mitochondrial NADH dehydrogenase subunit 2 [ND2] gene, and 890 bp of 12S rRNA and 16S rRNA partial sequences) from 37 localities (including the topotypes of 5 described species) and constructed Bayesian and maximum-likelihood trees to examine the patterns of phylogeography. A total of 28 morphological variables were taken on 624 specimens. Three clades with clear geographic patterns were recognized: clade C (from south-western Sichuan Province and central Yunnan Province), clade E (western Guizhou Province and eastern to central Yunnan Province) and clade W (western to southern Yunnan Province). Integrating morphological characteristics and distribution information, the clades W, E and C represent Gynandropaa yunnanensis, G. phrynoides and G. sichuanensis, respectively. We draw the following conclusions: (i) the taxon G. phrynoides, formerly evaluated as a junior synonym of G. yunnanensis, is revalidated herein at the rank of species; (ii) G. liui is a junior synonym of G. sichuanensis; and (iii) G. yunnanensis is a valid species while G. bourreti is probably a subspecies of G. yunnanensis, with the distribution range from Vietnam to southern Yunnan Province. This study clears up the taxonomic status of Gynandropaa and provides important information for understanding the evolution and conservation of these spiny frogs. © 2015 International Society of Zoological Sciences, Institute of Zoology/Chinese Academy of Sciences and John Wiley & Sons Australia, Ltd.
2010-01-01
Background The maturing field of genomics is rapidly increasing the number of sequenced genomes and producing more information from those previously sequenced. Much of this additional information is variation data derived from sampling multiple individuals of a given species with the goal of discovering new variants and characterising the population frequencies of the variants that are already known. These data have immense value for many studies, including those designed to understand evolution and connect genotype to phenotype. Maximising the utility of the data requires that it be stored in an accessible manner that facilitates the integration of variation data with other genome resources such as gene annotation and comparative genomics. Description The Ensembl project provides comprehensive and integrated variation resources for a wide variety of chordate genomes. This paper provides a detailed description of the sources of data and the methods for creating the Ensembl variation databases. It also explores the utility of the information by explaining the range of query options available, from using interactive web displays, to online data mining tools and connecting directly to the data servers programmatically. It gives a good overview of the variation resources and future plans for expanding the variation data within Ensembl. Conclusions Variation data is an important key to understanding the functional and phenotypic differences between individuals. The development of new sequencing and genotyping technologies is greatly increasing the amount of variation data known for almost all genomes. The Ensembl variation resources are integrated into the Ensembl genome browser and provide a comprehensive way to access this data in the context of a widely used genome bioinformatics system. All Ensembl data is freely available at http://www.ensembl.org and from the public MySQL database server at ensembldb.ensembl.org. PMID:20459805
Chromosomal Speciation in the Genomics Era: Disentangling Phylogenetic Evolution of Rock-wallabies.
Potter, Sally; Bragg, Jason G; Blom, Mozes P K; Deakin, Janine E; Kirkpatrick, Mark; Eldridge, Mark D B; Moritz, Craig
2017-01-01
The association of chromosome rearrangements (CRs) with speciation is well established, and there is a long history of theory and evidence relating to "chromosomal speciation." Genomic sequencing has the potential to provide new insights into how reorganization of genome structure promotes divergence, and in model systems has demonstrated reduced gene flow in rearranged segments. However, there are limits to what we can understand from a small number of model systems, which each only tell us about one episode of chromosomal speciation. Progressing from patterns of association between chromosome (and genic) change, to understanding processes of speciation requires both comparative studies across diverse systems and integration of genome-scale sequence comparisons with other lines of evidence. Here, we showcase a promising example of chromosomal speciation in a non-model organism, the endemic Australian marsupial genus Petrogale . We present initial phylogenetic results from exon-capture that resolve a history of divergence associated with extensive and repeated CRs. Yet it remains challenging to disentangle gene tree heterogeneity caused by recent divergence and gene flow in this and other such recent radiations. We outline a way forward for better integration of comparative genomic sequence data with evidence from molecular cytogenetics, and analyses of shifts in the recombination landscape and potential disruption of meiotic segregation and epigenetic programming. In all likelihood, CRs impact multiple cellular processes and these effects need to be considered together, along with effects of genic divergence. Understanding the effects of CRs together with genic divergence will require development of more integrative theory and inference methods. Together, new data and analysis tools will combine to shed light on long standing questions of how chromosome and genic divergence promote speciation.
Evolutionary interrogation of human biology in well-annotated genomic framework of rhesus macaque.
Zhang, Shi-Jian; Liu, Chu-Jun; Yu, Peng; Zhong, Xiaoming; Chen, Jia-Yu; Yang, Xinzhuang; Peng, Jiguang; Yan, Shouyu; Wang, Chenqu; Zhu, Xiaotong; Xiong, Jingwei; Zhang, Yong E; Tan, Bertrand Chin-Ming; Li, Chuan-Yun
2014-05-01
With genome sequence and composition highly analogous to human, rhesus macaque represents a unique reference for evolutionary studies of human biology. Here, we developed a comprehensive genomic framework of rhesus macaque, the RhesusBase2, for evolutionary interrogation of human genes and the associated regulations. A total of 1,667 next-generation sequencing (NGS) data sets were processed, integrated, and evaluated, generating 51.2 million new functional annotation records. With extensive NGS annotations, RhesusBase2 refined the fine-scale structures in 30% of the macaque Ensembl transcripts, reporting an accurate, up-to-date set of macaque gene models. On the basis of these annotations and accurate macaque gene models, we further developed an NGS-oriented Molecular Evolution Gateway to access and visualize macaque annotations in reference to human orthologous genes and associated regulations (www.rhesusbase.org/molEvo). We highlighted the application of this well-annotated genomic framework in generating hypothetical link of human-biased regulations to human-specific traits, by using mechanistic characterization of the DIEXF gene as an example that provides novel clues to the understanding of digestive system reduction in human evolution. On a global scale, we also identified a catalog of 9,295 human-biased regulatory events, which may represent novel elements that have a substantial impact on shaping human transcriptome and possibly underpin recent human phenotypic evolution. Taken together, we provide an NGS data-driven, information-rich framework that will broadly benefit genomics research in general and serves as an important resource for in-depth evolutionary studies of human biology.
Lu, Wei; Liu, Jun; Xin, Qiang; Wan, Lili; Hong, Dengfeng; Yang, Guangsheng
2013-01-01
Background and Aims Spontaneous male sterility is an advantageous trait for both constructing efficient pollination control systems and for understanding the developmental process of the male reproductive unit in many crops. A triallelic genetic male-sterile locus (BnMs5) has been identified in Brassica napus; however, its complicated genome structure has greatly hampered the isolation of this locus. The aim of this study was to physically map BnMs5 through an integrated map-based cloning strategy and analyse the local chromosomal evolution around BnMs5. Methods A large F2 population was used to integrate the existing genetic maps around BnMs5. A map-based cloning strategy in combination with comparative mapping among B. napus, Arabidopsis, Brassica rapa and Brassica oleracea was employed to facilitate the identification of a target bacterial artificial chromosome (BAC) clone covering the BnMs5 locus. The genomic sequences from the Brassica species were analysed to reveal the regional chromosomal evolution around BnMs5. Key Results BnMs5 was finally delimited to a 0·3-cM genetic fragment from an integrated local genetic map, and was anchored on the B. napus A8 chromosome. Screening of a B. napus BAC clone library and identification of the positive clones validated that JBnB034L06 was the target BAC clone. The closest flanking markers restrict BnMs5 to a 21-kb region on JBnB034L06 containing six predicted functional genes. Good collinearity relationship around BnMs5 between several Brassica species was observed, while violent chromosomal evolutionary events including insertions/deletions, duplications and single nucleotide mutations were also found to have extensively occurred during their divergence. Conclusions This work represents major progress towards the molecular cloning of BnMs5, as well as presenting a powerful, integrative method to mapping loci in plants with complex genomic architecture, such as the amphidiploid B. napus. PMID:23243189
Integrated genome sequence and linkage map of physic nut (Jatropha curcas L.), a biodiesel plant.
Wu, Pingzhi; Zhou, Changpin; Cheng, Shifeng; Wu, Zhenying; Lu, Wenjia; Han, Jinli; Chen, Yanbo; Chen, Yan; Ni, Peixiang; Wang, Ying; Xu, Xun; Huang, Ying; Song, Chi; Wang, Zhiwen; Shi, Nan; Zhang, Xudong; Fang, Xiaohua; Yang, Qing; Jiang, Huawu; Chen, Yaping; Li, Meiru; Wang, Ying; Chen, Fan; Wang, Jun; Wu, Guojiang
2015-03-01
The family Euphorbiaceae includes some of the most efficient biomass accumulators. Whole genome sequencing and the development of genetic maps of these species are important components in molecular breeding and genetic improvement. Here we report the draft genome of physic nut (Jatropha curcas L.), a biodiesel plant. The assembled genome has a total length of 320.5 Mbp and contains 27,172 putative protein-coding genes. We established a linkage map containing 1208 markers and anchored the genome assembly (81.7%) to this map to produce 11 pseudochromosomes. After gene family clustering, 15,268 families were identified, of which 13,887 existed in the castor bean genome. Analysis of the genome highlighted specific expansion and contraction of a number of gene families during the evolution of this species, including the ribosome-inactivating proteins and oil biosynthesis pathway enzymes. The genomic sequence and linkage map provide a valuable resource not only for fundamental and applied research on physic nut but also for evolutionary and comparative genomics analysis, particularly in the Euphorbiaceae. © 2015 The Authors The Plant Journal © 2015 John Wiley & Sons Ltd.
Papaemmanuil, Elli; Rapado, Inmaculada; Li, Yilong; Potter, Nicola E; Wedge, David C; Tubio, Jose; Alexandrov, Ludmil B; Van Loo, Peter; Cooke, Susanna L; Marshall, John; Martincorena, Inigo; Hinton, Jonathan; Gundem, Gunes; van Delft, Frederik W; Nik-Zainal, Serena; Jones, David R; Ramakrishna, Manasa; Titley, Ian; Stebbings, Lucy; Leroy, Catherine; Menzies, Andrew; Gamble, John; Robinson, Ben; Mudie, Laura; Raine, Keiran; O’Meara, Sarah; Teague, Jon W; Butler, Adam P; Cazzaniga, Giovanni; Biondi, Andrea; Zuna, Jan; Kempski, Helena; Muschen, Markus; Ford, Anthony M; Stratton, Michael R; Greaves, Mel; Campbell, Peter J
2014-01-01
The ETV6-RUNX1 fusion gene, found in 25% of childhood acute lymphoblastic leukemia (ALL), is acquired in utero but requires additional somatic mutations for overt leukemia. We used exome and low-coverage whole-genome sequencing to characterize secondary events associated with leukemic transformation. RAG-mediated deletions emerge as the dominant mutational process, characterized by recombination signal sequence motifs near the breakpoints; incorporation of non-templated sequence at the junction; ~30-fold enrichment at promoters and enhancers of genes actively transcribed in B-cell development and an unexpectedly high ratio of recurrent to non-recurrent structural variants. Single cell tracking shows that this mechanism is active throughout leukemic evolution with evidence of localized clustering and re-iterated deletions. Integration of point mutation and rearrangement data identifies ATF7IP and MGA as two new tumor suppressor genes in ALL. Thus, a remarkably parsimonious mutational process transforms ETV6-RUNX1 lymphoblasts, targeting the promoters, enhancers and first exons of genes that normally regulate B-cell differentiation. PMID:24413735
Insights into hominid evolution from the gorilla genome sequence
Scally, Aylwyn; Dutheil, Julien Y.; Hillier, LaDeana W.; Jordan, Greg E.; Goodhead, Ian; Herrero, Javier; Hobolth, Asger; Lappalainen, Tuuli; Mailund, Thomas; Marques-Bonet, Tomas; McCarthy, Shane; Montgomery, Stephen H.; Schwalie, Petra C.; Tang, Y. Amy; Ward, Michelle C.; Xue, Yali; Yngvadottir, Bryndis; Alkan, Can; Andersen, Lars N.; Ayub, Qasim; Ball, Edward V.; Beal, Kathryn; Bradley, Brenda J.; Chen, Yuan; Clee, Chris M.; Fitzgerald, Stephen; Graves, Tina A.; Gu, Yong; Heath, Paul; Heger, Andreas; Karakoc, Emre; Kolb-Kokocinski, Anja; Laird, Gavin K.; Lunter, Gerton; Meader, Stephen; Mort, Matthew; Mullikin, James C.; Munch, Kasper; O’Connor, Timothy D.; Phillips, Andrew D.; Prado-Martinez, Javier; Rogers, Anthony S.; Sajjadian, Saba; Schmidt, Dominic; Shaw, Katy; Simpson, Jared T.; Stenson, Peter D.; Turner, Daniel J.; Vigilant, Linda; Vilella, Albert J.; Whitener, Weldon; Zhu, Baoli; Cooper, David N.; de Jong, Pieter; Dermitzakis, Emmanouil T.; Eichler, Evan E.; Flicek, Paul; Goldman, Nick; Mundy, Nicholas I.; Ning, Zemin; Odom, Duncan T.; Ponting, Chris P.; Quail, Michael A.; Ryder, Oliver A.; Searle, Stephen M.; Warren, Wesley C.; Wilson, Richard K.; Schierup, Mikkel H.; Rogers, Jane; Tyler-Smith, Chris; Durbin, Richard
2012-01-01
Summary Gorillas are humans’ closest living relatives after chimpanzees, and are of comparable importance for the study of human origins and evolution. Here we present the assembly and analysis of a genome sequence for the western lowland gorilla, and compare the whole genomes of all extant great ape genera. We propose a synthesis of genetic and fossil evidence consistent with placing the human-chimpanzee and human-chimpanzee-gorilla speciation events at approximately 6 and 10 million years ago (Mya). In 30% of the genome, gorilla is closer to human or chimpanzee than the latter are to each other; this is rarer around coding genes, indicating pervasive selection throughout great ape evolution, and has functional consequences in gene expression. A comparison of protein coding genes reveals approximately 500 genes showing accelerated evolution on each of the gorilla, human and chimpanzee lineages, and evidence for parallel acceleration, particularly of genes involved in hearing. We also compare the western and eastern gorilla species, estimating an average sequence divergence time 1.75 million years ago, but with evidence for more recent genetic exchange and a population bottleneck in the eastern species. The use of the genome sequence in these and future analyses will promote a deeper understanding of great ape biology and evolution. PMID:22398555
Target Site Recognition by a Diversity-Generating Retroelement
Guo, Huatao; Tse, Longping V.; Nieh, Angela W.; Czornyj, Elizabeth; Williams, Steven; Oukil, Sabrina; Liu, Vincent B.; Miller, Jeff F.
2011-01-01
Diversity-generating retroelements (DGRs) are in vivo sequence diversification machines that are widely distributed in bacterial, phage, and plasmid genomes. They function to introduce vast amounts of targeted diversity into protein-encoding DNA sequences via mutagenic homing. Adenine residues are converted to random nucleotides in a retrotransposition process from a donor template repeat (TR) to a recipient variable repeat (VR). Using the Bordetella bacteriophage BPP-1 element as a prototype, we have characterized requirements for DGR target site function. Although sequences upstream of VR are dispensable, a 24 bp sequence immediately downstream of VR, which contains short inverted repeats, is required for efficient retrohoming. The inverted repeats form a hairpin or cruciform structure and mutational analysis demonstrated that, while the structure of the stem is important, its sequence can vary. In contrast, the loop has a sequence-dependent function. Structure-specific nuclease digestion confirmed the existence of a DNA hairpin/cruciform, and marker coconversion assays demonstrated that it influences the efficiency, but not the site of cDNA integration. Comparisons with other phage DGRs suggested that similar structures are a conserved feature of target sequences. Using a kanamycin resistance determinant as a reporter, we found that transplantation of the IMH and hairpin/cruciform-forming region was sufficient to target the DGR diversification machinery to a heterologous gene. In addition to furthering our understanding of DGR retrohoming, our results suggest that DGRs may provide unique tools for directed protein evolution via in vivo DNA diversification. PMID:22194701
Wang, Chuan; Zhang, Chaowu; Pei, Xiaofang; Liu, Hengchuan
2007-11-01
For being further applied and studied, one strain of Lactobacillus delbrueckii subsp. bulgaricus (wch9901) separated from yoghourt which had been identified by phenotype characteristic analysis was identified by 16S rDNA and phylogenetic analyzed. The 16S rDNA of wch9901 was amplified with the genomic DNA of wch9901 as template, and the conservative sequences of the 16S rDNA as primers. Inserted 16S rDNA amplified into clonal vector pGEM-T under the function of T4 DNA ligase to construct recombined plasmid pGEM-wch9901 16S rDNA. The recombined plasmid was identified by restriction enzyme digestion, and the eligible plasmid was presented to sequencing company for DNA sequencing. Nucleic acid sequence was blast in GenBank and phylogenetic tree was constructed using neighbor-joining method of distance methods by Mega3.1 soft. Results of blastn showed that the homology of 16S rDNA of wch9901 with the 16S rDNA of Lactobacillus delbrueckii subsp. bulgaricus strains was higher than 96%. On the phylogenetic tree, wch9901 formed a separate branch and located between Lactobacillus delbrueckii subsp. bulgaricus LGM2 evolution branch and another evolution branch which was composed of Lactobacillus delbrueckii subsp. bulgaricus DL2 evolution cluster and Lactobacillus delbrueckii subsp. bulgaricus JSQ evolution cluster. The distance between wch9901 evolution branch and Lactobacillus delbrueckii subsp. bulgaricus LGM2 evolution branch was the closest. wch9901 belonged to Lactobacillus delbrueckii subsp. bulgaricus. wch9901 showed the closest evolution relationship to Lactobacillus delbrueckii subsp. bulgaricus LGM2.
Plechakova, Olga; Tranchant-Dubreuil, Christine; Benedet, Fabrice; Couderc, Marie; Tinaut, Alexandra; Viader, Véronique; De Block, Petra; Hamon, Perla; Campa, Claudine; de Kochko, Alexandre; Hamon, Serge; Poncet, Valérie
2009-01-01
Background In the past few years, functional genomics information has been rapidly accumulating on Rubiaceae species and especially on those belonging to the Coffea genus (coffee trees). An increasing number of expressed sequence tag (EST) data and EST- or genomic-derived microsatellite markers have been generated, together with Conserved Ortholog Set (COS) markers. This considerably facilitates comparative genomics or map-based genetic studies through the common use of orthologous loci across different species. Similar genomic information is available for e.g. tomato or potato, members of the Solanaceae family. Since both Rubiaceae and Solanaceae belong to the Euasterids I (lamiids) integration of information on genetic markers would be possible and lead to more efficient analyses and discovery of key loci involved in important traits such as fruit development, quality, and maturation, or adaptation. Our goal was to develop a comprehensive web data source for integrated information on validated orthologous markers in Rubiaceae. Description MoccaDB is an online MySQL-PHP driven relational database that houses annotated and/or mapped microsatellite markers in Rubiaceae. In its current release, the database stores 638 markers that have been defined on 259 ESTs and 379 genomic sequences. Marker information was retrieved from 11 published works, and completed with original data on 132 microsatellite markers validated in our laboratory. DNA sequences were derived from three Coffea species/hybrids. Microsatellite markers were checked for similarity, in vitro tested for cross-amplification and diversity/polymorphism status in up to 38 Rubiaceae species belonging to the Cinchonoideae and Rubioideae subfamilies. Functional annotation was provided and some markers associated with described metabolic pathways were also integrated. Users can search the database for marker, sequence, map or diversity information through multi-option query forms. The retrieved data can be browsed and downloaded, along with protocols used, using a standard web browser. MoccaDB also integrates bioinformatics tools (CMap viewer and local BLAST) and hyperlinks to related external data sources (NCBI GenBank and PubMed, SOL Genomic Network database). Conclusion We believe that MoccaDB will be extremely useful for all researchers working in the areas of comparative and functional genomics and molecular evolution, in general, and population analysis and association mapping of Rubiaceae and Solanaceae species, in particular. PMID:19788737
Ginkgo and Welwitschia Mitogenomes Reveal Extreme Contrasts in Gymnosperm Mitochondrial Evolution.
Guo, Wenhu; Grewe, Felix; Fan, Weishu; Young, Gregory J; Knoop, Volker; Palmer, Jeffrey D; Mower, Jeffrey P
2016-06-01
Mitochondrial genomes (mitogenomes) of flowering plants are well known for their extreme diversity in size, structure, gene content, and rates of sequence evolution and recombination. In contrast, little is known about mitogenomic diversity and evolution within gymnosperms. Only a single complete genome sequence is available, from the cycad Cycas taitungensis, while limited information is available for the one draft sequence, from Norway spruce (Picea abies). To examine mitogenomic evolution in gymnosperms, we generated complete genome sequences for the ginkgo tree (Ginkgo biloba) and a gnetophyte (Welwitschia mirabilis). There is great disparity in size, sequence conservation, levels of shared DNA, and functional content among gymnosperm mitogenomes. The Cycas and Ginkgo mitogenomes are relatively small, have low substitution rates, and possess numerous genes, introns, and edit sites; we infer that these properties were present in the ancestral seed plant. By contrast, the Welwitschia mitogenome has an expanded size coupled with accelerated substitution rates and extensive loss of these functional features. The Picea genome has expanded further, to more than 4 Mb. With regard to structural evolution, the Cycas and Ginkgo mitogenomes share a remarkable amount of intergenic DNA, which may be related to the limited recombinational activity detected at repeats in Ginkgo Conversely, the Welwitschia mitogenome shares almost no intergenic DNA with any other seed plant. By conducting the first measurements of rates of DNA turnover in seed plant mitogenomes, we discovered that turnover rates vary by orders of magnitude among species. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Evidence for Widespread Reticulate Evolution within Human Duplicons
Jackson, Michael S. ; Oliver, Karen ; Loveland, Jane ; Humphray, Sean ; Dunham, Ian ; Rocchi, Mariano ; Viggiano, Luigi ; Park, Jonathan P. ; Hurles, Matthew E. ; Santibanez-Koref, Mauro
2005-01-01
Approximately 5% of the human genome consists of segmental duplications that can cause genomic mutations and may play a role in gene innovation. Reticulate evolutionary processes, such as unequal crossing-over and gene conversion, are known to occur within specific duplicon families, but the broader contribution of these processes to the evolution of human duplications remains poorly characterized. Here, we use phylogenetic profiling to analyze multiple alignments of 24 human duplicon families that span >8 Mb of DNA. Our results indicate that none of them are evolving independently, with all alignments showing sharp discontinuities in phylogenetic signal consistent with reticulation. To analyze these results in more detail, we have developed a quartet method that estimates the relative contribution of nucleotide substitution and reticulate processes to sequence evolution. Our data indicate that most of the duplications show a highly significant excess of sites consistent with reticulate evolution, compared with the number expected by nucleotide substitution alone, with 15 of 30 alignments showing a >20-fold excess over that expected. Using permutation tests, we also show that at least 5% of the total sequence shares 100% sequence identity because of reticulation, a figure that includes 74 independent tracts of perfect identity >2 kb in length. Furthermore, analysis of a subset of alignments indicates that the density of reticulation events is as high as 1 every 4 kb. These results indicate that phylogenetic relationships within recently duplicated human DNA can be rapidly disrupted by reticulate evolution. This finding has important implications for efforts to finish the human genome sequence, complicates comparative sequence analysis of duplicon families, and could profoundly influence the tempo of gene-family evolution. PMID:16252241
NASA Astrophysics Data System (ADS)
Mananga, Eugene S.; Reid, Alicia E.
2013-01-01
This paper presents a study of finite pulse widths for the BABA pulse sequence using the Floquet-Magnus expansion (FME) approach. In the FME scheme, the first order ? is identical to its counterparts in average Hamiltonian theory (AHT) and Floquet theory (FT). However, the timing part in the FME approach is introduced via the ? function not present in other schemes. This function provides an easy way for evaluating the spin evolution during the time in between' through the Magnus expansion of the operator connected to the timing part of the evolution. The evaluation of ? is particularly useful for the analysis of the non-stroboscopic evolution. Here, the importance of the boundary conditions, which provide a natural choice of ? , is ignored. This work uses the ? function to compare the efficiency of the BABA pulse sequence with ? and the BABA pulse sequence with finite pulses. Calculations of ? and ? are presented.
The Use of Weighted Graphs for Large-Scale Genome Analysis
Zhou, Fang; Toivonen, Hannu; King, Ross D.
2014-01-01
There is an acute need for better tools to extract knowledge from the growing flood of sequence data. For example, thousands of complete genomes have been sequenced, and their metabolic networks inferred. Such data should enable a better understanding of evolution. However, most existing network analysis methods are based on pair-wise comparisons, and these do not scale to thousands of genomes. Here we propose the use of weighted graphs as a data structure to enable large-scale phylogenetic analysis of networks. We have developed three types of weighted graph for enzymes: taxonomic (these summarize phylogenetic importance), isoenzymatic (these summarize enzymatic variety/redundancy), and sequence-similarity (these summarize sequence conservation); and we applied these types of weighted graph to survey prokaryotic metabolism. To demonstrate the utility of this approach we have compared and contrasted the large-scale evolution of metabolism in Archaea and Eubacteria. Our results provide evidence for limits to the contingency of evolution. PMID:24619061
Adiabatic Mass Loss Model in Binary Stars
NASA Astrophysics Data System (ADS)
Ge, H. W.
2012-07-01
Rapid mass transfer process in the interacting binary systems is very complicated. It relates to two basic problems in the binary star evolution, i.e., the dynamically unstable Roche-lobe overflow and the common envelope evolution. Both of the problems are very important and difficult to be modeled. In this PhD thesis, we focus on the rapid mass loss process of the donor in interacting binary systems. The application to the criterion of dynamically unstable mass transfer and the common envelope evolution are also included. Our results based on the adiabatic mass loss model could be used to improve the binary evolution theory, the binary population synthetic method, and other related aspects. We build up the adiabatic mass loss model. In this model, two approximations are included. The first one is that the energy generation and heat flow through the stellar interior can be neglected, hence the restructuring is adiabatic. The second one is that he stellar interior remains in hydrostatic equilibrium. We model this response by constructing model sequences, beginning with a donor star filling its Roche lobe at an arbitrary point in its evolution, holding its specific entropy and composition profiles fixed. These approximations are validated by the comparison with the time-dependent binary mass transfer calculations and the polytropic model for low mass zero-age main-sequence stars. In the dynamical time scale mass transfer, the adiabatic response of the donor star drives it to expand beyond its Roche lobe, leading to runaway mass transfer and the formation of a common envelope with its companion star. For donor stars with surface convection zones of any significant depth, this runaway condition is encountered early in mass transfer, if at all; but for main sequence stars with radiative envelopes, it may be encountered after a prolonged phase of thermal time scale mass transfer, so-called delayed dynamical instability. We identify the critical binary mass ratio for the onset of dynamical time scale mass transfer; if the ratio of donor to accretor masses exceeds this critical value, the dynamical time scale mass transfer ensues. The grid of criterion for all stars can be used to be the basic input as the binary population synthetic method, which will be improved absolutely. In common envelope evolution, the dissipation of orbital energy of the binary provides the energy to eject the common envelope; the energy budget for this process essentially consists of the initial orbital energy of the binary and the initial binding energies of the binary components. We emphasize that, because stellar core and envelope contribute mutually to each other's gravitational potential energy, proper evaluation of the total energy of a star requires integration over the entire stellar interior, not the ejected envelope alone as commonly assumed. We show that the change in total energy of the donor star, as a function of its remaining mass along an adiabatic mass-loss sequence, can be calculated. This change in total energy of the donor star, combined with the requirement that both remnant donor and its companion star fit within their respective Roche lobes, then circumscribes energetically possible survivors of common envelope evolution. It is the first time that we can calculate the accurate total energy of the donor star in common envelope evolution, while the results with the old method are inconsistent with observations.
Extensive Concerted Evolution of Rice Paralogs and the Road to Regaining Independence
Wang, Xiyin; Tang, Haibao; Bowers, John E.; Feltus, Frank A.; Paterson, Andrew H.
2007-01-01
Many genes duplicated by whole-genome duplications (WGDs) are more similar to one another than expected. We investigated whether concerted evolution through conversion and crossing over, well-known to affect tandem gene clusters, also affects dispersed paralogs. Genome sequences for two Oryza subspecies reveal appreciable gene conversion in the ∼0.4 MY since their divergence, with a gradual progression toward independent evolution of older paralogs. Since divergence from subspecies indica, ∼8% of japonica paralogs produced 5–7 MYA on chromosomes 11 and 12 have been affected by gene conversion and several reciprocal exchanges of chromosomal segments, while ∼70-MY-old “paleologs” resulting from a genome duplication (GD) show much less conversion. Sequence similarity analysis in proximal gene clusters also suggests more conversion between younger paralogs. About 8% of paleologs may have been converted since rice–sorghum divergence ∼41 MYA. Domain-encoding sequences are more frequently converted than nondomain sequences, suggesting a sort of circularity—that sequences conserved by selection may be further conserved by relatively frequent conversion. The higher level of concerted evolution in the 5–7 MY-old segmental duplication may reflect the behavior of many genomes within the first few million years after duplication or polyploidization. PMID:18039882
Conserved noncoding sequences conserve biological networks and influence genome evolution.
Xie, Jianbo; Qian, Kecheng; Si, Jingna; Xiao, Liang; Ci, Dong; Zhang, Deqiang
2018-05-01
Comparative genomics approaches have identified numerous conserved cis-regulatory sequences near genes in plant genomes. Despite the identification of these conserved noncoding sequences (CNSs), our knowledge of their functional importance and selection remains limited. Here, we used a combination of DNA methylome analysis, microarray expression analyses, and functional annotation to study these sequences in the model tree Populus trichocarpa. Methylation in CG contexts and non-CG contexts was lower in CNSs, particularly CNSs in the 5'-upstream regions of genes, compared with other sites in the genome. We observed that CNSs are enriched in genes with transcription and binding functions, and this also associated with syntenic genes and those from whole-genome duplications, suggesting that cis-regulatory sequences play a key role in genome evolution. We detected a significant positive correlation between CNS number and protein interactions, suggesting that CNSs may have roles in the evolution and maintenance of biological networks. The divergence of CNSs indicates that duplication-degeneration-complementation drives the subfunctionalization of a proportion of duplicated genes from whole-genome duplication. Furthermore, population genomics confirmed that most CNSs are under strong purifying selection and only a small subset of CNSs shows evidence of adaptive evolution. These findings provide a foundation for future studies exploring these key genomic features in the maintenance of biological networks, local adaptation, and transcription.
Hidden long evolutionary memory in a model biochemical network
NASA Astrophysics Data System (ADS)
Ali, Md. Zulfikar; Wingreen, Ned S.; Mukhopadhyay, Ranjan
2018-04-01
We introduce a minimal model for the evolution of functional protein-interaction networks using a sequence-based mutational algorithm, and apply the model to study neutral drift in networks that yield oscillatory dynamics. Starting with a functional core module, random evolutionary drift increases network complexity even in the absence of specific selective pressures. Surprisingly, we uncover a hidden order in sequence space that gives rise to long-term evolutionary memory, implying strong constraints on network evolution due to the topology of accessible sequence space.
Genes involved in convergent evolution of eusociality in bees
Woodard, S. Hollis; Fischman, Brielle J.; Venkat, Aarti; Hudson, Matt E.; Varala, Kranthi; Cameron, Sydney A.; Clark, Andrew G.; Robinson, Gene E.
2011-01-01
Eusociality has arisen independently at least 11 times in insects. Despite this convergence, there are striking differences among eusocial lifestyles, ranging from species living in small colonies with overt conflict over reproduction to species in which colonies contain hundreds of thousands of highly specialized sterile workers produced by one or a few queens. Although the evolution of eusociality has been intensively studied, the genetic changes involved in the evolution of eusociality are relatively unknown. We examined patterns of molecular evolution across three independent origins of eusociality by sequencing transcriptomes of nine socially diverse bee species and combining these data with genome sequence from the honey bee Apis mellifera to generate orthologous sequence alignments for 3,647 genes. We found a shared set of 212 genes with a molecular signature of accelerated evolution across all eusocial lineages studied, as well as unique sets of 173 and 218 genes with a signature of accelerated evolution specific to either highly or primitively eusocial lineages, respectively. These results demonstrate that convergent evolution can involve a mosaic pattern of molecular changes in both shared and lineage-specific sets of genes. Genes involved in signal transduction, gland development, and carbohydrate metabolism are among the most prominent rapidly evolving genes in eusocial lineages. These findings provide a starting point for linking specific genetic changes to the evolution of eusociality. PMID:21482769
Papanicolaou, Alexie; Schetelig, Marc F; Arensburger, Peter; Atkinson, Peter W; Benoit, Joshua B; Bourtzis, Kostas; Castañera, Pedro; Cavanaugh, John P; Chao, Hsu; Childers, Christopher; Curril, Ingrid; Dinh, Huyen; Doddapaneni, HarshaVardhan; Dolan, Amanda; Dugan, Shannon; Friedrich, Markus; Gasperi, Giuliano; Geib, Scott; Georgakilas, Georgios; Gibbs, Richard A; Giers, Sarah D; Gomulski, Ludvik M; González-Guzmán, Miguel; Guillem-Amat, Ana; Han, Yi; Hatzigeorgiou, Artemis G; Hernández-Crespo, Pedro; Hughes, Daniel S T; Jones, Jeffery W; Karagkouni, Dimitra; Koskinioti, Panagiota; Lee, Sandra L; Malacrida, Anna R; Manni, Mosè; Mathiopoulos, Kostas; Meccariello, Angela; Munoz-Torres, Monica; Murali, Shwetha C; Murphy, Terence D; Muzny, Donna M; Oberhofer, Georg; Ortego, Félix; Paraskevopoulou, Maria D; Poelchau, Monica; Qu, Jiaxin; Reczko, Martin; Robertson, Hugh M; Rosendale, Andrew J; Rosselot, Andrew E; Saccone, Giuseppe; Salvemini, Marco; Savini, Grazia; Schreiner, Patrick; Scolari, Francesca; Siciliano, Paolo; Sim, Sheina B; Tsiamis, George; Ureña, Enric; Vlachos, Ioannis S; Werren, John H; Wimmer, Ernst A; Worley, Kim C; Zacharopoulou, Antigone; Richards, Stephen; Handler, Alfred M
2016-09-22
The Mediterranean fruit fly (medfly), Ceratitis capitata, is a major destructive insect pest due to its broad host range, which includes hundreds of fruits and vegetables. It exhibits a unique ability to invade and adapt to ecological niches throughout tropical and subtropical regions of the world, though medfly infestations have been prevented and controlled by the sterile insect technique (SIT) as part of integrated pest management programs (IPMs). The genetic analysis and manipulation of medfly has been subject to intensive study in an effort to improve SIT efficacy and other aspects of IPM control. The 479 Mb medfly genome is sequenced from adult flies from lines inbred for 20 generations. A high-quality assembly is achieved having a contig N50 of 45.7 kb and scaffold N50 of 4.06 Mb. In-depth curation of more than 1800 messenger RNAs shows specific gene expansions that can be related to invasiveness and host adaptation, including gene families for chemoreception, toxin and insecticide metabolism, cuticle proteins, opsins, and aquaporins. We identify genes relevant to IPM control, including those required to improve SIT. The medfly genome sequence provides critical insights into the biology of one of the most serious and widespread agricultural pests. This knowledge should significantly advance the means of controlling the size and invasive potential of medfly populations. Its close relationship to Drosophila, and other insect species important to agriculture and human health, will further comparative functional and structural studies of insect genomes that should broaden our understanding of gene family evolution.
Yu, Guoqin; Olsen, Kenneth M; Schaal, Barbara A
2011-01-01
The evolution of metabolic pathways is a fundamental but poorly understood aspect of evolutionary change. One approach for understanding the complexity of pathway evolution is to examine the molecular evolution of genes that together comprise an integrated metabolic pathway. The rice endosperm starch biosynthetic pathway is one of the most thoroughly characterized metabolic pathways in plants, and starch is a trait that has evolved in response to strong selection during rice domestication. In this study, we have examined six key genes (AGPL2, AGPS2b, SSIIa, SBEIIb, GBSSI, ISA1) in the rice endosperm starch biosynthesis pathway to investigate the evolution of these genes before and after rice domestication. Genome-wide sequence tagged sites data were used as a neutral reference to overcome the problems of detecting selection in species with complex demographic histories such as rice. Five variety groups of Oryza sativa (aus, indica, tropical japonica, temperate japonica, aromatic) and its wild ancestor (O. rufipogon) were sampled. Our results showed evidence of purifying selection at AGPL2 in O. rufipogon and strong evidence of positive selection at GBSSI in temperate japonica and tropical japonica varieties and at GBSSI and SBEIIb in aromatic varieties. All the other genes showed a pattern consistent with neutral evolution in both cultivated rice and its wild ancestor. These results indicate the important role of positive selection in the evolution of starch genes during rice domestication. We discuss the role of SBEIIb and GBSSI in the evolution of starch quality during rice domestication and the power and limitation of detecting selection using genome-wide data as a neutral reference.
Lappin, Fiona M; Shaw, Rebecca L; Macqueen, Daniel J
2016-12-01
High-throughput sequencing has revolutionised comparative and evolutionary genome biology. It has now become relatively commonplace to generate multiple genomes and/or transcriptomes to characterize the evolution of large taxonomic groups of interest. Nevertheless, such efforts may be unsuited to some research questions or remain beyond the scope of some research groups. Here we show that targeted high-throughput sequencing offers a viable alternative to study genome evolution across a vertebrate family of great scientific interest. Specifically, we exploited sequence capture and Illumina sequencing to characterize the evolution of key components from the insulin-like growth (IGF) signalling axis of salmonid fish at unprecedented phylogenetic resolution. The IGF axis represents a central governor of vertebrate growth and its core components were expanded by whole genome duplication in the salmonid ancestor ~95Ma. Using RNA baits synthesised to genes encoding the complete family of IGF binding proteins (IGFBP) and an IGF hormone (IGF2), we captured, sequenced and assembled orthologous and paralogous exons from species representing all ten salmonid genera. This approach generated 299 novel sequences, most as complete or near-complete protein-coding sequences. Phylogenetic analyses confirmed congruent evolutionary histories for all nineteen recognized salmonid IGFBP family members and identified novel salmonid-specific IGF2 paralogues. Moreover, we reconstructed the evolution of duplicated IGF axis paralogues across a replete salmonid phylogeny, revealing complex historic selection regimes - both ancestral to salmonids and lineage-restricted - that frequently involved asymmetric paralogue divergence under positive and/or relaxed purifying selection. Our findings add to an emerging literature highlighting diverse applications for targeted sequencing in comparative-evolutionary genomics. We also set out a viable approach to obtain large sets of nuclear genes for any member of the salmonid family, which should enable insights into the evolutionary role of whole genome duplication before additional nuclear genome sequences become available. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
The Genome Sequence of Taurine Cattle: A Window to Ruminant Biology and Evolution
USDA-ARS?s Scientific Manuscript database
As a major step toward understanding the biology and evolution of ruminants, the cattle genome was sequenced to ~7x coverage using a combined whole genome shotgun and BAC skim approach. The cattle genome contains a minimum of 22,000 genes, with a core set of 14,345 orthologs found in seven mammalian...
NASA Astrophysics Data System (ADS)
D'Elia, Leandro; Bilmes, Andrés; Franzese, Juan R.; Veiga, Gonzalo D.; Hernández, Mariano; Muravchik, Martín
2015-12-01
Long-lived rift basins are characterized by a complex structural and tectonic evolution. They present significant lateral and vertical stratigraphic variations that determine diverse basin-patterns at different timing, scale and location. These issues cause difficulties to establish facies models, correlations and stratal stacking patterns of the fault-related stratigraphy, specially when exploration of hydrocarbon plays proceeds on the subsurface of a basin. The present case study corresponds to the rift-successions of the Neuquén Basin. This basin formed in response to continental extension that took place at the western margin of Gondwana during the Late Triassic-Early Jurassic. A tectono-stratigraphic analysis of the initial successions of the southern part of the Neuquén Basin was carried out. Three syn-rift sequences were determined. These syn-rift sequences were located in different extensional depocentres during the rifting phases. The specific periods of rifting show distinctly different structural and stratigraphic styles: from non-volcanic to volcanic successions and/or from continental to marine sedimentation. The results were compared with surface and subsurface interpretations performed for other depocentres of the basin, devising an integrated rifting scheme for the whole basin. The more accepted tectono-stratigraphic scheme that assumes the deposits of the first marine transgression (Cuyo Cycle) as indicative of the onset of a post-rift phase is reconsidered. In the southern part of the basin, the marine deposits (lower Cuyo Cycle) were integrated into the syn-rift phase, implying the existence of different tectonic signatures for Cuyo Cycle along the basin. The rift climax becomes younger from north to south along the basin. The post-rift initiation followed the diachronic ending of the main syn-rift phase throughout the Neuquén Basin. Thus, initiation of the post-rift stage started in the north and proceeded towards the south, constituting a diachronous post-rift event. This arrangement implies that the lower part of Cuyo Cycle, traditionally related to regional thermal subsidence, may be deposited during either mechanical subsidence or thermal subsidence according to its position within the basin.
Simone, Domenico; Bay, Denice C.; Leach, Thorin; Turner, Raymond J.
2013-01-01
Background The twin-arginine translocation (Tat) protein export system enables the transport of fully folded proteins across a membrane. This system is composed of two integral membrane proteins belonging to TatA and TatC protein families and in some systems a third component, TatB, a homolog of TatA. TatC participates in substrate protein recognition through its interaction with a twin arginine leader peptide sequence. Methodology/Principal Findings The aim of this study was to explore TatC diversity, evolution and sequence conservation in bacteria to identify how TatC is evolving and diversifying in various bacterial phyla. Surveying bacterial genomes revealed that 77% of all species possess one or more tatC loci and half of these classes possessed only tatC and tatA genes. Phylogenetic analysis of diverse TatC homologues showed that they were primarily inherited but identified a small subset of taxonomically unrelated bacteria that exhibited evidence supporting lateral gene transfer within an ecological niche. Examination of bacilli tatCd/tatCy isoform operons identified a number of known and potentially new Tat substrate genes based on their frequent association to tatC loci. Evolutionary analysis of these Bacilli isoforms determined that TatCy was the progenitor of TatCd. A bacterial TatC consensus sequence was determined and highlighted conserved and variable regions within a three dimensional model of the Escherichia coli TatC protein. Comparative analysis between the TatC consensus sequence and Bacilli TatCd/y isoform consensus sequences revealed unique sites that may contribute to isoform substrate specificity or make TatA specific contacts. Synonymous to non-synonymous nucleotide substitution analyses of bacterial tatC homologues determined that tatC sequence variation differs dramatically between various classes and suggests TatC specialization in these species. Conclusions/Significance TatC proteins appear to be diversifying within particular bacterial classes and its specialization may be driven by the substrates it transports and the environment of its host. PMID:24236045
Gómez, José María; Perfectti, Francisco; Klingenberg, Christian Peter
2014-08-19
Flowers of animal-pollinated plants are integrated structures shaped by the action of pollinator-mediated selection. It is widely assumed that pollination specialization increases the magnitude of floral integration. However, empirical evidence is still inconclusive. In this study, we explored the role of pollinator diversity in shaping the evolution of corolla-shape integration in Erysimum, a plant genus with generalized pollination systems. We quantified floral integration in Erysimum using geometric morphometrics and explored its evolution using phylogenetic comparative methods. Corolla-shape integration was low but significantly different from zero in all study species. Spatial autocorrelation and phylogenetic signal in corolla-shape integration were not detected. In addition, integration in Erysimum seems to have evolved in a way that is consistent with Brownian motion, but with frequent convergent evolution. Corolla-shape integration was negatively associated with the number of pollinators visiting the flowers of each Erysimum species. That is, it was lower in those species having a more generalized pollination system. This negative association may occur because the co-occurrence of many pollinators imposes conflicting selection and cancels out any consistent selection on specific floral traits, preventing the evolution of highly integrated flowers. © 2014 The Author(s) Published by the Royal Society. All rights reserved.
Primate-Specific Evolution of an LDLR Enhancer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Qian-fei; Prabhakar, Shyam; Wang, Qianben
2006-06-28
Sequence changes in regulatory regions have often beeninvoked to explain phenotypic divergence among species, but molecularexamples of this have been difficult to obtain. In this study, weidentified an anthropoid primate specific sequence element thatcontributed to the regulatory evolution of the LDL receptor. Using acombination of close and distant species genomic sequence comparisonscoupled with in vivo and in vitro studies, we show that a functionalcholesterol-sensing sequence motif arose and was fixed within apre-existing enhancer in the common ancestor of anthropoid primates. Ourstudy demonstrates one molecular mechanism by which ancestral mammalianregulatory elements can evolve to perform new functions in the primatelineage leadingmore » to human.« less
Senerchia, Natacha; Wicker, Thomas; Felber, François; Parisod, Christian
2013-01-01
Transposable elements (TEs) represent a major fraction of plant genomes and drive their evolution. An improved understanding of genome evolution requires the dynamics of a large number of TE families to be considered. We put forward an approach bypassing the required step of a complete reference genome to assess the evolutionary trajectories of high copy number TE families from genome snapshot with high-throughput sequencing. Low coverage sequencing of the complex genomes of Aegilops cylindrica and Ae. geniculata using 454 identified more than 70% of the sequences as known TEs, mainly long terminal repeat (LTR) retrotransposons. Comparing the abundance of reads as well as patterns of sequence diversity and divergence within and among genomes assessed the dynamics of 44 major LTR retrotransposon families of the 165 identified. In particular, molecular population genetics on individual TE copies distinguished recently active from quiescent families and highlighted different evolutionary trajectories of retrotransposons among related species. This work presents a suite of tools suitable for current sequencing data, allowing to address the genome-wide evolutionary dynamics of TEs at the family level and advancing our understanding of the evolution of nonmodel genomes.
Ultra-deep mutant spectrum profiling: improving sequencing accuracy using overlapping read pairs.
Chen-Harris, Haiyin; Borucki, Monica K; Torres, Clinton; Slezak, Tom R; Allen, Jonathan E
2013-02-12
High throughput sequencing is beginning to make a transformative impact in the area of viral evolution. Deep sequencing has the potential to reveal the mutant spectrum within a viral sample at high resolution, thus enabling the close examination of viral mutational dynamics both within- and between-hosts. The challenge however, is to accurately model the errors in the sequencing data and differentiate real viral mutations, particularly those that exist at low frequencies, from sequencing errors. We demonstrate that overlapping read pairs (ORP) -- generated by combining short fragment sequencing libraries and longer sequencing reads -- significantly reduce sequencing error rates and improve rare variant detection accuracy. Using this sequencing protocol and an error model optimized for variant detection, we are able to capture a large number of genetic mutations present within a viral population at ultra-low frequency levels (<0.05%). Our rare variant detection strategies have important implications beyond viral evolution and can be applied to any basic and clinical research area that requires the identification of rare mutations.
Concerted evolution at the population level: pupfish HindIII satellite DNA sequences.
Elder, J F; Turner, B J
1994-01-01
The canonical monomers (approximately 170 bp) of an abundant (1.9 x 10(6) copies per diploid genome) satellite DNA sequence family in the genome of Cyprinodon variegatus, a "pupfish" that ranges along the Atlantic coast from Cape Cod to central Mexico, are divergent in base sequence in 10 of 12 samples collected from natural populations. The divergence involves substitutions, deletions, and insertions, is marked in scope (mean pairwise sequence similarity = 61.6%; range = 35-95.9%), is largely confined to the 3' half of the monomer, and is not correlated with the distance among collecting sites. Repetitive cloning and direct genomic sequencing experiments failed to detect intrapopulation and intraindividual variation, suggesting high levels of sequence homogeneity within populations. The satellite sequence has therefore undergone "concerted evolution," at the level of the local population. Concerted evolution has previously almost always been discussed in terms of the divergence of species or higher taxa; its intraspecific occurrence apparently has not been reported previously. The generality of the observation is difficult to evaluate, for although satellite DNAs from a large number of organisms have been studied in detail, there appear to be little or no other data on their sequence variation in natural populations. The relationship (if any) between concerted, population level, satellite DNA divergence and the extent of gene flow/genetic isolation among conspecific natural populations remains to be established. Images PMID:8302879
Developing a heart institute: the execution of a strategic plan.
Krawczeski, Catherine D; McDonald, Mark B
2013-01-01
The Heart Institute at Cincinnati Children's Hospital Medical Center was chartered in July 2008 with the purpose of integrating clinical cardiovascular medicine with basic science research to foster innovations in care of patients with congenital heart problems. The initial administrative steering committee included representation from a basic scientist, a cardiologist, and a cardiothoracic surgeon and was charged with the development of a strategic plan for the evolution of the Institute over a five-year horizon. Using structured focus groups and staff interviews, the vision, mission, and goals were identified and refined. An integrated implementation plan addressing recruitment, capitalization, infrastructure, and market opportunities was created and executed. The preliminary results demonstrated clinical outcome improvements, increased scientific and academic productivity, and financial sustainability. All of the goals identified in the initial planning sequence were achieved within the five-year time frame, prompting an early evaluation and revision of the strategic plan.
Construction, database integration, and application of an Oenothera EST library.
Mrácek, Jaroslav; Greiner, Stephan; Cho, Won Kyong; Rauwolf, Uwe; Braun, Martha; Umate, Pavan; Altstätter, Johannes; Stoppel, Rhea; Mlcochová, Lada; Silber, Martina V; Volz, Stefanie M; White, Sarah; Selmeier, Renate; Rudd, Stephen; Herrmann, Reinhold G; Meurer, Jörg
2006-09-01
Coevolution of cellular genetic compartments is a fundamental aspect in eukaryotic genome evolution that becomes apparent in serious developmental disturbances after interspecific organelle exchanges. The genus Oenothera represents a unique, at present the only available, resource to study the role of the compartmentalized plant genome in diversification of populations and speciation processes. An integrated approach involving cDNA cloning, EST sequencing, and bioinformatic data mining was chosen using Oenothera elata with the genetic constitution nuclear genome AA with plastome type I. The Gene Ontology system grouped 1621 unique gene products into 17 different functional categories. Application of arrays generated from a selected fraction of ESTs revealed significantly differing expression profiles among closely related Oenothera species possessing the potential to generate fertile and incompatible plastid/nuclear hybrids (hybrid bleaching). Furthermore, the EST library provides a valuable source of PCR-based polymorphic molecular markers that are instrumental for genotyping and molecular mapping approaches.
CRISPR-Cas systems: prokaryotes upgrade to adaptive immunity
Barrangou, Rodolphe; Marraffini, Luciano A.
2014-01-01
Summary Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR), and associated proteins (Cas) comprise the CRISPR-Cas system, which confers adaptive immunity against exogenic elements in many bacteria and most archaea. CRISPR-mediated immunization occurs through the uptake of DNA from invasive genetic elements such as plasmids and viruses, followed by its integration into CRISPR loci. These loci are subsequently transcribed and processed into small interfering RNAs that guide nucleases for specific cleavage of complementary sequences. Conceptually, CRISPR-Cas shares functional features with the mammalian adaptive immune system, while also exhibiting characteristics of Lamarckian evolution. Because immune markers spliced from exogenous agents are integrated iteratively in CRISPR loci, they constitute a genetic record of vaccination events and reflect environmental conditions and changes over time. Cas endonucleases, which can be reprogrammed by small guide RNAs have shown unprecedented potential and flexibility for genome editing, and can be repurposed for numerous DNA targeting applications including transcriptional control. PMID:24766887
Endogenous hepadnaviruses, bornaviruses and circoviruses in snakes
Gilbert, C.; Meik, J. M.; Dashevsky, D.; Card, D. C.; Castoe, T. A.; Schaack, S.
2014-01-01
We report the discovery of endogenous viral elements (EVEs) from Hepadnaviridae, Bornaviridae and Circoviridae in the speckled rattlesnake, Crotalus mitchellii, the first viperid snake for which a draft whole genome sequence assembly is available. Analysis of the draft assembly reveals genome fragments from the three virus families were inserted into the genome of this snake over the past 50 Myr. Cross-species PCR screening of orthologous loci and computational scanning of the python and king cobra genomes reveals that circoviruses integrated most recently (within the last approx. 10 Myr), whereas bornaviruses and hepadnaviruses integrated at least approximately 13 and approximately 50 Ma, respectively. This is, to our knowledge, the first report of circo-, borna- and hepadnaviruses in snakes and the first characterization of non-retroviral EVEs in non-avian reptiles. Our study provides a window into the historical dynamics of viruses in these host lineages and shows that their evolution involved multiple host-switches between mammals and reptiles. PMID:25080342
Sankar, Sathish; Upadhyay, Mohita; Ramamurthy, Mageshbabu; Vadivel, Kumaran; Sagadevan, Kalaiselvan; Nandagopal, Balaji; Vivekanandan, Perumal; Sridharan, Gopalan
2015-01-01
Hantaviruses are important emerging zoonotic pathogens. The current understanding of hantavirus evolution is complicated by the lack of consensus on co-divergence of hantaviruses with their animal hosts. In addition, hantaviruses have long-term associations with their reservoir hosts. Analyzing the relative abundance of dinucleotides may shed new light on hantavirus evolution. We studied the relative abundance of dinucleotides and the evolutionary pressures shaping different hantavirus segments. A total of 118 sequences were analyzed; this includes 51 sequences of the S segment, 43 sequences of the M segment and 23 sequences of the L segment. The relative abundance of dinucleotides, effective codon number (ENC), codon usage biases were analyzed. Standard methods were used to investigate the relative roles of mutational pressure and translational selection on the three hantavirus segments. All three segments of hantaviruses are CpG depleted. Mutational pressure is the predominant evolutionary force leading to CpG depletion among hantaviruses. Interestingly, the S segment of hantaviruses is GpU depleted and in contrast to CpG depletion, the depletion of GpU dinucleotides from the S segment is driven by translational selection. Our findings also suggest that mutational pressure is the primary evolutionary pressure acting on the S and the M segments of hantaviruses. While translational selection plays a key role in shaping the evolution of the L segment. Our findings highlight how different evolutionary pressures may contribute disproportionally to the evolution of the three hantavirus segments. These findings provide new insights on the current understanding of hantavirus evolution. There is a dichotomy among evolutionary pressures shaping a) the relative abundance of different dinucleotides in hantavirus genomes b) the evolution of the three hantavirus segments.
Khatri, Bhavin S.; Goldstein, Richard A.
2015-01-01
Speciation is fundamental to understanding the huge diversity of life on Earth. Although still controversial, empirical evidence suggests that the rate of speciation is larger for smaller populations. Here, we explore a biophysical model of speciation by developing a simple coarse-grained theory of transcription factor-DNA binding and how their co-evolution in two geographically isolated lineages leads to incompatibilities. To develop a tractable analytical theory, we derive a Smoluchowski equation for the dynamics of binding energy evolution that accounts for the fact that natural selection acts on phenotypes, but variation arises from mutations in sequences; the Smoluchowski equation includes selection due to both gradients in fitness and gradients in sequence entropy, which is the logarithm of the number of sequences that correspond to a particular binding energy. This simple consideration predicts that smaller populations develop incompatibilities more quickly in the weak mutation regime; this trend arises as sequence entropy poises smaller populations closer to incompatible regions of phenotype space. These results suggest a generic coarse-grained approach to evolutionary stochastic dynamics, allowing realistic modelling at the phenotypic level. PMID:25936759
Clonal evolution in breast cancer revealed by single nucleus genome sequencing.
Wang, Yong; Waters, Jill; Leung, Marco L; Unruh, Anna; Roh, Whijae; Shi, Xiuqing; Chen, Ken; Scheet, Paul; Vattathil, Selina; Liang, Han; Multani, Asha; Zhang, Hong; Zhao, Rui; Michor, Franziska; Meric-Bernstam, Funda; Navin, Nicholas E
2014-08-14
Sequencing studies of breast tumour cohorts have identified many prevalent mutations, but provide limited insight into the genomic diversity within tumours. Here we developed a whole-genome and exome single cell sequencing approach called nuc-seq that uses G2/M nuclei to achieve 91% mean coverage breadth. We applied this method to sequence single normal and tumour nuclei from an oestrogen-receptor-positive (ER(+)) breast cancer and a triple-negative ductal carcinoma. In parallel, we performed single nuclei copy number profiling. Our data show that aneuploid rearrangements occurred early in tumour evolution and remained highly stable as the tumour masses clonally expanded. In contrast, point mutations evolved gradually, generating extensive clonal diversity. Using targeted single-molecule sequencing, many of the diverse mutations were shown to occur at low frequencies (<10%) in the tumour mass. Using mathematical modelling we found that the triple-negative tumour cells had an increased mutation rate (13.3×), whereas the ER(+) tumour cells did not. These findings have important implications for the diagnosis, therapeutic treatment and evolution of chemoresistance in breast cancer.
Kijima, T E; Innan, Hideki
2013-11-01
A population genetic simulation framework is developed to understand the behavior and molecular evolution of DNA sequences of transposable elements. Our model incorporates random transposition and excision of transposable element (TE) copies, two modes of selection against TEs, and degeneration of transpositional activity by point mutations. We first investigated the relationships between the behavior of the copy number of TEs and these parameters. Our results show that when selection is weak, the genome can maintain a relatively large number of TEs, but most of them are less active. In contrast, with strong selection, the genome can maintain only a limited number of TEs but the proportion of active copies is large. In such a case, there could be substantial fluctuations of the copy number over generations. We also explored how DNA sequences of TEs evolve through the simulations. In general, active copies form clusters around the original sequence, while less active copies have long branches specific to themselves, exhibiting a star-shaped phylogeny. It is demonstrated that the phylogeny of TE sequences could be informative to understand the dynamics of TE evolution.
Genomics of bacteria and archaea: the emerging dynamic view of the prokaryotic world
Koonin, Eugene V.; Wolf, Yuri I.
2008-01-01
The first bacterial genome was sequenced in 1995, and the first archaeal genome in 1996. Soon after these breakthroughs, an exponential rate of genome sequencing was established, with a doubling time of approximately 20 months for bacteria and approximately 34 months for archaea. Comparative analysis of the hundreds of sequenced bacterial and dozens of archaeal genomes leads to several generalizations on the principles of genome organization and evolution. A crucial finding that enables functional characterization of the sequenced genomes and evolutionary reconstruction is that the majority of archaeal and bacterial genes have conserved orthologs in other, often, distant organisms. However, comparative genomics also shows that horizontal gene transfer (HGT) is a dominant force of prokaryotic evolution, along with the loss of genetic material resulting in genome contraction. A crucial component of the prokaryotic world is the mobilome, the enormous collection of viruses, plasmids and other selfish elements, which are in constant exchange with more stable chromosomes and serve as HGT vehicles. Thus, the prokaryotic genome space is a tightly connected, although compartmentalized, network, a novel notion that undermines the ‘Tree of Life’ model of evolution and requires a new conceptual framework and tools for the study of prokaryotic evolution. PMID:18948295
Chaw, R. Crystal; Collin, Matthew; Wimmer, Marjorie; Helmrick, Kara-Leigh; Hayashi, Cheryl Y.
2017-01-01
Spiders swath their eggs with silk to protect developing embryos and hatchlings. Egg case silks, like other fibrous spider silks, are primarily composed of proteins called spidroins (spidroin = spider-fibroin). Silks, and thus spidroins, are important throughout the lives of spiders, yet the evolution of spidroin genes has been relatively understudied. Spidroin genes are notoriously difficult to sequence because they are typically very long (≥ 10 kb of coding sequence) and highly repetitive. Here, we investigate the evolution of spider silk genes through long-read sequencing of Bacterial Artificial Chromosome (BAC) clones. We demonstrate that the silver garden spider Argiope argentata has multiple egg case spidroin loci with a loss of function at one locus. We also use degenerate PCR primers to search the genomic DNA of congeneric species and find evidence for multiple egg case spidroin loci in other Argiope spiders. Comparative analyses show that these multiple loci are more similar at the nucleotide level within a species than between species. This pattern is consistent with concerted evolution homogenizing gene copies within a genome. More complicated explanations include convergent evolution or recent independent gene duplications within each species. PMID:29127108
A prokaryotic viral sequence is expressed and conserved in mammalian brain.
Yeh, Yang-Hui; Gunasekharan, Vignesh; Manuelidis, Laura
2017-07-03
A natural and permanent transfer of prokaryotic viral sequences to mammals has not been reported by others. Circular "SPHINX" DNAs <5 kb were previously isolated from nuclease-protected cytoplasmic particles in rodent neuronal cell lines and brain. Two of these DNAs were sequenced after Φ29 polymerase amplification, and they revealed significant but imperfect homology to segments of commensal Acinetobacter phage viruses. These findings were surprising because the brain is isolated from environmental microorganisms. The 1.76-kb DNA sequence (SPHINX 1.8), with an iteron before its ORF, was evaluated here for its expression in neural cells and brain. A rabbit affinity purified antibody generated against a peptide without homology to mammalian sequences labeled a nonglycosylated ∼41-kDa protein (spx1) on Western blots, and the signal was efficiently blocked by the competing peptide. Spx1 was resistant to limited proteinase K digestion, but was unrelated to the expression of host prion protein or its pathologic amyloid form. Remarkably, spx1 concentrated in selected brain synapses, such as those on anterior motor horn neurons that integrate many complex neural inputs. SPHINX 1.8 appears to be involved in tissue-specific differentiation, including essential functions that preserve its propagation during mammalian evolution, possibly via maternal inheritance. The data here indicate that mammals can share and exchange a larger world of prokaryotic viruses than previously envisioned.
LinkFinder: An expert system that constructs phylogenic trees
NASA Technical Reports Server (NTRS)
Inglehart, James; Nelson, Peter C.
1991-01-01
An expert system has been developed using the C Language Integrated Production System (CLIPS) that automates the process of constructing DNA sequence based phylogenies (trees or lineages) that indicate evolutionary relationships. LinkFinder takes as input homologous DNA sequences from distinct individual organisms. It measures variations between the sequences, selects appropriate proportionality constants, and estimates the time that has passed since each pair of organisms diverged from a common ancestor. It then designs and outputs a phylogenic map summarizing these results. LinkFinder can find genetic relationships between different species, and between individuals of the same species, including humans. It was designed to take advantage of the vast amount of sequence data being produced by the Genome Project, and should be of value to evolution theorists who wish to utilize this data, but who have no formal training in molecular genetics. Evolutionary theory holds that distinct organisms carrying a common gene inherited that gene from a common ancestor. Homologous genes vary from individual to individual and species to species, and the amount of variation is now believed to be directly proportional to the time that has passed since divergence from a common ancestor. The proportionality constant must be determined experimentally; it varies considerably with the types of organisms and DNA molecules under study. Given an appropriate constant, and the variation between two DNA sequences, a simple linear equation gives the divergence time.
Wasabi: An Integrated Platform for Evolutionary Sequence Analysis and Data Visualization.
Veidenberg, Andres; Medlar, Alan; Löytynoja, Ari
2016-04-01
Wasabi is an open source, web-based environment for evolutionary sequence analysis. Wasabi visualizes sequence data together with a phylogenetic tree within a modern, user-friendly interface: The interface hides extraneous options, supports context sensitive menus, drag-and-drop editing, and displays additional information, such as ancestral sequences, associated with specific tree nodes. The Wasabi environment supports reproducibility by automatically storing intermediate analysis steps and includes built-in functions to share data between users and publish analysis results. For computational analysis, Wasabi supports PRANK and PAGAN for phylogeny-aware alignment and alignment extension, and it can be easily extended with other tools. Along with drag-and-drop import of local files, Wasabi can access remote data through URL and import sequence data, GeneTrees and EPO alignments directly from Ensembl. To demonstrate a typical workflow using Wasabi, we reproduce key findings from recent comparative genomics studies, including a reanalysis of the EGLN1 gene from the tiger genome study: These case studies can be browsed within Wasabi at http://wasabiapp.org:8000?id=usecases. Wasabi runs inside a web browser and does not require any installation. One can start using it at http://wasabiapp.org. All source code is licensed under the AGPLv3. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Zhang, Weiping; Li, Yudong; Chen, Yiwang; Xu, Sha; Du, Guocheng; Shi, Huidong; Zhou, Jingwen; Chen, Jian
2018-02-05
Chinese rice wine is a popular traditional alcoholic beverage in China, while its brewing processes have rarely been explored. We herein report the first gapless, near-finished genome sequence of the yeast strain Saccharomyces cerevisiae N85 for Chinese rice wine production. Several assembly methods were used to integrate Pacific Bioscience (PacBio) and Illumina sequencing data to achieve high-quality genome sequencing of the strain. The genome encodes more than 6,000 predicted proteins, and 238 long non-coding RNAs, which are validated by RNA-sequencing data. Moreover, our annotation predicts 171 novel genes that are not present in the reference S288c genome. We also identified 65,902 single nucleotide polymorphisms and small indels, many of which are located within genic regions. Dozens of larger copy-number variations and translocations were detected, mainly enriched in the subtelomeres, suggesting these regions may be related to genomic evolution. This study will serve as a milestone in studying of Chinese rice wine and related beverages in China and in other countries. It will help to develop more scientific and modern fermentation processes of Chinese rice wine, and explore metabolism pathways of desired and harmful components in Chinese rice wine to improve its taste and nutritional value. © The Author(s) 2018. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Gibbs motif sampling: detection of bacterial outer membrane protein repeats.
Neuwald, A. F.; Liu, J. S.; Lawrence, C. E.
1995-01-01
The detection and alignment of locally conserved regions (motifs) in multiple sequences can provide insight into protein structure, function, and evolution. A new Gibbs sampling algorithm is described that detects motif-encoding regions in sequences and optimally partitions them into distinct motif models; this is illustrated using a set of immunoglobulin fold proteins. When applied to sequences sharing a single motif, the sampler can be used to classify motif regions into related submodels, as is illustrated using helix-turn-helix DNA-binding proteins. Other statistically based procedures are described for searching a database for sequences matching motifs found by the sampler. When applied to a set of 32 very distantly related bacterial integral outer membrane proteins, the sampler revealed that they share a subtle, repetitive motif. Although BLAST (Altschul SF et al., 1990, J Mol Biol 215:403-410) fails to detect significant pairwise similarity between any of the sequences, the repeats present in these outer membrane proteins, taken as a whole, are highly significant (based on a generally applicable statistical test for motifs described here). Analysis of bacterial porins with known trimeric beta-barrel structure and related proteins reveals a similar repetitive motif corresponding to alternating membrane-spanning beta-strands. These beta-strands occur on the membrane interface (as opposed to the trimeric interface) of the beta-barrel. The broad conservation and structural location of these repeats suggests that they play important functional roles. PMID:8520488
The Causality of Evolution on Different Fitness Landscapes
NASA Astrophysics Data System (ADS)
Vyawahare, Saurabh; Austin, Robert; Zhang, Qiucen; Kim, Hyunsung; Bestoso, John
2013-03-01
Evolution of antibiotic resistance is a growing problem. One major reason why most antibiotics fail is because of mutations on drug targets (e.g. essential enzymes). Sequencing of clinically resistant isolates have shown that multiple mutational-hotspots exist in coding regions, which could potentially prohibit the binding of drugs. However, it is not clear whether the appearance of each mutation is random or influenced by other factors. In this paper, we compare evolution of resistance to ciprofloxacin from two distinct but well characterized genetic backgrounds. By combining our recently developed evolution reactor and deep whole-genome sequencing, we show different alleles of σs factor lead to fixation of different mutations in gyrA gene that confer ciprofloxacin resistance to bacteria Escherichia coli. Such causality of evolution in different genes provides an opportunity to control the evolution of antibiotic resistance. Sponsored by the NCI/NIH Physical Sciences Oncology Centers
Undheim, Eivind A B; Mobli, Mehdi; King, Glenn F
2016-06-01
Three-dimensional (3D) structures have been used to explore the evolution of proteins for decades, yet they have rarely been utilized to study the molecular evolution of peptides. Here, we highlight areas in which 3D structures can be particularly useful for studying the molecular evolution of peptide toxins. Although we focus our discussion on animal toxins, including one of the most widespread disulfide-rich peptide folds known, the inhibitor cystine knot, our conclusions should be widely applicable to studies of the evolution of disulfide-constrained peptides. We show that conserved 3D folds can be used to identify evolutionary links and test hypotheses regarding the evolutionary origin of peptides with extremely low sequence identity; construct accurate multiple sequence alignments; and better understand the evolutionary forces that drive the molecular evolution of peptides. Also watch the video abstract. © 2016 WILEY Periodicals, Inc.
Visualizing and Clustering Protein Similarity Networks: Sequences, Structures, and Functions.
Mai, Te-Lun; Hu, Geng-Ming; Chen, Chi-Ming
2016-07-01
Research in the recent decade has demonstrated the usefulness of protein network knowledge in furthering the study of molecular evolution of proteins, understanding the robustness of cells to perturbation, and annotating new protein functions. In this study, we aimed to provide a general clustering approach to visualize the sequence-structure-function relationship of protein networks, and investigate possible causes for inconsistency in the protein classifications based on sequences, structures, and functions. Such visualization of protein networks could facilitate our understanding of the overall relationship among proteins and help researchers comprehend various protein databases. As a demonstration, we clustered 1437 enzymes by their sequences and structures using the minimum span clustering (MSC) method. The general structure of this protein network was delineated at two clustering resolutions, and the second level MSC clustering was found to be highly similar to existing enzyme classifications. The clustering of these enzymes based on sequence, structure, and function information is consistent with each other. For proteases, the Jaccard's similarity coefficient is 0.86 between sequence and function classifications, 0.82 between sequence and structure classifications, and 0.78 between structure and function classifications. From our clustering results, we discussed possible examples of divergent evolution and convergent evolution of enzymes. Our clustering approach provides a panoramic view of the sequence-structure-function network of proteins, helps visualize the relation between related proteins intuitively, and is useful in predicting the structure and function of newly determined protein sequences.
Danley, Patrick D; Mullen, Sean P; Liu, Fenglong; Nene, Vishvanath; Quackenbush, John; Shaw, Kerry L
2007-01-01
Background As the developmental costs of genomic tools decline, genomic approaches to non-model systems are becoming more feasible. Many of these systems may lack advanced genetic tools but are extremely valuable models in other biological fields. Here we report the development of expressed sequence tags (EST's) in an orthopteroid insect, a model for the study of neurobiology, speciation, and evolution. Results We report the sequencing of 14,502 EST's from clones derived from a nerve cord cDNA library, and the subsequent construction of a Gene Index from these sequences, from the Hawaiian trigonidiine cricket Laupala kohalensis. The Gene Index contains 8607 unique sequences comprised of 2575 tentative consensus (TC) sequences and 6032 singletons. For each of the unique sequences, an attempt was made to assign a provisional annotation and to categorize its function using a Gene Ontology-based classification through a sequence-based comparison to known proteins. In addition, a set of unique 70 base pair oligomers that can be used for DNA microarrays was developed. All Gene Index information is posted at the DFCI Gene Indices web page Conclusion Orthopterans are models used to understand the neurophysiological basis of complex motor patterns such as flight and stridulation. The sequences presented in the cricket Gene Index will provide neurophysiologists with many genetic tools that have been largely absent in this field. The cricket Gene Index is one of only two gene indices to be developed in an evolutionary model system. Species within the genus Laupala have speciated recently, rapidly, and extensively. Therefore, the genes identified in the cricket Gene Index can be used to study the genomics of speciation. Furthermore, this gene index represents a significant EST resources for basal insects. As such, this resource is a valuable comparative tool for the understanding of invertebrate molecular evolution. The sequences presented here will provide much needed genomic resources for three distinct but overlapping fields of inquiry: neurobiology, speciation, and molecular evolution. PMID:17459168
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rudnick, Gregory H.; Tran, Kim-Vy; Papovich, Casey
2012-08-10
We study the red sequence in a cluster of galaxies at z = 1.62 and follow its evolution over the intervening 9.5 Gyr to the present day. Using deep YJK{sub s} imaging with the HAWK-I instrument on the Very Large Telescope, we identify a tight red sequence and construct its rest-frame i-band luminosity function (LF). There is a marked deficit of faint red galaxies in the cluster that causes a turnover in the LF. We compare the red-sequence LF to that for clusters at z < 0.8, correcting the luminosities for passive evolution. The shape of the cluster red-sequence LFmore » does not evolve between z = 1.62 and z = 0.6 but at z < 0.6 the faint population builds up significantly. Meanwhile, between z = 1.62 and 0.6 the inferred total light on the red sequence grows by a factor of {approx}2 and the bright end of the LF becomes more populated. We construct a simple model for red-sequence evolution that grows the red sequence in total luminosity and matches the constant LF shape at z > 0.6. In this model the cluster accretes blue galaxies from the field whose star formation is quenched and who are subsequently allowed to merge. We find that three to four mergers among cluster galaxies during the 4 Gyr between z = 1.62 and z = 0.6 match the observed LF evolution between the two redshifts. The inferred merger rate is consistent with other studies of this cluster. Our result supports the picture that galaxy merging during the major growth phase of massive clusters is an important process in shaping the red-sequence population at all luminosities.« less
Adiabatic leakage elimination operator in an experimental framework
NASA Astrophysics Data System (ADS)
Wang, Zhao-Ming; Byrd, Mark S.; Jing, Jun; Wu, Lian-Ao
2018-06-01
Adiabatic evolution is used in a variety of quantum information processing tasks. However, the elimination of errors is not as well developed as it is for circuit model processing. Here, we present a strategy to improve the performance of a quantum adiabatic process by adding leakage elimination operators (LEOs) to the evolution. These are a sequence of pulse controls acting in an adiabatic subspace to eliminate errors by suppressing unwanted transitions. Using the Feshbach P Q partitioning technique, we obtain an analytical solution for a set of pulse controls. The effectiveness of the LEO is independent of the specific form of the pulse but depends on the average frequency of the control function. By observing that the evolution of the target eigenstate is governed by a periodic function appearing in the integral of the control function, we show that control parameters can be chosen in such a way that the instantaneous eigenstates of the system are unchanged, yet a speedup can be achieved by suppressing transitions. Furthermore, we give the exact expression of the control function for a counter unitary transformation to be used in experiments which provides a clear physical meaning for the LEO, aiding in the implementation.
NASA Technical Reports Server (NTRS)
Wisdom, Jack
2002-01-01
In these 18 years, the research has touched every major dynamical problem in the solar system, including: the effect of chaotic zones on the distribution of asteroids, the delivery of meteorites along chaotic pathways, the chaotic motion of Pluto, the chaotic motion of the outer planets and that of the whole solar system, the delivery of short period comets from the Kuiper belt, the tidal evolution of the Uranian arid Galilean satellites, the chaotic tumbling of Hyperion and other irregular satellites, the large chaotic variations of the obliquity of Mars, the evolution of the Earth-Moon system, and the resonant core- mantle dynamics of Earth and Venus. It has introduced new analytical and numerical tools that are in widespread use. Today, nearly every long-term integration of our solar system, its subsystems, and other solar systems uses algorithms that was invented. This research has all been primarily Supported by this sequence of PGG NASA grants. During this period published major investigations of tidal evolution of the Earth-Moon system and of the passage of the Earth and Venus through non-linear core-mantle resonances were completed. It has published a major innovation in symplectic algorithms: the symplectic corrector. A paper was completed on non-perturbative hydrostatic equilibrium.
The tangled bank of amino acids.
Goldstein, Richard A; Pollock, David D
2016-07-01
The use of amino acid substitution matrices to model protein evolution has yielded important insights into both the evolutionary process and the properties of specific protein families. In order to make these models tractable, standard substitution matrices represent the average results of the evolutionary process rather than the underlying molecular biophysics and population genetics, treating proteins as a set of independently evolving sites rather than as an integrated biomolecular entity. With advances in computing and the increasing availability of sequence data, we now have an opportunity to move beyond current substitution matrices to more interpretable mechanistic models with greater fidelity to the evolutionary process of mutation and selection and the holistic nature of the selective constraints. As part of this endeavour, we consider how epistatic interactions induce spatial and temporal rate heterogeneity, and demonstrate how these generally ignored factors can reconcile standard substitution rate matrices and the underlying biology, allowing us to better understand the meaning of these substitution rates. Using computational simulations of protein evolution, we can demonstrate the importance of both spatial and temporal heterogeneity in modelling protein evolution. © 2016 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.
The genome sequence of taurine cattle: a window to ruminant biology and evolution.
Elsik, Christine G; Tellam, Ross L; Worley, Kim C; Gibbs, Richard A; Muzny, Donna M; Weinstock, George M; Adelson, David L; Eichler, Evan E; Elnitski, Laura; Guigó, Roderic; Hamernik, Debora L; Kappes, Steve M; Lewin, Harris A; Lynn, David J; Nicholas, Frank W; Reymond, Alexandre; Rijnkels, Monique; Skow, Loren C; Zdobnov, Evgeny M; Schook, Lawrence; Womack, James; Alioto, Tyler; Antonarakis, Stylianos E; Astashyn, Alex; Chapple, Charles E; Chen, Hsiu-Chuan; Chrast, Jacqueline; Câmara, Francisco; Ermolaeva, Olga; Henrichsen, Charlotte N; Hlavina, Wratko; Kapustin, Yuri; Kiryutin, Boris; Kitts, Paul; Kokocinski, Felix; Landrum, Melissa; Maglott, Donna; Pruitt, Kim; Sapojnikov, Victor; Searle, Stephen M; Solovyev, Victor; Souvorov, Alexandre; Ucla, Catherine; Wyss, Carine; Anzola, Juan M; Gerlach, Daniel; Elhaik, Eran; Graur, Dan; Reese, Justin T; Edgar, Robert C; McEwan, John C; Payne, Gemma M; Raison, Joy M; Junier, Thomas; Kriventseva, Evgenia V; Eyras, Eduardo; Plass, Mireya; Donthu, Ravikiran; Larkin, Denis M; Reecy, James; Yang, Mary Q; Chen, Lin; Cheng, Ze; Chitko-McKown, Carol G; Liu, George E; Matukumalli, Lakshmi K; Song, Jiuzhou; Zhu, Bin; Bradley, Daniel G; Brinkman, Fiona S L; Lau, Lilian P L; Whiteside, Matthew D; Walker, Angela; Wheeler, Thomas T; Casey, Theresa; German, J Bruce; Lemay, Danielle G; Maqbool, Nauman J; Molenaar, Adrian J; Seo, Seongwon; Stothard, Paul; Baldwin, Cynthia L; Baxter, Rebecca; Brinkmeyer-Langford, Candice L; Brown, Wendy C; Childers, Christopher P; Connelley, Timothy; Ellis, Shirley A; Fritz, Krista; Glass, Elizabeth J; Herzig, Carolyn T A; Iivanainen, Antti; Lahmers, Kevin K; Bennett, Anna K; Dickens, C Michael; Gilbert, James G R; Hagen, Darren E; Salih, Hanni; Aerts, Jan; Caetano, Alexandre R; Dalrymple, Brian; Garcia, Jose Fernando; Gill, Clare A; Hiendleder, Stefan G; Memili, Erdogan; Spurlock, Diane; Williams, John L; Alexander, Lee; Brownstein, Michael J; Guan, Leluo; Holt, Robert A; Jones, Steven J M; Marra, Marco A; Moore, Richard; Moore, Stephen S; Roberts, Andy; Taniguchi, Masaaki; Waterman, Richard C; Chacko, Joseph; Chandrabose, Mimi M; Cree, Andy; Dao, Marvin Diep; Dinh, Huyen H; Gabisi, Ramatu Ayiesha; Hines, Sandra; Hume, Jennifer; Jhangiani, Shalini N; Joshi, Vandita; Kovar, Christie L; Lewis, Lora R; Liu, Yih-Shin; Lopez, John; Morgan, Margaret B; Nguyen, Ngoc Bich; Okwuonu, Geoffrey O; Ruiz, San Juana; Santibanez, Jireh; Wright, Rita A; Buhay, Christian; Ding, Yan; Dugan-Rocha, Shannon; Herdandez, Judith; Holder, Michael; Sabo, Aniko; Egan, Amy; Goodell, Jason; Wilczek-Boney, Katarzyna; Fowler, Gerald R; Hitchens, Matthew Edward; Lozado, Ryan J; Moen, Charles; Steffen, David; Warren, James T; Zhang, Jingkun; Chiu, Readman; Schein, Jacqueline E; Durbin, K James; Havlak, Paul; Jiang, Huaiyang; Liu, Yue; Qin, Xiang; Ren, Yanru; Shen, Yufeng; Song, Henry; Bell, Stephanie Nicole; Davis, Clay; Johnson, Angela Jolivet; Lee, Sandra; Nazareth, Lynne V; Patel, Bella Mayurkumar; Pu, Ling-Ling; Vattathil, Selina; Williams, Rex Lee; Curry, Stacey; Hamilton, Cerissa; Sodergren, Erica; Wheeler, David A; Barris, Wes; Bennett, Gary L; Eggen, André; Green, Ronnie D; Harhay, Gregory P; Hobbs, Matthew; Jann, Oliver; Keele, John W; Kent, Matthew P; Lien, Sigbjørn; McKay, Stephanie D; McWilliam, Sean; Ratnakumar, Abhirami; Schnabel, Robert D; Smith, Timothy; Snelling, Warren M; Sonstegard, Tad S; Stone, Roger T; Sugimoto, Yoshikazu; Takasuga, Akiko; Taylor, Jeremy F; Van Tassell, Curtis P; Macneil, Michael D; Abatepaulo, Antonio R R; Abbey, Colette A; Ahola, Virpi; Almeida, Iassudara G; Amadio, Ariel F; Anatriello, Elen; Bahadue, Suria M; Biase, Fernando H; Boldt, Clayton R; Carroll, Jeffery A; Carvalho, Wanessa A; Cervelatti, Eliane P; Chacko, Elsa; Chapin, Jennifer E; Cheng, Ye; Choi, Jungwoo; Colley, Adam J; de Campos, Tatiana A; De Donato, Marcos; Santos, Isabel K F de Miranda; de Oliveira, Carlo J F; Deobald, Heather; Devinoy, Eve; Donohue, Kaitlin E; Dovc, Peter; Eberlein, Annett; Fitzsimmons, Carolyn J; Franzin, Alessandra M; Garcia, Gustavo R; Genini, Sem; Gladney, Cody J; Grant, Jason R; Greaser, Marion L; Green, Jonathan A; Hadsell, Darryl L; Hakimov, Hatam A; Halgren, Rob; Harrow, Jennifer L; Hart, Elizabeth A; Hastings, Nicola; Hernandez, Marta; Hu, Zhi-Liang; Ingham, Aaron; Iso-Touru, Terhi; Jamis, Catherine; Jensen, Kirsty; Kapetis, Dimos; Kerr, Tovah; Khalil, Sari S; Khatib, Hasan; Kolbehdari, Davood; Kumar, Charu G; Kumar, Dinesh; Leach, Richard; Lee, Justin C-M; Li, Changxi; Logan, Krystin M; Malinverni, Roberto; Marques, Elisa; Martin, William F; Martins, Natalia F; Maruyama, Sandra R; Mazza, Raffaele; McLean, Kim L; Medrano, Juan F; Moreno, Barbara T; Moré, Daniela D; Muntean, Carl T; Nandakumar, Hari P; Nogueira, Marcelo F G; Olsaker, Ingrid; Pant, Sameer D; Panzitta, Francesca; Pastor, Rosemeire C P; Poli, Mario A; Poslusny, Nathan; Rachagani, Satyanarayana; Ranganathan, Shoba; Razpet, Andrej; Riggs, Penny K; Rincon, Gonzalo; Rodriguez-Osorio, Nelida; Rodriguez-Zas, Sandra L; Romero, Natasha E; Rosenwald, Anne; Sando, Lillian; Schmutz, Sheila M; Shen, Libing; Sherman, Laura; Southey, Bruce R; Lutzow, Ylva Strandberg; Sweedler, Jonathan V; Tammen, Imke; Telugu, Bhanu Prakash V L; Urbanski, Jennifer M; Utsunomiya, Yuri T; Verschoor, Chris P; Waardenberg, Ashley J; Wang, Zhiquan; Ward, Robert; Weikard, Rosemarie; Welsh, Thomas H; White, Stephen N; Wilming, Laurens G; Wunderlich, Kris R; Yang, Jianqi; Zhao, Feng-Qi
2009-04-24
To understand the biology and evolution of ruminants, the cattle genome was sequenced to about sevenfold coverage. The cattle genome contains a minimum of 22,000 genes, with a core set of 14,345 orthologs shared among seven mammalian species of which 1217 are absent or undetected in noneutherian (marsupial or monotreme) genomes. Cattle-specific evolutionary breakpoint regions in chromosomes have a higher density of segmental duplications, enrichment of repetitive elements, and species-specific variations in genes associated with lactation and immune responsiveness. Genes involved in metabolism are generally highly conserved, although five metabolic genes are deleted or extensively diverged from their human orthologs. The cattle genome sequence thus provides a resource for understanding mammalian evolution and accelerating livestock genetic improvement for milk and meat production.
NASA Astrophysics Data System (ADS)
Jiang, Dengkai; Chen, Xuefei; Li, Lifang; Han, Zhanwen
2017-11-01
Two blue-straggler sequences discovered in globular cluster M30 provide a strong constraint on the formation mechanisms of blue stragglers. We study the formation of blue-straggler binaries through binary evolution, and find that binary evolution can contribute to the blue stragglers in both of the sequences. Whether a blue-straggler is located in the blue sequence or red sequence depends on the contribution of the mass donor to the total luminosity of the binary, which is generally observed as a single star in globular clusters. The blue stragglers in the blue sequence have a cool white dwarf companion, while the majority (˜60%) of the objects in the red sequence are binaries that are still experiencing mass transfer. However, there are also some objects for which the donors have just finished the mass transfer (the stripped-core stars, ˜10%) or the blue stragglers (the accretors) have evolved away from the blue sequence (˜30%). Meanwhile, W UMa contact binaries found in both sequences may be explained by various mass ratios, that is, W UMa contact binaries in the red sequence have two components with comparable masses (e.g., mass ratio q ˜ 0.3-1.0), while those in the blue sequence have low mass ratios (e.g., q< 0.3). However, the fraction of the blue sequence in M30 cannot be reproduced by binary population synthesis if we assumed the initial parameters of a binary sample to be the same as those of the field. This possibly indicates that dynamical effects on binary systems are very important in globular clusters.
Molecular Evolution in Historical Perspective.
Suárez-Díaz, Edna
2016-12-01
In the 1960s, advances in protein chemistry and molecular genetics provided new means for the study of biological evolution. Amino acid sequencing, nucleic acid hybridization, zone gel electrophoresis, and immunochemistry were some of the experimental techniques that brought about new perspectives to the study of the patterns and mechanisms of evolution. New concepts, such as the molecular evolutionary clock, and the discovery of unexpected molecular phenomena, like the presence of repetitive sequences in eukaryotic genomes, eventually led to the realization that evolution might occur at a different pace at the organismic and the molecular levels, and according to different mechanisms. These developments sparked important debates between defendants of the molecular and organismic approaches. The most vocal confrontations focused on the relation between primates and humans, and the neutral theory of molecular evolution. By the 1980s and 1990s, the construction of large protein and DNA sequences databases, and the development of computer-based statistical tools, facilitated the coming together of molecular and evolutionary biology. Although in its contemporary form the field of molecular evolution can be traced back to the last five decades, the field has deep roots in twentieth century experimental life sciences. For historians of science, the origins and consolidation of molecular evolution provide a privileged field for the study of scientific debates, the relation between technological advances and scientific knowledge, and the connection between science and broader social concerns.
2011-01-01
Background Ribosomal 5S genes are well known for the critical role they play in ribosome folding and functionality. These genes are thought to evolve in a concerted fashion, with high rates of homogenization of gene copies. However, the majority of previous analyses regarding the evolutionary process of rDNA repeats were conducted in invertebrates and plants. Studies have also been conducted on vertebrates, but these analyses were usually restricted to the 18S, 5.8S and 28S rRNA genes. The recent identification of divergent 5S rRNA gene paralogs in the genomes of elasmobranches and teleost fishes indicate that the eukaryotic 5S rRNA gene family has a more complex genomic organization than previously thought. The availability of new sequence data from lower vertebrates such as teleosts and elasmobranches enables an enhanced evolutionary characterization of 5S rDNA among vertebrates. Results We identified two variant classes of 5S rDNA sequences in the genomes of Potamotrygonidae stingrays, similar to the genomes of other vertebrates. One class of 5S rRNA genes was shared only by elasmobranches. A broad comparative survey among 100 vertebrate species suggests that the 5S rRNA gene variants in fishes originated from rounds of genome duplication. These variants were then maintained or eliminated by birth-and-death mechanisms, under intense purifying selection. Clustered multiple copies of 5S rDNA variants could have arisen due to unequal crossing over mechanisms. Simultaneously, the distinct genome clusters were independently homogenized, resulting in the maintenance of clusters of highly similar repeats through concerted evolution. Conclusions We believe that 5S rDNA molecular evolution in fish genomes is driven by a mixed mechanism that integrates birth-and-death and concerted evolution. PMID:21627815
Bohlen, Jörg; Šlechtová, Vendula; Altmanová, Marie; Pelikánová, Šárka; Ráb, Petr
2018-01-01
Polyploidization has played an important role in the evolution of vertebrates, particularly at the base of Teleostei–an enormously successful ray-finned fish group with additional genome doublings on lower taxonomic levels. The investigation of post-polyploid genome dynamics might provide important clues about the evolution and ecology of respective species and can help to decipher the role of polyploidy per se on speciation. Few studies have attempted to investigate the dynamics of repetitive DNA sequences in the post-polyploid genome using molecular cytogenetic tools in fishes, though recent efforts demonstrated their usefulness. The demonstrably monophyletic freshwater loach family Botiidae, branching to evolutionary diploid and tetraploid lineages separated >25 Mya, offers a suited model group for comparing the long-term repetitive DNA evolution. For this, we integrated phylogenetic analyses with cytogenetical survey involving Giemsa- and Chromomycin A3 (CMA3)/DAPI stainings and fluorescence in situ hybridization with 5S/45S rDNA, U2 snDNA and telomeric probes in representative sample of 12 botiid species. The karyotypes of all diploids were composed of 2n = 50 chromosomes, while majority of tetraploids had 2n = 4x = 100, with only subtle interspecific karyotype differences. The exceptional karyotype of Botia dario (2n = 4x = 96) suggested centric fusions behind the 2n reduction. Variable patterns of FISH signals revealed cases of intraspecific polymorphisms, rDNA amplification, variable degree of correspondence with CMA3+ sites and almost no phylogenetic signal. In tetraploids, either additivity or loci gain/loss was recorded. Despite absence of classical interstitial telomeric sites, large blocks of interspersed rDNA/telomeric regions were found in diploids only. We uncovered different molecular drives of studied repetitive DNA classes within botiid genomes as well as the advanced stage of the re-diploidization process in tetraploids. Our results may contribute to link genomic approach with molecular cytogenetic analyses in addressing the origin and mechanism of this polyploidization event. PMID:29590207
The proteome: structure, function and evolution
Fleming, Keiran; Kelley, Lawrence A; Islam, Suhail A; MacCallum, Robert M; Muller, Arne; Pazos, Florencio; Sternberg, Michael J.E
2006-01-01
This paper reports two studies to model the inter-relationships between protein sequence, structure and function. First, an automated pipeline to provide a structural annotation of proteomes in the major genomes is described. The results are stored in a database at Imperial College, London (3D-GENOMICS) that can be accessed at www.sbg.bio.ic.ac.uk. Analysis of the assignments to structural superfamilies provides evolutionary insights. 3D-GENOMICS is being integrated with related proteome annotation data at University College London and the European Bioinformatics Institute in a project known as e-protein (http://www.e-protein.org/). The second topic is motivated by the developments in structural genomics projects in which the structure of a protein is determined prior to knowledge of its function. We have developed a new approach PHUNCTIONER that uses the gene ontology (GO) classification to supervise the extraction of the sequence signal responsible for protein function from a structure-based sequence alignment. Using GO we can obtain profiles for a range of specificities described in the ontology. In the region of low sequence similarity (around 15%), our method is more accurate than assignment from the closest structural homologue. The method is also able to identify the specific residues associated with the function of the protein family. PMID:16524832
IMGT, the International ImMunoGeneTics database.
Lefranc, M P; Giudicelli, V; Busin, C; Bodmer, J; Müller, W; Bontrop, R; Lemaitre, M; Malik, A; Chaume, D
1998-01-01
IMGT, the international ImMunoGeneTics database, is an integrated database specialising in Immunoglobulins (Ig), T cell Receptors (TcR) and Major Histocompatibility Complex (MHC) of all vertebrate species, created by Marie-Paule Lefranc, CNRS, Montpellier II University, Montpellier, France (lefranc@ligm.crbm.cnrs-mop.fr). IMGT includes three databases: LIGM-DB (for Ig and TcR), MHC/HLA-DB and PRIMER-DB (the last two in development). IMGT comprises expertly annotated sequences and alignment tables. LIGM-DB contains more than 23 000 Immunoglobulin and T cell Receptor sequences from 78 species. MHC/HLA-DB contains Class I and Class II Human Leucocyte Antigen alignment tables. An IMGT tool, DNAPLOT, developed for Ig, TcR and MHC sequence alignments, is also available. IMGT works in close collaboration with the EMBL database. IMGT goals are to establish a common data access to all immunogenetics data, including nucleotide and protein sequences, oligonucleotide primers, gene maps and other genetic data of Ig, TcR and MHC molecules, and to provide a graphical user friendly data access. IMGT has important implications in medical research (repertoire in autoimmune diseases, AIDS, leukemias, lymphomas), therapeutical approaches (antibody engineering), genome diversity and genome evolution studies. IMGT is freely available at http://imgt.cnusc.fr:8104 PMID:9399859
Kayal, Ehsan; Bentlage, Bastian; Cartwright, Paulyn; Yanagihara, Angel A; Lindsay, Dhugal J; Hopcroft, Russell R; Collins, Allen G
2015-01-01
Hydrozoans display the most morphological diversity within the phylum Cnidaria. While recent molecular studies have provided some insights into their evolutionary history, sister group relationships remain mostly unresolved, particularly at mid-taxonomic levels. Specifically, within Hydroidolina, the most speciose hydrozoan subclass, the relationships and sometimes integrity of orders are highly unsettled. Here we obtained the near complete mitochondrial sequence of twenty-six hydroidolinan hydrozoan species from a range of sources (DNA and RNA-seq data, long-range PCR). Our analyses confirm previous inference of the evolution of mtDNA in Hydrozoa while introducing a novel genome organization. Using RNA-seq data, we propose a mechanism for the expression of mitochondrial mRNA in Hydroidolina that can be extrapolated to the other medusozoan taxa. Phylogenetic analyses using the full set of mitochondrial gene sequences provide some insights into the order-level relationships within Hydroidolina, including siphonophores as the first diverging clade, a well-supported clade comprised of Leptothecata-Filifera III-IV, and a second clade comprised of Aplanulata-Capitata s.s.-Filifera I-II. Finally, we describe our relatively inexpensive and accessible multiplexing strategy to sequence long-range PCR amplicons that can be adapted to most high-throughput sequencing platforms.
Bentlage, Bastian; Cartwright, Paulyn; Yanagihara, Angel A.; Lindsay, Dhugal J.; Hopcroft, Russell R.; Collins, Allen G.
2015-01-01
Hydrozoans display the most morphological diversity within the phylum Cnidaria. While recent molecular studies have provided some insights into their evolutionary history, sister group relationships remain mostly unresolved, particularly at mid-taxonomic levels. Specifically, within Hydroidolina, the most speciose hydrozoan subclass, the relationships and sometimes integrity of orders are highly unsettled. Here we obtained the near complete mitochondrial sequence of twenty-six hydroidolinan hydrozoan species from a range of sources (DNA and RNA-seq data, long-range PCR). Our analyses confirm previous inference of the evolution of mtDNA in Hydrozoa while introducing a novel genome organization. Using RNA-seq data, we propose a mechanism for the expression of mitochondrial mRNA in Hydroidolina that can be extrapolated to the other medusozoan taxa. Phylogenetic analyses using the full set of mitochondrial gene sequences provide some insights into the order-level relationships within Hydroidolina, including siphonophores as the first diverging clade, a well-supported clade comprised of Leptothecata-Filifera III–IV, and a second clade comprised of Aplanulata-Capitata s.s.-Filifera I–II. Finally, we describe our relatively inexpensive and accessible multiplexing strategy to sequence long-range PCR amplicons that can be adapted to most high-throughput sequencing platforms. PMID:26618080
Sharma, Alok; Pohlentz, Gottfried; Bobbili, Kishore Babu; Jeyaprakash, A Arockia; Chandran, Thyageshwar; Mormann, Michael; Swamy, Musti J; Vijayan, M
2013-08-01
The sequence and structure of snake gourd seed lectin (SGSL), a nontoxic homologue of type II ribosome-inactivating proteins (RIPs), have been determined by mass spectrometry and X-ray crystallography, respectively. As in type II RIPs, the molecule consists of a lectin chain made up of two β-trefoil domains. The catalytic chain, which is connected through a disulfide bridge to the lectin chain in type II RIPs, is cleaved into two in SGSL. However, the integrity of the three-dimensional structure of the catalytic component of the molecule is preserved. This is the first time that a three-chain RIP or RIP homologue has been observed. A thorough examination of the sequence and structure of the protein and of its interactions with the bound methyl-α-galactose indicate that the nontoxicity of SGSL results from a combination of changes in the catalytic and the carbohydrate-binding sites. Detailed analyses of the sequences of type II RIPs of known structure and their homologues with unknown structure provide valuable insights into the evolution of this class of proteins. They also indicate some variability in carbohydrate-binding sites, which appears to contribute to the different levels of toxicity exhibited by lectins from various sources.
NASA Astrophysics Data System (ADS)
Nallaseth, Ferez Soli
The Y-chromosome presents a unique cytogenetic framework for the evolution of nucleotide sequences. Alignment of nine Y-chromosomal fragments in their increasing Y-specific/non Y-specific (male/female) sequence divergence ratios was directly and inversely related to their interspersion on these two respective genomic fractions. Sequence analysis confirmed a direct relationship between divergence ratios and the Alu, LINE-1, Satellite and their derivative oligonucleotide contents. Thus their relocation on the Y-chromosome is followed by sequence divergence rather than the well documented concerted evolution of these non-coding progenitor repeated sequences. Five of the nine Y-chromosomal fragments are non-pseudoautosomal and transcribed into heterogeneous PolyA^+ RNA and thus can be retrotransposed. Evolutionary and computer analysis identified homologous oligonucleotide tracts in several human loci suggesting common and random mechanistic origins. Dysgenic genomes represent the accelerated evolution driving sequence divergence (McClintock, 1984). Sex reversal and sterility characterizing dysgenesis occurs in C57BL/6JY ^{rm Pos} but not in 129/SvY^{rm Pos} derivative strains. High frequency, random, multi-locus deletion products of the feral Y^{ rm Pos}-chromosome are generated in the germlines of F1(C57BL/6J X 129/SvY^{ rm Pos})(male) and C57BL/6JY ^{rm Pos}(male) but not in 129/SvY^{rm Pos}(male). Equal, 10^{-1}, 10^ {-2}, and 0 copies (relative to males) of Y^{rm Pos}-specific deletion products respectively characterize C57BL/6JY ^{rm Pos} (HC), (LC), (T) and (F) females. The testes determining loci of inactive Y^{rm Pos}-chromosomes in C57BL/6JY^{rm Pos} HC females are the preferentially deleted/rearranged Y ^{rm Pos}-sequences. Disruption of regulation of plasma testosterone and hepatic MUP-A mRNA levels, TRD of a 4.7 Kbp EcoR1 fragment suggest disruption of autosomal/X-chromosomal sequences. These data and the highly repeated progenitor (Alu, GATA, LINE-1) sequence content of deletion products confirmed the previously unidentified loss of genetic control of mammalian chromosome biology and hybrid dysgenesis.
Yoshida, M A; Ogura, A; Ikeo, K; Shigeno, S; Moritaki, T; Winters, G C; Kohn, A B; Moroz, L L
2015-12-01
Coleoid cephalopods show remarkable evolutionary convergence with vertebrates in their neural organization, including (1) eyes and visual system with optic lobes, (2) specialized parts of the brain controlling learning and memory, such as vertical lobes, and (3) unique vasculature supporting such complexity of the central nervous system. We performed deep sequencing of eye transcriptomes of pygmy squids (Idiosepius paradoxus) and chambered nautiluses (Nautilus pompilius) to decipher the molecular basis of convergent evolution in cephalopods. RNA-seq was complemented by in situ hybridization to localize the expression of selected genes. We found three types of genomic innovations in the evolution of complex brains: (1) recruitment of novel genes into morphogenetic pathways, (2) recombination of various coding and regulatory regions of different genes, often called "evolutionary tinkering" or "co-option", and (3) duplication and divergence of genes. Massive recruitment of novel genes occurred in the evolution of the "camera" eye from nautilus' "pinhole" eye. We also showed that the type-2 co-option of transcription factors played important roles in the evolution of the lens and visual neurons. In summary, the cephalopod convergent morphological evolution of the camera eyes was driven by a mosaic of all types of gene recruitments. In addition, our analysis revealed unexpected variations of squids' opsins, retinochromes, and arrestins, providing more detailed information, valuable for further research on intra-ocular and extra-ocular photoreception of the cephalopods. © The Author 2015. Published by Oxford University Press on behalf of the Society for Integrative and Comparative Biology. All rights reserved. For permissions please email: journals.permissions@oup.com.
Ye, Lidan; Yang, Chengcheng; Yu, Hongwei
2018-01-01
With increasing concerns in sustainable development, biocatalysis has been recognized as a competitive alternative to traditional chemical routes in the past decades. As nature's biocatalysts, enzymes are able to catalyze a broad range of chemical transformations, not only with mild reaction conditions but also with high activity and selectivity. However, the insufficient activity or enantioselectivity of natural enzymes toward non-natural substrates limits their industrial application, while directed evolution provides a potent solution to this problem, thanks to its independence on detailed knowledge about the relationship between sequence, structure, and mechanism/function of the enzymes. A proper high-throughput screening (HTS) method is the key to successful and efficient directed evolution. In recent years, huge varieties of HTS methods have been developed for rapid evaluation of mutant libraries, ranging from in vitro screening to in vivo selection, from indicator addition to multi-enzyme system construction, and from plate screening to computation- or machine-assisted screening. Recently, there is a tendency to integrate directed evolution with metabolic engineering in biosynthesis, using metabolites as HTS indicators, which implies that directed evolution has transformed from molecular engineering to process engineering. This paper aims to provide an overview of HTS methods categorized based on the reaction principles or types by summarizing related studies published in recent years including the work from our group, to discuss assay design strategies and typical examples of HTS methods, and to share our understanding on HTS method development for directed evolution of enzymes involved in specific catalytic reactions or metabolic pathways.
Trombetta, Beniamino; Sellitto, Daniele; Scozzari, Rosaria; Cruciani, Fulvio
2014-01-01
It has long been believed that the male-specific region of the human Y chromosome (MSY) is genetically independent from the X chromosome. This idea has been recently dismissed due to the discovery that X–Y gametologous gene conversion may occur. However, the pervasiveness of this molecular process in the evolution of sex chromosomes has yet to be exhaustively analyzed. In this study, we explored how pervasive X–Y gene conversion has been during the evolution of the youngest stratum of the human sex chromosomes. By comparing about 0.5 Mb of human–chimpanzee gametologous sequences, we identified 19 regions in which extensive gene conversion has occurred. From our analysis, two major features of these emerged: 1) Several of them are evolutionarily conserved between the two species and 2) almost all of the 19 hotspots overlap with regions where X–Y crossing-over has been previously reported to be involved in sex reversal. Furthermore, in order to explore the dynamics of X–Y gametologous conversion in recent human evolution, we resequenced these 19 hotspots in 68 widely divergent Y haplogroups and used publicly available single nucleotide polymorphism data for the X chromosome. We found that at least ten hotspots are still active in humans. Hence, the results of the interspecific analysis are consistent with the hypothesis of widespread reticulate evolution within gametologous sequences in the differentiation of hominini sex chromosomes. In turn, intraspecific analysis demonstrates that X–Y gene conversion may modulate human sex-chromosome-sequence evolution to a greater extent than previously thought. PMID:24817545
Razban, Rostam M; Gilson, Amy I; Durfee, Niamh; Strobelt, Hendrik; Dinkla, Kasper; Choi, Jeong-Mo; Pfister, Hanspeter; Shakhnovich, Eugene I
2018-05-08
Protein evolution spans time scales and its effects span the length of an organism. A web app named ProteomeVis is developed to provide a comprehensive view of protein evolution in the S. cerevisiae and E. coli proteomes. ProteomeVis interactively creates protein chain graphs, where edges between nodes represent structure and sequence similarities within user-defined ranges, to study the long time scale effects of protein structure evolution. The short time scale effects of protein sequence evolution are studied by sequence evolutionary rate (ER) correlation analyses with protein properties that span from the molecular to the organismal level. We demonstrate the utility and versatility of ProteomeVis by investigating the distribution of edges per node in organismal protein chain universe graphs (oPCUGs) and putative ER determinants. S. cerevisiae and E. coli oPCUGs are scale-free with scaling constants of 1.79 and 1.56, respectively. Both scaling constants can be explained by a previously reported theoretical model describing protein structure evolution (Dokholyan et al., 2002). Protein abundance most strongly correlates with ER among properties in ProteomeVis, with Spearman correlations of -0.49 (p-value<10-10) and -0.46 (p-value<10-10) for S. cerevisiae and E. coli, respectively. This result is consistent with previous reports that found protein expression to be the most important ER determinant (Zhang and Yang, 2015). ProteomeVis is freely accessible at http://proteomevis.chem.harvard.edu. Supplementary data are available at Bioinformatics. shakhnovich@chemistry.harvard.edu.
Hao, Jia-Jie; Lin, De-Chen; Dinh, Huy Q; Mayakonda, Anand; Jiang, Yan-Yi; Chang, Chen; Jiang, Ye; Lu, Chen-Chen; Shi, Zhi-Zhou; Xu, Xin; Zhang, Yu; Cai, Yan; Wang, Jin-Wu; Zhan, Qi-Min; Wei, Wen-Qiang; Berman, Benjamin P; Wang, Ming-Rong; Koeffler, H Phillip
2016-12-01
Esophageal squamous cell carcinoma (ESCC) is among the most common malignancies, but little is known about its spatial intratumoral heterogeneity (ITH) and temporal clonal evolutionary processes. To address this, we performed multiregion whole-exome sequencing on 51 tumor regions from 13 ESCC cases and multiregion global methylation profiling for 3 of these 13 cases. We found an average of 35.8% heterogeneous somatic mutations with strong evidence of ITH. Half of the driver mutations located on the branches of tumor phylogenetic trees targeted oncogenes, including PIK3CA, NFE2L2 and MTOR, among others. By contrast, the majority of truncal and clonal driver mutations occurred in tumor-suppressor genes, including TP53, KMT2D and ZNF750, among others. Interestingly, phyloepigenetic trees robustly recapitulated the topological structures of the phylogenetic trees, indicating a possible relationship between genetic and epigenetic alterations. Our integrated investigations of spatial ITH and clonal evolution provide an important molecular foundation for enhanced understanding of tumorigenesis and progression in ESCC.
Cnidarian Cell Type Diversity and Regulation Revealed by Whole-Organism Single-Cell RNA-Seq.
Sebé-Pedrós, Arnau; Saudemont, Baptiste; Chomsky, Elad; Plessier, Flora; Mailhé, Marie-Pierre; Renno, Justine; Loe-Mie, Yann; Lifshitz, Aviezer; Mukamel, Zohar; Schmutz, Sandrine; Novault, Sophie; Steinmetz, Patrick R H; Spitz, François; Tanay, Amos; Marlow, Heather
2018-05-31
The emergence and diversification of cell types is a leading factor in animal evolution. So far, systematic characterization of the gene regulatory programs associated with cell type specificity was limited to few cell types and few species. Here, we perform whole-organism single-cell transcriptomics to map adult and larval cell types in the cnidarian Nematostella vectensis, a non-bilaterian animal with complex tissue-level body-plan organization. We uncover eight broad cell classes in Nematostella, including neurons, cnidocytes, and digestive cells. Each class comprises different subtypes defined by the expression of multiple specific markers. In particular, we characterize a surprisingly diverse repertoire of neurons, which comparative analysis suggests are the result of lineage-specific diversification. By integrating transcription factor expression, chromatin profiling, and sequence motif analysis, we identify the regulatory codes that underlie Nematostella cell-specific expression. Our study reveals cnidarian cell type complexity and provides insights into the evolution of animal cell-specific genomic regulation. Copyright © 2018 Elsevier Inc. All rights reserved.
Circular DNA Intermediate in the Duplication of Nile Tilapia vasa Genes
Fujimura, Koji; Conte, Matthew A.; Kocher, Thomas D.
2011-01-01
vasa is a highly conserved RNA helicase involved in animal germ cell development. Among vertebrate species, it is typically present as a single copy per genome. Here we report the isolation and sequencing of BAC clones for Nile tilapia vasa genes. Contrary to a previous report that Nile tilapia have a single copy of the vasa gene, we find evidence for at least three vasa gene loci. The vasa gene locus was duplicated from the original site and integrated into two distant novel sites. For one of these insertions we find evidence that the duplication was mediated by a circular DNA intermediate. This mechanism of gene duplication may explain the origin of isolated gene duplicates during the evolution of fish genomes. These data provide a foundation for studying the role of multiple vasa genes in the development of tilapia gonads, and will contribute to investigations of the molecular mechanisms of sex determination and evolution in cichlid fishes. PMID:22216289
FunTree: advances in a resource for exploring and contextualising protein function evolution.
Sillitoe, Ian; Furnham, Nicholas
2016-01-04
FunTree is a resource that brings together protein sequence, structure and functional information, including overall chemical reaction and mechanistic data, for structurally defined domain superfamilies. Developed in tandem with the CATH database, the original FunTree contained just 276 superfamilies focused on enzymes. Here, we present an update of FunTree that has expanded to include 2340 superfamilies including both enzymes and proteins with non-enzymatic functions annotated by Gene Ontology (GO) terms. This allows the investigation of how novel functions have evolved within a structurally defined superfamily and provides a means to analyse trends across many superfamilies. This is done not only within the context of a protein's sequence and structure but also the relationships of their functions. New measures of functional similarity have been integrated, including for enzymes comparisons of overall reactions based on overall bond changes, reaction centres (the local environment atoms involved in the reaction) and the sub-structure similarities of the metabolites involved in the reaction and for non-enzymes semantic similarities based on the GO. To identify and highlight changes in function through evolution, ancestral character estimations are made and presented. All this is accessible through a new re-designed web interface that can be found at http://www.funtree.info. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Wu, Dong-Dong; Ye, Ling-Qun; Li, Yan; Sun, Yan-Bo; Shao, Yi; Chen, Chunyan; Zhu, Zhu; Zhong, Li; Wang, Lu; Irwin, David M; Zhang, Yong E; Zhang, Ya-Ping
2015-08-01
Next-generation RNA sequencing has been successfully used for identification of transcript assembly, evaluation of gene expression levels, and detection of post-transcriptional modifications. Despite these large-scale studies, additional comprehensive RNA-seq data from different subregions of the human brain are required to fully evaluate the evolutionary patterns experienced by the human brain transcriptome. Here, we provide a total of 6.5 billion RNA-seq reads from different subregions of the human brain. A significant correlation was observed between the levels of alternative splicing and RNA editing, which might be explained by a competition between the molecular machineries responsible for the splicing and editing of RNA. Young human protein-coding genes demonstrate biased expression to the neocortical and non-neocortical regions during evolution on the lineage leading to humans. We also found that a significantly greater number of young human protein-coding genes are expressed in the putamen, a tissue that was also observed to have the highest level of RNA-editing activity. The putamen, which previously received little attention, plays an important role in cognitive ability, and our data suggest a potential contribution of the putamen to human evolution. © The Author (2015). Published by Oxford University Press on behalf of Journal of Molecular Cell Biology, IBCB, SIBS, CAS. All rights reserved.
Differential pleiotropy and HOX functional organization.
Sivanantharajah, Lovesha; Percival-Smith, Anthony
2015-02-01
Key studies led to the idea that transcription factors are composed of defined modular protein motifs or domains, each with separable, unique function. During evolution, the recombination of these modular domains could give rise to transcription factors with new properties, as has been shown using recombinant molecules. This archetypic, modular view of transcription factor organization is based on the analyses of a few transcription factors such as GAL4, which may represent extreme exemplars rather than an archetype or the norm. Recent work with a set of Homeotic selector (HOX) proteins has revealed differential pleiotropy: the observation that highly-conserved HOX protein motifs and domains make small, additive, tissue specific contributions to HOX activity. Many of these differentially pleiotropic HOX motifs may represent plastic sequence elements called short linear motifs (SLiMs). The coupling of differential pleiotropy with SLiMs, suggests that protein sequence changes in HOX transcription factors may have had a greater impact on morphological diversity during evolution than previously believed. Furthermore, differential pleiotropy may be the genetic consequence of an ensemble nature of HOX transcription factor allostery, where HOX proteins exist as an ensemble of states with the capacity to integrate an extensive array of developmental information. Given a new structural model for HOX functional domain organization, the properties of the archetypic TF may require reassessment. Copyright © 2014 Elsevier Inc. All rights reserved.
Goller, Katja V; Gabriel, Claudia; Dimna, Mireille Le; Le Potier, Marie-Frédérique; Rossi, Sophie; Staubach, Christoph; Merboth, Matthias; Beer, Martin; Blome, Sandra
2016-03-01
Classical swine fever is a viral disease of pigs that carries tremendous socio-economic impact. In outbreak situations, genetic typing is carried out for the purpose of molecular epidemiology in both domestic pigs and wild boar. These analyses are usually based on harmonized partial sequences. However, for high-resolution analyses towards the understanding of genetic variability and virus evolution, full-genome sequences are more appropriate. In this study, a unique set of representative virus strains was investigated that was collected during an outbreak in French free-ranging wild boar in the Vosges-du-Nord mountains between 2003 and 2007. Comparative sequence and evolutionary analyses of the nearly full-length sequences showed only slow evolution of classical swine fever virus strains over the years and no impact of vaccination on mutation rates. However, substitution rates varied amongst protein genes; furthermore, a spatial and temporal pattern could be observed whereby two separate clusters were formed that coincided with physical barriers.
The evolution of vertebrate Toll-like receptors
Roach, J.C.; Glusman, G.; Rowen, L.; Kaur, A.; Purcell, M.K.; Smith, K.D.; Hood, L.E.; Aderem, A.
2005-01-01
The complete sequences of Takifugu Toll-like receptor (TLR) loci and gene predictions from many draft genomes enable comprehensive molecular phylogenetic analysis. Strong selective pressure for recognition of and response to pathogen-associated molecular patterns has maintained a largely unchanging TLR recognition in all vertebrates. There are six major families of vertebrate TLRs. This repertoire is distinct from that of invertebrates. TLRs within a family recognize a general class of pathogen-associated molecular patterns. Most vertebrates have exactly one gene ortholog for each TLR family. The family including TLR1 has more species-specific adaptations than other families. A major family including TLR11 is represented in humans only by a pseudogene. Coincidental evolution plays a minor role in TLR evolution. The sequencing phase of this study produced finished genomic sequences for the 12 Takifugu rubripes TLRs. In addition, we have produced > 70 gene models, including sequences from the opossum, chicken, frog, dog, sea urchin, and sea squirt. ?? 2005 by The National Academy of Sciences of the USA.
Pyvolve: A Flexible Python Module for Simulating Sequences along Phylogenies.
Spielman, Stephanie J; Wilke, Claus O
2015-01-01
We introduce Pyvolve, a flexible Python module for simulating genetic data along a phylogeny using continuous-time Markov models of sequence evolution. Easily incorporated into Python bioinformatics pipelines, Pyvolve can simulate sequences according to most standard models of nucleotide, amino-acid, and codon sequence evolution. All model parameters are fully customizable. Users can additionally specify custom evolutionary models, with custom rate matrices and/or states to evolve. This flexibility makes Pyvolve a convenient framework not only for simulating sequences under a wide variety of conditions, but also for developing and testing new evolutionary models. Pyvolve is an open-source project under a FreeBSD license, and it is available for download, along with a detailed user-manual and example scripts, from http://github.com/sjspielman/pyvolve.
Isolation and characterization of major histocompatibility complex class II B genes in cranes.
Kohyama, Tetsuo I; Akiyama, Takuya; Nishida, Chizuko; Takami, Kazutoshi; Onuma, Manabu; Momose, Kunikazu; Masuda, Ryuichi
2015-11-01
In this study, we isolated and characterized the major histocompatibility complex (MHC) class II B genes in cranes. Genomic sequences spanning exons 1 to 4 were amplified and determined in 13 crane species and three other species closely related to cranes. In all, 55 unique sequences were identified, and at least two polymorphic MHC class II B loci were found in most species. An analysis of sequence polymorphisms showed the signature of positive selection and recombination. A phylogenetic reconstruction based on exon 2 sequences indicated that trans-species polymorphism has persisted for at least 10 million years, whereas phylogenetic analyses of the sequences flanking exon 2 revealed a pattern of concerted evolution. These results suggest that both balancing selection and recombination play important roles in the crane MHC evolution.
Birney, Ewan; Stamatoyannopoulos, John A; Dutta, Anindya; Guigó, Roderic; Gingeras, Thomas R; Margulies, Elliott H; Weng, Zhiping; Snyder, Michael; Dermitzakis, Emmanouil T; Thurman, Robert E; Kuehn, Michael S; Taylor, Christopher M; Neph, Shane; Koch, Christoph M; Asthana, Saurabh; Malhotra, Ankit; Adzhubei, Ivan; Greenbaum, Jason A; Andrews, Robert M; Flicek, Paul; Boyle, Patrick J; Cao, Hua; Carter, Nigel P; Clelland, Gayle K; Davis, Sean; Day, Nathan; Dhami, Pawandeep; Dillon, Shane C; Dorschner, Michael O; Fiegler, Heike; Giresi, Paul G; Goldy, Jeff; Hawrylycz, Michael; Haydock, Andrew; Humbert, Richard; James, Keith D; Johnson, Brett E; Johnson, Ericka M; Frum, Tristan T; Rosenzweig, Elizabeth R; Karnani, Neerja; Lee, Kirsten; Lefebvre, Gregory C; Navas, Patrick A; Neri, Fidencio; Parker, Stephen C J; Sabo, Peter J; Sandstrom, Richard; Shafer, Anthony; Vetrie, David; Weaver, Molly; Wilcox, Sarah; Yu, Man; Collins, Francis S; Dekker, Job; Lieb, Jason D; Tullius, Thomas D; Crawford, Gregory E; Sunyaev, Shamil; Noble, William S; Dunham, Ian; Denoeud, France; Reymond, Alexandre; Kapranov, Philipp; Rozowsky, Joel; Zheng, Deyou; Castelo, Robert; Frankish, Adam; Harrow, Jennifer; Ghosh, Srinka; Sandelin, Albin; Hofacker, Ivo L; Baertsch, Robert; Keefe, Damian; Dike, Sujit; Cheng, Jill; Hirsch, Heather A; Sekinger, Edward A; Lagarde, Julien; Abril, Josep F; Shahab, Atif; Flamm, Christoph; Fried, Claudia; Hackermüller, Jörg; Hertel, Jana; Lindemeyer, Manja; Missal, Kristin; Tanzer, Andrea; Washietl, Stefan; Korbel, Jan; Emanuelsson, Olof; Pedersen, Jakob S; Holroyd, Nancy; Taylor, Ruth; Swarbreck, David; Matthews, Nicholas; Dickson, Mark C; Thomas, Daryl J; Weirauch, Matthew T; Gilbert, James; Drenkow, Jorg; Bell, Ian; Zhao, XiaoDong; Srinivasan, K G; Sung, Wing-Kin; Ooi, Hong Sain; Chiu, Kuo Ping; Foissac, Sylvain; Alioto, Tyler; Brent, Michael; Pachter, Lior; Tress, Michael L; Valencia, Alfonso; Choo, Siew Woh; Choo, Chiou Yu; Ucla, Catherine; Manzano, Caroline; Wyss, Carine; Cheung, Evelyn; Clark, Taane G; Brown, James B; Ganesh, Madhavan; Patel, Sandeep; Tammana, Hari; Chrast, Jacqueline; Henrichsen, Charlotte N; Kai, Chikatoshi; Kawai, Jun; Nagalakshmi, Ugrappa; Wu, Jiaqian; Lian, Zheng; Lian, Jin; Newburger, Peter; Zhang, Xueqing; Bickel, Peter; Mattick, John S; Carninci, Piero; Hayashizaki, Yoshihide; Weissman, Sherman; Hubbard, Tim; Myers, Richard M; Rogers, Jane; Stadler, Peter F; Lowe, Todd M; Wei, Chia-Lin; Ruan, Yijun; Struhl, Kevin; Gerstein, Mark; Antonarakis, Stylianos E; Fu, Yutao; Green, Eric D; Karaöz, Ulaş; Siepel, Adam; Taylor, James; Liefer, Laura A; Wetterstrand, Kris A; Good, Peter J; Feingold, Elise A; Guyer, Mark S; Cooper, Gregory M; Asimenos, George; Dewey, Colin N; Hou, Minmei; Nikolaev, Sergey; Montoya-Burgos, Juan I; Löytynoja, Ari; Whelan, Simon; Pardi, Fabio; Massingham, Tim; Huang, Haiyan; Zhang, Nancy R; Holmes, Ian; Mullikin, James C; Ureta-Vidal, Abel; Paten, Benedict; Seringhaus, Michael; Church, Deanna; Rosenbloom, Kate; Kent, W James; Stone, Eric A; Batzoglou, Serafim; Goldman, Nick; Hardison, Ross C; Haussler, David; Miller, Webb; Sidow, Arend; Trinklein, Nathan D; Zhang, Zhengdong D; Barrera, Leah; Stuart, Rhona; King, David C; Ameur, Adam; Enroth, Stefan; Bieda, Mark C; Kim, Jonghwan; Bhinge, Akshay A; Jiang, Nan; Liu, Jun; Yao, Fei; Vega, Vinsensius B; Lee, Charlie W H; Ng, Patrick; Shahab, Atif; Yang, Annie; Moqtaderi, Zarmik; Zhu, Zhou; Xu, Xiaoqin; Squazzo, Sharon; Oberley, Matthew J; Inman, David; Singer, Michael A; Richmond, Todd A; Munn, Kyle J; Rada-Iglesias, Alvaro; Wallerman, Ola; Komorowski, Jan; Fowler, Joanna C; Couttet, Phillippe; Bruce, Alexander W; Dovey, Oliver M; Ellis, Peter D; Langford, Cordelia F; Nix, David A; Euskirchen, Ghia; Hartman, Stephen; Urban, Alexander E; Kraus, Peter; Van Calcar, Sara; Heintzman, Nate; Kim, Tae Hoon; Wang, Kun; Qu, Chunxu; Hon, Gary; Luna, Rosa; Glass, Christopher K; Rosenfeld, M Geoff; Aldred, Shelley Force; Cooper, Sara J; Halees, Anason; Lin, Jane M; Shulha, Hennady P; Zhang, Xiaoling; Xu, Mousheng; Haidar, Jaafar N S; Yu, Yong; Ruan, Yijun; Iyer, Vishwanath R; Green, Roland D; Wadelius, Claes; Farnham, Peggy J; Ren, Bing; Harte, Rachel A; Hinrichs, Angie S; Trumbower, Heather; Clawson, Hiram; Hillman-Jackson, Jennifer; Zweig, Ann S; Smith, Kayla; Thakkapallayil, Archana; Barber, Galt; Kuhn, Robert M; Karolchik, Donna; Armengol, Lluis; Bird, Christine P; de Bakker, Paul I W; Kern, Andrew D; Lopez-Bigas, Nuria; Martin, Joel D; Stranger, Barbara E; Woodroffe, Abigail; Davydov, Eugene; Dimas, Antigone; Eyras, Eduardo; Hallgrímsdóttir, Ingileif B; Huppert, Julian; Zody, Michael C; Abecasis, Gonçalo R; Estivill, Xavier; Bouffard, Gerard G; Guan, Xiaobin; Hansen, Nancy F; Idol, Jacquelyn R; Maduro, Valerie V B; Maskeri, Baishali; McDowell, Jennifer C; Park, Morgan; Thomas, Pamela J; Young, Alice C; Blakesley, Robert W; Muzny, Donna M; Sodergren, Erica; Wheeler, David A; Worley, Kim C; Jiang, Huaiyang; Weinstock, George M; Gibbs, Richard A; Graves, Tina; Fulton, Robert; Mardis, Elaine R; Wilson, Richard K; Clamp, Michele; Cuff, James; Gnerre, Sante; Jaffe, David B; Chang, Jean L; Lindblad-Toh, Kerstin; Lander, Eric S; Koriabine, Maxim; Nefedov, Mikhail; Osoegawa, Kazutoyo; Yoshinaga, Yuko; Zhu, Baoli; de Jong, Pieter J
2007-06-14
We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.
Bhasi, Ashwini; Philip, Philge; Manikandan, Vinu; Senapathy, Periannan
2009-01-01
We have developed ExDom, a unique database for the comparative analysis of the exon–intron structures of 96 680 protein domains from seven eukaryotic organisms (Homo sapiens, Mus musculus, Bos taurus, Rattus norvegicus, Danio rerio, Gallus gallus and Arabidopsis thaliana). ExDom provides integrated access to exon-domain data through a sophisticated web interface which has the following analytical capabilities: (i) intergenomic and intragenomic comparative analysis of exon–intron structure of domains; (ii) color-coded graphical display of the domain architecture of proteins correlated with their corresponding exon-intron structures; (iii) graphical analysis of multiple sequence alignments of amino acid and coding nucleotide sequences of homologous protein domains from seven organisms; (iv) comparative graphical display of exon distributions within the tertiary structures of protein domains; and (v) visualization of exon–intron structures of alternative transcripts of a gene correlated to variations in the domain architecture of corresponding protein isoforms. These novel analytical features are highly suited for detailed investigations on the exon–intron structure of domains and make ExDom a powerful tool for exploring several key questions concerning the function, origin and evolution of genes and proteins. ExDom database is freely accessible at: http://66.170.16.154/ExDom/. PMID:18984624
Grid Integration Studies: Advancing Clean Energy Planning and Deployment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Katz, Jessica; Chernyakhovskiy, Ilya
2016-07-01
Integrating significant variable renewable energy (VRE) into the grid requires an evolution in power system planning and operation. To plan for this evolution, power system stakeholders can undertake grid integration studies. This Greening the Grid document reviews grid integration studies, common elements, questions, and guidance for system planners.
Fuller, Zachary L; Niño, Elina L; Patch, Harland M; Bedoya-Reina, Oscar C; Baumgarten, Tracey; Muli, Elliud; Mumoki, Fiona; Ratan, Aakrosh; McGraw, John; Frazier, Maryann; Masiga, Daniel; Schuster, Stephen; Grozinger, Christina M; Miller, Webb
2015-07-10
With the development of inexpensive, high-throughput sequencing technologies, it has become feasible to examine questions related to population genetics and molecular evolution of non-model species in their ecological contexts on a genome-wide scale. Here, we employed a newly developed suite of integrated, web-based programs to examine population dynamics and signatures of selection across the genome using several well-established tests, including F ST, pN/pS, and McDonald-Kreitman. We applied these techniques to study populations of honey bees (Apis mellifera) in East Africa. In Kenya, there are several described A. mellifera subspecies, which are thought to be localized to distinct ecological regions. We performed whole genome sequencing of 11 worker honey bees from apiaries distributed throughout Kenya and identified 3.6 million putative single-nucleotide polymorphisms. The dense coverage allowed us to apply several computational procedures to study population structure and the evolutionary relationships among the populations, and to detect signs of adaptive evolution across the genome. While there is considerable gene flow among the sampled populations, there are clear distinctions between populations from the northern desert region and those from the temperate, savannah region. We identified several genes showing population genetic patterns consistent with positive selection within African bee populations, and between these populations and European A. mellifera or Asian Apis florea. These results lay the groundwork for future studies of adaptive ecological evolution in honey bees, and demonstrate the use of new, freely available web-based tools and workflows ( http://usegalaxy.org/r/kenyanbee ) that can be applied to any model system with genomic information.
Rapid and Parallel Adaptive Evolution of the Visual System of Neotropical Midas Cichlid Fishes.
Torres-Dowdall, Julián; Pierotti, Michele E R; Härer, Andreas; Karagic, Nidal; Woltering, Joost M; Henning, Frederico; Elmer, Kathryn R; Meyer, Axel
2017-10-01
Midas cichlid fish are a Central American species flock containing 13 described species that has been dated to only a few thousand years old, a historical timescale infrequently associated with speciation. Their radiation involved the colonization of several clear water crater lakes from two turbid great lakes. Therefore, Midas cichlids have been subjected to widely varying photic conditions during their radiation. Being a primary signal relay for information from the environment to the organism, the visual system is under continuing selective pressure and a prime organ system for accumulating adaptive changes during speciation, particularly in the case of dramatic shifts in photic conditions. Here, we characterize the full visual system of Midas cichlids at organismal and genetic levels, to determine what types of adaptive changes evolved within the short time span of their radiation. We show that Midas cichlids have a diverse visual system with unexpectedly high intra- and interspecific variation in color vision sensitivity and lens transmittance. Midas cichlid populations in the clear crater lakes have convergently evolved visual sensitivities shifted toward shorter wavelengths compared with the ancestral populations from the turbid great lakes. This divergence in sensitivity is driven by changes in chromophore usage, differential opsin expression, opsin coexpression, and to a lesser degree by opsin coding sequence variation. The visual system of Midas cichlids has the evolutionary capacity to rapidly integrate multiple adaptations to changing light environments. Our data may indicate that, in early stages of divergence, changes in opsin regulation could precede changes in opsin coding sequence evolution. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
NASA Astrophysics Data System (ADS)
Cipollari, Paola; Cosentino, Domenico
1995-12-01
This paper shows the results obtained from an integrated study (geology, biostratigraphy and geochemistry) carried out on the Miocene edimentary deposits in Central Italy in order to define the timing of the sedimentary basin evolution. This paper deals also with the causes of the unconformities recorded in these basins. In the Miocene deposits of the Latina Valley and the Ernici-Simbruini Mts. several unconformities which distinguish different stratigraphic sequences have been recognized (D 0, D 1, D 2 D 3 and D 4). For each unconformity a general description together with a geodynamical significance is provided. In particular, D 0 unconformity appears to be related to a regional tectonic event (Adria-Europe collision). As a consequence, the Adria lithosphere folded and the area underwent a regional erosive event. D 1, D 2 and D 3 unconformities have had a more local tectonic control since they represent the stratigraphic record of the migration of the Apennines thrust belt/foredeep system. D 1 and D 2 unconformities are related to the late Tortonian foredeep stage, whereas D 3 is linked to the early Messinian piggy-back stage. Moreover, the D 4 unconformity, which took place during the Messinian piggy-back stage, is strictly linked to the sea-level drop of the Messinian salinity crisis. In this paper the genesis and evolution of a late Tortonian foreland basin is also stressed (Latina Valley foredeep basin). Finally, taking into account sequence boundaries, nannofossil biostratigraphy and geochemistry isotopic data, a comparison with the curve of the 3rd order of the relative coastal onlap (Haq et al., 1988) has been attempted in order to distinguish the unconformities controlled either by tectonic or eustatic processes.
Molecular Evolution of Slow and Quick Anion Channels (SLACs and QUACs/ALMTs)
Dreyer, Ingo; Gomez-Porras, Judith Lucia; Riaño-Pachón, Diego Mauricio; Hedrich, Rainer; Geiger, Dietmar
2012-01-01
Electrophysiological analyses conducted about 25 years ago detected two types of anion channels in the plasma membrane of guard cells. One type of channel responds slowly to changes in membrane voltage while the other responds quickly. Consequently, they were named SLAC, for SLow Anion Channel, and QUAC, for QUick Anion Channel. Recently, genes SLAC1 and QUAC1/ALMT12, underlying the two different anion current components, could be identified in the model plant Arabidopsis thaliana. Expression of the gene products in Xenopus oocytes confirmed the quick and slow current kinetics. In this study we provide an overview on our current knowledge on slow and quick anion channels in plants and analyze the molecular evolution of ALMT/QUAC-like and SLAC-like channels. We discovered fingerprints that allow screening databases for these channel types and were able to identify 192 (177 non-redundant) SLAC-like and 422 (402 non-redundant) ALMT/QUAC-like proteins in the fully sequenced genomes of 32 plant species. Phylogenetic analyses provided new insights into the molecular evolution of these channel types. We also combined sequence alignment and clustering with predictions of protein features, leading to the identification of known conserved phosphorylation sites in SLAC1-like channels along with potential sites that have not been yet experimentally confirmed. Using a similar strategy to analyze the hydropathicity of ALMT/QUAC-like channels, we propose a modified topology with additional transmembrane regions that integrates structure and function of these membrane proteins. Our results suggest that cross-referencing phylogenetic analyses with position-specific protein properties and functional data could be a very powerful tool for genome research approaches in general. PMID:23226151
Molecular Evolution of Slow and Quick Anion Channels (SLACs and QUACs/ALMTs).
Dreyer, Ingo; Gomez-Porras, Judith Lucia; Riaño-Pachón, Diego Mauricio; Hedrich, Rainer; Geiger, Dietmar
2012-01-01
Electrophysiological analyses conducted about 25 years ago detected two types of anion channels in the plasma membrane of guard cells. One type of channel responds slowly to changes in membrane voltage while the other responds quickly. Consequently, they were named SLAC, for SLow Anion Channel, and QUAC, for QUick Anion Channel. Recently, genes SLAC1 and QUAC1/ALMT12, underlying the two different anion current components, could be identified in the model plant Arabidopsis thaliana. Expression of the gene products in Xenopus oocytes confirmed the quick and slow current kinetics. In this study we provide an overview on our current knowledge on slow and quick anion channels in plants and analyze the molecular evolution of ALMT/QUAC-like and SLAC-like channels. We discovered fingerprints that allow screening databases for these channel types and were able to identify 192 (177 non-redundant) SLAC-like and 422 (402 non-redundant) ALMT/QUAC-like proteins in the fully sequenced genomes of 32 plant species. Phylogenetic analyses provided new insights into the molecular evolution of these channel types. We also combined sequence alignment and clustering with predictions of protein features, leading to the identification of known conserved phosphorylation sites in SLAC1-like channels along with potential sites that have not been yet experimentally confirmed. Using a similar strategy to analyze the hydropathicity of ALMT/QUAC-like channels, we propose a modified topology with additional transmembrane regions that integrates structure and function of these membrane proteins. Our results suggest that cross-referencing phylogenetic analyses with position-specific protein properties and functional data could be a very powerful tool for genome research approaches in general.
Integration of Temporal and Ordinal Information During Serial Interception Sequence Learning
Gobel, Eric W.; Sanchez, Daniel J.; Reber, Paul J.
2011-01-01
The expression of expert motor skills typically involves learning to perform a precisely timed sequence of movements (e.g., language production, music performance, athletic skills). Research examining incidental sequence learning has previously relied on a perceptually-cued task that gives participants exposure to repeating motor sequences but does not require timing of responses for accuracy. Using a novel perceptual-motor sequence learning task, learning a precisely timed cued sequence of motor actions is shown to occur without explicit instruction. Participants learned a repeating sequence through practice and showed sequence-specific knowledge via a performance decrement when switched to an unfamiliar sequence. In a second experiment, the integration of representation of action order and timing sequence knowledge was examined. When either action order or timing sequence information was selectively disrupted, performance was reduced to levels similar to completely novel sequences. Unlike prior sequence-learning research that has found timing information to be secondary to learning action sequences, when the task demands require accurate action and timing information, an integrated representation of these types of information is acquired. These results provide the first evidence for incidental learning of fully integrated action and timing sequence information in the absence of an independent representation of action order, and suggest that this integrative mechanism may play a material role in the acquisition of complex motor skills. PMID:21417511
Ribosomal RNA Genes Contribute to the Formation of Pseudogenes and Junk DNA in the Human Genome.
Robicheau, Brent M; Susko, Edward; Harrigan, Amye M; Snyder, Marlene
2017-02-01
Approximately 35% of the human genome can be identified as sequence devoid of a selected-effect function, and not derived from transposable elements or repeated sequences. We provide evidence supporting a known origin for a fraction of this sequence. We show that: 1) highly degraded, but near full length, ribosomal DNA (rDNA) units, including both 45S and Intergenic Spacer (IGS), can be found at multiple sites in the human genome on chromosomes without rDNA arrays, 2) that these rDNA sequences have a propensity for being centromere proximal, and 3) that sequence at all human functional rDNA array ends is divergent from canonical rDNA to the point that it is pseudogenic. We also show that small sequence strings of rDNA (from 45S + IGS) can be found distributed throughout the genome and are identifiable as an "rDNA-like signal", representing 0.26% of the q-arm of HSA21 and ∼2% of the total sequence of other regions tested. The size of sequence strings found in the rDNA-like signal intergrade into the size of sequence strings that make up the full-length degrading rDNA units found scattered throughout the genome. We conclude that the displaced and degrading rDNA sequences are likely of a similar origin but represent different stages in their evolution towards random sequence. Collectively, our data suggests that over vast evolutionary time, rDNA arrays contribute to the production of junk DNA. The concept that the production of rDNA pseudogenes is a by-product of concerted evolution represents a previously under-appreciated process; we demonstrate here its importance. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Barry, Elizabeth G; Witherspoon, David J; Lampe, David J
2004-02-01
Transposons of the mariner family are widespread in animal genomes and have apparently infected them by horizontal transfer. Most species carry only old defective copies of particular mariner transposons that have diverged greatly from their active horizontally transferred ancestor, while a few contain young, very similar, and active copies. We report here the use of a whole-genome screen in bacteria to isolate somewhat diverged Famar1 copies from the European earwig, Forficula auricularia, that encode functional transposases. Functional and nonfunctional coding sequences of Famar1 and nonfunctional copies of Ammar1 from the European honey bee, Apis mellifera, were sequenced to examine their molecular evolution. No selection for sequence conservation was detected in any clade of a tree derived from these sequences, not even on branches leading to functional copies. This agrees with the current model for mariner transposon evolution that expects neutral evolution within particular hosts, with selection for function occurring only upon horizontal transfer to a new host. Our results further suggest that mariners are not finely tuned genetic entities and that a greater amount of sequence diversification than had previously been appreciated can occur in functional copies in a single host lineage. Finally, this method of isolating active copies can be used to isolate other novel active transposons without resorting to reconstruction of ancestral sequences.
Biophysics of protein evolution and evolutionary protein biophysics
Sikosek, Tobias; Chan, Hue Sun
2014-01-01
The study of molecular evolution at the level of protein-coding genes often entails comparing large datasets of sequences to infer their evolutionary relationships. Despite the importance of a protein's structure and conformational dynamics to its function and thus its fitness, common phylogenetic methods embody minimal biophysical knowledge of proteins. To underscore the biophysical constraints on natural selection, we survey effects of protein mutations, highlighting the physical basis for marginal stability of natural globular proteins and how requirement for kinetic stability and avoidance of misfolding and misinteractions might have affected protein evolution. The biophysical underpinnings of these effects have been addressed by models with an explicit coarse-grained spatial representation of the polypeptide chain. Sequence–structure mappings based on such models are powerful conceptual tools that rationalize mutational robustness, evolvability, epistasis, promiscuous function performed by ‘hidden’ conformational states, resolution of adaptive conflicts and conformational switches in the evolution from one protein fold to another. Recently, protein biophysics has been applied to derive more accurate evolutionary accounts of sequence data. Methods have also been developed to exploit sequence-based evolutionary information to predict biophysical behaviours of proteins. The success of these approaches demonstrates a deep synergy between the fields of protein biophysics and protein evolution. PMID:25165599
Sequence Memory Constraints Give Rise to Language-Like Structure through Iterated Learning
Cornish, Hannah; Dale, Rick; Kirby, Simon; Christiansen, Morten H.
2017-01-01
Human language is composed of sequences of reusable elements. The origins of the sequential structure of language is a hotly debated topic in evolutionary linguistics. In this paper, we show that sets of sequences with language-like statistical properties can emerge from a process of cultural evolution under pressure from chunk-based memory constraints. We employ a novel experimental task that is non-linguistic and non-communicative in nature, in which participants are trained on and later asked to recall a set of sequences one-by-one. Recalled sequences from one participant become training data for the next participant. In this way, we simulate cultural evolution in the laboratory. Our results show a cumulative increase in structure, and by comparing this structure to data from existing linguistic corpora, we demonstrate a close parallel between the sets of sequences that emerge in our experiment and those seen in natural language. PMID:28118370
NASA Technical Reports Server (NTRS)
Nakayama, S.; Kretsinger, R. H.
1993-01-01
In the first report in this series we presented dendrograms based on 152 individual proteins of the EF-hand family. In the second we used sequences from 228 proteins, containing 835 domains, and showed that eight of the 29 subfamilies are congruent and that the EF-hand domains of the remaining 21 subfamilies have diverse evolutionary histories. In this study we have computed dendrograms within and among the EF-hand subfamilies using the encoding DNA sequences. In most instances the dendrograms based on protein and on DNA sequences are very similar. Significant differences between protein and DNA trees for calmodulin remain unexplained. In our fourth report we evaluate the sequences and the distribution of introns within the EF-hand family and conclude that exon shuffling did not play a significant role in its evolution.
Chan, Yvonne H.; Venev, Sergey V.; Zeldovich, Konstantin B.; Matthews, C. Robert
2017-01-01
Sequence divergence of orthologous proteins enables adaptation to environmental stresses and promotes evolution of novel functions. Limits on evolution imposed by constraints on sequence and structure were explored using a model TIM barrel protein, indole-3-glycerol phosphate synthase (IGPS). Fitness effects of point mutations in three phylogenetically divergent IGPS proteins during adaptation to temperature stress were probed by auxotrophic complementation of yeast with prokaryotic, thermophilic IGPS. Analysis of beneficial mutations pointed to an unexpected, long-range allosteric pathway towards the active site of the protein. Significant correlations between the fitness landscapes of distant orthologues implicate both sequence and structure as primary forces in defining the TIM barrel fitness landscape and suggest that fitness landscapes can be translocated in sequence space. Exploration of fitness landscapes in the context of a protein fold provides a strategy for elucidating the sequence-structure-fitness relationships in other common motifs. PMID:28262665
Sequence Memory Constraints Give Rise to Language-Like Structure through Iterated Learning.
Cornish, Hannah; Dale, Rick; Kirby, Simon; Christiansen, Morten H
2017-01-01
Human language is composed of sequences of reusable elements. The origins of the sequential structure of language is a hotly debated topic in evolutionary linguistics. In this paper, we show that sets of sequences with language-like statistical properties can emerge from a process of cultural evolution under pressure from chunk-based memory constraints. We employ a novel experimental task that is non-linguistic and non-communicative in nature, in which participants are trained on and later asked to recall a set of sequences one-by-one. Recalled sequences from one participant become training data for the next participant. In this way, we simulate cultural evolution in the laboratory. Our results show a cumulative increase in structure, and by comparing this structure to data from existing linguistic corpora, we demonstrate a close parallel between the sets of sequences that emerge in our experiment and those seen in natural language.
NASA Technical Reports Server (NTRS)
Dayhoff, M. O.
1971-01-01
The amino acid sequences of proteins from living organisms are dealt with. The structure of proteins is first discussed; the variation in this structure from one biological group to another is illustrated by the first halves of the sequences of cytochrome c, and a phylogenetic tree is derived from the cytochrome c data. The relative geological times associated with the events of this tree are discussed. Errors which occur in the duplication of cells during the evolutionary process are examined. Particular attention is given to evolution of mutant proteins, globins, ferredoxin, and transfer ribonucleic acids (tRNA's). Finally, a general outline of biological evolution is presented.
New perspectives on bacterial ferredoxin evolution
NASA Technical Reports Server (NTRS)
George, D. G.; Hunt, L. T.; Yeh, L.-S. L.; Barker, W. C.
1985-01-01
Ferredoxins are low-molecular-weight, nonheme, iron proteins which function as electron carriers in a wide variety of electron transport chains. Howard et al. (1983) have suggested that the amino end of Azotobacter vinelandii ferredoxin shows a greater similarity to the carboxyl end of ferredoxin from Chromatium vinosum and that their half-chain sequences are homologous when the half-chains of either species are considered in inverse order. Examination of this proposition has made it necessary to reevaluate previous conclusions concerning the evolution of bacterial ferredoxin. Attention is given to the properties of the bacterial ferredoxin sequences, and the evolution of the bacterial ferredoxins.
Yasukochi, Yoshiki; Satta, Yoko
2015-03-25
The human cytochrome P450 (CYP) 2D6 gene is a member of the CYP2D gene subfamily, along with the CYP2D7P and CYP2D8P pseudogenes. Although the CYP2D6 enzyme has been studied extensively because of its clinical importance, the evolution of the CYP2D subfamily has not yet been fully understood. Therefore, the goal of this study was to reveal the evolutionary process of the human drug metabolic system. Here, we investigate molecular evolution of the CYP2D subfamily in primates by comparing 14 CYP2D sequences from humans to New World monkey genomes. Window analysis and statistical tests revealed that entire genomic sequences of paralogous genes were extensively homogenized by gene conversion during molecular evolution of CYP2D genes in primates. A neighbor-joining tree based on genomic sequences at the nonsubstrate recognition sites showed that CYP2D6 and CYP2D8 genes were clustered together due to gene conversion. In contrast, a phylogenetic tree using amino acid sequences at substrate recognition sites did not cluster the CYP2D6 and CYP2D8 genes, suggesting that the functional constraint on substrate specificity is one of the causes for purifying selection at the substrate recognition sites. Our results suggest that the CYP2D gene subfamily in primates has evolved to maintain the regioselectivity for a substrate hydroxylation activity between individual enzymes, even though extensive gene conversion has occurred across CYP2D coding sequences. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Evolution of the vertebrate insulin receptor substrate (Irs) gene family.
Al-Salam, Ahmad; Irwin, David M
2017-06-23
Insulin receptor substrate (Irs) proteins are essential for insulin signaling as they allow downstream effectors to dock with, and be activated by, the insulin receptor. A family of four Irs proteins have been identified in mice, however the gene for one of these, IRS3, has been pseudogenized in humans. While it is known that the Irs gene family originated in vertebrates, it is not known when it originated and which members are most closely related to each other. A better understanding of the evolution of Irs genes and proteins should provide insight into the regulation of metabolism by insulin. Multiple genes for Irs proteins were identified in a wide variety of vertebrate species. Phylogenetic and genomic neighborhood analyses indicate that this gene family originated very early in vertebrae evolution. Most Irs genes were duplicated and retained in fish after the fish-specific genome duplication. Irs genes have been lost of various lineages, including Irs3 in primates and birds and Irs1 in most fish. Irs3 and Irs4 experienced an episode of more rapid protein sequence evolution on the ancestral mammalian lineage. Comparisons of the conservation of the proteins sequences among Irs paralogs show that domains involved in binding to the plasma membrane and insulin receptors are most strongly conserved, while divergence has occurred in sequences involved in interacting with downstream effector proteins. The Irs gene family originated very early in vertebrate evolution, likely through genome duplications, and in parallel with duplications of other components of the insulin signaling pathway, including insulin and the insulin receptor. While the N-terminal sequences of these proteins are conserved among the paralogs, changes in the C-terminal sequences likely allowed changes in biological function.
Melters, Daniël P; Bradnam, Keith R; Young, Hugh A; Telis, Natalie; May, Michael R; Ruby, J Graham; Sebra, Robert; Peluso, Paul; Eid, John; Rank, David; Garcia, José Fernando; DeRisi, Joseph L; Smith, Timothy; Tobias, Christian; Ross-Ibarra, Jeffrey; Korf, Ian; Chan, Simon W L
2013-01-30
Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres contain megabase-scale arrays of tandem repeats. Despite their importance, very little is known about the degree to which centromere tandem repeats share common properties between different species across different phyla. We used bioinformatic methods to identify high-copy tandem repeats from 282 species using publicly available genomic sequence and our own data. Our methods are compatible with all current sequencing technologies. Long Pacific Biosciences sequence reads allowed us to find tandem repeat monomers up to 1,419 bp. We assumed that the most abundant tandem repeat is the centromere DNA, which was true for most species whose centromeres have been previously characterized, suggesting this is a general property of genomes. High-copy centromere tandem repeats were found in almost all animal and plant genomes, but repeat monomers were highly variable in sequence composition and length. Furthermore, phylogenetic analysis of sequence homology showed little evidence of sequence conservation beyond approximately 50 million years of divergence. We find that despite an overall lack of sequence conservation, centromere tandem repeats from diverse species showed similar modes of evolution. While centromere position in most eukaryotes is epigenetically determined, our results indicate that tandem repeats are highly prevalent at centromeres of both animal and plant genomes. This suggests a functional role for such repeats, perhaps in promoting concerted evolution of centromere DNA across chromosomes.
2013-01-01
Background Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres contain megabase-scale arrays of tandem repeats. Despite their importance, very little is known about the degree to which centromere tandem repeats share common properties between different species across different phyla. We used bioinformatic methods to identify high-copy tandem repeats from 282 species using publicly available genomic sequence and our own data. Results Our methods are compatible with all current sequencing technologies. Long Pacific Biosciences sequence reads allowed us to find tandem repeat monomers up to 1,419 bp. We assumed that the most abundant tandem repeat is the centromere DNA, which was true for most species whose centromeres have been previously characterized, suggesting this is a general property of genomes. High-copy centromere tandem repeats were found in almost all animal and plant genomes, but repeat monomers were highly variable in sequence composition and length. Furthermore, phylogenetic analysis of sequence homology showed little evidence of sequence conservation beyond approximately 50 million years of divergence. We find that despite an overall lack of sequence conservation, centromere tandem repeats from diverse species showed similar modes of evolution. Conclusions While centromere position in most eukaryotes is epigenetically determined, our results indicate that tandem repeats are highly prevalent at centromeres of both animal and plant genomes. This suggests a functional role for such repeats, perhaps in promoting concerted evolution of centromere DNA across chromosomes. PMID:23363705
Positive Selection Underlies Faster-Z Evolution of Gene Expression in Birds.
Dean, Rebecca; Harrison, Peter W; Wright, Alison E; Zimmer, Fabian; Mank, Judith E
2015-10-01
The elevated rate of evolution for genes on sex chromosomes compared with autosomes (Fast-X or Fast-Z evolution) can result either from positive selection in the heterogametic sex or from nonadaptive consequences of reduced relative effective population size. Recent work in birds suggests that Fast-Z of coding sequence is primarily due to relaxed purifying selection resulting from reduced relative effective population size. However, gene sequence and gene expression are often subject to distinct evolutionary pressures; therefore, we tested for Fast-Z in gene expression using next-generation RNA-sequencing data from multiple avian species. Similar to studies of Fast-Z in coding sequence, we recover clear signatures of Fast-Z in gene expression; however, in contrast to coding sequence, our data indicate that Fast-Z in expression is due to positive selection acting primarily in females. In the soma, where gene expression is highly correlated between the sexes, we detected Fast-Z in both sexes, although at a higher rate in females, suggesting that many positively selected expression changes in females are also expressed in males. In the gonad, where intersexual correlations in expression are much lower, we detected Fast-Z for female gene expression, but crucially, not males. This suggests that a large amount of expression variation is sex-specific in its effects within the gonad. Taken together, our results indicate that Fast-Z evolution of gene expression is the product of positive selection acting on recessive beneficial alleles in the heterogametic sex. More broadly, our analysis suggests that the adaptive potential of Z chromosome gene expression may be much greater than that of gene sequence, results which have important implications for the role of sex chromosomes in speciation and sexual selection. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Marean, Curtis W.
2016-01-01
Scientists have identified a series of milestones in the evolution of the human food quest that are anticipated to have had far-reaching impacts on biological, behavioural and cultural evolution: the inclusion of substantial portions of meat, the broad spectrum revolution and the transition to food production. The foraging shift to dense and predictable resources is another key milestone that had consequential impacts on the later part of human evolution. The theory of economic defendability predicts that this shift had an important consequence—elevated levels of intergroup territoriality and conflict. In this paper, this theory is integrated with a well-established general theory of hunter–gatherer adaptations and is used to make predictions for the sequence of appearance of several evolved traits of modern humans. The distribution of dense and predictable resources in Africa is reviewed and found to occur only in aquatic contexts (coasts, rivers and lakes). The palaeoanthropological empirical record contains recurrent evidence for a shift to the exploitation of dense and predictable resources by 110 000 years ago, and the first known occurrence is in a marine coastal context in South Africa. Some theory predicts that this elevated conflict would have provided the conditions for selection for the hyperprosocial behaviours unique to modern humans. This article is part of the themed issue ‘Major transitions in human evolution’. PMID:27298470
Bastolla, Ugo
2014-01-01
The properties of biomolecules depend both on physics and on the evolutionary process that formed them. These two points of view produce a powerful synergism. Physics sets the stage and the constraints that molecular evolution has to obey, and evolutionary theory helps in rationalizing the physical properties of biomolecules, including protein folding thermodynamics. To complete the parallelism, protein thermodynamics is founded on the statistical mechanics in the space of protein structures, and molecular evolution can be viewed as statistical mechanics in the space of protein sequences. In this review, we will integrate both points of view, applying them to detecting selection on the stability of the folded state of proteins. We will start discussing positive design, which strengthens the stability of the folded against the unfolded state of proteins. Positive design justifies why statistical potentials for protein folding can be obtained from the frequencies of structural motifs. Stability against unfolding is easier to achieve for longer proteins. On the contrary, negative design, which consists in destabilizing frequently formed misfolded conformations, is more difficult to achieve for longer proteins. The folding rate can be enhanced by strengthening short-range native interactions, but this requirement contrasts with negative design, and evolution has to trade-off between them. Finally, selection can accelerate functional movements by favoring low frequency normal modes of the dynamics of the native state that strongly correlate with the functional conformation change. PMID:24970217
Sites of Retroviral DNA Integration: From Basic Research to Clinical Applications
Serrao, Erik; Engelman, Alan N.
2016-01-01
One of the most crucial steps in the life cycle of a retrovirus is the integration of the viral DNA (vDNA) copy of the RNA genome into the genome of an infected host cell. Integration provides for efficient viral gene expression as well as for the segregation of the viral genomes to daughter cells upon cell division. Some integrated viruses are not well expressed, and cells latently infected with HIV-1 can resist the action of potent antiretroviral drugs and remain dormant for decades. Intensive research has been dedicated to understanding the catalytic mechanism of integration, as well as the viral and cellular determinants that influence integration site distribution throughout the host genome. In this review we summarize the evolution of techniques that have been used to recover and map retroviral integration sites, from the early days that first indicated that integration could occur in multiple cellular DNA locations, to current technologies that map upwards of millions of unique integration sites from single in vitro integration reactions or cell culture infections. We further review important insights gained from the use of such mapping techniques, including the monitoring of cell clonal expansion in patients treated with retrovirus-based gene therapy vectors, or AIDS patients on suppressive antiretroviral therapy (ART). These insights span from integrase (IN) enzyme sequence preferences within target DNA (tDNA) at the sites of integration, to the roles of host cellular proteins in mediating global integration distribution, to the potential relationship between genomic location of vDNA integration site and retroviral latency. PMID:26508664
The Genome Sequence of Taurine Cattle: A window to ruminant biology and evolution
Elsik, Christine G.; Tellam, Ross L.; Worley, Kim C.
2010-01-01
To understand the biology and evolution of ruminants, the cattle genome was sequenced to ∼7× coverage. The cattle genome contains a minimum of 22,000 genes, with a core set of 14,345 orthologs shared among seven mammalian species of which 1,217 are absent or undetected in non-eutherian (marsupial or monotreme) genomes. Cattle-specific evolutionary breakpoint regions in chromosomes have a higher density of segmental duplications, enrichment of repetitive elements, and species-specific variations in genes associated with lactation and immune responsiveness. Genes involved in metabolism are generally highly conserved, although five metabolic genes are deleted or extensively diverged from their human orthologs. The cattle genome sequence thus provides an enabling resource for understanding mammalian evolution and accelerating livestock genetic improvement for milk and meat production. PMID:19390049
The macroevolutionary consequences of phenotypic integration: from development to deep time.
Goswami, A; Smaers, J B; Soligo, C; Polly, P D
2014-08-19
Phenotypic integration is a pervasive characteristic of organisms. Numerous analyses have demonstrated that patterns of phenotypic integration are conserved across large clades, but that significant variation also exists. For example, heterochronic shifts related to different mammalian reproductive strategies are reflected in postcranial skeletal integration and in coordination of bone ossification. Phenotypic integration and modularity have been hypothesized to shape morphological evolution, and we extended simulations to confirm that trait integration can influence both the trajectory and magnitude of response to selection. We further demonstrate that phenotypic integration can produce both more and less disparate organisms than would be expected under random walk models by repartitioning variance in preferred directions. This effect can also be expected to favour homoplasy and convergent evolution. New empirical analyses of the carnivoran cranium show that rates of evolution, in contrast, are not strongly influenced by phenotypic integration and show little relationship to morphological disparity, suggesting that phenotypic integration may shape the direction of evolutionary change, but not necessarily the speed of it. Nonetheless, phenotypic integration is problematic for morphological clocks and should be incorporated more widely into models that seek to accurately reconstruct both trait and organismal evolution.
The macroevolutionary consequences of phenotypic integration: from development to deep time
Goswami, A.; Smaers, J. B.; Soligo, C.; Polly, P. D.
2014-01-01
Phenotypic integration is a pervasive characteristic of organisms. Numerous analyses have demonstrated that patterns of phenotypic integration are conserved across large clades, but that significant variation also exists. For example, heterochronic shifts related to different mammalian reproductive strategies are reflected in postcranial skeletal integration and in coordination of bone ossification. Phenotypic integration and modularity have been hypothesized to shape morphological evolution, and we extended simulations to confirm that trait integration can influence both the trajectory and magnitude of response to selection. We further demonstrate that phenotypic integration can produce both more and less disparate organisms than would be expected under random walk models by repartitioning variance in preferred directions. This effect can also be expected to favour homoplasy and convergent evolution. New empirical analyses of the carnivoran cranium show that rates of evolution, in contrast, are not strongly influenced by phenotypic integration and show little relationship to morphological disparity, suggesting that phenotypic integration may shape the direction of evolutionary change, but not necessarily the speed of it. Nonetheless, phenotypic integration is problematic for morphological clocks and should be incorporated more widely into models that seek to accurately reconstruct both trait and organismal evolution. PMID:25002699
Gupta, R S; Aitken, K; Falah, M; Singh, B
1994-01-01
The genes for two different 70-kDa heat shock protein (HSP70) homologs have been cloned and sequenced from the protozoan Giardia lamblia. On the basis of their sequence features, one of these genes corresponds to the cytoplasmic form of HSP70. The second gene, on the basis of its characteristic N-terminal hydrophobic signal sequence and C-terminal endoplasmic reticulum (ER) retention sequence (Lys-Asp-Glu-Leu), is the equivalent of ER-resident GRP78 or the Bip family of proteins. Phylogenetic trees based on HSP70 sequences show that G. lamblia homologs show the deepest divergence among eukaryotic species. The identification of a GRP78 or Bip homolog in G. lamblia strongly suggests the existence of ER in this ancient eukaryote. Detailed phylogenetic analyses of HSP70 sequences by boot-strap neighbor-joining and maximum-parsimony methods show that the cytoplasmic and ER homologs form distinct subfamilies that evolved from a common eukaryotic ancestor by gene duplication that occurred very early in the evolution of eukaryotic cells. It is postulated that because of the essential "molecular chaperone" function of these proteins in translocation of other proteins across membranes, duplication of their genes accompanied the evolution of ER or nucleus in the eukaryotic cell ancestor. The presence in all eukaryotic cytoplasmic HSP70 homologs (including the cognate, heat-induced, and ER forms) of a number of autapomorphic sequence signatures that are not present in any prokaryotic or organellar homologs provides strong evidence regarding the monophyletic nature of eukaryotic lineage. Further, all eukaryotic HSP70 homologs share in common with the Gram-negative group of eubacteria a number of sequence features that are not present in any archaebacterium or Gram-positive bacterium, indicating their evolution from this group of organisms. Some implications of these findings regarding the evolution of eukaryotic cells and ER are discussed. Images PMID:8159675
2011-01-01
Background Evolution of the Brassica species has been recursively affected by polyploidy events, and comparison to their relative, Arabidopsis thaliana, provides means to explore their genomic complexity. Results A genome-wide physical map of a rapid-cycling strain of B. oleracea was constructed by integrating high-information-content fingerprinting (HICF) of Bacterial Artificial Chromosome (BAC) clones with hybridization to sequence-tagged probes. Using 2907 contigs of two or more BACs, we performed several lines of comparative genomic analysis. Interspecific DNA synteny is much better preserved in euchromatin than heterochromatin, showing the qualitative difference in evolution of these respective genomic domains. About 67% of contigs can be aligned to the Arabidopsis genome, with 96.5% corresponding to euchromatic regions, and 3.5% (shown to contain repetitive sequences) to pericentromeric regions. Overgo probe hybridization data showed that contigs aligned to Arabidopsis euchromatin contain ~80% of low-copy-number genes, while genes with high copy number are much more frequently associated with pericentromeric regions. We identified 39 interchromosomal breakpoints during the diversification of B. oleracea and Arabidopsis thaliana, a relatively high level of genomic change since their divergence. Comparison of the B. oleracea physical map with Arabidopsis and other available eudicot genomes showed appreciable 'shadowing' produced by more ancient polyploidies, resulting in a web of relatedness among contigs which increased genomic complexity. Conclusions A high-resolution genetically-anchored physical map sheds light on Brassica genome organization and advances positional cloning of specific genes, and may help to validate genome sequence assembly and alignment to chromosomes. All the physical mapping data is freely shared at a WebFPC site (http://lulu.pgml.uga.edu/fpc/WebAGCoL/brassica/WebFPC/; Temporarily password-protected: account: pgml; password: 123qwe123. PMID:21955929
Deshmukh, Rupesh K; Sonah, Humira; Bélanger, Richard R
2016-01-01
Aquaporins (AQPs) are channel-forming integral membrane proteins that facilitate the movement of water and many other small molecules. Compared to animals, plants contain a much higher number of AQPs in their genome. Homology-based identification of AQPs in sequenced species is feasible because of the high level of conservation of protein sequences across plant species. Genome-wide characterization of AQPs has highlighted several important aspects such as distribution, genetic organization, evolution and conserved features governing solute specificity. From a functional point of view, the understanding of AQP transport system has expanded rapidly with the help of transcriptomics and proteomics data. The efficient analysis of enormous amounts of data generated through omic scale studies has been facilitated through computational advancements. Prediction of protein tertiary structures, pore architecture, cavities, phosphorylation sites, heterodimerization, and co-expression networks has become more sophisticated and accurate with increasing computational tools and pipelines. However, the effectiveness of computational approaches is based on the understanding of physiological and biochemical properties, transport kinetics, solute specificity, molecular interactions, sequence variations, phylogeny and evolution of aquaporins. For this purpose, tools like Xenopus oocyte assays, yeast expression systems, artificial proteoliposomes, and lipid membranes have been efficiently exploited to study the many facets that influence solute transport by AQPs. In the present review, we discuss genome-wide identification of AQPs in plants in relation with recent advancements in analytical tools, and their availability and technological challenges as they apply to AQPs. An exhaustive review of omics resources available for AQP research is also provided in order to optimize their efficient utilization. Finally, a detailed catalog of computational tools and analytical pipelines is offered as a resource for AQP research.
Deshmukh, Rupesh K.; Sonah, Humira; Bélanger, Richard R.
2016-01-01
Aquaporins (AQPs) are channel-forming integral membrane proteins that facilitate the movement of water and many other small molecules. Compared to animals, plants contain a much higher number of AQPs in their genome. Homology-based identification of AQPs in sequenced species is feasible because of the high level of conservation of protein sequences across plant species. Genome-wide characterization of AQPs has highlighted several important aspects such as distribution, genetic organization, evolution and conserved features governing solute specificity. From a functional point of view, the understanding of AQP transport system has expanded rapidly with the help of transcriptomics and proteomics data. The efficient analysis of enormous amounts of data generated through omic scale studies has been facilitated through computational advancements. Prediction of protein tertiary structures, pore architecture, cavities, phosphorylation sites, heterodimerization, and co-expression networks has become more sophisticated and accurate with increasing computational tools and pipelines. However, the effectiveness of computational approaches is based on the understanding of physiological and biochemical properties, transport kinetics, solute specificity, molecular interactions, sequence variations, phylogeny and evolution of aquaporins. For this purpose, tools like Xenopus oocyte assays, yeast expression systems, artificial proteoliposomes, and lipid membranes have been efficiently exploited to study the many facets that influence solute transport by AQPs. In the present review, we discuss genome-wide identification of AQPs in plants in relation with recent advancements in analytical tools, and their availability and technological challenges as they apply to AQPs. An exhaustive review of omics resources available for AQP research is also provided in order to optimize their efficient utilization. Finally, a detailed catalog of computational tools and analytical pipelines is offered as a resource for AQP research. PMID:28066459
Archaebacterial rhodopsin sequences: Implications for evolution
NASA Technical Reports Server (NTRS)
Lanyi, J. K.
1991-01-01
It was proposed over 10 years ago that the archaebacteria represent a separate kingdom which diverged very early from the eubacteria and eukaryotes. It follows that investigations of archaebacterial characteristics might reveal features of early evolution. So far, two genes, one for bacteriorhodopsin and another for halorhodopsin, both from Halobacterium halobium, have been sequenced. We cloned and sequenced the gene coding for the polypeptide of another one of these rhodopsins, a halorhodopsin in Natronobacterium pharaonis. Peptide sequencing of cyanogen bromide fragments, and immuno-reactions of the protein and synthetic peptides derived from the C-terminal gene sequence, confirmed that the open reading frame was the structural gene for the pharaonis halorhodopsin polypeptide. The flanking DNA sequences of this gene, as well as those of other bacterial rhodopsins, were compared to previously proposed archaebacterial consensus sequences. In pairwise comparisons of the open reading frame with DNA sequences for bacterio-opsin and halo-opsin from Halobacterium halobium, silent divergences were calculated. These indicate very considerable evolutionary distance between each pair of genes, even in the dame organism. In spite of this, three protein sequences show extensive similarities, indicating strong selective pressures.
Exploring Connectivity in Sequence Space of Functional RNA
NASA Technical Reports Server (NTRS)
Wei, Chenyu; Pohorille, Andrzej; Popovic, Milena; Ditzler, Mark
2017-01-01
Emergence of replicable genetic molecules was one of the marking points in the origin of life, evolution of which can be conceptualized as a walk through the space of all possible sequences. A theoretical concept of fitness landscape helps to understand evolutionary processes through assigning a value of fitness to each genotype. Then, evolution of a phenotype is viewed as a series of consecutive, single-point mutations. Natural selection biases evolution toward peaks of high fitness and away from valleys of low fitness. whereas neutral drift occurs in the sequence space without direction as mutations are introduced at random. Large networks of neutral or near-neutral mutations on a fitness landscape, especially for sufficiently long genomes, are possible or even inevitable. Their detection in experiments, however, has been elusive. Although a few near-neutral evolutionary pathways have been found, recent experimental evidence indicates landscapes consist of largely isolated islands. The generality of these results, however, is not clear, as the genome length or the fraction of functional molecules in the genotypic space might have been insufficient for the emergence of large, neutral networks. Thorough investigation on the structure of the fitness landscape is essential to understand the mechanisms of evolution of early genomes. RNA molecules are commonly assumed to play the pivotal role in the origin of genetic systems. They are widely believed to be early, if not the earliest, genetic and catalytic molecules, with abundant biochemical activities as aptamers and ribozymes, i.e. RNA molecules capable, respectively, to bind small molecules or catalyze chemical reactions. Here, we present results of our recent studies on the structure of the sequence space of RNA ligase ribozymes selected through in vitro evolution. Several hundred thousands of sequences active to a different degree were obtained by way of deep sequencing. Analysis of these sequences revealed several large clusters defined such that every sequence in a cluster can be reached from any other sequence in the same cluster through a series of single point mutations. Sequences in a single cluster appear to adopt more than one secondary structure. The mechanism of refolding within a single cluster was examined. To shed light on possible evolutionary paths in the space of ribozymes, the connectivity between clusters was investigated. The effect of length of RNA molecules on the structure of the fitness landscape and possible evolutionary paths was examined by way of comparing functional sequences of 20 and 80 nucleobases in length. It was found that sequences of different lengths shared secondary structure motifs that were presumed responsible for catalytic activity, with increasing complexity and global structural rearrangements emerging in longer molecules.
The orbital evolution of NEA 30825 1900 TG1
NASA Astrophysics Data System (ADS)
Timoshkova, E. I.
2008-02-01
The orbital evolution of the near-Earth asteroid (NEA) 30825 1990 TG1 has been studied by numerical integration of the equations of its motion over the 100 000-year time interval with allowance for perturbations from eight major planets and Pluto, and the variations in its osculating orbit over this time interval were determined. The numerical integrations were performed using two methods: the Bulirsch-Stoer method and the Everhart method. The comparative analysis of the two resulting orbital evolutions of motion is presented for the time interval examined. The evolution of the asteroid motion is qualitatively the same for both variants, but the rate of evolution of the orbital elements is different. Our research confirms the known fact that the application of different integrators to the study of the long-term evolution of the NEA orbit may lead to different evolution tracks.
Jurka, Jerzy W.
1997-01-01
Enhanced homologous recombination is obtained by employing a consensus sequence which has been found to be associated with integration of repeat sequences, such as Alu and ID. The consensus sequence or sequence having a single transition mutation determines one site of a double break which allows for high efficiency of integration at the site. By introducing single or double stranded DNA having the consensus sequence flanking region joined to a sequence of interest, one can reproducibly direct integration of the sequence of interest at one or a limited number of sites. In this way, specific sites can be identified and homologous recombination achieved at the site by employing a second flanking sequence associated with a sequence proximal to the 3'-nick.
Dynamical evolution of motion perception.
Kanai, Ryota; Sheth, Bhavin R; Shimojo, Shinsuke
2007-03-01
Motion is defined as a sequence of positional changes over time. However, in perception, spatial position and motion dynamically interact with each other. This reciprocal interaction suggests that the perception of a moving object itself may dynamically evolve following the onset of motion. Here, we show evidence that the percept of a moving object systematically changes over time. In experiments, we introduced a transient gap in the motion sequence or a brief change in some feature (e.g., color or shape) of an otherwise smoothly moving target stimulus. Observers were highly sensitive to the gap or transient change if it occurred soon after motion onset (< or =200 ms), but significantly less so if it occurred later (> or = 300 ms). Our findings suggest that the moving stimulus is initially perceived as a time series of discrete potentially isolatable frames; later failures to perceive change suggests that over time, the stimulus begins to be perceived as a single, indivisible gestalt integrated over space as well as time, which could well be the signature of an emergent stable motion percept.
Horizontal gene transfer of chromosomal Type II toxin-antitoxin systems of Escherichia coli.
Ramisetty, Bhaskar Chandra Mohan; Santhosh, Ramachandran Sarojini
2016-02-01
Type II toxin-antitoxin systems (TAs) are small autoregulated bicistronic operons that encode a toxin protein with the potential to inhibit metabolic processes and an antitoxin protein to neutralize the toxin. Most of the bacterial genomes encode multiple TAs. However, the diversity and accumulation of TAs on bacterial genomes and its physiological implications are highly debated. Here we provide evidence that Escherichia coli chromosomal TAs (encoding RNase toxins) are 'acquired' DNA likely originated from heterologous DNA and are the smallest known autoregulated operons with the potential for horizontal propagation. Sequence analyses revealed that integration of TAs into the bacterial genome is unique and contributes to variations in the coding and/or regulatory regions of flanking host genome sequences. Plasmids and genomes encoding identical TAs of natural isolates are mutually exclusive. Chromosomal TAs might play significant roles in the evolution and ecology of bacteria by contributing to host genome variation and by moderation of plasmid maintenance. © FEMS 2015. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Biological data sciences in genome research
Schatz, Michael C.
2015-01-01
The last 20 years have been a remarkable era for biology and medicine. One of the most significant achievements has been the sequencing of the first human genomes, which has laid the foundation for profound insights into human genetics, the intricacies of regulation and development, and the forces of evolution. Incredibly, as we look into the future over the next 20 years, we see the very real potential for sequencing more than 1 billion genomes, bringing even deeper insight into human genetics as well as the genetics of millions of other species on the planet. Realizing this great potential for medicine and biology, though, will only be achieved through the integration and development of highly scalable computational and quantitative approaches that can keep pace with the rapid improvements to biotechnology. In this perspective, I aim to chart out these future technologies, anticipate the major themes of research, and call out the challenges ahead. One of the largest shifts will be in the training used to prepare the class of 2035 for their highly interdisciplinary world. PMID:26430150
Genome sequence diversity and clues to the evolution of variola (smallpox) virus.
Esposito, Joseph J; Sammons, Scott A; Frace, A Michael; Osborne, John D; Olsen-Rasmussen, Melissa; Zhang, Ming; Govil, Dhwani; Damon, Inger K; Kline, Richard; Laker, Miriam; Li, Yu; Smith, Geoffrey L; Meyer, Hermann; Leduc, James W; Wohlhueter, Robert M
2006-08-11
Comparative genomics of 45 epidemiologically varied variola virus isolates from the past 30 years of the smallpox era indicate low sequence diversity, suggesting that there is probably little difference in the isolates' functional gene content. Phylogenetic clustering inferred three clades coincident with their geographical origin and case-fatality rate; the latter implicated putative proteins that mediate viral virulence differences. Analysis of the viral linear DNA genome suggests that its evolution involved direct descent and DNA end-region recombination events. Knowing the sequences will help understand the viral proteome and improve diagnostic test precision, therapeutics, and systems for their assessment.
Genomic signatures of diet-related shifts during human origins
Babbitt, Courtney C.; Warner, Lisa R.; Fedrigo, Olivier; Wall, Christine E.; Wray, Gregory A.
2011-01-01
There are numerous anthropological analyses concerning the importance of diet during human evolution. Diet is thought to have had a profound influence on the human phenotype, and dietary differences have been hypothesized to contribute to the dramatic morphological changes seen in modern humans as compared with non-human primates. Here, we attempt to integrate the results of new genomic studies within this well-developed anthropological context. We then review the current evidence for adaptation related to diet, both at the level of sequence changes and gene expression. Finally, we propose some ways in which new technologies can help identify specific genomic adaptations that have resulted in metabolic and morphological differences between humans and non-human primates. PMID:21177690
Bacterial Actins? An Evolutionary Perspective
NASA Technical Reports Server (NTRS)
Doolittle, Russell F.; York, Amanda L.
2003-01-01
According to the conventional wisdom, the existence of a cytoskeleton in eukaryotes and its absence in prokaryotes constitute a fundamental divide between the two domains of life. An integral part of the dogma is that a cytoskeleton enabled an early eukaryote to feed upon prokaryotes, a consequence of which was the occasional endosymbiosis and the eventual evolution of organelles. Two recent papers present compelling evidence that actin, one of the principal components of a cytoskeleton, has a homolog in Bacteria that behaves in many ways like eukaryotic actin. Sequence comparisons reveml that eukaryotic actin and the bacterial homolog (mreB protein), unlike many other proteins common to eukaryotes and Bacteria, have very different and more highly extended evolutionary histories.
AMPLIFICATION OF RIBOSOMAL RNA SEQUENCES
This book chapter offers an overview of the use of ribosomal RNA sequences. A history of the technology traces the evolution of techniques to measure bacterial phylogenetic relationships and recent advances in obtaining rRNA sequence information. The manual also describes procedu...
A transposase strategy for creating libraries of circularly permuted proteins.
Mehta, Manan M; Liu, Shirley; Silberg, Jonathan J
2012-05-01
A simple approach for creating libraries of circularly permuted proteins is described that is called PERMutation Using Transposase Engineering (PERMUTE). In PERMUTE, the transposase MuA is used to randomly insert a minitransposon that can function as a protein expression vector into a plasmid that contains the open reading frame (ORF) being permuted. A library of vectors that express different permuted variants of the ORF-encoded protein is created by: (i) using bacteria to select for target vectors that acquire an integrated minitransposon; (ii) excising the ensemble of ORFs that contain an integrated minitransposon from the selected vectors; and (iii) circularizing the ensemble of ORFs containing integrated minitransposons using intramolecular ligation. Construction of a Thermotoga neapolitana adenylate kinase (AK) library using PERMUTE revealed that this approach produces vectors that express circularly permuted proteins with distinct sequence diversity from existing methods. In addition, selection of this library for variants that complement the growth of Escherichia coli with a temperature-sensitive AK identified functional proteins with novel architectures, suggesting that PERMUTE will be useful for the directed evolution of proteins with new functions.
A transposase strategy for creating libraries of circularly permuted proteins
Mehta, Manan M.; Liu, Shirley; Silberg, Jonathan J.
2012-01-01
A simple approach for creating libraries of circularly permuted proteins is described that is called PERMutation Using Transposase Engineering (PERMUTE). In PERMUTE, the transposase MuA is used to randomly insert a minitransposon that can function as a protein expression vector into a plasmid that contains the open reading frame (ORF) being permuted. A library of vectors that express different permuted variants of the ORF-encoded protein is created by: (i) using bacteria to select for target vectors that acquire an integrated minitransposon; (ii) excising the ensemble of ORFs that contain an integrated minitransposon from the selected vectors; and (iii) circularizing the ensemble of ORFs containing integrated minitransposons using intramolecular ligation. Construction of a Thermotoga neapolitana adenylate kinase (AK) library using PERMUTE revealed that this approach produces vectors that express circularly permuted proteins with distinct sequence diversity from existing methods. In addition, selection of this library for variants that complement the growth of Escherichia coli with a temperature-sensitive AK identified functional proteins with novel architectures, suggesting that PERMUTE will be useful for the directed evolution of proteins with new functions. PMID:22319214
SENCA: A Multilayered Codon Model to Study the Origins and Dynamics of Codon Usage
Pouyet, Fanny; Bailly-Bechet, Marc; Mouchiroud, Dominique; Guéguen, Laurent
2016-01-01
Gene sequences are the target of evolution operating at different levels, including the nucleotide, codon, and amino acid levels. Disentangling the impact of those different levels on gene sequences requires developing a probabilistic model with three layers. Here we present SENCA (site evolution of nucleotides, codons, and amino acids), a codon substitution model that separately describes 1) nucleotide processes which apply on all sites of a sequence such as the mutational bias, 2) preferences between synonymous codons, and 3) preferences among amino acids. We argue that most synonymous substitutions are not neutral and that SENCA provides more accurate estimates of selection compared with more classical codon sequence models. We study the forces that drive the genomic content evolution, intraspecifically in the core genome of 21 prokaryotes and interspecifically for five Enterobacteria. We retrieve the existence of a universal mutational bias toward AT, and that taking into account selection on synonymous codon usage has consequences on the measurement of selection on nonsynonymous substitutions. We also confirm that codon usage bias is mostly driven by selection on preferred codons. We propose new summary statistics to measure the relative importance of the different evolutionary processes acting on sequences. PMID:27401173
Germline transformation of the butterfly Bicyclus anynana.
Marcus, Jeffrey M; Ramos, Diane M; Monteiro, Antónia
2004-08-07
Ecological and evolutionary theory has frequently been inspired by the diversity of colour patterns on the wings of butterflies. More recently, these varied patterns have also become model systems for studying the evolution of developmental mechanisms. A technique that will facilitate our understanding of butterfly colour-pattern development is germline transformation. Germline transformation permits functional tests of candidate gene products and of cis-regulatory regions, and provides a means of generating new colour-pattern mutants by insertional mutagenesis. We report the successful transformation of the African satyrid butterfly Bicyclus anynana with two different transposable element vectors, Hermes and piggyBac, each carrying EGFP coding sequences driven by the 3XP3 synthetic enhancer that drives gene expression in the eyes. Candidate lines identified by screening for EGFP in adult eyes were later confirmed by PCR amplification of a fragment of the EGFP coding sequence from genomic DNA. Flanking DNA surrounding the insertions was amplified by inverse PCR and sequenced. Transformation rates were 5% for piggyBac and 10.2% for Hermes. Ultimately, the new data generated by these techniques may permit an integrated understanding of the developmental genetics of colour-pattern formation and of the ecological and evolutionary processes in which these patterns play a role.
Ochoa, David; García-Gutiérrez, Ponciano; Juan, David; Valencia, Alfonso; Pazos, Florencio
2013-01-27
A widespread family of methods for studying and predicting protein interactions using sequence information is based on co-evolution, quantified as similarity of phylogenetic trees. Part of the co-evolution observed between interacting proteins could be due to co-adaptation caused by inter-protein contacts. In this case, the co-evolution is expected to be more evident when evaluated on the surface of the proteins or the internal layers close to it. In this work we study the effect of incorporating information on predicted solvent accessibility to three methods for predicting protein interactions based on similarity of phylogenetic trees. We evaluate the performance of these methods in predicting different types of protein associations when trees based on positions with different characteristics of predicted accessibility are used as input. We found that predicted accessibility improves the results of two recent versions of the mirrortree methodology in predicting direct binary physical interactions, while it neither improves these methods, nor the original mirrortree method, in predicting other types of interactions. That improvement comes at no cost in terms of applicability since accessibility can be predicted for any sequence. We also found that predictions of protein-protein interactions are improved when multiple sequence alignments with a richer representation of sequences (including paralogs) are incorporated in the accessibility prediction.
Genomic analysis of expressed sequence tags in American black bear Ursus americanus
2010-01-01
Background Species of the bear family (Ursidae) are important organisms for research in molecular evolution, comparative physiology and conservation biology, but relatively little genetic sequence information is available for this group. Here we report the development and analyses of the first large scale Expressed Sequence Tag (EST) resource for the American black bear (Ursus americanus). Results Comprehensive analyses of molecular functions, alternative splicing, and tissue-specific expression of 38,757 black bear EST sequences were conducted using the dog genome as a reference. We identified 18 genes, involved in functions such as lipid catabolism, cell cycle, and vesicle-mediated transport, that are showing rapid evolution in the bear lineage Three genes, Phospholamban (PLN), cysteine glycine-rich protein 3 (CSRP3) and Troponin I type 3 (TNNI3), are related to heart contraction, and defects in these genes in humans lead to heart disease. Two genes, biphenyl hydrolase-like (BPHL) and CSRP3, contain positively selected sites in bear. Global analysis of evolution rates of hibernation-related genes in bear showed that they are largely conserved and slowly evolving genes, rather than novel and fast-evolving genes. Conclusion We provide a genomic resource for an important mammalian organism and our study sheds new light on the possible functions and evolution of bear genes. PMID:20338065
Genomic analysis of expressed sequence tags in American black bear Ursus americanus.
Zhao, Sen; Shao, Chunxuan; Goropashnaya, Anna V; Stewart, Nathan C; Xu, Yichi; Tøien, Øivind; Barnes, Brian M; Fedorov, Vadim B; Yan, Jun
2010-03-26
Species of the bear family (Ursidae) are important organisms for research in molecular evolution, comparative physiology and conservation biology, but relatively little genetic sequence information is available for this group. Here we report the development and analyses of the first large scale Expressed Sequence Tag (EST) resource for the American black bear (Ursus americanus). Comprehensive analyses of molecular functions, alternative splicing, and tissue-specific expression of 38,757 black bear EST sequences were conducted using the dog genome as a reference. We identified 18 genes, involved in functions such as lipid catabolism, cell cycle, and vesicle-mediated transport, that are showing rapid evolution in the bear lineage Three genes, Phospholamban (PLN), cysteine glycine-rich protein 3 (CSRP3) and Troponin I type 3 (TNNI3), are related to heart contraction, and defects in these genes in humans lead to heart disease. Two genes, biphenyl hydrolase-like (BPHL) and CSRP3, contain positively selected sites in bear. Global analysis of evolution rates of hibernation-related genes in bear showed that they are largely conserved and slowly evolving genes, rather than novel and fast-evolving genes. We provide a genomic resource for an important mammalian organism and our study sheds new light on the possible functions and evolution of bear genes.
2011-01-01
Background Vinyl chloride is a widespread groundwater pollutant and Group 1 carcinogen. A previous comparative genomic analysis revealed that the vinyl chloride reductase operon, vcrABC, of Dehalococcoides sp. strain VS is embedded in a horizontally-acquired genomic island that integrated at the single-copy tmRNA gene, ssrA. Results We targeted conserved positions in available genomic islands to amplify and sequence four additional vcrABC -containing genomic islands from previously-unsequenced vinyl chloride respiring Dehalococcoides enrichments. We identified a total of 31 ssrA-specific genomic islands from Dehalococcoides genomic data, accounting for 47 reductive dehalogenase homologous genes and many other non-core genes. Sixteen of these genomic islands contain a syntenic module of integration-associated genes located adjacent to the predicted site of integration, and among these islands, eight contain vcrABC as genetic 'cargo'. These eight vcrABC -containing genomic islands are syntenic across their ~12 kbp length, but have two phylogenetically discordant segments that unambiguously differentiate the integration module from the vcrABC cargo. Using available Dehalococcoides phylogenomic data we estimate that these ssrA-specific genomic islands are at least as old as the Dehalococcoides group itself, which in turn is much older than human civilization. Conclusions The vcrABC -containing genomic islands are a recently-acquired subset of a diverse collection of ssrA-specific mobile elements that are a major contributor to strain-level diversity in Dehalococcoides, and may have been throughout its evolution. The high similarity between vcrABC sequences is quantitatively consistent with recent horizontal acquisition driven by ~100 years of industrial pollution with chlorinated ethenes. PMID:21635780
Lin, Chao-Fen; Lo, Ta-Chun; Kuo, Yang-Cheng; Lin, Thy-Hou
2013-04-01
An integration vector capable of stably integrating and maintaining in the chromosomes of several lactobacilli over hundreds of generations has been constructed. The major integration machinery used is based on the ΦAT3 integrase (int) and attP sequences determined previously. A novel core sequence located at the 3' end of the tRNA(leu) gene is identified in Lactobacillus fermentum ATCC 14931 as the integration target by the integration vector though most of such sequences found in other lactobacilli are similar to that determined previously. Due to the lack of an appropriate attB site in Lactococcus lactis MG1363, the integration vector is found to be unable to integrate into the chromosome of the strain. However, such integration can be successfully restored by cotransforming the integration vector with a replicative one harboring both attB and erythromycin resistance sequences into the strain. Furthermore, the integration vector constructed carries a promoter region of placT from the chromosome of Lactobacillus rhamnosus TCELL-1 which is used to express green fluorescence and luminance protein genes in the lactobacilli studied.
Resurgence of Integrated Behavioral Units
Bachá-Méndez, Gustavo; Reid, Alliston K; Mendoza-Soylovna, Adela
2007-01-01
Two experiments with rats examined the dynamics of well-learned response sequences when reinforcement contingencies were changed. Both experiments contained four phases, each of which reinforced a 2-response sequence of lever presses until responding was stable. The contingencies then were shifted to a new reinforced sequence until responding was again stable. Extinction-induced resurgence of previously reinforced, and then extinguished, heterogeneous response sequences was observed in all subjects in both experiments. These sequences were demonstrated to be integrated behavioral units, controlled by processes acting at the level of the entire sequence. Response-level processes were also simultaneously operative. Errors in sequence production were strongly influenced by the terminal, not the initial, response in the currently reinforced sequence, but not by the previously reinforced sequence. These studies demonstrate that sequence-level and response-level processes can operate simultaneously in integrated behavioral units. Resurgence and the development of integrated behavioral units may be dissociated; thus the observation of one does not necessarily imply the other. PMID:17345948
Methods of geometrical integration in accelerator physics
NASA Astrophysics Data System (ADS)
Andrianov, S. N.
2016-12-01
In the paper we consider a method of geometric integration for a long evolution of the particle beam in cyclic accelerators, based on the matrix representation of the operator of particles evolution. This method allows us to calculate the corresponding beam evolution in terms of two-dimensional matrices including for nonlinear effects. The ideology of the geometric integration introduces in appropriate computational algorithms amendments which are necessary for preserving the qualitative properties of maps presented in the form of the truncated series generated by the operator of evolution. This formalism extends both on polarized and intense beams. Examples of practical applications are described.
Pan, Keyao; Deem, Michael W.
2011-01-01
Many viruses evolve rapidly. For example, haemagglutinin (HA) of the H3N2 influenza A virus evolves to escape antibody binding. This evolution of the H3N2 virus means that people who have previously been exposed to an influenza strain may be infected by a newly emerged virus. In this paper, we use Shannon entropy and relative entropy to measure the diversity and selection pressure by an antibody in each amino acid site of H3 HA between the 1992–1993 season and the 2009–2010 season. Shannon entropy and relative entropy are two independent state variables that we use to characterize H3N2 evolution. The entropy method estimates future H3N2 evolution and migration using currently available H3 HA sequences. First, we show that the rate of evolution increases with the virus diversity in the current season. The Shannon entropy of the sequence in the current season predicts relative entropy between sequences in the current season and those in the next season. Second, a global migration pattern of H3N2 is assembled by comparing the relative entropy flows of sequences sampled in China, Japan, the USA and Europe. We verify this entropy method by describing two aspects of historical H3N2 evolution. First, we identify 54 amino acid sites in HA that have evolved in the past to evade the immune system. Second, the entropy method shows that epitopes A and B on the top of HA evolve most vigorously to escape antibody binding. Our work provides a novel entropy-based method to predict and quantify future H3N2 evolution and to describe the evolutionary history of H3N2. PMID:21543352
Modeling the expected lifetime and evolution of a deme's principal genetic sequence.
NASA Astrophysics Data System (ADS)
Clark, Brian
2014-03-01
The principal genetic sequence (PGS) is the most common genetic sequence in a deme. The PGS changes over time because new genetic sequences are created by inversions, compete with the current PGS, and a small fraction become PGSs. A set of coupled difference equations provides a description of the evolution of the PGS distribution function in an ensemble of demes. Solving the set of equations produces the survival probability of a new genetic sequence and the expected lifetime of an existing PGS as a function of inversion size and rate, recombination rate, and deme size. Additionally, the PGS distribution function is used to explain the transition pathway from old to new PGSs. We compare these results to a cellular automaton based representation of a deme and the drosophila species, D. melanogaster and D. yakuba.
Detecting and Analyzing Genetic Recombination Using RDP4.
Martin, Darren P; Murrell, Ben; Khoosal, Arjun; Muhire, Brejnev
2017-01-01
Recombination between nucleotide sequences is a major process influencing the evolution of most species on Earth. The evolutionary value of recombination has been widely debated and so too has its influence on evolutionary analysis methods that assume nucleotide sequences replicate without recombining. When nucleic acids recombine, the evolution of the daughter or recombinant molecule cannot be accurately described by a single phylogeny. This simple fact can seriously undermine the accuracy of any phylogenetics-based analytical approach which assumes that the evolutionary history of a set of recombining sequences can be adequately described by a single phylogenetic tree. There are presently a large number of available methods and associated computer programs for analyzing and characterizing recombination in various classes of nucleotide sequence datasets. Here we examine the use of some of these methods to derive and test recombination hypotheses using multiple sequence alignments.
NASA Astrophysics Data System (ADS)
Schmidt, T.; Neuhaüser, R.; Seifahrt, A.
2010-10-01
About 15 substellar companions with large separations (>∼50 AU) to their young primary stars and brown dwarfs are confirmed by both common proper motion and late-M / early-L type spectra. The origin and early evolution of these objects is still under debate. While often these substellar companions are regarded as brown dwarfs, they could possibly also be massive planets, the mass estimates are very uncertain so far. They are companions to primary stars or brown dwarfs in young associations and star forming regions like the TW Hya association, Upper Scorpius, Taurus, Beta Pic moving group, TucHor association, Lupus, Ophiuchus, and Chamaeleon, hence their ages and distances are well known, in contrast to free-floating brown dwarfs. An empirical classification is not possible, because a spectral sequence that is taking the lower gravity into account, is not existing. This problem leads to an apparent mismatch between spectra of old field type objects and young low-mass companions at the same effective temperature, hampering a determination of temperature and surface gravity independent from models. Now that about 15 such substellar candidates are found in associations of different ages, 1 - 35 Myrs, it is possible to study their spectra in comparison to each other using the advantage of light concentration by an adaptive optics system with their primary as guide star. Therefore we have begun the construction of an empirical log g sequence from beginning to observe all these substellar companions homogeneously using the AO-assisted integral field spectrograph SINFONI at VLT (ESO).
Visualizing Clonal Evolution in Cancer.
Krzywinski, Martin
2016-06-02
Rapid and inexpensive single-cell sequencing is driving new visualizations of cancer instability and evolution. Krzywinski discusses how to present clone evolution plots in order to visualize temporal, phylogenetic, and spatial aspects of a tumor in a single static image. Copyright © 2016 Elsevier Inc. All rights reserved.
Primate-specific evolution of an LDLR enhancer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Qian-Fei; Prabhakar, Shyam; Wang, Qianben
2005-12-01
Sequence changes in regulatory regions have often been invoked to explain phenotypic divergence among species, but molecular examples of this have been difficult to obtain. In this study we identified an anthropoid primate-specific sequence element that contributed to the regulatory evolution of the low-density lipoprotein receptor. Using a combination of close and distant species genomic sequence comparisons coupled with in vivo and in vitro studies, we found that a functional cholesterol-sensing sequence motif arose and was fixed within a pre-existing enhancer in the common ancestor of anthropoid primates. Our study demonstrates one molecular mechanism by which ancestral mammalian regulatory elementsmore » can evolve to perform new functions in the primate lineage leading to human.« less
Variation in promiscuity and sexual selection drives avian rate of Faster-Z evolution.
Wright, Alison E; Harrison, Peter W; Zimmer, Fabian; Montgomery, Stephen H; Pointer, Marie A; Mank, Judith E
2015-03-01
Higher rates of coding sequence evolution have been observed on the Z chromosome relative to the autosomes across a wide range of species. However, despite a considerable body of theory, we lack empirical evidence explaining variation in the strength of the Faster-Z Effect. To assess the magnitude and drivers of Faster-Z Evolution, we assembled six de novo transcriptomes, spanning 90 million years of avian evolution. Our analysis combines expression, sequence and polymorphism data with measures of sperm competition and promiscuity. In doing so, we present the first empirical evidence demonstrating the positive relationship between Faster-Z Effect and measures of promiscuity, and therefore variance in male mating success. Our results from multiple lines of evidence indicate that selection is less effective on the Z chromosome, particularly in promiscuous species, and that Faster-Z Evolution in birds is due primarily to genetic drift. Our results reveal the power of mating system and sexual selection in shaping broad patterns in genome evolution. © 2015 John Wiley & Sons Ltd.
Afrache, Hassnae; Pontarotti, Pierre; Abi-Rached, Laurent; Olive, Daniel
2017-06-01
The butyrophilin 3 (BTN3) receptors are implicated in the T lymphocytes regulation and present a wide plasticity in mammals. In order to understand how these genes have been diversified, we studied their evolution and show that the three human BTN3 are the result of two successive duplications in Primates and that the three genes are present in Hominoids and the Old World Monkey groups. A thorough phylogenetic analysis reveals a concerted evolution of BTN3 characterized by a strong and recurrent homogenization of the region encoding the signal peptide and the immunoglobulin variable (IgV) domain in Hominoids, where the sequences of BTN3A1 or BTN3A3 are replaced by BTN3A2 sequence. In human, the analysis of the diversity of these genes in 1683 individuals representing 26 worldwide populations shows that the three genes are polymorphic, with more than 46 alleles for each gene, and marked by extreme homogenization of the IgV sequences. The same analysis performed for the BTN2 genes shows also a concerted evolution; however, it is not as strong and recurrent as for BTN3. This study shows that BTN3 receptors are marked by extreme concerted evolution at the IgV domain and that BTN3A2 plays a central role in this evolution.
Jeon, Junhyun; Choi, Jaeyoung; Lee, Gir-Won; Dean, Ralph A; Lee, Yong-Hwan
2013-01-01
Knowledge on mutation processes is central to interpreting genetic analysis data as well as understanding the underlying nature of almost all evolutionary phenomena. However, studies on genome-wide mutational spectrum and dynamics in fungal pathogens are scarce, hindering our understanding of their evolution and biology. Here, we explored changes in the phenotypes and genome sequences of the rice blast fungus Magnaporthe oryzae during the forced in vitro evolution by weekly transfer of cultures on artificial media. Through combination of experimental evolution with high throughput sequencing technology, we found that mutations accumulate rapidly prior to visible phenotypic changes and that both genetic drift and selection seem to contribute to shaping mutational landscape, suggesting the buffering capacity of fungal genome against mutations. Inference of mutational effects on phenotypes through the use of T-DNA insertion mutants suggested that at least some of the DNA sequence mutations are likely associated with the observed phenotypic changes. Furthermore, our data suggest oxidative damages and UV as major sources of mutation during subcultures. Taken together, our work revealed important properties of original source of variation in the genome of the rice blast fungus. We believe that these results provide not only insights into stability of pathogenicity and genome evolution in plant pathogenic fungi but also a model in which evolution of fungal pathogens in natura can be comparatively investigated.
Positive Selection Underlies Faster-Z Evolution of Gene Expression in Birds
Dean, Rebecca; Harrison, Peter W.; Wright, Alison E.; Zimmer, Fabian; Mank, Judith E.
2015-01-01
The elevated rate of evolution for genes on sex chromosomes compared with autosomes (Fast-X or Fast-Z evolution) can result either from positive selection in the heterogametic sex or from nonadaptive consequences of reduced relative effective population size. Recent work in birds suggests that Fast-Z of coding sequence is primarily due to relaxed purifying selection resulting from reduced relative effective population size. However, gene sequence and gene expression are often subject to distinct evolutionary pressures; therefore, we tested for Fast-Z in gene expression using next-generation RNA-sequencing data from multiple avian species. Similar to studies of Fast-Z in coding sequence, we recover clear signatures of Fast-Z in gene expression; however, in contrast to coding sequence, our data indicate that Fast-Z in expression is due to positive selection acting primarily in females. In the soma, where gene expression is highly correlated between the sexes, we detected Fast-Z in both sexes, although at a higher rate in females, suggesting that many positively selected expression changes in females are also expressed in males. In the gonad, where intersexual correlations in expression are much lower, we detected Fast-Z for female gene expression, but crucially, not males. This suggests that a large amount of expression variation is sex-specific in its effects within the gonad. Taken together, our results indicate that Fast-Z evolution of gene expression is the product of positive selection acting on recessive beneficial alleles in the heterogametic sex. More broadly, our analysis suggests that the adaptive potential of Z chromosome gene expression may be much greater than that of gene sequence, results which have important implications for the role of sex chromosomes in speciation and sexual selection. PMID:26067773
The first genome sequences of human bocaviruses from Vietnam
Thanh, Tran Tan; Van, Hoang Minh Tu; Hong, Nguyen Thi Thu; Nhu, Le Nguyen Truc; Anh, Nguyen To; Tuan, Ha Manh; Hien, Ho Van; Tuong, Nguyen Manh; Kien, Trinh Trung; Khanh, Truong Huu; Nhan, Le Nguyen Thanh; Hung, Nguyen Thanh; Chau, Nguyen Van Vinh; Thwaites, Guy; van Doorn, H. Rogier; Tan, Le Van
2017-01-01
As part of an ongoing effort to generate complete genome sequences of hand, foot and mouth disease-causing enteroviruses directly from clinical specimens, two complete coding sequences and two partial genomic sequences of human bocavirus 1 (n=3) and 2 (n=1) were co-amplified and sequenced, representing the first genome sequences of human bocaviruses from Vietnam. The sequences may aid future study aiming at understanding the evolution of the virus. PMID:28090592
Multiplex sequencing of plant chloroplast genomes using Solexa sequencing-by-synthesis technology
Richard Cronn; Aaron Liston; Matthew Parks; David S. Gernandt; Rongkun Shen; Todd Mockler
2008-01-01
Organellar DNA sequences are widely used in evolutionary and population genetic studies; however, the conservative nature of chloroplast gene and genome evolution often limits phylogenetic resolution and statistical power. To gain maximal access to the historical record contained within chloroplast genomes, we have adapted multiplex sequencing-by-synthesis (MSBS) to...
Clonal evolution of acute myeloid leukemia highlighted by latest genome sequencing studies.
Zhang, Xuehong; Lv, Dekang; Zhang, Yu; Liu, Quentin; Li, Zhiguang
2016-09-06
Decades of years might be required for an initiated cell to become a fully-pledged, metastasized tumor. DNA mutations are accumulated during this process including background mutations that emerge scholastically, as well as driver mutations that selectively occur in a handful of cancer genes and confer the cell a growth advantage over its neighbors. A clone of tumor cells could be superseded by another clone that acquires new mutations and grows more aggressively. Tumor evolutional patterns have been studied for years using conventional approaches that focus on the investigation of a single or a couple of genes. Latest deep sequencing technology enables a global view of tumor evolution by deciphering almost all genome aberrations in a tumor. Tumor clones and the fate of each clone during tumor evolution can be depicted with the help of the concept of variant allele frequency. Here, we summarize the new insights of cancer evolutional progression in acute myeloid leukemia. Cancer evolution is currently thought to start from a clone that has accumulated the requisite somatically-acquired genetic aberrations through a series of increasingly disordered clinical and pathological phases, eventually leading to malignant transformation [1-3]. The observations in invasive colorectal cancer that usually emerges from an antecedent benign adenomatous polyp and in cervical cancer that proceeds through intraepithelial neoplasia support the idea of stepwise or linear cancerous progression [3-5]. Genetically, such progression is achieved by successive waves of clonal expansion during which cells acquire novel genomic alterations including single nucleotide variants (SNVs), small insertions and deletions (indels), and/or copy number variations (CNVs) [6]. The latest improvement in sequencing technology has allowed the deciphering of the whole exome or genome in different types of tumor and normal tissue pairs, providing detailed catalogue about genome aberrations during tumor initiation and progression, which have been reviewed in several papers [7-10]. Here, we focus on demonstrating the cancer clonal evolution pattern revealed by recent deep sequencing studies of samples from acute myeloid leukemia (AML) patients.
Larracuente, Amanda M
2014-11-25
Satellite DNA can make up a substantial fraction of eukaryotic genomes and has roles in genome structure and chromosome segregation. The rapid evolution of satellite DNA can contribute to genomic instability and genetic incompatibilities between species. Despite its ubiquity and its contribution to genome evolution, we currently know little about the dynamics of satellite DNA evolution. The Responder (Rsp) satellite DNA family is found in the pericentric heterochromatin of chromosome 2 of Drosophila melanogaster. Rsp is well-known for being the target of Segregation Distorter (SD)- an autosomal meiotic drive system in D. melanogaster. I present an evolutionary genetic analysis of the Rsp family of repeats in D. melanogaster and its closely-related species in the melanogaster group (D. simulans, D. sechellia, D. mauritiana, D. erecta, and D. yakuba) using a combination of available BAC sequences, whole genome shotgun Sanger reads, Illumina short read deep sequencing, and fluorescence in situ hybridization. I show that Rsp repeats have euchromatic locations throughout the D. melanogaster genome, that Rsp arrays show evidence for concerted evolution, and that Rsp repeats exist outside of D. melanogaster, in the melanogaster group. The repeats in these species are considerably diverged at the sequence level compared to D. melanogaster, and have a strikingly different genomic distribution, even between closely-related sister taxa. The genomic organization of the Rsp repeat in the D. melanogaster genome is complex-it exists of large blocks of tandem repeats in the heterochromatin and small blocks of tandem repeats in the euchromatin. My discovery of heterochromatic Rsp-like sequences outside of D. melanogaster suggests that SD evolved after its target satellite and that the evolution of the Rsp satellite family is highly dynamic over a short evolutionary time scale (<240,000 years).
Deakin, Janine E; Edwards, Melanie J; Patel, Hardip; O'Meally, Denis; Lian, Jinmin; Stenhouse, Rachael; Ryan, Sam; Livernois, Alexandra M; Azad, Bhumika; Holleley, Clare E; Li, Qiye; Georges, Arthur
2016-06-10
Squamates (lizards and snakes) are a speciose lineage of reptiles displaying considerable karyotypic diversity, particularly among lizards. Understanding the evolution of this diversity requires comparison of genome organisation between species. Although the genomes of several squamate species have now been sequenced, only the green anole lizard has any sequence anchored to chromosomes. There is only limited gene mapping data available for five other squamates. This makes it difficult to reconstruct the events that have led to extant squamate karyotypic diversity. The purpose of this study was to anchor the recently sequenced central bearded dragon (Pogona vitticeps) genome to chromosomes to trace the evolution of squamate chromosomes. Assigning sequence to sex chromosomes was of particular interest for identifying candidate sex determining genes. By using two different approaches to map conserved blocks of genes, we were able to anchor approximately 42 % of the dragon genome sequence to chromosomes. We constructed detailed comparative maps between dragon, anole and chicken genomes, and where possible, made broader comparisons across Squamata using cytogenetic mapping information for five other species. We show that squamate macrochromosomes are relatively well conserved between species, supporting findings from previous molecular cytogenetic studies. Macrochromosome diversity between members of the Toxicofera clade has been generated by intrachromosomal, and a small number of interchromosomal, rearrangements. We reconstructed the ancestral squamate macrochromosomes by drawing upon comparative cytogenetic mapping data from seven squamate species and propose the events leading to the arrangements observed in representative species. In addition, we assigned over 8 Mbp of sequence containing 219 genes to the Z chromosome, providing a list of genes to begin testing as candidate sex determining genes. Anchoring of the dragon genome has provided substantial insight into the evolution of squamate genomes, enabling us to reconstruct ancestral macrochromosome arrangements at key positions in the squamate phylogeny, demonstrating that fusions between macrochromosomes or fusions of macrochromosomes and microchromosomes, have played an important role during the evolution of squamate genomes. Assigning sequence to the sex chromosomes has identified NR5A1 as a promising candidate sex determining gene in the dragon.
Kugelman, Jeffrey R; Wiley, Michael R; Mate, Suzanne; Ladner, Jason T; Beitzel, Brett; Fakoli, Lawrence; Taweh, Fahn; Prieto, Karla; Diclaro, Joseph W; Minogue, Timothy; Schoepp, Randal J; Schaecher, Kurt E; Pettitt, James; Bateman, Stacey; Fair, Joseph; Kuhn, Jens H; Hensley, Lisa; Park, Daniel J; Sabeti, Pardis C; Sanchez-Lockhart, Mariano; Bolay, Fatorma K; Palacios, Gustavo
2015-07-01
To support Liberia's response to the ongoing Ebola virus (EBOV) disease epidemic in Western Africa, we established in-country advanced genomic capabilities to monitor EBOV evolution. Twenty-five EBOV genomes were sequenced at the Liberian Institute for Biomedical Research, which provided an in-depth view of EBOV diversity in Liberia during September 2014-February 2015. These sequences were consistent with a single virus introduction to Liberia; however, shared ancestry with isolates from Mali indicated at least 1 additional instance of movement into or out of Liberia. The pace of change is generally consistent with previous estimates of mutation rate. We observed 23 nonsynonymous mutations and 1 nonsense mutation. Six of these changes are within known binding sites for sequence-based EBOV medical countermeasures; however, the diagnostic and therapeutic impact of EBOV evolution within Liberia appears to be low.
Kugelman, Jeffrey R.; Wiley, Michael R.; Mate, Suzanne; Ladner, Jason T.; Beitzel, Brett; Fakoli, Lawrence; Taweh, Fahn; Prieto, Karla; Diclaro, Joseph W.; Minogue, Timothy; Schoepp, Randal J.; Schaecher, Kurt E.; Pettitt, James; Bateman, Stacey; Fair, Joseph; Kuhn, Jens H.; Hensley, Lisa; Park, Daniel J.; Sabeti, Pardis C.; Sanchez-Lockhart, Mariano; Bolay, Fatorma K.
2015-01-01
To support Liberia’s response to the ongoing Ebola virus (EBOV) disease epidemic in Western Africa, we established in-country advanced genomic capabilities to monitor EBOV evolution. Twenty-five EBOV genomes were sequenced at the Liberian Institute for Biomedical Research, which provided an in-depth view of EBOV diversity in Liberia during September 2014–February 2015. These sequences were consistent with a single virus introduction to Liberia; however, shared ancestry with isolates from Mali indicated at least 1 additional instance of movement into or out of Liberia. The pace of change is generally consistent with previous estimates of mutation rate. We observed 23 nonsynonymous mutations and 1 nonsense mutation. Six of these changes are within known binding sites for sequence-based EBOV medical countermeasures; however, the diagnostic and therapeutic impact of EBOV evolution within Liberia appears to be low. PMID:26079255
Life Cycle Evolution and Systematics of Campanulariid Hydrozoans
2004-09-01
kit according to manufacturer’s protocol. Purified PCR product was cycle-sequenced using either Big Dye 2 or 3 sequencing chemistry (ABI), following...ethidium bromide and purified with PCR purification kits (Qiagen). Purified products were cycle- sequenced with either Big Dye 2 or 3 sequencing chemistry...PCR purification kit (Qiagen). The purified product was cycle-sequenced using Big Dye 2 sequencing chemistry (ABI) following the manufacturer’s
Mananga, Eugene S; Reid, Alicia E
This paper presents the study of finite pulse widths for the BABA pulse sequence using the Floquet-Magnus expansion (FME) approach. In the FME scheme, the first order F 1 is identical to its counterparts in average Hamiltonian theory (AHT) and Floquet theory (FT). However, the timing part in the FME approach is introduced via the Λ 1 ( t ) function not present in other schemes. This function provides an easy way for evaluating the spin evolution during "the time in between" through the Magnus expansion of the operator connected to the timing part of the evolution. The evaluation of Λ 1 ( t ) is useful especially for the analysis of the non-stroboscopic evolution. Here, the importance of the boundary conditions, which provides a natural choice of Λ 1 (0) is ignored. This work uses the Λ 1 ( t ) function to compare the efficiency of the BABA pulse sequence with δ - pulses and the BABA pulse sequence with finite pulses. Calculations of Λ 1 ( t ) and F 1 are presented.
Mananga, Eugene S.; Reid, Alicia E.
2013-01-01
This paper presents the study of finite pulse widths for the BABA pulse sequence using the Floquet-Magnus expansion (FME) approach. In the FME scheme, the first order F1 is identical to its counterparts in average Hamiltonian theory (AHT) and Floquet theory (FT). However, the timing part in the FME approach is introduced via the Λ1 (t) function not present in other schemes. This function provides an easy way for evaluating the spin evolution during “the time in between” through the Magnus expansion of the operator connected to the timing part of the evolution. The evaluation of Λ1 (t) is useful especially for the analysis of the non-stroboscopic evolution. Here, the importance of the boundary conditions, which provides a natural choice of Λ1 (0) is ignored. This work uses the Λ1 (t) function to compare the efficiency of the BABA pulse sequence with δ – pulses and the BABA pulse sequence with finite pulses. Calculations of Λ1 (t) and F1 are presented. PMID:25792763
Zhang, Ran; Yin, Yinliang; Zhang, Yujun; Li, Kexin; Zhu, Hongxia; Gong, Qin; Wang, Jianwu; Hu, Xiaoxiang; Li, Ning
2012-01-01
As the number of transgenic livestock increases, reliable detection and molecular characterization of transgene integration sites and copy number are crucial not only for interpreting the relationship between the integration site and the specific phenotype but also for commercial and economic demands. However, the ability of conventional PCR techniques to detect incomplete and multiple integration events is limited, making it technically challenging to characterize transgenes. Next-generation sequencing has enabled cost-effective, routine and widespread high-throughput genomic analysis. Here, we demonstrate the use of next-generation sequencing to extensively characterize cattle harboring a 150-kb human lactoferrin transgene that was initially analyzed by chromosome walking without success. Using this approach, the sites upstream and downstream of the target gene integration site in the host genome were identified at the single nucleotide level. The sequencing result was verified by event-specific PCR for the integration sites and FISH for the chromosomal location. Sequencing depth analysis revealed that multiple copies of the incomplete target gene and the vector backbone were present in the host genome. Upon integration, complex recombination was also observed between the target gene and the vector backbone. These findings indicate that next-generation sequencing is a reliable and accurate approach for the molecular characterization of the transgene sequence, integration sites and copy number in transgenic species. PMID:23185606
2011-01-01
Background Natural acquisition of novel genes from other organisms by horizontal or lateral gene transfer is well established for microorganisms. There is now growing evidence that horizontal gene transfer also plays important roles in the evolution of eukaryotes. Genome-sequencing and EST projects of plant and animal associated nematodes such as Brugia, Meloidogyne, Bursaphelenchus and Pristionchus indicate horizontal gene transfer as a key adaptation towards parasitism and pathogenicity. However, little is known about the functional activity and evolutionary longevity of genes acquired by horizontal gene transfer and the mechanisms favoring such processes. Results We examine the transfer of cellulase genes to the free-living and beetle-associated nematode Pristionchus pacificus, for which detailed phylogenetic knowledge is available, to address predictions by evolutionary theory for successful gene transfer. We used transcriptomics in seven Pristionchus species and three other related diplogastrid nematodes with a well-defined phylogenetic framework to study the evolution of ancestral cellulase genes acquired by horizontal gene transfer. We performed intra-species, inter-species and inter-genic analysis by comparing the transcriptomes of these ten species and tested for cellulase activity in each species. Species with cellulase genes in their transcriptome always exhibited cellulase activity indicating functional integration into the host's genome and biology. The phylogenetic profile of cellulase genes was congruent with the species phylogeny demonstrating gene longevity. Cellulase genes show notable turnover with elevated birth and death rates. Comparison by sequencing of three selected cellulase genes in 24 natural isolates of Pristionchus pacificus suggests these high evolutionary dynamics to be associated with copy number variations and positive selection. Conclusion We could demonstrate functional integration of acquired cellulase genes into the nematode's biology as predicted by theory. Thus, functional assimilation, remarkable gene turnover and selection might represent key features of horizontal gene transfer events in nematodes. PMID:21232122
Mayer, Werner E; Schuster, Lisa N; Bartelmes, Gabi; Dieterich, Christoph; Sommer, Ralf J
2011-01-13
Natural acquisition of novel genes from other organisms by horizontal or lateral gene transfer is well established for microorganisms. There is now growing evidence that horizontal gene transfer also plays important roles in the evolution of eukaryotes. Genome-sequencing and EST projects of plant and animal associated nematodes such as Brugia, Meloidogyne, Bursaphelenchus and Pristionchus indicate horizontal gene transfer as a key adaptation towards parasitism and pathogenicity. However, little is known about the functional activity and evolutionary longevity of genes acquired by horizontal gene transfer and the mechanisms favoring such processes. We examine the transfer of cellulase genes to the free-living and beetle-associated nematode Pristionchus pacificus, for which detailed phylogenetic knowledge is available, to address predictions by evolutionary theory for successful gene transfer. We used transcriptomics in seven Pristionchus species and three other related diplogastrid nematodes with a well-defined phylogenetic framework to study the evolution of ancestral cellulase genes acquired by horizontal gene transfer. We performed intra-species, inter-species and inter-genic analysis by comparing the transcriptomes of these ten species and tested for cellulase activity in each species. Species with cellulase genes in their transcriptome always exhibited cellulase activity indicating functional integration into the host's genome and biology. The phylogenetic profile of cellulase genes was congruent with the species phylogeny demonstrating gene longevity. Cellulase genes show notable turnover with elevated birth and death rates. Comparison by sequencing of three selected cellulase genes in 24 natural isolates of Pristionchus pacificus suggests these high evolutionary dynamics to be associated with copy number variations and positive selection. We could demonstrate functional integration of acquired cellulase genes into the nematode's biology as predicted by theory. Thus, functional assimilation, remarkable gene turnover and selection might represent key features of horizontal gene transfer events in nematodes.
NASA Astrophysics Data System (ADS)
Song, Insun; Chang, Chandong
2017-05-01
This paper presents a complete set of in situ stress calculations for depths of 200-1400 meters below seafloor at Integrated Ocean Drilling Program (IODP) Site C0002, near the seaward margin of the Kumano fore-arc basin, offshore from southwest Japan. The vertical stress component was obtained by integrating bulk density calculations from moisture and density logging data, and the two horizontal components were stochastically optimized by minimizing misfits between a probabilistic model and measured breakout widths for every 30 m vertical segment of the wellbore. Our stochastic optimization process reveals that the in situ stress regime is decoupled across an unconformity between an accretionary complex and the overlying Kumano fore-arc basin. The stress condition above the unconformity is close to the critical condition for normal faulting, while below the unconformity the geologic system is stable in a normal to strike-slip fault stress regime. The critical state of stress demonstrates that the tectonic evolution of the sedimentary system has been achieved mainly by the regionally continuous action of a major out-of-sequence thrust fault during sedimentation in the fore-arc basin. The stable stress condition in the accretionary prism is interpreted to have resulted from mechanical decoupling by the accommodation of large displacement along the megasplay fault.
Spatial Models of Prebiotic Evolution: Soup Before Pizza?
NASA Astrophysics Data System (ADS)
Scheuring, István; Czárán, Tamás; Szabó, Péter; Károlyi, György; Toroczkai, Zoltán
2003-10-01
The problem of information integration and resistance to the invasion of parasitic mutants in prebiotic replicator systems is a notorious issue of research on the origin of life. Almost all theoretical studies published so far have demonstrated that some kind of spatial structure is indispensable for the persistence and/or the parasite resistance of any feasible replicator system. Based on a detailed critical survey of spatial models on prebiotic information integration, we suggest a possible scenario for replicator system evolution leading to the emergence of the first protocells capable of independent life. We show that even the spatial versions of the hypercycle model are vulnerable to selfish parasites in heterogeneous habitats. Contrary, the metabolic system remains persistent and coexistent with its parasites both on heterogeneous surfaces and in chaotically mixing flowing media. Persistent metabolic parasites can be converted to metabolic cooperators, or they can gradually obtain replicase activity. Our simulations show that, once replicase activity emerged, a gradual and simultaneous evolutionary improvement of replicase functionality (speed and fidelity) and template efficiency is possible only on a surface that constrains the mobility of macromolecule replicators. Based on the results of the models reviewed, we suggest that open chaotic flows (`soup') and surface dynamics (`pizza') both played key roles in the sequence of evolutionary events ultimately concluding in the appearance of the first living cell on Earth.
Recapitulating phylogenies using k-mers: from trees to networks.
Bernard, Guillaume; Ragan, Mark A; Chan, Cheong Xin
2016-01-01
Ernst Haeckel based his landmark Tree of Life on the supposed ontogenic recapitulation of phylogeny, i.e. that successive embryonic stages during the development of an organism re-trace the morphological forms of its ancestors over the course of evolution. Much of this idea has since been discredited. Today, phylogenies are often based on families of molecular sequences. The standard approach starts with a multiple sequence alignment, in which the sequences are arranged relative to each other in a way that maximises a measure of similarity position-by-position along their entire length. A tree (or sometimes a network) is then inferred. Rigorous multiple sequence alignment is computationally demanding, and evolutionary processes that shape the genomes of many microbes (bacteria, archaea and some morphologically simple eukaryotes) can add further complications. In particular, recombination, genome rearrangement and lateral genetic transfer undermine the assumptions that underlie multiple sequence alignment, and imply that a tree-like structure may be too simplistic. Here, using genome sequences of 143 bacterial and archaeal genomes, we construct a network of phylogenetic relatedness based on the number of shared k -mers (subsequences at fixed length k ). Our findings suggest that the network captures not only key aspects of microbial genome evolution as inferred from a tree, but also features that are not treelike. The method is highly scalable, allowing for investigation of genome evolution across a large number of genomes. Instead of using specific regions or sequences from genome sequences, or indeed Haeckel's idea of ontogeny, we argue that genome phylogenies can be inferred using k -mers from whole-genome sequences. Representing these networks dynamically allows biological questions of interest to be formulated and addressed quickly and in a visually intuitive manner.
A Glimpse into the Satellite DNA Library in Characidae Fish (Teleostei, Characiformes)
Utsunomia, Ricardo; Ruiz-Ruano, Francisco J.; Silva, Duílio M. Z. A.; Serrano, Érica A.; Rosa, Ivana F.; Scudeler, Patrícia E. S.; Hashimoto, Diogo T.; Oliveira, Claudio; Camacho, Juan Pedro M.; Foresti, Fausto
2017-01-01
Satellite DNA (satDNA) is an abundant fraction of repetitive DNA in eukaryotic genomes and plays an important role in genome organization and evolution. In general, satDNA sequences follow a concerted evolutionary pattern through the intragenomic homogenization of different repeat units. In addition, the satDNA library hypothesis predicts that related species share a series of satDNA variants descended from a common ancestor species, with differential amplification of different satDNA variants. The finding of a same satDNA family in species belonging to different genera within Characidae fish provided the opportunity to test both concerted evolution and library hypotheses. For this purpose, we analyzed here sequence variation and abundance of this satDNA family in ten species, by a combination of next generation sequencing (NGS), PCR and Sanger sequencing, and fluorescence in situ hybridization (FISH). We found extensive between-species variation for the number and size of pericentromeric FISH signals. At genomic level, the analysis of 1000s of DNA sequences obtained by Illumina sequencing and PCR amplification allowed defining 150 haplotypes which were linked in a common minimum spanning tree, where different patterns of concerted evolution were apparent. This also provided a glimpse into the satDNA library of this group of species. In consistency with the library hypothesis, different variants for this satDNA showed high differences in abundance between species, from highly abundant to simply relictual variants. PMID:28855916
Leung, Preston; Eltahla, Auda A; Lloyd, Andrew R; Bull, Rowena A; Luciani, Fabio
2017-07-15
With the advent of affordable deep sequencing technologies, detection of low frequency variants within genetically diverse viral populations can now be achieved with unprecedented depth and efficiency. The high-resolution data provided by next generation sequencing technologies is currently recognised as the gold standard in estimation of viral diversity. In the analysis of rapidly mutating viruses, longitudinal deep sequencing datasets from viral genomes during individual infection episodes, as well as at the epidemiological level during outbreaks, now allow for more sophisticated analyses such as statistical estimates of the impact of complex mutation patterns on the evolution of the viral populations both within and between hosts. These analyses are revealing more accurate descriptions of the evolutionary dynamics that underpin the rapid adaptation of these viruses to the host response, and to drug therapies. This review assesses recent developments in methods and provide informative research examples using deep sequencing data generated from rapidly mutating viruses infecting humans, particularly hepatitis C virus (HCV), human immunodeficiency virus (HIV), Ebola virus and influenza virus, to understand the evolution of viral genomes and to explore the relationship between viral mutations and the host adaptive immune response. Finally, we discuss limitations in current technologies, and future directions that take advantage of publically available large deep sequencing datasets. Copyright © 2016 Elsevier B.V. All rights reserved.
Yang, Zujun; Zhang, Tao; Li, Guangrong; Nevo, Eviatar
2011-12-01
Dehydrins are one of the major stress-induced gene families, and the expression of dehydrin 6 (Dhn6) is strictly related to drought in barley. In order to investigate how the evolution of the Dhn6 gene is associated with adaptation to environmental changes, we examined 48 genotypes of wild barley, Hordeum spontaneum, from "Evolution Canyon" at Mount Carmel, Israel. The Dhn6 sequences of the 48 genotypes were identified, and a recent insertion of 342 bp at 5'UTR was found in the sequences of 11 genotypes. Both nucleotide and haplotype diversity of single nucleotide polymorphism in Dhn6 coding regions were higher on the AS ("African" slope or dry slope) than on the ES ("European" slope or humid slope), and the applied Tajima D and Fu-Li test rejected neutrality of SNP diversity. Expression analysis indicated that the 342 bp insertion at 5'UTR was associated with the earlier up-regulation of Dhn6 after dehydration. The genetic divergence of amino acids sequences indicated significant positive selection of Dhn6 among the wild barley populations. The diversity of Dhn6 in microclimatic divergence slopes suggested that Dhn6 has been subjected to natural selection and adaptively associated with drought resistance of wild barley at "Evolution Canyon".
Blenda, Anna; Fang, David D.; Rami, Jean-François; Garsmeur, Olivier; Luo, Feng; Lacape, Jean-Marc
2012-01-01
A consensus genetic map of tetraploid cotton was constructed using six high-density maps and after the integration of a sequence-based marker redundancy check. Public cotton SSR libraries (17,343 markers) were curated for sequence redundancy using 90% as a similarity cutoff. As a result, 20% of the markers (3,410) could be considered as redundant with some other markers. The marker redundancy information had been a crucial part of the map integration process, in which the six most informative interspecific Gossypium hirsutum×G. barbadense genetic maps were used for assembling a high density consensus (HDC) map for tetraploid cotton. With redundant markers being removed, the HDC map could be constructed thanks to the sufficient number of collinear non-redundant markers in common between the component maps. The HDC map consists of 8,254 loci, originating from 6,669 markers, and spans 4,070 cM, with an average of 2 loci per cM. The HDC map presents a high rate of locus duplications, as 1,292 markers among the 6,669 were mapped in more than one locus. Two thirds of the duplications are bridging homoeologous AT and DT chromosomes constitutive of allopolyploid cotton genome, with an average of 64 duplications per AT/DT chromosome pair. Sequences of 4,744 mapped markers were used for a mutual blast alignment (BBMH) with the 13 major scaffolds of the recently released Gossypium raimondii genome indicating high level of homology between the diploid D genome and the tetraploid cotton genetic map, with only a few minor possible structural rearrangements. Overall, the HDC map will serve as a valuable resource for trait QTL comparative mapping, map-based cloning of important genes, and better understanding of the genome structure and evolution of tetraploid cotton. PMID:23029214
Sullivan, Matthew B; Krastins, Bryan; Hughes, Jennifer L; Kelly, Libusha; Chase, Michael; Sarracino, David; Chisholm, Sallie W
2009-01-01
Prochlorococcus, an abundant phototroph in the oceans, are infected by members of three families of viruses: myo-, podo- and siphoviruses. Genomes of myo- and podoviruses isolated on Prochlorococcus contain DNA replication machinery and virion structural genes homologous to those from coliphages T4 and T7 respectively. They also contain a suite of genes of cyanobacterial origin, most notably photosynthesis genes, which are expressed during infection and appear integral to the evolutionary trajectory of both host and phage. Here we present the first genome of a cyanobacterial siphovirus, P-SS2, which was isolated from Atlantic slope waters using a Prochlorococcus host (MIT9313). The P-SS2 genome is larger than, and considerably divergent from, previously sequenced siphoviruses. It appears most closely related to lambdoid siphoviruses, with which it shares 13 functional homologues. The ∼108 kb P-SS2 genome encodes 131 predicted proteins and notably lacks photosynthesis genes which have consistently been found in other marine cyanophage, but does contain 14 other cyanobacterial homologues. While only six structural proteins were identified from the genome sequence, 35 proteins were detected experimentally; these mapped onto capsid and tail structural modules in the genome. P-SS2 is potentially capable of integration into its host as inferred from bioinformatically identified genetic machinery int, bet, exo and a 53 bp attachment site. The host attachment site appears to be a genomic island that is tied to insertion sequence (IS) activity that could facilitate mobility of a gene involved in the nitrogen-stress response. The homologous region and a secondary IS-element hot-spot in Synechococcus RS9917 are further evidence of IS-mediated genome evolution coincident with a probable relic prophage integration event. This siphovirus genome provides a glimpse into the biology of a deep-photic zone phage as well as the ocean cyanobacterial prophage and IS element ‘mobilome’. PMID:19840100
Sullivan, Matthew B; Krastins, Bryan; Hughes, Jennifer L; Kelly, Libusha; Chase, Michael; Sarracino, David; Chisholm, Sallie W
2009-11-01
Prochlorococcus, an abundant phototroph in the oceans, are infected by members of three families of viruses: myo-, podo- and siphoviruses. Genomes of myo- and podoviruses isolated on Prochlorococcus contain DNA replication machinery and virion structural genes homologous to those from coliphages T4 and T7 respectively. They also contain a suite of genes of cyanobacterial origin, most notably photosynthesis genes, which are expressed during infection and appear integral to the evolutionary trajectory of both host and phage. Here we present the first genome of a cyanobacterial siphovirus, P-SS2, which was isolated from Atlantic slope waters using a Prochlorococcus host (MIT9313). The P-SS2 genome is larger than, and considerably divergent from, previously sequenced siphoviruses. It appears most closely related to lambdoid siphoviruses, with which it shares 13 functional homologues. The approximately 108 kb P-SS2 genome encodes 131 predicted proteins and notably lacks photosynthesis genes which have consistently been found in other marine cyanophage, but does contain 14 other cyanobacterial homologues. While only six structural proteins were identified from the genome sequence, 35 proteins were detected experimentally; these mapped onto capsid and tail structural modules in the genome. P-SS2 is potentially capable of integration into its host as inferred from bioinformatically identified genetic machinery int, bet, exo and a 53 bp attachment site. The host attachment site appears to be a genomic island that is tied to insertion sequence (IS) activity that could facilitate mobility of a gene involved in the nitrogen-stress response. The homologous region and a secondary IS-element hot-spot in Synechococcus RS9917 are further evidence of IS-mediated genome evolution coincident with a probable relic prophage integration event. This siphovirus genome provides a glimpse into the biology of a deep-photic zone phage as well as the ocean cyanobacterial prophage and IS element 'mobilome'.
Pre-main Sequence Evolution and the Hydrogen-Burning Minimum Mass
NASA Astrophysics Data System (ADS)
Nakano, Takenori
There is a lower limit to the mass of the main-sequence stars (the hydrogen-burning minimum mass) below which the stars cannot replenish the energy lost from their surfaces with the energy released by the hydrogen burning in their cores. This is caused by the electron degeneracy in the stars which suppresses the increase of the central temperature with contraction. To find out the lower limit we need the accurate knowledge of the pre-main sequence evolution of very low-mass stars in which the effect of electron degeneracy is important. We review how Hayashi and Nakano (1963) carried out the first determination of this limit.
Multiplex Reverse Transcription-PCR for Simultaneous Surveillance of Influenza A and B Viruses
Zhou, Bin; Barnes, John R.; Sessions, October M.; Chou, Tsui-Wen; Wilson, Malania; Stark, Thomas J.; Volk, Michelle; Spirason, Natalie; Halpin, Rebecca A.; Kamaraj, Uma Sangumathi; Ding, Tao; Stockwell, Timothy B.; Ghedin, Elodie; Barr, Ian G.
2017-01-01
ABSTRACT Influenza A and B viruses are the causative agents of annual influenza epidemics that can be severe, and influenza A viruses intermittently cause pandemics. Sequence information from influenza virus genomes is instrumental in determining mechanisms underpinning antigenic evolution and antiviral resistance. However, due to sequence diversity and the dynamics of influenza virus evolution, rapid and high-throughput sequencing of influenza viruses remains a challenge. We developed a single-reaction influenza A/B virus (FluA/B) multiplex reverse transcription-PCR (RT-PCR) method that amplifies the most critical genomic segments (hemagglutinin [HA], neuraminidase [NA], and matrix [M]) of seasonal influenza A and B viruses for next-generation sequencing, regardless of viral type, subtype, or lineage. Herein, we demonstrate that the strategy is highly sensitive and robust. The strategy was validated on thousands of seasonal influenza A and B virus-positive specimens using multiple next-generation sequencing platforms. PMID:28978683
Cytogenetic evidence for asexual evolution of bdelloid rotifers.
Mark Welch, Jessica L; Mark Welch, David B; Meselson, Matthew
2004-02-10
DNA sequencing has shown individual bdelloid rotifer genomes to contain two or more diverged copies of every gene examined and has revealed no closely similar copies. These and other findings are consistent with long-term asexual evolution of bdelloids. It is not entirely ruled out, however, that bdelloid genomes consist of previously undetected pairs of sequences so similar as to be identical over the regions sequenced, as might result if bdelloids were highly inbred sexual diploids or polyploids. Here, we employ fluorescent in situ hybridization with cosmid probes to determine the copy number and chromosomal distribution of the heat shock gene hsp82 and adjacent sequences in the bdelloid Philodina roseola. We conclude that the four copies identified by sequencing are the only ones present and that each is on a separate chromosome. Bdelloids therefore are not highly homozygous sexually reproducing diploids or polyploids.
Sampathkumar, Raghavan; Shadabi, Elnaz; Luo, Ma
2012-01-01
As of February 2012, 50 circulating recombinant forms (CRFs) have been reported for HIV-1 while one CRF for HIV-2. Also according to HIV sequence compendium 2011, the HIV sequence database is replete with 414,398 sequences. The fact that there are CRFs, which are an amalgamation of sequences derived from six or more subtypes (CRF27_cpx (cpx refers to complex) is a mosaic with sequences from 6 different subtypes besides an unclassified fragment), serves as a testimony to the continual divergent evolution of the virus with its approximate 1% per year rate of evolution, and this phenomena per se poses tremendous challenge for vaccine development against HIV/AIDS, a devastating disease that has killed 1.8 million patients in 2010. Here, we explore the interaction between HIV-1 and host genetic variation in the context of HIV/AIDS and antiretroviral therapy response. PMID:22666249
Evolution of asteroidal orbits with high inclinations
NASA Astrophysics Data System (ADS)
Solovaya, Nina A.; Pittich, Eduard M.
1993-10-01
The 20,000 years orbital evolution of massless fictitious asteroid located at a border of the Hill's gravitational sphere has been investigated. The eleven orbits with the eccentricities from 0.0 to 0.4 in five groups of inclinations from 40 deg to 80 deg were numerically integrated with planetary perturbations of six major planets, using the numerical integration n-body program with the Everhart's integrator RA 15. For each group time evolution of orbital elements of the asteroids is presented.
Ma, Peng-Fei; Guo, Zhen-Hua; Li, De-Zhu
2012-01-01
Compared to their counterparts in animals, the mitochondrial (mt) genomes of angiosperms exhibit a number of unique features. However, unravelling their evolution is hindered by the few completed genomes, of which are essentially Sanger sequenced. While next-generation sequencing technologies have revolutionized chloroplast genome sequencing, they are just beginning to be applied to angiosperm mt genomes. Chloroplast genomes of grasses (Poaceae) have undergone episodic evolution and the evolutionary rate was suggested to be correlated between chloroplast and mt genomes in Poaceae. It is interesting to investigate whether correlated rate change also occurred in grass mt genomes as expected under lineage effects. A time-calibrated phylogenetic tree is needed to examine rate change. We determined a largely completed mt genome from a bamboo, Ferrocalamus rimosivaginus (Poaceae), through Illumina sequencing of total DNA. With combination of de novo and reference-guided assembly, 39.5-fold coverage Illumina reads were finally assembled into scaffolds totalling 432,839 bp. The assembled genome contains nearly the same genes as the completed mt genomes in Poaceae. For examining evolutionary rate in grass mt genomes, we reconstructed a phylogenetic tree including 22 taxa based on 31 mt genes. The topology of the well-resolved tree was almost identical to that inferred from chloroplast genome with only minor difference. The inconsistency possibly derived from long branch attraction in mtDNA tree. By calculating absolute substitution rates, we found significant rate change (∼4-fold) in mt genome before and after the diversification of Poaceae both in synonymous and nonsynonymous terms. Furthermore, the rate change was correlated with that of chloroplast genomes in grasses. Our result demonstrates that it is a rapid and efficient approach to obtain angiosperm mt genome sequences using Illumina sequencing technology. The parallel episodic evolution of mt and chloroplast genomes in grasses is consistent with lineage effects.
Ma, Peng-Fei; Guo, Zhen-Hua; Li, De-Zhu
2012-01-01
Background Compared to their counterparts in animals, the mitochondrial (mt) genomes of angiosperms exhibit a number of unique features. However, unravelling their evolution is hindered by the few completed genomes, of which are essentially Sanger sequenced. While next-generation sequencing technologies have revolutionized chloroplast genome sequencing, they are just beginning to be applied to angiosperm mt genomes. Chloroplast genomes of grasses (Poaceae) have undergone episodic evolution and the evolutionary rate was suggested to be correlated between chloroplast and mt genomes in Poaceae. It is interesting to investigate whether correlated rate change also occurred in grass mt genomes as expected under lineage effects. A time-calibrated phylogenetic tree is needed to examine rate change. Methodology/Principal Findings We determined a largely completed mt genome from a bamboo, Ferrocalamus rimosivaginus (Poaceae), through Illumina sequencing of total DNA. With combination of de novo and reference-guided assembly, 39.5-fold coverage Illumina reads were finally assembled into scaffolds totalling 432,839 bp. The assembled genome contains nearly the same genes as the completed mt genomes in Poaceae. For examining evolutionary rate in grass mt genomes, we reconstructed a phylogenetic tree including 22 taxa based on 31 mt genes. The topology of the well-resolved tree was almost identical to that inferred from chloroplast genome with only minor difference. The inconsistency possibly derived from long branch attraction in mtDNA tree. By calculating absolute substitution rates, we found significant rate change (∼4-fold) in mt genome before and after the diversification of Poaceae both in synonymous and nonsynonymous terms. Furthermore, the rate change was correlated with that of chloroplast genomes in grasses. Conclusions/Significance Our result demonstrates that it is a rapid and efficient approach to obtain angiosperm mt genome sequences using Illumina sequencing technology. The parallel episodic evolution of mt and chloroplast genomes in grasses is consistent with lineage effects. PMID:22272330
When are pathogen genome sequences informative of transmission events?
Ferguson, Neil; Jombart, Thibaut
2018-01-01
Recent years have seen the development of numerous methodologies for reconstructing transmission trees in infectious disease outbreaks from densely sampled whole genome sequence data. However, a fundamental and as of yet poorly addressed limitation of such approaches is the requirement for genetic diversity to arise on epidemiological timescales. Specifically, the position of infected individuals in a transmission tree can only be resolved by genetic data if mutations have accumulated between the sampled pathogen genomes. To quantify and compare the useful genetic diversity expected from genetic data in different pathogen outbreaks, we introduce here the concept of ‘transmission divergence’, defined as the number of mutations separating whole genome sequences sampled from transmission pairs. Using parameter values obtained by literature review, we simulate outbreak scenarios alongside sequence evolution using two models described in the literature to describe transmission divergence of ten major outbreak-causing pathogens. We find that while mean values vary significantly between the pathogens considered, their transmission divergence is generally very low, with many outbreaks characterised by large numbers of genetically identical transmission pairs. We describe the impact of transmission divergence on our ability to reconstruct outbreaks using two outbreak reconstruction tools, the R packages outbreaker and phybreak, and demonstrate that, in agreement with previous observations, genetic sequence data of rapidly evolving pathogens such as RNA viruses can provide valuable information on individual transmission events. Conversely, sequence data of pathogens with lower mean transmission divergence, including Streptococcus pneumoniae, Shigella sonnei and Clostridium difficile, provide little to no information about individual transmission events. Our results highlight the informational limitations of genetic sequence data in certain outbreak scenarios, and demonstrate the need to expand the toolkit of outbreak reconstruction tools to integrate other types of epidemiological data. PMID:29420641
Andersen, Mikael R.; Salazar, Margarita P.; Schaap, Peter J.; van de Vondervoort, Peter J.I.; Culley, David; Thykaer, Jette; Frisvad, Jens C.; Nielsen, Kristian F.; Albang, Richard; Albermann, Kaj; Berka, Randy M.; Braus, Gerhard H.; Braus-Stromeyer, Susanna A.; Corrochano, Luis M.; Dai, Ziyu; van Dijck, Piet W.M.; Hofmann, Gerald; Lasure, Linda L.; Magnuson, Jon K.; Menke, Hildegard; Meijer, Martin; Meijer, Susan L.; Nielsen, Jakob B.; Nielsen, Michael L.; van Ooyen, Albert J.J.; Pel, Herman J.; Poulsen, Lars; Samson, Rob A.; Stam, Hein; Tsang, Adrian; van den Brink, Johannes M.; Atkins, Alex; Aerts, Andrea; Shapiro, Harris; Pangilinan, Jasmyn; Salamov, Asaf; Lou, Yigong; Lindquist, Erika; Lucas, Susan; Grimwood, Jane; Grigoriev, Igor V.; Kubicek, Christian P.; Martinez, Diego; van Peij, Noël N.M.E.; Roubos, Johannes A.; Nielsen, Jens; Baker, Scott E.
2011-01-01
The filamentous fungus Aspergillus niger exhibits great diversity in its phenotype. It is found globally, both as marine and terrestrial strains, produces both organic acids and hydrolytic enzymes in high amounts, and some isolates exhibit pathogenicity. Although the genome of an industrial enzyme-producing A. niger strain (CBS 513.88) has already been sequenced, the versatility and diversity of this species compel additional exploration. We therefore undertook whole-genome sequencing of the acidogenic A. niger wild-type strain (ATCC 1015) and produced a genome sequence of very high quality. Only 15 gaps are present in the sequence, and half the telomeric regions have been elucidated. Moreover, sequence information from ATCC 1015 was used to improve the genome sequence of CBS 513.88. Chromosome-level comparisons uncovered several genome rearrangements, deletions, a clear case of strain-specific horizontal gene transfer, and identification of 0.8 Mb of novel sequence. Single nucleotide polymorphisms per kilobase (SNPs/kb) between the two strains were found to be exceptionally high (average: 7.8, maximum: 160 SNPs/kb). High variation within the species was confirmed with exo-metabolite profiling and phylogenetics. Detailed lists of alleles were generated, and genotypic differences were observed to accumulate in metabolic pathways essential to acid production and protein synthesis. A transcriptome analysis supported up-regulation of genes associated with biosynthesis of amino acids that are abundant in glucoamylase A, tRNA-synthases, and protein transporters in the protein producing CBS 513.88 strain. Our results and data sets from this integrative systems biology analysis resulted in a snapshot of fungal evolution and will support further optimization of cell factories based on filamentous fungi. PMID:21543515
Parasitic plants have increased rates of molecular evolution across all three genomes
2013-01-01
Background Theoretical models and experimental evidence suggest that rates of molecular evolution could be raised in parasitic organisms compared to non-parasitic taxa. Parasitic plants provide an ideal test for these predictions, as there are at least a dozen independent origins of the parasitic lifestyle in angiosperms. Studies of a number of parasitic plant lineages have suggested faster rates of molecular evolution, but the results of some studies have been mixed. Comparative analysis of all parasitic plant lineages, including sequences from all three genomes, is needed to examine the generality of the relationship between rates of molecular evolution and parasitism in plants. Results We analysed DNA sequence data from the mitochondrial, nuclear and chloroplast genomes for 12 independent evolutionary origins of parasitism in angiosperms. We demonstrated that parasitic lineages have a faster rate of molecular evolution than their non-parasitic relatives in sequences for all three genomes, for both synonymous and nonsynonymous substitutions. Conclusions Our results prove that raised rates of molecular evolution are a general feature of parasitic plants, not confined to a few taxa or specific genes. We discuss possible causes for this relationship, including increased positive selection associated with host-parasite arms races, relaxed selection, reduced population size or repeated bottlenecks, increased mutation rates, and indirect causal links with generation time and body size. We find no evidence that faster rates are due to smaller effective populations sizes or changes in selection pressure. Instead, our results suggest that parasitic plants have a higher mutation rate than their close non-parasitic relatives. This may be due to a direct connection, where some aspect of the parasitic lifestyle drives the evolution of raised mutation rates. Alternatively, this pattern may be driven by an indirect connection between rates and parasitism: for example, parasitic plants tend to be smaller than their non-parasitic relatives, which may result in more cell generations per year, thus a higher rate of mutations arising from DNA copy errors per unit time. Demonstration that adoption of a parasitic lifestyle influences the rate of genomic evolution is relevant to attempts to infer molecular phylogenies of parasitic plants and to estimate their evolutionary divergence times using sequence data. PMID:23782527
Parasitic plants have increased rates of molecular evolution across all three genomes.
Bromham, Lindell; Cowman, Peter F; Lanfear, Robert
2013-06-19
Theoretical models and experimental evidence suggest that rates of molecular evolution could be raised in parasitic organisms compared to non-parasitic taxa. Parasitic plants provide an ideal test for these predictions, as there are at least a dozen independent origins of the parasitic lifestyle in angiosperms. Studies of a number of parasitic plant lineages have suggested faster rates of molecular evolution, but the results of some studies have been mixed. Comparative analysis of all parasitic plant lineages, including sequences from all three genomes, is needed to examine the generality of the relationship between rates of molecular evolution and parasitism in plants. We analysed DNA sequence data from the mitochondrial, nuclear and chloroplast genomes for 12 independent evolutionary origins of parasitism in angiosperms. We demonstrated that parasitic lineages have a faster rate of molecular evolution than their non-parasitic relatives in sequences for all three genomes, for both synonymous and nonsynonymous substitutions. Our results prove that raised rates of molecular evolution are a general feature of parasitic plants, not confined to a few taxa or specific genes. We discuss possible causes for this relationship, including increased positive selection associated with host-parasite arms races, relaxed selection, reduced population size or repeated bottlenecks, increased mutation rates, and indirect causal links with generation time and body size. We find no evidence that faster rates are due to smaller effective populations sizes or changes in selection pressure. Instead, our results suggest that parasitic plants have a higher mutation rate than their close non-parasitic relatives. This may be due to a direct connection, where some aspect of the parasitic lifestyle drives the evolution of raised mutation rates. Alternatively, this pattern may be driven by an indirect connection between rates and parasitism: for example, parasitic plants tend to be smaller than their non-parasitic relatives, which may result in more cell generations per year, thus a higher rate of mutations arising from DNA copy errors per unit time. Demonstration that adoption of a parasitic lifestyle influences the rate of genomic evolution is relevant to attempts to infer molecular phylogenies of parasitic plants and to estimate their evolutionary divergence times using sequence data.
NASA Astrophysics Data System (ADS)
Bolmont, E.; Gallet, F.; Mathis, S.; Charbonnel, C.; Amard, L.; Alibert, Y.
2017-08-01
Observations of hot-Jupiter exoplanets suggest that their orbital period distribution depends on the metallicity of the host stars. We investigate here whether the impact of the stellar metallicity on the evolution of the tidal dissipation inside the convective envelope of rotating stars and its resulting effect on the planetary migration might be a possible explanation for this observed statistical trend. We use a frequency-averaged tidal dissipation formalism coupled to an orbital evolution code and to rotating stellar evolution models in order to estimate the effect of a change of stellar metallicity on the evolution of close-in planets. We consider here two different stellar masses: 0.4 M⊙ and 1.0 M⊙ evolving from the early pre-main sequence phase up to the red-giant branch. We show that the metallicity of a star has a strong effect on the stellar parameters, which in turn strongly influence the tidal dissipation in the convective region. While on the pre-main sequence, the dissipation of a metal-poor Sun-like star is higher than the dissipation of a metal-rich Sun-like star; on the main sequence it is the opposite. However, for the 0.4 M⊙ star, the dependence of the dissipation with metallicity is much less visible. Using an orbital evolution model, we show that changing the metallicity leads to different orbital evolutions (e.g., planets migrate farther out from an initially fast-rotating metal-rich star). Using this model, we qualitatively reproduced the observational trends of the population of hot Jupiters with the metallicity of their host stars. However, more steps are needed to improve our model to try to quantitatively fit our results to the observations. Specifically, we need to improve the treatment of the rotation evolution in the orbital evolution model, and ultimately we need to consistently couple the orbital model to the stellar evolution model.
L1-mediated retrotransposition of murine B1 and B2 SINEs recapitulated in cultured cells.
Dewannieux, Marie; Heidmann, Thierry
2005-06-03
SINEs are short interspersed nucleotide elements with transpositional activity, present at a high copy number (up to a million) in mammalian genomes. They are 80-400 bp long, non-coding sequences which derive either from the 7SL RNA (e.g. human Alus, murine B1s) or tRNA (e.g. murine B2s) polymerase III-driven genes. We have previously demonstrated that Alus very efficiently divert the enzymatic machinery of the autonomous L1 LINE (long interspersed nucleotide element) retrotransposons to transpose at a high rate. Here we show, using an ex vivo assay for transposition, that both B1 and B2 SINEs can be mobilized by murine LINEs, with the hallmarks of a bona fide retrotransposition process, including target site duplications of varying lengths and integrations into A-rich sequences. Despite different phylogenetic origins, transposition of the tRNA-derived B2 sequences is as efficient as that of the human Alus, whereas that of B1s is 20-100-fold lower despite a similar high copy number of these elements in the mouse genome. We provide evidence, via an appropriate nucleotide substitution within the B1 sequence in a domain essential for its intracellular targeting, that the current B1 SINEs are not optimal for transposition, a feature most probably selected for the host sake in the course of evolution.
Merotto, Aldo; Jasieniuk, Marie; Osuna, Maria D; Vidotto, Francesco; Ferrero, Aldo; Fischer, Albert J
2009-02-25
Resistance to ALS-inhibiting herbicides in Cyperus difformis has evolved rapidly in many rice areas worldwide. This study identified the mechanism of resistance, assessed cross-resistance patterns to all five chemical groups of ALS-inhibiting herbicides in four C. difformis biotypes, and attempted to sequence the ALS gene. Whole-plant and ALS enzyme activity dose-response assays indicated that the WA biotype was resistant to all ALS-inhibiting herbicides evaluated. The IR biotype was resistant to bensulfuron-methyl, orthosulfamuron, imazethapyr, and propoxycarbazone-sodium and less resistant to bispyribac-sodium and halosulfuron-methyl, and susceptible to penoxsulam. ALS enzyme activity assays indicated that resistance is due to an altered target site yet mutations previously found to endow target-site resistance in weeds were not detected in the sequences obtained. The inability to detect resistance mutations in C. difformis may result from the presence of additional ALS genes, which were not amplified by the primers used. This study reports the first ALS gene sequence from Cyperus difformis. Certain ALS-inhibiting herbicides can still be used to control some resistant C. difformis biotypes. However, because cross-resistance to all five classes of ALS-inhibitors was detected in other resistant biotypes, these herbicides should only be used within an integrated weed management program designed to delay the evolution of herbicide resistance.
2004-12-09
We present here a draft genome sequence of the red jungle fowl, Gallus gallus. Because the chicken is a modern descendant of the dinosaurs and the first non-mammalian amniote to have its genome sequenced, the draft sequence of its genome--composed of approximately one billion base pairs of sequence and an estimated 20,000-23,000 genes--provides a new perspective on vertebrate genome evolution, while also improving the annotation of mammalian genomes. For example, the evolutionary distance between chicken and human provides high specificity in detecting functional elements, both non-coding and coding. Notably, many conserved non-coding sequences are far from genes and cannot be assigned to defined functional classes. In coding regions the evolutionary dynamics of protein domains and orthologous groups illustrate processes that distinguish the lineages leading to birds and mammals. The distinctive properties of avian microchromosomes, together with the inferred patterns of conserved synteny, provide additional insights into vertebrate chromosome architecture.
Fan, Yu; Xi, Liu; Hughes, Daniel S T; Zhang, Jianjun; Zhang, Jianhua; Futreal, P Andrew; Wheeler, David A; Wang, Wenyi
2016-08-24
Subclonal mutations reveal important features of the genetic architecture of tumors. However, accurate detection of mutations in genetically heterogeneous tumor cell populations using next-generation sequencing remains challenging. We develop MuSE ( http://bioinformatics.mdanderson.org/main/MuSE ), Mutation calling using a Markov Substitution model for Evolution, a novel approach for modeling the evolution of the allelic composition of the tumor and normal tissue at each reference base. MuSE adopts a sample-specific error model that reflects the underlying tumor heterogeneity to greatly improve the overall accuracy. We demonstrate the accuracy of MuSE in calling subclonal mutations in the context of large-scale tumor sequencing projects using whole exome and whole genome sequencing.
Iacobuzio-Donahue, Christine A
2012-01-01
Pancreatic cancer is a disease caused by the accumulation of genetic alterations in specific genes. Elucidation of the human genome sequence, in conjunction with technical advances in the ability to perform whole exome sequencing, have provided new insight into the mutational spectra characteristic of this lethal tumour type. Most recently, exomic sequencing has been used to clarify the clonal evolution of pancreatic cancer as well as provide time estimates of pancreatic carcinogenesis, indicating that a long window of opportunity may exist for early detection of this disease while in the curative stage. Moving forward, these mutational analyses indicate potential targets for personalised diagnostic and therapeutic intervention as well as the optimal timing for intervention based on the natural history of pancreatic carcinogenesis and progression. PMID:21749982
Singh, Prashant; Singh, Satya Shila; Elster, Josef; Mishra, Arun Kumar
2013-06-01
In order to assess phylogeny, population genetics, and approximation of future course of cyanobacterial evolution based on nifH gene sequences, 41 heterocystous cyanobacterial strains collected from all over India have been used in the present study. NifH gene sequence analysis data confirm that the heterocystous cyanobacteria are monophyletic while the stigonematales show polyphyletic origin with grave intermixing. Further, analysis of nifH gene sequence data using intricate mathematical extrapolations revealed that the nucleotide diversity and recombination frequency is much greater in Nostocales than the Stigonematales. Similarly, DNA divergence studies showed significant values of divergence with greater gene conversion tracts in the unbranched (Nostocales) than the branched (Stigonematales) strains. Our data strongly support the origin of true branching cyanobacterial strains from the unbranched strains.
Sequencing Data Discovery and Integration for Earth System Science with MetaSeek
NASA Astrophysics Data System (ADS)
Hoarfrost, A.; Brown, N.; Arnosti, C.
2017-12-01
Microbial communities play a central role in biogeochemical cycles. Sequencing data resources from environmental sources have grown exponentially in recent years, and represent a singular opportunity to investigate microbial interactions with Earth system processes. Carrying out such meta-analyses depends on our ability to discover and curate sequencing data into large-scale integrated datasets. However, such integration efforts are currently challenging and time-consuming, with sequencing data scattered across multiple repositories and metadata that is not easily or comprehensively searchable. MetaSeek is a sequencing data discovery tool that integrates sequencing metadata from all the major data repositories, allowing the user to search and filter on datasets in a lightweight application with an intuitive, easy-to-use web-based interface. Users can save and share curated datasets, while other users can browse these data integrations or use them as a jumping off point for their own curation. Missing and/or erroneous metadata are inferred automatically where possible, and where not possible, users are prompted to contribute to the improvement of the sequencing metadata pool by correcting and amending metadata errors. Once an integrated dataset has been curated, users can follow simple instructions to download their raw data and quickly begin their investigations. In addition to the online interface, the MetaSeek database is easily queryable via an open API, further enabling users and facilitating integrations of MetaSeek with other data curation tools. This tool lowers the barriers to curation and integration of environmental sequencing data, clearing the path forward to illuminating the ecosystem-scale interactions between biological and abiotic processes.
Evolution of language: Lessons from the genome.
Fisher, Simon E
2017-02-01
The post-genomic era is an exciting time for researchers interested in the biology of speech and language. Substantive advances in molecular methodologies have opened up entire vistas of investigation that were not previously possible, or in some cases even imagined. Speculations concerning the origins of human cognitive traits are being transformed into empirically addressable questions, generating specific hypotheses that can be explicitly tested using data collected from both the natural world and experimental settings. In this article, I discuss a number of promising lines of research in this area. For example, the field has begun to identify genes implicated in speech and language skills, including not just disorders but also the normal range of abilities. Such genes provide powerful entry points for gaining insights into neural bases and evolutionary origins, using sophisticated experimental tools from molecular neuroscience and developmental neurobiology. At the same time, sequencing of ancient hominin genomes is giving us an unprecedented view of the molecular genetic changes that have occurred during the evolution of our species. Synthesis of data from these complementary sources offers an opportunity to robustly evaluate alternative accounts of language evolution. Of course, this endeavour remains challenging on many fronts, as I also highlight in the article. Nonetheless, such an integrated approach holds great potential for untangling the complexities of the capacities that make us human.
Internal Dynamics and Crustal Evolution of Mars
NASA Technical Reports Server (NTRS)
Zuber, Maria
2005-01-01
The objective of this work is to improve understanding of the internal structure, crustal evolution, and thermal history of Mars by combining geophysical data analysis of topography, gravity and magnetics with results from analytical and computational modeling. Accomplishments thus far in this investigation include: (1) development of a new crustal thickness model that incorporates constraints from Mars meteorites, corrections for polar cap masses and other surface loads, Pratt isostasy, and core flattening; (2) determination of a refined estimate of crustal thickness of Mars from geoid/topography ratios (GTRs); (3) derivation of a preliminary estimate of the k(sub 2) gravitational Love number and a preliminary estimate of possible dissipation within Mars consistent with this value; and (4) an integrative analysis of the sequence of evolution of early Mars. During the remainder of this investigation we will: (1) extend models of degree-1 mantle convection from 2-D to 3-D; (2) investigate potential causal relationships and effects of major impacts on mantle plume formation, with primary application to Mars; (3) develop exploratory models to assess the convective stability of various Martian core states as relevant to the history of dynamo action; and (4) develop models of long-wavelength relaxation of crustal thickness anomalies to potentially explain the degree-1 structure of the Martian crust.
Mahardika, Gusti N
2018-01-01
Abstract To expand our capacity to discover venom sequences from the genomes of venomous organisms, we applied targeted sequencing techniques to selectively recover venom gene superfamilies and nontoxin loci from the genomes of 32 cone snail species (family, Conidae), a diverse group of marine gastropods that capture their prey using a cocktail of neurotoxic peptides (conotoxins). We were able to successfully recover conotoxin gene superfamilies across all species with high confidence (> 100× coverage) and used these data to provide new insights into conotoxin evolution. First, we found that conotoxin gene superfamilies are composed of one to six exons and are typically short in length (mean = ∼85 bp). Second, we expanded our understanding of the following genetic features of conotoxin evolution: 1) positive selection, where exons coding the mature toxin region were often three times more divergent than their adjacent noncoding regions, 2) expression regulation, with comparisons to transcriptome data showing that cone snails only express a fraction of the genes available in their genome (24–63%), and 3) extensive gene turnover, where Conidae species varied from 120 to 859 conotoxin gene copies. Finally, using comparative phylogenetic methods, we found that while diet specificity did not predict patterns of conotoxin evolution, dietary breadth was positively correlated with total conotoxin gene diversity. Overall, the targeted sequencing technique demonstrated here has the potential to radically increase the pace at which venom gene families are sequenced and studied, reshaping our ability to understand the impact of genetic changes on ecologically relevant phenotypes and subsequent diversification. PMID:29514313
Nikolaidis, Nikolas; Nei, Masatoshi
2004-03-01
We have identified the Hsp70 gene superfamily of the nematode Caenorhabditis briggsae and investigated the evolution of these genes in comparison with Hsp70 genes from C. elegans, Drosophila, and yeast. The Hsp70 genes are classified into three monophyletic groups according to their subcellular localization, namely, cytoplasm (CYT), endoplasmic reticulum (ER), and mitochondria (MT). The Hsp110 genes can be classified into the polyphyletic CYT group and the monophyletic ER group. The different Hsp70 and Hsp110 groups appeared to evolve following the model of divergent evolution. This model can also explain the evolution of the ER and MT genes. On the other hand, the CYT genes are divided into heat-inducible and constitutively expressed genes. The constitutively expressed genes have evolved more or less following the birth-and-death process, and the rates of gene birth and gene death are different between the two nematode species. By contrast, some heat-inducible genes show an intraspecies phylogenetic clustering. This suggests that they are subject to sequence homogenization resulting from gene conversion-like events. In addition, the heat-inducible genes show high levels of sequence conservation in both intra-species and inter-species comparisons, and in most cases, amino acid sequence similarity is higher than nucleotide sequence similarity. This indicates that purifying selection also plays an important role in maintaining high sequence similarity among paralogous Hsp70 genes. Therefore, we suggest that the CYT heat-inducible genes have been subjected to a combination of purifying selection, birth-and-death process, and gene conversion-like events.
Evolution of Mhc-DRB introns: implications for the origin of primates.
Kupfermann, H; Satta, Y; Takahata, N; Tichy, H; Klein, J
1999-06-01
Introns are generally believed to evolve too rapidly and too erratically to be of much use in phylogenetic reconstructions. Few phylogenetically informative intron sequences are available, however, to ascertain the validity of this supposition. In the present study the supposition was tested on the example of the mammalian class II major histocompatibility complex (Mhc) genes of the DRB family. Since the Mhc genes evolve under balancing selection and are believed to recombine or rearrange frequently, the evolution of their introns could be expected to be particularly rapid and subject to scrambling. Sequences of intron 4 and 5 DRB genes were obtained from polymerase chain reaction-amplified fragments of genomic DNA from representatives of six eutherian orders-Primates, Scandentia, Chiroptera, Dermoptera, Lagomorpha, and Insectivora. Although short stretches of the introns have indeed proved to be unalignable, the bulk of the intron sequences from all six orders, spanning >85 million years (my) of evolution, could be aligned and used in a study of the tempo and mode of intron evolution. The analysis has revealed the Mhc introns to evolve at a rate similar to that of other genes and of synonymous sites of non-Mhc genes. No evidence of homogenization or large-scale scrambling of the intron sequences could be found. The Mhc introns apparently evolve largely by point mutations and insertions/deletions. The phylogenetic signals contained in the intron sequences could be used to identify Scandentia as the sister group of Primates, to support the existence of the Archonta superorder, and to confirm the monophyly of the Chiroptera.
Nedelcu, Aurora M
2009-03-01
Programmed cell death (PCD) represents a significant component of normal growth and development in multicellular organisms. Recently, PCD-like processes have been reported in single-celled eukaryotes, implying that some components of the PCD machinery existed early in eukaryotic evolution. This study provides a comparative analysis of PCD-related sequences across more than 50 unicellular genera from four eukaryotic supergroups: Unikonts, Excavata, Chromalveolata, and Plantae. A complex set of PCD-related sequences that correspond to domains or proteins associated with all main functional classes--from ligands and receptors to executors of PCD--was found in many unicellular lineages. Several PCD domains and proteins previously thought to be restricted to animals or land plants are also present in unicellular species. Noteworthy, the yeast, Saccharomyces cerevisiae--used as an experimental model system for PCD research, has a rather reduced set of PCD-related sequences relative to other unicellular species. The phylogenetic distribution of the PCD-related sequences identified in unicellular lineages suggests that the genetic basis for the evolution of the complex PCD machinery present in extant multicellular lineages has been established early in the evolution of eukaryotes. The shaping of the PCD machinery in multicellular lineages involved the duplication, co-option, recruitment, and shuffling of domains already present in their unicellular ancestors.
Gene conversion as a secondary mechanism of short interspersed element (SINE) evolution
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kass, D.H.; Batzer, M.A.; Deininger, P.L.
The Alu repetitive family of short interspersed elements (SINEs) in primates can be subdivided into distinct subfamilies by specific diagnostic nucleotide changes. The older subfamilies are generally very abundant, while the younger subfamilies have fewer copies. Some of the youngest Alu elements are absent in the orthologous loci of nonhuman primates, indicative of recent retroposition events, the primary mode of SINE evolutions. PCR analysis of one young Alu subfamily (Sb2) member found in the low-density lipoprotein receptor gene apparently revealed the presence of this element in the green monkey, orangutan, gorilla, and chimpanzee genomes, as well as the human genome.more » However, sequence analysis of these genomes revealed a highly mutated, older, primate-specific Alu element was present at this position in the nonhuman primates. Comparison of the flanking DNA sequences upstream of this Alu insertion corresponded to evolution expected for standard primate phylogeny, but comparison of the Alu repeat sequences revealed that the human element departed from this phylogeny. The change in the human sequence apparently occurred by a gene conversion event only within the Alu element itself, converting it from one of the oldest to one of the youngest Alu subfamilies. Although gene conversions of Alu elements are clearly very rare, this finding shows that such events can occur and contribute to specific cases of SINE subfamily evolution.« less
Present Day Biology seen in the Looking Glass of Physics of Complexity
NASA Astrophysics Data System (ADS)
Schuster, P.
Darwin's theory of variation and selection in its simplest form is directly applicable to RNA evolution in vitro as well as to virus evolution, and it allows for quantitative predictions. Understanding evolution at the molecular level is ultimately related to the central paradigm of structural biology: sequence⇒ structure ⇒ function. We elaborate on the state of the art in modeling and understanding evolution of RNA driven by reproduction and mutation. The focus will be laid on the landscape concept—originally introduced by Sewall Wright—and its application to problems in biology. The relation between genotypes and phenotypes is the result of two consecutive mappings from a space of genotypes called sequence space onto a space of phenotypes or structures, and fitness is the result of a mapping from phenotype space into non-negative real numbers. Realistic landscapes as derived from folding of RNA sequences into structures are characterized by two properties: (i) they are rugged in the sense that sequences lying nearby in sequence space may have very different fitness values and (ii) they are characterized by an appreciable degree of neutrality implying that a certain fraction of genotypes and/or phenotypes cannot be distinguished in the selection process. Evolutionary dynamics on realistic landscapes will be studied as a function of the mutation rate, and the role of neutrality in the selection process will be discussed.
Cytogenetic Diversity of Simple Sequences Repeats in Morphotypes of Brassica rapa ssp. chinensis
Zheng, Jin-shuang; Sun, Cheng-zhen; Zhang, Shu-ning; Hou, Xi-lin; Bonnema, Guusje
2016-01-01
A significant fraction of the nuclear DNA of all eukaryotes is comprised of simple sequence repeats (SSRs). Although these sequences are widely used for studying genetic variation, linkage mapping and evolution, little attention had been paid to the chromosomal distribution and cytogenetic diversity of these sequences. In this paper, we report the distribution characterization of mono-, di-, and tri-nucleotide SSRs in Brassica rapa ssp. chinensis. Fluorescence in situ hybridization was used to characterize the cytogenetic diversity of SSRs among morphotypes of B. rapa ssp. chinensis. The proportion of different SSR motifs varied among morphotypes of B. rapa ssp. chinensis, with tri-nucleotide SSRs being more prevalent in the genome of B. rapa ssp. chinensis. We determined the chromosomal locations of mono-, di-, and tri-nucleotide repeat loci. The results showed that the chromosomal distribution of SSRs in the different morphotypes is non-random and motif-dependent, and allowed us to characterize the relative variability in terms of SSR numbers and similar chromosomal distributions in centromeric/peri-centromeric heterochromatin. The differences between SSR repeats with respect to abundance and distribution indicate that SSRs are a driving force in the genomic evolution of B. rapa species. Our results provide a comprehensive view of the SSR sequence distribution and evolution for comparison among morphotypes B. rapa ssp. chinensis. PMID:27507974
Cytogenetic Diversity of Simple Sequences Repeats in Morphotypes of Brassica rapa ssp. chinensis.
Zheng, Jin-Shuang; Sun, Cheng-Zhen; Zhang, Shu-Ning; Hou, Xi-Lin; Bonnema, Guusje
2016-01-01
A significant fraction of the nuclear DNA of all eukaryotes is comprised of simple sequence repeats (SSRs). Although these sequences are widely used for studying genetic variation, linkage mapping and evolution, little attention had been paid to the chromosomal distribution and cytogenetic diversity of these sequences. In this paper, we report the distribution characterization of mono-, di-, and tri-nucleotide SSRs in Brassica rapa ssp. chinensis. Fluorescence in situ hybridization was used to characterize the cytogenetic diversity of SSRs among morphotypes of B. rapa ssp. chinensis. The proportion of different SSR motifs varied among morphotypes of B. rapa ssp. chinensis, with tri-nucleotide SSRs being more prevalent in the genome of B. rapa ssp. chinensis. We determined the chromosomal locations of mono-, di-, and tri-nucleotide repeat loci. The results showed that the chromosomal distribution of SSRs in the different morphotypes is non-random and motif-dependent, and allowed us to characterize the relative variability in terms of SSR numbers and similar chromosomal distributions in centromeric/peri-centromeric heterochromatin. The differences between SSR repeats with respect to abundance and distribution indicate that SSRs are a driving force in the genomic evolution of B. rapa species. Our results provide a comprehensive view of the SSR sequence distribution and evolution for comparison among morphotypes B. rapa ssp. chinensis.
Sellem, C. H.; d'Aubenton-Carafa, Y.; Rossignol, M.; Belcour, L.
1996-01-01
The mitochondrial genome of 23 wild-type strains belonging to three different species of the filamentous fungus Podospora was examined. Among the 15 optional sequences identified are two intronic reading frames, nad1-i4-orf1 and cox1-i7-orf2. We show that the presence of these sequences was strictly correlated with tightly clustered nucleotide substitutions in the adjacent exon. This correlation applies to the presence or absence of closely related open reading frames (ORFs), found at the same genetic locations, in all the Pyrenomycete genera examined. The recent gain of these optional ORFs in the evolution of the genus Podospora probably account for such sequence differences. In the homoplasmic progeny from heteroplasmons constructed between Podospora strains differing by the presence of these optional ORFs, nad1-i4-orf1 and cox1-i7-orf2 appeared highly invasive. Sequence comparisons in the nad1-i4 intron of various strains of the Pyrenomycete family led us to propose a scenario of its evolution that includes several events of loss and gain of intronic ORFs. These results strongly reinforce the idea that group I intronic ORFs are mobile elements and that their transfer, and comcomitant modification of the adjacent exon, could participate in the modular evolution of mitochondrial genomes. PMID:8725226
Sellem, C H; d'Aubenton-Carafa, Y; Rossignol, M; Belcour, L
1996-06-01
The mitochondrial genome of 23 wild-type strains belonging to three different species of the filamentous fungus Podospora was examined. Among the 15 optional sequences identified are two intronic reading frames, nad1-i4-orf1 and cox1-i7-orf2. We show that the presence of these sequences was strictly correlated with tightly clustered nucleotide substitutions in the adjacent exon. This correlation applies to the presence or absence of closely related open reading frames (ORFs), found at the same genetic locations, in all the Pyrenomycete genera examined. The recent gain of these optional ORFs in the evolution of the genus Podospora probably account for such sequence differences. In the homoplasmic progeny from heteroplasmons constructed between Podospora strains differing by the presence of these optional ORFs, nad1-i4-orf1 and cox1-i7-orf2 appeared highly invasive. Sequence comparisons in the nad1-i4 intron of various strains of the Pyrenomycete family led us to propose a scenario of its evolution that includes several events of loss and gain of intronic ORFs. These results strongly reinforce the idea that group 1 intronic ORFs are mobile elements and that their transfer, and concomitant modification of the adjacent exon, could participate in the modular evolution of mitochondrial genomes.
Birth and death of genes linked to chromosomal inversion
Furuta, Yoshikazu; Kawai, Mikihiko; Yahara, Koji; Takahashi, Noriko; Handa, Naofumi; Tsuru, Takeshi; Oshima, Kenshiro; Yoshida, Masaru; Azuma, Takeshi; Hattori, Masahira; Uchiyama, Ikuo; Kobayashi, Ichizo
2011-01-01
The birth and death of genes is central to adaptive evolution, yet the underlying genome dynamics remain elusive. The availability of closely related complete genome sequences helps to follow changes in gene contents and clarify their relationship to overall genome organization. Helicobacter pylori, bacteria in our stomach, are known for their extreme genome plasticity through mutation and recombination and will make a good target for such an analysis. In comparing their complete genome sequences, we found that gain and loss of genes (loci) for outer membrane proteins, which mediate host interaction, occurred at breakpoints of chromosomal inversions. Sequence comparison there revealed a unique mechanism of DNA duplication: DNA duplication associated with inversion. In this process, a DNA segment at one chromosomal locus is copied and inserted, in an inverted orientation, into a distant locus on the same chromosome, while the entire region between these two loci is also inverted. Recognition of this and three more inversion modes, which occur through reciprocal recombination between long or short sequence similarity or adjacent to a mobile element, allowed reconstruction of synteny evolution through inversion events in this species. These results will guide the interpretation of extensive DNA sequencing results for understanding long- and short-term genome evolution in various organisms and in cancer cells. PMID:21212362
Woldring, Daniel R.; Holec, Patrick V.; Zhou, Hong; Hackel, Benjamin J.
2015-01-01
Discovering new binding function via a combinatorial library in small protein scaffolds requires balance between appropriate mutations to introduce favorable intermolecular interactions while maintaining intramolecular integrity. Sitewise constraints exist in a non-spatial gradient from diverse to conserved in evolved antibody repertoires; yet non-antibody scaffolds generally do not implement this strategy in combinatorial libraries. Despite the fact that biased amino acid distributions, typically elevated in tyrosine, serine, and glycine, have gained wider use in synthetic scaffolds, these distributions are still predominantly applied uniformly to diversified sites. While select sites in fibronectin domains and DARPins have shown benefit from sitewise designs, they have not been deeply evaluated. Inspired by this disparity between diversity distributions in natural libraries and synthetic scaffold libraries, we hypothesized that binders resulting from discovery and evolution would exhibit a non-spatial, sitewise gradient of amino acid diversity. To identify sitewise diversities consistent with efficient evolution in the context of a hydrophilic fibronectin domain, >105 binders to six targets were evolved and sequenced. Evolutionarily favorable amino acid distributions at 25 sites reveal Shannon entropies (range: 0.3–3.9; median: 2.1; standard deviation: 1.1) supporting the diversity gradient hypothesis. Sitewise constraints in evolved sequences are consistent with complementarity, stability, and consensus biases. Implementation of sitewise constrained diversity enables direct selection of nanomolar affinity binders validating an efficient strategy to balance inter- and intra-molecular interaction demands at each site. PMID:26383268
Highly conserved D-loop-like nuclear mitochondrial sequences (Numts) in tiger (Panthera tigris).
Zhang, Wenping; Zhang, Zhihe; Shen, Fujun; Hou, Rong; Lv, Xiaoping; Yue, Bisong
2006-08-01
Using oligonucleotide primers designed to match hypervariable segments I (HVS-1) of Panthera tigris mitochondrial DNA (mtDNA), we amplified two different PCR products (500 bp and 287 bp) in the tiger (Panthera tigris), but got only one PCR product (287 bp) in the leopard (Panthera pardus). Sequence analyses indicated that the sequence of 287 bp was a D-loop-like nuclear mitochondrial sequence (Numts), indicating a nuclear transfer that occurred approximately 4.8-17 million years ago in the tiger and 4.6-16 million years ago in the leopard. Although the mtDNA D-loop sequence has a rapid rate of evolution, the 287-bp Numts are highly conserved; they are nearly identical in tiger subspecies and only 1.742% different between tiger and leopard. Thus, such sequences represent molecular 'fossils' that can shed light on evolution of the mitochondrial genome and may be the most appropriate outgroup for phylogenetic analysis. This is also proved by comparing the phylogenetic trees reconstructed using the D-loop sequence of snow leopard and the 287-bp Numts as outgroup.
Self-organization of the protocell was a forward process
NASA Technical Reports Server (NTRS)
Fox, S. W.; Matsuno, K.
1983-01-01
Yockey's (1981) interpretation of information theory relative to concepts of self-organization in the origin of life is criticized on the ground that it assumes that each amino acid residue type in a given sequence is an unaided information carrier throughout evolution. It is argued that more than one amino acid residue can act as a unit information carrier, and that this was the case in prebiotic protein evolution. Forward-extrapolation should be used to study prebiotic evolution, not backward-extrapolation. Transposing the near-random internal order of modern proteins to primitive proteins, as Yockey has done, is an unsupported assumption and disagrees with the results of experimental models of the primordial type. Studies indicate that early primary information carriers in evolution were mixtures of free alpha amino acids which necessarily had the capability of sequencing themselves.
SpreaD3: Interactive Visualization of Spatiotemporal History and Trait Evolutionary Processes.
Bielejec, Filip; Baele, Guy; Vrancken, Bram; Suchard, Marc A; Rambaut, Andrew; Lemey, Philippe
2016-08-01
Model-based phylogenetic reconstructions increasingly consider spatial or phenotypic traits in conjunction with sequence data to study evolutionary processes. Alongside parameter estimation, visualization of ancestral reconstructions represents an integral part of these analyses. Here, we present a complete overhaul of the spatial phylogenetic reconstruction of evolutionary dynamics software, now called SpreaD3 to emphasize the use of data-driven documents, as an analysis and visualization package that primarily complements Bayesian inference in BEAST (http://beast.bio.ed.ac.uk, last accessed 9 May 2016). The integration of JavaScript D3 libraries (www.d3.org, last accessed 9 May 2016) offers novel interactive web-based visualization capacities that are not restricted to spatial traits and extend to any discrete or continuously valued trait for any organism of interest. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
CRISPR-Cas systems: Prokaryotes upgrade to adaptive immunity.
Barrangou, Rodolphe; Marraffini, Luciano A
2014-04-24
Clustered regularly interspaced short palindromic repeats (CRISPR), and associated proteins (Cas) comprise the CRISPR-Cas system, which confers adaptive immunity against exogenic elements in many bacteria and most archaea. CRISPR-mediated immunization occurs through the uptake of DNA from invasive genetic elements such as plasmids and viruses, followed by its integration into CRISPR loci. These loci are subsequently transcribed and processed into small interfering RNAs that guide nucleases for specific cleavage of complementary sequences. Conceptually, CRISPR-Cas shares functional features with the mammalian adaptive immune system, while also exhibiting characteristics of Lamarckian evolution. Because immune markers spliced from exogenous agents are integrated iteratively in CRISPR loci, they constitute a genetic record of vaccination events and reflect environmental conditions and changes over time. Cas endonucleases, which can be reprogrammed by small guide RNAs have shown unprecedented potential and flexibility for genome editing and can be repurposed for numerous DNA targeting applications including transcriptional control. Copyright © 2014 Elsevier Inc. All rights reserved.
Endogenous hepadnaviruses, bornaviruses and circoviruses in snakes.
Gilbert, C; Meik, J M; Dashevsky, D; Card, D C; Castoe, T A; Schaack, S
2014-09-22
We report the discovery of endogenous viral elements (EVEs) from Hepadnaviridae, Bornaviridae and Circoviridae in the speckled rattlesnake, Crotalus mitchellii, the first viperid snake for which a draft whole genome sequence assembly is available. Analysis of the draft assembly reveals genome fragments from the three virus families were inserted into the genome of this snake over the past 50 Myr. Cross-species PCR screening of orthologous loci and computational scanning of the python and king cobra genomes reveals that circoviruses integrated most recently (within the last approx. 10 Myr), whereas bornaviruses and hepadnaviruses integrated at least approximately 13 and approximately 50 Ma, respectively. This is, to our knowledge, the first report of circo-, borna- and hepadnaviruses in snakes and the first characterization of non-retroviral EVEs in non-avian reptiles. Our study provides a window into the historical dynamics of viruses in these host lineages and shows that their evolution involved multiple host-switches between mammals and reptiles. © 2014 The Author(s) Published by the Royal Society. All rights reserved.
Wang, Lili; Fan, Jean; Francis, Joshua M.; Georghiou, George; Hergert, Sarah; Li, Shuqiang; Gambe, Rutendo; Zhou, Chensheng W.; Yang, Chunxiao; Xiao, Sheng; Cin, Paola Dal; Bowden, Michaela; Kotliar, Dylan; Shukla, Sachet A.; Brown, Jennifer R.; Neuberg, Donna; Alessi, Dario R.; Zhang, Cheng-Zhong; Kharchenko, Peter V.; Livak, Kenneth J.; Wu, Catherine J.
2017-01-01
Intra-tumoral genetic heterogeneity has been characterized across cancers by genome sequencing of bulk tumors, including chronic lymphocytic leukemia (CLL). In order to more accurately identify subclones, define phylogenetic relationships, and probe genotype–phenotype relationships, we developed methods for targeted mutation detection in DNA and RNA isolated from thousands of single cells from five CLL samples. By clearly resolving phylogenic relationships, we uncovered mutated LCP1 and WNK1 as novel CLL drivers, supported by functional evidence demonstrating their impact on CLL pathways. Integrative analysis of somatic mutations with transcriptional states prompts the idea that convergent evolution generates phenotypically similar cells in distinct genetic branches, thus creating a cohesive expression profile in each CLL sample despite the presence of genetic heterogeneity. Our study highlights the potential for single-cell RNA-based targeted analysis to sensitively determine transcriptional and mutational profiles of individual cancer cells, leading to increased understanding of driving events in malignancy. PMID:28679620
DNA barcodes for ecology, evolution, and conservation.
Kress, W John; García-Robledo, Carlos; Uriarte, Maria; Erickson, David L
2015-01-01
The use of DNA barcodes, which are short gene sequences taken from a standardized portion of the genome and used to identify species, is entering a new phase of application as more and more investigations employ these genetic markers to address questions relating to the ecology and evolution of natural systems. The suite of DNA barcode markers now applied to specific taxonomic groups of organisms are proving invaluable for understanding species boundaries, community ecology, functional trait evolution, trophic interactions, and the conservation of biodiversity. The application of next-generation sequencing (NGS) technology will greatly expand the versatility of DNA barcodes across the Tree of Life, habitats, and geographies as new methodologies are explored and developed. Published by Elsevier Ltd.
Identification of three duplicated Spin genes in medaka (Oryzias latipes).
Wang, Xiao-Lei; Mei, Jie; Sun, Min; Hong, Yun-Han; Gui, Jian-Fang
2005-05-09
Gene and genomic duplications are very important and frequent events in fish evolution, and the divergence of duplicated genes in sequences and functions is a focus of research on gene evolution. Here, we report the identification and characterization of three duplicated Spindlin (Spin) genes from medaka (Oryzias latipes): OlSpinA, OlSpinB, and OlSpinC. Molecular cloning, genomic DNA Blast analysis and phylogenetic relationship analysis demonstrated that the three duplicated OlSpin genes should belong to gene duplication. Furthermore, Western blot analysis revealed significant expression differences of the three OlSpins among different tissues and during embryogenesis in medaka, and suggested that sequence and functional divergence might have occurred in evolution among them.
Xu, Haiyan; Sun, Zhihong; Liu, Wenjun; Yu, Jie; Song, Yuqin; Lv, Qiang; Zhang, Jiachao; Shao, Yuyu; Menghe, Bilige; Zhang, Heping
2014-05-01
To determine the genetic diversity and phylogenetic relationships among Lactococcus lactis isolates, 197 strains isolated from naturally homemade yogurt in 9 ethnic minority areas of 6 provinces of China were subjected to multilocus sequence typing (MLST). The MLST analysis was performed using internal fragment sequences of 12 housekeeping genes (carB, clpX, dnaA, groEL, murC, murE, pepN, pepX, pyrG, recA, rpoB, and pheS). Six (dnaA) to 8 (murC) different alleles were detected for these genes, which ranged from 33.62 (clpX) to 41.95% (recA) GC (guanine-cytosine) content. The nucleotide diversity (π) ranged from 0.00362 (murE) to 0.08439 (carB). Despite this limited allelic diversity, the allele combinations of each strain revealed 72 different sequence types, which denoted significant genotypic diversity. The dN/dS ratios (where dS is the number of synonymous substitutions per synonymous site, and dN is the number of nonsynonymous substitutions per nonsynonymous site) were lower than 1, suggesting potential negative selection for these genes. The standardized index of association of the alleles IA(S)=0.3038 supported the clonality of Lc. lactis, but the presence of network structure revealed by the split decomposition analysis of the concatenated sequence was strong evidence for intraspecies recombination. Therefore, this suggests that recombination contributed to the evolution of Lc. lactis. A minimum spanning tree analysis of the 197 isolates identified 14 clonal complexes and 23 singletons. Phylogenetic trees were constructed based on the sequence types, using the minimum evolution algorithm, and on the concatenated sequence (6,192 bp), using the unweighted pair-group method with arithmetic mean, and these trees indicated that the evolution of our Lc. lactis population was correlated with geographic origin. Taken together, our results demonstrated that MLST could provide a better understanding of Lc. lactis genome evolution, as well as useful information for future studies on global Lc. lactis structure and genetic evolution, which will lay the foundation for screening Lc. lactis as starter cultures in fermented dairy products. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Functionally conserved enhancers with divergent sequences in distant vertebrates
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, Song; Oksenberg, Nir; Takayama, Sachiko
To examine the contributions of sequence and function conservation in the evolution of enhancers, we systematically identified enhancers whose sequences are not conserved among distant groups of vertebrate species, but have homologous function and are likely to be derived from a common ancestral sequence. In conclusion, our approach combined comparative genomics and epigenomics to identify potential enhancer sequences in the genomes of three groups of distantly related vertebrate species.
Functionally conserved enhancers with divergent sequences in distant vertebrates
Yang, Song; Oksenberg, Nir; Takayama, Sachiko; ...
2015-10-30
To examine the contributions of sequence and function conservation in the evolution of enhancers, we systematically identified enhancers whose sequences are not conserved among distant groups of vertebrate species, but have homologous function and are likely to be derived from a common ancestral sequence. In conclusion, our approach combined comparative genomics and epigenomics to identify potential enhancer sequences in the genomes of three groups of distantly related vertebrate species.