biological sequence evolution: Topics by Science.gov

Sample records for biological sequence evolution

Biological intuition in alignment-free methods: response to Posada.

PubMed

Ragan, Mark A; Chan, Cheong Xin

2013-08-01

A recent editorial in Journal of Molecular Evolution highlights opportunities and challenges facing molecular evolution in the era of next-generation sequencing. Abundant sequence data should allow more-complex models to be fit at higher confidence, making phylogenetic inference more reliable and improving our understanding of evolution at the molecular level. However, concern that approaches based on multiple sequence alignment may be computationally infeasible for large datasets is driving the development of so-called alignment-free methods for sequence comparison and phylogenetic inference. The recent editorial characterized these approaches as model-free, not based on the concept of homology, and lacking in biological intuition. We argue here that alignment-free methods have not abandoned models or homology, and can be biologically intuitive.
The Genome Sequence of Taurine Cattle: A Window to Ruminant Biology and Evolution

USDA-ARS?s Scientific Manuscript database

As a major step toward understanding the biology and evolution of ruminants, the cattle genome was sequenced to ~7x coverage using a combined whole genome shotgun and BAC skim approach. The cattle genome contains a minimum of 22,000 genes, with a core set of 14,345 orthologs found in seven mammalian...
Genome sequence of an Australian kangaroo, Macropus eugenii, provides insight into the evolution of mammalian reproduction and development

PubMed Central

2011-01-01

Background We present the genome sequence of the tammar wallaby, Macropus eugenii, which is a member of the kangaroo family and the first representative of the iconic hopping mammals that symbolize Australia to be sequenced. The tammar has many unusual biological characteristics, including the longest period of embryonic diapause of any mammal, extremely synchronized seasonal breeding and prolonged and sophisticated lactation within a well-defined pouch. Like other marsupials, it gives birth to highly altricial young, and has a small number of very large chromosomes, making it a valuable model for genomics, reproduction and development. Results The genome has been sequenced to 2 × coverage using Sanger sequencing, enhanced with additional next generation sequencing and the integration of extensive physical and linkage maps to build the genome assembly. We also sequenced the tammar transcriptome across many tissues and developmental time points. Our analyses of these data shed light on mammalian reproduction, development and genome evolution: there is innovation in reproductive and lactational genes, rapid evolution of germ cell genes, and incomplete, locus-specific X inactivation. We also observe novel retrotransposons and a highly rearranged major histocompatibility complex, with many class I genes located outside the complex. Novel microRNAs in the tammar HOX clusters uncover new potential mammalian HOX regulatory elements. Conclusions Analyses of these resources enhance our understanding of marsupial gene evolution, identify marsupial-specific conserved non-coding elements and critical genes across a range of biological systems, including reproduction, development and immunity, and provide new insight into marsupial and mammalian biology and genome evolution. PMID:21854559
Genome sequence of an Australian kangaroo, Macropus eugenii, provides insight into the evolution of mammalian reproduction and development.

PubMed

Renfree, Marilyn B; Papenfuss, Anthony T; Deakin, Janine E; Lindsay, James; Heider, Thomas; Belov, Katherine; Rens, Willem; Waters, Paul D; Pharo, Elizabeth A; Shaw, Geoff; Wong, Emily S W; Lefèvre, Christophe M; Nicholas, Kevin R; Kuroki, Yoko; Wakefield, Matthew J; Zenger, Kyall R; Wang, Chenwei; Ferguson-Smith, Malcolm; Nicholas, Frank W; Hickford, Danielle; Yu, Hongshi; Short, Kirsty R; Siddle, Hannah V; Frankenberg, Stephen R; Chew, Keng Yih; Menzies, Brandon R; Stringer, Jessica M; Suzuki, Shunsuke; Hore, Timothy A; Delbridge, Margaret L; Patel, Hardip R; Mohammadi, Amir; Schneider, Nanette Y; Hu, Yanqiu; O'Hara, William; Al Nadaf, Shafagh; Wu, Chen; Feng, Zhi-Ping; Cocks, Benjamin G; Wang, Jianghui; Flicek, Paul; Searle, Stephen M J; Fairley, Susan; Beal, Kathryn; Herrero, Javier; Carone, Dawn M; Suzuki, Yutaka; Sugano, Sumio; Toyoda, Atsushi; Sakaki, Yoshiyuki; Kondo, Shinji; Nishida, Yuichiro; Tatsumoto, Shoji; Mandiou, Ion; Hsu, Arthur; McColl, Kaighin A; Lansdell, Benjamin; Weinstock, George; Kuczek, Elizabeth; McGrath, Annette; Wilson, Peter; Men, Artem; Hazar-Rethinam, Mehlika; Hall, Allison; Davis, John; Wood, David; Williams, Sarah; Sundaravadanam, Yogi; Muzny, Donna M; Jhangiani, Shalini N; Lewis, Lora R; Morgan, Margaret B; Okwuonu, Geoffrey O; Ruiz, San Juana; Santibanez, Jireh; Nazareth, Lynne; Cree, Andrew; Fowler, Gerald; Kovar, Christie L; Dinh, Huyen H; Joshi, Vandita; Jing, Chyn; Lara, Fremiet; Thornton, Rebecca; Chen, Lei; Deng, Jixin; Liu, Yue; Shen, Joshua Y; Song, Xing-Zhi; Edson, Janette; Troon, Carmen; Thomas, Daniel; Stephens, Amber; Yapa, Lankesha; Levchenko, Tanya; Gibbs, Richard A; Cooper, Desmond W; Speed, Terence P; Fujiyama, Asao; Graves, Jennifer A M; O'Neill, Rachel J; Pask, Andrew J; Forrest, Susan M; Worley, Kim C

2011-08-29

We present the genome sequence of the tammar wallaby, Macropus eugenii, which is a member of the kangaroo family and the first representative of the iconic hopping mammals that symbolize Australia to be sequenced. The tammar has many unusual biological characteristics, including the longest period of embryonic diapause of any mammal, extremely synchronized seasonal breeding and prolonged and sophisticated lactation within a well-defined pouch. Like other marsupials, it gives birth to highly altricial young, and has a small number of very large chromosomes, making it a valuable model for genomics, reproduction and development. The genome has been sequenced to 2 × coverage using Sanger sequencing, enhanced with additional next generation sequencing and the integration of extensive physical and linkage maps to build the genome assembly. We also sequenced the tammar transcriptome across many tissues and developmental time points. Our analyses of these data shed light on mammalian reproduction, development and genome evolution: there is innovation in reproductive and lactational genes, rapid evolution of germ cell genes, and incomplete, locus-specific X inactivation. We also observe novel retrotransposons and a highly rearranged major histocompatibility complex, with many class I genes located outside the complex. Novel microRNAs in the tammar HOX clusters uncover new potential mammalian HOX regulatory elements. Analyses of these resources enhance our understanding of marsupial gene evolution, identify marsupial-specific conserved non-coding elements and critical genes across a range of biological systems, including reproduction, development and immunity, and provide new insight into marsupial and mammalian biology and genome evolution.
Conserved noncoding sequences conserve biological networks and influence genome evolution.

PubMed

Xie, Jianbo; Qian, Kecheng; Si, Jingna; Xiao, Liang; Ci, Dong; Zhang, Deqiang

2018-05-01

Comparative genomics approaches have identified numerous conserved cis-regulatory sequences near genes in plant genomes. Despite the identification of these conserved noncoding sequences (CNSs), our knowledge of their functional importance and selection remains limited. Here, we used a combination of DNA methylome analysis, microarray expression analyses, and functional annotation to study these sequences in the model tree Populus trichocarpa. Methylation in CG contexts and non-CG contexts was lower in CNSs, particularly CNSs in the 5'-upstream regions of genes, compared with other sites in the genome. We observed that CNSs are enriched in genes with transcription and binding functions, and this also associated with syntenic genes and those from whole-genome duplications, suggesting that cis-regulatory sequences play a key role in genome evolution. We detected a significant positive correlation between CNS number and protein interactions, suggesting that CNSs may have roles in the evolution and maintenance of biological networks. The divergence of CNSs indicates that duplication-degeneration-complementation drives the subfunctionalization of a proportion of duplicated genes from whole-genome duplication. Furthermore, population genomics confirmed that most CNSs are under strong purifying selection and only a small subset of CNSs shows evidence of adaptive evolution. These findings provide a foundation for future studies exploring these key genomic features in the maintenance of biological networks, local adaptation, and transcription.
Evolution of proteins.

NASA Technical Reports Server (NTRS)

Dayhoff, M. O.

1971-01-01

The amino acid sequences of proteins from living organisms are dealt with. The structure of proteins is first discussed; the variation in this structure from one biological group to another is illustrated by the first halves of the sequences of cytochrome c, and a phylogenetic tree is derived from the cytochrome c data. The relative geological times associated with the events of this tree are discussed. Errors which occur in the duplication of cells during the evolutionary process are examined. Particular attention is given to evolution of mutant proteins, globins, ferredoxin, and transfer ribonucleic acids (tRNA's). Finally, a general outline of biological evolution is presented.
Evolution of Enzyme Superfamilies: Comprehensive Exploration of Sequence-Function Relationships.

PubMed

Baier, F; Copp, J N; Tokuriki, N

2016-11-22

The sequence and functional diversity of enzyme superfamilies have expanded through billions of years of evolution from a common ancestor. Understanding how protein sequence and functional "space" have expanded, at both the evolutionary and molecular level, is central to biochemistry, molecular biology, and evolutionary biology. Integrative approaches that examine protein sequence, structure, and function have begun to provide comprehensive views of the functional diversity and evolutionary relationships within enzyme superfamilies. In this review, we outline the recent advances in our understanding of enzyme evolution and superfamily functional diversity. We describe the tools that have been used to comprehensively analyze sequence relationships and to characterize sequence and function relationships. We also highlight recent large-scale experimental approaches that systematically determine the activity profiles across enzyme superfamilies. We identify several intriguing insights from this recent body of work. First, promiscuous activities are prevalent among extant enzymes. Second, many divergent proteins retain "function connectivity" via enzyme promiscuity, which can be used to probe the evolutionary potential and history of enzyme superfamilies. Finally, we discuss open questions regarding the intricacies of enzyme divergence, as well as potential research directions that will deepen our understanding of enzyme superfamily evolution.
Synthetic biology for the directed evolution of protein biocatalysts: navigating sequence space intelligently

PubMed Central

Currin, Andrew; Swainston, Neil; Day, Philip J.

2015-01-01

The amino acid sequence of a protein affects both its structure and its function. Thus, the ability to modify the sequence, and hence the structure and activity, of individual proteins in a systematic way, opens up many opportunities, both scientifically and (as we focus on here) for exploitation in biocatalysis. Modern methods of synthetic biology, whereby increasingly large sequences of DNA can be synthesised de novo, allow an unprecedented ability to engineer proteins with novel functions. However, the number of possible proteins is far too large to test individually, so we need means for navigating the ‘search space’ of possible protein sequences efficiently and reliably in order to find desirable activities and other properties. Enzymologists distinguish binding (K d) and catalytic (k cat) steps. In a similar way, judicious strategies have blended design (for binding, specificity and active site modelling) with the more empirical methods of classical directed evolution (DE) for improving k cat (where natural evolution rarely seeks the highest values), especially with regard to residues distant from the active site and where the functional linkages underpinning enzyme dynamics are both unknown and hard to predict. Epistasis (where the ‘best’ amino acid at one site depends on that or those at others) is a notable feature of directed evolution. The aim of this review is to highlight some of the approaches that are being developed to allow us to use directed evolution to improve enzyme properties, often dramatically. We note that directed evolution differs in a number of ways from natural evolution, including in particular the available mechanisms and the likely selection pressures. Thus, we stress the opportunities afforded by techniques that enable one to map sequence to (structure and) activity in silico, as an effective means of modelling and exploring protein landscapes. Because known landscapes may be assessed and reasoned about as a whole, simultaneously, this offers opportunities for protein improvement not readily available to natural evolution on rapid timescales. Intelligent landscape navigation, informed by sequence-activity relationships and coupled to the emerging methods of synthetic biology, offers scope for the development of novel biocatalysts that are both highly active and robust. PMID:25503938
Chromosome Evolution in Connection with Repetitive Sequences and Epigenetics in Plants.

PubMed

Li, Shu-Fen; Su, Ting; Cheng, Guang-Qian; Wang, Bing-Xiao; Li, Xu; Deng, Chuan-Liang; Gao, Wu-Jun

2017-10-24

Chromosome evolution is a fundamental aspect of evolutionary biology. The evolution of chromosome size, structure and shape, number, and the change in DNA composition suggest the high plasticity of nuclear genomes at the chromosomal level. Repetitive DNA sequences, which represent a conspicuous fraction of every eukaryotic genome, particularly in plants, are found to be tightly linked with plant chromosome evolution. Different classes of repetitive sequences have distinct distribution patterns on the chromosomes. Mounting evidence shows that repetitive sequences may play multiple generative roles in shaping the chromosome karyotypes in plants. Furthermore, recent development in our understanding of the repetitive sequences and plant chromosome evolution has elucidated the involvement of a spectrum of epigenetic modification. In this review, we focused on the recent evidence relating to the distribution pattern of repetitive sequences in plant chromosomes and highlighted their potential relevance to chromosome evolution in plants. We also discussed the possible connections between evolution and epigenetic alterations in chromosome structure and repatterning, such as heterochromatin formation, centromere function, and epigenetic-associated transposable element inactivation.
The genome sequence of taurine cattle: a window to ruminant biology and evolution.

PubMed

Elsik, Christine G; Tellam, Ross L; Worley, Kim C; Gibbs, Richard A; Muzny, Donna M; Weinstock, George M; Adelson, David L; Eichler, Evan E; Elnitski, Laura; Guigó, Roderic; Hamernik, Debora L; Kappes, Steve M; Lewin, Harris A; Lynn, David J; Nicholas, Frank W; Reymond, Alexandre; Rijnkels, Monique; Skow, Loren C; Zdobnov, Evgeny M; Schook, Lawrence; Womack, James; Alioto, Tyler; Antonarakis, Stylianos E; Astashyn, Alex; Chapple, Charles E; Chen, Hsiu-Chuan; Chrast, Jacqueline; Câmara, Francisco; Ermolaeva, Olga; Henrichsen, Charlotte N; Hlavina, Wratko; Kapustin, Yuri; Kiryutin, Boris; Kitts, Paul; Kokocinski, Felix; Landrum, Melissa; Maglott, Donna; Pruitt, Kim; Sapojnikov, Victor; Searle, Stephen M; Solovyev, Victor; Souvorov, Alexandre; Ucla, Catherine; Wyss, Carine; Anzola, Juan M; Gerlach, Daniel; Elhaik, Eran; Graur, Dan; Reese, Justin T; Edgar, Robert C; McEwan, John C; Payne, Gemma M; Raison, Joy M; Junier, Thomas; Kriventseva, Evgenia V; Eyras, Eduardo; Plass, Mireya; Donthu, Ravikiran; Larkin, Denis M; Reecy, James; Yang, Mary Q; Chen, Lin; Cheng, Ze; Chitko-McKown, Carol G; Liu, George E; Matukumalli, Lakshmi K; Song, Jiuzhou; Zhu, Bin; Bradley, Daniel G; Brinkman, Fiona S L; Lau, Lilian P L; Whiteside, Matthew D; Walker, Angela; Wheeler, Thomas T; Casey, Theresa; German, J Bruce; Lemay, Danielle G; Maqbool, Nauman J; Molenaar, Adrian J; Seo, Seongwon; Stothard, Paul; Baldwin, Cynthia L; Baxter, Rebecca; Brinkmeyer-Langford, Candice L; Brown, Wendy C; Childers, Christopher P; Connelley, Timothy; Ellis, Shirley A; Fritz, Krista; Glass, Elizabeth J; Herzig, Carolyn T A; Iivanainen, Antti; Lahmers, Kevin K; Bennett, Anna K; Dickens, C Michael; Gilbert, James G R; Hagen, Darren E; Salih, Hanni; Aerts, Jan; Caetano, Alexandre R; Dalrymple, Brian; Garcia, Jose Fernando; Gill, Clare A; Hiendleder, Stefan G; Memili, Erdogan; Spurlock, Diane; Williams, John L; Alexander, Lee; Brownstein, Michael J; Guan, Leluo; Holt, Robert A; Jones, Steven J M; Marra, Marco A; Moore, Richard; Moore, Stephen S; Roberts, Andy; Taniguchi, Masaaki; Waterman, Richard C; Chacko, Joseph; Chandrabose, Mimi M; Cree, Andy; Dao, Marvin Diep; Dinh, Huyen H; Gabisi, Ramatu Ayiesha; Hines, Sandra; Hume, Jennifer; Jhangiani, Shalini N; Joshi, Vandita; Kovar, Christie L; Lewis, Lora R; Liu, Yih-Shin; Lopez, John; Morgan, Margaret B; Nguyen, Ngoc Bich; Okwuonu, Geoffrey O; Ruiz, San Juana; Santibanez, Jireh; Wright, Rita A; Buhay, Christian; Ding, Yan; Dugan-Rocha, Shannon; Herdandez, Judith; Holder, Michael; Sabo, Aniko; Egan, Amy; Goodell, Jason; Wilczek-Boney, Katarzyna; Fowler, Gerald R; Hitchens, Matthew Edward; Lozado, Ryan J; Moen, Charles; Steffen, David; Warren, James T; Zhang, Jingkun; Chiu, Readman; Schein, Jacqueline E; Durbin, K James; Havlak, Paul; Jiang, Huaiyang; Liu, Yue; Qin, Xiang; Ren, Yanru; Shen, Yufeng; Song, Henry; Bell, Stephanie Nicole; Davis, Clay; Johnson, Angela Jolivet; Lee, Sandra; Nazareth, Lynne V; Patel, Bella Mayurkumar; Pu, Ling-Ling; Vattathil, Selina; Williams, Rex Lee; Curry, Stacey; Hamilton, Cerissa; Sodergren, Erica; Wheeler, David A; Barris, Wes; Bennett, Gary L; Eggen, André; Green, Ronnie D; Harhay, Gregory P; Hobbs, Matthew; Jann, Oliver; Keele, John W; Kent, Matthew P; Lien, Sigbjørn; McKay, Stephanie D; McWilliam, Sean; Ratnakumar, Abhirami; Schnabel, Robert D; Smith, Timothy; Snelling, Warren M; Sonstegard, Tad S; Stone, Roger T; Sugimoto, Yoshikazu; Takasuga, Akiko; Taylor, Jeremy F; Van Tassell, Curtis P; Macneil, Michael D; Abatepaulo, Antonio R R; Abbey, Colette A; Ahola, Virpi; Almeida, Iassudara G; Amadio, Ariel F; Anatriello, Elen; Bahadue, Suria M; Biase, Fernando H; Boldt, Clayton R; Carroll, Jeffery A; Carvalho, Wanessa A; Cervelatti, Eliane P; Chacko, Elsa; Chapin, Jennifer E; Cheng, Ye; Choi, Jungwoo; Colley, Adam J; de Campos, Tatiana A; De Donato, Marcos; Santos, Isabel K F de Miranda; de Oliveira, Carlo J F; Deobald, Heather; Devinoy, Eve; Donohue, Kaitlin E; Dovc, Peter; Eberlein, Annett; Fitzsimmons, Carolyn J; Franzin, Alessandra M; Garcia, Gustavo R; Genini, Sem; Gladney, Cody J; Grant, Jason R; Greaser, Marion L; Green, Jonathan A; Hadsell, Darryl L; Hakimov, Hatam A; Halgren, Rob; Harrow, Jennifer L; Hart, Elizabeth A; Hastings, Nicola; Hernandez, Marta; Hu, Zhi-Liang; Ingham, Aaron; Iso-Touru, Terhi; Jamis, Catherine; Jensen, Kirsty; Kapetis, Dimos; Kerr, Tovah; Khalil, Sari S; Khatib, Hasan; Kolbehdari, Davood; Kumar, Charu G; Kumar, Dinesh; Leach, Richard; Lee, Justin C-M; Li, Changxi; Logan, Krystin M; Malinverni, Roberto; Marques, Elisa; Martin, William F; Martins, Natalia F; Maruyama, Sandra R; Mazza, Raffaele; McLean, Kim L; Medrano, Juan F; Moreno, Barbara T; Moré, Daniela D; Muntean, Carl T; Nandakumar, Hari P; Nogueira, Marcelo F G; Olsaker, Ingrid; Pant, Sameer D; Panzitta, Francesca; Pastor, Rosemeire C P; Poli, Mario A; Poslusny, Nathan; Rachagani, Satyanarayana; Ranganathan, Shoba; Razpet, Andrej; Riggs, Penny K; Rincon, Gonzalo; Rodriguez-Osorio, Nelida; Rodriguez-Zas, Sandra L; Romero, Natasha E; Rosenwald, Anne; Sando, Lillian; Schmutz, Sheila M; Shen, Libing; Sherman, Laura; Southey, Bruce R; Lutzow, Ylva Strandberg; Sweedler, Jonathan V; Tammen, Imke; Telugu, Bhanu Prakash V L; Urbanski, Jennifer M; Utsunomiya, Yuri T; Verschoor, Chris P; Waardenberg, Ashley J; Wang, Zhiquan; Ward, Robert; Weikard, Rosemarie; Welsh, Thomas H; White, Stephen N; Wilming, Laurens G; Wunderlich, Kris R; Yang, Jianqi; Zhao, Feng-Qi

2009-04-24

To understand the biology and evolution of ruminants, the cattle genome was sequenced to about sevenfold coverage. The cattle genome contains a minimum of 22,000 genes, with a core set of 14,345 orthologs shared among seven mammalian species of which 1217 are absent or undetected in noneutherian (marsupial or monotreme) genomes. Cattle-specific evolutionary breakpoint regions in chromosomes have a higher density of segmental duplications, enrichment of repetitive elements, and species-specific variations in genes associated with lactation and immune responsiveness. Genes involved in metabolism are generally highly conserved, although five metabolic genes are deleted or extensively diverged from their human orthologs. The cattle genome sequence thus provides a resource for understanding mammalian evolution and accelerating livestock genetic improvement for milk and meat production.
Genome-Wide Identification of Regulatory Sequences Undergoing Accelerated Evolution in the Human Genome.

PubMed

Dong, Xinran; Wang, Xiao; Zhang, Feng; Tian, Weidong

2016-10-01

Accelerated evolution of regulatory sequence can alter the expression pattern of target genes, and cause phenotypic changes. In this study, we used DNase I hypersensitive sites (DHSs) to annotate putative regulatory sequences in the human genome, and conducted a genome-wide analysis of the effects of accelerated evolution on regulatory sequences. Working under the assumption that local ancient repeat elements of DHSs are under neutral evolution, we discovered that ∼0.44% of DHSs are under accelerated evolution (ace-DHSs). We found that ace-DHSs tend to be more active than background DHSs, and are strongly associated with epigenetic marks of active transcription. The target genes of ace-DHSs are significantly enriched in neuron-related functions, and their expression levels are positively selected in the human brain. Thus, these lines of evidences strongly suggest that accelerated evolution on regulatory sequences plays important role in the evolution of human-specific phenotypes. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
A Topical Trajectory on Survival: An Analysis of Link-Making in a Sequence of Lessons on Evolution

ERIC Educational Resources Information Center

Rocksén, Miranda; Olander, Clas

2017-01-01

This study explores the concept of link-making in relation to communicative strategies applied in the teaching and studying of biological evolution. The analysis focused on video recordings of 11 lessons on biological evolution conducted in a Swedish 9th grade class of students aged 15 years. It reveals how the teacher and students connected…
Advanced Applications of Next-Generation Sequencing Technologies to Orchid Biology.

PubMed

Yeh, Chuan-Ming; Liu, Zhong-Jian; Tsai, Wen-Chieh

2018-01-01

Next-generation sequencing technologies are revolutionizing biology by permitting, transcriptome sequencing, whole-genome sequencing and resequencing, and genome-wide single nucleotide polymorphism profiling. Orchid research has benefited from this breakthrough, and a few orchid genomes are now available; new biological questions can be approached and new breeding strategies can be designed. The first part of this review describes the unique features of orchid biology. The second part provides an overview of the current next-generation sequencing platforms, many of which are already used in plant laboratories. The third part summarizes the state of orchid transcriptome and genome sequencing and illustrates current achievements. The genetic sequences currently obtained will not only provide a broad scope for the study of orchid biology, but also serves as a starting point for uncovering the mystery of orchid evolution.
The Genome Sequence of Taurine Cattle: A window to ruminant biology and evolution

PubMed Central

Elsik, Christine G.; Tellam, Ross L.; Worley, Kim C.

2010-01-01

To understand the biology and evolution of ruminants, the cattle genome was sequenced to ∼7× coverage. The cattle genome contains a minimum of 22,000 genes, with a core set of 14,345 orthologs shared among seven mammalian species of which 1,217 are absent or undetected in non-eutherian (marsupial or monotreme) genomes. Cattle-specific evolutionary breakpoint regions in chromosomes have a higher density of segmental duplications, enrichment of repetitive elements, and species-specific variations in genes associated with lactation and immune responsiveness. Genes involved in metabolism are generally highly conserved, although five metabolic genes are deleted or extensively diverged from their human orthologs. The cattle genome sequence thus provides an enabling resource for understanding mammalian evolution and accelerating livestock genetic improvement for milk and meat production. PMID:19390049
Chromosome Evolution in Connection with Repetitive Sequences and Epigenetics in Plants

PubMed Central

Li, Shu-Fen; Su, Ting; Cheng, Guang-Qian; Wang, Bing-Xiao; Li, Xu; Deng, Chuan-Liang; Gao, Wu-Jun

2017-01-01

Chromosome evolution is a fundamental aspect of evolutionary biology. The evolution of chromosome size, structure and shape, number, and the change in DNA composition suggest the high plasticity of nuclear genomes at the chromosomal level. Repetitive DNA sequences, which represent a conspicuous fraction of every eukaryotic genome, particularly in plants, are found to be tightly linked with plant chromosome evolution. Different classes of repetitive sequences have distinct distribution patterns on the chromosomes. Mounting evidence shows that repetitive sequences may play multiple generative roles in shaping the chromosome karyotypes in plants. Furthermore, recent development in our understanding of the repetitive sequences and plant chromosome evolution has elucidated the involvement of a spectrum of epigenetic modification. In this review, we focused on the recent evidence relating to the distribution pattern of repetitive sequences in plant chromosomes and highlighted their potential relevance to chromosome evolution in plants. We also discussed the possible connections between evolution and epigenetic alterations in chromosome structure and repatterning, such as heterochromatin formation, centromere function, and epigenetic-associated transposable element inactivation. PMID:29064432
Present Day Biology seen in the Looking Glass of Physics of Complexity

NASA Astrophysics Data System (ADS)

Schuster, P.

Darwin's theory of variation and selection in its simplest form is directly applicable to RNA evolution in vitro as well as to virus evolution, and it allows for quantitative predictions. Understanding evolution at the molecular level is ultimately related to the central paradigm of structural biology: sequence⇒ structure ⇒ function. We elaborate on the state of the art in modeling and understanding evolution of RNA driven by reproduction and mutation. The focus will be laid on the landscape concept—originally introduced by Sewall Wright—and its application to problems in biology. The relation between genotypes and phenotypes is the result of two consecutive mappings from a space of genotypes called sequence space onto a space of phenotypes or structures, and fitness is the result of a mapping from phenotype space into non-negative real numbers. Realistic landscapes as derived from folding of RNA sequences into structures are characterized by two properties: (i) they are rugged in the sense that sequences lying nearby in sequence space may have very different fitness values and (ii) they are characterized by an appreciable degree of neutrality implying that a certain fraction of genotypes and/or phenotypes cannot be distinguished in the selection process. Evolutionary dynamics on realistic landscapes will be studied as a function of the mutation rate, and the role of neutrality in the selection process will be discussed.
Evolution of microbes and viruses: a paradigm shift in evolutionary biology?

PubMed Central

Koonin, Eugene V.; Wolf, Yuri I.

2012-01-01

When Charles Darwin formulated the central principles of evolutionary biology in the Origin of Species in 1859 and the architects of the Modern Synthesis integrated these principles with population genetics almost a century later, the principal if not the sole objects of evolutionary biology were multicellular eukaryotes, primarily animals and plants. Before the advent of efficient gene sequencing, all attempts to extend evolutionary studies to bacteria have been futile. Sequencing of the rRNA genes in thousands of microbes allowed the construction of the three- domain “ribosomal Tree of Life” that was widely thought to have resolved the evolutionary relationships between the cellular life forms. However, subsequent massive sequencing of numerous, complete microbial genomes revealed novel evolutionary phenomena, the most fundamental of these being: (1) pervasive horizontal gene transfer (HGT), in large part mediated by viruses and plasmids, that shapes the genomes of archaea and bacteria and call for a radical revision (if not abandonment) of the Tree of Life concept, (2) Lamarckian-type inheritance that appears to be critical for antivirus defense and other forms of adaptation in prokaryotes, and (3) evolution of evolvability, i.e., dedicated mechanisms for evolution such as vehicles for HGT and stress-induced mutagenesis systems. In the non-cellular part of the microbial world, phylogenomics and metagenomics of viruses and related selfish genetic elements revealed enormous genetic and molecular diversity and extremely high abundance of viruses that come across as the dominant biological entities on earth. Furthermore, the perennial arms race between viruses and their hosts is one of the defining factors of evolution. Thus, microbial phylogenomics adds new dimensions to the fundamental picture of evolution even as the principle of descent with modification discovered by Darwin and the laws of population genetics remain at the core of evolutionary biology. PMID:22993722
Determinants of the rate of protein sequence evolution

PubMed Central

Zhang, Jianzhi; Yang, Jian-Rong

2015-01-01

The rate and mechanism of protein sequence evolution have been central questions in evolutionary biology since the 1960s. Although the rate of protein sequence evolution depends primarily on the level of functional constraint, exactly what constitutes functional constraint has remained unclear. The increasing availability of genomic data has allowed for much needed empirical examinations on the nature of functional constraint. These studies found that the evolutionary rate of a protein is predominantly influenced by its expression level rather than functional importance. A combination of theoretical and empirical analyses have identified multiple mechanisms behind these observations and demonstrated a prominent role that selection against errors in molecular and cellular processes plays in protein evolution. PMID:26055156
Evolution of biological complexity

PubMed Central

Adami, Christoph; Ofria, Charles; Collier, Travis C.

2000-01-01

To make a case for or against a trend in the evolution of complexity in biological evolution, complexity needs to be both rigorously defined and measurable. A recent information-theoretic (but intuitively evident) definition identifies genomic complexity with the amount of information a sequence stores about its environment. We investigate the evolution of genomic complexity in populations of digital organisms and monitor in detail the evolutionary transitions that increase complexity. We show that, because natural selection forces genomes to behave as a natural “Maxwell Demon,” within a fixed environment, genomic complexity is forced to increase. PMID:10781045
Exploitation of peptide motif sequences and their use in nanobiotechnology.

PubMed

Shiba, Kiyotaka

2010-08-01

Short amino acid sequences extracted from natural proteins or created using in vitro evolution systems are sometimes associated with particular biological functions. These peptides, called peptide motifs, can serve as functional units for the creation of various tools for nanobiotechnology. In particular, peptide motifs that have the ability to specifically recognize the surfaces of solid materials and to mineralize certain inorganic materials have been linking biological science to material science. Here, I review how these peptide motifs have been isolated from natural proteins or created using in vitro evolution systems, and how they have been used in the nanobiotechnology field. Copyright © 2010 Elsevier Ltd. All rights reserved.

Molecular Evolution in Historical Perspective.

PubMed

Suárez-Díaz, Edna

2016-12-01

In the 1960s, advances in protein chemistry and molecular genetics provided new means for the study of biological evolution. Amino acid sequencing, nucleic acid hybridization, zone gel electrophoresis, and immunochemistry were some of the experimental techniques that brought about new perspectives to the study of the patterns and mechanisms of evolution. New concepts, such as the molecular evolutionary clock, and the discovery of unexpected molecular phenomena, like the presence of repetitive sequences in eukaryotic genomes, eventually led to the realization that evolution might occur at a different pace at the organismic and the molecular levels, and according to different mechanisms. These developments sparked important debates between defendants of the molecular and organismic approaches. The most vocal confrontations focused on the relation between primates and humans, and the neutral theory of molecular evolution. By the 1980s and 1990s, the construction of large protein and DNA sequences databases, and the development of computer-based statistical tools, facilitated the coming together of molecular and evolutionary biology. Although in its contemporary form the field of molecular evolution can be traced back to the last five decades, the field has deep roots in twentieth century experimental life sciences. For historians of science, the origins and consolidation of molecular evolution provide a privileged field for the study of scientific debates, the relation between technological advances and scientific knowledge, and the connection between science and broader social concerns.
Neutral evolution of proteins: The superfunnel in sequence space and its relation to mutational robustness

NASA Astrophysics Data System (ADS)

Noirel, Josselin; Simonson, Thomas

2008-11-01

Following Kimura's neutral theory of molecular evolution [M. Kimura, The Neutral Theory of Molecular Evolution (Cambridge University Press, Cambridge, 1983) (reprinted in 1986)], it has become common to assume that the vast majority of viable mutations of a gene confer little or no functional advantage. Yet, in silico models of protein evolution have shown that mutational robustness of sequences could be selected for, even in the context of neutral evolution. The evolution of a biological population can be seen as a diffusion on the network of viable sequences. This network is called a "neutral network." Depending on the mutation rate μ and the population size N, the biological population can evolve purely randomly (μN ≪1) or it can evolve in such a way as to select for sequences of higher mutational robustness (μN ≫1). The stringency of the selection depends not only on the product μN but also on the exact topology of the neutral network, the special arrangement of which was named "superfunnel." Even though the relation between mutation rate, population size, and selection was thoroughly investigated, a study of the salient topological features of the superfunnel that could affect the strength of the selection was wanting. This question is addressed in this study. We use two different models of proteins: on lattice and off lattice. We compare neutral networks computed using these models to random networks. From this, we identify two important factors of the topology that determine the stringency of the selection for mutationally robust sequences. First, the presence of highly connected nodes ("hubs") in the network increases the selection for mutationally robust sequences. Second, the stringency of the selection increases when the correlation between a sequence's mutational robustness and its neighbors' increases. The latter finding relates a global characteristic of the neutral network to a local one, which is attainable through experiments or molecular modeling.
Neutral evolution of proteins: The superfunnel in sequence space and its relation to mutational robustness.

PubMed

Noirel, Josselin; Simonson, Thomas

2008-11-14

Following Kimura's neutral theory of molecular evolution [M. Kimura, The Neutral Theory of Molecular Evolution (Cambridge University Press, Cambridge, 1983) (reprinted in 1986)], it has become common to assume that the vast majority of viable mutations of a gene confer little or no functional advantage. Yet, in silico models of protein evolution have shown that mutational robustness of sequences could be selected for, even in the context of neutral evolution. The evolution of a biological population can be seen as a diffusion on the network of viable sequences. This network is called a "neutral network." Depending on the mutation rate mu and the population size N, the biological population can evolve purely randomly (muN<1) or it can evolve in such a way as to select for sequences of higher mutational robustness (muN>1). The stringency of the selection depends not only on the product muN but also on the exact topology of the neutral network, the special arrangement of which was named "superfunnel." Even though the relation between mutation rate, population size, and selection was thoroughly investigated, a study of the salient topological features of the superfunnel that could affect the strength of the selection was wanting. This question is addressed in this study. We use two different models of proteins: on lattice and off lattice. We compare neutral networks computed using these models to random networks. From this, we identify two important factors of the topology that determine the stringency of the selection for mutationally robust sequences. First, the presence of highly connected nodes ("hubs") in the network increases the selection for mutationally robust sequences. Second, the stringency of the selection increases when the correlation between a sequence's mutational robustness and its neighbors' increases. The latter finding relates a global characteristic of the neutral network to a local one, which is attainable through experiments or molecular modeling.
Bioinformatics: A History of Evolution "In Silico"

ERIC Educational Resources Information Center

Ondrej, Vladan; Dvorak, Petr

2012-01-01

Bioinformatics, biological databases, and the worldwide use of computers have accelerated biological research in many fields, such as evolutionary biology. Here, we describe a primer of nucleotide sequence management and the construction of a phylogenetic tree with two examples; the two selected are from completely different groups of organisms:…
PROFESS: a PROtein Function, Evolution, Structure and Sequence database

PubMed Central

Triplet, Thomas; Shortridge, Matthew D.; Griep, Mark A.; Stark, Jaime L.; Powers, Robert; Revesz, Peter

2010-01-01

The proliferation of biological databases and the easy access enabled by the Internet is having a beneficial impact on biological sciences and transforming the way research is conducted. There are ∼1100 molecular biology databases dispersed throughout the Internet. To assist in the functional, structural and evolutionary analysis of the abundant number of novel proteins continually identified from whole-genome sequencing, we introduce the PROFESS (PROtein Function, Evolution, Structure and Sequence) database. Our database is designed to be versatile and expandable and will not confine analysis to a pre-existing set of data relationships. A fundamental component of this approach is the development of an intuitive query system that incorporates a variety of similarity functions capable of generating data relationships not conceived during the creation of the database. The utility of PROFESS is demonstrated by the analysis of the structural drift of homologous proteins and the identification of potential pancreatic cancer therapeutic targets based on the observation of protein–protein interaction networks. Database URL: http://cse.unl.edu/∼profess/ PMID:20624718
Combining protein sequence, structure, and dynamics: A novel approach for functional evolution analysis of PAS domain superfamily.

PubMed

Dong, Zheng; Zhou, Hongyu; Tao, Peng

2018-02-01

PAS domains are widespread in archaea, bacteria, and eukaryota, and play important roles in various functions. In this study, we aim to explore functional evolutionary relationship among proteins in the PAS domain superfamily in view of the sequence-structure-dynamics-function relationship. We collected protein sequences and crystal structure data from RCSB Protein Data Bank of the PAS domain superfamily belonging to three biological functions (nucleotide binding, photoreceptor activity, and transferase activity). Protein sequences were aligned and then used to select sequence-conserved residues and build phylogenetic tree. Three-dimensional structure alignment was also applied to obtain structure-conserved residues. The protein dynamics were analyzed using elastic network model (ENM) and validated by molecular dynamics (MD) simulation. The result showed that the proteins with same function could be grouped by sequence similarity, and proteins in different functional groups displayed statistically significant difference in their vibrational patterns. Interestingly, in all three functional groups, conserved amino acid residues identified by sequence and structure conservation analysis generally have a lower fluctuation than other residues. In addition, the fluctuation of conserved residues in each biological function group was strongly correlated with the corresponding biological function. This research suggested a direct connection in which the protein sequences were related to various functions through structural dynamics. This is a new attempt to delineate functional evolution of proteins using the integrated information of sequence, structure, and dynamics. © 2017 The Protein Society.
Nothing in Evolution Makes Sense Except in the Light of Genomics: Read-Write Genome Evolution as an Active Biological Process.

PubMed

Shapiro, James A

2016-06-08

The 21st century genomics-based analysis of evolutionary variation reveals a number of novel features impossible to predict when Dobzhansky and other evolutionary biologists formulated the neo-Darwinian Modern Synthesis in the middle of the last century. These include three distinct realms of cell evolution; symbiogenetic fusions forming eukaryotic cells with multiple genome compartments; horizontal organelle, virus and DNA transfers; functional organization of proteins as systems of interacting domains subject to rapid evolution by exon shuffling and exonization; distributed genome networks integrated by mobile repetitive regulatory signals; and regulation of multicellular development by non-coding lncRNAs containing repetitive sequence components. Rather than single gene traits, all phenotypes involve coordinated activity by multiple interacting cell molecules. Genomes contain abundant and functional repetitive components in addition to the unique coding sequences envisaged in the early days of molecular biology. Combinatorial coding, plus the biochemical abilities cells possess to rearrange DNA molecules, constitute a powerful toolbox for adaptive genome rewriting. That is, cells possess "Read-Write Genomes" they alter by numerous biochemical processes capable of rapidly restructuring cellular DNA molecules. Rather than viewing genome evolution as a series of accidental modifications, we can now study it as a complex biological process of active self-modification.
Nothing in Evolution Makes Sense Except in the Light of Genomics: Read–Write Genome Evolution as an Active Biological Process

PubMed Central

Shapiro, James A.

2016-01-01

The 21st century genomics-based analysis of evolutionary variation reveals a number of novel features impossible to predict when Dobzhansky and other evolutionary biologists formulated the neo-Darwinian Modern Synthesis in the middle of the last century. These include three distinct realms of cell evolution; symbiogenetic fusions forming eukaryotic cells with multiple genome compartments; horizontal organelle, virus and DNA transfers; functional organization of proteins as systems of interacting domains subject to rapid evolution by exon shuffling and exonization; distributed genome networks integrated by mobile repetitive regulatory signals; and regulation of multicellular development by non-coding lncRNAs containing repetitive sequence components. Rather than single gene traits, all phenotypes involve coordinated activity by multiple interacting cell molecules. Genomes contain abundant and functional repetitive components in addition to the unique coding sequences envisaged in the early days of molecular biology. Combinatorial coding, plus the biochemical abilities cells possess to rearrange DNA molecules, constitute a powerful toolbox for adaptive genome rewriting. That is, cells possess “Read–Write Genomes” they alter by numerous biochemical processes capable of rapidly restructuring cellular DNA molecules. Rather than viewing genome evolution as a series of accidental modifications, we can now study it as a complex biological process of active self-modification. PMID:27338490
Goodbye to 'one by one' genetics

PubMed Central

Theologis, Athanasios

2001-01-01

The completion of the Arabidopsis thaliana (mustard weed) genome sequence constitutes a major breakthrough in plant biology. It will revolutionize how we answer questions about the biology and evolution of plants as well as how we confront and resolve world-wide agricultural problems. PMID:11305933
Between Two Fern Genomes

PubMed Central

2014-01-01

Ferns are the only major lineage of vascular plants not represented by a sequenced nuclear genome. This lack of genome sequence information significantly impedes our ability to understand and reconstruct genome evolution not only in ferns, but across all land plants. Azolla and Ceratopteris are ideal and complementary candidates to be the first ferns to have their nuclear genomes sequenced. They differ dramatically in genome size, life history, and habit, and thus represent the immense diversity of extant ferns. Together, this pair of genomes will facilitate myriad large-scale comparative analyses across ferns and all land plants. Here we review the unique biological characteristics of ferns and describe a number of outstanding questions in plant biology that will benefit from the addition of ferns to the set of taxa with sequenced nuclear genomes. We explain why the fern clade is pivotal for understanding genome evolution across land plants, and we provide a rationale for how knowledge of fern genomes will enable progress in research beyond the ferns themselves. PMID:25324969
The painted turtle, Chrysemys picta: a model system for vertebrate evolution, ecology, and human health.

PubMed

Valenzuela, Nicole

2009-07-01

Painted turtles (Chrysemys picta) are representatives of a vertebrate clade whose biology and phylogenetic position hold a key to our understanding of fundamental aspects of vertebrate evolution. These features make them an ideal emerging model system. Extensive ecological and physiological research provide the context in which to place new research advances in evolutionary genetics, genomics, evolutionary developmental biology, and ecological developmental biology which are enabled by current resources, such as a bacterial artificial chromosome (BAC) library of C. picta, and the imminent development of additional ones such as genome sequences and cDNA and expressed sequence tag (EST) libraries. This integrative approach will allow the research community to continue making advances to provide functional and evolutionary explanations for the lability of biological traits found not only among reptiles but vertebrates in general. Moreover, because humans and reptiles share a common ancestor, and given the ease of using nonplacental vertebrates in experimental biology compared with mammalian embryos, painted turtles are also an emerging model system for biomedical research. For example, painted turtles have been studied to understand many biological responses to overwintering and anoxia, as potential sentinels for environmental xenobiotics, and as a model to decipher the ecology and evolution of sexual development and reproduction. Thus, painted turtles are an excellent reptilian model system for studies with human health, environmental, ecological, and evolutionary significance.
Chemical Evolution and the Evolutionary Definition of Life.

PubMed

Higgs, Paul G

2017-06-01

Darwinian evolution requires a mechanism for generation of diversity in a population, and selective differences between individuals that influence reproduction. In biology, diversity is generated by mutations and selective differences arise because of the encoded functions of the sequences (e.g., ribozymes or proteins). Here, I draw attention to a process that I will call chemical evolution, in which the diversity is generated by random chemical synthesis instead of (or in addition to) mutation, and selection acts on physicochemical properties, such as hydrolysis, photolysis, solubility, or surface binding. Chemical evolution applies to short oligonucleotides that can be generated by random polymerization, as well as by template-directed replication, and which may be too short to encode a specific function. Chemical evolution is an important stage on the pathway to life, between the stage of "just chemistry" and the stage of full biological evolution. A mathematical model is presented here that illustrates the differences between these three stages. Chemical evolution leads to much larger differences in molecular concentrations than can be achieved by selection without replication. However, chemical evolution is not open-ended, unlike biological evolution. The ability to undergo Darwinian evolution is often considered to be a defining feature of life. Here, I argue that chemical evolution, although Darwinian, does not quite constitute life, and that a good place to put the conceptual boundary between non-life and life is between chemical and biological evolution.
Long-read sequencing improves assembly of Trichinella genomes 10-fold, revealing substantial synteny between lineages diverged over seven million years

USDA-ARS?s Scientific Manuscript database

Genome evolution influences a parasite’s’s pathogenicity, host-pathogen interactions, environmental constraints, and invasion biology, while genome assemblies form the basis of comparative sequence analyses. Given that closely related organisms typically maintain appreciable synteny, the genome asse...
Biodiversity Meets Neuroscience: From the Sequencing Ship (Ship-Seq) to Deciphering Parallel Evolution of Neural Systems in Omic's Era.

PubMed

Moroz, Leonid L

2015-12-01

The origins of neural systems and centralized brains are one of the major transitions in evolution. These events might occur more than once over 570-600 million years. The convergent evolution of neural circuits is evident from a diversity of unique adaptive strategies implemented by ctenophores, cnidarians, acoels, molluscs, and basal deuterostomes. But, further integration of biodiversity research and neuroscience is required to decipher critical events leading to development of complex integrative and cognitive functions. Here, we outline reference species and interdisciplinary approaches in reconstructing the evolution of nervous systems. In the "omic" era, it is now possible to establish fully functional genomics laboratories aboard of oceanic ships and perform sequencing and real-time analyses of data at any oceanic location (named here as Ship-Seq). In doing so, fragile, rare, cryptic, and planktonic organisms, or even entire marine ecosystems, are becoming accessible directly to experimental and physiological analyses by modern analytical tools. Thus, we are now in a position to take full advantages from countless "experiments" Nature performed for us in the course of 3.5 billion years of biological evolution. Together with progress in computational and comparative genomics, evolutionary neuroscience, proteomic and developmental biology, a new surprising picture is emerging that reveals many ways of how nervous systems evolved. As a result, this symposium provides a unique opportunity to revisit old questions about the origins of biological complexity. © The Author 2015. Published by Oxford University Press on behalf of the Society for Integrative and Comparative Biology. All rights reserved. For permissions please email: journals.permissions@oup.com.
The Evolution of Bony Vertebrate Enhancers at Odds with Their Coding Sequence Landscape.

PubMed

Yousaf, Aisha; Sohail Raza, Muhammad; Ali Abbasi, Amir

2015-08-06

Enhancers lie at the heart of transcriptional and developmental gene regulation. Therefore, changes in enhancer sequences usually disrupt the target gene expression and result in disease phenotypes. Despite the well-established role of enhancers in development and disease, evolutionary sequence studies are lacking. The current study attempts to unravel the puzzle of bony vertebrates' conserved noncoding elements (CNE) enhancer evolution. Bayesian phylogenetics of enhancer sequences spotlights promising interordinal relationships among placental mammals, proposing a closer relationship between humans and laurasiatherians while placing rodents at the basal position. Clock-based estimates of enhancer evolution provided a dynamic picture of interspecific rate changes across the bony vertebrate lineage. Moreover, coelacanth in the study augmented our appreciation of the vertebrate cis-regulatory evolution during water-land transition. Intriguingly, we observed a pronounced upsurge in enhancer evolution in land-dwelling vertebrates. These novel findings triggered us to further investigate the evolutionary trend of coding as well as CNE nonenhancer repertoires, to highlight the relative evolutionary dynamics of diverse genomic landscapes. Surprisingly, the evolutionary rates of enhancer sequences were clearly at odds with those of the coding and the CNE nonenhancer sequences during vertebrate adaptation to land, with land vertebrates exhibiting significantly reduced rates of coding sequence evolution in comparison to their fast evolving regulatory landscape. The observed variation in tetrapod cis-regulatory elements caused the fine-tuning of associated gene regulatory networks. Therefore, the increased evolutionary rate of tetrapods' enhancer sequences might be responsible for the variation in developmental regulatory circuits during the process of vertebrate adaptation to land. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Darwinian evolution in the light of genomics

PubMed Central

Koonin, Eugene V.

2009-01-01

Comparative genomics and systems biology offer unprecedented opportunities for testing central tenets of evolutionary biology formulated by Darwin in the Origin of Species in 1859 and expanded in the Modern Synthesis 100 years later. Evolutionary-genomic studies show that natural selection is only one of the forces that shape genome evolution and is not quantitatively dominant, whereas non-adaptive processes are much more prominent than previously suspected. Major contributions of horizontal gene transfer and diverse selfish genetic elements to genome evolution undermine the Tree of Life concept. An adequate depiction of evolution requires the more complex concept of a network or ‘forest’ of life. There is no consistent tendency of evolution towards increased genomic complexity, and when complexity increases, this appears to be a non-adaptive consequence of evolution under weak purifying selection rather than an adaptation. Several universals of genome evolution were discovered including the invariant distributions of evolutionary rates among orthologous genes from diverse genomes and of paralogous gene family sizes, and the negative correlation between gene expression level and sequence evolution rate. Simple, non-adaptive models of evolution explain some of these universals, suggesting that a new synthesis of evolutionary biology might become feasible in a not so remote future. PMID:19213802
Comparable contributions of structural-functional constraints and expression level to the rate of protein sequence evolution

PubMed Central

Wolf, Maxim Y; Wolf, Yuri I; Koonin, Eugene V

2008-01-01

Background Proteins show a broad range of evolutionary rates. Understanding the factors that are responsible for the characteristic rate of evolution of a given protein arguably is one of the major goals of evolutionary biology. A long-standing general assumption used to be that the evolution rate is, primarily, determined by the specific functional constraints that affect the given protein. These constrains were traditionally thought to depend both on the specific features of the protein's structure and its biological role. The advent of systems biology brought about new types of data, such as expression level and protein-protein interactions, and unexpectedly, a variety of correlations between protein evolution rate and these variables have been observed. The strongest connections by far were repeatedly seen between protein sequence evolution rate and the expression level of the respective gene. It has been hypothesized that this link is due to the selection for the robustness of the protein structure to mistranslation-induced misfolding that is particularly important for highly expressed proteins and is the dominant determinant of the sequence evolution rate. Results This work is an attempt to assess the relative contributions of protein domain structure and function, on the one hand, and expression level on the other hand, to the rate of sequence evolution. To this end, we performed a genome-wide analysis of the effect of the fusion of a pair of domains in multidomain proteins on the difference in the domain-specific evolutionary rates. The mistranslation-induced misfolding hypothesis would predict that, within multidomain proteins, fused domains, on average, should evolve at substantially closer rates than the same domains in different proteins because, within a mutlidomain protein, all domains are translated at the same rate. We performed a comprehensive comparison of the evolutionary rates of mammalian and plant protein domains that are either joined in multidomain proteins or contained in distinct proteins. Substantial homogenization of evolutionary rates in multidomain proteins was, indeed, observed in both animals and plants, although highly significant differences between domain-specific rates remained. The contributions of the translation rate, as determined by the effect of the fusion of a pair of domains within a multidomain protein, and intrinsic, domain-specific structural-functional constraints appear to be comparable in magnitude. Conclusion Fusion of domains in a multidomain protein results in substantial homogenization of the domain-specific evolutionary rates but significant differences between domain-specific evolution rates remain. Thus, the rate of translation and intrinsic structural-functional constraints both exert sizable and comparable effects on sequence evolution. Reviewers This article was reviewed by Sergei Maslov, Dennis Vitkup, Claus Wilke (nominated by Orly Alter), and Allan Drummond (nominated by Joel Bader). For the full reviews, please go to the Reviewers' Reports section. PMID:18840284
Datasets for evolutionary comparative genomics

PubMed Central

Liberles, David A

2005-01-01

Many decisions about genome sequencing projects are directed by perceived gaps in the tree of life, or towards model organisms. With the goal of a better understanding of biology through the lens of evolution, however, there are additional genomes that are worth sequencing. One such rationale for whole-genome sequencing is discussed here, along with other important strategies for understanding the phenotypic divergence of species. PMID:16086856
Pragmatic turn in biology: From biological molecules to genetic content operators.

PubMed

Witzany, Guenther

2014-08-26

Erwin Schrödinger's question "What is life?" received the answer for decades of "physics + chemistry". The concepts of Alain Turing and John von Neumann introduced a third term: "information". This led to the understanding of nucleic acid sequences as a natural code. Manfred Eigen adapted the concept of Hammings "sequence space". Similar to Hilbert space, in which every ontological entity could be defined by an unequivocal point in a mathematical axiomatic system, in the abstract "sequence space" concept each point represents a unique syntactic structure and the value of their separation represents their dissimilarity. In this concept molecular features of the genetic code evolve by means of self-organisation of matter. Biological selection determines the fittest types among varieties of replication errors of quasi-species. The quasi-species concept dominated evolution theory for many decades. In contrast to this, recent empirical data on the evolution of DNA and its forerunners, the RNA-world and viruses indicate cooperative agent-based interactions. Group behaviour of quasi-species consortia constitute de novo and arrange available genetic content for adaptational purposes within real-life contexts that determine epigenetic markings. This review focuses on some fundamental changes in biology, discarding its traditional status as a subdiscipline of physics and chemistry.
Evolutionary trend toward kinetic stability in the folding trajectory of RNases H

PubMed Central

Lim, Shion A.; Hart, Kathryn M.; Marqusee, Susan

2016-01-01

Proper folding of proteins is critical to producing the biological machinery essential for cellular function. The rates and energetics of a protein’s folding process, which is described by its energy landscape, are encoded in the amino acid sequence. Over the course of evolution, this landscape must be maintained such that the protein folds and remains folded over a biologically relevant time scale. How exactly a protein’s energy landscape is maintained or altered throughout evolution is unclear. To study how a protein’s energy landscape changed over time, we characterized the folding trajectories of ancestral proteins of the ribonuclease H (RNase H) family using ancestral sequence reconstruction to access the evolutionary history between RNases H from mesophilic and thermophilic bacteria. We found that despite large sequence divergence, the overall folding pathway is conserved over billions of years of evolution. There are robust trends in the rates of protein folding and unfolding; both modern RNases H evolved to be more kinetically stable than their most recent common ancestor. Finally, our study demonstrates how a partially folded intermediate provides a readily adaptable folding landscape by allowing the independent tuning of kinetics and thermodynamics. PMID:27799545

Interspecific Plastome Recombination Reflects Ancient Reticulate Evolution in Picea (Pinaceae).

PubMed

Sullivan, Alexis R; Schiffthaler, Bastian; Thompson, Stacey Lee; Street, Nathaniel R; Wang, Xiao-Ru

2017-07-01

Plastid sequences are a cornerstone in plant systematic studies and key aspects of their evolution, such as uniparental inheritance and absent recombination, are often treated as axioms. While exceptions to these assumptions can profoundly influence evolutionary inference, detecting them can require extensive sampling, abundant sequence data, and detailed testing. Using advancements in high-throughput sequencing, we analyzed the whole plastomes of 65 accessions of Picea, a genus of ∼35 coniferous forest tree species, to test for deviations from canonical plastome evolution. Using complementary hypothesis and data-driven tests, we found evidence for chimeric plastomes generated by interspecific hybridization and recombination in the clade comprising Norway spruce (P. abies) and 10 other species. Support for interspecific recombination remained after controlling for sequence saturation, positive selection, and potential alignment artifacts. These results reconcile previous conflicting plastid-based phylogenies and strengthen the mounting evidence of reticulate evolution in Picea. Given the relatively high frequency of hybridization and biparental plastid inheritance in plants, we suggest interspecific plastome recombination may be more widespread than currently appreciated and could underlie reported cases of discordant plastid phylogenies. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Genomic science provides new insights into the biology of forest trees

Treesearch

Andrew Groover

2015-01-01

Forest biology is undergoing a fundamental change fostered by the application of genomic science to longstanding questions surrounding the evolution, adaptive traits, development, and environmental interactions of tree species. Genomic science has made major technical leaps in recent years, most notably with the advent of 'next generation sequencing' but...
Sequence-Level Mechanisms of Human Epigenome Evolution

PubMed Central

Prendergast, James G.D.; Chambers, Emily V.; Semple, Colin A.M.

2014-01-01

DNA methylation and chromatin states play key roles in development and disease. However, the extent of recent evolutionary divergence in the human epigenome and the influential factors that have shaped it are poorly understood. To determine the links between genome sequence and human epigenome evolution, we examined the divergence of DNA methylation and chromatin states following segmental duplication events in the human lineage. Chromatin and DNA methylation states were found to have been generally well conserved following a duplication event, with the evolution of the epigenome largely uncoupled from the total number of genetic changes in the surrounding DNA sequence. However, the epigenome at tissue-specific, distal regulatory regions was observed to be unusually prone to diverge following duplication, with particular sequence differences, altering known sequence motifs, found to be associated with divergence in patterns of DNA methylation and chromatin. Alu elements were found to have played a particularly prominent role in shaping human epigenome evolution, and we show that human-specific AluY insertion events are strongly linked to the evolution of the DNA methylation landscape and gene expression levels, including at key neurological genes in the human brain. Studying paralogous regions within the same sample enables the study of the links between genome and epigenome evolution while controlling for biological and technical variation. We show DNA methylation and chromatin divergence between duplicated regions are linked to the divergence of particular genetic motifs, with Alu elements having played a disproportionate role in the evolution of the epigenome in the human lineage. PMID:24966180
Biodiversity Meets Neuroscience: From the Sequencing Ship (Ship-Seq) to Deciphering Parallel Evolution of Neural Systems in Omic’s Era

PubMed Central

Moroz, Leonid L.

2015-01-01

The origins of neural systems and centralized brains are one of the major transitions in evolution. These events might occur more than once over 570–600 million years. The convergent evolution of neural circuits is evident from a diversity of unique adaptive strategies implemented by ctenophores, cnidarians, acoels, molluscs, and basal deuterostomes. But, further integration of biodiversity research and neuroscience is required to decipher critical events leading to development of complex integrative and cognitive functions. Here, we outline reference species and interdisciplinary approaches in reconstructing the evolution of nervous systems. In the “omic” era, it is now possible to establish fully functional genomics laboratories aboard of oceanic ships and perform sequencing and real-time analyses of data at any oceanic location (named here as Ship-Seq). In doing so, fragile, rare, cryptic, and planktonic organisms, or even entire marine ecosystems, are becoming accessible directly to experimental and physiological analyses by modern analytical tools. Thus, we are now in a position to take full advantages from countless “experiments” Nature performed for us in the course of 3.5 billion years of biological evolution. Together with progress in computational and comparative genomics, evolutionary neuroscience, proteomic and developmental biology, a new surprising picture is emerging that reveals many ways of how nervous systems evolved. As a result, this symposium provides a unique opportunity to revisit old questions about the origins of biological complexity. PMID:26163680
Conservation of hot regions in protein-protein interaction in evolution.

PubMed

Hu, Jing; Li, Jiarui; Chen, Nansheng; Zhang, Xiaolong

2016-11-01

The hot regions of protein-protein interactions refer to the active area which formed by those most important residues to protein combination process. With the research development on protein interactions, lots of predicted hot regions can be discovered efficiently by intelligent computing methods, while performing biology experiments to verify each every prediction is hardly to be done due to the time-cost and the complexity of the experiment. This study based on the research of hot spot residue conservations, the proposed method is used to verify authenticity of predicted hot regions that using machine learning algorithm combined with protein's biological features and sequence conservation, though multiple sequence alignment, module substitute matrix and sequence similarity to create conservation scoring algorithm, and then using threshold module to verify the conservation tendency of hot regions in evolution. This research work gives an effective method to verify predicted hot regions in protein-protein interactions, which also provides a useful way to deeply investigate the functional activities of protein hot regions. Copyright © 2016. Published by Elsevier Inc.
Molecular Epidemiology and Genomics of Group A Streptococcus

PubMed Central

Bessen, Debra E.; McShan, W. Michael; Nguyen, Scott V.; Shetty, Amol; Agrawal, Sonia; Tettelin, Hervé

2014-01-01

Streptococcus pyogenes (group A streptococcus; GAS) is a strict human pathogen with a very high prevalence worldwide. This review highlights the genetic organization of the species and the important ecological considerations that impact its evolution. Recent advances are presented on the topics of molecular epidemiology, population biology, molecular basis for genetic change, genome structure and genetic flux, phylogenomics and closely related streptococcal species, and the long- and short-term evolution of GAS. The application of whole genome sequence data to addressing key biological questions is discussed. PMID:25460818
Inverse statistical physics of protein sequences: a key issues review.

PubMed

Cocco, Simona; Feinauer, Christoph; Figliuzzi, Matteo; Monasson, Rémi; Weigt, Martin

2018-03-01

In the course of evolution, proteins undergo important changes in their amino acid sequences, while their three-dimensional folded structure and their biological function remain remarkably conserved. Thanks to modern sequencing techniques, sequence data accumulate at unprecedented pace. This provides large sets of so-called homologous, i.e. evolutionarily related protein sequences, to which methods of inverse statistical physics can be applied. Using sequence data as the basis for the inference of Boltzmann distributions from samples of microscopic configurations or observables, it is possible to extract information about evolutionary constraints and thus protein function and structure. Here we give an overview over some biologically important questions, and how statistical-mechanics inspired modeling approaches can help to answer them. Finally, we discuss some open questions, which we expect to be addressed over the next years.
Inverse statistical physics of protein sequences: a key issues review

NASA Astrophysics Data System (ADS)

Cocco, Simona; Feinauer, Christoph; Figliuzzi, Matteo; Monasson, Rémi; Weigt, Martin

2018-03-01

In the course of evolution, proteins undergo important changes in their amino acid sequences, while their three-dimensional folded structure and their biological function remain remarkably conserved. Thanks to modern sequencing techniques, sequence data accumulate at unprecedented pace. This provides large sets of so-called homologous, i.e. evolutionarily related protein sequences, to which methods of inverse statistical physics can be applied. Using sequence data as the basis for the inference of Boltzmann distributions from samples of microscopic configurations or observables, it is possible to extract information about evolutionary constraints and thus protein function and structure. Here we give an overview over some biologically important questions, and how statistical-mechanics inspired modeling approaches can help to answer them. Finally, we discuss some open questions, which we expect to be addressed over the next years.
Investigation of compounds essential for the origin of life

NASA Technical Reports Server (NTRS)

Dayhoff, M. O.; Hunt, L. T.

1983-01-01

Nucleic acid sequencing as a technique to determine the chemical and biological evolution of certain prokaryotic metabolic pathways is discussed. Protein in data and a microbiological organization of the prokaryotes is included.
Rapid biological speciation driven by tectonic evolution in New Zealand

NASA Astrophysics Data System (ADS)

Craw, Dave; Upton, Phaedra; Burridge, Christopher P.; Wallis, Graham P.; Waters, Jonathan M.

2016-02-01

Collisions between tectonic plates lead to the rise of new mountain ranges that can separate biological populations and ultimately result in new species. However, the identification of links between tectonic mountain-building and biological speciation is confounded by environmental and ecological factors. Thus, there are surprisingly few well-documented examples of direct tectonic controls on terrestrial biological speciation. Here we present examples from New Zealand, where the rapid evolution of 18 species of freshwater fishes has resulted from parallel tectonic landscape evolution. We use numerical models to reconstruct changes in the deep crustal structure and surface drainage catchments of the southern island of New Zealand over the past 25 million years. We show that the island and mountain topography evolved in six principal tectonic zones, which have distinct drainage catchments that separated fish populations. We use new and existing phylogenetic analyses of freshwater fish populations, based on over 1,000 specimens from more than 400 localities, to show that fish genomes can retain evidence of this tectonic landscape development, with a clear correlation between geologic age and extent of DNA sequence divergence. We conclude that landscape evolution has controlled on-going biological diversification over the past 25 million years.
Evolution of biological sequences implies an extreme value distribution of type I for both global and local pairwise alignment scores.

PubMed

Bastien, Olivier; Maréchal, Eric

2008-08-07

Confidence in pairwise alignments of biological sequences, obtained by various methods such as Blast or Smith-Waterman, is critical for automatic analyses of genomic data. Two statistical models have been proposed. In the asymptotic limit of long sequences, the Karlin-Altschul model is based on the computation of a P-value, assuming that the number of high scoring matching regions above a threshold is Poisson distributed. Alternatively, the Lipman-Pearson model is based on the computation of a Z-value from a random score distribution obtained by a Monte-Carlo simulation. Z-values allow the deduction of an upper bound of the P-value (1/Z-value2) following the TULIP theorem. Simulations of Z-value distribution is known to fit with a Gumbel law. This remarkable property was not demonstrated and had no obvious biological support. We built a model of evolution of sequences based on aging, as meant in Reliability Theory, using the fact that the amount of information shared between an initial sequence and the sequences in its lineage (i.e., mutual information in Information Theory) is a decreasing function of time. This quantity is simply measured by a sequence alignment score. In systems aging, the failure rate is related to the systems longevity. The system can be a machine with structured components, or a living entity or population. "Reliability" refers to the ability to operate properly according to a standard. Here, the "reliability" of a sequence refers to the ability to conserve a sufficient functional level at the folded and maturated protein level (positive selection pressure). Homologous sequences were considered as systems 1) having a high redundancy of information reflected by the magnitude of their alignment scores, 2) which components are the amino acids that can independently be damaged by random DNA mutations. From these assumptions, we deduced that information shared at each amino acid position evolved with a constant rate, corresponding to the information hazard rate, and that pairwise sequence alignment scores should follow a Gumbel distribution, which parameters could find some theoretical rationale. In particular, one parameter corresponds to the information hazard rate. Extreme value distribution of alignment scores, assessed from high scoring segments pairs following the Karlin-Altschul model, can also be deduced from the Reliability Theory applied to molecular sequences. It reflects the redundancy of information between homologous sequences, under functional conservative pressure. This model also provides a link between concepts of biological sequence analysis and of systems biology.
Evolution of Sphingomonad Gene Clusters Related to Pesticide Catabolism Revealed by Genome Sequence and Mobilomics of Sphingobium herbicidovorans MH.

PubMed

Nielsen, Tue Kjærgaard; Rasmussen, Morten; Demanèche, Sandrine; Cecillon, Sébastien; Vogel, Timothy M; Hansen, Lars Hestbjerg

2017-09-01

Bacterial degraders of chlorophenoxy herbicides have been isolated from various ecosystems, including pristine environments. Among these degraders, the sphingomonads constitute a prominent group that displays versatile xenobiotic-degradation capabilities. Four separate sequencing strategies were required to provide the complete sequence of the complex and plastic genome of the canonical chlorophenoxy herbicide-degrading Sphingobium herbicidovorans MH. The genome has an intricate organization of the chlorophenoxy-herbicide catabolic genes sdpA, rdpA, and cadABCD that encode the (R)- and (S)-enantiomer-specific 2,4-dichlorophenoxypropionate dioxygenases and four subunits of a Rieske non-heme iron oxygenase involved in 2-methyl-chlorophenoxyacetic acid degradation, respectively. Several major genomic rearrangements are proposed to help understand the evolution and mobility of these important genes and their genetic context. Single-strain mobilomic sequence analysis uncovered plasmids and insertion sequence-associated circular intermediates in this environmentally important bacterium and enabled the description of evolutionary models for pesticide degradation in strain MH and related organisms. The mobilome presented a complex mosaic of mobile genetic elements including four plasmids and several circular intermediate DNA molecules of insertion-sequence elements and transposons that are central to the evolution of xenobiotics degradation. Furthermore, two individual chromosomally integrated prophages were shown to excise and form free circular DNA molecules. This approach holds great potential for improving the understanding of genome plasticity, evolution, and microbial ecology. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Are there laws of genome evolution?

PubMed

Koonin, Eugene V

2011-08-01

Research in quantitative evolutionary genomics and systems biology led to the discovery of several universal regularities connecting genomic and molecular phenomic variables. These universals include the log-normal distribution of the evolutionary rates of orthologous genes; the power law-like distributions of paralogous family size and node degree in various biological networks; the negative correlation between a gene's sequence evolution rate and expression level; and differential scaling of functional classes of genes with genome size. The universals of genome evolution can be accounted for by simple mathematical models similar to those used in statistical physics, such as the birth-death-innovation model. These models do not explicitly incorporate selection; therefore, the observed universal regularities do not appear to be shaped by selection but rather are emergent properties of gene ensembles. Although a complete physical theory of evolutionary biology is inconceivable, the universals of genome evolution might qualify as "laws of evolutionary genomics" in the same sense "law" is understood in modern physics.
String Mining in Bioinformatics

NASA Astrophysics Data System (ADS)

Abouelhoda, Mohamed; Ghanem, Moustafa

Sequence analysis is a major area in bioinformatics encompassing the methods and techniques for studying the biological sequences, DNA, RNA, and proteins, on the linear structure level. The focus of this area is generally on the identification of intra- and inter-molecular similarities. Identifying intra-molecular similarities boils down to detecting repeated segments within a given sequence, while identifying inter-molecular similarities amounts to spotting common segments among two or multiple sequences. From a data mining point of view, sequence analysis is nothing but string- or pattern mining specific to biological strings. For a long time, this point of view, however, has not been explicitly embraced neither in the data mining nor in the sequence analysis text books, which may be attributed to the co-evolution of the two apparently independent fields. In other words, although the word "data-mining" is almost missing in the sequence analysis literature, its basic concepts have been implicitly applied. Interestingly, recent research in biological sequence analysis introduced efficient solutions to many problems in data mining, such as querying and analyzing time series [49,53], extracting information from web pages [20], fighting spam mails [50], detecting plagiarism [22], and spotting duplications in software systems [14].
String Mining in Bioinformatics

NASA Astrophysics Data System (ADS)

Abouelhoda, Mohamed; Ghanem, Moustafa

Sequence analysis is a major area in bioinformatics encompassing the methods and techniques for studying the biological sequences, DNA, RNA, and proteins, on the linear structure level. The focus of this area is generally on the identification of intra- and inter-molecular similarities. Identifying intra-molecular similarities boils down to detecting repeated segments within a given sequence, while identifying inter-molecular similarities amounts to spotting common segments among two or multiple sequences. From a data mining point of view, sequence analysis is nothing but string- or pattern mining specific to biological strings. For a long time, this point of view, however, has not been explicitly embraced neither in the data mining nor in the sequence analysis text books, which may be attributed to the co-evolution of the two apparently independent fields. In other words, although the word “data-mining” is almost missing in the sequence analysis literature, its basic concepts have been implicitly applied. Interestingly, recent research in biological sequence analysis introduced efficient solutions to many problems in data mining, such as querying and analyzing time series [49,53], extracting information from web pages [20], fighting spam mails [50], detecting plagiarism [22], and spotting duplications in software systems [14].
Unravelling biology and shifting paradigms in cancer with single-cell sequencing.

PubMed

Baslan, Timour; Hicks, James

2017-08-24

The fundamental operative unit of a cancer is the genetically and epigenetically innovative single cell. Whether proliferating or quiescent, in the primary tumour mass or disseminated elsewhere, single cells govern the parameters that dictate all facets of the biology of cancer. Thus, single-cell analyses provide the ultimate level of resolution in our quest for a fundamental understanding of this disease. Historically, this quest has been hampered by technological shortcomings. In this Opinion article, we argue that the rapidly evolving field of single-cell sequencing has unshackled the cancer research community of these shortcomings. From furthering an elemental understanding of intra-tumoural genetic heterogeneity and cancer genome evolution to illuminating the governing principles of disease relapse and metastasis, we posit that single-cell sequencing promises to unravel the biology of all facets of this disease.
Phylogenetic estimates of diversification rate are affected by molecular rate variation.

PubMed

Duchêne, D A; Hua, X; Bromham, L

2017-10-01

Molecular phylogenies are increasingly being used to investigate the patterns and mechanisms of macroevolution. In particular, node heights in a phylogeny can be used to detect changes in rates of diversification over time. Such analyses rest on the assumption that node heights in a phylogeny represent the timing of diversification events, which in turn rests on the assumption that evolutionary time can be accurately predicted from DNA sequence divergence. But there are many influences on the rate of molecular evolution, which might also influence node heights in molecular phylogenies, and thus affect estimates of diversification rate. In particular, a growing number of studies have revealed an association between the net diversification rate estimated from phylogenies and the rate of molecular evolution. Such an association might, by influencing the relative position of node heights, systematically bias estimates of diversification time. We simulated the evolution of DNA sequences under several scenarios where rates of diversification and molecular evolution vary through time, including models where diversification and molecular evolutionary rates are linked. We show that commonly used methods, including metric-based, likelihood and Bayesian approaches, can have a low power to identify changes in diversification rate when molecular substitution rates vary. Furthermore, the association between the rates of speciation and molecular evolution rate can cause the signature of a slowdown or speedup in speciation rates to be lost or misidentified. These results suggest that the multiple sources of variation in molecular evolutionary rates need to be considered when inferring macroevolutionary processes from phylogenies. © 2017 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2017 European Society For Evolutionary Biology.
Insights into hominid evolution from the gorilla genome sequence

PubMed Central

Scally, Aylwyn; Dutheil, Julien Y.; Hillier, LaDeana W.; Jordan, Greg E.; Goodhead, Ian; Herrero, Javier; Hobolth, Asger; Lappalainen, Tuuli; Mailund, Thomas; Marques-Bonet, Tomas; McCarthy, Shane; Montgomery, Stephen H.; Schwalie, Petra C.; Tang, Y. Amy; Ward, Michelle C.; Xue, Yali; Yngvadottir, Bryndis; Alkan, Can; Andersen, Lars N.; Ayub, Qasim; Ball, Edward V.; Beal, Kathryn; Bradley, Brenda J.; Chen, Yuan; Clee, Chris M.; Fitzgerald, Stephen; Graves, Tina A.; Gu, Yong; Heath, Paul; Heger, Andreas; Karakoc, Emre; Kolb-Kokocinski, Anja; Laird, Gavin K.; Lunter, Gerton; Meader, Stephen; Mort, Matthew; Mullikin, James C.; Munch, Kasper; O’Connor, Timothy D.; Phillips, Andrew D.; Prado-Martinez, Javier; Rogers, Anthony S.; Sajjadian, Saba; Schmidt, Dominic; Shaw, Katy; Simpson, Jared T.; Stenson, Peter D.; Turner, Daniel J.; Vigilant, Linda; Vilella, Albert J.; Whitener, Weldon; Zhu, Baoli; Cooper, David N.; de Jong, Pieter; Dermitzakis, Emmanouil T.; Eichler, Evan E.; Flicek, Paul; Goldman, Nick; Mundy, Nicholas I.; Ning, Zemin; Odom, Duncan T.; Ponting, Chris P.; Quail, Michael A.; Ryder, Oliver A.; Searle, Stephen M.; Warren, Wesley C.; Wilson, Richard K.; Schierup, Mikkel H.; Rogers, Jane; Tyler-Smith, Chris; Durbin, Richard

2012-01-01

Summary Gorillas are humans’ closest living relatives after chimpanzees, and are of comparable importance for the study of human origins and evolution. Here we present the assembly and analysis of a genome sequence for the western lowland gorilla, and compare the whole genomes of all extant great ape genera. We propose a synthesis of genetic and fossil evidence consistent with placing the human-chimpanzee and human-chimpanzee-gorilla speciation events at approximately 6 and 10 million years ago (Mya). In 30% of the genome, gorilla is closer to human or chimpanzee than the latter are to each other; this is rarer around coding genes, indicating pervasive selection throughout great ape evolution, and has functional consequences in gene expression. A comparison of protein coding genes reveals approximately 500 genes showing accelerated evolution on each of the gorilla, human and chimpanzee lineages, and evidence for parallel acceleration, particularly of genes involved in hearing. We also compare the western and eastern gorilla species, estimating an average sequence divergence time 1.75 million years ago, but with evidence for more recent genetic exchange and a population bottleneck in the eastern species. The use of the genome sequence in these and future analyses will promote a deeper understanding of great ape biology and evolution. PMID:22398555
Sequence diversity and evolution of antimicrobial peptides in invertebrates.

PubMed

Tassanakajon, Anchalee; Somboonwiwat, Kunlaya; Amparyup, Piti

2015-02-01

Antimicrobial peptides (AMPs) are evolutionarily ancient molecules that act as the key components in the invertebrate innate immunity against invading pathogens. Several AMPs have been identified and characterized in invertebrates, and found to display considerable diversity in their amino acid sequence, structure and biological activity. AMP genes appear to have rapidly evolved, which might have arisen from the co-evolutionary arms race between host and pathogens, and enabled organisms to survive in different microbial environments. Here, the sequence diversity of invertebrate AMPs (defensins, cecropins, crustins and anti-lipopolysaccharide factors) are presented to provide a better understanding of the evolution pattern of these peptides that play a major role in host defense mechanisms. Copyright © 2014 Elsevier Ltd. All rights reserved.
Evolution and the Distribution of Glutaminyl and Asparaginyl Residues in Proteins

PubMed Central

Robinson, Arthur B.

1974-01-01

Recent experiments on the deamidation of glutaminyl and asparaginyl residues in peptides and proteins support the hypothesis that these residues may serve as molecular clocks that control biological processes. A hypothesis is now offered that suggests that these molecular clocks are set by rejection or accumulation of appropriate sequences of residues including a glutaminyl or asparaginyl residue during evolution. PMID:4522799

Solution to a gene divergence problem under arbitrary stable nucleotide transition probabilities

NASA Technical Reports Server (NTRS)

Holmquist, R.

1976-01-01

A nucleic acid chain, L nucleotides in length, with the specific base sequence B(1)B(2) ... B(L) is defined by the L-dimensional vector B = (B(1), B(2), ..., B(L)). For twelve given constant non-negative transition probabilities that, in a specified position, the base B is replaced by the base B' in a single step, an exact analytical expression is derived for the probability that the position goes from base B to B' in X steps. Assuming that each base mutates independently of the others, an exact expression is derived for the probability that the initial gene sequence B goes to a sequence B' = (B'(1), B'(2), ..., B'(L)) after X = (X(1), X(2), ..., X(L)) base replacements. The resulting equations allow a more precise accounting for the effects of Darwinian natural selection in molecular evolution than does the idealized (biologically less accurate) assumption that each of the four nucleotides is equally likely to mutate to and be fixed as one of the other three. Illustrative applications of the theory to some problems of biological evolution are given.
The minimal kinome of Giardia lamblia illuminates early kinase evolution and unique parasite biology

PubMed Central

2011-01-01

Background The major human intestinal pathogen Giardia lamblia is a very early branching eukaryote with a minimal genome of broad evolutionary and biological interest. Results To explore early kinase evolution and regulation of Giardia biology, we cataloged the kinomes of three sequenced strains. Comparison with published kinomes and those of the excavates Trichomonas vaginalis and Leishmania major shows that Giardia's 80 core kinases constitute the smallest known core kinome of any eukaryote that can be grown in pure culture, reflecting both its early origin and secondary gene loss. Kinase losses in DNA repair, mitochondrial function, transcription, splicing, and stress response reflect this reduced genome, while the presence of other kinases helps define the kinome of the last common eukaryotic ancestor. Immunofluorescence analysis shows abundant phospho-staining in trophozoites, with phosphotyrosine abundant in the nuclei and phosphothreonine and phosphoserine in distinct cytoskeletal organelles. The Nek kinase family has been massively expanded, accounting for 198 of the 278 protein kinases in Giardia. Most Neks are catalytically inactive, have very divergent sequences and undergo extensive duplication and loss between strains. Many Neks are highly induced during development. We localized four catalytically active Neks to distinct parts of the cytoskeleton and one inactive Nek to the cytoplasm. Conclusions The reduced kinome of Giardia sheds new light on early kinase evolution, and its highly divergent sequences add to the definition of individual kinase families as well as offering specific drug targets. Giardia's massive Nek expansion may reflect its distinctive lifestyle, biphasic life cycle and complex cytoskeleton. PMID:21787419
Cyberinfrastructure for Fusarium (CiF)

USDA-ARS?s Scientific Manuscript database

The rapidly increasing number of genome sequences from diverse fungal species and expanding phylogenetic data necessitate highly integrated informatics platforms to adequately support the use of these resources for studying fungal biology and evolution. The long-term goal of Cyberinfrastructure for...
Natural product-inspired cascade synthesis yields modulators of centrosome integrity.

PubMed

Dückert, Heiko; Pries, Verena; Khedkar, Vivek; Menninger, Sascha; Bruss, Hanna; Bird, Alexander W; Maliga, Zoltan; Brockmeyer, Andreas; Janning, Petra; Hyman, Anthony; Grimme, Stefan; Schürmann, Markus; Preut, Hans; Hübel, Katja; Ziegler, Slava; Kumar, Kamal; Waldmann, Herbert

2011-12-25

In biology-oriented synthesis, the scaffolds of biologically relevant compound classes inspire the synthesis of focused compound collections enriched in bioactivity. This criterion is, in particular, met by the scaffolds of natural products selected in evolution. The synthesis of natural product-inspired compound collections calls for efficient reaction sequences that preferably combine multiple individual transformations in one operation. Here we report the development of a one-pot, twelve-step cascade reaction sequence that includes nine different reactions and two opposing kinds of organocatalysis. The cascade sequence proceeds within 10-30 min and transforms readily available substrates into complex indoloquinolizines that resemble the core tetracyclic scaffold of numerous polycyclic indole alkaloids. Biological investigation of a corresponding focused compound collection revealed modulators of centrosome integrity, termed centrocountins, which caused fragmented and supernumerary centrosomes, chromosome congression defects, multipolar mitotic spindles, acentrosomal spindle poles and multipolar cell division by targeting the centrosome-associated proteins nucleophosmin and Crm1.
Next Generation Sequencing Technology and Genomewide Data Analysis: Perspectives for Retinal Research

PubMed Central

Chaitankar, Vijender; Karakülah, Gökhan; Ratnapriya, Rinki; Giuste, Felipe O.; Brooks, Matthew J.; Swaroop, Anand

2016-01-01

The advent of high throughput next generation sequencing (NGS) has accelerated the pace of discovery of disease-associated genetic variants and genomewide profiling of expressed sequences and epigenetic marks, thereby permitting systems-based analyses of ocular development and disease. Rapid evolution of NGS and associated methodologies presents significant challenges in acquisition, management, and analysis of large data sets and for extracting biologically or clinically relevant information. Here we illustrate the basic design of commonly used NGS-based methods, specifically whole exome sequencing, transcriptome, and epigenome profiling, and provide recommendations for data analyses. We briefly discuss systems biology approaches for integrating multiple data sets to elucidate gene regulatory or disease networks. While we provide examples from the retina, the NGS guidelines reviewed here are applicable to other tissues/cell types as well. PMID:27297499
Greater than the sum of its parts: single-nucleus sequencing identifies convergent evolution of independent EGFR mutants in GBM.

PubMed

Gini, Beatrice; Mischel, Paul S

2014-08-01

Single-cell sequencing approaches are needed to characterize the genomic diversity of complex tumors, shedding light on their evolutionary paths and potentially suggesting more effective therapies. In this issue of Cancer Discovery, Francis and colleagues develop a novel integrative approach to identify distinct tumor subpopulations based on joint detection of clonal and subclonal events from bulk tumor and single-nucleus whole-genome sequencing, allowing them to infer a subclonal architecture. Surprisingly, the authors identify convergent evolution of multiple, mutually exclusive, independent EGFR gain-of-function variants in a single tumor. This study demonstrates the value of integrative single-cell genomics and highlights the biologic primacy of EGFR as an actionable target in glioblastoma. ©2014 American Association for Cancer Research.
Microsporidian genome analysis reveals evolutionary strategies for obligate intracellular growth

USDA-ARS?s Scientific Manuscript database

Microsporidia comprise a large phylum of obligate intracellular eukaryotes that are fungalrelated parasites responsible for widespread disease, and here we address questions about microsporidia biology and evolution. We sequenced three microsporidian genomes from two species, Nematocida parisii and...
An overview on genome organization of marine organisms.

PubMed

Costantini, Maria

2015-12-01

In this review we will concentrate on some general genome features of marine organisms and their evolution, ranging from vertebrate to invertebrates until unicellular organisms. Before genome sequencing, the ultracentrifugation in CsCl led to high resolution of mammalian DNA (without seeing at the sequence). The analytical profile of human DNA showed that the vertebrate genome is a mosaic of isochores, typically megabase-size DNA segments that belong in a small number of families characterized by different GC levels. The recent availability of a number of fully sequenced genomes allowed mapping very precisely the isochores, based on DNA sequences. Since isochores are tightly linked to biological properties such as gene density, replication timing and recombination, the new level of detail provided by the isochore map helped the understanding of genome structure, function and evolution. This led the current level of knowledge and to further insights. Copyright © 2015. Published by Elsevier B.V.
Toward a theory of multilevel evolution: long-term information integration shapes the mutational landscape and enhances evolvability.

PubMed

Hogeweg, Paulien

2012-01-01

Most of evolutionary theory has abstracted away from how information is coded in the genome and how this information is transformed into traits on which selection takes place. While in the earliest stages of biological evolution, in the RNA world, the mapping from the genotype into function was largely predefined by the physical-chemical properties of the evolving entities (RNA replicators, e.g. from sequence to folded structure and catalytic sites), in present-day organisms, the mapping itself is the result of evolution. I will review results of several in silico evolutionary studies which examine the consequences of evolving the genetic coding, and the ways this information is transformed, while adapting to prevailing environments. Such multilevel evolution leads to long-term information integration. Through genome, network, and dynamical structuring, the occurrence and/or effect of random mutations becomes nonrandom, and facilitates rapid adaptation. This is what does happen in the in silico experiments. Is it also what did happen in biological evolution? I will discuss some data that suggest that it did. In any case, these results provide us with novel search images to tackle the wealth of biological data.
The spotted gar genome illuminates vertebrate evolution and facilitates human-teleost comparisons.

PubMed

Braasch, Ingo; Gehrke, Andrew R; Smith, Jeramiah J; Kawasaki, Kazuhiko; Manousaki, Tereza; Pasquier, Jeremy; Amores, Angel; Desvignes, Thomas; Batzel, Peter; Catchen, Julian; Berlin, Aaron M; Campbell, Michael S; Barrell, Daniel; Martin, Kyle J; Mulley, John F; Ravi, Vydianathan; Lee, Alison P; Nakamura, Tetsuya; Chalopin, Domitille; Fan, Shaohua; Wcisel, Dustin; Cañestro, Cristian; Sydes, Jason; Beaudry, Felix E G; Sun, Yi; Hertel, Jana; Beam, Michael J; Fasold, Mario; Ishiyama, Mikio; Johnson, Jeremy; Kehr, Steffi; Lara, Marcia; Letaw, John H; Litman, Gary W; Litman, Ronda T; Mikami, Masato; Ota, Tatsuya; Saha, Nil Ratan; Williams, Louise; Stadler, Peter F; Wang, Han; Taylor, John S; Fontenot, Quenton; Ferrara, Allyse; Searle, Stephen M J; Aken, Bronwen; Yandell, Mark; Schneider, Igor; Yoder, Jeffrey A; Volff, Jean-Nicolas; Meyer, Axel; Amemiya, Chris T; Venkatesh, Byrappa; Holland, Peter W H; Guiguen, Yann; Bobe, Julien; Shubin, Neil H; Di Palma, Federica; Alföldi, Jessica; Lindblad-Toh, Kerstin; Postlethwait, John H

2016-04-01

To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before teleost genome duplication (TGD). The slowly evolving gar genome has conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization and development (mediated, for example, by Hox, ParaHox and microRNA genes). Numerous conserved noncoding elements (CNEs; often cis regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles for such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses showed that the sums of expression domains and expression levels for duplicated teleost genes often approximate the patterns and levels of expression for gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes and the function of human regulatory sequences.
The spotted gar genome illuminates vertebrate evolution and facilitates human-to-teleost comparisons

PubMed Central

Braasch, Ingo; Gehrke, Andrew R.; Smith, Jeramiah J.; Kawasaki, Kazuhiko; Manousaki, Tereza; Pasquier, Jeremy; Amores, Angel; Desvignes, Thomas; Batzel, Peter; Catchen, Julian; Berlin, Aaron M.; Campbell, Michael S.; Barrell, Daniel; Martin, Kyle J.; Mulley, John F.; Ravi, Vydianathan; Lee, Alison P.; Nakamura, Tetsuya; Chalopin, Domitille; Fan, Shaohua; Wcisel, Dustin; Cañestro, Cristian; Sydes, Jason; Beaudry, Felix E. G.; Sun, Yi; Hertel, Jana; Beam, Michael J.; Fasold, Mario; Ishiyama, Mikio; Johnson, Jeremy; Kehr, Steffi; Lara, Marcia; Letaw, John H.; Litman, Gary W.; Litman, Ronda T.; Mikami, Masato; Ota, Tatsuya; Saha, Nil Ratan; Williams, Louise; Stadler, Peter F.; Wang, Han; Taylor, John S.; Fontenot, Quenton; Ferrara, Allyse; Searle, Stephen M. J.; Aken, Bronwen; Yandell, Mark; Schneider, Igor; Yoder, Jeffrey A.; Volff, Jean-Nicolas; Meyer, Axel; Amemiya, Chris T.; Venkatesh, Byrappa; Holland, Peter W. H.; Guiguen, Yann; Bobe, Julien; Shubin, Neil H.; Di Palma, Federica; Alföldi, Jessica; Lindblad-Toh, Kerstin; Postlethwait, John H.

2016-01-01

To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before the teleost genome duplication (TGD). The slowly evolving gar genome conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization, and development (e.g., Hox, ParaHox, and miRNA genes). Numerous conserved non-coding elements (CNEs, often cis-regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles of such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses revealed that the sum of expression domains and levels from duplicated teleost genes often approximate patterns and levels of gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes, and the function of human regulatory sequences. PMID:26950095
Whole-genome sequencing of staphylococcus haemolyticus uncovers the extreme plasticity of its genome and the evolution of human-colonizing staphylococcal species.

PubMed

Takeuchi, Fumihiko; Watanabe, Shinya; Baba, Tadashi; Yuzawa, Harumi; Ito, Teruyo; Morimoto, Yuh; Kuroda, Makoto; Cui, Longzhu; Takahashi, Mikio; Ankai, Akiho; Baba, Shin-ichi; Fukui, Shigehiro; Lee, Jean C; Hiramatsu, Keiichi

2005-11-01

Staphylococcus haemolyticus is an opportunistic bacterial pathogen that colonizes human skin and is remarkable for its highly antibiotic-resistant phenotype. We determined the complete genome sequence of S.haemolyticus to better understand its pathogenicity and evolutionary relatedness to the other staphylococcal species. A large proportion of the open reading frames in the genomes of S.haemolyticus, Staphylococcus aureus, and Staphylococcus epidermidis were conserved in their sequence and order on the chromosome. We identified a region of the bacterial chromosome just downstream of the origin of replication that showed little homology among the species but was conserved among strains within a species. This novel region, designated the "oriC environ," likely contributes to the evolution and differentiation of the staphylococcal species, since it was enriched for species-specific nonessential genes that contribute to the biological features of each staphylococcal species. A comparative analysis of the genomes of S.haemolyticus, S.aureus, and S.epidermidis elucidated differences in their biological and genetic characteristics and pathogenic potentials. We identified as many as 82 insertion sequences in the S.haemolyticus chromosome that probably mediated frequent genomic rearrangements, resulting in phenotypic diversification of the strain. Such rearrangements could have brought genomic plasticity to this species and contributed to its acquisition of antibiotic resistance.
A new method to cluster genomes based on cumulative Fourier power spectrum.

PubMed

Dong, Rui; Zhu, Ziyue; Yin, Changchuan; He, Rong L; Yau, Stephen S-T

2018-06-20

Analyzing phylogenetic relationships using mathematical methods has always been of importance in bioinformatics. Quantitative research may interpret the raw biological data in a precise way. Multiple Sequence Alignment (MSA) is used frequently to analyze biological evolutions, but is very time-consuming. When the scale of data is large, alignment methods cannot finish calculation in reasonable time. Therefore, we present a new method using moments of cumulative Fourier power spectrum in clustering the DNA sequences. Each sequence is translated into a vector in Euclidean space. Distances between the vectors can reflect the relationships between sequences. The mapping between the spectra and moment vector is one-to-one, which means that no information is lost in the power spectra during the calculation. We cluster and classify several datasets including Influenza A, primates, and human rhinovirus (HRV) datasets to build up the phylogenetic trees. Results show that the new proposed cumulative Fourier power spectrum is much faster and more accurately than MSA and another alignment-free method known as k-mer. The research provides us new insights in the study of phylogeny, evolution, and efficient DNA comparison algorithms for large genomes. The computer programs of the cumulative Fourier power spectrum are available at GitHub (https://github.com/YaulabTsinghua/cumulative-Fourier-power-spectrum). Copyright © 2018. Published by Elsevier B.V.
Distinct biological subtypes and patterns of genome evolution in lymphoma revealed by circulating tumor DNA.

PubMed

Scherer, Florian; Kurtz, David M; Newman, Aaron M; Stehr, Henning; Craig, Alexander F M; Esfahani, Mohammad Shahrokh; Lovejoy, Alexander F; Chabon, Jacob J; Klass, Daniel M; Liu, Chih Long; Zhou, Li; Glover, Cynthia; Visser, Brendan C; Poultsides, George A; Advani, Ranjana H; Maeda, Lauren S; Gupta, Neel K; Levy, Ronald; Ohgami, Robert S; Kunder, Christian A; Diehn, Maximilian; Alizadeh, Ash A

2016-11-09

Patients with diffuse large B cell lymphoma (DLBCL) exhibit marked diversity in tumor behavior and outcomes, yet the identification of poor-risk groups remains challenging. In addition, the biology underlying these differences is incompletely understood. We hypothesized that characterization of mutational heterogeneity and genomic evolution using circulating tumor DNA (ctDNA) profiling could reveal molecular determinants of adverse outcomes. To address this hypothesis, we applied cancer personalized profiling by deep sequencing (CAPP-Seq) analysis to tumor biopsies and cell-free DNA samples from 92 lymphoma patients and 24 healthy subjects. At diagnosis, the amount of ctDNA was found to strongly correlate with clinical indices and was independently predictive of patient outcomes. We demonstrate that ctDNA genotyping can classify transcriptionally defined tumor subtypes, including DLBCL cell of origin, directly from plasma. By simultaneously tracking multiple somatic mutations in ctDNA, our approach outperformed immunoglobulin sequencing and radiographic imaging for the detection of minimal residual disease and facilitated noninvasive identification of emergent resistance mutations to targeted therapies. In addition, we identified distinct patterns of clonal evolution distinguishing indolent follicular lymphomas from those that transformed into DLBCL, allowing for potential noninvasive prediction of histological transformation. Collectively, our results demonstrate that ctDNA analysis reveals biological factors that underlie lymphoma clinical outcomes and could facilitate individualized therapy. Copyright © 2016, American Association for the Advancement of Science.
Delayed Gratification Habitable Zones: When Deep Outer Solar System Regions Become Balmy During Post-Main Sequence Stellar Evolution

NASA Astrophysics Data System (ADS)

Stern, S. Alan

2003-06-01

Like all low- and moderate-mass stars, the Sun will burn as a red giant during its later evolution, generating of solar luminosities for some tens of millions of years. During this post-main sequence phase, the habitable (i.e., liquid water) thermal zone of our Solar System will lie in the region where Triton, Pluto-Charon, and Kuiper Belt objects orbit. Compared with the 1 AU habitable zone where Earth resides, this "delayed gratification habitable zone" (DGHZ) will enjoy a far less biologically hazardous environment - with lower harmful radiation levels from the Sun, and a far less destructive collisional environment. Objects like Triton, Pluto-Charon, and Kuiper Belt objects, which are known to be rich in both water and organics, will then become possible sites for biochemical and perhaps even biological evolution. The Kuiper Belt, with >105 objects >=50 km in radius and more than three times the combined surface area of the four terrestrial planets, provides numerous sites for possible evolution once the Sun's DGHZ reaches it. The Sun's DGHZ might be thought to only be of academic interest owing to its great separation from us in time. However, ~109 Milky Way stars burn as luminous red giants today. Thus, if icy-organic objects are common in the 20-50 AU zones of these stars, as they are in our Solar System (and as inferred in numerous main sequence stellar disk systems), then DGHZs may form a niche type of habitable zone that is likely to be numerically common in the Galaxy.
Delayed gratification habitable zones: when deep outer solar system regions become balmy during post-main sequence stellar evolution.

PubMed

Stern, S Alan

2003-01-01

Like all low- and moderate-mass stars, the Sun will burn as a red giant during its later evolution, generating of solar luminosities for some tens of millions of years. During this post-main sequence phase, the habitable (i.e., liquid water) thermal zone of our Solar System will lie in the region where Triton, Pluto-Charon, and Kuiper Belt objects orbit. Compared with the 1 AU habitable zone where Earth resides, this "delayed gratification habitable zone" (DGHZ) will enjoy a far less biologically hazardous environment - with lower harmful radiation levels from the Sun, and a far less destructive collisional environment. Objects like Triton, Pluto-Charon, and Kuiper Belt objects, which are known to be rich in both water and organics, will then become possible sites for biochemical and perhaps even biological evolution. The Kuiper Belt, with >10(5) objects > or =50 km in radius and more than three times the combined surface area of the four terrestrial planets, provides numerous sites for possible evolution once the Sun's DGHZ reaches it. The Sun's DGHZ might be thought to only be of academic interest owing to its great separation from us in time. However, approximately 10(9) Milky Way stars burn as luminous red giants today. Thus, if icy-organic objects are common in the 20-50 AU zones of these stars, as they are in our Solar System (and as inferred in numerous main sequence stellar disk systems), then DGHZs may form a niche type of habitable zone that is likely to be numerically common in the Galaxy.
Cofactors in the RNA World

NASA Technical Reports Server (NTRS)

Ditzler, Mark A.

2014-01-01

RNA world theories figure prominently in many scenarios for the origin and early evolution of life. These theories posit that RNA molecules played a much larger role in ancient biology than they do now, acting both as the dominant biocatalysts and as the repository of genetic information. Many features of modern RNA biology are potential examples of molecular fossils from an RNA world, such as the pervasive involvement of nucleotides in coenzymes, the existence of natural aptamers that bind these coenzymes, the existence of natural ribozymes, a biosynthetic pathway in which deoxynucleotides are produced from ribonucleotides, and the central role of ribosomal RNA in protein synthesis in the peptidyl transferase center of the ribosome. Here, we uses both a top-down approach that evaluates RNA function in modern biology and a bottom-up approach that examines the capacities of RNA independent of modern biology. These complementary approaches exploit multiple in vitro evolution techniques coupled with high-throughput sequencing and bioinformatics analysis. Together these complementary approaches advance our understanding of the most primitive organisms, their early evolution, and their eventual transition to modern biochemistry.
Genomic analysis of expressed sequence tags in American black bear Ursus americanus

PubMed Central

2010-01-01

Background Species of the bear family (Ursidae) are important organisms for research in molecular evolution, comparative physiology and conservation biology, but relatively little genetic sequence information is available for this group. Here we report the development and analyses of the first large scale Expressed Sequence Tag (EST) resource for the American black bear (Ursus americanus). Results Comprehensive analyses of molecular functions, alternative splicing, and tissue-specific expression of 38,757 black bear EST sequences were conducted using the dog genome as a reference. We identified 18 genes, involved in functions such as lipid catabolism, cell cycle, and vesicle-mediated transport, that are showing rapid evolution in the bear lineage Three genes, Phospholamban (PLN), cysteine glycine-rich protein 3 (CSRP3) and Troponin I type 3 (TNNI3), are related to heart contraction, and defects in these genes in humans lead to heart disease. Two genes, biphenyl hydrolase-like (BPHL) and CSRP3, contain positively selected sites in bear. Global analysis of evolution rates of hibernation-related genes in bear showed that they are largely conserved and slowly evolving genes, rather than novel and fast-evolving genes. Conclusion We provide a genomic resource for an important mammalian organism and our study sheds new light on the possible functions and evolution of bear genes. PMID:20338065
Genomic analysis of expressed sequence tags in American black bear Ursus americanus.

PubMed

Zhao, Sen; Shao, Chunxuan; Goropashnaya, Anna V; Stewart, Nathan C; Xu, Yichi; Tøien, Øivind; Barnes, Brian M; Fedorov, Vadim B; Yan, Jun

2010-03-26

Species of the bear family (Ursidae) are important organisms for research in molecular evolution, comparative physiology and conservation biology, but relatively little genetic sequence information is available for this group. Here we report the development and analyses of the first large scale Expressed Sequence Tag (EST) resource for the American black bear (Ursus americanus). Comprehensive analyses of molecular functions, alternative splicing, and tissue-specific expression of 38,757 black bear EST sequences were conducted using the dog genome as a reference. We identified 18 genes, involved in functions such as lipid catabolism, cell cycle, and vesicle-mediated transport, that are showing rapid evolution in the bear lineage Three genes, Phospholamban (PLN), cysteine glycine-rich protein 3 (CSRP3) and Troponin I type 3 (TNNI3), are related to heart contraction, and defects in these genes in humans lead to heart disease. Two genes, biphenyl hydrolase-like (BPHL) and CSRP3, contain positively selected sites in bear. Global analysis of evolution rates of hibernation-related genes in bear showed that they are largely conserved and slowly evolving genes, rather than novel and fast-evolving genes. We provide a genomic resource for an important mammalian organism and our study sheds new light on the possible functions and evolution of bear genes.
Evolution and Diversity of the Human Hepatitis D Virus Genome

PubMed Central

Huang, Chi-Ruei; Lo, Szecheng J.

2010-01-01

Human hepatitis delta virus (HDV) is the smallest RNA virus in genome. HDV genome is divided into a viroid-like sequence and a protein-coding sequence which could have originated from different resources and the HDV genome was eventually constituted through RNA recombination. The genome subsequently diversified through accumulation of mutations selected by interactions between the mutated RNA and proteins with host factors to successfully form the infectious virions. Therefore, we propose that the conservation of HDV nucleotide sequence is highly related with its functionality. Genome analysis of known HDV isolates shows that the C-terminal coding sequences of large delta antigen (LDAg) are the highest diversity than other regions of protein-coding sequences but they still retain biological functionality to interact with the heavy chain of clathrin can be selected and maintained. Since viruses interact with many host factors, including escaping the host immune response, how to design a program to predict RNA genome evolution is a great challenging work. PMID:20204073

Mobile element biology – new possibilities with high-throughput sequencing

PubMed Central

Xing, Jinchuan; Witherspoon, David J.; Jorde, Lynn B.

2014-01-01

Mobile elements compose more than half of the human genome, but until recently their large-scale detection was time-consuming and challenging. With the development of new high-throughput sequencing technologies, the complete spectrum of mobile element variation in humans can now be identified and analyzed. Thousands of new mobile element insertions have been discovered, yielding new insights into mobile element biology, evolution, and genomic variation. We review several high-throughput methods, with an emphasis on techniques that specifically target mobile element insertions in humans, and we highlight recent applications of these methods in evolutionary studies and in the analysis of somatic alterations in human cancers. PMID:23312846
Constraints in cancer evolution.

PubMed

Venkatesan, Subramanian; Birkbak, Nicolai J; Swanton, Charles

2017-02-08

Next-generation deep genome sequencing has only recently allowed us to quantitatively dissect the extent of heterogeneity within a tumour, resolving patterns of cancer evolution. Intratumour heterogeneity and natural selection contribute to resistance to anticancer therapies in the advanced setting. Recent evidence has also revealed that cancer evolution might be constrained. In this review, we discuss the origins of intratumour heterogeneity and subsequently focus on constraints imposed upon cancer evolution. The presence of (1) parallel evolution, (2) convergent evolution and (3) the biological impact of acquiring mutations in specific orders suggest that cancer evolution may be exploitable. These constraints on cancer evolution may help us identify cancer evolutionary rule books, which could eventually inform both diagnostic and therapeutic approaches to improve survival outcomes. © 2017 The Author(s); published by Portland Press Limited on behalf of the Biochemical Society.
Genome-wide survey of the seagrass Zostera muelleri suggests modification of the ethylene signalling network.

PubMed

Golicz, Agnieszka A; Schliep, Martin; Lee, Huey Tyng; Larkum, Anthony W D; Dolferus, Rudy; Batley, Jacqueline; Chan, Chon-Kit Kenneth; Sablok, Gaurav; Ralph, Peter J; Edwards, David

2015-03-01

Seagrasses are flowering plants which grow fully submerged in the marine environment. They have evolved a range of adaptations to environmental challenges including light attenuation through water, the physical stress of wave action and tidal currents, high concentrations of salt, oxygen deficiency in marine sediment, and water-borne pollination. Although, seagrasses are a key stone species of the costal ecosystems, many questions regarding seagrass biology and evolution remain unanswered. Genome sequence data for the widespread Australian seagrass species Zostera muelleri were generated and the unassembled data were compared with the annotated genes of five sequenced plant species (Arabidopsis thaliana, Oryza sativa, Phoenix dactylifera, Musa acuminata, and Spirodela polyrhiza). Genes which are conserved between Z. muelleri and the five plant species were identified, together with genes that have been lost in Z. muelleri. The effect of gene loss on biological processes was assessed on the gene ontology classification level. Gene loss in Z. muelleri appears to influence some core biological processes such as ethylene biosynthesis. This study provides a foundation for further studies of seagrass evolution as well as the hormonal regulation of plant growth and development. © The Author 2015. Published by Oxford University Press on behalf of the Society for Experimental Biology.
Ginkgo and Welwitschia Mitogenomes Reveal Extreme Contrasts in Gymnosperm Mitochondrial Evolution.

PubMed

Guo, Wenhu; Grewe, Felix; Fan, Weishu; Young, Gregory J; Knoop, Volker; Palmer, Jeffrey D; Mower, Jeffrey P

2016-06-01

Mitochondrial genomes (mitogenomes) of flowering plants are well known for their extreme diversity in size, structure, gene content, and rates of sequence evolution and recombination. In contrast, little is known about mitogenomic diversity and evolution within gymnosperms. Only a single complete genome sequence is available, from the cycad Cycas taitungensis, while limited information is available for the one draft sequence, from Norway spruce (Picea abies). To examine mitogenomic evolution in gymnosperms, we generated complete genome sequences for the ginkgo tree (Ginkgo biloba) and a gnetophyte (Welwitschia mirabilis). There is great disparity in size, sequence conservation, levels of shared DNA, and functional content among gymnosperm mitogenomes. The Cycas and Ginkgo mitogenomes are relatively small, have low substitution rates, and possess numerous genes, introns, and edit sites; we infer that these properties were present in the ancestral seed plant. By contrast, the Welwitschia mitogenome has an expanded size coupled with accelerated substitution rates and extensive loss of these functional features. The Picea genome has expanded further, to more than 4 Mb. With regard to structural evolution, the Cycas and Ginkgo mitogenomes share a remarkable amount of intergenic DNA, which may be related to the limited recombinational activity detected at repeats in Ginkgo Conversely, the Welwitschia mitogenome shares almost no intergenic DNA with any other seed plant. By conducting the first measurements of rates of DNA turnover in seed plant mitogenomes, we discovered that turnover rates vary by orders of magnitude among species. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
An Evolutionary/Biochemical Connection Between Promoter- and Primer-Dependent Polymerases Revealed by Selective Evolution of Ligands by Exponential Enrichment (SELEX).

PubMed

Fenstermacher, Katherine J; Achuthan, Vasudevan; Schneider, Thomas D; DeStefano, Jeffrey J

2018-01-16

DNA polymerases (DNAPs) recognize 3' recessed termini on duplex DNA and carry out nucleotide catalysis. Unlike promoter-specific RNA polymerases (RNAPs), no sequence specificity is required for binding or initiation of catalysis. Despite this, previous results indicate that viral reverse transcriptases bind much more tightly to DNA primers that mimic the polypurine tract. In the current report, primer sequences that bind with high affinity to Taq and Klenow polymerases were identified using a modified Selective Evolution of Ligands by Exponential Enrichment (SELEX) approach. Two Taq -specific primers that bound ∼10 (Taq1) and over 100 (Taq2) times more stably than controls to Taq were identified. Taq1 contained 8 nucleotides (5' -CACTAAAG-3') that matched the phage T3 RNAP "core" promoter. Both primers dramatically outcompeted primers with similar binding thermodynamics in PCR reactions. Similarly, exonuclease minus Klenow polymerase also selected a high affinity primer that contained a related core promoter sequence from phage T7 RNAP (5' -ACTATAG-3'). For both Taq and Klenow, even small modifications to the sequence resulted in large losses in binding affinity suggesting that binding was highly sequence-specific. The results are discussed in the context of possible effects on multi-primer (multiplex) PCR assays, molecular information theory, and the evolution of RNAPs and DNAPs. Importance This work further demonstrates that primer-dependent DNA polymerases can have strong sequence biases leading to dramatically tighter binding to specific sequences. These may be related to biological function, or be a consequences of the structural architecture of the enzyme. New sequence specificity for Taq and Klenow polymerases were uncovered and among them were sequences that contained the core promoter elements from T3 and T7 phage RNA polymerase promoters. This suggests the intriguing possibility that phage RNA polymerases exploited intrinsic binding affinities of ancestral DNA polymerases to develop their promotors. Conversely, DNA polymerases could have evolved from related RNA polymerases and retained the intrinsic binding preference despite there being no clear function for such a preference in DNA biology. Copyright © 2018 American Society for Microbiology.
Genomics, transcriptomics and proteomics: enabling insights into social evolution and disease challenges for managed and wild bees.

PubMed

Trapp, Judith; McAfee, Alison; Foster, Leonard J

2017-02-01

Globally, there are over 20 000 bee species (Hymenoptera: Apoidea: Anthophila) with a host of biologically fascinating characteristics. Although they have long been studied as models for social evolution, recent challenges to bee health (mainly diseases and pesticides) have gathered the attention of both public and research communities. Genome sequences of twelve bee species are now complete or under progress, facilitating the application of additional 'omic technologies. Here, we review recent developments in honey bee and native bee research in the genomic era. We discuss the progress in genome sequencing and functional annotation, followed by the enabled comparative genomics, proteomics and transcriptomics applications regarding social evolution and health. Finally, we end with comments on future challenges in the postgenomic era. © 2016 John Wiley & Sons Ltd.
Update on Genomic Databases and Resources at the National Center for Biotechnology Information.

PubMed

Tatusova, Tatiana

2016-01-01

The National Center for Biotechnology Information (NCBI), as a primary public repository of genomic sequence data, collects and maintains enormous amounts of heterogeneous data. Data for genomes, genes, gene expressions, gene variation, gene families, proteins, and protein domains are integrated with the analytical, search, and retrieval resources through the NCBI website, text-based search and retrieval system, provides a fast and easy way to navigate across diverse biological databases.Comparative genome analysis tools lead to further understanding of evolution processes quickening the pace of discovery. Recent technological innovations have ignited an explosion in genome sequencing that has fundamentally changed our understanding of the biology of living organisms. This huge increase in DNA sequence data presents new challenges for the information management system and the visualization tools. New strategies have been designed to bring an order to this genome sequence shockwave and improve the usability of associated data.
Fractal landscapes in biological systems: long-range correlations in DNA and interbeat heart intervals

NASA Technical Reports Server (NTRS)

Stanley, H. E.; Buldyrev, S. V.; Goldberger, A. L.; Hausdorff, J. M.; Havlin, S.; Mietus, J.; Sciortino, F.; Simons, M.

1992-01-01

Here we discuss recent advances in applying ideas of fractals and disordered systems to two topics of biological interest, both topics having common the appearance of scale-free phenomena, i.e., correlations that have no characteristic length scale, typically exhibited by physical systems near a critical point and dynamical systems far from equilibrium. (i) DNA nucleotide sequences have traditionally been analyzed using models which incorporate the possibility of short-range nucleotide correlations. We found, instead, a remarkably long-range power law correlation. We found such long-range correlations in intron-containing genes and in non-transcribed regulatory DNA sequences as well as intragenomic DNA, but not in cDNA sequences or intron-less genes. We also found that the myosin heavy chain family gene evolution increases the fractal complexity of the DNA landscapes, consistent with the intron-late hypothesis of gene evolution. (ii) The healthy heartbeat is traditionally thought to be regulated according to the classical principle of homeostasis, whereby physiologic systems operate to reduce variability and achieve an equilibrium-like state. We found, however, that under normal conditions, beat-to-beat fluctuations in heart rate display long-range power law correlations.
Evolution of the arginase fold and functional diversity

PubMed Central

Dowling, Daniel P.; Costanzo, Luigi Di; Gennadios, Heather A.; Christianson, David W.

2009-01-01

The large number of protein structures deposited in the Protein Data Bank allows for the identification of novel structural superfamilies based on conservation of fold in addition to conservation of amino acid sequence. Since sequence diverges more rapidly than fold in protein evolution, proteins with little or no significant sequence identity are occasionally observed to adopt similar folds, thereby reflecting unanticipated evolutionary relationships. Here, we review the unique α/β fold first observed in the manganese metalloenzyme rat liver arginase, consisting of a parallel 8 stranded β-sheet surrounded by several helices, and its evolutionary relationship with the zinc-requiring and/or iron-requiring histone deacetylases and acetylpolyamine amidohydrolases. Structural comparisons reveal key features of the core α/β fold that contribute to the divergent metal ion specificity and stoichiometry required for the chemical and biological functions of these enzymes. PMID:18360740
Preliminary Evolutionary Explanations: A Basic Framework for Conceptual Change and Explanatory Coherence in Evolution

NASA Astrophysics Data System (ADS)

Kampourakis, Kostas; Zogza, Vasso

2009-10-01

This study aimed to explore secondary students’ explanations of evolutionary processes, and to determine how consistent these were, after a specific evolution instruction. In a previous study it was found that before instruction students provided different explanations for similar processes to tasks with different content. Hence, it seemed that the structure and the content of the task may have had an effect on students’ explanations. The tasks given to students demanded evolutionary explanations, in particular explanations for the origin of homologies and adaptations. Based on the conclusions from the previous study, we developed a teaching sequence in order to overcome students’ preconceptions, as well as to achieve conceptual change and explanatory coherence. Students were taught about fundamental biological concepts and the several levels of biological organization, as well as about the mechanisms of heredity and of the origin of genetic variation. Then, all these concepts were used to teach about evolution, by relating micro-concepts (e.g. genotypes) to macro-concepts (e.g. phenotypes). Moreover, during instruction students were brought to a conceptual conflict situation, where their intuitive explanations were challenged as emphasis was put on two concepts entirely opposed to their preconceptions: chance and unpredictability. From the explanations that students provided in the post-test it is concluded that conceptual change and explanatory coherence in evolution can be achieved to a certain degree by lower secondary school students through the suggested teaching sequence and the explanatory framework, which may form a basis for teaching further about evolution.
Epistasis in protein evolution

PubMed Central

Starr, Tyler N.

2016-01-01

Abstract The structure, function, and evolution of proteins depend on physical and genetic interactions among amino acids. Recent studies have used new strategies to explore the prevalence, biochemical mechanisms, and evolutionary implications of these interactions—called epistasis—within proteins. Here we describe an emerging picture of pervasive epistasis in which the physical and biological effects of mutations change over the course of evolution in a lineage‐specific fashion. Epistasis can restrict the trajectories available to an evolving protein or open new paths to sequences and functions that would otherwise have been inaccessible. We describe two broad classes of epistatic interactions, which arise from different physical mechanisms and have different effects on evolutionary processes. Specific epistasis—in which one mutation influences the phenotypic effect of few other mutations—is caused by direct and indirect physical interactions between mutations, which nonadditively change the protein's physical properties, such as conformation, stability, or affinity for ligands. In contrast, nonspecific epistasis describes mutations that modify the effect of many others; these typically behave additively with respect to the physical properties of a protein but exhibit epistasis because of a nonlinear relationship between the physical properties and their biological effects, such as function or fitness. Both types of interaction are rampant, but specific epistasis has stronger effects on the rate and outcomes of evolution, because it imposes stricter constraints and modulates evolutionary potential more dramatically; it therefore makes evolution more contingent on low‐probability historical events and leaves stronger marks on the sequences, structures, and functions of protein families. PMID:26833806
The genome of the sea urchin Strongylocentrotus purpuratus.

PubMed

Sodergren, Erica; Weinstock, George M; Davidson, Eric H; Cameron, R Andrew; Gibbs, Richard A; Angerer, Robert C; Angerer, Lynne M; Arnone, Maria Ina; Burgess, David R; Burke, Robert D; Coffman, James A; Dean, Michael; Elphick, Maurice R; Ettensohn, Charles A; Foltz, Kathy R; Hamdoun, Amro; Hynes, Richard O; Klein, William H; Marzluff, William; McClay, David R; Morris, Robert L; Mushegian, Arcady; Rast, Jonathan P; Smith, L Courtney; Thorndyke, Michael C; Vacquier, Victor D; Wessel, Gary M; Wray, Greg; Zhang, Lan; Elsik, Christine G; Ermolaeva, Olga; Hlavina, Wratko; Hofmann, Gretchen; Kitts, Paul; Landrum, Melissa J; Mackey, Aaron J; Maglott, Donna; Panopoulou, Georgia; Poustka, Albert J; Pruitt, Kim; Sapojnikov, Victor; Song, Xingzhi; Souvorov, Alexandre; Solovyev, Victor; Wei, Zheng; Whittaker, Charles A; Worley, Kim; Durbin, K James; Shen, Yufeng; Fedrigo, Olivier; Garfield, David; Haygood, Ralph; Primus, Alexander; Satija, Rahul; Severson, Tonya; Gonzalez-Garay, Manuel L; Jackson, Andrew R; Milosavljevic, Aleksandar; Tong, Mark; Killian, Christopher E; Livingston, Brian T; Wilt, Fred H; Adams, Nikki; Bellé, Robert; Carbonneau, Seth; Cheung, Rocky; Cormier, Patrick; Cosson, Bertrand; Croce, Jenifer; Fernandez-Guerra, Antonio; Genevière, Anne-Marie; Goel, Manisha; Kelkar, Hemant; Morales, Julia; Mulner-Lorillon, Odile; Robertson, Anthony J; Goldstone, Jared V; Cole, Bryan; Epel, David; Gold, Bert; Hahn, Mark E; Howard-Ashby, Meredith; Scally, Mark; Stegeman, John J; Allgood, Erin L; Cool, Jonah; Judkins, Kyle M; McCafferty, Shawn S; Musante, Ashlan M; Obar, Robert A; Rawson, Amanda P; Rossetti, Blair J; Gibbons, Ian R; Hoffman, Matthew P; Leone, Andrew; Istrail, Sorin; Materna, Stefan C; Samanta, Manoj P; Stolc, Viktor; Tongprasit, Waraporn; Tu, Qiang; Bergeron, Karl-Frederik; Brandhorst, Bruce P; Whittle, James; Berney, Kevin; Bottjer, David J; Calestani, Cristina; Peterson, Kevin; Chow, Elly; Yuan, Qiu Autumn; Elhaik, Eran; Graur, Dan; Reese, Justin T; Bosdet, Ian; Heesun, Shin; Marra, Marco A; Schein, Jacqueline; Anderson, Michele K; Brockton, Virginia; Buckley, Katherine M; Cohen, Avis H; Fugmann, Sebastian D; Hibino, Taku; Loza-Coll, Mariano; Majeske, Audrey J; Messier, Cynthia; Nair, Sham V; Pancer, Zeev; Terwilliger, David P; Agca, Cavit; Arboleda, Enrique; Chen, Nansheng; Churcher, Allison M; Hallböök, F; Humphrey, Glen W; Idris, Mohammed M; Kiyama, Takae; Liang, Shuguang; Mellott, Dan; Mu, Xiuqian; Murray, Greg; Olinski, Robert P; Raible, Florian; Rowe, Matthew; Taylor, John S; Tessmar-Raible, Kristin; Wang, D; Wilson, Karen H; Yaguchi, Shunsuke; Gaasterland, Terry; Galindo, Blanca E; Gunaratne, Herath J; Juliano, Celina; Kinukawa, Masashi; Moy, Gary W; Neill, Anna T; Nomura, Mamoru; Raisch, Michael; Reade, Anna; Roux, Michelle M; Song, Jia L; Su, Yi-Hsien; Townley, Ian K; Voronina, Ekaterina; Wong, Julian L; Amore, Gabriele; Branno, Margherita; Brown, Euan R; Cavalieri, Vincenzo; Duboc, Véronique; Duloquin, Louise; Flytzanis, Constantin; Gache, Christian; Lapraz, François; Lepage, Thierry; Locascio, Annamaria; Martinez, Pedro; Matassi, Giorgio; Matranga, Valeria; Range, Ryan; Rizzo, Francesca; Röttinger, Eric; Beane, Wendy; Bradham, Cynthia; Byrum, Christine; Glenn, Tom; Hussain, Sofia; Manning, Gerard; Miranda, Esther; Thomason, Rebecca; Walton, Katherine; Wikramanayke, Athula; Wu, Shu-Yu; Xu, Ronghui; Brown, C Titus; Chen, Lili; Gray, Rachel F; Lee, Pei Yun; Nam, Jongmin; Oliveri, Paola; Smith, Joel; Muzny, Donna; Bell, Stephanie; Chacko, Joseph; Cree, Andrew; Curry, Stacey; Davis, Clay; Dinh, Huyen; Dugan-Rocha, Shannon; Fowler, Jerry; Gill, Rachel; Hamilton, Cerrissa; Hernandez, Judith; Hines, Sandra; Hume, Jennifer; Jackson, Laronda; Jolivet, Angela; Kovar, Christie; Lee, Sandra; Lewis, Lora; Miner, George; Morgan, Margaret; Nazareth, Lynne V; Okwuonu, Geoffrey; Parker, David; Pu, Ling-Ling; Thorn, Rachel; Wright, Rita

2006-11-10

We report the sequence and analysis of the 814-megabase genome of the sea urchin Strongylocentrotus purpuratus, a model for developmental and systems biology. The sequencing strategy combined whole-genome shotgun and bacterial artificial chromosome (BAC) sequences. This use of BAC clones, aided by a pooling strategy, overcame difficulties associated with high heterozygosity of the genome. The genome encodes about 23,300 genes, including many previously thought to be vertebrate innovations or known only outside the deuterostomes. This echinoderm genome provides an evolutionary outgroup for the chordates and yields insights into the evolution of deuterostomes.
Vibrational spectroscopy reveals the initial steps of biological hydrogen evolution.

PubMed

Katz, S; Noth, J; Horch, M; Shafaat, H S; Happe, T; Hildebrandt, P; Zebger, I

2016-11-01

[FeFe] hydrogenases are biocatalytic model systems for the exploitation and investigation of catalytic hydrogen evolution. Here, we used vibrational spectroscopic techniques to characterize, in detail, redox transformations of the [FeFe] and [4Fe4S] sub-sites of the catalytic centre (H-cluster) in a monomeric [FeFe] hydrogenase. Through the application of low-temperature resonance Raman spectroscopy, we discovered a novel metastable intermediate that is characterized by an oxidized [Fe I Fe II ] centre and a reduced [4Fe4S] 1+ cluster. Based on this unusual configuration, this species is assigned to the first, deprotonated H-cluster intermediate of the [FeFe] hydrogenase catalytic cycle. Providing insights into the sequence of initial reaction steps, the identification of this species represents a key finding towards the mechanistic understanding of biological hydrogen evolution.
Experimental evolution reveals genome-wide spectrum and dynamics of mutations in the rice blast fungus, Magnaporthe oryzae.

PubMed

Jeon, Junhyun; Choi, Jaeyoung; Lee, Gir-Won; Dean, Ralph A; Lee, Yong-Hwan

2013-01-01

Knowledge on mutation processes is central to interpreting genetic analysis data as well as understanding the underlying nature of almost all evolutionary phenomena. However, studies on genome-wide mutational spectrum and dynamics in fungal pathogens are scarce, hindering our understanding of their evolution and biology. Here, we explored changes in the phenotypes and genome sequences of the rice blast fungus Magnaporthe oryzae during the forced in vitro evolution by weekly transfer of cultures on artificial media. Through combination of experimental evolution with high throughput sequencing technology, we found that mutations accumulate rapidly prior to visible phenotypic changes and that both genetic drift and selection seem to contribute to shaping mutational landscape, suggesting the buffering capacity of fungal genome against mutations. Inference of mutational effects on phenotypes through the use of T-DNA insertion mutants suggested that at least some of the DNA sequence mutations are likely associated with the observed phenotypic changes. Furthermore, our data suggest oxidative damages and UV as major sources of mutation during subcultures. Taken together, our work revealed important properties of original source of variation in the genome of the rice blast fungus. We believe that these results provide not only insights into stability of pathogenicity and genome evolution in plant pathogenic fungi but also a model in which evolution of fungal pathogens in natura can be comparatively investigated.
Protein-protein interaction network-based detection of functionally similar proteins within species.

PubMed

Song, Baoxing; Wang, Fen; Guo, Yang; Sang, Qing; Liu, Min; Li, Dengyun; Fang, Wei; Zhang, Deli

2012-07-01

Although functionally similar proteins across species have been widely studied, functionally similar proteins within species showing low sequence similarity have not been examined in detail. Identification of these proteins is of significant importance for understanding biological functions, evolution of protein families, progression of co-evolution, and convergent evolution and others which cannot be obtained by detection of functionally similar proteins across species. Here, we explored a method of detecting functionally similar proteins within species based on graph theory. After denoting protein-protein interaction networks using graphs, we split the graphs into subgraphs using the 1-hop method. Proteins with functional similarities in a species were detected using a method of modified shortest path to compare these subgraphs and to find the eligible optimal results. Using seven protein-protein interaction networks and this method, some functionally similar proteins with low sequence similarity that cannot detected by sequence alignment were identified. By analyzing the results, we found that, sometimes, it is difficult to separate homologous from convergent evolution. Evaluation of the performance of our method by gene ontology term overlap showed that the precision of our method was excellent. Copyright © 2012 Wiley Periodicals, Inc.
The medicago genome provides insight into evolution of rhizobial symbiosis

USDA-ARS?s Scientific Manuscript database

Medicago truncatula is an excellent model for the study of legume-specific biology, especially endosymbiotic interactions with bacteria and fungi. This paper describes the sequence of the euchromatic portion of the M. truncatula genome based on a recently completed BAC-based assembly supplemented by...
Evolution of polyketide synthesis in a Dothideomycete forest pathogen

USDA-ARS?s Scientific Manuscript database

Fungal secondary metabolites have many important biological roles and some, like the toxic polyketide aflatoxin, have been intensively studied at the genetic level. Complete sets of polyketide synthase (PKS) genes can now be identified in fungal pathogens by whole genome sequencing and studied in or...
Shedding new light on opsin evolution

PubMed Central

Porter, Megan L.; Blasic, Joseph R.; Bok, Michael J.; Cameron, Evan G.; Pringle, Thomas; Cronin, Thomas W.; Robinson, Phyllis R.

2012-01-01

Opsin proteins are essential molecules in mediating the ability of animals to detect and use light for diverse biological functions. Therefore, understanding the evolutionary history of opsins is key to understanding the evolution of light detection and photoreception in animals. As genomic data have appeared and rapidly expanded in quantity, it has become possible to analyse opsins that functionally and histologically are less well characterized, and thus to examine opsin evolution strictly from a genetic perspective. We have incorporated these new data into a large-scale, genome-based analysis of opsin evolution. We use an extensive phylogeny of currently known opsin sequence diversity as a foundation for examining the evolutionary distributions of key functional features within the opsin clade. This new analysis illustrates the lability of opsin protein-expression patterns, site-specific functionality (i.e. counterion position) and G-protein binding interactions. Further, it demonstrates the limitations of current model organisms, and highlights the need for further characterization of many of the opsin sequence groups with unknown function. PMID:22012981
Evolution and Diversity of Transposable Elements in Vertebrate Genomes.

PubMed

Sotero-Caio, Cibele G; Platt, Roy N; Suh, Alexander; Ray, David A

2017-01-01

Transposable elements (TEs) are selfish genetic elements that mobilize in genomes via transposition or retrotransposition and often make up large fractions of vertebrate genomes. Here, we review the current understanding of vertebrate TE diversity and evolution in the context of recent advances in genome sequencing and assembly techniques. TEs make up 4-60% of assembled vertebrate genomes, and deeply branching lineages such as ray-finned fishes and amphibians generally exhibit a higher TE diversity than the more recent radiations of birds and mammals. Furthermore, the list of taxa with exceptional TE landscapes is growing. We emphasize that the current bottleneck in genome analyses lies in the proper annotation of TEs and provide examples where superficial analyses led to misleading conclusions about genome evolution. Finally, recent advances in long-read sequencing will soon permit access to TE-rich genomic regions that previously resisted assembly including the gigantic, TE-rich genomes of salamanders and lungfishes. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Rates of molecular evolution in tree ferns are associated with body size, environmental temperature, and biological productivity.

PubMed

Barrera-Redondo, Josué; Ramírez-Barahona, Santiago; Eguiarte, Luis E

2018-05-01

Variation in rates of molecular evolution (heterotachy) is a common phenomenon among plants. Although multiple theoretical models have been proposed, fundamental questions remain regarding the combined effects of ecological and morphological traits on rate heterogeneity. Here, we used tree ferns to explore the correlation between rates of molecular evolution in chloroplast DNA sequences and several morphological and environmental factors within a Bayesian framework. We revealed direct and indirect effects of body size, biological productivity, and temperature on substitution rates, where smaller tree ferns living in warmer and less productive environments tend to have faster rates of molecular evolution. In addition, we found that variation in the ratio of nonsynonymous to synonymous substitution rates (dN/dS) in the chloroplast rbcL gene was significantly correlated with ecological and morphological variables. Heterotachy in tree ferns may be influenced by effective population size associated with variation in body size and productivity. Macroevolutionary hypotheses should go beyond explaining heterotachy in terms of mutation rates and instead, should integrate population-level factors to better understand the processes affecting the tempo of evolution at the molecular level. © 2018 The Author(s). Evolution © 2018 The Society for the Study of Evolution.

Evolutionary biology through the lens of budding yeast comparative genomics.

PubMed

Marsit, Souhir; Leducq, Jean-Baptiste; Durand, Éléonore; Marchant, Axelle; Filteau, Marie; Landry, Christian R

2017-10-01

The budding yeast Saccharomyces cerevisiae is a highly advanced model system for studying genetics, cell biology and systems biology. Over the past decade, the application of high-throughput sequencing technologies to this species has contributed to this yeast also becoming an important model for evolutionary genomics. Indeed, comparative genomic analyses of laboratory, wild and domesticated yeast populations are providing unprecedented detail about many of the processes that govern evolution, including long-term processes, such as reproductive isolation and speciation, and short-term processes, such as adaptation to natural and domestication-related environments.
Targeted sequencing for high-resolution evolutionary analyses following genome duplication in salmonid fish: Proof of concept for key components of the insulin-like growth factor axis.

PubMed

Lappin, Fiona M; Shaw, Rebecca L; Macqueen, Daniel J

2016-12-01

High-throughput sequencing has revolutionised comparative and evolutionary genome biology. It has now become relatively commonplace to generate multiple genomes and/or transcriptomes to characterize the evolution of large taxonomic groups of interest. Nevertheless, such efforts may be unsuited to some research questions or remain beyond the scope of some research groups. Here we show that targeted high-throughput sequencing offers a viable alternative to study genome evolution across a vertebrate family of great scientific interest. Specifically, we exploited sequence capture and Illumina sequencing to characterize the evolution of key components from the insulin-like growth (IGF) signalling axis of salmonid fish at unprecedented phylogenetic resolution. The IGF axis represents a central governor of vertebrate growth and its core components were expanded by whole genome duplication in the salmonid ancestor ~95Ma. Using RNA baits synthesised to genes encoding the complete family of IGF binding proteins (IGFBP) and an IGF hormone (IGF2), we captured, sequenced and assembled orthologous and paralogous exons from species representing all ten salmonid genera. This approach generated 299 novel sequences, most as complete or near-complete protein-coding sequences. Phylogenetic analyses confirmed congruent evolutionary histories for all nineteen recognized salmonid IGFBP family members and identified novel salmonid-specific IGF2 paralogues. Moreover, we reconstructed the evolution of duplicated IGF axis paralogues across a replete salmonid phylogeny, revealing complex historic selection regimes - both ancestral to salmonids and lineage-restricted - that frequently involved asymmetric paralogue divergence under positive and/or relaxed purifying selection. Our findings add to an emerging literature highlighting diverse applications for targeted sequencing in comparative-evolutionary genomics. We also set out a viable approach to obtain large sets of nuclear genes for any member of the salmonid family, which should enable insights into the evolutionary role of whole genome duplication before additional nuclear genome sequences become available. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Whole-Genome Sequencing of Staphylococcus haemolyticus Uncovers the Extreme Plasticity of Its Genome and the Evolution of Human-Colonizing Staphylococcal Species

PubMed Central

Takeuchi, Fumihiko; Watanabe, Shinya; Baba, Tadashi; Yuzawa, Harumi; Ito, Teruyo; Morimoto, Yuh; Kuroda, Makoto; Cui, Longzhu; Takahashi, Mikio; Ankai, Akiho; Baba, Shin-ichi; Fukui, Shigehiro; Lee, Jean C.; Hiramatsu, Keiichi

2005-01-01

Staphylococcus haemolyticus is an opportunistic bacterial pathogen that colonizes human skin and is remarkable for its highly antibiotic-resistant phenotype. We determined the complete genome sequence of S.haemolyticus to better understand its pathogenicity and evolutionary relatedness to the other staphylococcal species. A large proportion of the open reading frames in the genomes of S.haemolyticus, Staphylococcus aureus, and Staphylococcus epidermidis were conserved in their sequence and order on the chromosome. We identified a region of the bacterial chromosome just downstream of the origin of replication that showed little homology among the species but was conserved among strains within a species. This novel region, designated the “oriC environ,” likely contributes to the evolution and differentiation of the staphylococcal species, since it was enriched for species-specific nonessential genes that contribute to the biological features of each staphylococcal species. A comparative analysis of the genomes of S.haemolyticus, S.aureus, and S.epidermidis elucidated differences in their biological and genetic characteristics and pathogenic potentials. We identified as many as 82 insertion sequences in the S.haemolyticus chromosome that probably mediated frequent genomic rearrangements, resulting in phenotypic diversification of the strain. Such rearrangements could have brought genomic plasticity to this species and contributed to its acquisition of antibiotic resistance. PMID:16237012
The evolution, diversity, and host associations of rhabdoviruses.

PubMed

Longdon, Ben; Murray, Gemma G R; Palmer, William J; Day, Jonathan P; Parker, Darren J; Welch, John J; Obbard, Darren J; Jiggins, Francis M

2015-01-01

Metagenomic studies are leading to the discovery of a hidden diversity of RNA viruses. These new viruses are poorly characterized and new approaches are needed predict the host species these viruses pose a risk to. The rhabdoviruses are a diverse family of RNA viruses that includes important pathogens of humans, animals, and plants. We have discovered thirty-two new rhabdoviruses through a combination of our own RNA sequencing of insects and searching public sequence databases. Combining these with previously known sequences we reconstructed the phylogeny of 195 rhabdovirus sequences, and produced the most in depth analysis of the family to date. In most cases we know nothing about the biology of the viruses beyond the host they were identified from, but our dataset provides a powerful phylogenetic approach to predict which are vector-borne viruses and which are specific to vertebrates or arthropods. By reconstructing ancestral and present host states we found that switches between major groups of hosts have occurred rarely during rhabdovirus evolution. This allowed us to propose seventy-six new likely vector-borne vertebrate viruses among viruses identified from vertebrates or biting insects. Based on currently available data, our analysis suggests it is likely there was a single origin of the known plant viruses and arthropod-borne vertebrate viruses, while vertebrate- and arthropod-specific viruses arose at least twice. There are also few transitions between aquatic and terrestrial ecosystems. Viruses also cluster together at a finer scale, with closely related viruses tending to be found in closely related hosts. Our data therefore suggest that throughout their evolution, rhabdoviruses have occasionally jumped between distantly related host species before spreading through related hosts in the same environment. This approach offers a way to predict the most probable biology and key traits of newly discovered viruses.
Is Mutation Random or Targeted?: No Evidence for Hypermutability in Snail Toxin Genes.

PubMed

Roy, Scott W

2016-10-01

Ever since Luria and Delbruck, the notion that mutation is random with respect to fitness has been foundational to modern biology. However, various studies have claimed striking exceptions to this rule. One influential case involves toxin-encoding genes in snails of the genus Conus, termed conotoxins, a large gene family that undergoes rapid diversification of their protein-coding sequences by positive selection. Previous reconstructions of the sequence evolution of conotoxin genes claimed striking patterns: (1) elevated synonymous change, interpreted as being due to targeted "hypermutation" in this region; (2) elevated transversion-to-transition ratios, interpreted as reflective of the particular mechanism of hypermutation; and (3) much lower rates of synonymous change in the codons encoding several highly conserved cysteine residues, interpreted as strong position-specific codon bias. This work has spawned a variety of studies on the potential mechanisms of hypermutation and on causes for cysteine codon bias, and has inspired hypermutation hypotheses for various other fast-evolving genes. Here, I show that all three findings are likely to be artifacts of statistical reconstruction. First, by simulating nonsynonymous change I show that high rates of dN can lead to overestimation of dS. Second, I show that there is no evidence for any of these three patterns in comparisons of closely related conotoxin sequences, suggesting that the reported findings are due to breakdown of statistical methods at high levels of sequence divergence. The current findings suggest that mutation and codon bias in conotoxin genes may not be atypical, and that random mutation and selection can explain the evolution of even these exceptional loci. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
PyEvolve: a toolkit for statistical modelling of molecular evolution.

PubMed

Butterfield, Andrew; Vedagiri, Vivek; Lang, Edward; Lawrence, Cath; Wakefield, Matthew J; Isaev, Alexander; Huttley, Gavin A

2004-01-05

Examining the distribution of variation has proven an extremely profitable technique in the effort to identify sequences of biological significance. Most approaches in the field, however, evaluate only the conserved portions of sequences - ignoring the biological significance of sequence differences. A suite of sophisticated likelihood based statistical models from the field of molecular evolution provides the basis for extracting the information from the full distribution of sequence variation. The number of different problems to which phylogeny-based maximum likelihood calculations can be applied is extensive. Available software packages that can perform likelihood calculations suffer from a lack of flexibility and scalability, or employ error-prone approaches to model parameterisation. Here we describe the implementation of PyEvolve, a toolkit for the application of existing, and development of new, statistical methods for molecular evolution. We present the object architecture and design schema of PyEvolve, which includes an adaptable multi-level parallelisation schema. The approach for defining new methods is illustrated by implementing a novel dinucleotide model of substitution that includes a parameter for mutation of methylated CpG's, which required 8 lines of standard Python code to define. Benchmarking was performed using either a dinucleotide or codon substitution model applied to an alignment of BRCA1 sequences from 20 mammals, or a 10 species subset. Up to five-fold parallel performance gains over serial were recorded. Compared to leading alternative software, PyEvolve exhibited significantly better real world performance for parameter rich models with a large data set, reducing the time required for optimisation from approximately 10 days to approximately 6 hours. PyEvolve provides flexible functionality that can be used either for statistical modelling of molecular evolution, or the development of new methods in the field. The toolkit can be used interactively or by writing and executing scripts. The toolkit uses efficient processes for specifying the parameterisation of statistical models, and implements numerous optimisations that make highly parameter rich likelihood functions solvable within hours on multi-cpu hardware. PyEvolve can be readily adapted in response to changing computational demands and hardware configurations to maximise performance. PyEvolve is released under the GPL and can be downloaded from http://cbis.anu.edu.au/software.
On the origin and early evolution of biological catalysis and other studies on chemical evolution

NASA Technical Reports Server (NTRS)

Oro, J.; Lazcano, A.

1991-01-01

One of the lines of research in molecular evolution which we have developed for the past three years is related to the experimental and theoretical study of the origin and early evolution of biological catalysis. In an attempt to understand the nature of the first peptidic catalysts and coenzymes, we have achieved the non-enzymatic synthesis of the coenzymes ADPG, GDPG, and CDP-ethanolamine, under conditions considered to have been prevalent on the primitive Earth. We have also accomplished the prebiotic synthesis of histidine, as well as histidyl-histidine, and we have measured the enhancing effects of this catalytic dipeptide on the dephosphorylation of deoxyribonucleotide monophosphates, the hydrolysis of oligo A, and the oligomerization 2', 3' cAMP. We reviewed and further developed the hypothesis that RNA preceded double stranded DNA molecules as a reservoir of cellular genetic information. This led us to undertake the study of extant RNA polymerases in an attempt to discover vestigial sequences preserved from early Archean times. In addition, we continued our studies of on the chemical evolution of organic compounds in the solar system and beyond.
Ribosomal RNA Genes Contribute to the Formation of Pseudogenes and Junk DNA in the Human Genome.

PubMed

Robicheau, Brent M; Susko, Edward; Harrigan, Amye M; Snyder, Marlene

2017-02-01

Approximately 35% of the human genome can be identified as sequence devoid of a selected-effect function, and not derived from transposable elements or repeated sequences. We provide evidence supporting a known origin for a fraction of this sequence. We show that: 1) highly degraded, but near full length, ribosomal DNA (rDNA) units, including both 45S and Intergenic Spacer (IGS), can be found at multiple sites in the human genome on chromosomes without rDNA arrays, 2) that these rDNA sequences have a propensity for being centromere proximal, and 3) that sequence at all human functional rDNA array ends is divergent from canonical rDNA to the point that it is pseudogenic. We also show that small sequence strings of rDNA (from 45S + IGS) can be found distributed throughout the genome and are identifiable as an "rDNA-like signal", representing 0.26% of the q-arm of HSA21 and ∼2% of the total sequence of other regions tested. The size of sequence strings found in the rDNA-like signal intergrade into the size of sequence strings that make up the full-length degrading rDNA units found scattered throughout the genome. We conclude that the displaced and degrading rDNA sequences are likely of a similar origin but represent different stages in their evolution towards random sequence. Collectively, our data suggests that over vast evolutionary time, rDNA arrays contribute to the production of junk DNA. The concept that the production of rDNA pseudogenes is a by-product of concerted evolution represents a previously under-appreciated process; we demonstrate here its importance. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Unity and disunity in evolutionary sciences: process-based analogies open common research avenues for biology and linguistics.

PubMed

List, Johann-Mattis; Pathmanathan, Jananan Sylvestre; Lopez, Philippe; Bapteste, Eric

2016-08-20

For a long time biologists and linguists have been noticing surprising similarities between the evolution of life forms and languages. Most of the proposed analogies have been rejected. Some, however, have persisted, and some even turned out to be fruitful, inspiring the transfer of methods and models between biology and linguistics up to today. Most proposed analogies were based on a comparison of the research objects rather than the processes that shaped their evolution. Focusing on process-based analogies, however, has the advantage of minimizing the risk of overstating similarities, while at the same time reflecting the common strategy to use processes to explain the evolution of complexity in both fields. We compared important evolutionary processes in biology and linguistics and identified processes specific to only one of the two disciplines as well as processes which seem to be analogous, potentially reflecting core evolutionary processes. These new process-based analogies support novel methodological transfer, expanding the application range of biological methods to the field of historical linguistics. We illustrate this by showing (i) how methods dealing with incomplete lineage sorting offer an introgression-free framework to analyze highly mosaic word distributions across languages; (ii) how sequence similarity networks can be used to identify composite and borrowed words across different languages; (iii) how research on partial homology can inspire new methods and models in both fields; and (iv) how constructive neutral evolution provides an original framework for analyzing convergent evolution in languages resulting from common descent (Sapir's drift). Apart from new analogies between evolutionary processes, we also identified processes which are specific to either biology or linguistics. This shows that general evolution cannot be studied from within one discipline alone. In order to get a full picture of evolution, biologists and linguists need to complement their studies, trying to identify cross-disciplinary and discipline-specific evolutionary processes. The fact that we found many process-based analogies favoring transfer from biology to linguistics further shows that certain biological methods and models have a broader scope than previously recognized. This opens fruitful paths for collaboration between the two disciplines. This article was reviewed by W. Ford Doolittle and Eugene V. Koonin.
Question 7: Comparative Genomics and Early Cell Evolution: A Cautionary Methodological Note

NASA Astrophysics Data System (ADS)

Islas, Sara; Hernández-Morales, Ricardo; Lazcano, Antonio

2007-10-01

Inventories of the gene content of the last common ancestor (LCA), i.e., the cenancestor, include sequences that may have undergone horizontal transfer events, as well as sequences that have originated in different pre-cenancestral epochs. However, the universal distribution of highly conserved genes involved in RNA metabolism provide insights into early stages of cell evolution during which RNA played a much more conspicuous biological role, and is consistent with the hypothesis that extant living systems were preceded by an RNA/protein world. Insights into the traits of primitive entities from which the LCA evolved may be derived from the analysis of paralogous gene families, including those formed by sequences that resulted from internal elongation events. Three major types of paralogous gene families can be recognized. The importance of this grouping for understanding the traits of early cells is discussed.
Genome-wide characterization of centromeric satellites from multiple mammalian genomes.

PubMed

Alkan, Can; Cardone, Maria Francesca; Catacchio, Claudia Rita; Antonacci, Francesca; O'Brien, Stephen J; Ryder, Oliver A; Purgato, Stefania; Zoli, Monica; Della Valle, Giuliano; Eichler, Evan E; Ventura, Mario

2011-01-01

Despite its importance in cell biology and evolution, the centromere has remained the final frontier in genome assembly and annotation due to its complex repeat structure. However, isolation and characterization of the centromeric repeats from newly sequenced species are necessary for a complete understanding of genome evolution and function. In recent years, various genomes have been sequenced, but the characterization of the corresponding centromeric DNA has lagged behind. Here, we present a computational method (RepeatNet) to systematically identify higher-order repeat structures from unassembled whole-genome shotgun sequence and test whether these sequence elements correspond to functional centromeric sequences. We analyzed genome datasets from six species of mammals representing the diversity of the mammalian lineage, namely, horse, dog, elephant, armadillo, opossum, and platypus. We define candidate monomer satellite repeats and demonstrate centromeric localization for five of the six genomes. Our analysis revealed the greatest diversity of centromeric sequences in horse and dog in contrast to elephant and armadillo, which showed high-centromeric sequence homogeneity. We could not isolate centromeric sequences within the platypus genome, suggesting that centromeres in platypus are not enriched in satellite DNA. Our method can be applied to the characterization of thousands of other vertebrate genomes anticipated for sequencing in the near future, providing an important tool for annotation of centromeres.
Molecular evolution of the CYP2D subfamily in primates: purifying selection on substrate recognition sites without the frequent or long-tract gene conversion.

PubMed

Yasukochi, Yoshiki; Satta, Yoko

2015-03-25

The human cytochrome P450 (CYP) 2D6 gene is a member of the CYP2D gene subfamily, along with the CYP2D7P and CYP2D8P pseudogenes. Although the CYP2D6 enzyme has been studied extensively because of its clinical importance, the evolution of the CYP2D subfamily has not yet been fully understood. Therefore, the goal of this study was to reveal the evolutionary process of the human drug metabolic system. Here, we investigate molecular evolution of the CYP2D subfamily in primates by comparing 14 CYP2D sequences from humans to New World monkey genomes. Window analysis and statistical tests revealed that entire genomic sequences of paralogous genes were extensively homogenized by gene conversion during molecular evolution of CYP2D genes in primates. A neighbor-joining tree based on genomic sequences at the nonsubstrate recognition sites showed that CYP2D6 and CYP2D8 genes were clustered together due to gene conversion. In contrast, a phylogenetic tree using amino acid sequences at substrate recognition sites did not cluster the CYP2D6 and CYP2D8 genes, suggesting that the functional constraint on substrate specificity is one of the causes for purifying selection at the substrate recognition sites. Our results suggest that the CYP2D gene subfamily in primates has evolved to maintain the regioselectivity for a substrate hydroxylation activity between individual enzymes, even though extensive gene conversion has occurred across CYP2D coding sequences. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Evolution of the vertebrate insulin receptor substrate (Irs) gene family.

PubMed

Al-Salam, Ahmad; Irwin, David M

2017-06-23

Insulin receptor substrate (Irs) proteins are essential for insulin signaling as they allow downstream effectors to dock with, and be activated by, the insulin receptor. A family of four Irs proteins have been identified in mice, however the gene for one of these, IRS3, has been pseudogenized in humans. While it is known that the Irs gene family originated in vertebrates, it is not known when it originated and which members are most closely related to each other. A better understanding of the evolution of Irs genes and proteins should provide insight into the regulation of metabolism by insulin. Multiple genes for Irs proteins were identified in a wide variety of vertebrate species. Phylogenetic and genomic neighborhood analyses indicate that this gene family originated very early in vertebrae evolution. Most Irs genes were duplicated and retained in fish after the fish-specific genome duplication. Irs genes have been lost of various lineages, including Irs3 in primates and birds and Irs1 in most fish. Irs3 and Irs4 experienced an episode of more rapid protein sequence evolution on the ancestral mammalian lineage. Comparisons of the conservation of the proteins sequences among Irs paralogs show that domains involved in binding to the plasma membrane and insulin receptors are most strongly conserved, while divergence has occurred in sequences involved in interacting with downstream effector proteins. The Irs gene family originated very early in vertebrate evolution, likely through genome duplications, and in parallel with duplications of other components of the insulin signaling pathway, including insulin and the insulin receptor. While the N-terminal sequences of these proteins are conserved among the paralogs, changes in the C-terminal sequences likely allowed changes in biological function.
Evolutionary connections of biological kingdoms based on protein and nucleic acid sequence evidence

NASA Technical Reports Server (NTRS)

Dayhoff, M. O.

1983-01-01

Prokaryotic and eukaryotic evolutionary trees are developed from protein and nucleic-acid sequences by the methods of numerical taxonomy. Trees are presented for bacterial ferredoxins, 5S ribosomal RNA, c-type cytochromes , cytochromes c2 and c', and 5.8S ribosomal RNA; the implications for early evolution are discussed; and a composite tree showing the branching of the anaerobes, aerobes, archaebacteria, and eukaryotes is shown. Single lines are found for all oxygen-evolving photosynthetic forms and for the salt-loving and high-temperature forms of archaebacteria. It is argued that the eukaryote mitochondria, chloroplasts, and cytoplasmic host material are descended from free-living prokaryotes that formed symbiotic associations, with more than one symbiotic event involved in the evolution of each organelle.
Universality of long-range correlations in expansion randomization systems

NASA Astrophysics Data System (ADS)

Messer, P. W.; Lässig, M.; Arndt, P. F.

2005-10-01

We study the stochastic dynamics of sequences evolving by single-site mutations, segmental duplications, deletions, and random insertions. These processes are relevant for the evolution of genomic DNA. They define a universality class of non-equilibrium 1D expansion-randomization systems with generic stationary long-range correlations in a regime of growing sequence length. We obtain explicitly the two-point correlation function of the sequence composition and the distribution function of the composition bias in sequences of finite length. The characteristic exponent χ of these quantities is determined by the ratio of two effective rates, which are explicitly calculated for several specific sequence evolution dynamics of the universality class. Depending on the value of χ, we find two different scaling regimes, which are distinguished by the detectability of the initial composition bias. All analytic results are accurately verified by numerical simulations. We also discuss the non-stationary build-up and decay of correlations, as well as more complex evolutionary scenarios, where the rates of the processes vary in time. Our findings provide a possible example for the emergence of universality in molecular biology.
A cricket Gene Index: a genomic resource for studying neurobiology, speciation, and molecular evolution

PubMed Central

Danley, Patrick D; Mullen, Sean P; Liu, Fenglong; Nene, Vishvanath; Quackenbush, John; Shaw, Kerry L

2007-01-01

Background As the developmental costs of genomic tools decline, genomic approaches to non-model systems are becoming more feasible. Many of these systems may lack advanced genetic tools but are extremely valuable models in other biological fields. Here we report the development of expressed sequence tags (EST's) in an orthopteroid insect, a model for the study of neurobiology, speciation, and evolution. Results We report the sequencing of 14,502 EST's from clones derived from a nerve cord cDNA library, and the subsequent construction of a Gene Index from these sequences, from the Hawaiian trigonidiine cricket Laupala kohalensis. The Gene Index contains 8607 unique sequences comprised of 2575 tentative consensus (TC) sequences and 6032 singletons. For each of the unique sequences, an attempt was made to assign a provisional annotation and to categorize its function using a Gene Ontology-based classification through a sequence-based comparison to known proteins. In addition, a set of unique 70 base pair oligomers that can be used for DNA microarrays was developed. All Gene Index information is posted at the DFCI Gene Indices web page Conclusion Orthopterans are models used to understand the neurophysiological basis of complex motor patterns such as flight and stridulation. The sequences presented in the cricket Gene Index will provide neurophysiologists with many genetic tools that have been largely absent in this field. The cricket Gene Index is one of only two gene indices to be developed in an evolutionary model system. Species within the genus Laupala have speciated recently, rapidly, and extensively. Therefore, the genes identified in the cricket Gene Index can be used to study the genomics of speciation. Furthermore, this gene index represents a significant EST resources for basal insects. As such, this resource is a valuable comparative tool for the understanding of invertebrate molecular evolution. The sequences presented here will provide much needed genomic resources for three distinct but overlapping fields of inquiry: neurobiology, speciation, and molecular evolution. PMID:17459168
The Genome of the Sea Urchin Strongylocentrotus purpuratus

PubMed Central

2011-01-01

We report the sequence and analysis of the 814-megabase genome of the sea urchin Strongylocentrotus purpuratus, a model for developmental and systems biology. The sequencing strategy combined whole-genome shotgun and bacterial artificial chromosome (BAC) sequences. This use of BAC clones, aided by a pooling strategy, overcame difficulties associated with high heterozygosity of the genome. The genome encodes about 23,300 genes, including many previously thought to be vertebrate innovations or known only outside the deuterostomes. This echinoderm genome provides an evolutionary outgroup for the chordates and yields insights into the evolution of deuterostomes. PMID:17095691
DNA is structured as a linear "jigsaw puzzle" in the genomes of Arabidopsis, rice, and budding yeast.

PubMed

Liu, Yun-Hua; Zhang, Meiping; Wu, Chengcang; Huang, James J; Zhang, Hong-Bin

2014-01-01

Knowledge of how a genome is structured and organized from its constituent elements is crucial to understanding its biology and evolution. Here, we report the genome structuring and organization pattern as revealed by systems analysis of the sequences of three model species, Arabidopsis, rice and yeast, at the whole-genome and chromosome levels. We found that all fundamental function elements (FFE) constituting the genomes, including genes (GEN), DNA transposable elements (DTE), retrotransposable elements (RTE), simple sequence repeats (SSR), and (or) low complexity repeats (LCR), are structured in a nonrandom and correlative manner, thus leading to a hypothesis that the DNA of the species is structured as a linear "jigsaw puzzle". Furthermore, we showed that different FFE differ in their importance in the formation and evolution of the DNA jigsaw puzzle structure between species. DTE and RTE play more important roles than GEN, LCR, and SSR in Arabidopsis, whereas GEN and RTE play more important roles than LCR, SSR, and DTE in rice. The genes having multiple recognized functions play more important roles than those having single functions. These results provide useful knowledge necessary for better understanding genome biology and evolution of the species and for effective molecular breeding of rice.
A model for genesis of transcription systems.

PubMed

Burton, Zachary F; Opron, Kristopher; Wei, Guowei; Geiger, James H

2016-01-01

Repeating sequences generated from RNA gene fusions/ligations dominate ancient life, indicating central importance of building structural complexity in evolving biological systems. A simple and coherent story of life on earth is told from tracking repeating motifs that generate α/β proteins, 2-double-Ψ-β-barrel (DPBB) type RNA polymerases (RNAPs), general transcription factors (GTFs), and promoters. A general rule that emerges is that biological complexity that arises through generation of repeats is often bounded by solubility and closure (i.e., to form a pseudo-dimer or a barrel). Because the first DNA genomes were replicated by DNA template-dependent RNA synthesis followed by RNA template-dependent DNA synthesis via reverse transcriptase, the first DNA replication origins were initially 2-DPBB type RNAP promoters. A simplifying model for evolution of promoters/replication origins via repetition of core promoter elements is proposed. The model can explain why Pribnow boxes in bacterial transcription (i.e., (-12)TATAATG(-6)) so closely resemble TATA boxes (i.e., (-31)TATAAAAG(-24)) in archaeal/eukaryotic transcription. The evolution of anchor DNA sequences in bacterial (i.e., (-35)TTGACA(-30)) and archaeal (BRE(up); BRE for TFB recognition element) promoters is potentially explained. The evolution of BRE(down) elements of archaeal promoters is potentially explained.
Large-Scale Analysis Exploring Evolution of Catalytic Machineries and Mechanisms in Enzyme Superfamilies.

PubMed

Furnham, Nicholas; Dawson, Natalie L; Rahman, Syed A; Thornton, Janet M; Orengo, Christine A

2016-01-29

Enzymes, as biological catalysts, form the basis of all forms of life. How these proteins have evolved their functions remains a fundamental question in biology. Over 100 years of detailed biochemistry studies, combined with the large volumes of sequence and protein structural data now available, means that we are able to perform large-scale analyses to address this question. Using a range of computational tools and resources, we have compiled information on all experimentally annotated changes in enzyme function within 379 structurally defined protein domain superfamilies, linking the changes observed in functions during evolution to changes in reaction chemistry. Many superfamilies show changes in function at some level, although one function often dominates one superfamily. We use quantitative measures of changes in reaction chemistry to reveal the various types of chemical changes occurring during evolution and to exemplify these by detailed examples. Additionally, we use structural information of the enzymes active site to examine how different superfamilies have changed their catalytic machinery during evolution. Some superfamilies have changed the reactions they perform without changing catalytic machinery. In others, large changes of enzyme function, in terms of both overall chemistry and substrate specificity, have been brought about by significant changes in catalytic machinery. Interestingly, in some superfamilies, relatives perform similar functions but with different catalytic machineries. This analysis highlights characteristics of functional evolution across a wide range of superfamilies, providing insights that will be useful in predicting the function of uncharacterised sequences and the design of new synthetic enzymes. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

Azolla--a model organism for plant genomic studies.

PubMed

Qiu, Yin-Long; Yu, Jun

2003-02-01

The aquatic ferns of the genus Azolla are nitrogen-fixing plants that have great potentials in agricultural production and environmental conservation. Azolla in many aspects is qualified to serve as a model organism for genomic studies because of its importance in agriculture, its unique position in plant evolution, its symbiotic relationship with the N2-fixing cyanobacterium, Anabaena azollae, and its moderate-sized genome. The goals of this genome project are not only to understand the biology of the Azolla genome to promote its applications in biological research and agriculture practice but also to gain critical insights about evolution of plant genomes. Together with the strategic and technical improvement as well as cost reduction of DNA sequencing, the deciphering of their genetic code is imminent.
Homology and phylogeny and their automated inference

NASA Astrophysics Data System (ADS)

Fuellen, Georg

2008-06-01

The analysis of the ever-increasing amount of biological and biomedical data can be pushed forward by comparing the data within and among species. For example, an integrative analysis of data from the genome sequencing projects for various species traces the evolution of the genomes and identifies conserved and innovative parts. Here, I review the foundations and advantages of this “historical” approach and evaluate recent attempts at automating such analyses. Biological data is comparable if a common origin exists (homology), as is the case for members of a gene family originating via duplication of an ancestral gene. If the family has relatives in other species, we can assume that the ancestral gene was present in the ancestral species from which all the other species evolved. In particular, describing the relationships among the duplicated biological sequences found in the various species is often possible by a phylogeny, which is more informative than homology statements. Detecting and elaborating on common origins may answer how certain biological sequences developed, and predict what sequences are in a particular species and what their function is. Such knowledge transfer from sequences in one species to the homologous sequences of the other is based on the principle of ‘my closest relative looks and behaves like I do’, often referred to as ‘guilt by association’. To enable knowledge transfer on a large scale, several automated ‘phylogenomics pipelines’ have been developed in recent years, and seven of these will be described and compared. Overall, the examples in this review demonstrate that homology and phylogeny analyses, done on a large (and automated) scale, can give insights into function in biology and biomedicine.
Sequence co-evolution gives 3D contacts and structures of protein complexes

PubMed Central

Hopf, Thomas A; Schärfe, Charlotta P I; Rodrigues, João P G L M; Green, Anna G; Kohlbacher, Oliver; Sander, Chris; Bonvin, Alexandre M J J; Marks, Debora S

2014-01-01

Protein–protein interactions are fundamental to many biological processes. Experimental screens have identified tens of thousands of interactions, and structural biology has provided detailed functional insight for select 3D protein complexes. An alternative rich source of information about protein interactions is the evolutionary sequence record. Building on earlier work, we show that analysis of correlated evolutionary sequence changes across proteins identifies residues that are close in space with sufficient accuracy to determine the three-dimensional structure of the protein complexes. We evaluate prediction performance in blinded tests on 76 complexes of known 3D structure, predict protein–protein contacts in 32 complexes of unknown structure, and demonstrate how evolutionary couplings can be used to distinguish between interacting and non-interacting protein pairs in a large complex. With the current growth of sequences, we expect that the method can be generalized to genome-wide elucidation of protein–protein interaction networks and used for interaction predictions at residue resolution. DOI: http://dx.doi.org/10.7554/eLife.03430.001 PMID:25255213
Computational power and generative capacity of genetic systems.

PubMed

Igamberdiev, Abir U; Shklovskiy-Kordi, Nikita E

2016-01-01

Semiotic characteristics of genetic sequences are based on the general principles of linguistics formulated by Ferdinand de Saussure, such as the arbitrariness of sign and the linear nature of the signifier. Besides these semiotic features that are attributable to the basic structure of the genetic code, the principle of generativity of genetic language is important for understanding biological transformations. The problem of generativity in genetic systems arises to a possibility of different interpretations of genetic texts, and corresponds to what Alexander von Humboldt called "the infinite use of finite means". These interpretations appear in the individual development as the spatiotemporal sequences of realizations of different textual meanings, as well as the emergence of hyper-textual statements about the text itself, which underlies the process of biological evolution. These interpretations are accomplished at the level of the readout of genetic texts by the structures defined by Efim Liberman as "the molecular computer of cell", which includes DNA, RNA and the corresponding enzymes operating with molecular addresses. The molecular computer performs physically manifested mathematical operations and possesses both reading and writing capacities. Generativity paradoxically resides in the biological computational system as a possibility to incorporate meta-statements about the system, and thus establishes the internal capacity for its evolution. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Defining functional distance using manifold embeddings of gene ontology annotations

PubMed Central

Lerman, Gilad; Shakhnovich, Boris E.

2007-01-01

Although rigorous measures of similarity for sequence and structure are now well established, the problem of defining functional relationships has been particularly daunting. Here, we present several manifold embedding techniques to compute distances between Gene Ontology (GO) functional annotations and consequently estimate functional distances between protein domains. To evaluate accuracy, we correlate the functional distance to the well established measures of sequence, structural, and phylogenetic similarities. Finally, we show that manual classification of structures into folds and superfamilies is mirrored by proximity in the newly defined function space. We show how functional distances place structure–function relationships in biological context resulting in insight into divergent and convergent evolution. The methods and results in this paper can be readily generalized and applied to a wide array of biologically relevant investigations, such as accuracy of annotation transference, the relationship between sequence, structure, and function, or coherence of expression modules. PMID:17595300
Positive Selection Underlies Faster-Z Evolution of Gene Expression in Birds.

PubMed

Dean, Rebecca; Harrison, Peter W; Wright, Alison E; Zimmer, Fabian; Mank, Judith E

2015-10-01

The elevated rate of evolution for genes on sex chromosomes compared with autosomes (Fast-X or Fast-Z evolution) can result either from positive selection in the heterogametic sex or from nonadaptive consequences of reduced relative effective population size. Recent work in birds suggests that Fast-Z of coding sequence is primarily due to relaxed purifying selection resulting from reduced relative effective population size. However, gene sequence and gene expression are often subject to distinct evolutionary pressures; therefore, we tested for Fast-Z in gene expression using next-generation RNA-sequencing data from multiple avian species. Similar to studies of Fast-Z in coding sequence, we recover clear signatures of Fast-Z in gene expression; however, in contrast to coding sequence, our data indicate that Fast-Z in expression is due to positive selection acting primarily in females. In the soma, where gene expression is highly correlated between the sexes, we detected Fast-Z in both sexes, although at a higher rate in females, suggesting that many positively selected expression changes in females are also expressed in males. In the gonad, where intersexual correlations in expression are much lower, we detected Fast-Z for female gene expression, but crucially, not males. This suggests that a large amount of expression variation is sex-specific in its effects within the gonad. Taken together, our results indicate that Fast-Z evolution of gene expression is the product of positive selection acting on recessive beneficial alleles in the heterogametic sex. More broadly, our analysis suggests that the adaptive potential of Z chromosome gene expression may be much greater than that of gene sequence, results which have important implications for the role of sex chromosomes in speciation and sexual selection. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Time-Sampled Population Sequencing Reveals the Interplay of Selection and Genetic Drift in Experimental Evolution of Potato Virus Y

PubMed Central

2017-01-01

ABSTRACT RNA viruses are one of the fastest-evolving biological entities. Within their hosts, they exist as genetically diverse populations (i.e., viral mutant swarms), which are sculpted by different evolutionary mechanisms, such as mutation, natural selection, and genetic drift, and also the interactions between genetic variants within the mutant swarms. To elucidate the mechanisms that modulate the population diversity of an important plant-pathogenic virus, we performed evolution experiments with Potato virus Y (PVY) in potato genotypes that differ in their defense response against the virus. Using deep sequencing of small RNAs, we followed the temporal dynamics of standing and newly generated variations in the evolving viral lineages. A time-sampled approach allowed us to (i) reconstruct theoretical haplotypes in the starting population by using clustering of single nucleotide polymorphisms' trajectories and (ii) use quantitative population genetics approaches to estimate the contribution of selection and genetic drift, and their interplay, to the evolution of the virus. We detected imprints of strong selective sweeps and narrow genetic bottlenecks, followed by the shift in frequency of selected haplotypes. Comparison of patterns of viral evolution in differently susceptible host genotypes indicated possible diversifying evolution of PVY in the less-susceptible host (efficient in the accumulation of salicylic acid). IMPORTANCE High diversity of within-host populations of RNA viruses is an important aspect of their biology, since they represent a reservoir of genetic variants, which can enable quick adaptation of viruses to a changing environment. This study focuses on an important plant virus, Potato virus Y, and describes, at high resolution, temporal changes in the structure of viral populations within different potato genotypes. A novel and easy-to-implement computational approach was established to cluster single nucleotide polymorphisms into viral haplotypes from very short sequencing reads. During the experiment, a shift in the frequency of selected viral haplotypes was observed after a narrow genetic bottleneck, indicating an important role of the genetic drift in the evolution of the virus. On the other hand, a possible case of diversifying selection of the virus was observed in less susceptible host genotypes. PMID:28592544
Recapitulating phylogenies using k-mers: from trees to networks.

PubMed

Bernard, Guillaume; Ragan, Mark A; Chan, Cheong Xin

2016-01-01

Ernst Haeckel based his landmark Tree of Life on the supposed ontogenic recapitulation of phylogeny, i.e. that successive embryonic stages during the development of an organism re-trace the morphological forms of its ancestors over the course of evolution. Much of this idea has since been discredited. Today, phylogenies are often based on families of molecular sequences. The standard approach starts with a multiple sequence alignment, in which the sequences are arranged relative to each other in a way that maximises a measure of similarity position-by-position along their entire length. A tree (or sometimes a network) is then inferred. Rigorous multiple sequence alignment is computationally demanding, and evolutionary processes that shape the genomes of many microbes (bacteria, archaea and some morphologically simple eukaryotes) can add further complications. In particular, recombination, genome rearrangement and lateral genetic transfer undermine the assumptions that underlie multiple sequence alignment, and imply that a tree-like structure may be too simplistic. Here, using genome sequences of 143 bacterial and archaeal genomes, we construct a network of phylogenetic relatedness based on the number of shared k -mers (subsequences at fixed length k ). Our findings suggest that the network captures not only key aspects of microbial genome evolution as inferred from a tree, but also features that are not treelike. The method is highly scalable, allowing for investigation of genome evolution across a large number of genomes. Instead of using specific regions or sequences from genome sequences, or indeed Haeckel's idea of ontogeny, we argue that genome phylogenies can be inferred using k -mers from whole-genome sequences. Representing these networks dynamically allows biological questions of interest to be formulated and addressed quickly and in a visually intuitive manner.
The others: our biased perspective of eukaryotic genomes

PubMed Central

del Campo, Javier; Sieracki, Michael E.; Molestina, Robert; Keeling, Patrick; Massana, Ramon; Ruiz-Trillo, Iñaki

2015-01-01

Understanding the origin and evolution of the eukaryotic cell and the full diversity of eukaryotes is relevant to many biological disciplines. However, our current understanding of eukaryotic genomes is extremely biased, leading to a skewed view of eukaryotic biology. We argue that a phylogeny-driven initiative to cover the full eukaryotic diversity is needed to overcome this bias. We encourage the community: (i) to sequence a representative of the neglected groups available at public culture collections, (ii) to increase our culturing efforts, and (iii) to embrace single cell genomics to access organisms refractory to propagation in culture. We hope that the community will welcome this proposal, explore the approaches suggested, and join efforts to sequence the full diversity of eukaryotes. PMID:24726347
Tandem Repeats in Proteins: Prediction Algorithms and Biological Role.

PubMed

Pellegrini, Marco

2015-01-01

Tandem repetitions in protein sequence and structure is a fascinating subject of research which has been a focus of study since the late 1990s. In this survey, we give an overview on the multi-faceted aspects of research on protein tandem repeats (PTR for short), including prediction algorithms, databases, early classification efforts, mechanisms of PTR formation and evolution, and synthetic PTR design. We also touch on the rather open issue of the relationship between PTR and flexibility (or disorder) in proteins. Detection of PTR either from protein sequence or structure data is challenging due to inherent high (biological) signal-to-noise ratio that is a key feature of this problem. As early in silico analytic tools have been key enablers for starting this field of study, we expect that current and future algorithmic and statistical breakthroughs will have a high impact on the investigations of the biological role of PTR.
The Role of the Y-Chromosome in the Establishment of Murine Hybrid Dysgenesis and in the Analysis of the Nucleotide Sequence Organization, Genetic Transmission and Evolution of Repeated Sequences.

NASA Astrophysics Data System (ADS)

Nallaseth, Ferez Soli

The Y-chromosome presents a unique cytogenetic framework for the evolution of nucleotide sequences. Alignment of nine Y-chromosomal fragments in their increasing Y-specific/non Y-specific (male/female) sequence divergence ratios was directly and inversely related to their interspersion on these two respective genomic fractions. Sequence analysis confirmed a direct relationship between divergence ratios and the Alu, LINE-1, Satellite and their derivative oligonucleotide contents. Thus their relocation on the Y-chromosome is followed by sequence divergence rather than the well documented concerted evolution of these non-coding progenitor repeated sequences. Five of the nine Y-chromosomal fragments are non-pseudoautosomal and transcribed into heterogeneous PolyA^+ RNA and thus can be retrotransposed. Evolutionary and computer analysis identified homologous oligonucleotide tracts in several human loci suggesting common and random mechanistic origins. Dysgenic genomes represent the accelerated evolution driving sequence divergence (McClintock, 1984). Sex reversal and sterility characterizing dysgenesis occurs in C57BL/6JY ^{rm Pos} but not in 129/SvY^{rm Pos} derivative strains. High frequency, random, multi-locus deletion products of the feral Y^{ rm Pos}-chromosome are generated in the germlines of F1(C57BL/6J X 129/SvY^{ rm Pos})(male) and C57BL/6JY ^{rm Pos}(male) but not in 129/SvY^{rm Pos}(male). Equal, 10^{-1}, 10^ {-2}, and 0 copies (relative to males) of Y^{rm Pos}-specific deletion products respectively characterize C57BL/6JY ^{rm Pos} (HC), (LC), (T) and (F) females. The testes determining loci of inactive Y^{rm Pos}-chromosomes in C57BL/6JY^{rm Pos} HC females are the preferentially deleted/rearranged Y ^{rm Pos}-sequences. Disruption of regulation of plasma testosterone and hepatic MUP-A mRNA levels, TRD of a 4.7 Kbp EcoR1 fragment suggest disruption of autosomal/X-chromosomal sequences. These data and the highly repeated progenitor (Alu, GATA, LINE-1) sequence content of deletion products confirmed the previously unidentified loss of genetic control of mammalian chromosome biology and hybrid dysgenesis.
Comparative genomics reveals high biological diversity and specific adaptations in the industrially and medically important fungal genus Aspergillus.

PubMed

de Vries, Ronald P; Riley, Robert; Wiebenga, Ad; Aguilar-Osorio, Guillermo; Amillis, Sotiris; Uchima, Cristiane Akemi; Anderluh, Gregor; Asadollahi, Mojtaba; Askin, Marion; Barry, Kerrie; Battaglia, Evy; Bayram, Özgür; Benocci, Tiziano; Braus-Stromeyer, Susanna A; Caldana, Camila; Cánovas, David; Cerqueira, Gustavo C; Chen, Fusheng; Chen, Wanping; Choi, Cindy; Clum, Alicia; Dos Santos, Renato Augusto Corrêa; Damásio, André Ricardo de Lima; Diallinas, George; Emri, Tamás; Fekete, Erzsébet; Flipphi, Michel; Freyberg, Susanne; Gallo, Antonia; Gournas, Christos; Habgood, Rob; Hainaut, Matthieu; Harispe, María Laura; Henrissat, Bernard; Hildén, Kristiina S; Hope, Ryan; Hossain, Abeer; Karabika, Eugenia; Karaffa, Levente; Karányi, Zsolt; Kraševec, Nada; Kuo, Alan; Kusch, Harald; LaButti, Kurt; Lagendijk, Ellen L; Lapidus, Alla; Levasseur, Anthony; Lindquist, Erika; Lipzen, Anna; Logrieco, Antonio F; MacCabe, Andrew; Mäkelä, Miia R; Malavazi, Iran; Melin, Petter; Meyer, Vera; Mielnichuk, Natalia; Miskei, Márton; Molnár, Ákos P; Mulé, Giuseppina; Ngan, Chew Yee; Orejas, Margarita; Orosz, Erzsébet; Ouedraogo, Jean Paul; Overkamp, Karin M; Park, Hee-Soo; Perrone, Giancarlo; Piumi, Francois; Punt, Peter J; Ram, Arthur F J; Ramón, Ana; Rauscher, Stefan; Record, Eric; Riaño-Pachón, Diego Mauricio; Robert, Vincent; Röhrig, Julian; Ruller, Roberto; Salamov, Asaf; Salih, Nadhira S; Samson, Rob A; Sándor, Erzsébet; Sanguinetti, Manuel; Schütze, Tabea; Sepčić, Kristina; Shelest, Ekaterina; Sherlock, Gavin; Sophianopoulou, Vicky; Squina, Fabio M; Sun, Hui; Susca, Antonia; Todd, Richard B; Tsang, Adrian; Unkles, Shiela E; van de Wiele, Nathalie; van Rossen-Uffink, Diana; Oliveira, Juliana Velasco de Castro; Vesth, Tammi C; Visser, Jaap; Yu, Jae-Hyuk; Zhou, Miaomiao; Andersen, Mikael R; Archer, David B; Baker, Scott E; Benoit, Isabelle; Brakhage, Axel A; Braus, Gerhard H; Fischer, Reinhard; Frisvad, Jens C; Goldman, Gustavo H; Houbraken, Jos; Oakley, Berl; Pócsi, István; Scazzocchio, Claudio; Seiboth, Bernhard; vanKuyk, Patricia A; Wortman, Jennifer; Dyer, Paul S; Grigoriev, Igor V

2017-02-14

The fungal genus Aspergillus is of critical importance to humankind. Species include those with industrial applications, important pathogens of humans, animals and crops, a source of potent carcinogenic contaminants of food, and an important genetic model. The genome sequences of eight aspergilli have already been explored to investigate aspects of fungal biology, raising questions about evolution and specialization within this genus. We have generated genome sequences for ten novel, highly diverse Aspergillus species and compared these in detail to sister and more distant genera. Comparative studies of key aspects of fungal biology, including primary and secondary metabolism, stress response, biomass degradation, and signal transduction, revealed both conservation and diversity among the species. Observed genomic differences were validated with experimental studies. This revealed several highlights, such as the potential for sex in asexual species, organic acid production genes being a key feature of black aspergilli, alternative approaches for degrading plant biomass, and indications for the genetic basis of stress response. A genome-wide phylogenetic analysis demonstrated in detail the relationship of the newly genome sequenced species with other aspergilli. Many aspects of biological differences between fungal species cannot be explained by current knowledge obtained from genome sequences. The comparative genomics and experimental study, presented here, allows for the first time a genus-wide view of the biological diversity of the aspergilli and in many, but not all, cases linked genome differences to phenotype. Insights gained could be exploited for biotechnological and medical applications of fungi.
Evolutionary interrogation of human biology in well-annotated genomic framework of rhesus macaque.

PubMed

Zhang, Shi-Jian; Liu, Chu-Jun; Yu, Peng; Zhong, Xiaoming; Chen, Jia-Yu; Yang, Xinzhuang; Peng, Jiguang; Yan, Shouyu; Wang, Chenqu; Zhu, Xiaotong; Xiong, Jingwei; Zhang, Yong E; Tan, Bertrand Chin-Ming; Li, Chuan-Yun

2014-05-01

With genome sequence and composition highly analogous to human, rhesus macaque represents a unique reference for evolutionary studies of human biology. Here, we developed a comprehensive genomic framework of rhesus macaque, the RhesusBase2, for evolutionary interrogation of human genes and the associated regulations. A total of 1,667 next-generation sequencing (NGS) data sets were processed, integrated, and evaluated, generating 51.2 million new functional annotation records. With extensive NGS annotations, RhesusBase2 refined the fine-scale structures in 30% of the macaque Ensembl transcripts, reporting an accurate, up-to-date set of macaque gene models. On the basis of these annotations and accurate macaque gene models, we further developed an NGS-oriented Molecular Evolution Gateway to access and visualize macaque annotations in reference to human orthologous genes and associated regulations (www.rhesusbase.org/molEvo). We highlighted the application of this well-annotated genomic framework in generating hypothetical link of human-biased regulations to human-specific traits, by using mechanistic characterization of the DIEXF gene as an example that provides novel clues to the understanding of digestive system reduction in human evolution. On a global scale, we also identified a catalog of 9,295 human-biased regulatory events, which may represent novel elements that have a substantial impact on shaping human transcriptome and possibly underpin recent human phenotypic evolution. Taken together, we provide an NGS data-driven, information-rich framework that will broadly benefit genomics research in general and serves as an important resource for in-depth evolutionary studies of human biology.
Satellite DNA: An Evolving Topic

PubMed Central

Garrido-Ramos, Manuel A.

2017-01-01

Satellite DNA represents one of the most fascinating parts of the repetitive fraction of the eukaryotic genome. Since the discovery of highly repetitive tandem DNA in the 1960s, a lot of literature has extensively covered various topics related to the structure, organization, function, and evolution of such sequences. Today, with the advent of genomic tools, the study of satellite DNA has regained a great interest. Thus, Next-Generation Sequencing (NGS), together with high-throughput in silico analysis of the information contained in NGS reads, has revolutionized the analysis of the repetitive fraction of the eukaryotic genomes. The whole of the historical and current approaches to the topic gives us a broad view of the function and evolution of satellite DNA and its role in chromosomal evolution. Currently, we have extensive information on the molecular, chromosomal, biological, and population factors that affect the evolutionary fate of satellite DNA, knowledge that gives rise to a series of hypotheses that get on well with each other about the origin, spreading, and evolution of satellite DNA. In this paper, I review these hypotheses from a methodological, conceptual, and historical perspective and frame them in the context of chromosomal organization and evolution. PMID:28926993
Using evolutionary computations to understand the design and evolution of gene and cell regulatory networks.

PubMed

Spirov, Alexander; Holloway, David

2013-07-15

This paper surveys modeling approaches for studying the evolution of gene regulatory networks (GRNs). Modeling of the design or 'wiring' of GRNs has become increasingly common in developmental and medical biology, as a means of quantifying gene-gene interactions, the response to perturbations, and the overall dynamic motifs of networks. Drawing from developments in GRN 'design' modeling, a number of groups are now using simulations to study how GRNs evolve, both for comparative genomics and to uncover general principles of evolutionary processes. Such work can generally be termed evolution in silico. Complementary to these biologically-focused approaches, a now well-established field of computer science is Evolutionary Computations (ECs), in which highly efficient optimization techniques are inspired from evolutionary principles. In surveying biological simulation approaches, we discuss the considerations that must be taken with respect to: (a) the precision and completeness of the data (e.g. are the simulations for very close matches to anatomical data, or are they for more general exploration of evolutionary principles); (b) the level of detail to model (we proceed from 'coarse-grained' evolution of simple gene-gene interactions to 'fine-grained' evolution at the DNA sequence level); (c) to what degree is it important to include the genome's cellular context; and (d) the efficiency of computation. With respect to the latter, we argue that developments in computer science EC offer the means to perform more complete simulation searches, and will lead to more comprehensive biological predictions. Copyright © 2013 Elsevier Inc. All rights reserved.
Lineage-Specific Biology Revealed by a Finished Genome Assembly of the Mouse

PubMed Central

Hillier, LaDeana W.; Zody, Michael C.; Goldstein, Steve; She, Xinwe; Bult, Carol J.; Agarwala, Richa; Cherry, Joshua L.; DiCuccio, Michael; Hlavina, Wratko; Kapustin, Yuri; Meric, Peter; Maglott, Donna; Birtle, Zoë; Marques, Ana C.; Graves, Tina; Zhou, Shiguo; Teague, Brian; Potamousis, Konstantinos; Churas, Christopher; Place, Michael; Herschleb, Jill; Runnheim, Ron; Forrest, Daniel; Amos-Landgraf, James; Schwartz, David C.; Cheng, Ze; Lindblad-Toh, Kerstin; Eichler, Evan E.; Ponting, Chris P.

2009-01-01

The mouse (Mus musculus) is the premier animal model for understanding human disease and development. Here we show that a comprehensive understanding of mouse biology is only possible with the availability of a finished, high-quality genome assembly. The finished clone-based assembly of the mouse strain C57BL/6J reported here has over 175,000 fewer gaps and over 139 Mb more of novel sequence, compared with the earlier MGSCv3 draft genome assembly. In a comprehensive analysis of this revised genome sequence, we are now able to define 20,210 protein-coding genes, over a thousand more than predicted in the human genome (19,042 genes). In addition, we identified 439 long, non–protein-coding RNAs with evidence for transcribed orthologs in human. We analyzed the complex and repetitive landscape of 267 Mb of sequence that was missing or misassembled in the previously published assembly, and we provide insights into the reasons for its resistance to sequencing and assembly by whole-genome shotgun approaches. Duplicated regions within newly assembled sequence tend to be of more recent ancestry than duplicates in the published draft, correcting our initial understanding of recent evolution on the mouse lineage. These duplicates appear to be largely composed of sequence regions containing transposable elements and duplicated protein-coding genes; of these, some may be fixed in the mouse population, but at least 40% of segmentally duplicated sequences are copy number variable even among laboratory mouse strains. Mouse lineage-specific regions contain 3,767 genes drawn mainly from rapidly-changing gene families associated with reproductive functions. The finished mouse genome assembly, therefore, greatly improves our understanding of rodent-specific biology and allows the delineation of ancestral biological functions that are shared with human from derived functions that are not. PMID:19468303
Clustering and visualizing similarity networks of membrane proteins.

PubMed

Hu, Geng-Ming; Mai, Te-Lun; Chen, Chi-Ming

2015-08-01

We proposed a fast and unsupervised clustering method, minimum span clustering (MSC), for analyzing the sequence-structure-function relationship of biological networks, and demonstrated its validity in clustering the sequence/structure similarity networks (SSN) of 682 membrane protein (MP) chains. The MSC clustering of MPs based on their sequence information was found to be consistent with their tertiary structures and functions. For the largest seven clusters predicted by MSC, the consistency in chain function within the same cluster is found to be 100%. From analyzing the edge distribution of SSN for MPs, we found a characteristic threshold distance for the boundary between clusters, over which SSN of MPs could be properly clustered by an unsupervised sparsification of the network distance matrix. The clustering results of MPs from both MSC and the unsupervised sparsification methods are consistent with each other, and have high intracluster similarity and low intercluster similarity in sequence, structure, and function. Our study showed a strong sequence-structure-function relationship of MPs. We discussed evidence of convergent evolution of MPs and suggested applications in finding structural similarities and predicting biological functions of MP chains based on their sequence information. © 2015 Wiley Periodicals, Inc.
[Scale Relativity Theory in living beings morphogenesis: fratal, determinism and chance].

PubMed

Chaline, J

2012-10-01

The Scale Relativity Theory has many biological applications from linear to non-linear and, from classical mechanics to quantum mechanics. Self-similar laws have been used as model for the description of a huge number of biological systems. Theses laws may explain the origin of basal life structures. Log-periodic behaviors of acceleration or deceleration can be applied to branching macroevolution, to the time sequences of major evolutionary leaps. The existence of such a law does not mean that the role of chance in evolution is reduced, but instead that randomness and contingency may occur within a framework which may itself be structured in a partly statistical way. The scale relativity theory can open new perspectives in evolution. Copyright © 2012 Elsevier Masson SAS. All rights reserved.
Lineage Tracking for Probing Heritable Phenotypes at Single-Cell Resolution

PubMed Central

Cottinet, Denis; Condamine, Florence; Bremond, Nicolas; Griffiths, Andrew D.; Rainey, Paul B.; de Visser, J. Arjan G. M.; Baudry, Jean; Bibette, Jérôme

2016-01-01

Determining the phenotype and genotype of single cells is central to understand microbial evolution. DNA sequencing technologies allow the detection of mutants at high resolution, but similar approaches for phenotypic analyses are still lacking. We show that a drop-based millifluidic system enables the detection of heritable phenotypic changes in evolving bacterial populations. At time intervals, cells were sampled and individually compartmentalized in 100 nL drops. Growth through 15 generations was monitored using a fluorescent protein reporter. Amplification of heritable changes–via growth–over multiple generations yields phenotypically distinct clusters reflecting variation relevant for evolution. To demonstrate the utility of this approach, we follow the evolution of Escherichia coli populations during 30 days of starvation. Phenotypic diversity was observed to rapidly increase upon starvation with the emergence of heritable phenotypes. Mutations corresponding to each phenotypic class were identified by DNA sequencing. This scalable lineage-tracking technology opens the door to large-scale phenotyping methods with special utility for microbiology and microbial population biology. PMID:27077662
Lineage Tracking for Probing Heritable Phenotypes at Single-Cell Resolution.

PubMed

Cottinet, Denis; Condamine, Florence; Bremond, Nicolas; Griffiths, Andrew D; Rainey, Paul B; de Visser, J Arjan G M; Baudry, Jean; Bibette, Jérôme

2016-01-01

Determining the phenotype and genotype of single cells is central to understand microbial evolution. DNA sequencing technologies allow the detection of mutants at high resolution, but similar approaches for phenotypic analyses are still lacking. We show that a drop-based millifluidic system enables the detection of heritable phenotypic changes in evolving bacterial populations. At time intervals, cells were sampled and individually compartmentalized in 100 nL drops. Growth through 15 generations was monitored using a fluorescent protein reporter. Amplification of heritable changes-via growth-over multiple generations yields phenotypically distinct clusters reflecting variation relevant for evolution. To demonstrate the utility of this approach, we follow the evolution of Escherichia coli populations during 30 days of starvation. Phenotypic diversity was observed to rapidly increase upon starvation with the emergence of heritable phenotypes. Mutations corresponding to each phenotypic class were identified by DNA sequencing. This scalable lineage-tracking technology opens the door to large-scale phenotyping methods with special utility for microbiology and microbial population biology.

An In Vitro Translation, Selection, and Amplification System for Peptide Nucleic Acids

PubMed Central

Brudno, Yevgeny; Birnbaum, Michael E.; Kleiner, Ralph E.; Liu, David R.

2009-01-01

Methods to evolve synthetic, rather than biological, polymers could significantly expand the functional potential of polymers that emerge from in vitro evolution. Requirements for synthetic polymer evolution include: (i) sequence-specific polymerization of synthetic building blocks on an amplifiable template; (ii) display of the newly translated polymer strand in a manner that allows it to adopt folded structures; (iii) selection of synthetic polymer libraries for desired binding or catalytic properties; and (iv) amplification of template sequences surviving selection in a manner that allows subsequent translation. Here we report the development of such a system for peptide nucleic acids (PNAs) using a set of twelve PNA pentamer building blocks. We validated the system by performing six iterated cycles of translation, selection, and amplification on a library of 4.3 × 108 PNA-encoding DNA templates and observed >1,000,000-fold overall enrichment of a template encoding a biotinylated (streptavidin-binding) PNA. These results collectively provide an experimental foundation for PNA evolution in the laboratory. PMID:20081830
60 years ago, Francis Crick changed the logic of biology

PubMed Central

2017-01-01

In September 1957, Francis Crick gave a lecture in which he outlined key ideas about gene function, in particular what he called the central dogma. These ideas still frame how we understand life. This essay explores the concepts he developed in this influential lecture, including his prediction that we would study evolution by comparing sequences. PMID:28922352
Some Physical Principles Governing Spatial and Temporal Organization in Living Systems

NASA Astrophysics Data System (ADS)

Ali, Md Zulfikar

Spatial and temporal organization in living organisms are crucial for a variety of biological functions and arise from the interplay of large number of interacting molecules. One of the central questions in systems biology is to understand how such an intricate organization emerges from the molecular biochemistry of the cell. In this dissertation we explore two projects. The first project relates to pattern formation in a cell membrane as an example of spatial organization, and the second project relates to the evolution of oscillatory networks as a simple example of temporal organization. For the first project, we introduce a model for pattern formation in a two-component lipid bilayer and study the interplay between membrane composition and membrane geometry, demonstrating the existence of a rich phase diagram. Pattern formation is governed by the interplay between phase separation driven by lipid-lipid interactions and tendency of lipid domains with high intrinsic curvature to deform the membrane away from its preferred position. Depending on membrane parameters, we find the formation of compact lipid micro-clusters or of striped domains. We calculate the stripe width analytically and find good agreement with stripe widths obtained from the simulations. For the second project, we introduce a minimal model for the evolution of functional protein-interaction networks using a sequence-based mutational algorithm and apply it to study the following problems. Using the model, we study robustness and designabilty of a 2-component network that generate oscillations. We completely enumerate the sequence space and the phenotypic space, and discuss the relationship between designabilty, robustness and evolvability. We further apply the model to studies of neutral drift in networks that yield oscillatory dynamics, e.g. starting with a relatively simple network and allowing it to evolve by adding nodes and connections while requiring that oscillatory dynamics be preserved. Our studies demonstrate both the importance of employing a sequence-based evolutionary scheme and the relative rapidity (in evolutionary time) for the redistribution of function over new nodes via neutral drift. In addition we discovered another much slower timescale for network evolution, reflecting hidden order in sequence space that we interpret in terms of sparsely connected domains. Finally, we use the model to study the evolution of an oscillator from a non-oscillatory network under the influence of external periodic forcing as a model for evolution of circadian rhythm in living systems. We use a greedy algorithm based on optimizing biologically motivated fitness functions and find that the algorithm successfully produces oscillators. However, the distribution of free-period of evolved oscillators depends on the choice of fitness functions and the nature of forcing.
Temporal variations in the gene expression levels of cyanobacterial anti-oxidant enzymes through geological history: implications for biological evolution during the Great Oxidation Event

NASA Astrophysics Data System (ADS)

Harada, M.; Furukawa, R.; Yokobori, S. I.; Tajika, E.; Yamagishi, A.

2016-12-01

A significant rise in atmospheric O2 levels during the GOE (Great Oxidation Event), ca. 2.45-2.0 Ga, must have caused a great stress to biosphere, enforcing life to adapt to oxic conditions. Cyanobacteria, oxygenic photosynthetic bacteria that had been responsible for the GOE, are at the same time one of the organisms that would have been greatly affected by the rise of O2 level in the surface environments. Knowledge on the evolution of cyanobacteria is not only important to elucidate the cause of the GOE, but also helps us to better understand the adaptive evolution of life in response to the GOE. Here we performed phylogenetic analysis of an anti-oxidant enzyme Fe-SOD (iron superoxide dismutase) of cyanobacteria, to assess the adaptive evolution of life under the GOE. The rise of O2 level must have increased the level of toxic reactive oxygen species in cyanobacterial cells, thus forced them to change activities or the gene expression levels of Fe-SOD. In the present study, we focus on the change in the gene expression levels of the enzyme, which can be estimated from the promoter sequences of the gene. Promoters are DNA sequences found upstream of protein encoding regions, where RNA polymerase binds and initiates transcription. "Strong" promoters that efficiently interact with RNA polymerase induce high rates of transcription, leading to high levels of gene expression. Thus, from the temporal changes in the promoter sequences, we can estimate the variations in the gene expression levels during the geological time. Promoter sequences of Fe-SOD at each ancestral node of cyanobacteria were predicted from phylogenetic analysis, and the ancestral promoter sequences were compared to the promoters of known highly expressed genes. The similarity was low at the time of the emergence of cyanobacteria; however, increased at the branching nodes diverged 2.4 billon years ago. This roughly coincided with the onset of the GOE, implying that the transition from low to high gene expression levels of Fe-SOD occurred in response to the GOE. We propose that this is the first direct evidence of the evolution of cyanobacteria related to the rise of O2, and that the methodologies of ancestral promoter analysis used in this study can be a novel tools to reveal the biological adaptation to such a significant geologic event.
Mapping the Geometric Evolution of Protein Folding Motor.

PubMed

Jerath, Gaurav; Hazam, Prakash Kishore; Shekhar, Shashi; Ramakrishnan, Vibin

2016-01-01

Polypeptide chain has an invariant main-chain and a variant side-chain sequence. How the side-chain sequence determines fold in terms of its chemical constitution has been scrutinized extensively and verified periodically. However, a focussed investigation on the directive effect of side-chain geometry may provide important insights supplementing existing algorithms in mapping the geometrical evolution of protein chains and its structural preferences. Geometrically, folding of protein structure may be envisaged as the evolution of its geometric variables: ϕ, and ψ dihedral angles of polypeptide main-chain directed by χ1, and χ2 of side chain. In this work, protein molecule is metaphorically modelled as a machine with 4 rotors ϕ, ψ, χ1 and χ2, with its evolution to the functional fold is directed by combinations of its rotor directions. We observe that differential rotor motions lead to different secondary structure formations and the combinatorial pattern is unique and consistent for particular secondary structure type. Further, we found that combination of rotor geometries of each amino acid is unique which partly explains how different amino acid sequence combinations have unique structural evolution and functional adaptation. Quantification of these amino acid rotor preferences, resulted in the generation of 3 substitution matrices, which later on plugged in the BLAST tool, for evaluating their efficiency in aligning sequences. We have employed BLOSUM62 and PAM30 as standard for primary evaluation. Generation of substitution matrices is a logical extension of the conceptual framework we attempted to build during the development of this work. Optimization of matrices following the conventional routines and possible application with biologically relevant data sets are beyond the scope of this manuscript, though it is a part of the larger project design.
How Life and Rocks Have Co-Evolved

NASA Astrophysics Data System (ADS)

Hazen, R.

2014-04-01

The near-surface environment of terrestrial planets and moons evolves as a consequence of selective physical, chemical, and biological processes - an evolution that is preserved in the mineralogical record. Mineral evolution begins with approximately 12 different refractory minerals that form in the cooling envelopes of exploding stars. Subsequent aqueous and thermal alteration of planetessimals results in the approximately 250 minerals now found in unweathered lunar and meteorite samples. Following Earth's accretion and differentiation, mineral evolution resulted from a sequence of geochemical and petrologic processes, which led to perhaps 1500 mineral species. According to some origin-of-life scenarios, a planet must progress through at least some of these stages of chemical processing as a prerequisite for life. Once life emerged, mineralogy and biology co-evolved and dramatically increased Earth's mineral diversity to >4000 species. Sequential stages of a planet's near-surface evolution arise from three primary mechanisms: (1) the progressive separation and concentration of the elements from their original relatively uniform distribution in the presolar nebula; (2) the increase in range of intensive variables such as pressure, temperature, and volatile activities; and (3) the generation of far-from-equilibrium conditions by living systems. Remote observations of the mineralogy of other terrestrial bodies may thus provide evidence for biological influences beyond Earth. Recent studies of mineral diversification through time reveal striking correlations with major geochemical, tectonic, and biological events, including large-changes in ocean chemistry, the supercontinent cycle, the increase of atmospheric oxygen, and the rise of the terrestrial biosphere.
Denisovans, Melanesians, Europeans, and Neandertals: The Confusion of DNA Assumptions and the Biological Species Concept.

PubMed

Caldararo, Niccolo

2016-08-01

A number of recent articles have appeared on the Denisova fossil remains and attempts to produce DNA sequences from them. One of these recently appeared in Science by Vernot et al. (Science 352:235-239, 2016). We would like to advance an alternative interpretation of the data presented. One concerns the problem of contamination/degradation of the determined DNA sequenced. Just as the publication of the first Neandertal sequence included an interpretation that argued that Neandertals had not contributed any genes to modern humans, the Denisovan interpretation has considerable influence on ideas regarding human evolution. The new papers, however, confuse established ideas concerning the nature of species, as well as the use of terms like premodern, Archaic Homo, and Homo heidelbergensis. Examination of these problems presents a solution by means of reinterpreting the results. Given the claims for gene transfer among a number of Mid Pleistocene hominids, it may be time to reexamine the idea of anagenesis in hominid evolution.
Markov-modulated Markov chains and the covarion process of molecular evolution.

PubMed

Galtier, N; Jean-Marie, A

2004-01-01

The covarion (or site specific rate variation, SSRV) process of biological sequence evolution is a process by which the evolutionary rate of a nucleotide/amino acid/codon position can change in time. In this paper, we introduce time-continuous, space-discrete, Markov-modulated Markov chains as a model for representing SSRV processes, generalizing existing theory to any model of rate change. We propose a fast algorithm for diagonalizing the generator matrix of relevant Markov-modulated Markov processes. This algorithm makes phylogeny likelihood calculation tractable even for a large number of rate classes and a large number of states, so that SSRV models become applicable to amino acid or codon sequence datasets. Using this algorithm, we investigate the accuracy of the discrete approximation to the Gamma distribution of evolutionary rates, widely used in molecular phylogeny. We show that a relatively large number of classes is required to achieve accurate approximation of the exact likelihood when the number of analyzed sequences exceeds 20, both under the SSRV and among site rate variation (ASRV) models.
Anticipatory Mechanisms in Evolutionary Living Systems

NASA Astrophysics Data System (ADS)

Dubois, Daniel M.; Holmberg, Stig C.

2010-11-01

This paper deals firstly with a revisiting of Darwin's theory of Natural Selection. Darwin in his book never uses the word "evolution", but shows a clear position about mutability of species. Darwin's Natural Selection was mainly inspired by the anticipatory Artificial Selection by humans in domestication, and the Malthus struggle for existence. Darwin showed that the struggle for existence leads to the preservation of the most divergent offspring of any one species. He cited several times the canon of "Natura non facit saltum". He spoke about the origin of life from some one primordial form, into which life was first breathed. Finally, Darwin made anticipation about the future researches in psychology. This paper cites the work of Ernst Mayr who was the first, after 90 years of an intense scientific debate, to present a new and stable Darwinian paradigm as the "Evolutionary Synthesis" in 1942. To explain what is life, the Living Systems Theory (LST) by J. G. Miller is presented. It is showed that the Autopoietic Systems Theory of Varela et al is also a fundamental component of living systems. In agreement with Darwin, the natural selection is a necessary condition for transformation of biological systems, but is not a sufficient condition. Thus, in this paper we conjecture that an anticipatory evolutionary mechanism exists with the genetic code that is a self-replicating and self-modifying anticipatory program. As demonstrated by Nobel laureate McClintock, evolution in genomes is programmed. The word "program" comes from "pro-gram" meaning to write before, by anticipation, and means a plan for the programming of a mechanism, or a sequence of coded instructions that can be inserted into a mechanism, or a sequence of coded instructions, as genes of behavioural responses, that is part of an organism. For example, cell death may be programmed by what is called the apoptosis. This definitively is a great breakthrough in our understanding of biological evolution. Hence, it is possible to formulate a new principle of evolution, i.e. the principle of Double Anticipatory Loop (DAL) of evolution: Biological evolution is driven by interaction between a mindless environment that is passively selecting the fittest inhabitants and purposeful anticipatory living systems, which are actively selecting and creating their own environment. Evolution on the genome level is trigged by environmental stress but guided by an inherent program.
Domain organizations of modular extracellular matrix proteins and their evolution.

PubMed

Engel, J

1996-11-01

Multidomain proteins which are composed of modular units are a rather recent invention of evolution. Domains are defined as autonomously folding regions of a protein, and many of them are similar in sequence and structure, indicating common ancestry. Their modular nature is emphasized by frequent repetitions in identical or in different proteins and by a large number of different combinations with other domains. The extracellular matrix is perhaps the largest biological system composed of modular mosaic proteins, and its astonishing complexity and diversity are based on them. A cluster of minireviews on modular proteins is being published in Matrix Biology. These deal with the evolution of modular proteins, the three-dimensional structure of domains and the ways in which these interact in a multidomain protein. They discuss structure-function relationships in calcium binding domains, collagen helices, alpha-helical coiled-coil domains and C-lectins. The present minireview is focused on some general aspects and serves as an introduction to the cluster.
Fossil rhabdoviral sequences integrated into arthropod genomes: ontogeny, evolution, and potential functionality.

PubMed

Fort, Philippe; Albertini, Aurélie; Van-Hua, Aurélie; Berthomieu, Arnaud; Roche, Stéphane; Delsuc, Frédéric; Pasteur, Nicole; Capy, Pierre; Gaudin, Yves; Weill, Mylène

2012-01-01

Retroelements represent a considerable fraction of many eukaryotic genomes and are considered major drives for adaptive genetic innovations. Recent discoveries showed that despite not normally using DNA intermediates like retroviruses do, Mononegaviruses (i.e., viruses with nonsegmented, negative-sense RNA genomes) can integrate gene fragments into the genomes of their hosts. This was shown for Bornaviridae and Filoviridae, the sequences of which have been found integrated into the germ line cells of many vertebrate hosts. Here, we show that Rhabdoviridae sequences, the major Mononegavirales family, have integrated only into the genomes of arthropod species. We identified 185 integrated rhabdoviral elements (IREs) coding for nucleoproteins, glycoproteins, or RNA-dependent RNA polymerases; they were mostly found in the genomes of the mosquito Aedes aegypti and the blacklegged tick Ixodes scapularis. Phylogenetic analyses showed that most IREs in A. aegypti derived from multiple independent integration events. Since RNA viruses are submitted to much higher substitution rates as compared with their hosts, IREs thus represent fossil traces of the diversity of extinct Rhabdoviruses. Furthermore, analyses of orthologous IREs in A. aegypti field mosquitoes sampled worldwide identified an integrated polymerase IRE fragment that appeared under purifying selection within several million years, which supports a functional role in the host's biology. These results show that A. aegypti was subjected to repeated Rhabdovirus infectious episodes during its evolution history, which led to the accumulation of many integrated sequences. They also suggest that like retroviruses, integrated rhabdoviral sequences may participate actively in the evolution of their hosts.
Early animal evolution: emerging views from comparative biology and geology

NASA Technical Reports Server (NTRS)

Knoll, A. H.; Carroll, S. B.

1999-01-01

The Cambrian appearance of fossils representing diverse phyla has long inspired hypotheses about possible genetic or environmental catalysts of early animal evolution. Only recently, however, have data begun to emerge that can resolve the sequence of genetic and morphological innovations, environmental events, and ecological interactions that collectively shaped Cambrian evolution. Assembly of the modern genetic tool kit for development and the initial divergence of major animal clades occurred during the Proterozoic Eon. Crown group morphologies diversified in the Cambrian through changes in the genetic regulatory networks that organize animal ontogeny. Cambrian radiation may have been triggered by environmental perturbation near the Proterozoic-Cambrian boundary and subsequently amplified by ecological interactions within reorganized ecosystems.
Comparative Analysis of the Peanut Witches'-Broom Phytoplasma Genome Reveals Horizontal Transfer of Potential Mobile Units and Effectors

PubMed Central

Lo, Wen-Sui; Lin, Chan-Pin; Kuo, Chih-Horng

2013-01-01

Phytoplasmas are a group of bacteria that are associated with hundreds of plant diseases. Due to their economical importance and the difficulties involved in the experimental study of these obligate pathogens, genome sequencing and comparative analysis have been utilized as powerful tools to understand phytoplasma biology. To date four complete phytoplasma genome sequences have been published. However, these four strains represent limited phylogenetic diversity. In this study, we report the shotgun sequencing and evolutionary analysis of a peanut witches'-broom (PnWB) phytoplasma genome. The availability of this genome provides the first representative of the 16SrII group and substantially improves the taxon sampling to investigate genome evolution. The draft genome assembly contains 13 chromosomal contigs with a total size of 562,473 bp, covering ∼90% of the chromosome. Additionally, a complete plasmid sequence is included. Comparisons among the five available phytoplasma genomes reveal the differentiations in gene content and metabolic capacity. Notably, phylogenetic inferences of the potential mobile units (PMUs) in these genomes indicate that horizontal transfer may have occurred between divergent phytoplasma lineages. Because many effectors are associated with PMUs, the horizontal transfer of these transposon-like elements can contribute to the adaptation and diversification of these pathogens. In summary, the findings from this study highlight the importance of improving taxon sampling when investigating genome evolution. Moreover, the currently available sequences are inadequate to fully characterize the pan-genome of phytoplasmas. Future genome sequencing efforts to expand phylogenetic diversity are essential in improving our understanding of phytoplasma evolution. PMID:23626855
Evolutionary distances in the twilight zone--a rational kernel approach.

PubMed

Schwarz, Roland F; Fletcher, William; Förster, Frank; Merget, Benjamin; Wolf, Matthias; Schultz, Jörg; Markowetz, Florian

2010-12-31

Phylogenetic tree reconstruction is traditionally based on multiple sequence alignments (MSAs) and heavily depends on the validity of this information bottleneck. With increasing sequence divergence, the quality of MSAs decays quickly. Alignment-free methods, on the other hand, are based on abstract string comparisons and avoid potential alignment problems. However, in general they are not biologically motivated and ignore our knowledge about the evolution of sequences. Thus, it is still a major open question how to define an evolutionary distance metric between divergent sequences that makes use of indel information and known substitution models without the need for a multiple alignment. Here we propose a new evolutionary distance metric to close this gap. It uses finite-state transducers to create a biologically motivated similarity score which models substitutions and indels, and does not depend on a multiple sequence alignment. The sequence similarity score is defined in analogy to pairwise alignments and additionally has the positive semi-definite property. We describe its derivation and show in simulation studies and real-world examples that it is more accurate in reconstructing phylogenies than competing methods. The result is a new and accurate way of determining evolutionary distances in and beyond the twilight zone of sequence alignments that is suitable for large datasets.
Project 1: Microbial Genomes: A Genomic Approach to Understanding the Evolution of Virulence. Project 2: From Genomes to Life: Drosophilia Development in Space and Time

DOE Office of Scientific and Technical Information (OSTI.GOV)

Robert DeSalle

2004-09-10

This project seeks to use the genomes of two close relatives, A. actinomycetemcomitans and H. aphrophilus, to understand the evolutionary changes that take place in a genome to make it more or less virulent. Our primary specific aim of this project was to sequence, annotate, and analyze the genomes of Actinobacillus actinomycetemcomitans (CU1000, serotype f) and Haemophilus aphrophilus. With these genome sequences we have then compared the whole genome sequences to each other and to the current Aa (HK1651 www.genome.ou.edu) genome project sequence along with other fully sequenced Pasteurellaceae to determine inter and intra species differences that may account formore » the differences and similarities in disease. We also propose to create and curate a comprehensive database where sequence information and analysis for the Pasteurellaceae (family that includes the genera Actinobacillus and Haemophilus) are readily accessible. And finally we have proposed to develop phylogenetic techniques that can be used to efficiently and accurately examine the evolution of genomes. Below we report on progress we have made on these major specific aims. Progress on the specific aims is reported below under two major headings--experimental approaches and bioinformatics and systematic biology approaches.« less
Progress in Understanding and Sequencing the Genome of Brassica rapa

PubMed Central

Hong, Chang Pyo; Kwon, Soo-Jin; Kim, Jung Sun; Yang, Tae-Jin; Park, Beom-Seok; Lim, Yong Pyo

2008-01-01

Brassica rapa, which is closely related to Arabidopsis thaliana, is an important crop and a model plant for studying genome evolution via polyploidization. We report the current understanding of the genome structure of B. rapa and efforts for the whole-genome sequencing of the species. The tribe Brassicaceae, which comprises ca. 240 species, descended from a common hexaploid ancestor with a basic genome similar to that of Arabidopsis. Chromosome rearrangements, including fusions and/or fissions, resulted in the present-day “diploid” Brassica species with variation in chromosome number and phenotype. Triplicated genomic segments of B. rapa are collinear to those of A. thaliana with InDels. The genome triplication has led to an approximately 1.7-fold increase in the B. rapa gene number compared to that of A. thaliana. Repetitive DNA of B. rapa has also been extensively amplified and has diverged from that of A. thaliana. For its whole-genome sequencing, the Brassica rapa Genome Sequencing Project (BrGSP) consortium has developed suitable genomic resources and constructed genetic and physical maps. Ten chromosomes of B. rapa are being allocated to BrGSP consortium participants, and each chromosome will be sequenced by a BAC-by-BAC approach. Genome sequencing of B. rapa will offer a new perspective for plant biology and evolution in the context of polyploidization. PMID:18288250
Sperm Bindin Divergence under Sexual Selection and Concerted Evolution in Sea Stars.

PubMed

Patiño, Susana; Keever, Carson C; Sunday, Jennifer M; Popovic, Iva; Byrne, Maria; Hart, Michael W

2016-08-01

Selection associated with competition among males or sexual conflict between mates can create positive selection for high rates of molecular evolution of gamete recognition genes and lead to reproductive isolation between species. We analyzed coding sequence and repetitive domain variation in the gene encoding the sperm acrosomal protein bindin in 13 diverse sea star species. We found that bindin has a conserved coding sequence domain structure in all 13 species, with several repeated motifs in a large central region that is similar among all sea stars in organization but highly divergent among genera in nucleotide and predicted amino acid sequence. More bindin codons and lineages showed positive selection for high relative rates of amino acid substitution in genera with gonochoric outcrossing adults (and greater expected strength of sexual selection) than in selfing hermaphrodites. That difference is consistent with the expectation that selfing (a highly derived mating system) may moderate the strength of sexual selection and limit the accumulation of bindin amino acid differences. The results implicate both positive selection on single codons and concerted evolution within the repetitive region in bindin divergence, and suggest that both single amino acid differences and repeat differences may affect sperm-egg binding and reproductive compatibility. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
MOLECULAR CLONING, SEQUENCING, EXPRESSION AND BIOLOGICAL ACTIVITY OF GIANT PANDA (AILUROPODA MELANOLEUCA) INTERFERON-GAMMA.

PubMed

Zhu, Hui; Wang, Wen-Xiu; Wang, Bao-Qin; Zhu, Xiao-Fu; Wu, Xu-Jin; Ma, Qing-Yi; Chen, De-Kun

2012-06-29

The giant panda (Ailuropoda melanoleuca) is an endangered species and indigenous to China. Interferon-gamma (IFN-γ) is the only member of type □ IFN and is vital for the regulation of host adapted immunity and inflammatory response. Little is known aboutthe FN-γ gene and its roles in giant panda.In this study, IFN-γ gene of Qinling giant panda was amplified from total blood RNA by RT-CPR, cloned, sequenced and analysed. The open reading frame (ORF) of Qinling giant panda IFN-γ encodes 152 amino acidsand is highly similar to Sichuan giant panda with an identity of 99.3% in cDNA sequence. The IFN-γ cDNA sequence was ligated to the pET32a vector and transformed into E. coli BL21 competent cells. Expression of recombinant IFN-γ protein of Qinling giant panda in E. coli was confirmed by SDS-PAGE and Western blot analysis. Biological activity assay indicated that the recombinant IFN-γ protein at the concentration of 4-10 µg/ml activated the giant panda peripheral blood lymphocytes,while at 12 µg/mlinhibited. the activation of the lymphocytes.These findings provide insights into the evolution of giant panda IFN-γ and information regarding amino acid residues essential for their biological activity.
Short reads from honey bee (Apis sp.) sequencing projects reflect microbial associate diversity

PubMed Central

Hurst, Gregory D.D.

2017-01-01

High throughput (or ‘next generation’) sequencing has transformed most areas of biological research and is now a standard method that underpins empirical study of organismal biology, and (through comparison of genomes), reveals patterns of evolution. For projects focused on animals, these sequencing methods do not discriminate between the primary target of sequencing (the animal genome) and ‘contaminating’ material, such as associated microbes. A common first step is to filter out these contaminants to allow better assembly of the animal genome or transcriptome. Here, we aimed to assess if these ‘contaminations’ provide information with regard to biologically important microorganisms associated with the individual. To achieve this, we examined whether the short read data from Apis retrieved elements of its well established microbiome. To this end, we screened almost 1,000 short read libraries of honey bee (Apis sp.) DNA sequencing project for the presence of microbial sequences, and find sequences from known honey bee microbial associates in at least 11% of them. Further to this, we screened ∼500 Apis RNA sequencing libraries for evidence of viral infections, which were found to be present in about half of them. We then used the data to reconstruct draft genomes of three Apis associated bacteria, as well as several viral strains de novo. We conclude that ‘contamination’ in short read sequencing libraries can provide useful genomic information on microbial taxa known to be associated with the target organisms, and may even lead to the discovery of novel associations. Finally, we demonstrate that RNAseq samples from experiments commonly carry uneven viral loads across libraries. We note variation in viral presence and load may be a confounding feature of differential gene expression analyses, and as such it should be incorporated as a random factor in analyses. PMID:28717593
Short reads from honey bee (Apis sp.) sequencing projects reflect microbial associate diversity.

PubMed

Gerth, Michael; Hurst, Gregory D D

2017-01-01

High throughput (or 'next generation') sequencing has transformed most areas of biological research and is now a standard method that underpins empirical study of organismal biology, and (through comparison of genomes), reveals patterns of evolution. For projects focused on animals, these sequencing methods do not discriminate between the primary target of sequencing (the animal genome) and 'contaminating' material, such as associated microbes. A common first step is to filter out these contaminants to allow better assembly of the animal genome or transcriptome. Here, we aimed to assess if these 'contaminations' provide information with regard to biologically important microorganisms associated with the individual. To achieve this, we examined whether the short read data from Apis retrieved elements of its well established microbiome. To this end, we screened almost 1,000 short read libraries of honey bee ( Apis sp.) DNA sequencing project for the presence of microbial sequences, and find sequences from known honey bee microbial associates in at least 11% of them. Further to this, we screened ∼500 Apis RNA sequencing libraries for evidence of viral infections, which were found to be present in about half of them. We then used the data to reconstruct draft genomes of three Apis associated bacteria, as well as several viral strains de novo . We conclude that 'contamination' in short read sequencing libraries can provide useful genomic information on microbial taxa known to be associated with the target organisms, and may even lead to the discovery of novel associations. Finally, we demonstrate that RNAseq samples from experiments commonly carry uneven viral loads across libraries. We note variation in viral presence and load may be a confounding feature of differential gene expression analyses, and as such it should be incorporated as a random factor in analyses.

Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing

PubMed Central

2011-01-01

Background Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.) could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution. Results A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp) and 5S rDNA (120 bp) sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp), with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae) unigenes (median coverage of 0.29×) and 66% of single copy orthologs (COSII) in asterids (median coverage of 0.14×). From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites) and phylogenetics (low-copy nuclear genes) studies were developed. Conclusions The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species and its relatives. This study represents a first step in the development of a community resource for further study of plant-insect co-evolution, anti-herbivore defense, floral developmental genetics, reproductive biology, chemical evolution, population genetics, and comparative genomics using milkweeds, and A. syriaca in particular, as ecological and evolutionary models. PMID:21542930
Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing.

PubMed

Straub, Shannon C K; Fishbein, Mark; Livshultz, Tatyana; Foster, Zachary; Parks, Matthew; Weitemier, Kevin; Cronn, Richard C; Liston, Aaron

2011-05-04

Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.) could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution. A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp) and 5S rDNA (120 bp) sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp), with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae) unigenes (median coverage of 0.29×) and 66% of single copy orthologs (COSII) in asterids (median coverage of 0.14×). From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites) and phylogenetics (low-copy nuclear genes) studies were developed. The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species and its relatives. This study represents a first step in the development of a community resource for further study of plant-insect co-evolution, anti-herbivore defense, floral developmental genetics, reproductive biology, chemical evolution, population genetics, and comparative genomics using milkweeds, and A. syriaca in particular, as ecological and evolutionary models.
Codon usage bias: causative factors, quantification methods and genome-wide patterns: with emphasis on insect genomes.

PubMed

Behura, Susanta K; Severson, David W

2013-02-01

Codon usage bias refers to the phenomenon where specific codons are used more often than other synonymous codons during translation of genes, the extent of which varies within and among species. Molecular evolutionary investigations suggest that codon bias is manifested as a result of balance between mutational and translational selection of such genes and that this phenomenon is widespread across species and may contribute to genome evolution in a significant manner. With the advent of whole-genome sequencing of numerous species, both prokaryotes and eukaryotes, genome-wide patterns of codon bias are emerging in different organisms. Various factors such as expression level, GC content, recombination rates, RNA stability, codon position, gene length and others (including environmental stress and population size) can influence codon usage bias within and among species. Moreover, there has been a continuous quest towards developing new concepts and tools to measure the extent of codon usage bias of genes. In this review, we outline the fundamental concepts of evolution of the genetic code, discuss various factors that may influence biased usage of synonymous codons and then outline different principles and methods of measurement of codon usage bias. Finally, we discuss selected studies performed using whole-genome sequences of different insect species to show how codon bias patterns vary within and among genomes. We conclude with generalized remarks on specific emerging aspects of codon bias studies and highlight the recent explosion of genome-sequencing efforts on arthropods (such as twelve Drosophila species, species of ants, honeybee, Nasonia and Anopheles mosquitoes as well as the recent launch of a genome-sequencing project involving 5000 insects and other arthropods) that may help us to understand better the evolution of codon bias and its biological significance. © 2012 The Authors. Biological Reviews © 2012 Cambridge Philosophical Society.
Networking Omic Data to Envisage Systems Biological Regulation.

PubMed

Kalapanulak, Saowalak; Saithong, Treenut; Thammarongtham, Chinae

To understand how biological processes work, it is necessary to explore the systematic regulation governing the behaviour of the processes. Not only driving the normal behavior of organisms, the systematic regulation evidently underlies the temporal responses to surrounding environments (dynamics) and long-term phenotypic adaptation (evolution). The systematic regulation is, in effect, formulated from the regulatory components which collaboratively work together as a network. In the drive to decipher such a code of lives, a spectrum of technologies has continuously been developed in the post-genomic era. With current advances, high-throughput sequencing technologies are tremendously powerful for facilitating genomics and systems biology studies in the attempt to understand system regulation inside the cells. The ability to explore relevant regulatory components which infer transcriptional and signaling regulation, driving core cellular processes, is thus enhanced. This chapter reviews high-throughput sequencing technologies, including second and third generation sequencing technologies, which support the investigation of genomics and transcriptomics data. Utilization of this high-throughput data to form the virtual network of systems regulation is explained, particularly transcriptional regulatory networks. Analysis of the resulting regulatory networks could lead to an understanding of cellular systems regulation at the mechanistic and dynamics levels. The great contribution of the biological networking approach to envisage systems regulation is finally demonstrated by a broad range of examples.
Fitness in time-dependent environments includes a geometric phase contribution

PubMed Central

Tănase-Nicola, Sorin; Nemenman, Ilya

2012-01-01

Phenotypic evolution implies sequential rise in frequency of new genomic sequences. The speed of the rise depends, in part, on the relative fitness (selection coefficient) of the mutant versus the ancestor. Using a simple population dynamics model, we show that the relative fitness in dynamical environments is not equal to the geometric average of the fitness over individual environments. Instead, it includes a term that explicitly depends on the sequence of the environments. For slowly varying environments, this term depends only on the oriented area enclosed by the trajectory taken by the system in the environment state space. It is closely related to the well-studied geometric phases in classical and quantum physical systems. We discuss possible biological implications of these observations, focusing on evolution of novel metabolic or stress-resistant functions. PMID:22112653
A genomic survey of the fish parasite Spironucleus salmonicida indicates genomic plasticity among diplomonads and significant lateral gene transfer in eukaryote genome evolution

PubMed Central

Andersson, Jan O; Sjögren, Åsa M; Horner, David S; Murphy, Colleen A; Dyal, Patricia L; Svärd, Staffan G; Logsdon, John M; Ragan, Mark A; Hirt, Robert P; Roger, Andrew J

2007-01-01

Background Comparative genomic studies of the mitochondrion-lacking protist group Diplomonadida (diplomonads) has been lacking, although Giardia lamblia has been intensively studied. We have performed a sequence survey project resulting in 2341 expressed sequence tags (EST) corresponding to 853 unique clones, 5275 genome survey sequences (GSS), and eleven finished contigs from the diplomonad fish parasite Spironucleus salmonicida (previously described as S. barkhanus). Results The analyses revealed a compact genome with few, if any, introns and very short 3' untranslated regions. Strikingly different patterns of codon usage were observed in genes corresponding to frequently sampled ESTs versus genes poorly sampled, indicating that translational selection is influencing the codon usage of highly expressed genes. Rigorous phylogenomic analyses identified 84 genes – mostly encoding metabolic proteins – that have been acquired by diplomonads or their relatively close ancestors via lateral gene transfer (LGT). Although most acquisitions were from prokaryotes, more than a dozen represent likely transfers of genes between eukaryotic lineages. Many genes that provide novel insights into the genetic basis of the biology and pathogenicity of this parasitic protist were identified including 149 that putatively encode variant-surface cysteine-rich proteins which are candidate virulence factors. A number of genomic properties that distinguish S. salmonicida from its human parasitic relative G. lamblia were identified such as nineteen putative lineage-specific gene acquisitions, distinct mutational biases and codon usage and distinct polyadenylation signals. Conclusion Our results highlight the power of comparative genomic studies to yield insights into the biology of parasitic protists and the evolution of their genomes, and suggest that genetic exchange between distantly-related protist lineages may be occurring at an appreciable rate in eukaryote genome evolution. PMID:17298675
Sequence Evolution and Expression Regulation of Stress-Responsive Genes in Natural Populations of Wild Tomato

PubMed Central

Fischer, Iris; Steige, Kim A.; Stephan, Wolfgang; Mboup, Mamadou

2013-01-01

The wild tomato species Solanum chilense and S. peruvianum are a valuable non-model system for studying plant adaptation since they grow in diverse environments facing many abiotic constraints. Here we investigate the sequence evolution of regulatory regions of drought and cold responsive genes and their expression regulation. The coding regions of these genes were previously shown to exhibit signatures of positive selection. Expression profiles and sequence evolution of regulatory regions of members of the Asr (ABA/water stress/ripening induced) gene family and the dehydrin gene pLC30-15 were analyzed in wild tomato populations from contrasting environments. For S. chilense, we found that Asr4 and pLC30-15 appear to respond much faster to drought conditions in accessions from very dry environments than accessions from more mesic locations. Sequence analysis suggests that the promoter of Asr2 and the downstream region of pLC30-15 are under positive selection in some local populations of S. chilense. By investigating gene expression differences at the population level we provide further support of our previous conclusions that Asr2, Asr4, and pLC30-15 are promising candidates for functional studies of adaptation. Our analysis also demonstrates the power of the candidate gene approach in evolutionary biology research and highlights the importance of wild Solanum species as a genetic resource for their cultivated relatives. PMID:24205149
Lateral Gene Transfer in a Heavy Metal-Contaminated-Groundwater Microbial Community

PubMed Central

Hemme, Christopher L.; Green, Stefan J.; Rishishwar, Lavanya; Prakash, Om; Pettenato, Angelica; Chakraborty, Romy; Deutschbauer, Adam M.; Van Nostrand, Joy D.; Wu, Liyou; He, Zhili; Jordan, I. King; Arkin, Adam P.; Kostka, Joel E.

2016-01-01

ABSTRACT Unraveling the drivers controlling the response and adaptation of biological communities to environmental change, especially anthropogenic activities, is a central but poorly understood issue in ecology and evolution. Comparative genomics studies suggest that lateral gene transfer (LGT) is a major force driving microbial genome evolution, but its role in the evolution of microbial communities remains elusive. To delineate the importance of LGT in mediating the response of a groundwater microbial community to heavy metal contamination, representative Rhodanobacter reference genomes were sequenced and compared to shotgun metagenome sequences. 16S rRNA gene-based amplicon sequence analysis indicated that Rhodanobacter populations were highly abundant in contaminated wells with low pHs and high levels of nitrate and heavy metals but remained rare in the uncontaminated wells. Sequence comparisons revealed that multiple geochemically important genes, including genes encoding Fe2+/Pb2+ permeases, most denitrification enzymes, and cytochrome c553, were native to Rhodanobacter and not subjected to LGT. In contrast, the Rhodanobacter pangenome contained a recombinational hot spot in which numerous metal resistance genes were subjected to LGT and/or duplication. In particular, Co2+/Zn2+/Cd2+ efflux and mercuric resistance operon genes appeared to be highly mobile within Rhodanobacter populations. Evidence of multiple duplications of a mercuric resistance operon common to most Rhodanobacter strains was also observed. Collectively, our analyses indicated the importance of LGT during the evolution of groundwater microbial communities in response to heavy metal contamination, and a conceptual model was developed to display such adaptive evolutionary processes for explaining the extreme dominance of Rhodanobacter populations in the contaminated groundwater microbiome. PMID:27048805
Quantifying the Number of Independent Organelle DNA Insertions in Genome Evolution and Human Health.

PubMed

Hazkani-Covo, Einat; Martin, William F

2017-05-01

Fragments of organelle genomes are often found as insertions in nuclear DNA. These fragments of mitochondrial DNA (numts) and plastid DNA (nupts) are ubiquitous components of eukaryotic genomes. They are, however, often edited out during the genome assembly process, leading to systematic underestimation of their frequency. Numts and nupts, once inserted, can become further fragmented through subsequent insertion of mobile elements or other recombinational events that disrupt the continuity of the inserted sequence relative to the genuine organelle DNA copy. Because numts and nupts are typically identified through sequence comparison tools such as BLAST, disruption of insertions into smaller fragments can lead to systematic overestimation of numt and nupt frequencies. Accurate identification of numts and nupts is important, however, both for better understanding of their role during evolution, and for monitoring their increasingly evident role in human disease. Human populations are polymorphic for 141 numt loci, five numts are causal to genetic disease, and cancer genomic studies are revealing an abundance of numts associated with tumor progression. Here, we report investigation of salient parameters involved in obtaining accurate estimates of numt and nupt numbers in genome sequence data. Numts and nupts from 44 sequenced eukaryotic genomes reveal lineage-specific differences in the number, relative age and frequency of insertional events as well as lineage-specific dynamics of their postinsertional fragmentation. Our findings outline the main technical parameters influencing accurate identification and frequency estimation of numts in genomic studies pertinent to both evolution and human health. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Measuring the Evolutionary Rewiring of Biological Networks

PubMed Central

Shou, Chong; Bhardwaj, Nitin; Lam, Hugo Y. K.; Yan, Koon-Kiu; Kim, Philip M.; Snyder, Michael; Gerstein, Mark B.

2011-01-01

We have accumulated a large amount of biological network data and expect even more to come. Soon, we anticipate being able to compare many different biological networks as we commonly do for molecular sequences. It has long been believed that many of these networks change, or “rewire”, at different rates. It is therefore important to develop a framework to quantify the differences between networks in a unified fashion. We developed such a formalism based on analogy to simple models of sequence evolution, and used it to conduct a systematic study of network rewiring on all the currently available biological networks. We found that, similar to sequences, biological networks show a decreased rate of change at large time divergences, because of saturation in potential substitutions. However, different types of biological networks consistently rewire at different rates. Using comparative genomics and proteomics data, we found a consistent ordering of the rewiring rates: transcription regulatory, phosphorylation regulatory, genetic interaction, miRNA regulatory, protein interaction, and metabolic pathway network, from fast to slow. This ordering was found in all comparisons we did of matched networks between organisms. To gain further intuition on network rewiring, we compared our observed rewirings with those obtained from simulation. We also investigated how readily our formalism could be mapped to other network contexts; in particular, we showed how it could be applied to analyze changes in a range of “commonplace” networks such as family trees, co-authorships and linux-kernel function dependencies. PMID:21253555
On Developing Content-Oriented Theories Taking Biological Evolution as an Example

ERIC Educational Resources Information Center

Andersson, Bjorn; Wallin, Anita

2006-01-01

Both in Europe and the United States there is a growing interest in design research. One example is the design and validation of topic-oriented teaching-learning sequences. This research may be said to have two objectives. One is to design and test "useful products", such as teachers guides and study material for students, which may be…
Phylogeny and systematics of the bee genus Osmia (Hymenoptera: megachilidae) with emphasis on North American melanosmia: new subgenera, synonymies, and nesting biology revisited

USDA-ARS?s Scientific Manuscript database

The predominantly holarctic bee genus Osmia is species-rich and behaviorally diverse. A robust phylogeny of this genus is important for understanding the evolution of the immense variety of morphological and behavioral traits exhibited by this group. We infer a phylogeny of Osmia using DNA sequenc...
De novo selection of oncogenes.

PubMed

Chacón, Kelly M; Petti, Lisa M; Scheideman, Elizabeth H; Pirazzoli, Valentina; Politi, Katerina; DiMaio, Daniel

2014-01-07

All cellular proteins are derived from preexisting ones by natural selection. Because of the random nature of this process, many potentially useful protein structures never arose or were discarded during evolution. Here, we used a single round of genetic selection in mouse cells to isolate chemically simple, biologically active transmembrane proteins that do not contain any amino acid sequences from preexisting proteins. We screened a retroviral library expressing hundreds of thousands of proteins consisting of hydrophobic amino acids in random order to isolate four 29-aa proteins that induced focus formation in mouse and human fibroblasts and tumors in mice. These proteins share no amino acid sequences with known cellular or viral proteins, and the simplest of them contains only seven different amino acids. They transformed cells by forming a stable complex with the platelet-derived growth factor β receptor transmembrane domain and causing ligand-independent receptor activation. We term this approach de novo selection and suggest that it can be used to generate structures and activities not observed in nature, create prototypes for novel research reagents and therapeutics, and provide insight into cell biology, transmembrane protein-protein interactions, and possibly virus evolution and the origin of life.
The post-genomic era of biological network alignment.

PubMed

Faisal, Fazle E; Meng, Lei; Crawford, Joseph; Milenković, Tijana

2015-12-01

Biological network alignment aims to find regions of topological and functional (dis)similarities between molecular networks of different species. Then, network alignment can guide the transfer of biological knowledge from well-studied model species to less well-studied species between conserved (aligned) network regions, thus complementing valuable insights that have already been provided by genomic sequence alignment. Here, we review computational challenges behind the network alignment problem, existing approaches for solving the problem, ways of evaluating their alignment quality, and the approaches' biomedical applications. We discuss recent innovative efforts of improving the existing view of network alignment. We conclude with open research questions in comparative biological network research that could further our understanding of principles of life, evolution, disease, and therapeutics.
Delayed Gratification Habitable Zones (DG-HZs): When Deep Outer Solar System Regions Become Balmy During Post-Main Sequence Stellar Evolution

NASA Astrophysics Data System (ADS)

Stern, S. A.

2002-09-01

Late in the Sun's evolution it, like all low and moderate mass stars, it will burn as a red giant, generating 1000s of solar luminosities for a few tens of millions of years. A dozen years ago this stage of stellar evolution was predicted to create observable sublimation signatures in systems where Kuiper Belts (KBs) are extant (Stern et al. 1990, Nature, 345, 305); recently, the SWAS spacecraft detected such systems (Melnick et al. 2001, 412, 160). During the red giant phase, the habitable zone of our solar system will lie in the region where Triton, Pluto-Charon, and KBOs orbit. Compared to the 1 AU habitable zone where Earth resided early in the solar system's history, this "delayed gratification habitable zone (DG-HZ)" will enjoy a far less biologically hazardous environment-- with far lower harmful UV radiation levels from the Sun, and a far quieter collisional environment. Objects like Triton, Pluto-Charon, and KBOs, which are known to be rich in both water and organics, will then become possible sites for biochemical and perhaps even biological evolution. The Sun's DG-HZ may only be of academic interest owing to its great separation from us in time. However, several 108 approximately solar-type Milky Way stars burn as luminous red giants today. Thus, if icy-organic objects are common in the 20-50 AU zones of these stars, as they are in our solar system (and as inferred in numerous main sequence stellar disk systems), then DG-HZs form a kind of niche habitable zone that is likely to be numerically common in the galaxy. I will show the calculated temporal evolution of DG-HZs around various stellar types using modern stellar evolution luminosity tracks, and then discuss various aspects of DG-HZs, including the effects of stellar pulsations and mass loss winds. This work was supported by NASA's Origins of Solar Systems Program.
Evolution of the VEGF-regulated vascular network from a neural guidance system.

PubMed

Ponnambalam, Sreenivasan; Alberghina, Mario

2011-06-01

The vascular network is closely linked to the neural system, and an interdependence is displayed in healthy and in pathophysiological responses. How has close apposition of two such functionally different systems occurred? Here, we present a hypothesis for the evolution of the vascular network from an ancestral neural guidance system. Biological cornerstones of this hypothesis are the vascular endothelial growth factor (VEGF) protein family and cognate receptors. The primary sequences of such proteins are conserved from invertebrates, such as worms and flies that lack discernible vascular systems compared to mammals, but all these systems have sophisticated neuronal wiring involving such molecules. Ancestral VEGFs and receptors (VEGFRs) could have been used to develop and maintain the nervous system in primitive eukaryotes. During evolution, the demands of increased morphological complexity required systems for transporting molecules and cells, i.e., biological conductive tubes. We propose that the VEGF-VEGFR axis was subverted by evolution to mediate the formation of biological tubes necessary for transport of fluids, e.g., blood. Increasingly, there is evidence that aberrant VEGF-mediated responses are also linked to neuronal dysfunctions ranging from motor neuron disease, stroke, Parkinson's disease, Alzheimer's disease, ischemic brain disease, epilepsy, multiple sclerosis, and neuronal repair after injury, as well as common vascular diseases (e.g., retinal disease). Manipulation and correction of the VEGF response in different neural tissues could be an effective strategy to treat different neurological diseases.
Identifying structural variation in haploid microbial genomes from short-read resequencing data using breseq.

PubMed

Barrick, Jeffrey E; Colburn, Geoffrey; Deatherage, Daniel E; Traverse, Charles C; Strand, Matthew D; Borges, Jordan J; Knoester, David B; Reba, Aaron; Meyer, Austin G

2014-11-29

Mutations that alter chromosomal structure play critical roles in evolution and disease, including in the origin of new lifestyles and pathogenic traits in microbes. Large-scale rearrangements in genomes are often mediated by recombination events involving new or existing copies of mobile genetic elements, recently duplicated genes, or other repetitive sequences. Most current software programs for predicting structural variation from short-read DNA resequencing data are intended primarily for use on human genomes. They typically disregard information in reads mapping to repeat sequences, and significant post-processing and manual examination of their output is often required to rule out false-positive predictions and precisely describe mutational events. We have implemented an algorithm for identifying structural variation from DNA resequencing data as part of the breseq computational pipeline for predicting mutations in haploid microbial genomes. Our method evaluates the support for new sequence junctions present in a clonal sample from split-read alignments to a reference genome, including matches to repeat sequences. Then, it uses a statistical model of read coverage evenness to accept or reject these predictions. Finally, breseq combines predictions of new junctions and deleted chromosomal regions to output biologically relevant descriptions of mutations and their effects on genes. We demonstrate the performance of breseq on simulated Escherichia coli genomes with deletions generating unique breakpoint sequences, new insertions of mobile genetic elements, and deletions mediated by mobile elements. Then, we reanalyze data from an E. coli K-12 mutation accumulation evolution experiment in which structural variation was not previously identified. Transposon insertions and large-scale chromosomal changes detected by breseq account for ~25% of spontaneous mutations in this strain. In all cases, we find that breseq is able to reliably predict structural variation with modest read-depth coverage of the reference genome (>40-fold). Using breseq to predict structural variation should be useful for studies of microbial epidemiology, experimental evolution, synthetic biology, and genetics when a reference genome for a closely related strain is available. In these cases, breseq can discover mutations that may be responsible for important or unintended changes in genomes that might otherwise go undetected.
A Guide to the PLAZA 3.0 Plant Comparative Genomic Database.

PubMed

Vandepoele, Klaas

2017-01-01

PLAZA 3.0 is an online resource for comparative genomics and offers a versatile platform to study gene functions and gene families or to analyze genome organization and evolution in the green plant lineage. Starting from genome sequence information for over 35 plant species, precomputed comparative genomic data sets cover homologous gene families, multiple sequence alignments, phylogenetic trees, and genomic colinearity information within and between species. Complementary functional data sets, a Workbench, and interactive visualization tools are available through a user-friendly web interface, making PLAZA an excellent starting point to translate sequence or omics data sets into biological knowledge. PLAZA is available at http://bioinformatics.psb.ugent.be/plaza/ .
A single determinant dominates the rate of yeast protein evolution.

PubMed

Drummond, D Allan; Raval, Alpan; Wilke, Claus O

2006-02-01

A gene's rate of sequence evolution is among the most fundamental evolutionary quantities in common use, but what determines evolutionary rates has remained unclear. Here, we carry out the first combined analysis of seven predictors (gene expression level, dispensability, protein abundance, codon adaptation index, gene length, number of protein-protein interactions, and the gene's centrality in the interaction network) previously reported to have independent influences on protein evolutionary rates. Strikingly, our analysis reveals a single dominant variable linked to the number of translation events which explains 40-fold more variation in evolutionary rate than any other, suggesting that protein evolutionary rate has a single major determinant among the seven predictors. The dominant variable explains nearly half the variation in the rate of synonymous and protein evolution. We show that the two most commonly used methods to disentangle the determinants of evolutionary rate, partial correlation analysis and ordinary multivariate regression, produce misleading or spurious results when applied to noisy biological data. We overcome these difficulties by employing principal component regression, a multivariate regression of evolutionary rate against the principal components of the predictor variables. Our results support the hypothesis that translational selection governs the rate of synonymous and protein sequence evolution in yeast.
Evolution of transcriptional enhancers and animal diversity

PubMed Central

Rubinstein, Marcelo; de Souza, Flávio S. J.

2013-01-01

Deciphering the genetic bases that drive animal diversity is one of the major challenges of modern biology. Although four decades ago it was proposed that animal evolution was mainly driven by changes in cis-regulatory DNA elements controlling gene expression rather than in protein-coding sequences, only now are powerful bioinformatics and experimental approaches available to accelerate studies into how the evolution of transcriptional enhancers contributes to novel forms and functions. In the introduction to this Theme Issue, we start by defining the general properties of transcriptional enhancers, such as modularity and the coexistence of tight sequence conservation with transcription factor-binding site shuffling as different mechanisms that maintain the enhancer grammar over evolutionary time. We discuss past and current methods used to identify cell-type-specific enhancers and provide examples of how enhancers originate de novo, change and are lost in particular lineages. We then focus in the central part of this Theme Issue on analysing examples of how the molecular evolution of enhancers may change form and function. Throughout this introduction, we present the main findings of the articles, reviews and perspectives contributed to this Theme Issue that together illustrate some of the great advances and current frontiers in the field. PMID:24218630

EdiPy: a resource to simulate the evolution of plant mitochondrial genes under the RNA editing.

PubMed

Picardi, Ernesto; Quagliariello, Carla

2006-02-01

EdiPy is an online resource appropriately designed to simulate the evolution of plant mitochondrial genes in a biologically realistic fashion. EdiPy takes into account the presence of sites subjected to RNA editing and provides multiple artificial alignments corresponding to both genomic and cDNA sequences. Each artificial data set can successively be submitted to main and widespread evolutionary and phylogenetic software packages such as PAUP, Phyml, PAML and Phylip. As an online bioinformatic resource, EdiPy is available at the following web page: http://biologia.unical.it/py_script/index.html.
Using the Tools and Resources of the RCSB Protein Data Bank.

PubMed

Costanzo, Luigi Di; Ghosh, Sutapa; Zardecki, Christine; Burley, Stephen K

2016-09-07

The Protein Data Bank (PDB) archive is the worldwide repository of experimentally determined three-dimensional structures of large biological molecules found in all three kingdoms of life. Atomic-level structures of these proteins, nucleic acids, and complex assemblies thereof are central to research and education in molecular, cellular, and organismal biology, biochemistry, biophysics, materials science, bioengineering, ecology, and medicine. Several types of information are associated with each PDB archival entry, including atomic coordinates, primary experimental data, polymer sequence(s), and summary metadata. The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) serves as the U.S. data center for the PDB, distributing archival data and supporting both simple and complex queries that return results. These data can be freely downloaded, analyzed, and visualized using RCSB PDB tools and resources to gain a deeper understanding of fundamental biological processes, molecular evolution, human health and disease, and drug discovery. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.
Biological data sciences in genome research

PubMed Central

Schatz, Michael C.

2015-01-01

The last 20 years have been a remarkable era for biology and medicine. One of the most significant achievements has been the sequencing of the first human genomes, which has laid the foundation for profound insights into human genetics, the intricacies of regulation and development, and the forces of evolution. Incredibly, as we look into the future over the next 20 years, we see the very real potential for sequencing more than 1 billion genomes, bringing even deeper insight into human genetics as well as the genetics of millions of other species on the planet. Realizing this great potential for medicine and biology, though, will only be achieved through the integration and development of highly scalable computational and quantitative approaches that can keep pace with the rapid improvements to biotechnology. In this perspective, I aim to chart out these future technologies, anticipate the major themes of research, and call out the challenges ahead. One of the largest shifts will be in the training used to prepare the class of 2035 for their highly interdisciplinary world. PMID:26430150
Evolution viewed from physics, physiology and medicine.

PubMed

Noble, Denis

2017-10-06

Stochasticity is harnessed by organisms to generate functionality. Randomness does not, therefore, necessarily imply lack of function or 'blind chance' at higher levels. In this respect, biology must resemble physics in generating order from disorder. This fact is contrary to Schrödinger's idea of biology generating phenotypic order from molecular- level order, which inspired the central dogma of molecular biology. The order originates at higher levels, which constrain the components at lower levels. We now know that this includes the genome, which is controlled by patterns of transcription factors and various epigenetic and reorganization mechanisms. These processes can occur in response to environmental stress, so that the genome becomes 'a highly sensitive organ of the cell' (McClintock). Organisms have evolved to be able to cope with many variations at the molecular level. Organisms also make use of physical processes in evolution and development when it is possible to arrive at functional development without the necessity to store all information in DNA sequences. This view of development and evolution differs radically from that of neo-Darwinism with its emphasis on blind chance as the origin of variation. Blind chance is necessary, but the origin of functional variation is not at the molecular level. These observations derive from and reinforce the principle of biological relativity, which holds that there is no privileged level of causation. They also have important implications for medical science.
Integration and macroevolutionary patterns in the pollination biology of conifers.

PubMed

Leslie, Andrew B; Beaulieu, Jeremy M; Crane, Peter R; Knopf, Patrick; Donoghue, Michael J

2015-06-01

Integration influences patterns of trait evolution, but the relationship between these patterns and the degree of trait integration is not well understood. To explore this further, we study a specialized pollination mechanism in conifers whose traits are linked through function but not development. This mechanism depends on interactions among three characters: pollen that is buoyant, ovules that face downward at pollination, and the production of a liquid droplet that buoyant grains float through to enter the ovule. We use a well-sampled phylogeny of conifers to test correlated evolution among these characters and specific sequences of character change. Using likelihood models of character evolution, we find that pollen morphology and ovule characters evolve in a concerted manner, where the flotation mechanism breaks down irreversibly following changes in orientation or drop production. The breakdown of this functional constraint, which may be facilitated by the lack of developmental integration among the constituent traits, is associated with increased trait variation and more diverse pollination strategies. Although this functional "release" increases diversity in some ways, the irreversible way in which the flotation mechanism is lost may eventually result in its complete disappearance from seed plant reproductive biology. © 2015 The Author(s). Evolution © 2015 The Society for the Study of Evolution.
Single nucleotide variations: Biological impact and theoretical interpretation

PubMed Central

Katsonis, Panagiotis; Koire, Amanda; Wilson, Stephen Joseph; Hsu, Teng-Kuei; Lua, Rhonald C; Wilkins, Angela Dawn; Lichtarge, Olivier

2014-01-01

Genome-wide association studies (GWAS) and whole-exome sequencing (WES) generate massive amounts of genomic variant information, and a major challenge is to identify which variations drive disease or contribute to phenotypic traits. Because the majority of known disease-causing mutations are exonic non-synonymous single nucleotide variations (nsSNVs), most studies focus on whether these nsSNVs affect protein function. Computational studies show that the impact of nsSNVs on protein function reflects sequence homology and structural information and predict the impact through statistical methods, machine learning techniques, or models of protein evolution. Here, we review impact prediction methods and discuss their underlying principles, their advantages and limitations, and how they compare to and complement one another. Finally, we present current applications and future directions for these methods in biological research and medical genetics. PMID:25234433
Human development, heredity and evolution.

PubMed

Nishinakamura, Ryuichi; Takasato, Minoru

2017-06-15

From March 27-29 2017, the RIKEN Center for Developmental Biology held a symposium entitled 'Towards Understanding Human Development, Heredity, and Evolution' in Kobe, Japan. Recent advances in technologies including stem cell culture, live imaging, single-cell approaches, next-generation sequencing and genome editing have led to an expansion in our knowledge of human development. Organized by Yoshiya Kawaguchi, Mitinori Saitou, Mototsugu Eiraku, Tomoya Kitajima, Fumio Matsuzaki, Takashi Tsuji and Edith Heard, the symposium covered a broad range of topics including human germline development, epigenetics, organogenesis and evolution. This Meeting Review provides a summary of this timely and exciting symposium, which has convinced us that we are moving into the era of science targeted on humans. © 2017. Published by The Company of Biologists Ltd.
Understanding dengue virus evolution to support epidemic surveillance and counter-measure development.

PubMed

Pollett, S; Melendrez, M C; Maljkovic Berry, I; Duchêne, S; Salje, H; Dat, Cummings; Jarman, R G

2018-04-25

Dengue virus (DENV) causes a profound burden of morbidity and mortality, and its global burden is rising due to the co-circulation of four divergent DENV serotypes in the ecological context of globalization, travel, climate change, urbanization, and expansion of the geographic range of the Ae.aegypti and Ae.albopictus vectors. Understanding DENV evolution offers valuable opportunities to enhance surveillance and response to DENV epidemics via advances in RNA virus sequencing, bioinformatics, phylogenetic and other computational biology methods. Here we provide a scoping overview of the evolution and molecular epidemiology of DENV and the range of ways that evolutionary analyses can be applied as a public health tool against this arboviral pathogen. Copyright © 2018. Published by Elsevier B.V.
Molecular mechanisms of adaptation emerging from the physics and evolution of nucleic acids and proteins.

PubMed

Goncearenco, Alexander; Ma, Bin-Guang; Berezovsky, Igor N

2014-03-01

DNA, RNA and proteins are major biological macromolecules that coevolve and adapt to environments as components of one highly interconnected system. We explore here sequence/structure determinants of mechanisms of adaptation of these molecules, links between them, and results of their mutual evolution. We complemented statistical analysis of genomic and proteomic sequences with folding simulations of RNA molecules, unraveling causal relations between compositional and sequence biases reflecting molecular adaptation on DNA, RNA and protein levels. We found many compositional peculiarities related to environmental adaptation and the life style. Specifically, thermal adaptation of protein-coding sequences in Archaea is characterized by a stronger codon bias than in Bacteria. Guanine and cytosine load in the third codon position is important for supporting the aerobic life style, and it is highly pronounced in Bacteria. The third codon position also provides a tradeoff between arginine and lysine, which are favorable for thermal adaptation and aerobicity, respectively. Dinucleotide composition provides stability of nucleic acids via strong base-stacking in ApG dinucleotides. In relation to coevolution of nucleic acids and proteins, thermostability-related demands on the amino acid composition affect the nucleotide content in the second codon position in Archaea.
Molecular mechanisms of adaptation emerging from the physics and evolution of nucleic acids and proteins

PubMed Central

Goncearenco, Alexander; Ma, Bin-Guang; Berezovsky, Igor N.

2014-01-01

DNA, RNA and proteins are major biological macromolecules that coevolve and adapt to environments as components of one highly interconnected system. We explore here sequence/structure determinants of mechanisms of adaptation of these molecules, links between them, and results of their mutual evolution. We complemented statistical analysis of genomic and proteomic sequences with folding simulations of RNA molecules, unraveling causal relations between compositional and sequence biases reflecting molecular adaptation on DNA, RNA and protein levels. We found many compositional peculiarities related to environmental adaptation and the life style. Specifically, thermal adaptation of protein-coding sequences in Archaea is characterized by a stronger codon bias than in Bacteria. Guanine and cytosine load in the third codon position is important for supporting the aerobic life style, and it is highly pronounced in Bacteria. The third codon position also provides a tradeoff between arginine and lysine, which are favorable for thermal adaptation and aerobicity, respectively. Dinucleotide composition provides stability of nucleic acids via strong base-stacking in ApG dinucleotides. In relation to coevolution of nucleic acids and proteins, thermostability-related demands on the amino acid composition affect the nucleotide content in the second codon position in Archaea. PMID:24371267
MEvoLib v1.0: the first molecular evolution library for Python.

PubMed

Álvarez-Jarreta, Jorge; Ruiz-Pesini, Eduardo

2016-10-28

Molecular evolution studies involve many different hard computational problems solved, in most cases, with heuristic algorithms that provide a nearly optimal solution. Hence, diverse software tools exist for the different stages involved in a molecular evolution workflow. We present MEvoLib, the first molecular evolution library for Python, providing a framework to work with different tools and methods involved in the common tasks of molecular evolution workflows. In contrast with already existing bioinformatics libraries, MEvoLib is focused on the stages involved in molecular evolution studies, enclosing the set of tools with a common purpose in a single high-level interface with fast access to their frequent parameterizations. The gene clustering from partial or complete sequences has been improved with a new method that integrates accessible external information (e.g. GenBank's features data). Moreover, MEvoLib adjusts the fetching process from NCBI databases to optimize the download bandwidth usage. In addition, it has been implemented using parallelization techniques to cope with even large-case scenarios. MEvoLib is the first library for Python designed to facilitate molecular evolution researches both for expert and novel users. Its unique interface for each common task comprises several tools with their most used parameterizations. It has also included a method to take advantage of biological knowledge to improve the gene partition of sequence datasets. Additionally, its implementation incorporates parallelization techniques to enhance computational costs when handling very large input datasets.
An Exploration into Fern Genome Space.

PubMed

Wolf, Paul G; Sessa, Emily B; Marchant, Daniel Blaine; Li, Fay-Wei; Rothfels, Carl J; Sigel, Erin M; Gitzendanner, Matthew A; Visger, Clayton J; Banks, Jo Ann; Soltis, Douglas E; Soltis, Pamela S; Pryer, Kathleen M; Der, Joshua P

2015-08-26

Ferns are one of the few remaining major clades of land plants for which a complete genome sequence is lacking. Knowledge of genome space in ferns will enable broad-scale comparative analyses of land plant genes and genomes, provide insights into genome evolution across green plants, and shed light on genetic and genomic features that characterize ferns, such as their high chromosome numbers and large genome sizes. As part of an initial exploration into fern genome space, we used a whole genome shotgun sequencing approach to obtain low-density coverage (∼0.4X to 2X) for six fern species from the Polypodiales (Ceratopteris, Pteridium, Polypodium, Cystopteris), Cyatheales (Plagiogyria), and Gleicheniales (Dipteris). We explore these data to characterize the proportion of the nuclear genome represented by repetitive sequences (including DNA transposons, retrotransposons, ribosomal DNA, and simple repeats) and protein-coding genes, and to extract chloroplast and mitochondrial genome sequences. Such initial sweeps of fern genomes can provide information useful for selecting a promising candidate fern species for whole genome sequencing. We also describe variation of genomic traits across our sample and highlight some differences and similarities in repeat structure between ferns and seed plants. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Multi-step formation, evolution, and functionalization of new cytoplasmic male sterility genes in the plant mitochondrial genomes

PubMed Central

Tang, Huiwu; Zheng, Xingmei; Li, Chuliang; Xie, Xianrong; Chen, Yuanling; Chen, Letian; Zhao, Xiucai; Zheng, Huiqi; Zhou, Jiajian; Ye, Shan; Guo, Jingxin; Liu, Yao-Guang

2017-01-01

New gene origination is a major source of genomic innovations that confer phenotypic changes and biological diversity. Generation of new mitochondrial genes in plants may cause cytoplasmic male sterility (CMS), which can promote outcrossing and increase fitness. However, how mitochondrial genes originate and evolve in structure and function remains unclear. The rice Wild Abortive type of CMS is conferred by the mitochondrial gene WA352c (previously named WA352) and has been widely exploited in hybrid rice breeding. Here, we reconstruct the evolutionary trajectory of WA352c by the identification and analyses of 11 mitochondrial genomic recombinant structures related to WA352c in wild and cultivated rice. We deduce that these structures arose through multiple rearrangements among conserved mitochondrial sequences in the mitochondrial genome of the wild rice Oryza rufipogon, coupled with substoichiometric shifting and sequence variation. We identify two expressed but nonfunctional protogenes among these structures, and show that they could evolve into functional CMS genes via sequence variations that could relieve the self-inhibitory potential of the proteins. These sequence changes would endow the proteins the ability to interact with the nucleus-encoded mitochondrial protein COX11, resulting in premature programmed cell death in the anther tapetum and male sterility. Furthermore, we show that the sequences that encode the COX11-interaction domains in these WA352c-related genes have experienced purifying selection during evolution. We propose a model for the formation and evolution of new CMS genes via a “multi-recombination/protogene formation/functionalization” mechanism involving gradual variations in the structure, sequence, copy number, and function. PMID:27725674
A Lossy Compression Technique Enabling Duplication-Aware Sequence Alignment

PubMed Central

Freschi, Valerio; Bogliolo, Alessandro

2012-01-01

In spite of the recognized importance of tandem duplications in genome evolution, commonly adopted sequence comparison algorithms do not take into account complex mutation events involving more than one residue at the time, since they are not compliant with the underlying assumption of statistical independence of adjacent residues. As a consequence, the presence of tandem repeats in sequences under comparison may impair the biological significance of the resulting alignment. Although solutions have been proposed, repeat-aware sequence alignment is still considered to be an open problem and new efficient and effective methods have been advocated. The present paper describes an alternative lossy compression scheme for genomic sequences which iteratively collapses repeats of increasing length. The resulting approximate representations do not contain tandem duplications, while retaining enough information for making their comparison even more significant than the edit distance between the original sequences. This allows us to exploit traditional alignment algorithms directly on the compressed sequences. Results confirm the validity of the proposed approach for the problem of duplication-aware sequence alignment. PMID:22518086
Chloroplast genomes: diversity, evolution, and applications in genetic engineering

DOE Office of Scientific and Technical Information (OSTI.GOV)

Daniell, Henry; Lin, Choun -Sea; Yu, Ming

Chloroplasts play a crucial role in sustaining life on earth. The availability of over 800 sequenced chloroplast genomes from a variety of land plants has enhanced our understanding of chloroplast biology, intracellular gene transfer, conservation, diversity, and the genetic basis by which chloroplast transgenes can be engineered to enhance plant agronomic traits or to produce high-value agricultural or biomedical products. In this review, we discuss the impact of chloroplast genome sequences on understanding the origins of economically important cultivated species and changes that have taken place during domestication. Here, we also discuss the potential biotechnological applications of chloroplast genomes.
Chloroplast genomes: diversity, evolution, and applications in genetic engineering

DOE PAGES

Daniell, Henry; Lin, Choun -Sea; Yu, Ming; ...

2016-06-23

Chloroplasts play a crucial role in sustaining life on earth. The availability of over 800 sequenced chloroplast genomes from a variety of land plants has enhanced our understanding of chloroplast biology, intracellular gene transfer, conservation, diversity, and the genetic basis by which chloroplast transgenes can be engineered to enhance plant agronomic traits or to produce high-value agricultural or biomedical products. In this review, we discuss the impact of chloroplast genome sequences on understanding the origins of economically important cultivated species and changes that have taken place during domestication. Here, we also discuss the potential biotechnological applications of chloroplast genomes.
Comparative analysis of gene regulatory networks: from network reconstruction to evolution.

PubMed

Thompson, Dawn; Regev, Aviv; Roy, Sushmita

2015-01-01

Regulation of gene expression is central to many biological processes. Although reconstruction of regulatory circuits from genomic data alone is therefore desirable, this remains a major computational challenge. Comparative approaches that examine the conservation and divergence of circuits and their components across strains and species can help reconstruct circuits as well as provide insights into the evolution of gene regulatory processes and their adaptive contribution. In recent years, advances in genomic and computational tools have led to a wealth of methods for such analysis at the sequence, expression, pathway, module, and entire network level. Here, we review computational methods developed to study transcriptional regulatory networks using comparative genomics, from sequence to functional data. We highlight how these methods use evolutionary conservation and divergence to reliably detect regulatory components as well as estimate the extent and rate of divergence. Finally, we discuss the promise and open challenges in linking regulatory divergence to phenotypic divergence and adaptation.
An Evolution-Based Approach to De Novo Protein Design and Case Study on Mycobacterium tuberculosis

PubMed Central

Brender, Jeffrey R.; Czajka, Jeff; Marsh, David; Gray, Felicia; Cierpicki, Tomasz; Zhang, Yang

2013-01-01

Computational protein design is a reverse procedure of protein folding and structure prediction, where constructing structures from evolutionarily related proteins has been demonstrated to be the most reliable method for protein 3-dimensional structure prediction. Following this spirit, we developed a novel method to design new protein sequences based on evolutionarily related protein families. For a given target structure, a set of proteins having similar fold are identified from the PDB library by structural alignments. A structural profile is then constructed from the protein templates and used to guide the conformational search of amino acid sequence space, where physicochemical packing is accommodated by single-sequence based solvation, torsion angle, and secondary structure predictions. The method was tested on a computational folding experiment based on a large set of 87 protein structures covering different fold classes, which showed that the evolution-based design significantly enhances the foldability and biological functionality of the designed sequences compared to the traditional physics-based force field methods. Without using homologous proteins, the designed sequences can be folded with an average root-mean-square-deviation of 2.1 Å to the target. As a case study, the method is extended to redesign all 243 structurally resolved proteins in the pathogenic bacteria Mycobacterium tuberculosis, which is the second leading cause of death from infectious disease. On a smaller scale, five sequences were randomly selected from the design pool and subjected to experimental validation. The results showed that all the designed proteins are soluble with distinct secondary structure and three have well ordered tertiary structure, as demonstrated by circular dichroism and NMR spectroscopy. Together, these results demonstrate a new avenue in computational protein design that uses knowledge of evolutionary conservation from protein structural families to engineer new protein molecules of improved fold stability and biological functionality. PMID:24204234
Coevolutionary modeling of protein sequences: Predicting structure, function, and mutational landscapes

NASA Astrophysics Data System (ADS)

Weigt, Martin

Over the last years, biological research has been revolutionized by experimental high-throughput techniques, in particular by next-generation sequencing technology. Unprecedented amounts of data are accumulating, and there is a growing request for computational methods unveiling the information hidden in raw data, thereby increasing our understanding of complex biological systems. Statistical-physics models based on the maximum-entropy principle have, in the last few years, played an important role in this context. To give a specific example, proteins and many non-coding RNA show a remarkable degree of structural and functional conservation in the course of evolution, despite a large variability in amino acid sequences. We have developed a statistical-mechanics inspired inference approach - called Direct-Coupling Analysis - to link this sequence variability (easy to observe in sequence alignments, which are available in public sequence databases) to bio-molecular structure and function. In my presentation I will show, how this methodology can be used (i) to infer contacts between residues and thus to guide tertiary and quaternary protein structure prediction and RNA structure prediction, (ii) to discriminate interacting from non-interacting protein families, and thus to infer conserved protein-protein interaction networks, and (iii) to reconstruct mutational landscapes and thus to predict the phenotypic effect of mutations. References [1] M. Figliuzzi, H. Jacquier, A. Schug, O. Tenaillon and M. Weigt ''Coevolutionary landscape inference and the context-dependence of mutations in beta-lactamase TEM-1'', Mol. Biol. Evol. (2015), doi: 10.1093/molbev/msv211 [2] E. De Leonardis, B. Lutz, S. Ratz, S. Cocco, R. Monasson, A. Schug, M. Weigt ''Direct-Coupling Analysis of nucleotide coevolution facilitates RNA secondary and tertiary structure prediction'', Nucleic Acids Research (2015), doi: 10.1093/nar/gkv932 [3] F. Morcos, A. Pagnani, B. Lunt, A. Bertolino, D. Marks, C. Sander, R. Zecchina, J.N. Onuchic, T. Hwa, M. Weigt, ''Direct-coupling analysis of residue co-evolution captures native contacts across many protein families'', Proc. Natl. Acad. Sci. 108, E1293-E1301 (2011).
The whole genome sequence of the Mediterranean fruit fly, Ceratitis capitata (Wiedmann), reveals insights into the biology and adaptive evolution of a highly invasive pest species

USDA-ARS?s Scientific Manuscript database

The Mediterranean fruit fly is one of the most destructive agricultural pests throughout the world due to its broad host plant range that includes more than 260 different fruits, flowers, vegetables, and nuts. Host preferences vary in different regions of the world, which can be associated with its ...

Evolution of MHC class I genes in the endangered loggerhead sea turtle (Caretta caretta) revealed by 454 amplicon sequencing.

PubMed

Stiebens, Victor A; Merino, Sonia E; Chain, Frédéric J J; Eizaguirre, Christophe

2013-04-30

In evolutionary and conservation biology, parasitism is often highlighted as a major selective pressure. To fight against parasites and pathogens, genetic diversity of the immune genes of the major histocompatibility complex (MHC) are particularly important. However, the extensive degree of polymorphism observed in these genes makes it difficult to conduct thorough population screenings. We utilized a genotyping protocol that uses 454 amplicon sequencing to characterize the MHC class I in the endangered loggerhead sea turtle (Caretta caretta) and to investigate their evolution at multiple relevant levels of organization. MHC class I genes revealed signatures of trans-species polymorphism across several reptile species. In the studied loggerhead turtle individuals, it results in the maintenance of two ancient allelic lineages. We also found that individuals carrying an intermediate number of MHC class I alleles are larger than those with either a low or high number of alleles. Multiple modes of evolution seem to maintain MHC diversity in the loggerhead turtles, with relatively high polymorphism for an endangered species.
Habitability of super-Earth planets around other suns: models including Red Giant Branch evolution.

PubMed

von Bloh, W; Cuntz, M; Schröder, K-P; Bounama, C; Franck, S

2009-01-01

The unexpected diversity of exoplanets includes a growing number of super-Earth planets, i.e., exoplanets with masses of up to several Earth masses and a similar chemical and mineralogical composition as Earth. We present a thermal evolution model for a 10 Earth-mass planet orbiting a star like the Sun. Our model is based on the integrated system approach, which describes the photosynthetic biomass production and takes into account a variety of climatological, biogeochemical, and geodynamical processes. This allows us to identify a so-called photosynthesis-sustaining habitable zone (pHZ), as determined by the limits of biological productivity on the planetary surface. Our model considers solar evolution during the main-sequence stage and along the Red Giant Branch as described by the most recent solar model. We obtain a large set of solutions consistent with the principal possibility of life. The highest likelihood of habitability is found for "water worlds." Only mass-rich water worlds are able to realize pHZ-type habitability beyond the stellar main sequence on the Red Giant Branch.
Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomes

PubMed Central

Aziz, Ramy K.; Dwivedi, Bhakti; Akhter, Sajia; Breitbart, Mya; Edwards, Robert A.

2015-01-01

Phages are the most abundant biological entities on Earth and play major ecological roles, yet the current sequenced phage genomes do not adequately represent their diversity, and little is known about the abundance and distribution of these sequenced genomes in nature. Although the study of phage ecology has benefited tremendously from the emergence of metagenomic sequencing, a systematic survey of phage genes and genomes in various ecosystems is still lacking, and fundamental questions about phage biology, lifestyle, and ecology remain unanswered. To address these questions and improve comparative analysis of phages in different metagenomes, we screened a core set of publicly available metagenomic samples for sequences related to completely sequenced phages using the web tool, Phage Eco-Locator. We then adopted and deployed an array of mathematical and statistical metrics for a multidimensional estimation of the abundance and distribution of phage genes and genomes in various ecosystems. Experiments using those metrics individually showed their usefulness in emphasizing the pervasive, yet uneven, distribution of known phage sequences in environmental metagenomes. Using these metrics in combination allowed us to resolve phage genomes into clusters that correlated with their genotypes and taxonomic classes as well as their ecological properties. We propose adding this set of metrics to current metaviromic analysis pipelines, where they can provide insight regarding phage mosaicism, habitat specificity, and evolution. PMID:26005436
Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomes

DOE PAGES

Aziz, Ramy K.; Dwivedi, Bhakti; Akhter, Sajia; ...

2015-05-08

Phages are the most abundant biological entities on Earth and play major ecological roles, yet the current sequenced phage genomes do not adequately represent their diversity, and little is known about the abundance and distribution of these sequenced genomes in nature. Although the study of phage ecology has benefited tremendously from the emergence of metagenomic sequencing, a systematic survey of phage genes and genomes in various ecosystems is still lacking, and fundamental questions about phage biology, lifestyle, and ecology remain unanswered. To address these questions and improve comparative analysis of phages in different metagenomes, we screened a core set ofmore » publicly available metagenomic samples for sequences related to completely sequenced phages using the web tool, Phage Eco-Locator. We then adopted and deployed an array of mathematical and statistical metrics for a multidimensional estimation of the abundance and distribution of phage genes and genomes in various ecosystems. Experiments using those metrics individually showed their usefulness in emphasizing the pervasive, yet uneven, distribution of known phage sequences in environmental metagenomes. Using these metrics in combination allowed us to resolve phage genomes into clusters that correlated with their genotypes and taxonomic classes as well as their ecological properties. By adding this set of metrics to current metaviromic analysis pipelines, where they can provide insight regarding phage mosaicism, habitat specificity, and evolution.« less
Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aziz, Ramy K.; Dwivedi, Bhakti; Akhter, Sajia

Phages are the most abundant biological entities on Earth and play major ecological roles, yet the current sequenced phage genomes do not adequately represent their diversity, and little is known about the abundance and distribution of these sequenced genomes in nature. Although the study of phage ecology has benefited tremendously from the emergence of metagenomic sequencing, a systematic survey of phage genes and genomes in various ecosystems is still lacking, and fundamental questions about phage biology, lifestyle, and ecology remain unanswered. To address these questions and improve comparative analysis of phages in different metagenomes, we screened a core set ofmore » publicly available metagenomic samples for sequences related to completely sequenced phages using the web tool, Phage Eco-Locator. We then adopted and deployed an array of mathematical and statistical metrics for a multidimensional estimation of the abundance and distribution of phage genes and genomes in various ecosystems. Experiments using those metrics individually showed their usefulness in emphasizing the pervasive, yet uneven, distribution of known phage sequences in environmental metagenomes. Using these metrics in combination allowed us to resolve phage genomes into clusters that correlated with their genotypes and taxonomic classes as well as their ecological properties. By adding this set of metrics to current metaviromic analysis pipelines, where they can provide insight regarding phage mosaicism, habitat specificity, and evolution.« less
The role of Bh4 in parallel evolution of hull colour in domesticated and weedy rice.

PubMed

Vigueira, C C; Li, W; Olsen, K M

2013-08-01

The two independent domestication events in the genus Oryza that led to African and Asian rice offer an extremely useful system for studying the genetic basis of parallel evolution. This system is also characterized by parallel de-domestication events, with two genetically distinct weedy rice biotypes in the US derived from the Asian domesticate. One important trait that has been altered by rice domestication and de-domestication is hull colour. The wild progenitors of the two cultivated rice species have predominantly black-coloured hulls, as does one of the two U.S. weed biotypes; both cultivated species and one of the US weedy biotypes are characterized by straw-coloured hulls. Using Black hull 4 (Bh4) as a hull colour candidate gene, we examined DNA sequence variation at this locus to study the parallel evolution of hull colour variation in the domesticated and weedy rice system. We find that independent Bh4-coding mutations have arisen in African and Asian rice that are correlated with the straw hull phenotype, suggesting that the same gene is responsible for parallel trait evolution. For the U.S. weeds, Bh4 haplotype sequences support current hypotheses on the phylogenetic relationship between the two biotypes and domesticated Asian rice; straw hull weeds are most similar to indica crops, and black hull weeds are most similar to aus crops. Tests for selection indicate that Asian crops and straw hull weeds deviate from neutrality at this gene, suggesting possible selection on Bh4 during both rice domestication and de-domestication. © 2013 The Authors. Journal of Evolutionary Biology © 2013 European Society For Evolutionary Biology.
The Evolution of Campylobacter jejuni and Campylobacter coli

PubMed Central

Sheppard, Samuel K.; Maiden, Martin C.J.

2015-01-01

The global significance of Campylobacter jejuni and Campylobacter coli as gastrointestinal human pathogens has motivated numerous studies to characterize their population biology and evolution. These bacteria are a common component of the intestinal microbiota of numerous bird and mammal species and cause disease in humans, typically via consumption of contaminated meat products, especially poultry meat. Sequence-based molecular typing methods, such as multilocus sequence typing (MLST) and whole genome sequencing (WGS), have been instructive for understanding the epidemiology and evolution of these bacteria and how phenotypic variation relates to the high degree of genetic structuring in C. coli and C. jejuni populations. Here, we describe aspects of the relatively short history of coevolution between humans and pathogenic Campylobacter, by reviewing research investigating how mutation and lateral or horizontal gene transfer (LGT or HGT, respectively) interact to create the observed population structure. These genetic changes occur in a complex fitness landscape with divergent ecologies, including multiple host species, which can lead to rapid adaptation, for example, through frame-shift mutations that alter gene expression or the acquisition of novel genetic elements by HGT. Recombination is a particularly strong evolutionary force in Campylobacter, leading to the emergence of new lineages and even large-scale genome-wide interspecies introgression between C. jejuni and C. coli. The increasing availability of large genome datasets is enhancing understanding of Campylobacter evolution through the application of methods, such as genome-wide association studies, but MLST-derived clonal complex designations remain a useful method for describing population structure. PMID:26101080
The king cobra genome reveals dynamic gene evolution and adaptation in the snake venom system.

PubMed

Vonk, Freek J; Casewell, Nicholas R; Henkel, Christiaan V; Heimberg, Alysha M; Jansen, Hans J; McCleary, Ryan J R; Kerkkamp, Harald M E; Vos, Rutger A; Guerreiro, Isabel; Calvete, Juan J; Wüster, Wolfgang; Woods, Anthony E; Logan, Jessica M; Harrison, Robert A; Castoe, Todd A; de Koning, A P Jason; Pollock, David D; Yandell, Mark; Calderon, Diego; Renjifo, Camila; Currier, Rachel B; Salgado, David; Pla, Davinia; Sanz, Libia; Hyder, Asad S; Ribeiro, José M C; Arntzen, Jan W; van den Thillart, Guido E E J M; Boetzer, Marten; Pirovano, Walter; Dirks, Ron P; Spaink, Herman P; Duboule, Denis; McGlinn, Edwina; Kini, R Manjunatha; Richardson, Michael K

2013-12-17

Snakes are limbless predators, and many species use venom to help overpower relatively large, agile prey. Snake venoms are complex protein mixtures encoded by several multilocus gene families that function synergistically to cause incapacitation. To examine venom evolution, we sequenced and interrogated the genome of a venomous snake, the king cobra (Ophiophagus hannah), and compared it, together with our unique transcriptome, microRNA, and proteome datasets from this species, with data from other vertebrates. In contrast to the platypus, the only other venomous vertebrate with a sequenced genome, we find that snake toxin genes evolve through several distinct co-option mechanisms and exhibit surprisingly variable levels of gene duplication and directional selection that correlate with their functional importance in prey capture. The enigmatic accessory venom gland shows a very different pattern of toxin gene expression from the main venom gland and seems to have recruited toxin-like lectin genes repeatedly for new nontoxic functions. In addition, tissue-specific microRNA analyses suggested the co-option of core genetic regulatory components of the venom secretory system from a pancreatic origin. Although the king cobra is limbless, we recovered coding sequences for all Hox genes involved in amniote limb development, with the exception of Hoxd12. Our results provide a unique view of the origin and evolution of snake venom and reveal multiple genome-level adaptive responses to natural selection in this complex biological weapon system. More generally, they provide insight into mechanisms of protein evolution under strong selection.
Evolution of rDNA in Nicotiana Allopolyploids: A Potential Link between rDNA Homogenization and Epigenetics

PubMed Central

Kovarik, Ales; Dadejova, Martina; Lim, Yoong K.; Chase, Mark W.; Clarkson, James J.; Knapp, Sandra; Leitch, Andrew R.

2008-01-01

Background The evolution and biology of rDNA have interested biologists for many years, in part, because of two intriguing processes: (1) nucleolar dominance and (2) sequence homogenization. We review patterns of evolution in rDNA in the angiosperm genus Nicotiana to determine consequences of allopolyploidy on these processes. Scope Allopolyploid species of Nicotiana are ideal for studying rDNA evolution because phylogenetic reconstruction of DNA sequences has revealed patterns of species divergence and their parents. From these studies we also know that polyploids formed over widely different timeframes (thousands to millions of years), enabling comparative and temporal studies of rDNA structure, activity and chromosomal distribution. In addition studies on synthetic polyploids enable the consequences of de novo polyploidy on rDNA activity to be determined. Conclusions We propose that rDNA epigenetic expression patterns established even in F1 hybrids have a material influence on the likely patterns of divergence of rDNA. It is the active rDNA units that are vulnerable to homogenization, which probably acts to reduce mutational load across the active array. Those rDNA units that are epigenetically silenced may be less vulnerable to sequence homogenization. Selection cannot act on these silenced genes, and they are likely to accumulate mutations and eventually be eliminated from the genome. It is likely that whole silenced arrays will be deleted in polyploids of 1 million years of age and older. PMID:18310159
The genome sequence of the model ascomycete fungus Podospora anserina.

PubMed

Espagne, Eric; Lespinet, Olivier; Malagnac, Fabienne; Da Silva, Corinne; Jaillon, Olivier; Porcel, Betina M; Couloux, Arnaud; Aury, Jean-Marc; Ségurens, Béatrice; Poulain, Julie; Anthouard, Véronique; Grossetete, Sandrine; Khalili, Hamid; Coppin, Evelyne; Déquard-Chablat, Michelle; Picard, Marguerite; Contamine, Véronique; Arnaise, Sylvie; Bourdais, Anne; Berteaux-Lecellier, Véronique; Gautheret, Daniel; de Vries, Ronald P; Battaglia, Evy; Coutinho, Pedro M; Danchin, Etienne Gj; Henrissat, Bernard; Khoury, Riyad El; Sainsard-Chanet, Annie; Boivin, Antoine; Pinan-Lucarré, Bérangère; Sellem, Carole H; Debuchy, Robert; Wincker, Patrick; Weissenbach, Jean; Silar, Philippe

2008-01-01

The dung-inhabiting ascomycete fungus Podospora anserina is a model used to study various aspects of eukaryotic and fungal biology, such as ageing, prions and sexual development. We present a 10X draft sequence of P. anserina genome, linked to the sequences of a large expressed sequence tag collection. Similar to higher eukaryotes, the P. anserina transcription/splicing machinery generates numerous non-conventional transcripts. Comparison of the P. anserina genome and orthologous gene set with the one of its close relatives, Neurospora crassa, shows that synteny is poorly conserved, the main result of evolution being gene shuffling in the same chromosome. The P. anserina genome contains fewer repeated sequences and has evolved new genes by duplication since its separation from N. crassa, despite the presence of the repeat induced point mutation mechanism that mutates duplicated sequences. We also provide evidence that frequent gene loss took place in the lineages leading to P. anserina and N. crassa. P. anserina contains a large and highly specialized set of genes involved in utilization of natural carbon sources commonly found in its natural biotope. It includes genes potentially involved in lignin degradation and efficient cellulose breakdown. The features of the P. anserina genome indicate a highly dynamic evolution since the divergence of P. anserina and N. crassa, leading to the ability of the former to use specific complex carbon sources that match its needs in its natural biotope.
Proteomics Improves the New Understanding of Honeybee Biology.

PubMed

Hora, Zewdu Ararso; Altaye, Solomon Zewdu; Wubie, Abebe Jemberie; Li, Jianke

2018-04-11

The honeybee is one of the most valuable insect pollinators, playing a key role in pollinating wild vegetation and agricultural crops, with significant contribution to the world's food production. Although honeybees have long been studied as model for social evolution, honeybee biology at the molecular level remained poorly understood until the year 2006. With the availability of the honeybee genome sequence and technological advancements in protein separation, mass spectrometry, and bioinformatics, aspects of honeybee biology such as developmental biology, physiology, behavior, neurobiology, and immunology have been explored to new depths at molecular and biochemical levels. This Review comprehensively summarizes the recent progress in honeybee biology using proteomics to study developmental physiology, task transition, and physiological changes in some of the organs, tissues, and cells based on achievements from the authors' laboratory in this field. The research advances of honeybee proteomics provide new insights for understanding of honeybee biology and future research directions.
Adaptive laboratory evolution -- principles and applications for biotechnology.

PubMed

Dragosits, Martin; Mattanovich, Diethard

2013-07-01

Adaptive laboratory evolution is a frequent method in biological studies to gain insights into the basic mechanisms of molecular evolution and adaptive changes that accumulate in microbial populations during long term selection under specified growth conditions. Although regularly performed for more than 25 years, the advent of transcript and cheap next-generation sequencing technologies has resulted in many recent studies, which successfully applied this technique in order to engineer microbial cells for biotechnological applications. Adaptive laboratory evolution has some major benefits as compared with classical genetic engineering but also some inherent limitations. However, recent studies show how some of the limitations may be overcome in order to successfully incorporate adaptive laboratory evolution in microbial cell factory design. Over the last two decades important insights into nutrient and stress metabolism of relevant model species were acquired, whereas some other aspects such as niche-specific differences of non-conventional cell factories are not completely understood. Altogether the current status and its future perspectives highlight the importance and potential of adaptive laboratory evolution as approach in biotechnological engineering.
Gene network polymorphism is the raw material of natural selection: the selfish gene network hypothesis.

PubMed

Boldogköi, Zsolt

2004-09-01

Population genetics, the mathematical theory of modern evolutionary biology, defines evolution as the alteration of the frequency of distinct gene variants (alleles) differing in fitness over the time. The major problem with this view is that in gene and protein sequences we can find little evidence concerning the molecular basis of phenotypic variance, especially those that would confer adaptive benefit to the bearers. Some novel data, however, suggest that a large amount of genetic variation exists in the regulatory region of genes within populations. In addition, comparison of homologous DNA sequences of various species shows that evolution appears to depend more strongly on gene expression than on the genes themselves. Furthermore, it has been demonstrated in several systems that genes form functional networks, whose products exhibit interrelated expression profiles. Finally, it has been found that regulatory circuits of development behave as evolutionary units. These data demonstrate that our view of evolution calls for a new synthesis. In this article I propose a novel concept, termed the selfish gene network hypothesis, which is based on an overall consideration of the above findings. The major statements of this hypothesis are as follows. (1) Instead of individual genes, gene networks (GNs) are responsible for the determination of traits and behaviors. (2) The primary source of microevolution is the intraspecific polymorphism in GNs and not the allelic variation in either the coding or the regulatory sequences of individual genes. (3) GN polymorphism is generated by the variation in the regulatory regions of the component genes and not by the variance in their coding sequences. (4) Evolution proceeds through continuous restructuring of the composition of GNs rather than fixing of specific alleles or GN variants.
Testing Convergent Evolution in Auditory Processing Genes between Echolocating Mammals and the Aye-Aye, a Percussive-Foraging Primate.

PubMed

Bankoff, Richard J; Jerjos, Michael; Hohman, Baily; Lauterbur, M Elise; Kistler, Logan; Perry, George H

2017-07-01

Several taxonomically distinct mammalian groups-certain microbats and cetaceans (e.g., dolphins)-share both morphological adaptations related to echolocation behavior and strong signatures of convergent evolution at the amino acid level across seven genes related to auditory processing. Aye-ayes (Daubentonia madagascariensis) are nocturnal lemurs with a specialized auditory processing system. Aye-ayes tap rapidly along the surfaces of trees, listening to reverberations to identify the mines of wood-boring insect larvae; this behavior has been hypothesized to functionally mimic echolocation. Here we investigated whether there are signals of convergence in auditory processing genes between aye-ayes and known mammalian echolocators. We developed a computational pipeline (Basic Exon Assembly Tool) that produces consensus sequences for regions of interest from shotgun genomic sequencing data for nonmodel organisms without requiring de novo genome assembly. We reconstructed complete coding region sequences for the seven convergent echolocating bat-dolphin genes for aye-ayes and another lemur. We compared sequences from these two lemurs in a phylogenetic framework with those of bat and dolphin echolocators and appropriate nonecholocating outgroups. Our analysis reaffirms the existence of amino acid convergence at these loci among echolocating bats and dolphins; some methods also detected signals of convergence between echolocating bats and both mice and elephants. However, we observed no significant signal of amino acid convergence between aye-ayes and echolocating bats and dolphins, suggesting that aye-aye tap-foraging auditory adaptations represent distinct evolutionary innovations. These results are also consistent with a developing consensus that convergent behavioral ecology does not reliably predict convergent molecular evolution. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
A review of the evolution of viviparity in squamate reptiles: the past, present and future role of molecular biology and genomics.

PubMed

Murphy, Bridget F; Thompson, Michael B

2011-07-01

Squamate reptiles (lizards and snakes) offer a unique model system for testing hypotheses about the evolutionary transition from oviparity (egg-laying) to viviparity (live-bearing) in amniote vertebrates. The evolution of squamate viviparity has occurred remarkably frequently (>108 times) and has resulted in major changes in reproductive physiology. Such frequent changes in reproductive strategy pose two questions: (1) what are the molecular mechanisms responsible for the evolution of squamate viviparity? (2) Are these molecular mechanisms the same for separate origins of viviparity? Molecular approaches, such as RT-PCR, in situ hybridisation, Western blotting and immunofluorescence, have been invaluable for identifying genes and proteins that are involved in squamate placental development, materno-foetal immunotolerance, placental transport, placental angiogenesis, hormone synthesis and hormone receptor expression. However, the candidate-gene or -protein approach that has been used until now does not allow for de novo gene/protein discovery; results to date suggest that the reproductive physiologies of mammals and squamate reptiles are very similar, but this conclusion may simply be due to a limited capacity to study the subset of genes and proteins that are unique to reptiles. Progress has also been slowed by the lack of appropriate molecular and genomic resources for squamate reptiles. The advent of next-generation sequencing provides a relatively inexpensive way to conduct rapid high-throughput sequencing of genomes and transcriptomes. We discuss the potential use of next-generation sequencing technologies to analyse differences in gene expression between oviparous and viviparous squamates, provide important sequence information for reptiles, and generate testable hypotheses for the evolution of viviparity.
Physiology is rocking the foundations of evolutionary biology.

PubMed

Noble, Denis

2013-08-01

The 'Modern Synthesis' (Neo-Darwinism) is a mid-20th century gene-centric view of evolution, based on random mutations accumulating to produce gradual change through natural selection. Any role of physiological function in influencing genetic inheritance was excluded. The organism became a mere carrier of the real objects of selection, its genes. We now know that genetic change is far from random and often not gradual. Molecular genetics and genome sequencing have deconstructed this unnecessarily restrictive view of evolution in a way that reintroduces physiological function and interactions with the environment as factors influencing the speed and nature of inherited change. Acquired characteristics can be inherited, and in a few but growing number of cases that inheritance has now been shown to be robust for many generations. The 21st century can look forward to a new synthesis that will reintegrate physiology with evolutionary biology.
Biological and serological variability, evolution and molecular epidemiology of Zucchini yellow mosaic virus (ZYMV, Potyvirus) with special reference to Caribbean islands.

PubMed

Desbiez, C; Wipf-Scheibel, C; Lecoq, H

2002-04-23

Zucchini yellow mosaic virus (ZYMV, Potyvirus) emerged as an important pathogen of cucurbits within the last 20 years. Its origins and mechanisms for evolution and worldwide spread represent important questions to understand plant virus emergence. Sequence analysis on a 250 nucleotide fragment including the N-terminal part of the coat protein coding region, revealed one major group of strains, and some highly divergent isolates from distinct origins. Within the major group, three subsets of strains were defined without correlation with geographic origin, year of collection or biological properties. ZYMV was first observed in Martinique and Guadeloupe in 1992 and 1994, respectively. We studied the evolution of ZYMV variability on both islands in the few years following the putative virus introduction. In Martinique, molecular divergence remained low even after 6 years, suggesting a lack of new introductions. Interactions between strains resulted in a stability of the high biological variability, while the serological diversity decreased and molecular divergence remained low. In Guadeloupe, as in Martinique in 1993, serological variability was high shortly after virus introduction. While the first introduction in Guadeloupe was independent from Martinique, the 'Martinique' type was detected in 1998, suggesting further introductions, maybe through viruliferous aphids or imported plant material.
Sexual Selection of Protamine 1 in Mammals.

PubMed

Lüke, Lena; Tourmente, Maximiliano; Roldan, Eduardo R S

2016-01-01

Protamines have a crucial role in male fertility. They are involved in sperm chromatin packaging and influence the shape of the sperm head and, hence, are important for sperm performance. Protamine structure is basic with numerous arginine-rich DNA-binding domains. Postcopulatory sexual selection is thought to play an important role in protamine sequence evolution and expression. Here, we analyze patterns of evolution and sexual selection (in the form of sperm competition) acting on protamine 1 gene sequence in 237 mammalian species. We assessed common patterns as well as differences between the major mammalian subclasses (Eutheria, Metatheria) and clades. We found that a high arginine content in protamine 1 associates with a lower sperm head width, which may have an impact on sperm swimming velocity. Increase in arginine content in protamine 1 across mammals appears to take place in a way consistent with sexual selection. In metatherians, increase in sequence length correlates with sexual selection. Differences in selective pressures on sequences and codon sites were observed between mammalian clades. Our study revealed a complex evolutionary pattern of protamine 1, with different selective constraints, and effects of sexual selection, between mammalian groups. In contrast, the effect of arginine content on head shape, and the possible involvement of sperm competition, was identified across all mammals. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Ancient Recombination Events between Human Herpes Simplex Viruses.

PubMed

Burrel, Sonia; Boutolleau, David; Ryu, Diane; Agut, Henri; Merkel, Kevin; Leendertz, Fabian H; Calvignac-Spencer, Sébastien

2017-07-01

Herpes simplex viruses 1 and 2 (HSV-1 and HSV-2) are seen as close relatives but also unambiguously considered as evolutionary independent units. Here, we sequenced the genomes of 18 HSV-2 isolates characterized by divergent UL30 gene sequences to further elucidate the evolutionary history of this virus. Surprisingly, genome-wide recombination analyses showed that all HSV-2 genomes sequenced to date contain HSV-1 fragments. Using phylogenomic analyses, we could also show that two main HSV-2 lineages exist. One lineage is mostly restricted to subSaharan Africa whereas the other has reached a global distribution. Interestingly, only the worldwide lineage is characterized by ancient recombination events with HSV-1. Our findings highlight the complexity of HSV-2 evolution, a virus of putative zoonotic origin which later recombined with its human-adapted relative. They also suggest that coinfections with HSV-1 and 2 may have genomic and potentially functional consequences and should therefore be monitored more closely. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Phylogenetics beyond biology.

PubMed

Retzlaff, Nancy; Stadler, Peter F

2018-06-21

Evolutionary processes have been described not only in biology but also for a wide range of human cultural activities including languages and law. In contrast to the evolution of DNA or protein sequences, the detailed mechanisms giving rise to the observed evolution-like processes are not or only partially known. The absence of a mechanistic model of evolution implies that it remains unknown how the distances between different taxa have to be quantified. Considering distortions of metric distances, we first show that poor choices of the distance measure can lead to incorrect phylogenetic trees. Based on the well-known fact that phylogenetic inference requires additive metrics, we then show that the correct phylogeny can be computed from a distance matrix [Formula: see text] if there is a monotonic, subadditive function [Formula: see text] such that [Formula: see text] is additive. The required metric-preserving transformation [Formula: see text] can be computed as the solution of an optimization problem. This result shows that the problem of phylogeny reconstruction is well defined even if a detailed mechanistic model of the evolutionary process remains elusive.

An experimental phylogeny to benchmark ancestral sequence reconstruction

PubMed Central

Randall, Ryan N.; Radford, Caelan E.; Roof, Kelsey A.; Natarajan, Divya K.; Gaucher, Eric A.

2016-01-01

Ancestral sequence reconstruction (ASR) is a still-burgeoning method that has revealed many key mechanisms of molecular evolution. One criticism of the approach is an inability to validate its algorithms within a biological context as opposed to a computer simulation. Here we build an experimental phylogeny using the gene of a single red fluorescent protein to address this criticism. The evolved phylogeny consists of 19 operational taxonomic units (leaves) and 17 ancestral bifurcations (nodes) that display a wide variety of fluorescent phenotypes. The 19 leaves then serve as ‘modern' sequences that we subject to ASR analyses using various algorithms and to benchmark against the known ancestral genotypes and ancestral phenotypes. We confirm computer simulations that show all algorithms infer ancient sequences with high accuracy, yet we also reveal wide variation in the phenotypes encoded by incorrectly inferred sequences. Specifically, Bayesian methods incorporating rate variation significantly outperform the maximum parsimony criterion in phenotypic accuracy. Subsampling of extant sequences had minor effect on the inference of ancestral sequences. PMID:27628687
Optimal network alignment with graphlet degree vectors.

PubMed

Milenković, Tijana; Ng, Weng Leong; Hayes, Wayne; Przulj, Natasa

2010-06-30

Important biological information is encoded in the topology of biological networks. Comparative analyses of biological networks are proving to be valuable, as they can lead to transfer of knowledge between species and give deeper insights into biological function, disease, and evolution. We introduce a new method that uses the Hungarian algorithm to produce optimal global alignment between two networks using any cost function. We design a cost function based solely on network topology and use it in our network alignment. Our method can be applied to any two networks, not just biological ones, since it is based only on network topology. We use our new method to align protein-protein interaction networks of two eukaryotic species and demonstrate that our alignment exposes large and topologically complex regions of network similarity. At the same time, our alignment is biologically valid, since many of the aligned protein pairs perform the same biological function. From the alignment, we predict function of yet unannotated proteins, many of which we validate in the literature. Also, we apply our method to find topological similarities between metabolic networks of different species and build phylogenetic trees based on our network alignment score. The phylogenetic trees obtained in this way bear a striking resemblance to the ones obtained by sequence alignments. Our method detects topologically similar regions in large networks that are statistically significant. It does this independent of protein sequence or any other information external to network topology.
Evolution of puma lentivirus in bobcats (Lynx rufus) and mountain lions (Puma concolor) in North America.

PubMed

Lee, Justin S; Bevins, Sarah N; Serieys, Laurel E K; Vickers, Winston; Logan, Ken A; Aldredge, Mat; Boydston, Erin E; Lyren, Lisa M; McBride, Roy; Roelke-Parker, Melody; Pecon-Slattery, Jill; Troyer, Jennifer L; Riley, Seth P; Boyce, Walter M; Crooks, Kevin R; VandeWoude, Sue

2014-07-01

Mountain lions (Puma concolor) throughout North and South America are infected with puma lentivirus clade B (PLVB). A second, highly divergent lentiviral clade, PLVA, infects mountain lions in southern California and Florida. Bobcats (Lynx rufus) in these two geographic regions are also infected with PLVA, and to date, this is the only strain of lentivirus identified in bobcats. We sequenced full-length PLV genomes in order to characterize the molecular evolution of PLV in bobcats and mountain lions. Low sequence homology (88% average pairwise identity) and frequent recombination (1 recombination breakpoint per 3 isolates analyzed) were observed in both clades. Viral proteins have markedly different patterns of evolution; sequence homology and negative selection were highest in Gag and Pol and lowest in Vif and Env. A total of 1.7% of sites across the PLV genome evolve under positive selection, indicating that host-imposed selection pressure is an important force shaping PLV evolution. PLVA strains are highly spatially structured, reflecting the population dynamics of their primary host, the bobcat. In contrast, the phylogeography of PLVB reflects the highly mobile mountain lion, with diverse PLVB isolates cocirculating in some areas and genetically related viruses being present in populations separated by thousands of kilometers. We conclude that PLVA and PLVB are two different viral species with distinct feline hosts and evolutionary histories. Importance: An understanding of viral evolution in natural host populations is a fundamental goal of virology, molecular biology, and disease ecology. Here we provide a detailed analysis of puma lentivirus (PLV) evolution in two natural carnivore hosts, the bobcat and mountain lion. Our results illustrate that PLV evolution is a dynamic process that results from high rates of viral mutation/recombination and host-imposed selection pressure. Copyright © 2014, American Society for Microbiology. All Rights Reserved.
Positioning Genomics in Biology Education: Content Mapping of Undergraduate Biology Textbooks†

PubMed Central

Wernick, Naomi L. B.; Ndung’u, Eric; Haughton, Dominique; Ledley, Fred D.

2014-01-01

Biological thought increasingly recognizes the centrality of the genome in constituting and regulating processes ranging from cellular systems to ecology and evolution. In this paper, we ask whether genomics is similarly positioned as a core concept in the instructional sequence for undergraduate biology. Using quantitative methods, we analyzed the order in which core biological concepts were introduced in textbooks for first-year general and human biology. Statistical analysis was performed using self-organizing map algorithms and conventional methods to identify clusters of terms and their relative position in the books. General biology textbooks for both majors and nonmajors introduced genome-related content after text related to cell biology and biological chemistry, but before content describing higher-order biological processes. However, human biology textbooks most often introduced genomic content near the end of the books. These results suggest that genomics is not yet positioned as a core concept in commonly used textbooks for first-year biology and raises questions about whether such textbooks, or courses based on the outline of these textbooks, provide an appropriate foundation for understanding contemporary biological science. PMID:25574293
Positioning genomics in biology education: content mapping of undergraduate biology textbooks.

PubMed

Wernick, Naomi L B; Ndung'u, Eric; Haughton, Dominique; Ledley, Fred D

2014-12-01

Biological thought increasingly recognizes the centrality of the genome in constituting and regulating processes ranging from cellular systems to ecology and evolution. In this paper, we ask whether genomics is similarly positioned as a core concept in the instructional sequence for undergraduate biology. Using quantitative methods, we analyzed the order in which core biological concepts were introduced in textbooks for first-year general and human biology. Statistical analysis was performed using self-organizing map algorithms and conventional methods to identify clusters of terms and their relative position in the books. General biology textbooks for both majors and nonmajors introduced genome-related content after text related to cell biology and biological chemistry, but before content describing higher-order biological processes. However, human biology textbooks most often introduced genomic content near the end of the books. These results suggest that genomics is not yet positioned as a core concept in commonly used textbooks for first-year biology and raises questions about whether such textbooks, or courses based on the outline of these textbooks, provide an appropriate foundation for understanding contemporary biological science.
Mind the gap; seven reasons to close fragmented genome assemblies.

PubMed

Thomma, Bart P H J; Seidl, Michael F; Shi-Kunne, Xiaoqian; Cook, David E; Bolton, Melvin D; van Kan, Jan A L; Faino, Luigi

2016-05-01

Like other domains of life, research into the biology of filamentous microbes has greatly benefited from the advent of whole-genome sequencing. Next-generation sequencing (NGS) technologies have revolutionized sequencing, making genomic sciences accessible to many academic laboratories including those that study non-model organisms. Thus, hundreds of fungal genomes have been sequenced and are publically available today, although these initiatives have typically yielded considerably fragmented genome assemblies that often lack large contiguous genomic regions. Many important genomic features are contained in intergenic DNA that is often missing in current genome assemblies, and recent studies underscore the significance of non-coding regions and repetitive elements for the life style, adaptability and evolution of many organisms. The study of particular types of genetic elements, such as telomeres, centromeres, repetitive elements, effectors, and clusters of co-regulated genes, but also of phenomena such as structural rearrangements, genome compartmentalization and epigenetics, greatly benefits from having a contiguous and high-quality, preferably even complete and gapless, genome assembly. Here we discuss a number of important reasons to produce gapless, finished, genome assemblies to help answer important biological questions. Copyright © 2015 Elsevier Inc. All rights reserved.
RNA regulators responding to ribosomal protein S15 are frequent in sequence space

PubMed Central

Slinger, Betty L.; Meyer, Michelle M.

2016-01-01

There are several natural examples of distinct RNA structures that interact with the same ligand to regulate the expression of homologous genes in different organisms. One essential question regarding this phenomenon is whether such RNA regulators are the result of convergent or divergent evolution. Are the RNAs derived from some common ancestor and diverged to the point where we cannot identify the similarity, or have multiple solutions to the same biological problem arisen independently? A key variable in assessing these alternatives is how frequently such regulators arise within sequence space. Ribosomal protein S15 is autogenously regulated via an RNA regulator in many bacterial species; four apparently distinct regulators have been functionally validated in different bacterial phyla. Here, we explore how frequently such regulators arise within a partially randomized sequence population. We find many RNAs that interact specifically with ribosomal protein S15 from Geobacillus kaustophilus with biologically relevant dissociation constants. Furthermore, of the six sequences we characterize, four show regulatory activity in an Escherichia coli reporter assay. Subsequent footprinting and mutagenesis analysis indicates that protein binding proximal to regulatory features such as the Shine–Dalgarno sequence is sufficient to enable regulation, suggesting that regulation in response to S15 is relatively easily acquired. PMID:27580716
Evolutionary crossroads in developmental biology: Cnidaria

PubMed Central

Technau, Ulrich; Steele, Robert E.

2011-01-01

There is growing interest in the use of cnidarians (corals, sea anemones, jellyfish and hydroids) to investigate the evolution of key aspects of animal development, such as the formation of the third germ layer (mesoderm), the nervous system and the generation of bilaterality. The recent sequencing of the Nematostella and Hydra genomes, and the establishment of methods for manipulating gene expression, have inspired new research efforts using cnidarians. Here, we present the main features of cnidarian models and their advantages for research, and summarize key recent findings using these models that have informed our understanding of the evolution of the developmental processes underlying metazoan body plan formation. PMID:21389047
Evolutionary crossroads in developmental biology: Cnidaria.

PubMed

Technau, Ulrich; Steele, Robert E

2011-04-01

There is growing interest in the use of cnidarians (corals, sea anemones, jellyfish and hydroids) to investigate the evolution of key aspects of animal development, such as the formation of the third germ layer (mesoderm), the nervous system and the generation of bilaterality. The recent sequencing of the Nematostella and Hydra genomes, and the establishment of methods for manipulating gene expression, have inspired new research efforts using cnidarians. Here, we present the main features of cnidarian models and their advantages for research, and summarize key recent findings using these models that have informed our understanding of the evolution of the developmental processes underlying metazoan body plan formation.
The genome of Eucalyptus grandis.

PubMed

Myburg, Alexander A; Grattapaglia, Dario; Tuskan, Gerald A; Hellsten, Uffe; Hayes, Richard D; Grimwood, Jane; Jenkins, Jerry; Lindquist, Erika; Tice, Hope; Bauer, Diane; Goodstein, David M; Dubchak, Inna; Poliakov, Alexandre; Mizrachi, Eshchar; Kullan, Anand R K; Hussey, Steven G; Pinard, Desre; van der Merwe, Karen; Singh, Pooja; van Jaarsveld, Ida; Silva-Junior, Orzenil B; Togawa, Roberto C; Pappas, Marilia R; Faria, Danielle A; Sansaloni, Carolina P; Petroli, Cesar D; Yang, Xiaohan; Ranjan, Priya; Tschaplinski, Timothy J; Ye, Chu-Yu; Li, Ting; Sterck, Lieven; Vanneste, Kevin; Murat, Florent; Soler, Marçal; Clemente, Hélène San; Saidi, Naijib; Cassan-Wang, Hua; Dunand, Christophe; Hefer, Charles A; Bornberg-Bauer, Erich; Kersting, Anna R; Vining, Kelly; Amarasinghe, Vindhya; Ranik, Martin; Naithani, Sushma; Elser, Justin; Boyd, Alexander E; Liston, Aaron; Spatafora, Joseph W; Dharmwardhana, Palitha; Raja, Rajani; Sullivan, Christopher; Romanel, Elisson; Alves-Ferreira, Marcio; Külheim, Carsten; Foley, William; Carocha, Victor; Paiva, Jorge; Kudrna, David; Brommonschenkel, Sergio H; Pasquali, Giancarlo; Byrne, Margaret; Rigault, Philippe; Tibbits, Josquin; Spokevicius, Antanas; Jones, Rebecca C; Steane, Dorothy A; Vaillancourt, René E; Potts, Brad M; Joubert, Fourie; Barry, Kerrie; Pappas, Georgios J; Strauss, Steven H; Jaiswal, Pankaj; Grima-Pettenati, Jacqueline; Salse, Jérôme; Van de Peer, Yves; Rokhsar, Daniel S; Schmutz, Jeremy

2014-06-19

Eucalypts are the world's most widely planted hardwood trees. Their outstanding diversity, adaptability and growth have made them a global renewable resource of fibre and energy. We sequenced and assembled >94% of the 640-megabase genome of Eucalyptus grandis. Of 36,376 predicted protein-coding genes, 34% occur in tandem duplications, the largest proportion thus far in plant genomes. Eucalyptus also shows the highest diversity of genes for specialized metabolites such as terpenes that act as chemical defence and provide unique pharmaceutical oils. Genome sequencing of the E. grandis sister species E. globulus and a set of inbred E. grandis tree genomes reveals dynamic genome evolution and hotspots of inbreeding depression. The E. grandis genome is the first reference for the eudicot order Myrtales and is placed here sister to the eurosids. This resource expands our understanding of the unique biology of large woody perennials and provides a powerful tool to accelerate comparative biology, breeding and biotechnology.
From Sequence and Forces to Structure, Function and Evolution of Intrinsically Disordered Proteins

PubMed Central

Forman-Kay, Julie D.; Mittag, Tanja

2015-01-01

Intrinsically disordered proteins (IDPs), which lack persistent structure, are a challenge to structural biology due to the inapplicability of standard methods for characterization of folded proteins as well as their deviation from the dominant structure/function paradigm. Their widespread presence and involvement in biological function, however, has spurred the growing acceptance of the importance of IDPs and the development of new tools for studying their structure, dynamics and function. The interplay of folded and disordered domains or regions for function and the existence of a continuum of protein states with respect to conformational energetics, motional timescales and compactness is shaping a unified understanding of structure-dynamics-disorder/function relationships. On the 20th anniversary of this journal, Structure, we provide a historical perspective on the investigation of IDPs and summarize the sequence features and physical forces that underlie their unique structural, functional and evolutionary properties. PMID:24010708
From sequence and forces to structure, function, and evolution of intrinsically disordered proteins.

PubMed

Forman-Kay, Julie D; Mittag, Tanja

2013-09-03

Intrinsically disordered proteins (IDPs), which lack persistent structure, are a challenge to structural biology due to the inapplicability of standard methods for characterization of folded proteins as well as their deviation from the dominant structure/function paradigm. Their widespread presence and involvement in biological function, however, has spurred the growing acceptance of the importance of IDPs and the development of new tools for studying their structure, dynamics, and function. The interplay of folded and disordered domains or regions for function and the existence of a continuum of protein states with respect to conformational energetics, motional timescales, and compactness are shaping a unified understanding of structure-dynamics-disorder/function relationships. In the 20(th) anniversary of Structure, we provide a historical perspective on the investigation of IDPs and summarize the sequence features and physical forces that underlie their unique structural, functional, and evolutionary properties. Copyright © 2013 Elsevier Ltd. All rights reserved.
Genome analysis of the platypus reveals unique signatures of evolution.

PubMed

Warren, Wesley C; Hillier, LaDeana W; Marshall Graves, Jennifer A; Birney, Ewan; Ponting, Chris P; Grützner, Frank; Belov, Katherine; Miller, Webb; Clarke, Laura; Chinwalla, Asif T; Yang, Shiaw-Pyng; Heger, Andreas; Locke, Devin P; Miethke, Pat; Waters, Paul D; Veyrunes, Frédéric; Fulton, Lucinda; Fulton, Bob; Graves, Tina; Wallis, John; Puente, Xose S; López-Otín, Carlos; Ordóñez, Gonzalo R; Eichler, Evan E; Chen, Lin; Cheng, Ze; Deakin, Janine E; Alsop, Amber; Thompson, Katherine; Kirby, Patrick; Papenfuss, Anthony T; Wakefield, Matthew J; Olender, Tsviya; Lancet, Doron; Huttley, Gavin A; Smit, Arian F A; Pask, Andrew; Temple-Smith, Peter; Batzer, Mark A; Walker, Jerilyn A; Konkel, Miriam K; Harris, Robert S; Whittington, Camilla M; Wong, Emily S W; Gemmell, Neil J; Buschiazzo, Emmanuel; Vargas Jentzsch, Iris M; Merkel, Angelika; Schmitz, Juergen; Zemann, Anja; Churakov, Gennady; Kriegs, Jan Ole; Brosius, Juergen; Murchison, Elizabeth P; Sachidanandam, Ravi; Smith, Carly; Hannon, Gregory J; Tsend-Ayush, Enkhjargal; McMillan, Daniel; Attenborough, Rosalind; Rens, Willem; Ferguson-Smith, Malcolm; Lefèvre, Christophe M; Sharp, Julie A; Nicholas, Kevin R; Ray, David A; Kube, Michael; Reinhardt, Richard; Pringle, Thomas H; Taylor, James; Jones, Russell C; Nixon, Brett; Dacheux, Jean-Louis; Niwa, Hitoshi; Sekita, Yoko; Huang, Xiaoqiu; Stark, Alexander; Kheradpour, Pouya; Kellis, Manolis; Flicek, Paul; Chen, Yuan; Webber, Caleb; Hardison, Ross; Nelson, Joanne; Hallsworth-Pepin, Kym; Delehaunty, Kim; Markovic, Chris; Minx, Pat; Feng, Yucheng; Kremitzki, Colin; Mitreva, Makedonka; Glasscock, Jarret; Wylie, Todd; Wohldmann, Patricia; Thiru, Prathapan; Nhan, Michael N; Pohl, Craig S; Smith, Scott M; Hou, Shunfeng; Nefedov, Mikhail; de Jong, Pieter J; Renfree, Marilyn B; Mardis, Elaine R; Wilson, Richard K

2008-05-08

We present a draft genome sequence of the platypus, Ornithorhynchus anatinus. This monotreme exhibits a fascinating combination of reptilian and mammalian characters. For example, platypuses have a coat of fur adapted to an aquatic lifestyle; platypus females lactate, yet lay eggs; and males are equipped with venom similar to that of reptiles. Analysis of the first monotreme genome aligned these features with genetic innovations. We find that reptile and platypus venom proteins have been co-opted independently from the same gene families; milk protein genes are conserved despite platypuses laying eggs; and immune gene family expansions are directly related to platypus biology. Expansions of protein, non-protein-coding RNA and microRNA families, as well as repeat elements, are identified. Sequencing of this genome now provides a valuable resource for deep mammalian comparative analyses, as well as for monotreme biology and conservation.
Genome analysis of the platypus reveals unique signatures of evolution

PubMed Central

Warren, Wesley C.; Hillier, LaDeana W.; Marshall Graves, Jennifer A.; Birney, Ewan; Ponting, Chris P.; Grützner, Frank; Belov, Katherine; Miller, Webb; Clarke, Laura; Chinwalla, Asif T.; Yang, Shiaw-Pyng; Heger, Andreas; Locke, Devin P.; Miethke, Pat; Waters, Paul D.; Veyrunes, Frédéric; Fulton, Lucinda; Fulton, Bob; Graves, Tina; Wallis, John; Puente, Xose S.; López-Otín, Carlos; Ordóñez, Gonzalo R.; Eichler, Evan E.; Chen, Lin; Cheng, Ze; Deakin, Janine E.; Alsop, Amber; Thompson, Katherine; Kirby, Patrick; Papenfuss, Anthony T.; Wakefield, Matthew J.; Olender, Tsviya; Lancet, Doron; Huttley, Gavin A.; Smit, Arian F. A.; Pask, Andrew; Temple-Smith, Peter; Batzer, Mark A.; Walker, Jerilyn A.; Konkel, Miriam K.; Harris, Robert S.; Whittington, Camilla M.; Wong, Emily S. W.; Gemmell, Neil J.; Buschiazzo, Emmanuel; Vargas Jentzsch, Iris M.; Merkel, Angelika; Schmitz, Juergen; Zemann, Anja; Churakov, Gennady; Kriegs, Jan Ole; Brosius, Juergen; Murchison, Elizabeth P.; Sachidanandam, Ravi; Smith, Carly; Hannon, Gregory J.; Tsend-Ayush, Enkhjargal; McMillan, Daniel; Attenborough, Rosalind; Rens, Willem; Ferguson-Smith, Malcolm; Lefèvre, Christophe M.; Sharp, Julie A.; Nicholas, Kevin R.; Ray, David A.; Kube, Michael; Reinhardt, Richard; Pringle, Thomas H.; Taylor, James; Jones, Russell C.; Nixon, Brett; Dacheux, Jean-Louis; Niwa, Hitoshi; Sekita, Yoko; Huang, Xiaoqiu; Stark, Alexander; Kheradpour, Pouya; Kellis, Manolis; Flicek, Paul; Chen, Yuan; Webber, Caleb; Hardison, Ross; Nelson, Joanne; Hallsworth-Pepin, Kym; Delehaunty, Kim; Markovic, Chris; Minx, Pat; Feng, Yucheng; Kremitzki, Colin; Mitreva, Makedonka; Glasscock, Jarret; Wylie, Todd; Wohldmann, Patricia; Thiru, Prathapan; Nhan, Michael N.; Pohl, Craig S.; Smith, Scott M.; Hou, Shunfeng; Renfree, Marilyn B.; Mardis, Elaine R.; Wilson, Richard K.

2009-01-01

We present a draft genome sequence of the platypus, Ornithorhynchus anatinus. This monotreme exhibits a fascinating combination of reptilian and mammalian characters. For example, platypuses have a coat of fur adapted to an aquatic lifestyle; platypus females lactate, yet lay eggs; and males are equipped with venom similar to that of reptiles. Analysis of the first monotreme genome aligned these features with genetic innovations. We find that reptile and platypus venom proteins have been co-opted independently from the same gene families; milk protein genes are conserved despite platypuses laying eggs; and immune gene family expansions are directly related to platypus biology. Expansions of protein, non-protein-coding RNA and microRNA families, as well as repeat elements, are identified. Sequencing of this genome now provides a valuable resource for deep mammalian comparative analyses, as well as for monotreme biology and conservation. PMID:18464734
Naumovozyma castellii: an alternative model for budding yeast molecular biology.

PubMed

Karademir Andersson, Ahu; Cohn, Marita

2017-03-01

Naumovozyma castellii (Saccharomyces castellii) is a member of the budding yeast family Saccharomycetaceae. It has been extensively used as a model organism for telomere biology research and has gained increasing interest as a budding yeast model for functional analyses owing to its amenability to genetic modifications. Owing to the suitable phylogenetic distance to S. cerevisiae, the whole genome sequence of N. castellii has provided unique data for comparative genomic studies, and it played a key role in the establishment of the timing of the whole genome duplication and the evolutionary events that took place in the subsequent genomic evolution of the Saccharomyces lineage. Here we summarize the historical background of its establishment as a laboratory yeast species, and the development of genetic and molecular tools and strains. We review the research performed on N. castellii, focusing on areas where it has significantly contributed to the discovery of new features of molecular biology and to the advancement of our understanding of molecular evolution. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Emerging Concepts of Data Integration in Pathogen Phylodynamics.

PubMed

Baele, Guy; Suchard, Marc A; Rambaut, Andrew; Lemey, Philippe

2017-01-01

Phylodynamics has become an increasingly popular statistical framework to extract evolutionary and epidemiological information from pathogen genomes. By harnessing such information, epidemiologists aim to shed light on the spatio-temporal patterns of spread and to test hypotheses about the underlying interaction of evolutionary and ecological dynamics in pathogen populations. Although the field has witnessed a rich development of statistical inference tools with increasing levels of sophistication, these tools initially focused on sequences as their sole primary data source. Integrating various sources of information, however, promises to deliver more precise insights in infectious diseases and to increase opportunities for statistical hypothesis testing. Here, we review how the emerging concept of data integration is stimulating new advances in Bayesian evolutionary inference methodology which formalize a marriage of statistical thinking and evolutionary biology. These approaches include connecting sequence to trait evolution, such as for host, phenotypic and geographic sampling information, but also the incorporation of covariates of evolutionary and epidemic processes in the reconstruction procedures. We highlight how a full Bayesian approach to covariate modeling and testing can generate further insights into sequence evolution, trait evolution, and population dynamics in pathogen populations. Specific examples demonstrate how such approaches can be used to test the impact of host on rabies and HIV evolutionary rates, to identify the drivers of influenza dispersal as well as the determinants of rabies cross-species transmissions, and to quantify the evolutionary dynamics of influenza antigenicity. Finally, we briefly discuss how data integration is now also permeating through the inference of transmission dynamics, leading to novel insights into tree-generative processes and detailed reconstructions of transmission trees. [Bayesian inference; birth–death models; coalescent models; continuous trait evolution; covariates; data integration; discrete trait evolution; pathogen phylodynamics.
Emerging Concepts of Data Integration in Pathogen Phylodynamics

PubMed Central

Baele, Guy; Suchard, Marc A.; Rambaut, Andrew; Lemey, Philippe

2017-01-01

Phylodynamics has become an increasingly popular statistical framework to extract evolutionary and epidemiological information from pathogen genomes. By harnessing such information, epidemiologists aim to shed light on the spatio-temporal patterns of spread and to test hypotheses about the underlying interaction of evolutionary and ecological dynamics in pathogen populations. Although the field has witnessed a rich development of statistical inference tools with increasing levels of sophistication, these tools initially focused on sequences as their sole primary data source. Integrating various sources of information, however, promises to deliver more precise insights in infectious diseases and to increase opportunities for statistical hypothesis testing. Here, we review how the emerging concept of data integration is stimulating new advances in Bayesian evolutionary inference methodology which formalize a marriage of statistical thinking and evolutionary biology. These approaches include connecting sequence to trait evolution, such as for host, phenotypic and geographic sampling information, but also the incorporation of covariates of evolutionary and epidemic processes in the reconstruction procedures. We highlight how a full Bayesian approach to covariate modeling and testing can generate further insights into sequence evolution, trait evolution, and population dynamics in pathogen populations. Specific examples demonstrate how such approaches can be used to test the impact of host on rabies and HIV evolutionary rates, to identify the drivers of influenza dispersal as well as the determinants of rabies cross-species transmissions, and to quantify the evolutionary dynamics of influenza antigenicity. Finally, we briefly discuss how data integration is now also permeating through the inference of transmission dynamics, leading to novel insights into tree-generative processes and detailed reconstructions of transmission trees. [Bayesian inference; birth–death models; coalescent models; continuous trait evolution; covariates; data integration; discrete trait evolution; pathogen phylodynamics. PMID:28173504
Vertebrate Genome Evolution in the Light of Fish Cytogenomics and rDNAomics

PubMed Central

Howell, W. Mike

2018-01-01

To understand the cytogenomic evolution of vertebrates, we must first unravel the complex genomes of fishes, which were the first vertebrates to evolve and were ancestors to all other vertebrates. We must not forget the immense time span during which the fish genomes had to evolve. Fish cytogenomics is endowed with unique features which offer irreplaceable insights into the evolution of the vertebrate genome. Due to the general DNA base compositional homogeneity of fish genomes, fish cytogenomics is largely based on mapping DNA repeats that still represent serious obstacles in genome sequencing and assembling, even in model species. Localization of repeats on chromosomes of hundreds of fish species and populations originating from diversified environments have revealed the biological importance of this genomic fraction. Ribosomal genes (rDNA) belong to the most informative repeats and in fish, they are subject to a more relaxed regulation than in higher vertebrates. This can result in formation of a literal ‘rDNAome’ consisting of more than 20,000 copies with their high proportion employed in extra-coding functions. Because rDNA has high rates of transcription and recombination, it contributes to genome diversification and can form reproductive barrier. Our overall knowledge of fish cytogenomics grows rapidly by a continuously increasing number of fish genomes sequenced and by use of novel sequencing methods improving genome assembly. The recently revealed exceptional compositional heterogeneity in an ancient fish lineage (gars) sheds new light on the compositional genome evolution in vertebrates generally. We highlight the power of synergy of cytogenetics and genomics in fish cytogenomics, its potential to understand the complexity of genome evolution in vertebrates, which is also linked to clinical applications and the chromosomal backgrounds of speciation. We also summarize the current knowledge on fish cytogenomics and outline its main future avenues. PMID:29443947
Clonal evolution of chemotherapy-resistant urothelial carcinoma.

PubMed

Faltas, Bishoy M; Prandi, Davide; Tagawa, Scott T; Molina, Ana M; Nanus, David M; Sternberg, Cora; Rosenberg, Jonathan; Mosquera, Juan Miguel; Robinson, Brian; Elemento, Olivier; Sboner, Andrea; Beltran, Himisha; Demichelis, Francesca; Rubin, Mark A

2016-12-01

Chemotherapy-resistant urothelial carcinoma has no uniformly curative therapy. Understanding how selective pressure from chemotherapy directs the evolution of urothelial carcinoma and shapes its clonal architecture is a central biological question with clinical implications. To address this question, we performed whole-exome sequencing and clonality analysis of 72 urothelial carcinoma samples, including 16 matched sets of primary and advanced tumors prospectively collected before and after chemotherapy. Our analysis provided several insights: (i) chemotherapy-treated urothelial carcinoma is characterized by intra-patient mutational heterogeneity, and the majority of mutations are not shared; (ii) both branching evolution and metastatic spread are very early events in the natural history of urothelial carcinoma; (iii) chemotherapy-treated urothelial carcinoma is enriched with clonal mutations involving L1 cell adhesion molecule (L1CAM) and integrin signaling pathways; and (iv) APOBEC-induced mutagenesis is clonally enriched in chemotherapy-treated urothelial carcinoma and continues to shape the evolution of urothelial carcinoma throughout its lifetime.
Clonal Evolution of Chemotherapy-resistant Urothelial Carcinoma

PubMed Central

Faltas, Bishoy M.; Prandi, Davide; Tagawa, Scott T.; Molina, Ana M.; Nanus, David M.; Sternberg, Cora; Rosenberg, Jonathan; Mosquera, Juan Miguel; Robinson, Brian; Elemento, Olivier; Sboner, Andrea; Beltran, Himisha; Demichelis, Francesca; Rubin, Mark A.

2017-01-01

Chemotherapy-resistant urothelial carcinoma (UC) has no uniformly curative therapy. Understanding how selective pressure from chemotherapy directs UC’s evolution and shapes its clonal architecture is a central biological question with clinical implications. To address this question, we performed whole-exome sequencing and clonality analysis of 72 UCs including 16 matched sets of primary and advanced tumors prospectively collected before and after chemotherapy. Our analysis provided several insights: (i) chemotherapy-treated UC is characterized by intra-patient mutational heterogeneity and the majority of mutations are not shared, (ii) both branching evolution and metastatic spread are very early events in the natural history of UC; (iii) chemotherapy-treated UC is enriched with clonal mutations involving L1-cell adhesion molecule (L1CAM) and integrin signaling pathways; (iv) APOBEC induced-mutagenesis is clonally-enriched in chemotherapy-treated UC and continues to shape UC’s evolution throughout its lifetime. PMID:27749842

Driving in the Dark: Ten Propositions About Prediction and National Security

DTIC Science & Technology

2011-10-01

to a predicted threat list. The evolution of modern biology has produced techniques of genetic sequencing and synthesis that will permit the...and Australia, often under the rubric of Capability Based Planning. See, for example, the work of The Technical Cooperation Program at www...attacking humans. See, for example, the website of Functional Genetics , www.functional-genetics.com. 143. Stewart Brand, How Buildings Learn: What
The evolution of microRNAs in plants

PubMed Central

Cui, Jie; You, Chenjiang; Chen, Xuemei

2016-01-01

MicroRNAs (miRNAs) are a central player in post-transcriptional regulation of gene expression and are involved in numerous biological processes in eukaryotes. Knowledge of the origins and divergence of miRNAs paves the way for a better understanding of the complexity of the regulatory networks that they participate in. The biogenesis, degradation, and regulatory activities of miRNAs are relatively better understood, but the evolutionary history of miRNAs still needs more exploration. Inverted duplication of target genes, random hairpin sequences and small transposable elements constitute three main models that explain the origination of miRNA genes (MIR). Both inter- and intra-species divergence of miRNAs exhibits functional adaptation and adaptation to changing environments in evolution. Here we summarize recent progress in studies on the evolution of MIR and related genes. PMID:27886593
Prions are affected by evolution at two levels.

PubMed

Wickner, Reed B; Kelly, Amy C

2016-03-01

Prions, infectious proteins, can transmit diseases or be the basis of heritable traits (or both), mostly based on amyloid forms of the prion protein. A single protein sequence can be the basis for many prion strains/variants, with different biological properties based on different amyloid conformations, each rather stably propagating. Prions are unique in that evolution and selection work at both the level of the chromosomal gene encoding the protein, and on the prion itself selecting prion variants. Here, we summarize what is known about the evolution of prion proteins, both the genes and the prions themselves. We contrast the one known functional prion, [Het-s] of Podospora anserina, with the known disease prions, the yeast prions [PSI+] and [URE3] and the transmissible spongiform encephalopathies of mammals.
Mapping Phylogenetic Trees to Reveal Distinct Patterns of Evolution.

PubMed

Kendall, Michelle; Colijn, Caroline

2016-10-01

Evolutionary relationships are frequently described by phylogenetic trees, but a central barrier in many fields is the difficulty of interpreting data containing conflicting phylogenetic signals. We present a metric-based method for comparing trees which extracts distinct alternative evolutionary relationships embedded in data. We demonstrate detection and resolution of phylogenetic uncertainty in a recent study of anole lizards, leading to alternate hypotheses about their evolutionary relationships. We use our approach to compare trees derived from different genes of Ebolavirus and find that the VP30 gene has a distinct phylogenetic signature composed of three alternatives that differ in the deep branching structure. phylogenetics, evolution, tree metrics, genetics, sequencing. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Temporal variations in early developmental decisions: an engine of forebrain evolution.

PubMed

Bielen, H; Pal, S; Tole, S; Houart, C

2017-02-01

Tight control of developmental timing is pivotal to many major processes in developmental biology, such as patterning, fate specification, cell cycle dynamics, cell migration and connectivity. Temporal change in these ontogenetic sequences is known as heterochrony, a major force in the evolution of body plans and organogenesis. In the last 5 years, studies in fish and rodents indicate that heterochrony in signaling during early development generates diversity in forebrain size and complexity. Here, we summarize these findings and propose that, additionally to spatio-temporal tuning of neurogenesis, temporal and quantitative modulation of signaling events drive pivotal changes in shape, size and complexity of the forebrain across evolution, participating to the generation of diversity in animal behavior and emergence of cognition. Copyright Â© 2017 Elsevier Ltd. All rights reserved.
The evolutionary and integrative roles of transthyretin in thyroid hormone homeostasis.

PubMed

Schreiber, G

2002-10-01

In larger mammals, thyroid hormone-binding plasma proteins are albumin, transthyretin (TTR) and thyroxine (T4)-binding globulin. They differ characteristically in affinities and release rates for T4 and triiodothyronine (T3). Together, they form a 'buffering' system counteracting thyroid hormone permeation from aqueous to lipid phases. Evolution led to important differences in the expression pattern of these three proteins in tissues. In adult liver, TTR is only made in eutherians and herbivorous marsupials. During development, it is also made in tadpole and fish liver. More intense TTR synthesis than in liver is found in the choroid plexus of reptilians, birds and mammals, but none in the choroid plexus of amphibians and fish, i.e. species without a neocortex. All brain-made TTR is secreted into the cerebrospinal fluid, where it becomes the major thyroid hormone-binding protein. During ontogeny, the maximum TTR synthesis in the choroid plexus precedes that of the growth rate of the brain and occurs during the period of maximum neuroblast replication. TTR is only one component in a network of factors determining thyroid hormone distribution. This explains why, under laboratory conditions, TTR-knockout mice show no major abnormalities. The ratio of TTR affinity for T4 over affinity for T3 is higher in eutherians than in reptiles and birds. This favors T4 transport from blood to brain providing more substrate for conversion of the biologically less active T4 into the biologically more active T3 by the tissue-specific brain deiodinases. The change in affinity of TTR during evolution involves a shortening and an increase in the hydrophilicity of the N-terminal regions of the TTR subunits. The molecular mechanism for this change is a stepwise shift of the splice site at the intron 1/exon 2 border of the TTR gene. The shift probably results from a sequence of single base mutations. Thus, TTR evolution provides an example for a molecular mechanism of positive Darwinian evolution. The amino acid sequences of fish and amphibian TTRs are very similar to those in mammals, suggesting that substantial TTR evolution occurred before the vertebrate stage. Open reading frames for TTR-like sequences already exist in Caenorhabditis elegans, yeast and Escherichia coli genomes.
Efficient identification of Y chromosome sequences in the human and Drosophila genomes.

PubMed

Carvalho, Antonio Bernardo; Clark, Andrew G

2013-11-01

Notwithstanding their biological importance, Y chromosomes remain poorly known in most species. A major obstacle to their study is the identification of Y chromosome sequences; due to its high content of repetitive DNA, in most genome projects, the Y chromosome sequence is fragmented into a large number of small, unmapped scaffolds. Identification of Y-linked genes among these fragments has yielded important insights about the origin and evolution of Y chromosomes, but the process is labor intensive, restricting studies to a small number of species. Apart from these fragmentary assemblies, in a few mammalian species, the euchromatic sequence of the Y is essentially complete, owing to painstaking BAC mapping and sequencing. Here we use female short-read sequencing and k-mer comparison to identify Y-linked sequences in two very different genomes, Drosophila virilis and human. Using this method, essentially all D. virilis scaffolds were unambiguously classified as Y-linked or not Y-linked. We found 800 new scaffolds (totaling 8.5 Mbp), and four new genes in the Y chromosome of D. virilis, including JYalpha, a gene involved in hybrid male sterility. Our results also strongly support the preponderance of gene gains over gene losses in the evolution of the Drosophila Y. In the intensively studied human genome, used here as a positive control, we recovered all previously known genes or gene families, plus a small amount (283 kb) of new, unfinished sequence. Hence, this method works in large and complex genomes and can be applied to any species with sex chromosomes.
Evolution of the Largest Mammalian Genome.

PubMed

Evans, Ben J; Upham, Nathan S; Golding, Goeffrey B; Ojeda, Ricardo A; Ojeda, Agustina A

2017-06-01

The genome of the red vizcacha rat (Rodentia, Octodontidae, Tympanoctomys barrerae) is the largest of all mammals, and about double the size of their close relative, the mountain vizcacha rat Octomys mimax, even though the lineages that gave rise to these species diverged from each other only about 5 Ma. The mechanism for this rapid genome expansion is controversial, and hypothesized to be a consequence of whole genome duplication or accumulation of repetitive elements. To test these alternative but nonexclusive hypotheses, we gathered and evaluated evidence from whole transcriptome and whole genome sequences of T. barrerae and O. mimax. We recovered support for genome expansion due to accumulation of a diverse assemblage of repetitive elements, which represent about one half and one fifth of the genomes of T. barrerae and O. mimax, respectively, but we found no strong signal of whole genome duplication. In both species, repetitive sequences were rare in transcribed regions as compared with the rest of the genome, and mostly had no close match to annotated repetitive sequences from other rodents. These findings raise new questions about the genomic dynamics of these repetitive elements, their connection to widespread chromosomal fissions that occurred in the T. barrerae ancestor, and their fitness effects-including during the evolution of hypersaline dietary tolerance in T. barrerae. ©The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Comparative transcriptomics of Entelegyne spiders (Araneae, Entelegynae), with emphasis on molecular evolution of orphan genes.

PubMed

Carlson, David E; Hedin, Marshal

2017-01-01

Next-generation sequencing technology is rapidly transforming the landscape of evolutionary biology, and has become a cost-effective and efficient means of collecting exome information for non-model organisms. Due to their taxonomic diversity, production of interesting venom and silk proteins, and the relative scarcity of existing genomic resources, spiders in particular are excellent targets for next-generation sequencing (NGS) methods. In this study, the transcriptomes of six entelegyne spider species from three genera (Cicurina travisae, C. vibora, Habronattus signatus, H. ustulatus, Nesticus bishopi, and N. cooperi) were sequenced and de novo assembled. Each assembly was assessed for quality and completeness and functionally annotated using gene ontology information. Approximately 100 transcripts with evidence of homology to venom proteins were discovered. After identifying more than 3,000 putatively orthologous genes across all six taxa, we used comparative analyses to identify 24 instances of positively selected genes. In addition, between ~ 550 and 1,100 unique orphan genes were found in each genus. These unique, uncharacterized genes exhibited elevated rates of amino acid substitution, potentially consistent with lineage-specific adaptive evolution. The data generated for this study represent a valuable resource for future phylogenetic and molecular evolutionary research, and our results provide new insight into the forces driving genome evolution in taxa that span the root of entelegyne spider phylogeny.
Novel features of ARS selection in budding yeast Lachancea kluyveri

PubMed Central

2011-01-01

Background The characterization of DNA replication origins in yeast has shed much light on the mechanisms of initiation of DNA replication. However, very little is known about the evolution of origins or the evolution of mechanisms through which origins are recognized by the initiation machinery. This lack of understanding is largely due to the vast evolutionary distances between model organisms in which origins have been examined. Results In this study we have isolated and characterized autonomously replicating sequences (ARSs) in Lachancea kluyveri - a pre-whole genome duplication (WGD) budding yeast. Through a combination of experimental work and rigorous computational analysis, we show that L. kluyveri ARSs require a sequence that is similar but much longer than the ARS Consensus Sequence well defined in Saccharomyces cerevisiae. Moreover, compared with S. cerevisiae and K. lactis, the replication licensing machinery in L. kluyveri seems more tolerant to variations in the ARS sequence composition. It is able to initiate replication from almost all S. cerevisiae ARSs tested and most Kluyveromyces lactis ARSs. In contrast, only about half of the L. kluyveri ARSs function in S. cerevisiae and less than 10% function in K. lactis. Conclusions Our findings demonstrate a replication initiation system with novel features and underscore the functional diversity within the budding yeasts. Furthermore, we have developed new approaches for analyzing biologically functional DNA sequences with ill-defined motifs. PMID:22204614
Novel features of ARS selection in budding yeast Lachancea kluyveri.

PubMed

Liachko, Ivan; Tanaka, Emi; Cox, Katherine; Chung, Shau Chee Claire; Yang, Lu; Seher, Arael; Hallas, Lindsay; Cha, Eugene; Kang, Gina; Pace, Heather; Barrow, Jasmine; Inada, Maki; Tye, Bik-Kwoon; Keich, Uri

2011-12-28

The characterization of DNA replication origins in yeast has shed much light on the mechanisms of initiation of DNA replication. However, very little is known about the evolution of origins or the evolution of mechanisms through which origins are recognized by the initiation machinery. This lack of understanding is largely due to the vast evolutionary distances between model organisms in which origins have been examined. In this study we have isolated and characterized autonomously replicating sequences (ARSs) in Lachancea kluyveri - a pre-whole genome duplication (WGD) budding yeast. Through a combination of experimental work and rigorous computational analysis, we show that L. kluyveri ARSs require a sequence that is similar but much longer than the ARS Consensus Sequence well defined in Saccharomyces cerevisiae. Moreover, compared with S. cerevisiae and K. lactis, the replication licensing machinery in L. kluyveri seems more tolerant to variations in the ARS sequence composition. It is able to initiate replication from almost all S. cerevisiae ARSs tested and most Kluyveromyces lactis ARSs. In contrast, only about half of the L. kluyveri ARSs function in S. cerevisiae and less than 10% function in K. lactis. Our findings demonstrate a replication initiation system with novel features and underscore the functional diversity within the budding yeasts. Furthermore, we have developed new approaches for analyzing biologically functional DNA sequences with ill-defined motifs.
Biological and genetic evolution of HIV type 1 in two siblings with different patterns of disease progression.

PubMed

Ripamonti, Chiara; Leitner, Thomas; Laurén, Anna; Karlsson, Ingrid; Pastore, Angela; Cavarelli, Mariangela; Antonsson, Liselotte; Plebani, Anna; Fenyö, Eva Maria; Scarlatti, Gabriella

2007-12-01

To investigate the immunological and virological factors that may lead to different patterns of disease progression characteristic of HIV-1-infected children, two HIV-1-infected siblings, a slow and a fast progressor, were followed prospectively before the onset of highly active antiretroviral therapy. Viral coreceptor usage, including the use of CCR5/CXCR4 chimeric receptors, macrophage tropism, and sensitivity to the CC-chemokine RANTES, has been studied. An autologous and heterologous neutralizing antibody response has been documented using peripheral blood mononuclear cells- and GHOST(3) cell line-based assays. Viral evolution was investigated by env C2-V3 region sequence analysis. Although both siblings were infected with HIV-1 of the R5 phenotype, their viruses showed important biological differences. In the fast progressor there was a higher RANTES sensitivity of the early virus, an increased trend to change the mode of CCR5 receptor use, and a larger genetic evolution. Both children developed an autologous neutralizing antibody response starting from the second year with evidence of the continuous emergence of resistant variants. A marked viral genetic and phenotypic evolution was documented in the fast progressor sibling, which is accompanied by a high viral RANTES sensitivity and persistent neutralizing antibodies.
Biological data sciences in genome research.

PubMed

Schatz, Michael C

2015-10-01

The last 20 years have been a remarkable era for biology and medicine. One of the most significant achievements has been the sequencing of the first human genomes, which has laid the foundation for profound insights into human genetics, the intricacies of regulation and development, and the forces of evolution. Incredibly, as we look into the future over the next 20 years, we see the very real potential for sequencing more than 1 billion genomes, bringing even deeper insight into human genetics as well as the genetics of millions of other species on the planet. Realizing this great potential for medicine and biology, though, will only be achieved through the integration and development of highly scalable computational and quantitative approaches that can keep pace with the rapid improvements to biotechnology. In this perspective, I aim to chart out these future technologies, anticipate the major themes of research, and call out the challenges ahead. One of the largest shifts will be in the training used to prepare the class of 2035 for their highly interdisciplinary world. © 2015 Schatz; Published by Cold Spring Harbor Laboratory Press.
Spinning Gland Transcriptomics from Two Main Clades of Spiders (Order: Araneae) - Insights on Their Molecular, Anatomical and Behavioral Evolution

PubMed Central

Prosdocimi, Francisco; Bittencourt, Daniela; da Silva, Felipe Rodrigues; Kirst, Matias; Motta, Paulo C.; Rech, Elibio L.

2011-01-01

Characterized by distinctive evolutionary adaptations, spiders provide a comprehensive system for evolutionary and developmental studies of anatomical organs, including silk and venom production. Here we performed cDNA sequencing using massively parallel sequencers (454 GS-FLX Titanium) to generate ∼80,000 reads from the spinning gland of Actinopus spp. (infraorder: Mygalomorphae) and Gasteracantha cancriformis (infraorder: Araneomorphae, Orbiculariae clade). Actinopus spp. retains primitive characteristics on web usage and presents a single undifferentiated spinning gland while the orbiculariae spiders have seven differentiated spinning glands and complex patterns of web usage. MIRA, Celera Assembler and CAP3 software were used to cluster NGS reads for each spider. CAP3 unigenes passed through a pipeline for automatic annotation, classification by biological function, and comparative transcriptomics. Genes related to spider silks were manually curated and analyzed. Although a single spidroin gene family was found in Actinopus spp., a vast repertoire of specialized spider silk proteins was encountered in orbiculariae. Astacin-like metalloproteases (meprin subfamily) were shown to be some of the most sampled unigenes and duplicated gene families in G. cancriformis since its evolutionary split from mygalomorphs. Our results confirm that the evolution of the molecular repertoire of silk proteins was accompanied by the (i) anatomical differentiation of spinning glands and (ii) behavioral complexification in the web usage. Finally, a phylogenetic tree was constructed to cluster most of the known spidroins in gene clades. This is the first large-scale, multi-organism transcriptome for spider spinning glands and a first step into a broad understanding of spider web systems biology and evolution. PMID:21738742
Modeling the evolution of regulatory elements by simultaneous detection and alignment with phylogenetic pair HMMs.

PubMed

Majoros, William H; Ohler, Uwe

2010-12-16

The computational detection of regulatory elements in DNA is a difficult but important problem impacting our progress in understanding the complex nature of eukaryotic gene regulation. Attempts to utilize cross-species conservation for this task have been hampered both by evolutionary changes of functional sites and poor performance of general-purpose alignment programs when applied to non-coding sequence. We describe a new and flexible framework for modeling binding site evolution in multiple related genomes, based on phylogenetic pair hidden Markov models which explicitly model the gain and loss of binding sites along a phylogeny. We demonstrate the value of this framework for both the alignment of regulatory regions and the inference of precise binding-site locations within those regions. As the underlying formalism is a stochastic, generative model, it can also be used to simulate the evolution of regulatory elements. Our implementation is scalable in terms of numbers of species and sequence lengths and can produce alignments and binding-site predictions with accuracy rivaling or exceeding current systems that specialize in only alignment or only binding-site prediction. We demonstrate the validity and power of various model components on extensive simulations of realistic sequence data and apply a specific model to study Drosophila enhancers in as many as ten related genomes and in the presence of gain and loss of binding sites. Different models and modeling assumptions can be easily specified, thus providing an invaluable tool for the exploration of biological hypotheses that can drive improvements in our understanding of the mechanisms and evolution of gene regulation.
The king cobra genome reveals dynamic gene evolution and adaptation in the snake venom system

PubMed Central

Vonk, Freek J.; Casewell, Nicholas R.; Henkel, Christiaan V.; Heimberg, Alysha M.; Jansen, Hans J.; McCleary, Ryan J. R.; Kerkkamp, Harald M. E.; Vos, Rutger A.; Guerreiro, Isabel; Calvete, Juan J.; Wüster, Wolfgang; Woods, Anthony E.; Logan, Jessica M.; Harrison, Robert A.; Castoe, Todd A.; de Koning, A. P. Jason; Pollock, David D.; Yandell, Mark; Calderon, Diego; Renjifo, Camila; Currier, Rachel B.; Salgado, David; Pla, Davinia; Sanz, Libia; Hyder, Asad S.; Ribeiro, José M. C.; Arntzen, Jan W.; van den Thillart, Guido E. E. J. M.; Boetzer, Marten; Pirovano, Walter; Dirks, Ron P.; Spaink, Herman P.; Duboule, Denis; McGlinn, Edwina; Kini, R. Manjunatha; Richardson, Michael K.

2013-01-01

Snakes are limbless predators, and many species use venom to help overpower relatively large, agile prey. Snake venoms are complex protein mixtures encoded by several multilocus gene families that function synergistically to cause incapacitation. To examine venom evolution, we sequenced and interrogated the genome of a venomous snake, the king cobra (Ophiophagus hannah), and compared it, together with our unique transcriptome, microRNA, and proteome datasets from this species, with data from other vertebrates. In contrast to the platypus, the only other venomous vertebrate with a sequenced genome, we find that snake toxin genes evolve through several distinct co-option mechanisms and exhibit surprisingly variable levels of gene duplication and directional selection that correlate with their functional importance in prey capture. The enigmatic accessory venom gland shows a very different pattern of toxin gene expression from the main venom gland and seems to have recruited toxin-like lectin genes repeatedly for new nontoxic functions. In addition, tissue-specific microRNA analyses suggested the co-option of core genetic regulatory components of the venom secretory system from a pancreatic origin. Although the king cobra is limbless, we recovered coding sequences for all Hox genes involved in amniote limb development, with the exception of Hoxd12. Our results provide a unique view of the origin and evolution of snake venom and reveal multiple genome-level adaptive responses to natural selection in this complex biological weapon system. More generally, they provide insight into mechanisms of protein evolution under strong selection. PMID:24297900
The genome sequence of the model ascomycete fungus Podospora anserina

PubMed Central

Espagne, Eric; Lespinet, Olivier; Malagnac, Fabienne; Da Silva, Corinne; Jaillon, Olivier; Porcel, Betina M; Couloux, Arnaud; Aury, Jean-Marc; Ségurens, Béatrice; Poulain, Julie; Anthouard, Véronique; Grossetete, Sandrine; Khalili, Hamid; Coppin, Evelyne; Déquard-Chablat, Michelle; Picard, Marguerite; Contamine, Véronique; Arnaise, Sylvie; Bourdais, Anne; Berteaux-Lecellier, Véronique; Gautheret, Daniel; de Vries, Ronald P; Battaglia, Evy; Coutinho, Pedro M; Danchin, Etienne GJ; Henrissat, Bernard; Khoury, Riyad EL; Sainsard-Chanet, Annie; Boivin, Antoine; Pinan-Lucarré, Bérangère; Sellem, Carole H; Debuchy, Robert; Wincker, Patrick; Weissenbach, Jean; Silar, Philippe

2008-01-01

Background The dung-inhabiting ascomycete fungus Podospora anserina is a model used to study various aspects of eukaryotic and fungal biology, such as ageing, prions and sexual development. Results We present a 10X draft sequence of P. anserina genome, linked to the sequences of a large expressed sequence tag collection. Similar to higher eukaryotes, the P. anserina transcription/splicing machinery generates numerous non-conventional transcripts. Comparison of the P. anserina genome and orthologous gene set with the one of its close relatives, Neurospora crassa, shows that synteny is poorly conserved, the main result of evolution being gene shuffling in the same chromosome. The P. anserina genome contains fewer repeated sequences and has evolved new genes by duplication since its separation from N. crassa, despite the presence of the repeat induced point mutation mechanism that mutates duplicated sequences. We also provide evidence that frequent gene loss took place in the lineages leading to P. anserina and N. crassa. P. anserina contains a large and highly specialized set of genes involved in utilization of natural carbon sources commonly found in its natural biotope. It includes genes potentially involved in lignin degradation and efficient cellulose breakdown. Conclusion The features of the P. anserina genome indicate a highly dynamic evolution since the divergence of P. anserina and N. crassa, leading to the ability of the former to use specific complex carbon sources that match its needs in its natural biotope. PMID:18460219
Reconstructing evolutionary trees in parallel for massive sequences.

PubMed

Zou, Quan; Wan, Shixiang; Zeng, Xiangxiang; Ma, Zhanshan Sam

2017-12-14

Building the evolutionary trees for massive unaligned DNA sequences is challenging and crucial. However, reconstructing evolutionary tree for ultra-large sequences is hard. Massive multiple sequence alignment is also challenging and time/space consuming. Hadoop and Spark are developed recently, which bring spring light for the classical computational biology problems. In this paper, we tried to solve the multiple sequence alignment and evolutionary reconstruction in parallel. HPTree, which is developed in this paper, can deal with big DNA sequence files quickly. It works well on the >1GB files, and gets better performance than other evolutionary reconstruction tools. Users could use HPTree for reonstructing evolutioanry trees on the computer clusters or cloud platform (eg. Amazon Cloud). HPTree could help on population evolution research and metagenomics analysis. In this paper, we employ the Hadoop and Spark platform and design an evolutionary tree reconstruction software tool for unaligned massive DNA sequences. Clustering and multiple sequence alignment are done in parallel. Neighbour-joining model was employed for the evolutionary tree building. We opened our software together with source codes via http://lab.malab.cn/soft/HPtree/ .
The Human Genome Project: applications in the diagnosis and treatment of neurologic disease.

PubMed

Evans, G A

1998-10-01

The Human Genome Project (HGP), an international program to decode the entire DNA sequence of the human genome in 15 years, represents the largest biological experiment ever conducted. This set of information will contain the blueprint for the construction and operation of a human being. While the primary driving force behind the genome project is the potential to vastly expand the amount of genetic information available for biomedical research, the ramifications for other fields of study in biological research, the biotechnology and pharmaceutical industry, our understanding of evolution, effects on agriculture, and implications for bioethics are likely to be profound.
Examination of Signatures of Recent Positive Selection on Genes Involved in Human Sialic Acid Biology.

PubMed

Moon, Jiyun M; Aronoff, David M; Capra, John A; Abbot, Patrick; Rokas, Antonis

2018-03-28

Sialic acids are nine carbon sugars ubiquitously found on the surfaces of vertebrate cells and are involved in various immune response-related processes. In humans, at least 58 genes spanning diverse functions, from biosynthesis and activation to recycling and degradation, are involved in sialic acid biology. Because of their role in immunity, sialic acid biology genes have been hypothesized to exhibit elevated rates of evolutionary change. Consistent with this hypothesis, several genes involved in sialic acid biology have experienced higher rates of non-synonymous substitutions in the human lineage than their counterparts in other great apes, perhaps in response to ancient pathogens that infected hominins millions of years ago (paleopathogens). To test whether sialic acid biology genes have also experienced more recent positive selection during the evolution of the modern human lineage, reflecting adaptation to contemporary cosmopolitan or geographically-restricted pathogens, we examined whether their protein-coding regions showed evidence of recent hard and soft selective sweeps. This examination involved the calculation of four measures that quantify changes in allele frequency spectra, extent of population differentiation, and haplotype homozygosity caused by recent hard and soft selective sweeps for 55 sialic acid biology genes using publicly available whole genome sequencing data from 1,668 humans from three ethnic groups. To disentangle evidence for selection from confounding demographic effects, we compared the observed patterns in sialic acid biology genes to simulated sequences of the same length under a model of neutral evolution that takes into account human demographic history. We found that the patterns of genetic variation of most sialic acid biology genes did not significantly deviate from neutral expectations and were not significantly different among genes belonging to different functional categories. Those few sialic acid biology genes that significantly deviated from neutrality either experienced soft sweeps or population-specific hard sweeps. Interestingly, while most hard sweeps occurred on genes involved in sialic acid recognition, most soft sweeps involved genes associated with recycling, degradation and activation, transport, and transfer functions. We propose that the lack of signatures of recent positive selection for the majority of the sialic acid biology genes is consistent with the view that these genes regulate immune responses against ancient rather than contemporary cosmopolitan or geographically restricted pathogens. Copyright © 2018 Moon et al.

Emergence and Evolution of Hominidae-Specific Coding and Noncoding Genomic Sequences.

PubMed

Saber, Morteza Mahmoudi; Adeyemi Babarinde, Isaac; Hettiarachchi, Nilmini; Saitou, Naruya

2016-07-12

Family Hominidae, which includes humans and great apes, is recognized for unique complex social behavior and intellectual abilities. Despite the increasing genome data, however, the genomic origin of its phenotypic uniqueness has remained elusive. Clade-specific genes and highly conserved noncoding sequences (HCNSs) are among the high-potential evolutionary candidates involved in driving clade-specific characters and phenotypes. On this premise, we analyzed whole genome sequences along with gene orthology data retrieved from major DNA databases to find Hominidae-specific (HS) genes and HCNSs. We discovered that Down syndrome critical region 4 (DSCR4) is the only experimentally verified gene uniquely present in Hominidae. DSCR4 has no structural homology to any known protein and was inferred to have emerged in several steps through LTR/ERV1, LTR/ERVL retrotransposition, and transversion. Using the genomic distance as neutral evolution threshold, we identified 1,658 HS HCNSs. Polymorphism coverage and derived allele frequency analysis of HS HCNSs showed that these HCNSs are under purifying selection, indicating that they may harbor important functions. They are overrepresented in promoters/untranslated regions, in close proximity of genes involved in sensory perception of sound and developmental process, and also showed a significantly lower nucleosome occupancy probability. Interestingly, many ancestral sequences of the HS HCNSs showed very high evolutionary rates. This suggests that new functions emerged through some kind of positive selection, and then purifying selection started to operate to keep these functions. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Draft genome sequences of the oomycete Pythium insidiosum strain CBS 573.85 from a horse with pythiosis and strain CR02 from the environment.

PubMed

Patumcharoenpol, Preecha; Rujirawat, Thidarat; Lohnoo, Tassanee; Yingyong, Wanta; Vanittanakom, Nongnuch; Kittichotirat, Weerayuth; Krajaejun, Theerapong

2018-02-01

Pythium insidiosum is an aquatic oomycete microorganism that causes the fatal infectious disease, pythiosis, in humans and animals. The organism has been successfully isolated from the environment worldwide. Diagnosis and treatment of pythiosis is difficult and challenging. Genome sequences of P. insidiosum , isolated from humans, are available and accessible in public databases. To further facilitate biology-, pathogenicity-, and evolution-related genomic and genetic studies of P. insidiosum , we report two additional draft genome sequences of the P. insidiosum strain CBS 573.85 (35.6 Mb in size; accession number, BCFO00000000.1) isolated from a horse with pythiosis, and strain CR02 (37.7 Mb in size; accession number, BCFR00000000.1) isolated from the environment.
Algorithm to find distant repeats in a single protein sequence

PubMed Central

Banerjee, Nirjhar; Sarani, Rangarajan; Ranjani, Chellamuthu Vasuki; Sowmiya, Govindaraj; Michael, Daliah; Balakrishnan, Narayanasamy; Sekar, Kanagaraj

2008-01-01

Distant repeats in protein sequence play an important role in various aspects of protein analysis. A keen analysis of the distant repeats would enable to establish a firm relation of the repeats with respect to their function and three-dimensional structure during the evolutionary process. Further, it enlightens the diversity of duplication during the evolution. To this end, an algorithm has been developed to find all distant repeats in a protein sequence. The scores from Point Accepted Mutation (PAM) matrix has been deployed for the identification of amino acid substitutions while detecting the distant repeats. Due to the biological importance of distant repeats, the proposed algorithm will be of importance to structural biologists, molecular biologists, biochemists and researchers involved in phylogenetic and evolutionary studies. PMID:19052663
The whole genome sequence of the Mediterranean fruit fly, Ceratitis capitata (Wiedemann), reveals insights into the biology and adaptive evolution of a highly invasive pest species.

PubMed

Papanicolaou, Alexie; Schetelig, Marc F; Arensburger, Peter; Atkinson, Peter W; Benoit, Joshua B; Bourtzis, Kostas; Castañera, Pedro; Cavanaugh, John P; Chao, Hsu; Childers, Christopher; Curril, Ingrid; Dinh, Huyen; Doddapaneni, HarshaVardhan; Dolan, Amanda; Dugan, Shannon; Friedrich, Markus; Gasperi, Giuliano; Geib, Scott; Georgakilas, Georgios; Gibbs, Richard A; Giers, Sarah D; Gomulski, Ludvik M; González-Guzmán, Miguel; Guillem-Amat, Ana; Han, Yi; Hatzigeorgiou, Artemis G; Hernández-Crespo, Pedro; Hughes, Daniel S T; Jones, Jeffery W; Karagkouni, Dimitra; Koskinioti, Panagiota; Lee, Sandra L; Malacrida, Anna R; Manni, Mosè; Mathiopoulos, Kostas; Meccariello, Angela; Munoz-Torres, Monica; Murali, Shwetha C; Murphy, Terence D; Muzny, Donna M; Oberhofer, Georg; Ortego, Félix; Paraskevopoulou, Maria D; Poelchau, Monica; Qu, Jiaxin; Reczko, Martin; Robertson, Hugh M; Rosendale, Andrew J; Rosselot, Andrew E; Saccone, Giuseppe; Salvemini, Marco; Savini, Grazia; Schreiner, Patrick; Scolari, Francesca; Siciliano, Paolo; Sim, Sheina B; Tsiamis, George; Ureña, Enric; Vlachos, Ioannis S; Werren, John H; Wimmer, Ernst A; Worley, Kim C; Zacharopoulou, Antigone; Richards, Stephen; Handler, Alfred M

2016-09-22

The Mediterranean fruit fly (medfly), Ceratitis capitata, is a major destructive insect pest due to its broad host range, which includes hundreds of fruits and vegetables. It exhibits a unique ability to invade and adapt to ecological niches throughout tropical and subtropical regions of the world, though medfly infestations have been prevented and controlled by the sterile insect technique (SIT) as part of integrated pest management programs (IPMs). The genetic analysis and manipulation of medfly has been subject to intensive study in an effort to improve SIT efficacy and other aspects of IPM control. The 479 Mb medfly genome is sequenced from adult flies from lines inbred for 20 generations. A high-quality assembly is achieved having a contig N50 of 45.7 kb and scaffold N50 of 4.06 Mb. In-depth curation of more than 1800 messenger RNAs shows specific gene expansions that can be related to invasiveness and host adaptation, including gene families for chemoreception, toxin and insecticide metabolism, cuticle proteins, opsins, and aquaporins. We identify genes relevant to IPM control, including those required to improve SIT. The medfly genome sequence provides critical insights into the biology of one of the most serious and widespread agricultural pests. This knowledge should significantly advance the means of controlling the size and invasive potential of medfly populations. Its close relationship to Drosophila, and other insect species important to agriculture and human health, will further comparative functional and structural studies of insect genomes that should broaden our understanding of gene family evolution.
Student Acquisition of Biological Evolution-Related Misconceptions: The Role of Public High School Introductory Biology Teachers

ERIC Educational Resources Information Center

Yates, Tony Brett

2011-01-01

In order to eliminate student misconceptions concerning biological evolution, it is important to identify their sources. The purposes of this study were to: (a) identify biological evolution-related misconceptions held by Oklahoma public high school Biology I teachers; (b) identify biological evolution-related misconceptions held by Oklahoma…
Alignment-free microbial phylogenomics under scenarios of sequence divergence, genome rearrangement and lateral genetic transfer.

PubMed

Bernard, Guillaume; Chan, Cheong Xin; Ragan, Mark A

2016-07-01

Alignment-free (AF) approaches have recently been highlighted as alternatives to methods based on multiple sequence alignment in phylogenetic inference. However, the sensitivity of AF methods to genome-scale evolutionary scenarios is little known. Here, using simulated microbial genome data we systematically assess the sensitivity of nine AF methods to three important evolutionary scenarios: sequence divergence, lateral genetic transfer (LGT) and genome rearrangement. Among these, AF methods are most sensitive to the extent of sequence divergence, less sensitive to low and moderate frequencies of LGT, and most robust against genome rearrangement. We describe the application of AF methods to three well-studied empirical genome datasets, and introduce a new application of the jackknife to assess node support. Our results demonstrate that AF phylogenomics is computationally scalable to multi-genome data and can generate biologically meaningful phylogenies and insights into microbial evolution.
A decade of pig genome sequencing: a window on pig domestication and evolution.

PubMed

Groenen, Martien A M

2016-03-29

Insight into how genomes change and adapt due to selection addresses key questions in evolutionary biology and in domestication of animals and plants by humans. In that regard, the pig and its close relatives found in Africa and Eurasia represent an excellent group of species that enables studies of the effect of both natural and human-mediated selection on the genome. The recent completion of the draft genome sequence of a domestic pig and the development of next-generation sequencing technology during the past decade have created unprecedented possibilities to address these questions in great detail. In this paper, I review recent whole-genome sequencing studies in the pig and closely-related species that provide insight into the demography, admixture and selection of these species and, in particular, how domestication and subsequent selection of Sus scrofa have shaped the genomes of these animals.
Gene Tree Discordance Does Not Explain Away the Temporal Decline of Convergence in Mammalian Protein Sequence Evolution.

PubMed

Zou, Zhengting; Zhang, Jianzhi

2017-07-01

Several authors reported lower frequencies of protein sequence convergence between more distantly related evolutionary lineages and attributed this trend to epistasis, which renders the acceptable amino acids at a site more different and convergence less likely in more divergent lineages. A recent primate study, however, suggested that this trend is at least partially and potentially entirely an artifact of gene tree discordance (GTD). Here, we demonstrate in a genome-wide data set from 17 mammals that the temporal trend remains (1) upon the control of the GTD level, (2) in genes whose genealogies are concordant with the species tree, and (3) for convergent changes, which are extremely unlikely to be caused by GTD. Similar results are observed in a comparable data set of 12 fruit flies in some but not all of these tests. We conclude that, at least in some cases, the temporal decline of convergence is genuine, reflecting an impact of epistasis on protein evolution. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Gene Chips: A New Tool for Biology

NASA Astrophysics Data System (ADS)

Botstein, David

2005-03-01

The knowledge of many complete genomic sequences has led to a ``grand unification of biology,'' consisting of direct evidence that most of the basic cellular functions of all organisms are carried out by genes and proteins whose primary sequences are directly related by descent (i.e. orthologs). Further, genome sequences have made it possible to study all the genes of a single organism simultaneously. We have been using DNA microarrays (sometime referred to as ``gene chips'') to study patterns of gene expression and genome rearrangement in yeast and human cells under a variety of conditions and in human tumors and normal tissues. These experiments produce huge volumes of data; new computational and statistical methods are required to analyze them properly. Examples from this work will be presented to illustrate how genome-scale experiments and analysis can result in new biological insights not obtainable by traditional analyses of genes and proteins one by one. For lymphomas, breast tumors, lung tumors, liver tumors, gastric tumors, brain tumors and soft tissue tumors we have been able, by the application of clustering algorithms, to subclassify tumors of similar anatomical origin on the basis of their gene expression patterns. These subclassifications appear to be reproducible and clinically as well as biologically meaningful. By studying synchronized cells growing in culture, we have identified many hundreds of yeast and human genes that are expressed periodically, at characteristically different points in the cell division cycle. In humans, it turns out that most of these genes are the same genes that comprise the ``proliferation cluster,'' i.e. the genes whose expression is specifically associated with the proliferativeness of tumors and tumor cell lines. Finally, we have been applying a variant of our DNA microarray technology (which we call ``array comparative hybridization'') to follow the DNA copy number of genes, both in tumors and in yeast cells undergoing adaptive evolution during hundreds of generations of growth in continuous culture. These studies suggest a basic similarity in mechanism between adaptive evolution in yeast and tumor progression in humans.
Environmental Epigenetics and a Unified Theory of the Molecular Aspects of Evolution: A Neo-Lamarckian Concept that Facilitates Neo-Darwinian Evolution.

PubMed

Skinner, Michael K

2015-04-26

Environment has a critical role in the natural selection process for Darwinian evolution. The primary molecular component currently considered for neo-Darwinian evolution involves genetic alterations and random mutations that generate the phenotypic variation required for natural selection to act. The vast majority of environmental factors cannot directly alter DNA sequence. Epigenetic mechanisms directly regulate genetic processes and can be dramatically altered by environmental factors. Therefore, environmental epigenetics provides a molecular mechanism to directly alter phenotypic variation generationally. Lamarck proposed in 1802 the concept that environment can directly alter phenotype in a heritable manner. Environmental epigenetics and epigenetic transgenerational inheritance provide molecular mechanisms for this process. Therefore, environment can on a molecular level influence the phenotypic variation directly. The ability of environmental epigenetics to alter phenotypic and genotypic variation directly can significantly impact natural selection. Neo-Lamarckian concept can facilitate neo-Darwinian evolution. A unified theory of evolution is presented to describe the integration of environmental epigenetic and genetic aspects of evolution. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Evolutionary Dynamics on Protein Bi-stability Landscapes can Potentially Resolve Adaptive Conflicts

PubMed Central

Sikosek, Tobias; Bornberg-Bauer, Erich; Chan, Hue Sun

2012-01-01

Experimental studies have shown that some proteins exist in two alternative native-state conformations. It has been proposed that such bi-stable proteins can potentially function as evolutionary bridges at the interface between two neutral networks of protein sequences that fold uniquely into the two different native conformations. Under adaptive conflict scenarios, bi-stable proteins may be of particular advantage if they simultaneously provide two beneficial biological functions. However, computational models that simulate protein structure evolution do not yet recognize the importance of bi-stability. Here we use a biophysical model to analyze sequence space to identify bi-stable or multi-stable proteins with two or more equally stable native-state structures. The inclusion of such proteins enhances phenotype connectivity between neutral networks in sequence space. Consideration of the sequence space neighborhood of bridge proteins revealed that bi-stability decreases gradually with each mutation that takes the sequence further away from an exactly bi-stable protein. With relaxed selection pressures, we found that bi-stable proteins in our model are highly successful under simulated adaptive conflict. Inspired by these model predictions, we developed a method to identify real proteins in the PDB with bridge-like properties, and have verified a clear bi-stability gradient for a series of mutants studied by Alexander et al. (Proc Nat Acad Sci USA 2009, 106:21149–21154) that connect two sequences that fold uniquely into two different native structures via a bridge-like intermediate mutant sequence. Based on these findings, new testable predictions for future studies on protein bi-stability and evolution are discussed. PMID:23028272
Adaptive laboratory evolution – principles and applications for biotechnology

PubMed Central

2013-01-01

Adaptive laboratory evolution is a frequent method in biological studies to gain insights into the basic mechanisms of molecular evolution and adaptive changes that accumulate in microbial populations during long term selection under specified growth conditions. Although regularly performed for more than 25 years, the advent of transcript and cheap next-generation sequencing technologies has resulted in many recent studies, which successfully applied this technique in order to engineer microbial cells for biotechnological applications. Adaptive laboratory evolution has some major benefits as compared with classical genetic engineering but also some inherent limitations. However, recent studies show how some of the limitations may be overcome in order to successfully incorporate adaptive laboratory evolution in microbial cell factory design. Over the last two decades important insights into nutrient and stress metabolism of relevant model species were acquired, whereas some other aspects such as niche-specific differences of non-conventional cell factories are not completely understood. Altogether the current status and its future perspectives highlight the importance and potential of adaptive laboratory evolution as approach in biotechnological engineering. PMID:23815749
The Ciona intestinalis genome: when the constraints are off

NASA Technical Reports Server (NTRS)

Holland, Linda Z.; Gibson-Brown, Jeremy J.

2003-01-01

The recent genome sequencing of a non-vertebrate deuterostome, the ascidian tunicate Ciona intestinalis, makes a substantial contribution to the fields of evolutionary and developmental biology.1 Tunicates have some of the smallest bilaterian genomes, embryos with relatively few cells, fixed lineages and early determination of cell fates. Initial analyses of the C. intestinalis genome indicate that it has been evolving rapidly. Comparisons with other bilaterians show that C. intestinalis has lost a number of genes, and that many genes linked together in most other bilaterians have become uncoupled. In addition, a number of independent, lineage-specific gene duplications have been detected. These new results, although interesting in themselves, will take on a deeper significance once the genomes of additional invertebrate deuterostomes (e.g. echinoderms, hemichordates and amphioxus) have been sequenced. With such a broadened database, comparative genomics can begin to ask pointed questions about the relationship between the evolution of genomes and the evolution of body plans. Copyright 2003 Wiley Periodicals, Inc.
Assessing the determinants of evolutionary rates in the presence of noise.

PubMed

Plotkin, Joshua B; Fraser, Hunter B

2007-05-01

Although protein sequences are known to evolve at vastly different rates, little is known about what determines their rate of evolution. However, a recent study using principal component regression (PCR) has concluded that evolutionary rates in yeast are primarily governed by a single determinant related to translation frequency. Here, we demonstrate that noise in biological data can confound PCRs, leading to spurious conclusions. When equalizing noise levels across 7 predictor variables used in previous studies, we find no evidence that protein evolution is dominated by a single determinant. Our results indicate that a variety of factors--including expression level, gene dispensability, and protein-protein interactions--may independently affect evolutionary rates in yeast. More accurate measurements or more sophisticated statistical techniques will be required to determine which one, if any, of these factors dominates protein evolution.
New lives for old: evolution of pseudoenzyme function illustrated by iRhoms.

PubMed

Adrain, Colin; Freeman, Matthew

2012-07-11

Large-scale sequencing of genomes has revealed that most enzyme families include inactive homologues. These pseudoenzymes are often well conserved, implying a selective pressure to retain them during evolution, and therefore that they have significant function. Mechanistic insights and evolutionary lessons are now emerging from the study of a broad range of such 'dead' enzymes. The recently discovered iRhoms - inactive homologues of rhomboid proteases - have joined derlins and other members of the rhomboid-like clan in regulating the fate of proteins as they pass through the secretory pathway. There is a strong case that dead enzymes, which have been rather overlooked, may be a rich source of biological regulators.
A self-triggered picoinjector in microfluidics

NASA Astrophysics Data System (ADS)

Yang, Yiming; Liu, Songsheng; Jia, Chunping; Mao, Hongju; Jin, Qinghui; Zhao, Jianlong; Zhou, Hongbo

2016-12-01

Droplet-based microfluidics has recently emerged as a potential platform for studies of single-cell, directed evolution, and genetic sequencing. In droplet-based microfluidics, adding reagents into drops is one of the most important functions. In this paper, we develop a new self-triggered picoinjector to add controlled volumes of reagent into droplets at kilohertz rates. In the picoinjector, the reagent injecting is triggered by the coming droplet itself, without needing a droplet detection module. Meanwhile, the dosing volume can be precisely controlled. These features make the system more practical and reliable. We expect the new picoinjector will find important applications of droplet-based microfluidics in automated biological assay, directed evolution, enzyme assay, and so on.
Genomic evolution of Saccharomyces cerevisiae under Chinese rice wine fermentation.

PubMed

Li, Yudong; Zhang, Weiping; Zheng, Daoqiong; Zhou, Zhan; Yu, Wenwen; Zhang, Lei; Feng, Lifang; Liang, Xinle; Guan, Wenjun; Zhou, Jingwen; Chen, Jian; Lin, Zhenguo

2014-09-10

Rice wine fermentation represents a unique environment for the evolution of the budding yeast, Saccharomyces cerevisiae. To understand how the selection pressure shaped the yeast genome and gene regulation, we determined the genome sequence and transcriptome of a S. cerevisiae strain YHJ7 isolated from Chinese rice wine (Huangjiu), a popular traditional alcoholic beverage in China. By comparing the genome of YHJ7 to the lab strain S288c, a Japanese sake strain K7, and a Chinese industrial bioethanol strain YJSH1, we identified many genomic sequence and structural variations in YHJ7, which are mainly located in subtelomeric regions, suggesting that these regions play an important role in genomic evolution between strains. In addition, our comparative transcriptome analysis between YHJ7 and S288c revealed a set of differentially expressed genes, including those involved in glucose transport (e.g., HXT2, HXT7) and oxidoredutase activity (e.g., AAD10, ADH7). Interestingly, many of these genomic and transcriptional variations are directly or indirectly associated with the adaptation of YHJ7 strain to its specific niches. Our molecular evolution analysis suggested that Japanese sake strains (K7/UC5) were derived from Chinese rice wine strains (YHJ7) at least approximately 2,300 years ago, providing the first molecular evidence elucidating the origin of Japanese sake strains. Our results depict interesting insights regarding the evolution of yeast during rice wine fermentation, and provided a valuable resource for genetic engineering to improve industrial wine-making strains. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Signal Correlations in Ecological Niches Can Shape the Organization and Evolution of Bacterial Gene Regulatory Networks

PubMed Central

Dufour, Yann S.; Donohue, Timothy J.

2015-01-01

Transcriptional regulation plays a significant role in the biological response of bacteria to changing environmental conditions. Therefore, mapping transcriptional regulatory networks is an important step not only in understanding how bacteria sense and interpret their environment but also to identify the functions involved in biological responses to specific conditions. Recent experimental and computational developments have facilitated the characterization of regulatory networks on a genome-wide scale in model organisms. In addition, the multiplication of complete genome sequences has encouraged comparative analyses to detect conserved regulatory elements and infer regulatory networks in other less well-studied organisms. However, transcription regulation appears to evolve rapidly, thus, creating challenges for the transfer of knowledge to nonmodel organisms. Nevertheless, the mechanisms and constraints driving the evolution of regulatory networks have been the subjects of numerous analyses, and several models have been proposed. Overall, the contributions of mutations, recombination, and horizontal gene transfer are complex. Finally, the rapid evolution of regulatory networks plays a significant role in the remarkable capacity of bacteria to adapt to new or changing environments. Conversely, the characteristics of environmental niches determine the selective pressures and can shape the structure of regulatory network accordingly. PMID:23046950
The interface of protein structure, protein biophysics, and molecular evolution

PubMed Central

Liberles, David A; Teichmann, Sarah A; Bahar, Ivet; Bastolla, Ugo; Bloom, Jesse; Bornberg-Bauer, Erich; Colwell, Lucy J; de Koning, A P Jason; Dokholyan, Nikolay V; Echave, Julian; Elofsson, Arne; Gerloff, Dietlind L; Goldstein, Richard A; Grahnen, Johan A; Holder, Mark T; Lakner, Clemens; Lartillot, Nicholas; Lovell, Simon C; Naylor, Gavin; Perica, Tina; Pollock, David D; Pupko, Tal; Regan, Lynne; Roger, Andrew; Rubinstein, Nimrod; Shakhnovich, Eugene; Sjölander, Kimmen; Sunyaev, Shamil; Teufel, Ashley I; Thorne, Jeffrey L; Thornton, Joseph W; Weinreich, Daniel M; Whelan, Simon

2012-01-01

Abstract The interface of protein structural biology, protein biophysics, molecular evolution, and molecular population genetics forms the foundations for a mechanistic understanding of many aspects of protein biochemistry. Current efforts in interdisciplinary protein modeling are in their infancy and the state-of-the art of such models is described. Beyond the relationship between amino acid substitution and static protein structure, protein function, and corresponding organismal fitness, other considerations are also discussed. More complex mutational processes such as insertion and deletion and domain rearrangements and even circular permutations should be evaluated. The role of intrinsically disordered proteins is still controversial, but may be increasingly important to consider. Protein geometry and protein dynamics as a deviation from static considerations of protein structure are also important. Protein expression level is known to be a major determinant of evolutionary rate and several considerations including selection at the mRNA level and the role of interaction specificity are discussed. Lastly, the relationship between modeling and needed high-throughput experimental data as well as experimental examination of protein evolution using ancestral sequence resurrection and in vitro biochemistry are presented, towards an aim of ultimately generating better models for biological inference and prediction. PMID:22528593
The essence of life.

PubMed

Ma, Wentao

2016-09-26

Although biology has achieved great successes in recent years, we have not got a clear idea on "what is life?" Actually, as explained here, the main reason for this situation is that there are two completely distinct aspects for "life", which are usually talked about together. Indeed, in respect to these two aspects: Darwinian evolution and self-sustaining, we must split the concept of life correspondingly, for example, by defining "life form" and "living entity", separately. For life's implementation (related to the two aspects) in nature, three mechanisms are crucial: the replication of DNA/RNA-like polymers by residue-pairing, the sequence-dependent folding of RNA/protein-like polymers engendering special functions, and the assembly of phospholipid-like amphiphiles forming vesicles. The notion "information" is significant for us to comprehend life phenomenon: the life form of a living entity can just be defined by its genetic information; Darwinian evolution is essentially an evolution of such information, transferred across generations. The in-depth analysis concerning the essence of life would improve our cognition in the whole field of biology, and may have a direct influence on its subfields like the origin of life, artificial life and astrobiology. This article was reviewed by Anthony Poole and Thomas Dandekar.

Genome-Wide Convergence during Evolution of Mangroves from Woody Plants.

PubMed

Xu, Shaohua; He, Ziwen; Guo, Zixiao; Zhang, Zhang; Wyckoff, Gerald J; Greenberg, Anthony; Wu, Chung-I; Shi, Suhua

2017-04-01

When living organisms independently invade a new environment, the evolution of similar phenotypic traits is often observed. An interesting but contentious issue is whether the underlying molecular biology also converges in the new habitat. Independent invasions of tropical intertidal zones by woody plants, collectively referred to as mangrove trees, represent some dramatic examples. The high salinity, hypoxia, and other stressors in the new habitat might have affected both genomic features and protein structures. Here, we developed a new method for detecting convergence at conservative Sites (CCS) and applied it to the genomic sequences of mangroves. In simulations, the CCS method drastically reduces random convergence at rapidly evolving sites as well as falsely inferred convergence caused by the misinferences of the ancestral character. In mangrove genomes, we estimated ∼400 genes that have experienced convergence over the background level of convergence in the nonmangrove relatives. The convergent genes are enriched in pathways related to stress response and embryo development, which could be important for mangroves' adaptation to the new habitat. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Quantifying the relationship between sequence and three-dimensional structure conservation in RNA

PubMed Central

2010-01-01

Background In recent years, the number of available RNA structures has rapidly grown reflecting the increased interest on RNA biology. Similarly to the studies carried out two decades ago for proteins, which gave the fundamental grounds for developing comparative protein structure prediction methods, we are now able to quantify the relationship between sequence and structure conservation in RNA. Results Here we introduce an all-against-all sequence- and three-dimensional (3D) structure-based comparison of a representative set of RNA structures, which have allowed us to quantitatively confirm that: (i) there is a measurable relationship between sequence and structure conservation that weakens for alignments resulting in below 60% sequence identity, (ii) evolution tends to conserve more RNA structure than sequence, and (iii) there is a twilight zone for RNA homology detection. Discussion The computational analysis here presented quantitatively describes the relationship between sequence and structure for RNA molecules and defines a twilight zone region for detecting RNA homology. Our work could represent the theoretical basis and limitations for future developments in comparative RNA 3D structure prediction. PMID:20550657
From cultured to uncultured genome sequences: metagenomics and modeling microbial ecosystems.

PubMed

Garza, Daniel R; Dutilh, Bas E

2015-11-01

Microorganisms and the viruses that infect them are the most numerous biological entities on Earth and enclose its greatest biodiversity and genetic reservoir. With strength in their numbers, these microscopic organisms are major players in the cycles of energy and matter that sustain all life. Scientists have only scratched the surface of this vast microbial world through culture-dependent methods. Recent developments in generating metagenomes, large random samples of nucleic acid sequences isolated directly from the environment, are providing comprehensive portraits of the composition, structure, and functioning of microbial communities. Moreover, advances in metagenomic analysis have created the possibility of obtaining complete or nearly complete genome sequences from uncultured microorganisms, providing important means to study their biology, ecology, and evolution. Here we review some of the recent developments in the field of metagenomics, focusing on the discovery of genetic novelty and on methods for obtaining uncultured genome sequences, including through the recycling of previously published datasets. Moreover we discuss how metagenomics has become a core scientific tool to characterize eco-evolutionary patterns of microbial ecosystems, thus allowing us to simultaneously discover new microbes and study their natural communities. We conclude by discussing general guidelines and challenges for modeling the interactions between uncultured microorganisms and viruses based on the information contained in their genome sequences. These models will significantly advance our understanding of the functioning of microbial ecosystems and the roles of microbes in the environment.
Novel Insights into Tree Biology and Genome Evolution as Revealed Through Genomics.

PubMed

Neale, David B; Martínez-García, Pedro J; De La Torre, Amanda R; Montanari, Sara; Wei, Xiao-Xin

2017-04-28

Reference genome sequences are the key to the discovery of genes and gene families that determine traits of interest. Recent progress in sequencing technologies has enabled a rapid increase in genome sequencing of tree species, allowing the dissection of complex characters of economic importance, such as fruit and wood quality and resistance to biotic and abiotic stresses. Although the number of reference genome sequences for trees lags behind those for other plant species, it is not too early to gain insight into the unique features that distinguish trees from nontree plants. Our review of the published data suggests that, although many gene families are conserved among herbaceous and tree species, some gene families, such as those involved in resistance to biotic and abiotic stresses and in the synthesis and transport of sugars, are often expanded in tree genomes. As the genomes of more tree species are sequenced, comparative genomics will further elucidate the complexity of tree genomes and how this relates to traits unique to trees.
Conserved intergenic sequences revealed by CTAG-profiling in Salmonella: thermodynamic modeling for function prediction

NASA Astrophysics Data System (ADS)

Tang, Le; Zhu, Songling; Mastriani, Emilio; Fang, Xin; Zhou, Yu-Jie; Li, Yong-Guo; Johnston, Randal N.; Guo, Zheng; Liu, Gui-Rong; Liu, Shu-Lin

2017-03-01

Highly conserved short sequences help identify functional genomic regions and facilitate genomic annotation. We used Salmonella as the model to search the genome for evolutionarily conserved regions and focused on the tetranucleotide sequence CTAG for its potentially important functions. In Salmonella, CTAG is highly conserved across the lineages and large numbers of CTAG-containing short sequences fall in intergenic regions, strongly indicating their biological importance. Computer modeling demonstrated stable stem-loop structures in some of the CTAG-containing intergenic regions, and substitution of a nucleotide of the CTAG sequence would radically rearrange the free energy and disrupt the structure. The postulated degeneration of CTAG takes distinct patterns among Salmonella lineages and provides novel information about genomic divergence and evolution of these bacterial pathogens. Comparison of the vertically and horizontally transmitted genomic segments showed different CTAG distribution landscapes, with the genome amelioration process to remove CTAG taking place inward from both terminals of the horizontally acquired segment.
Creation of a data base for sequences of ribosomal nucleic acids and detection of conserved restriction endonucleases sites through computerized processing.

PubMed Central

Patarca, R; Dorta, B; Ramirez, J L

1982-01-01

As part of a project pertaining the organization of ribosomal genes in Kinetoplastidae, we have created a data base for published sequences of ribosomal nucleic acids, with information in Spanish. As a first step in their processing, we have written a computer program which introduces the new feature of determining the length of the fragments produced after single or multiple digestion with any of the known restriction enzymes. With this information we have detected conserved SAU 3A sites: (i) at the 5' end of the 5.8S rRNA and at the 3' end of the small subunit rRNA, both included in similar larger sequences; (ii) in the 5.8S rRNA of vertebrates (a second one), which is not present in lower eukaryotes, showing a clear evolutive divergence; and, (iii) at the 5' terminal of the small subunit rRNA, included in a larger conserved sequence. The possible biological importance of these sequences is discussed. PMID:6278402
RNA-Seq Technology and Its Application in Fish Transcriptomics

PubMed Central

Ba, Yi; Zhuang, Qianfeng

2014-01-01

Abstract High-throughput sequencing technologies, also known as next-generation sequencing (NGS) technologies, have revolutionized the way that genomic research is advancing. In addition to the static genome, these state-of-art technologies have been recently exploited to analyze the dynamic transcriptome, and the resulting technology is termed RNA sequencing (RNA-seq). RNA-seq is free from many limitations of other transcriptomic approaches, such as microarray and tag-based sequencing method. Although RNA-seq has only been available for a short time, studies using this method have completely changed our perspective of the breadth and depth of eukaryotic transcriptomes. In terms of the transcriptomics of teleost fishes, both model and non-model species have benefited from the RNA-seq approach and have undergone tremendous advances in the past several years. RNA-seq has helped not only in mapping and annotating fish transcriptome but also in our understanding of many biological processes in fish, such as development, adaptive evolution, host immune response, and stress response. In this review, we first provide an overview of each step of RNA-seq from library construction to the bioinformatic analysis of the data. We then summarize and discuss the recent biological insights obtained from the RNA-seq studies in a variety of fish species. PMID:24380445
Dynamic Convergent Evolution Drives the Passage Adaptation across 48 Years' History of H3N2 Influenza Evolution.

PubMed

Chen, Hui; Deng, Qiang; Ng, Sock Hoon; Lee, Raphael Tze Chuen; Maurer-Stroh, Sebastian; Zhai, Weiwei

2016-12-01

Influenza viruses are often propagated in a diverse set of culturing media and additional substitutions known as passage adaptation can cause extra evolution in the target strain, leading to ineffective vaccines. Using 25,482 H3N2 HA1 sequences curated from Global Initiative on Sharing All Influenza Data and National Center for Biotechnology Information databases, we found that passage adaptation is a very dynamic process that changes over time and evolves in a seesaw like pattern. After crossing the species boundary from bird to human in 1968, the influenza H3N2 virus evolves to be better adapted to the human environment and passaging them in embryonated eggs (i.e., an avian environment) leads to increasingly stronger positive selection. On the contrary, passage adaptation to the mammalian cell lines changes from positive selection to negative selection. Using two statistical tests, we identified 19 codon positions around the receptor binding domain strongly contributing to passage adaptation in the embryonated egg. These sites show strong convergent evolution and overlap extensively with positively selected sites identified in humans, suggesting that passage adaptation can confound many of the earlier studies on influenza evolution. Interestingly, passage adaptation in recent years seems to target a few codon positions in antigenic surface epitopes, which makes it difficult to produce antigenically unaltered vaccines using embryonic eggs. Our study outlines another interesting scenario whereby both convergent and adaptive evolution are working in synchrony driving viral adaptation. Future studies from sequence analysis to vaccine production need to take careful consideration of passage adaptation. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Vibrational spectroscopy reveals the initial steps of biological hydrogen evolution† †Electronic supplementary information (ESI) available: Complementary resonance Raman and infrared spectroscopic data. See DOI: 10.1039/c6sc01098a Click here for additional data file.

PubMed Central

Katz, S.; Noth, J.; Shafaat, H. S.; Happe, T.; Hildebrandt, P.

2016-01-01

[FeFe] hydrogenases are biocatalytic model systems for the exploitation and investigation of catalytic hydrogen evolution. Here, we used vibrational spectroscopic techniques to characterize, in detail, redox transformations of the [FeFe] and [4Fe4S] sub-sites of the catalytic centre (H-cluster) in a monomeric [FeFe] hydrogenase. Through the application of low-temperature resonance Raman spectroscopy, we discovered a novel metastable intermediate that is characterized by an oxidized [FeIFeII] centre and a reduced [4Fe4S]1+ cluster. Based on this unusual configuration, this species is assigned to the first, deprotonated H-cluster intermediate of the [FeFe] hydrogenase catalytic cycle. Providing insights into the sequence of initial reaction steps, the identification of this species represents a key finding towards the mechanistic understanding of biological hydrogen evolution. PMID:28451119
The scope and strength of sex-specific selection in genome evolution.

PubMed

Wright, A E; Mank, J E

2013-09-01

Males and females share the vast majority of their genomes and yet are often subject to different, even conflicting, selection. Genomic and transcriptomic developments have made it possible to assess sex-specific selection at the molecular level, and it is clear that sex-specific selection shapes the evolutionary properties of several genomic characteristics, including transcription, post-transcriptional regulation, imprinting, genome structure and gene sequence. Sex-specific selection is strongly influenced by mating system, which also causes neutral evolutionary changes that affect different regions of the genome in different ways. Here, we synthesize theoretical and molecular work in order to provide a cohesive view of the role of sex-specific selection and mating system in genome evolution. We also highlight the need for a combined approach, incorporating both genomic data and experimental phenotypic studies, in order to understand precisely how sex-specific selection drives evolutionary change across the genome. © 2013 The Authors. Journal of Evolutionary Biology © 2013 European Society For Evolutionary Biology.
The Effectiveness of an Online Curriculum on High School Students' Understanding of Biological Evolution

ERIC Educational Resources Information Center

Marsteller, Robert B.; Bodzin, Alec M.

2015-01-01

An online curriculum about biological evolution was designed to promote increased student content knowledge and evidentiary reasoning. A feasibility study was conducted with 77 rural high school biology students who learned with the online biological evolution unit. Data sources included the Biological Evolution Assessment Measure (BEAM), an…
A new method to improve network topological similarity search: applied to fold recognition

PubMed Central

Lhota, John; Hauptman, Ruth; Hart, Thomas; Ng, Clara; Xie, Lei

2015-01-01

Motivation: Similarity search is the foundation of bioinformatics. It plays a key role in establishing structural, functional and evolutionary relationships between biological sequences. Although the power of the similarity search has increased steadily in recent years, a high percentage of sequences remain uncharacterized in the protein universe. Thus, new similarity search strategies are needed to efficiently and reliably infer the structure and function of new sequences. The existing paradigm for studying protein sequence, structure, function and evolution has been established based on the assumption that the protein universe is discrete and hierarchical. Cumulative evidence suggests that the protein universe is continuous. As a result, conventional sequence homology search methods may be not able to detect novel structural, functional and evolutionary relationships between proteins from weak and noisy sequence signals. To overcome the limitations in existing similarity search methods, we propose a new algorithmic framework—Enrichment of Network Topological Similarity (ENTS)—to improve the performance of large scale similarity searches in bioinformatics. Results: We apply ENTS to a challenging unsolved problem: protein fold recognition. Our rigorous benchmark studies demonstrate that ENTS considerably outperforms state-of-the-art methods. As the concept of ENTS can be applied to any similarity metric, it may provide a general framework for similarity search on any set of biological entities, given their representation as a network. Availability and implementation: Source code freely available upon request Contact: lxie@iscb.org PMID:25717198
[The nineteenth century roots of the contemporary biological revolution].

PubMed

Swynghedauw, Bernard

2006-01-01

The recent publication of the human genomic sequence is the most important progress in biology. It originates from four major watersheds between 1860-1865, namely the biological evolution by Darwin in 1858, the Mendel laws of heredity in 1865, the basis of physiology established by Claude Bernard also in 1865, and the discoveries of microbacteria by Louis Pasteur around 1857. Before 1860, biology did not exist as a science. After 1860, the Darwin's theory progressively became a law after the discovery of the DNA polymorphism and that of the mechanisms of genetic mixing. So far the Mendel's laws were confirmed in parallel with the development of molecular genetics after the discovery of DNA structure and genetic code. The discovery of hormones is one example, amongst several on how integrative physiology applies to Claude Bernard's basis. Finally, based on Pasteur's discovery and Pasteur Institutes, microbiology became a tool for molecular biologists.
GEM System: automatic prototyping of cell-wide metabolic pathway models from genomes.

PubMed

Arakawa, Kazuharu; Yamada, Yohei; Shinoda, Kosaku; Nakayama, Yoichi; Tomita, Masaru

2006-03-23

Successful realization of a "systems biology" approach to analyzing cells is a grand challenge for our understanding of life. However, current modeling approaches to cell simulation are labor-intensive, manual affairs, and therefore constitute a major bottleneck in the evolution of computational cell biology. We developed the Genome-based Modeling (GEM) System for the purpose of automatically prototyping simulation models of cell-wide metabolic pathways from genome sequences and other public biological information. Models generated by the GEM System include an entire Escherichia coli metabolism model comprising 968 reactions of 1195 metabolites, achieving 100% coverage when compared with the KEGG database, 92.38% with the EcoCyc database, and 95.06% with iJR904 genome-scale model. The GEM System prototypes qualitative models to reduce the labor-intensive tasks required for systems biology research. Models of over 90 bacterial genomes are available at our web site.
A stochastic evolution model for residue Insertion-Deletion Independent from Substitution.

PubMed

Lèbre, Sophie; Michel, Christian J

2010-12-01

We develop here a new class of stochastic models of gene evolution based on residue Insertion-Deletion Independent from Substitution (IDIS). Indeed, in contrast to all existing evolution models, insertions and deletions are modeled here by a concept in population dynamics. Therefore, they are not only independent from each other, but also independent from the substitution process. After a separate stochastic analysis of the substitution and the insertion-deletion processes, we obtain a matrix differential equation combining these two processes defining the IDIS model. By deriving a general solution, we give an analytical expression of the residue occurrence probability at evolution time t as a function of a substitution rate matrix, an insertion rate vector, a deletion rate and an initial residue probability vector. Various mathematical properties of the IDIS model in relation with time t are derived: time scale, time step, time inversion and sequence length. Particular expressions of the nucleotide occurrence probability at time t are given for classical substitution rate matrices in various biological contexts: equal insertion rate, insertion-deletion only and substitution only. All these expressions can be directly used for biological evolutionary applications. The IDIS model shows a strongly different stochastic behavior from the classical substitution only model when compared on a gene dataset. Indeed, by considering three processes of residue insertion, deletion and substitution independently from each other, it allows a more realistic representation of gene evolution and opens new directions and applications in this research field. Copyright © 2010 Elsevier Ltd. All rights reserved.
Natural selection in avian protein-coding genes expressed in brain.

PubMed

Axelsson, Erik; Hultin-Rosenberg, Lina; Brandström, Mikael; Zwahlén, Martin; Clayton, David F; Ellegren, Hans

2008-06-01

The evolution of birds from theropod dinosaurs took place approximately 150 million years ago, and was associated with a number of specific adaptations that are still evident among extant birds, including feathers, song and extravagant secondary sexual characteristics. Knowledge about the molecular evolutionary background to such adaptations is lacking. Here, we analyse the evolution of > 5000 protein-coding gene sequences expressed in zebra finch brain by comparison to orthologous sequences in chicken. Mean d(N)/d(S) is 0.085 and genes with their maximal expression in the eye and central nervous system have the lowest mean d(N)/d(S) value, while those expressed in digestive and reproductive tissues exhibit the highest. We find that fast-evolving genes (those which have higher than expected rate of nonsynonymous substitution, indicative of adaptive evolution) are enriched for biological functions such as fertilization, muscle contraction, defence response, response to stress, wounding and endogenous stimulus, and cell death. After alignment to mammalian orthologues, we identify a catalogue of 228 genes that show a significantly higher rate of protein evolution in the two bird lineages than in mammals. These accelerated bird genes, representing candidates for avian-specific adaptations, include genes implicated in vocal learning and other cognitive processes. Moreover, colouration genes evolve faster in birds than in mammals, which may have been driven by sexual selection for extravagant plumage characteristics.
The genetic evolution of canine parvovirus - A new perspective.

PubMed

Zhou, Pei; Zeng, Weijie; Zhang, Xin; Li, Shoujun

2017-01-01

To trace the evolution process of CPV-2, all of the VP2 gene sequences of CPV-2 and FPV (from 1978 to 2015) from GenBank were analyzed in this study. Then, several new ideas regarding CPV-2 evolution were presented. First, the VP2 amino acid 555 and 375 positions of CPV-2 were first ruled out as a universal mutation site in CPV-2a and amino acid 101 position of FPV feature I or T instead of only I in existing rule. Second, the recently confusing nomenclature of CPV-2 variants was substituted with a optional nomenclature that would serve future CPV-2 research. Third, After check the global distribution of variants, CPV-2a is the predominant variant in Asia and CPV-2c is the predominant variant in Europe and Latin America. Fourth, a series of CPV-2-like strains were identified and deduced to evolve from modified live vaccine strains. Finally, three single VP2 mutation (F267Y, Y324I, and T440A) strains were caught concern. Furthermore, these three new VP2 mutation strains may be responsible for vaccine failure, and the strains with VP2 440A may become the novel CPV sub-variant. In conclusion, a summary of all VP2 sequences provides a new perspective regarding CPV-2 evolution and the correlative biological studies needs to be further performed.
Phylogeny of Anophelinae (Diptera: Culicidae) Based on Nuclear Ribosomal and Mitochondrial DNA Sequences

DTIC Science & Technology

2002-01-01

numerous animal clades, including arthropods (Giribet & Ribera , 1998, 2000). The mitochondrial cytochrome oxidase subunits I and II have proven useful as...16S and 28S, D2 rRNA. Insect Molecular Biology, 6, 273-284. Giribet, G. & Ribera , C. (1998) The position of arthropods in animal kingdom: a search...for a reliable outgroup for internal arthropod phylogeny. Molecular Phylogenetics and Evolution, 9, 481-488. Giribet, G. & Ribera , C. (2000) A review
The Coding of Biological Information: From Nucleotide Sequence to Protein Recognition

NASA Astrophysics Data System (ADS)

Štambuk, Nikola

The paper reviews the classic results of Swanson, Dayhoff, Grantham, Blalock and Root-Bernstein, which link genetic code nucleotide patterns to the protein structure, evolution and molecular recognition. Symbolic representation of the binary addresses defining particular nucleotide and amino acid properties is discussed, with consideration of: structure and metric of the code, direct correspondence between amino acid and nucleotide information, and molecular recognition of the interacting protein motifs coded by the complementary DNA and RNA strands.
ISOL@: an Italian SOLAnaceae genomics resource.

PubMed

Chiusano, Maria Luisa; D'Agostino, Nunzio; Traini, Alessandra; Licciardello, Concetta; Raimondo, Enrico; Aversano, Mario; Frusciante, Luigi; Monti, Luigi

2008-03-26

Present-day '-omics' technologies produce overwhelming amounts of data which include genome sequences, information on gene expression (transcripts and proteins) and on cell metabolic status. These data represent multiple aspects of a biological system and need to be investigated as a whole to shed light on the mechanisms which underpin the system functionality. The gathering and convergence of data generated by high-throughput technologies, the effective integration of different data-sources and the analysis of the information content based on comparative approaches are key methods for meaningful biological interpretations. In the frame of the International Solanaceae Genome Project, we propose here ISOLA, an Italian SOLAnaceae genomics resource. ISOLA (available at http://biosrv.cab.unina.it/isola) represents a trial platform and it is conceived as a multi-level computational environment.ISOLA currently consists of two main levels: the genome and the expression level. The cornerstone of the genome level is represented by the Solanum lycopersicum genome draft sequences generated by the International Tomato Genome Sequencing Consortium. Instead, the basic element of the expression level is the transcriptome information from different Solanaceae species, mainly in the form of species-specific comprehensive collections of Expressed Sequence Tags (ESTs). The cross-talk between the genome and the expression levels is based on data source sharing and on tools that enhance data quality, that extract information content from the levels' under parts and produce value-added biological knowledge. ISOLA is the result of a bioinformatics effort that addresses the challenges of the post-genomics era. It is designed to exploit '-omics' data based on effective integration to acquire biological knowledge and to approach a systems biology view. Beyond providing experimental biologists with a preliminary annotation of the tomato genome, this effort aims to produce a trial computational environment where different aspects and details are maintained as they are relevant for the analysis of the organization, the functionality and the evolution of the Solanaceae family.

Revisiting Robustness and Evolvability: Evolution in Weighted Genotype Spaces

PubMed Central

Partha, Raghavendran; Raman, Karthik

2014-01-01

Robustness and evolvability are highly intertwined properties of biological systems. The relationship between these properties determines how biological systems are able to withstand mutations and show variation in response to them. Computational studies have explored the relationship between these two properties using neutral networks of RNA sequences (genotype) and their secondary structures (phenotype) as a model system. However, these studies have assumed every mutation to a sequence to be equally likely; the differences in the likelihood of the occurrence of various mutations, and the consequence of probabilistic nature of the mutations in such a system have previously been ignored. Associating probabilities to mutations essentially results in the weighting of genotype space. We here perform a comparative analysis of weighted and unweighted neutral networks of RNA sequences, and subsequently explore the relationship between robustness and evolvability. We show that assuming an equal likelihood for all mutations (as in an unweighted network), underestimates robustness and overestimates evolvability of a system. In spite of discarding this assumption, we observe that a negative correlation between sequence (genotype) robustness and sequence evolvability persists, and also that structure (phenotype) robustness promotes structure evolvability, as observed in earlier studies using unweighted networks. We also study the effects of base composition bias on robustness and evolvability. Particularly, we explore the association between robustness and evolvability in a sequence space that is AU-rich – sequences with an AU content of 80% or higher, compared to a normal (unbiased) sequence space. We find that evolvability of both sequences and structures in an AU-rich space is lesser compared to the normal space, and robustness higher. We also observe that AU-rich populations evolving on neutral networks of phenotypes, can access less phenotypic variation compared to normal populations evolving on neutral networks. PMID:25390641
Periodic Pattern of Genetic and Fitness Diversity during Evolution of an Artificial Cell-Like System.

PubMed

Ichihashi, Norikazu; Aita, Takuyo; Motooka, Daisuke; Nakamura, Shota; Yomo, Tetsuya

2015-12-01

Genetic and phenotypic diversity are the basis of evolution. Despite their importance, however, little is known about how they change over the course of evolution. In this study, we analyzed the dynamics of the adaptive evolution of a simple evolvable artificial cell-like system using single-molecule real-time sequencing technology that reads an entire single artificial genome. We found that the genomic RNA population increases in fitness intermittently, correlating with a periodic pattern of genetic and fitness diversity produced by repeated diversification and domination. In the diversification phase, a genomic RNA population spreads within a genetic space by accumulating mutations until mutants with higher fitness are generated, resulting in an increase in fitness diversity. In the domination phase, the mutants with higher fitness dominate, decreasing both the fitness and genetic diversity. This study reveals the dynamic nature of genetic and fitness diversity during adaptive evolution and demonstrates the utility of a simplified artificial cell-like system to study evolution at an unprecedented resolution. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Structural insights into the evolution of a sexy protein: novel topology and restricted backbone flexibility in a hypervariable pheromone from the red-legged salamander, Plethodon shermani.

PubMed

Wilburn, Damien B; Bowen, Kathleen E; Doty, Kari A; Arumugam, Sengodagounder; Lane, Andrew N; Feldhoff, Pamela W; Feldhoff, Richard C

2014-01-01

In response to pervasive sexual selection, protein sex pheromones often display rapid mutation and accelerated evolution of corresponding gene sequences. For proteins, the general dogma is that structure is maintained even as sequence or function may rapidly change. This phenomenon is well exemplified by the three-finger protein (TFP) superfamily: a diverse class of vertebrate proteins co-opted for many biological functions - such as components of snake venoms, regulators of the complement system, and coordinators of amphibian limb regeneration. All of the >200 structurally characterized TFPs adopt the namesake "three-finger" topology. In male red-legged salamanders, the TFP pheromone Plethodontid Modulating Factor (PMF) is a hypervariable protein such that, through extensive gene duplication and pervasive sexual selection, individual male salamanders express more than 30 unique isoforms. However, it remained unclear how this accelerated evolution affected the protein structure of PMF. Using LC/MS-MS and multidimensional NMR, we report the 3D structure of the most abundant PMF isoform, PMF-G. The high resolution structural ensemble revealed a highly modified TFP structure, including a unique disulfide bonding pattern and loss of secondary structure, that define a novel protein topology with greater backbone flexibility in the third peptide finger. Sequence comparison, models of molecular evolution, and homology modeling together support that this flexible third finger is the most rapidly evolving segment of PMF. Combined with PMF sequence hypervariability, this structural flexibility may enhance the plasticity of PMF as a chemical signal by permitting potentially thousands of structural conformers. We propose that the flexible third finger plays a critical role in PMF:receptor interactions. As female receptors co-evolve, this flexibility may allow PMF to still bind its receptor(s) without the immediate need for complementary mutations. Consequently, this unique adaptation may establish new paradigms for how receptor:ligand pairs co-evolve, in particular with respect to sexual conflict.
In silico evolution of the hunchback gene indicates redundancy in cis-regulatory organization and spatial gene expression

PubMed Central

Zagrijchuk, Elizaveta A.; Sabirov, Marat A.; Holloway, David M.; Spirov, Alexander V.

2014-01-01

Biological development depends on the coordinated expression of genes in time and space. Developmental genes have extensive cis-regulatory regions which control their expression. These regions are organized in a modular manner, with different modules controlling expression at different times and locations. Both how modularity evolved and what function it serves are open questions. We present a computational model for the cis-regulation of the hunchback (hb) gene in the fruit fly (Drosophila). We simulate evolution (using an evolutionary computation approach from computer science) to find the optimal cis-regulatory arrangements for fitting experimental hb expression patterns. We find that the cis-regulatory region tends to readily evolve modularity. These cis-regulatory modules (CRMs) do not tend to control single spatial domains, but show a multi-CRM/multi-domain correspondence. We find that the CRM-domain correspondence seen in Drosophila evolves with a high probability in our model, supporting the biological relevance of the approach. The partial redundancy resulting from multi-CRM control may confer some biological robustness against corruption of regulatory sequences. The technique developed on hb could readily be applied to other multi-CRM developmental genes. PMID:24712536
The dynamics of correlated novelties.

PubMed

Tria, F; Loreto, V; Servedio, V D P; Strogatz, S H

2014-07-31

Novelties are a familiar part of daily life. They are also fundamental to the evolution of biological systems, human society, and technology. By opening new possibilities, one novelty can pave the way for others in a process that Kauffman has called "expanding the adjacent possible". The dynamics of correlated novelties, however, have yet to be quantified empirically or modeled mathematically. Here we propose a simple mathematical model that mimics the process of exploring a physical, biological, or conceptual space that enlarges whenever a novelty occurs. The model, a generalization of Polya's urn, predicts statistical laws for the rate at which novelties happen (Heaps' law) and for the probability distribution on the space explored (Zipf's law), as well as signatures of the process by which one novelty sets the stage for another. We test these predictions on four data sets of human activity: the edit events of Wikipedia pages, the emergence of tags in annotation systems, the sequence of words in texts, and listening to new songs in online music catalogues. By quantifying the dynamics of correlated novelties, our results provide a starting point for a deeper understanding of the adjacent possible and its role in biological, cultural, and technological evolution.
The dynamics of correlated novelties

NASA Astrophysics Data System (ADS)

Tria, F.; Loreto, V.; Servedio, V. D. P.; Strogatz, S. H.

2014-07-01

Novelties are a familiar part of daily life. They are also fundamental to the evolution of biological systems, human society, and technology. By opening new possibilities, one novelty can pave the way for others in a process that Kauffman has called ``expanding the adjacent possible''. The dynamics of correlated novelties, however, have yet to be quantified empirically or modeled mathematically. Here we propose a simple mathematical model that mimics the process of exploring a physical, biological, or conceptual space that enlarges whenever a novelty occurs. The model, a generalization of Polya's urn, predicts statistical laws for the rate at which novelties happen (Heaps' law) and for the probability distribution on the space explored (Zipf's law), as well as signatures of the process by which one novelty sets the stage for another. We test these predictions on four data sets of human activity: the edit events of Wikipedia pages, the emergence of tags in annotation systems, the sequence of words in texts, and listening to new songs in online music catalogues. By quantifying the dynamics of correlated novelties, our results provide a starting point for a deeper understanding of the adjacent possible and its role in biological, cultural, and technological evolution.
The dynamics of correlated novelties

PubMed Central

Tria, F.; Loreto, V.; Servedio, V. D. P.; Strogatz, S. H.

2014-01-01

Novelties are a familiar part of daily life. They are also fundamental to the evolution of biological systems, human society, and technology. By opening new possibilities, one novelty can pave the way for others in a process that Kauffman has called “expanding the adjacent possible”. The dynamics of correlated novelties, however, have yet to be quantified empirically or modeled mathematically. Here we propose a simple mathematical model that mimics the process of exploring a physical, biological, or conceptual space that enlarges whenever a novelty occurs. The model, a generalization of Polya's urn, predicts statistical laws for the rate at which novelties happen (Heaps' law) and for the probability distribution on the space explored (Zipf's law), as well as signatures of the process by which one novelty sets the stage for another. We test these predictions on four data sets of human activity: the edit events of Wikipedia pages, the emergence of tags in annotation systems, the sequence of words in texts, and listening to new songs in online music catalogues. By quantifying the dynamics of correlated novelties, our results provide a starting point for a deeper understanding of the adjacent possible and its role in biological, cultural, and technological evolution. PMID:25080941
EvoluCode: Evolutionary Barcodes as a Unifying Framework for Multilevel Evolutionary Data.

PubMed

Linard, Benjamin; Nguyen, Ngoc Hoan; Prosdocimi, Francisco; Poch, Olivier; Thompson, Julie D

2012-01-01

Evolutionary systems biology aims to uncover the general trends and principles governing the evolution of biological networks. An essential part of this process is the reconstruction and analysis of the evolutionary histories of these complex, dynamic networks. Unfortunately, the methodologies for representing and exploiting such complex evolutionary histories in large scale studies are currently limited. Here, we propose a new formalism, called EvoluCode (Evolutionary barCode), which allows the integration of different evolutionary parameters (eg, sequence conservation, orthology, synteny …) in a unifying format and facilitates the multilevel analysis and visualization of complex evolutionary histories at the genome scale. The advantages of the approach are demonstrated by constructing barcodes representing the evolution of the complete human proteome. Two large-scale studies are then described: (i) the mapping and visualization of the barcodes on the human chromosomes and (ii) automatic clustering of the barcodes to highlight protein subsets sharing similar evolutionary histories and their functional analysis. The methodologies developed here open the way to the efficient application of other data mining and knowledge extraction techniques in evolutionary systems biology studies. A database containing all EvoluCode data is available at: http://lbgi.igbmc.fr/barcodes.
Nearly complete 28S rRNA gene sequences confirm new hypotheses of sponge evolution.

PubMed

Thacker, Robert W; Hill, April L; Hill, Malcolm S; Redmond, Niamh E; Collins, Allen G; Morrow, Christine C; Spicer, Lori; Carmack, Cheryl A; Zappe, Megan E; Pohlmann, Deborah; Hall, Chelsea; Diaz, Maria C; Bangalore, Purushotham V

2013-09-01

The highly collaborative research sponsored by the NSF-funded Assembling the Porifera Tree of Life (PorToL) project is providing insights into some of the most difficult questions in metazoan systematics. Our understanding of phylogenetic relationships within the phylum Porifera has changed considerably with increased taxon sampling and data from additional molecular markers. PorToL researchers have falsified earlier phylogenetic hypotheses, discovered novel phylogenetic alliances, found phylogenetic homes for enigmatic taxa, and provided a more precise understanding of the evolution of skeletal features, secondary metabolites, body organization, and symbioses. Some of these exciting new discoveries are shared in the papers that form this issue of Integrative and Comparative Biology. Our analyses of over 300 nearly complete 28S ribosomal subunit gene sequences provide specific case studies that illustrate how our dataset confirms new hypotheses of sponge evolution. We recovered monophyletic clades for all 4 classes of sponges, as well as the 4 major clades of Demospongiae (Keratosa, Myxospongiae, Haploscleromorpha, and Heteroscleromorpha), but our phylogeny differs in several aspects from traditional classifications. In most major clades of sponges, families within orders appear to be paraphyletic. Although additional sampling of genes and taxa are needed to establish whether this pattern results from a lack of phylogenetic resolution or from a paraphyletic classification system, many of our results are congruent with those obtained from 18S ribosomal subunit gene sequences and complete mitochondrial genomes. These data provide further support for a revision of the traditional classification of sponges.
Nearly Complete 28S rRNA Gene Sequences Confirm New Hypotheses of Sponge Evolution

PubMed Central

Thacker, Robert W.; Hill, April L.; Hill, Malcolm S.; Redmond, Niamh E.; Collins, Allen G.; Morrow, Christine C.; Spicer, Lori; Carmack, Cheryl A.; Zappe, Megan E.; Pohlmann, Deborah; Hall, Chelsea; Diaz, Maria C.; Bangalore, Purushotham V.

2013-01-01

The highly collaborative research sponsored by the NSF-funded Assembling the Porifera Tree of Life (PorToL) project is providing insights into some of the most difficult questions in metazoan systematics. Our understanding of phylogenetic relationships within the phylum Porifera has changed considerably with increased taxon sampling and data from additional molecular markers. PorToL researchers have falsified earlier phylogenetic hypotheses, discovered novel phylogenetic alliances, found phylogenetic homes for enigmatic taxa, and provided a more precise understanding of the evolution of skeletal features, secondary metabolites, body organization, and symbioses. Some of these exciting new discoveries are shared in the papers that form this issue of Integrative and Comparative Biology. Our analyses of over 300 nearly complete 28S ribosomal subunit gene sequences provide specific case studies that illustrate how our dataset confirms new hypotheses of sponge evolution. We recovered monophyletic clades for all 4 classes of sponges, as well as the 4 major clades of Demospongiae (Keratosa, Myxospongiae, Haploscleromorpha, and Heteroscleromorpha), but our phylogeny differs in several aspects from traditional classifications. In most major clades of sponges, families within orders appear to be paraphyletic. Although additional sampling of genes and taxa are needed to establish whether this pattern results from a lack of phylogenetic resolution or from a paraphyletic classification system, many of our results are congruent with those obtained from 18S ribosomal subunit gene sequences and complete mitochondrial genomes. These data provide further support for a revision of the traditional classification of sponges. PMID:23748742
Thermodynamic Basis for the Emergence of Genomes during Prebiotic Evolution

PubMed Central

Woo, Hyung-June; Vijaya Satya, Ravi; Reifman, Jaques

2012-01-01

The RNA world hypothesis views modern organisms as descendants of RNA molecules. The earliest RNA molecules must have been random sequences, from which the first genomes that coded for polymerase ribozymes emerged. The quasispecies theory by Eigen predicts the existence of an error threshold limiting genomic stability during such transitions, but does not address the spontaneity of changes. Following a recent theoretical approach, we applied the quasispecies theory combined with kinetic/thermodynamic descriptions of RNA replication to analyze the collective behavior of RNA replicators based on known experimental kinetics data. We find that, with increasing fidelity (relative rate of base-extension for Watson-Crick versus mismatched base pairs), replications without enzymes, with ribozymes, and with protein-based polymerases are above, near, and below a critical point, respectively. The prebiotic evolution therefore must have crossed this critical region. Over large regions of the phase diagram, fitness increases with increasing fidelity, biasing random drifts in sequence space toward ‘crystallization.’ This region encloses the experimental nonenzymatic fidelity value, favoring evolutions toward polymerase sequences with ever higher fidelity, despite error rates above the error catastrophe threshold. Our work shows that experimentally characterized kinetics and thermodynamics of RNA replication allow us to determine the physicochemical conditions required for the spontaneous crystallization of biological information. Our findings also suggest that among many potential oligomers capable of templated replication, RNAs may have evolved to form prebiotic genomes due to the value of their nonenzymatic fidelity. PMID:22693440
Evolution of HIV-1 coreceptor usage and coreceptor switching during pregnancy.

PubMed

Ransy, Doris G; Motorina, Alena; Merindol, Natacha; Akouamba, Bertine S; Samson, Johanne; Lie, Yolanda; Napolitano, Laura A; Lapointe, Normand; Boucher, Marc; Soudeyns, Hugo

2014-03-01

Coreceptor switch from CCR5 to CXCR4 is associated with HIV disease progression. To document the evolution of coreceptor tropism during pregnancy, a longitudinal study of envelope gene sequences was performed in a group of pregnant women infected with HIV-1 of clade B (n=10) or non-B (n=9). Polymerase chain reaction (PCR) amplification of the V1-V3 region was performed on plasma viral RNA, followed by cloning and sequencing. Using geno2pheno and PSSMX4R5, the presence of X4 variants was predicted in nine of 19 subjects (X4 subjects) independent of HIV-1 clade. Six of nine X4 subjects exhibited CD4(+) T cell counts <200 cells/mm(3), and the presence of X4-capable virus was confirmed using a recombinant phenotypic assay in four of seven cases where testing was successful. In five of nine X4 subjects, a statistically significant decline in the geno2pheno false-positive rate was observed during the course of pregnancy, invariably accompanied by progressive increases in the PSSMX4R5 score, the net charge of V3, and the relative representation of X4 sequences. Evolution toward X4 tropism was also echoed in the primary structure of V2, as an accumulation of substitutions associated with CXCR4 tropism was seen in X4 subjects. Results from these experiments provide the first evidence of the ongoing evolution of coreceptor utilization from CCR5 to CXCR4 during pregnancy in a significant fraction of HIV-infected women. These results inform changes in host-pathogen interactions that lead to a directional shaping of viral populations and viral tropism during pregnancy, and provide insights into the biology of HIV transmission from mother to child.
Phylogeny of zebrafish, a "model species," within Danio, a "model genus".

PubMed

McCluskey, Braedan M; Postlethwait, John H

2015-03-01

Zebrafish (Danio rerio) is an important model for vertebrate development, genomics, physiology, behavior, toxicology, and disease. Additionally, work on numerous Danio species is elucidating evolutionary mechanisms for morphological development. Yet, the relationships of zebrafish and its closest relatives remain unclear possibly due to incomplete lineage sorting, speciation with gene flow, and interspecies hybridization. To clarify these relationships, we first constructed phylogenomic data sets from 30,801 restriction-associated DNA (RAD)-tag loci (483,026 variable positions) with clear orthology to a single location in the sequenced zebrafish genome. We then inferred a well-supported species tree for Danio and tested for gene flow during the diversification of the genus. An approach independent of the sequenced zebrafish genome verified all inferred relationships. Although identification of the sister taxon to zebrafish has been contentious, multiple RAD-tag data sets and several analytical methods provided strong evidence for Danio aesculapii as the most closely related extant zebrafish relative studied to date. Data also displayed patterns consistent with gene flow during speciation and postspeciation introgression in the lineage leading to zebrafish. The incorporation of biogeographic data with phylogenomic analyses put these relationships in a phylogeographic context and supplied additional support for D. aesculapii as the sister species to D. rerio. The clear resolution of this study establishes a framework for investigating the evolutionary biology of Danio and the heterogeneity of genome evolution in the recent history of a model organism within an emerging model genus for genetics, development, and evolution. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
The Widespread Prevalence and Functional Significance of Silk-Like Structural Proteins in Metazoan Biological Materials

PubMed Central

McDougall, Carmel; Woodcroft, Ben J.

2016-01-01

In nature, numerous mechanisms have evolved by which organisms fabricate biological structures with an impressive array of physical characteristics. Some examples of metazoan biological materials include the highly elastic byssal threads by which bivalves attach themselves to rocks, biomineralized structures that form the skeletons of various animals, and spider silks that are renowned for their exceptional strength and elasticity. The remarkable properties of silks, which are perhaps the best studied biological materials, are the result of the highly repetitive, modular, and biased amino acid composition of the proteins that compose them. Interestingly, similar levels of modularity/repetitiveness and similar bias in amino acid compositions have been reported in proteins that are components of structural materials in other organisms, however the exact nature and extent of this similarity, and its functional and evolutionary relevance, is unknown. Here, we investigate this similarity and use sequence features common to silks and other known structural proteins to develop a bioinformatics-based method to identify similar proteins from large-scale transcriptome and whole-genome datasets. We show that a large number of proteins identified using this method have roles in biological material formation throughout the animal kingdom. Despite the similarity in sequence characteristics, most of the silk-like structural proteins (SLSPs) identified in this study appear to have evolved independently and are restricted to a particular animal lineage. Although the exact function of many of these SLSPs is unknown, the apparent independent evolution of proteins with similar sequence characteristics in divergent lineages suggests that these features are important for the assembly of biological materials. The identification of these characteristics enable the generation of testable hypotheses regarding the mechanisms by which these proteins assemble and direct the construction of biological materials with diverse morphologies. The SilkSlider predictor software developed here is available at https://github.com/wwood/SilkSlider. PMID:27415783
Process modeling of a HLA research lab

NASA Astrophysics Data System (ADS)

Ribeiro, Bruna G. C.; Sena, Alexandre C.; Silva, Dilson; Marzulo, Leandro A. J.

2017-11-01

Bioinformatics has provided tremendous breakthroughs in the field of molecular biology. All this evolution has generated a large volume of biological data that increasingly require the use of computing for analysis and storage of this information. The identification of the human leukocyte antigen (HLA) genotypes is critical to the success of organ transplants in humans. HLA typing involves not only laboratory tests but also DNA sequencing, with the participation of several professionals responsible for different stages of the process. Thus, the objective of this paper is to map the main steps in HLA typing in a laboratory specialized in performing such procedures, analyzing each process and proposing solutions to speed up the these steps, avoiding mistakes.
Synergy and contingency as driving forces for the evolution of multiple secondary metabolite production by Streptomyces species.

PubMed

Challis, Gregory L; Hopwood, David A

2003-11-25

In this article we briefly review theories about the ecological roles of microbial secondary metabolites and discuss the prevalence of multiple secondary metabolite production by strains of Streptomyces, highlighting results from analysis of the recently sequenced Streptomyces coelicolor and Streptomyces avermitilis genomes. We address this question: Why is multiple secondary metabolite production in Streptomyces species so commonplace? We argue that synergy or contingency in the action of individual metabolites against biological competitors may, in some cases, be a powerful driving force for the evolution of multiple secondary metabolite production. This argument is illustrated with examples of the coproduction of synergistically acting antibiotics and contingently acting siderophores: two well-known classes of secondary metabolite. We focus, in particular, on the coproduction of beta-lactam antibiotics and beta-lactamase inhibitors, the coproduction of type A and type B streptogramins, and the coregulated production and independent uptake of structurally distinct siderophores by species of Streptomyces. Possible mechanisms for the evolution of multiple synergistic and contingent metabolite production in Streptomyces species are discussed. It is concluded that the production by Streptomyces species of two or more secondary metabolites that act synergistically or contingently against biological competitors may be far more common than has previously been recognized, and that synergy and contingency may be common driving forces for the evolution of multiple secondary metabolite production by these sessile saprophytes.
Synergy and contingency as driving forces for the evolution of multiple secondary metabolite production by Streptomyces species

PubMed Central

Challis, Gregory L.; Hopwood, David A.

2003-01-01

In this article we briefly review theories about the ecological roles of microbial secondary metabolites and discuss the prevalence of multiple secondary metabolite production by strains of Streptomyces, highlighting results from analysis of the recently sequenced Streptomyces coelicolor and Streptomyces avermitilis genomes. We address this question: Why is multiple secondary metabolite production in Streptomyces species so commonplace? We argue that synergy or contingency in the action of individual metabolites against biological competitors may, in some cases, be a powerful driving force for the evolution of multiple secondary metabolite production. This argument is illustrated with examples of the coproduction of synergistically acting antibiotics and contingently acting siderophores: two well-known classes of secondary metabolite. We focus, in particular, on the coproduction of β-lactam antibiotics and β-lactamase inhibitors, the coproduction of type A and type B streptogramins, and the coregulated production and independent uptake of structurally distinct siderophores by species of Streptomyces. Possible mechanisms for the evolution of multiple synergistic and contingent metabolite production in Streptomyces species are discussed. It is concluded that the production by Streptomyces species of two or more secondary metabolites that act synergistically or contingently against biological competitors may be far more common than has previously been recognized, and that synergy and contingency may be common driving forces for the evolution of multiple secondary metabolite production by these sessile saprophytes. PMID:12970466
The Teaching of Evolution and Creationism in Minnesota

ERIC Educational Resources Information Center

Moore, Randy; Kraemer, Karen

2005-01-01

The evolution-related attitudes and actions of Minnesota high school biology teachers were studied to estimate the prevalence of creationism among biology teachers. Minnesota's high school biology teachers were questioned about the evolution education in public schools regarding the percentage of biology teachers who teach evolution, class-time…
Similar Ratios of Introns to Intergenic Sequence across Animal Genomes.

PubMed

Francis, Warren R; Wörheide, Gert

2017-06-01

One central goal of genome biology is to understand how the usage of the genome differs between organisms. Our knowledge of genome composition, needed for downstream inferences, is critically dependent on gene annotations, yet problems associated with gene annotation and assembly errors are usually ignored in comparative genomics. Here, we analyze the genomes of 68 species across 12 animal phyla and some single-cell eukaryotes for general trends in genome composition and transcription, taking into account problems of gene annotation. We show that, regardless of genome size, the ratio of introns to intergenic sequence is comparable across essentially all animals, with nearly all deviations dominated by increased intergenic sequence. Genomes of model organisms have ratios much closer to 1:1, suggesting that the majority of published genomes of nonmodel organisms are underannotated and consequently omit substantial numbers of genes, with likely negative impact on evolutionary interpretations. Finally, our results also indicate that most animals transcribe half or more of their genomes arguing against differences in genome usage between animal groups, and also suggesting that the transcribed portion is more dependent on genome size than previously thought. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Mutation at a distance caused by homopolymeric guanine repeats in Saccharomyces cerevisiae

PubMed Central

McDonald, Michael J.; Yu, Yen-Hsin; Guo, Jheng-Fen; Chong, Shin Yen; Kao, Cheng-Fu; Leu, Jun-Yi

2016-01-01

Mutation provides the raw material from which natural selection shapes adaptations. The rate at which new mutations arise is therefore a key factor that determines the tempo and mode of evolution. However, an accurate assessment of the mutation rate of a given organism is difficult because mutation rate varies on a fine scale within a genome. A central challenge of evolutionary genetics is to determine the underlying causes of this variation. In earlier work, we had shown that repeat sequences not only are prone to a high rate of expansion and contraction but also can cause an increase in mutation rate (on the order of kilobases) of the sequence surrounding the repeat. We perform experiments that show that simple guanine repeats 13 bp (base pairs) in length or longer (G13+) increase the substitution rate 4- to 18-fold in the downstream DNA sequence, and this correlates with DNA replication timing (R = 0.89). We show that G13+ mutagenicity results from the interplay of both error-prone translesion synthesis and homologous recombination repair pathways. The mutagenic repeats that we study have the potential to be exploited for the artificial elevation of mutation rate in systems biology and synthetic biology applications. PMID:27386516

Detection and sequence/structure mapping of biophysical constraints to protein variation in saturated mutational libraries and protein sequence alignments with a dedicated server.

PubMed

Abriata, Luciano A; Bovigny, Christophe; Dal Peraro, Matteo

2016-06-17

Protein variability can now be studied by measuring high-resolution tolerance-to-substitution maps and fitness landscapes in saturated mutational libraries. But these rich and expensive datasets are typically interpreted coarsely, restricting detailed analyses to positions of extremely high or low variability or dubbed important beforehand based on existing knowledge about active sites, interaction surfaces, (de)stabilizing mutations, etc. Our new webserver PsychoProt (freely available without registration at http://psychoprot.epfl.ch or at http://lucianoabriata.altervista.org/psychoprot/index.html ) helps to detect, quantify, and sequence/structure map the biophysical and biochemical traits that shape amino acid preferences throughout a protein as determined by deep-sequencing of saturated mutational libraries or from large alignments of naturally occurring variants. We exemplify how PsychoProt helps to (i) unveil protein structure-function relationships from experiments and from alignments that are consistent with structures according to coevolution analysis, (ii) recall global information about structural and functional features and identify hitherto unknown constraints to variation in alignments, and (iii) point at different sources of variation among related experimental datasets or between experimental and alignment-based data. Remarkably, metabolic costs of the amino acids pose strong constraints to variability at protein surfaces in nature but not in the laboratory. This and other differences call for caution when extrapolating results from in vitro experiments to natural scenarios in, for example, studies of protein evolution. We show through examples how PsychoProt can be a useful tool for the broad communities of structural biology and molecular evolution, particularly for studies about protein modeling, evolution and design.
Comparative Analysis of the Shared Sex-Determination Region (SDR) among Salmonid Fishes.

PubMed

Faber-Hammond, Joshua J; Phillips, Ruth B; Brown, Kim H

2015-06-25

Salmonids present an excellent model for studying evolution of young sex-chromosomes. Within the genus, Oncorhynchus, at least six independent sex-chromosome pairs have evolved, many unique to individual species. This variation results from the movement of the sex-determining gene, sdY, throughout the salmonid genome. While sdY is known to define sexual differentiation in salmonids, the mechanism of its movement throughout the genome has remained elusive due to high frequencies of repetitive elements, rDNA sequences, and transposons surrounding the sex-determining regions (SDR). Despite these difficulties, bacterial artificial chromosome (BAC) library clones from both rainbow trout and Atlantic salmon containing the sdY region have been reported. Here, we report the sequences for these BACs as well as the extended sequence for the known SDR in Chinook gained through genome walking methods. Comparative analysis allowed us to study the overlapping SDRs from three unique salmonid Y chromosomes to define the specific content, size, and variation present between the species. We found approximately 4.1 kb of orthologous sequence common to all three species, which contains the genetic content necessary for masculinization. The regions contain transposable elements that may be responsible for the translocations of the SDR throughout salmonid genomes and we examine potential mechanistic roles of each one. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
The Medicago Genome Provides Insight into the Evolution of Rhizobial Symbioses

PubMed Central

Young, Nevin D.; Debellé, Frédéric; Oldroyd, Giles E. D.; Geurts, Rene; Cannon, Steven B.; Udvardi, Michael K.; Benedito, Vagner A.; Mayer, Klaus F. X.; Gouzy, Jérôme; Schoof, Heiko; Van de Peer, Yves; Proost, Sebastian; Cook, Douglas R.; Meyers, Blake C.; Spannagl, Manuel; Cheung, Foo; De Mita, Stéphane; Krishnakumar, Vivek; Gundlach, Heidrun; Zhou, Shiguo; Mudge, Joann; Bharti, Arvind K.; Murray, Jeremy D.; Naoumkina, Marina A.; Rosen, Benjamin; Silverstein, Kevin A. T.; Tang, Haibao; Rombauts, Stephane; Zhao, Patrick X.; Zhou, Peng; Barbe, Valérie; Bardou, Philippe; Bechner, Michael; Bellec, Arnaud; Berger, Anne; Bergès, Hélène; Bidwell, Shelby; Bisseling, Ton; Choisne, Nathalie; Couloux, Arnaud; Denny, Roxanne; Deshpande, Shweta; Dai, Xinbin; Doyle, Jeff; Dudez, Anne-Marie; Farmer, Andrew D.; Fouteau, Stéphanie; Franken, Carolien; Gibelin, Chrystel; Gish, John; Goldstein, Steven; González, Alvaro J.; Green, Pamela J.; Hallab, Asis; Hartog, Marijke; Hua, Axin; Humphray, Sean; Jeong, Dong-Hoon; Jing, Yi; Jöcker, Anika; Kenton, Steve M.; Kim, Dong-Jin; Klee, Kathrin; Lai, Hongshing; Lang, Chunting; Lin, Shaoping; Macmil, Simone L; Magdelenat, Ghislaine; Matthews, Lucy; McCorrison, Jamison; Monaghan, Erin L.; Mun, Jeong-Hwan; Najar, Fares Z.; Nicholson, Christine; Noirot, Céline; O’Bleness, Majesta; Paule, Charles R.; Poulain, Julie; Prion, Florent; Qin, Baifang; Qu, Chunmei; Retzel, Ernest F.; Riddle, Claire; Sallet, Erika; Samain, Sylvie; Samson, Nicolas; Sanders, Iryna; Saurat, Olivier; Scarpelli, Claude; Schiex, Thomas; Segurens, Béatrice; Severin, Andrew J.; Sherrier, D. Janine; Shi, Ruihua; Sims, Sarah; Singer, Susan R.; Sinharoy, Senjuti; Sterck, Lieven; Viollet, Agnès; Wang, Bing-Bing; Wang, Keqin; Wang, Mingyi; Wang, Xiaohong; Warfsmann, Jens; Weissenbach, Jean; White, Doug D.; White, Jim D.; Wiley, Graham B.; Wincker, Patrick; Xing, Yanbo; Yang, Limei; Yao, Ziyun; Ying, Fu; Zhai, Jixian; Zhou, Liping; Zuber, Antoine; Dénarié, Jean; Dixon, Richard A.; May, Gregory D.; Schwartz, David C.; Rogers, Jane; Quétier, Francis; Town, Christopher D.; Roe, Bruce A.

2011-01-01

Legumes (Fabaceae or Leguminosae) are unique among cultivated plants for their ability to carry out endosymbiotic nitrogen fixation with rhizobial bacteria, a process that takes place in a specialized structure known as the nodule. Legumes belong to one of the two main groups of eurosids, the Fabidae, which includes most species capable of endosymbiotic nitrogen fixation 1. Legumes comprise several evolutionary lineages derived from a common ancestor 60 million years ago (Mya). Papilionoids are the largest clade, dating nearly to the origin of legumes and containing most cultivated species 2. Medicago truncatula (Mt) is a long-established model for the study of legume biology. Here we describe the draft sequence of the Mt euchromatin based on a recently completed BAC-assembly supplemented with Illumina-shotgun sequence, together capturing ~94% of all Mt genes. A whole-genome duplication (WGD) approximately 58 Mya played a major role in shaping the Mt genome and thereby contributed to the evolution of endosymbiotic nitrogen fixation. Subsequent to the WGD, the Mt genome experienced higher levels of rearrangement than two other sequenced legumes, Glycine max (Gm) and Lotus japonicus (Lj). Mt is a close relative of alfalfa (M. sativa), a widely cultivated crop with limited genomics tools and complex autotetraploid genetics. As such, the Mt genome sequence provides significant opportunities to expand alfalfa’s genomic toolbox. PMID:22089132
Endosymbiosis and Eukaryotic Cell Evolution.

PubMed

Archibald, John M

2015-10-05

Understanding the evolution of eukaryotic cellular complexity is one of the grand challenges of modern biology. It has now been firmly established that mitochondria and plastids, the classical membrane-bound organelles of eukaryotic cells, evolved from bacteria by endosymbiosis. In the case of mitochondria, evidence points very clearly to an endosymbiont of α-proteobacterial ancestry. The precise nature of the host cell that partnered with this endosymbiont is, however, very much an open question. And while the host for the cyanobacterial progenitor of the plastid was undoubtedly a fully-fledged eukaryote, how - and how often - plastids moved from one eukaryote to another during algal diversification is vigorously debated. In this article I frame modern views on endosymbiotic theory in a historical context, highlighting the transformative role DNA sequencing played in solving early problems in eukaryotic cell evolution, and posing key unanswered questions emerging from the age of comparative genomics. Copyright © 2015 Elsevier Ltd. All rights reserved.
The genomics of selection in dogs and the parallel evolution between dogs and humans.

PubMed

Wang, Guo-dong; Zhai, Weiwei; Yang, He-chuan; Fan, Ruo-xi; Cao, Xue; Zhong, Li; Wang, Lu; Liu, Fei; Wu, Hong; Cheng, Lu-guang; Poyarkov, Andrei D; Poyarkov, Nikolai A; Tang, Shu-sheng; Zhao, Wen-ming; Gao, Yun; Lv, Xue-mei; Irwin, David M; Savolainen, Peter; Wu, Chung-I; Zhang, Ya-ping

2013-01-01

The genetic bases of demographic changes and artificial selection underlying domestication are of great interest in evolutionary biology. Here we perform whole-genome sequencing of multiple grey wolves, Chinese indigenous dogs and dogs of diverse breeds. Demographic analysis show that the split between wolves and Chinese indigenous dogs occurred 32,000 years ago and that the subsequent bottlenecks were mild. Therefore, dogs may have been under human selection over a much longer time than previously concluded, based on molecular data, perhaps by initially scavenging with humans. Population genetic analysis identifies a list of genes under positive selection during domestication, which overlaps extensively with the corresponding list of positively selected genes in humans. Parallel evolution is most apparent in genes for digestion and metabolism, neurological process and cancer. Our study, for the first time, draws together humans and dogs in their recent genomic evolution.
Exploring mitochondrial evolution and metabolism organization principles by comparative analysis of metabolic networks.

PubMed

Chang, Xiao; Wang, Zhuo; Hao, Pei; Li, Yuan-Yuan; Li, Yi-Xue

2010-06-01

The endosymbiotic theory proposed that mitochondrial genomes are derived from an alpha-proteobacterium-like endosymbiont, which was concluded from sequence analysis. We rebuilt the metabolic networks of mitochondria and 22 relative species, and studied the evolution of mitochondrial metabolism at the level of enzyme content and network topology. Our phylogenetic results based on network alignment and motif identification supported the endosymbiotic theory from the point of view of systems biology for the first time. It was found that the mitochondrial metabolic network were much more compact than the relative species, probably related to the higher efficiency of oxidative phosphorylation of the specialized organelle, and the network is highly clustered around the TCA cycle. Moreover, the mitochondrial metabolic network exhibited high functional specificity to the modules. This work provided insight to the understanding of mitochondria evolution, and the organization principle of mitochondrial metabolic network at the network level. Copyright 2010 Elsevier Inc. All rights reserved.
Evolution of cyclohexadienyl dehydratase from an ancestral solute-binding protein.

PubMed

Clifton, Ben E; Kaczmarski, Joe A; Carr, Paul D; Gerth, Monica L; Tokuriki, Nobuhiko; Jackson, Colin J

2018-04-23

The emergence of enzymes through the neofunctionalization of noncatalytic proteins is ultimately responsible for the extraordinary range of biological catalysts observed in nature. Although the evolution of some enzymes from binding proteins can be inferred by homology, we have a limited understanding of the nature of the biochemical and biophysical adaptations along these evolutionary trajectories and the sequence in which they occurred. Here we reconstructed and characterized evolutionary intermediate states linking an ancestral solute-binding protein to the extant enzyme cyclohexadienyl dehydratase. We show how the intrinsic reactivity of a desolvated general acid was harnessed by a series of mutations radiating from the active site, which optimized enzyme-substrate complementarity and transition-state stabilization and minimized sampling of noncatalytic conformations. Our work reveals the molecular evolutionary processes that underlie the emergence of enzymes de novo, which are notably mirrored by recent examples of computational enzyme design and directed evolution.
A close-up view on ITS2 evolution and speciation - a case study in the Ulvophyceae (Chlorophyta, Viridiplantae)

PubMed Central

2011-01-01

Background The second Internal Transcriber Spacer (ITS2) is a fast evolving part of the nuclear-encoded rRNA operon located between the 5.8S and 28S rRNA genes. Based on crossing experiments it has been proposed that even a single Compensatory Base Change (CBC) in helices 2 and 3 of the ITS2 indicates sexual incompatibility and thus separates biological species. Taxa without any CBC in these ITS2 regions were designated as a 'CBC clade'. However, in depth comparative analyses of ITS2 secondary structures, ITS2 phylogeny, the origin of CBCs, and their relationship to biological species have rarely been performed. To gain 'close-up' insights into ITS2 evolution, (1) 86 sequences of ITS2 including secondary structures have been investigated in the green algal order Ulvales (Chlorophyta, Viridiplantae), (2) after recording all existing substitutions, CBCs and hemi-CBCs (hCBCs) were mapped upon the ITS2 phylogeny, rather than merely comparing ITS2 characters among pairs of taxa, and (3) the relation between CBCs, hCBCs, CBC clades, and the taxonomic level of organisms was investigated in detail. Results High sequence and length conservation allowed the generation of an ITS2 consensus secondary structure, and introduction of a novel numbering system of ITS2 nucleotides and base pairs. Alignments and analyses were based on this structural information, leading to the following results: (1) in the Ulvales, the presence of a CBC is not linked to any particular taxonomic level, (2) most CBC 'clades' sensu Coleman are paraphyletic, and should rather be termed CBC grades. (3) the phenetic approach of pairwise comparison of sequences can be misleading, and thus, CBCs/hCBCs must be investigated in their evolutionary context, including homoplasy events (4) CBCs and hCBCs in ITS2 helices evolved independently, and we found no evidence for a CBC that originated via a two-fold hCBC substitution. Conclusions Our case study revealed several discrepancies between ITS2 evolution in the Ulvales and generally accepted assumptions underlying ITS2 evolution as e.g. the CBC clade concept. Therefore, we developed a suite of methods providing a critical 'close-up' view into ITS2 evolution by directly tracing the evolutionary history of individual positions, and we caution against a non-critical use of the ITS2 CBC clade concept for species delimitation. PMID:21933414
Biological pattern and transcriptomic exploration and phylogenetic analysis in the odd floral architecture tree: Helwingia willd.

PubMed

Sun, Cheng; Yu, Guoliang; Bao, Manzhu; Zheng, Bo; Ning, Guogui

2014-06-27

Odd traits in few of plant species usually implicate potential biology significances in plant evolutions. The genus Helwingia Willd, a dioecious medical shrub in Aquifoliales order, has an odd floral architecture-epiphyllous inflorescence. The potential significances and possible evolutionary origin of this specie are not well understood due to poorly available data of biological and genetic studies. In addition, the advent of genomics-based technologies has widely revolutionized plant species with unknown genomic information. Morphological and biological pattern were detailed via anatomical and pollination analyses. An RNA sequencing based transcriptomic analysis were undertaken and a high-resolution phylogenetic analysis was conducted based on single-copy genes in more than 80 species of seed plants, including H. japonica. It is verified that a potential fusion of rachis to the leaf midvein facilitates insect pollination. RNA sequencing yielded a total of 111450 unigenes; half of them had significant similarity with proteins in the public database, and 20281 unigenes were mapped to 119 pathways. Deduced from the phylogenetic analysis based on single-copy genes, the group of Helwingia is closer with Euasterids II and rather than Euasterids, congruent with previous reports using plastid sequences. The odd flower architecture make H. Willd adapt to insect pollination by hosting those insects larger than the flower in size via leave, which has little common character that other insect pollination plants hold. Further the present transcriptome greatly riches genomics information of Helwingia species and nucleus genes based phylogenetic analysis also greatly improve the resolution and robustness of phylogenetic reconstruction in H. japonica.
Genome-Wide Identification of Regulatory Sequences Undergoing Accelerated Evolution in the Human Genome

PubMed Central

Dong, Xinran; Wang, Xiao; Zhang, Feng; Tian, Weidong

2016-01-01

Accelerated evolution of regulatory sequence can alter the expression pattern of target genes, and cause phenotypic changes. In this study, we used DNase I hypersensitive sites (DHSs) to annotate putative regulatory sequences in the human genome, and conducted a genome-wide analysis of the effects of accelerated evolution on regulatory sequences. Working under the assumption that local ancient repeat elements of DHSs are under neutral evolution, we discovered that ∼0.44% of DHSs are under accelerated evolution (ace-DHSs). We found that ace-DHSs tend to be more active than background DHSs, and are strongly associated with epigenetic marks of active transcription. The target genes of ace-DHSs are significantly enriched in neuron-related functions, and their expression levels are positively selected in the human brain. Thus, these lines of evidences strongly suggest that accelerated evolution on regulatory sequences plays important role in the evolution of human-specific phenotypes. PMID:27401230
The not so universal tree of life or the place of viruses in the living world

PubMed Central

Brüssow, Harald

2009-01-01

Darwin provided a great unifying theory for biology; its visual expression is the universal tree of life. The tree concept is challenged by the occurrence of horizontal gene transfer and—as summarized in this review—by the omission of viruses. Microbial ecologists have demonstrated that viruses are the most numerous biological entities on earth, outnumbering cells by a factor of 10. Viral genomics have revealed an unexpected size and distinctness of the viral DNA sequence space. Comparative genomics has shown elements of vertical evolution in some groups of viruses. Furthermore, structural biology has demonstrated links between viruses infecting the three domains of life pointing to a very ancient origin of viruses. However, presently viruses do not find a place on the universal tree of life, which is thus only a tree of cellular life. In view of the polythetic nature of current life definitions, viruses cannot be dismissed as non-living material. On earth we have therefore at least two large DNA sequence spaces, one represented by capsid-encoding viruses and another by ribosome-encoding cells. Despite their probable distinct evolutionary origin, both spheres were and are connected by intensive two-way gene transfers. PMID:19571246
Genome sequence and comparative analysis of a putative entomopathogenic Serratia isolated from Caenorhabditis briggsae.

PubMed

Abebe-Akele, Feseha; Tisa, Louis S; Cooper, Vaughn S; Hatcher, Philip J; Abebe, Eyualem; Thomas, W Kelley

2015-07-18

Entomopathogenic associations between nematodes in the genera Steinernema and Heterorhabdus with their cognate bacteria from the bacterial genera Xenorhabdus and Photorhabdus, respectively, are extensively studied for their potential as biological control agents against invasive insect species. These two highly coevolved associations were results of convergent evolution. Given the natural abundance of bacteria, nematodes and insects, it is surprising that only these two associations with no intermediate forms are widely studied in the entomopathogenic context. Discovering analogous systems involving novel bacterial and nematode species would shed light on the evolutionary processes involved in the transition from free living organisms to obligatory partners in entomopathogenicity. We report the complete genome sequence of a new member of the enterobacterial genus Serratia that forms a putative entomopathogenic complex with Caenorhabditis briggsae. Analysis of the 5.04 MB chromosomal genome predicts 4599 protein coding genes, seven sets of ribosomal RNA genes, 84 tRNA genes and a 64.8 KB plasmid encoding 74 genes. Comparative genomic analysis with three of the previously sequenced Serratia species, S. marcescens DB11 and S. proteamaculans 568, and Serratia sp. AS12, revealed that these four representatives of the genus share a core set of ~3100 genes and extensive structural conservation. The newly identified species shares a more recent common ancestor with S. marcescens with 99% sequence identity in rDNA sequence and orthology across 85.6% of predicted genes. Of the 39 genes/operons implicated in the virulence, symbiosis, recolonization, immune evasion and bioconversion, 21 (53.8%) were present in Serratia while 33 (84.6%) and 35 (89%) were present in Xenorhabdus and Photorhabdus EPN bacteria respectively. The majority of unique sequences in Serratia sp. SCBI (South African Caenorhabditis briggsae Isolate) are found in ~29 genomic islands of 5 to 65 genes and are enriched in putative functions that are biologically relevant to an entomopathogenic lifestyle, including non-ribosomal peptide synthetases, bacteriocins, fimbrial biogenesis, ushering proteins, toxins, secondary metabolite secretion and multiple drug resistance/efflux systems. By revealing the early stages of adaptation to this lifestyle, the Serratia sp. SCBI genome underscores the fact that in EPN formation the composite end result - killing, bioconversion, cadaver protection and recolonization- can be achieved by dissimilar mechanisms. This genome sequence will enable further study of the evolution of entomopathogenic nematode-bacteria complexes.
Herpesviruses that infect fish.

PubMed

Hanson, Larry; Dishon, Arnon; Kotler, Moshe

2011-11-01

Herpesviruses are host specific pathogens that are widespread among vertebrates. Genome sequence data demonstrate that most herpesviruses of fish and amphibians are grouped together (family Alloherpesviridae) and are distantly related to herpesviruses of reptiles, birds and mammals (family Herpesviridae). Yet, many of the biological processes of members of the order Herpesvirales are similar. Among the conserved characteristics are the virion structure, replication process, the ability to establish long term latency and the manipulation of the host immune response. Many of the similar processes may be due to convergent evolution. This overview of identified herpesviruses of fish discusses the diseases that alloherpesviruses cause, the biology of these viruses and the host-pathogen interactions. Much of our knowledge on the biology of Alloherpesvirdae is derived from research with two species: Ictalurid herpesvirus 1 (channel catfish virus) and Cyprinid herpesvirus 3 (koi herpesvirus).
Herpesviruses that Infect Fish

PubMed Central

Hanson, Larry; Dishon, Arnon; Kotler, Moshe

2011-01-01

Herpesviruses are host specific pathogens that are widespread among vertebrates. Genome sequence data demonstrate that most herpesviruses of fish and amphibians are grouped together (family Alloherpesviridae) and are distantly related to herpesviruses of reptiles, birds and mammals (family Herpesviridae). Yet, many of the biological processes of members of the order Herpesvirales are similar. Among the conserved characteristics are the virion structure, replication process, the ability to establish long term latency and the manipulation of the host immune response. Many of the similar processes may be due to convergent evolution. This overview of identified herpesviruses of fish discusses the diseases that alloherpesviruses cause, the biology of these viruses and the host-pathogen interactions. Much of our knowledge on the biology of Alloherpesvirdae is derived from research with two species: Ictalurid herpesvirus 1 (channel catfish virus) and Cyprinid herpesvirus 3 (koi herpesvirus). PMID:22163339
Beyond DNA: integrating inclusive inheritance into an extended theory of evolution.

PubMed

Danchin, Étienne; Charmantier, Anne; Champagne, Frances A; Mesoudi, Alex; Pujol, Benoit; Blanchet, Simon

2011-06-17

Many biologists are calling for an 'extended evolutionary synthesis' that would 'modernize the modern synthesis' of evolution. Biological information is typically considered as being transmitted across generations by the DNA sequence alone, but accumulating evidence indicates that both genetic and non-genetic inheritance, and the interactions between them, have important effects on evolutionary outcomes. We review the evidence for such effects of epigenetic, ecological and cultural inheritance and parental effects, and outline methods that quantify the relative contributions of genetic and non-genetic heritability to the transmission of phenotypic variation across generations. These issues have implications for diverse areas, from the question of missing heritability in human complex-trait genetics to the basis of major evolutionary transitions.
Biology Professors' and Teachers' Positions Regarding Biological Evolution and Evolution Education in a Middle Eastern Society

NASA Astrophysics Data System (ADS)

BouJaoude, Saouma; Asghar, Anila; Wiles, Jason R.; Jaber, Lama; Sarieddine, Diana; Alters, Brian

2011-05-01

This study investigated three questions: (1) What are Lebanese secondary school (Grade 9-12) biology teachers' and university biology professors' positions regarding biological evolution?, (2) How do participants' religious affiliations relate to their positions about evolutionary science?, and (3) What are participants' positions regarding evolution education? Participants were 20 secondary school biology teachers and seven university biology professors. Seventy percent of the teachers and 60% of the professors were Muslim. Data came from semi-structured interviews with participants. Results showed that nine (Christian or Muslim Druze) teachers accepted the theory, five (four Muslim) rejected it because it contradicted religious beliefs, and three (Muslim) reinterpreted it because evolution did not include humans. Teachers who rejected or reinterpreted the evolutionary theory said that it should not be taught (three), evolution and creationism should be given equal time (two), or students should be allowed to take their own stand. Two professors indicated that they taught evolution explicitly and five said that they integrated it in other biology content. One Muslim professor said that she stressed 'the role of God in creation during instruction on evolution'. It seems that years of studying and teaching biology have not had a transformative effect on how a number of teachers and professors think about evolution.
Modeling Co-evolution of Speech and Biology.

PubMed

de Boer, Bart

2016-04-01

Two computer simulations are investigated that model interaction of cultural evolution of language and biological evolution of adaptations to language. Both are agent-based models in which a population of agents imitates each other using realistic vowels. The agents evolve under selective pressure for good imitation. In one model, the evolution of the vocal tract is modeled; in the other, a cognitive mechanism for perceiving speech accurately is modeled. In both cases, biological adaptations to using and learning speech evolve, even though the system of speech sounds itself changes at a more rapid time scale than biological evolution. However, the fact that the available acoustic space is used maximally (a self-organized result of cultural evolution) is constant, and therefore biological evolution does have a stable target. This work shows that when cultural and biological traits are continuous, their co-evolution may lead to cognitive adaptations that are strong enough to detect empirically. Copyright © 2016 Cognitive Science Society, Inc.
Academic Preparation in Biology and Advocacy for Teaching Evolution: Biology versus Non-Biology Teachers

ERIC Educational Resources Information Center

Nehm, Ross H.; Kim, Sun Young; Sheppard, Keith

2009-01-01

Despite considerable focus on evolution knowledge-belief relationships, little research has targeted populations with strong content backgrounds, such as undergraduate degrees in biology. This study (1) measured precertified biology and non-biology teachers' (n = 167) knowledge of evolution and the nature of science; (2) quantified teacher…
Evolving Continents

NASA Astrophysics Data System (ADS)

Hamilton, Warren

Brian Windley succeeds very well indeed at the formidable task he sets for himself in this greatly revised second edition of a book that first appeared in 1977. He synthesizes primarily the tectonic and petrologic evolution of the continents and secondarily their economic geologic, stratigraphic, and biologic history. The book is organized in well-balanced time sequence and topical chapters, followed by a fine overview. The author describes examples, generalizes from them, and seeks understanding of variations with time and with depth of the process acting on continents within a plate tectonic framework.
Genome-Based Characterization of Biological Processes That Differentiate Closely Related Bacteria

PubMed Central

Palmer, Marike; Steenkamp, Emma T.; Coetzee, Martin P. A.; Blom, Jochen; Venter, Stephanus N.

2018-01-01

Bacteriologists have strived toward attaining a natural classification system based on evolutionary relationships for nearly 100 years. In the early twentieth century it was accepted that a phylogeny-based system would be the most appropriate, but in the absence of molecular data, this approach proved exceedingly difficult. Subsequent technical advances and the increasing availability of genome sequencing have allowed for the generation of robust phylogenies at all taxonomic levels. In this study, we explored the possibility of linking biological characters to higher-level taxonomic groups in bacteria by making use of whole genome sequence information. For this purpose, we specifically targeted the genus Pantoea and its four main lineages. The shared gene sets were determined for Pantoea, the four lineages within the genus, as well as its sister-genus Tatumella. This was followed by functional characterization of the gene sets using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. In comparison to Tatumella, various traits involved in nutrient cycling were identified within Pantoea, providing evidence for increased efficacy in recycling of metabolites within the genus. Additionally, a number of traits associated with pathogenicity were identified within species often associated with opportunistic infections, with some support for adaptation toward overcoming host defenses. Some traits were also only conserved within specific lineages, potentially acquired in an ancestor to the lineage and subsequently maintained. It was also observed that the species isolated from the most diverse sources were generally the most versatile in their carbon metabolism. By investigating evolution, based on the more variable genomic regions, it may be possible to detect biologically relevant differences associated with the course of evolution and speciation. PMID:29467735

Species Tree Inference Using a Mixture Model.

PubMed

Ullah, Ikram; Parviainen, Pekka; Lagergren, Jens

2015-09-01

Species tree reconstruction has been a subject of substantial research due to its central role across biology and medicine. A species tree is often reconstructed using a set of gene trees or by directly using sequence data. In either of these cases, one of the main confounding phenomena is the discordance between a species tree and a gene tree due to evolutionary events such as duplications and losses. Probabilistic methods can resolve the discordance by coestimating gene trees and the species tree but this approach poses a scalability problem for larger data sets. We present MixTreEM-DLRS: A two-phase approach for reconstructing a species tree in the presence of gene duplications and losses. In the first phase, MixTreEM, a novel structural expectation maximization algorithm based on a mixture model is used to reconstruct a set of candidate species trees, given sequence data for monocopy gene families from the genomes under study. In the second phase, PrIME-DLRS, a method based on the DLRS model (Åkerborg O, Sennblad B, Arvestad L, Lagergren J. 2009. Simultaneous Bayesian gene tree reconstruction and reconciliation analysis. Proc Natl Acad Sci U S A. 106(14):5714-5719), is used for selecting the best species tree. PrIME-DLRS can handle multicopy gene families since DLRS, apart from modeling sequence evolution, models gene duplication and loss using a gene evolution model (Arvestad L, Lagergren J, Sennblad B. 2009. The gene evolution model and computing its associated probabilities. J ACM. 56(2):1-44). We evaluate MixTreEM-DLRS using synthetic and biological data, and compare its performance with a recent genome-scale species tree reconstruction method PHYLDOG (Boussau B, Szöllősi GJ, Duret L, Gouy M, Tannier E, Daubin V. 2013. Genome-scale coestimation of species and gene trees. Genome Res. 23(2):323-330) as well as with a fast parsimony-based algorithm Duptree (Wehe A, Bansal MS, Burleigh JG, Eulenstein O. 2008. Duptree: a program for large-scale phylogenetic analyses using gene tree parsimony. Bioinformatics 24(13):1540-1541). Our method is competitive with PHYLDOG in terms of accuracy and runs significantly faster and our method outperforms Duptree in accuracy. The analysis constituted by MixTreEM without DLRS may also be used for selecting the target species tree, yielding a fast and yet accurate algorithm for larger data sets. MixTreEM is freely available at http://prime.scilifelab.se/mixtreem/. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Tapping the Power of Crustacean Transcriptomics to Address Grand Challenges in Comparative Biology: An Introduction to the Symposium.

PubMed

Mykles, Donald L; Burnett, Karen G; Durica, David S; Stillman, Jonathon H

2016-12-01

Crustaceans, and decapods in particular (i.e., crabs, shrimp, and lobsters), are a diverse and ecologically and commercially important group of organisms. Understanding responses to abiotic and biotic factors is critical for developing best practices in aquaculture and assessing the effects of changing environments on the biology of these important animals. A relatively small number of decapod crustacean species have been intensively studied at the molecular level; the availability, experimental tractability, and economic relevance factor into the selection of a particular species as a model. Transcriptomics, using high-throughput next generation sequencing (NGS, coupled with RNA sequencing or RNA-seq) is revolutionizing crustacean biology. The 11 symposium papers in this volume illustrate how RNA-seq is being used to study stress response, molting and limb regeneration, immunity and disease, reproduction and development, neurobiology, and ecology and evolution. This symposium occurred on the 10th anniversary of the symposium, "Genomic and Proteomic Approaches to Crustacean Biology", held at the Society for Integrative and Comparative Biology 2006 meeting. Two participants in the 2006 symposium, the late Paul Gross and David Towle, were recognized as leaders who pioneered the use of molecular techniques that would ultimately foster the transcriptomics research reviewed in this volume. RNA-seq is a powerful tool for hypothesis-driven research, as well as an engine for discovery. It has eclipsed the technologies available in 2006, such as microarrays, expressed sequence tags, and subtractive hybridization screening, as the millions of "reads" from NGS enable researchers to de novo assemble a comprehensive transcriptome without a complete genome sequence. The symposium series concludes with a policy paper that gives an overview of the resources available and makes recommendations for developing better tools for functional annotation and pathway and network analysis in organisms in which the genome is not available or is incomplete. © The Author 2016. Published by Oxford University Press on behalf of the Society for Integrative and Comparative Biology. All rights reserved. For permissions please email: journals.permissions@oup.com.
Repetitive sequences in plant nuclear DNA: types, distribution, evolution and function.

PubMed

Mehrotra, Shweta; Goyal, Vinod

2014-08-01

Repetitive DNA sequences are a major component of eukaryotic genomes and may account for up to 90% of the genome size. They can be divided into minisatellite, microsatellite and satellite sequences. Satellite DNA sequences are considered to be a fast-evolving component of eukaryotic genomes, comprising tandemly-arrayed, highly-repetitive and highly-conserved monomer sequences. The monomer unit of satellite DNA is 150-400 base pairs (bp) in length. Repetitive sequences may be species- or genus-specific, and may be centromeric or subtelomeric in nature. They exhibit cohesive and concerted evolution caused by molecular drive, leading to high sequence homogeneity. Repetitive sequences accumulate variations in sequence and copy number during evolution, hence they are important tools for taxonomic and phylogenetic studies, and are known as "tuning knobs" in the evolution. Therefore, knowledge of repetitive sequences assists our understanding of the organization, evolution and behavior of eukaryotic genomes. Repetitive sequences have cytoplasmic, cellular and developmental effects and play a role in chromosomal recombination. In the post-genomics era, with the introduction of next-generation sequencing technology, it is possible to evaluate complex genomes for analyzing repetitive sequences and deciphering the yet unknown functional potential of repetitive sequences. Copyright © 2014 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.
An analysis of factors influencing the teaching of biological evolution in Louisiana public secondary schools

NASA Astrophysics Data System (ADS)

Aguillard, Donald Wayne

Louisiana public school biology teachers were surveyed to investigate their attitudes toward biological evolution. A mixed method investigation was employed using a questionnaire and open-ended interviews. Results obtained from 64 percent of the sample receiving the questionnaire indicate that although teachers endorse the study of evolution as important, instructional time allocated to evolution is disproportionate with its status as a unifying concept of science. Two variables, number of college courses specifically devoted to evolution and number of semester credit hours in biology, produced a significant correlation with emphasis placed on evolution. The data suggest that teachers' knowledge base emerged as the most significant factor in determining degree of classroom emphasis on evolution. The data suggest a need for substantive changes in the training of biology teachers. Thirty-five percent of teachers reported pursuing fewer than 20 semester credit hours in biology and 68 percent reported fewer than three college courses in which evolution was specifically discussed. Fifty percent reported a willingness to undergo additional training about evolution. In spite of the fact that evolution has been identified as a major conceptual theme across all of the sciences, there is strong evidence that Louisiana biology teachers de-emphasize evolutionary theory. Even when biology teachers allocate instructional time to evolutionary theory, many avoid discussion of human evolution. The research data show that only ten percent of teachers reported allocating more than sixty minutes of instructional time to human evolution. Louisiana biology teachers were found to hold extreme views on the subject of creationism as a component of the biology curriculum. Twenty-nine percent indicated that creationism should be taught in high school biology and 25--35 percent allocated instructional time to discussions of creationism. Contributing to the de-emphasis of evolutionary theory, as a unifying theme of biology, is the courtesy extended to classroom teachers to determine what topics are emphasized. The inclusion of evolution in curriculum documents is not sufficient to ensure that evolutionary theory is regarded as a unifying theme of biology. School administrators, science supervisors, and local school boards have a clear responsibility to articulate strong support for requiring classroom discussions of evolutionary theory.
Progress in bioinformatics and the importance of being earnest.

PubMed

Attwood, T K; Miller, C J

2002-01-01

In silico biology has gathered momentum as, worldwide, scientists have united in a common quest to sequence, store and analyse complete genomes. This year, a pivotal achievement of this cooperative endeavour was realised in the release of a public draft of the human genome, and with it the promises to improve our understanding of diverse aspects of biology and to yield a healthier future with safe personalized medicines. Key to these goals will be the need to elucidate and characterise the genes and gene products encoded not just in the human genome, but in many genomes. These tasks are underpinned by the concepts and processes of genome and gene/protein evolution, regulation of gene expression, mechanisms of protein folding, the manifestation of protein function, and so on, all of which must be understood in the context of complex, dynamic biological systems. Our use of computers to model such concepts and systems must be placed in the context of the current limits of our understanding of them:- it is important to recognise, for example, that we don't have a common understanding either of what constitutes a gene or a protein function; we can't invariably say that a particular sequence or fold has arisen via divergent or convergent evolution; and we don't fully understand the rules of protein folding. Accepting what we can't do in silico is essential in appreciating what we can do. Without this understanding, it is easy to be misled, as notions of what particular computational approaches can achieve are sometimes rather optimistic. There are valuable lessons to be learned here from the field of Artificial Intelligence, principal among which is the realisation that capturing and representing complex knowledge is time consuming, expensive and hard. Thus, we argue here that if bioinformatics is to tackle biological complexity in earnest, it would be wise to absorb the experience distilled from decades of artificial intelligence research, and to approach the road ahead with caution, rigour and pragmatism.
Petunia, Your Next Supermodel?

PubMed Central

Vandenbussche, Michiel; Chambrier, Pierre; Rodrigues Bento, Suzanne; Morel, Patrice

2016-01-01

Plant biology in general, and plant evo–devo in particular would strongly benefit from a broader range of available model systems. In recent years, technological advances have facilitated the analysis and comparison of individual gene functions in multiple species, representing now a fairly wide taxonomic range of the plant kingdom. Because genes are embedded in gene networks, studying evolution of gene function ultimately should be put in the context of studying the evolution of entire gene networks, since changes in the function of a single gene will normally go together with further changes in its network environment. For this reason, plant comparative biology/evo–devo will require the availability of a defined set of ‘super’ models occupying key taxonomic positions, in which performing gene functional analysis and testing genetic interactions ideally is as straightforward as, e.g., in Arabidopsis. Here we review why petunia has the potential to become one of these future supermodels, as a representative of the Asterid clade. We will first detail its intrinsic qualities as a model system. Next, we highlight how the revolution in sequencing technologies will now finally allows exploitation of the petunia system to its full potential, despite that petunia has already a long history as a model in plant molecular biology and genetics. We conclude with a series of arguments in favor of a more diversified multi-model approach in plant biology, and we point out where the petunia model system may further play a role, based on its biological features and molecular toolkit. PMID:26870078
Transmission as a basic process in microbial biology. Lwoff Award Prize Lecture.

PubMed

Baquero, Fernando

2017-11-01

Transmission is a basic process in biology and evolution, as it communicates different biological entities within and across hierarchical levels (from genes to holobionts) both in time and space. Vertical descent, replication, is transmission of information across generations (in the time dimension), and horizontal descent is transmission of information across compartments (in the space dimension). Transmission is essentially a communication process that can be studied by analogy of the classic information theory, based on 'emitters', 'messages' and 'receivers'. The analogy can be easily extended to the triad 'emigration', 'migration' and 'immigration'. A number of causes (forces) determine the emission, and another set of causes (energies) assures the reception. The message in fact is essentially constituted by 'meaningful' biological entities. A DNA sequence, a cell and a population have a semiotic dimension, are 'signs' that are eventually recognized (decoded) and integrated by receiver biological entities. In cis-acting or unenclosed transmission, the emitters and receivers correspond to separated entities of the same hierarchical level; in trans-acting or embedded transmission, the information flows between different, but frequently nested, hierarchical levels. The result (as in introgressive events) is constantly producing innovation and feeding natural selection, influencing also the evolution of transmission processes. This review is based on the concepts presented at the André Lwoff Award Lecture in the FEMS Microbiology Congress in Maastricht in 2015. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Hsp-90 and the biology of nematodes

PubMed Central

Him, Nik AIIN; Gillan, Victoria; Emes, Richard D; Maitland, Kirsty; Devaney, Eileen

2009-01-01

Background Hsp-90 from the free-living nematode Caenorhabditis elegans is unique in that it fails to bind to the specific Hsp-90 inhibitor, geldanamycin (GA). Here we surveyed 24 different free-living or parasitic nematodes with the aim of determining whether C. elegans Hsp-90 was the exception or the norm amongst the nematodes. We combined these data with codon evolution models in an attempt to identify whether hsp-90 from GA-binding and non-binding species has evolved under different evolutionary constraints. Results We show that GA-binding is associated with life history: free-living nematodes and those parasitic species with free-living larval stages failed to bind GA. In contrast, obligate parasites and those worms in which the free-living stage in the environment is enclosed within a resistant egg, possess a GA-binding Hsp-90. We analysed Hsp-90 sequences from fifteen nematode species to determine whether nematode hsp-90s have undergone adaptive evolution that influences GA-binding. Our data provide evidence of rapid diversifying selection in the evolution of the hsp-90 gene along three separate lineages, and identified a number of residues showing significant evidence of adaptive evolution. However, we were unable to prove that the selection observed is correlated with the ability to bind geldanamycin or not. Conclusion Hsp-90 is a multi-functional protein and the rapid evolution of the hsp-90 gene presumably correlates with other key cellular functions. Factors other than primary amino acid sequence may influence the ability of Hsp-90 to bind to geldanamycin. PMID:19849843
Draft genome of the living fossil Ginkgo biloba.

PubMed

Guan, Rui; Zhao, Yunpeng; Zhang, He; Fan, Guangyi; Liu, Xin; Zhou, Wenbin; Shi, Chengcheng; Wang, Jiahao; Liu, Weiqing; Liang, Xinming; Fu, Yuanyuan; Ma, Kailong; Zhao, Lijun; Zhang, Fumin; Lu, Zuhong; Lee, Simon Ming-Yuen; Xu, Xun; Wang, Jian; Yang, Huanming; Fu, Chengxin; Ge, Song; Chen, Wenbin

2016-11-21

Ginkgo biloba L. (Ginkgoaceae) is one of the most distinctive plants. It possesses a suite of fascinating characteristics including a large genome, outstanding resistance/tolerance to abiotic and biotic stresses, and dioecious reproduction, making it an ideal model species for biological studies. However, the lack of a high-quality genome sequence has been an impediment to our understanding of its biology and evolution. The 10.61 Gb genome sequence containing 41,840 annotated genes was assembled in the present study. Repetitive sequences account for 76.58% of the assembled sequence, and long terminal repeat retrotransposons (LTR-RTs) are particularly prevalent. The diversity and abundance of LTR-RTs is due to their gradual accumulation and a remarkable amplification between 16 and 24 million years ago, and they contribute to the long introns and large genome. Whole genome duplication (WGD) may have occurred twice, with an ancient WGD consistent with that shown to occur in other seed plants, and a more recent event specific to ginkgo. Abundant gene clusters from tandem duplication were also evident, and enrichment of expanded gene families indicates a remarkable array of chemical and antibacterial defense pathways. The ginkgo genome consists mainly of LTR-RTs resulting from ancient gradual accumulation and two WGD events. The multiple defense mechanisms underlying the characteristic resilience of ginkgo are fostered by a remarkable enrichment in ancient duplicated and ginkgo-specific gene clusters. The present study sheds light on sequencing large genomes, and opens an avenue for further genetic and evolutionary research.
Origin and Functional Prediction of Pollen Allergens in Plants1[OPEN

PubMed Central

Chen, Miaolin; Xu, Jie; Ren, Kang; Searle, Iain

2016-01-01

Pollen allergies have long been a major pandemic health problem for human. However, the evolutionary events and biological function of pollen allergens in plants remain largely unknown. Here, we report the genome-wide prediction of pollen allergens and their biological function in the dicotyledonous model plant Arabidopsis (Arabidopsis thaliana) and the monocotyledonous model plant rice (Oryza sativa). In total, 145 and 107 pollen allergens were predicted from rice and Arabidopsis, respectively. These pollen allergens are putatively involved in stress responses and metabolic processes such as cell wall metabolism during pollen development. Interestingly, these putative pollen allergen genes were derived from large gene families and became diversified during evolution. Sequence analysis across 25 plant species from green alga to angiosperms suggest that about 40% of putative pollen allergenic proteins existed in both lower and higher plants, while other allergens emerged during evolution. Although a high proportion of gene duplication has been observed among allergen-coding genes, our data show that these genes might have undergone purifying selection during evolution. We also observed that epitopes of an allergen might have a biological function, as revealed by comprehensive analysis of two known allergens, expansin and profilin. This implies a crucial role of conserved amino acid residues in both in planta biological function and allergenicity. Finally, a model explaining how pollen allergens were generated and maintained in plants is proposed. Prediction and systematic analysis of pollen allergens in model plants suggest that pollen allergens were evolved by gene duplication and then functional specification. This study provides insight into the phylogenetic and evolutionary scenario of pollen allergens that will be helpful to future characterization and epitope screening of pollen allergens. PMID:27436829
Origin and Functional Prediction of Pollen Allergens in Plants.

PubMed

Chen, Miaolin; Xu, Jie; Devis, Deborah; Shi, Jianxin; Ren, Kang; Searle, Iain; Zhang, Dabing

2016-09-01

Pollen allergies have long been a major pandemic health problem for human. However, the evolutionary events and biological function of pollen allergens in plants remain largely unknown. Here, we report the genome-wide prediction of pollen allergens and their biological function in the dicotyledonous model plant Arabidopsis (Arabidopsis thaliana) and the monocotyledonous model plant rice (Oryza sativa). In total, 145 and 107 pollen allergens were predicted from rice and Arabidopsis, respectively. These pollen allergens are putatively involved in stress responses and metabolic processes such as cell wall metabolism during pollen development. Interestingly, these putative pollen allergen genes were derived from large gene families and became diversified during evolution. Sequence analysis across 25 plant species from green alga to angiosperms suggest that about 40% of putative pollen allergenic proteins existed in both lower and higher plants, while other allergens emerged during evolution. Although a high proportion of gene duplication has been observed among allergen-coding genes, our data show that these genes might have undergone purifying selection during evolution. We also observed that epitopes of an allergen might have a biological function, as revealed by comprehensive analysis of two known allergens, expansin and profilin. This implies a crucial role of conserved amino acid residues in both in planta biological function and allergenicity. Finally, a model explaining how pollen allergens were generated and maintained in plants is proposed. Prediction and systematic analysis of pollen allergens in model plants suggest that pollen allergens were evolved by gene duplication and then functional specification. This study provides insight into the phylogenetic and evolutionary scenario of pollen allergens that will be helpful to future characterization and epitope screening of pollen allergens. © 2016 American Society of Plant Biologists. All rights reserved.
Towards physical principles of biological evolution

NASA Astrophysics Data System (ADS)

Katsnelson, Mikhail I.; Wolf, Yuri I.; Koonin, Eugene V.

2018-03-01

Biological systems reach organizational complexity that far exceeds the complexity of any known inanimate objects. Biological entities undoubtedly obey the laws of quantum physics and statistical mechanics. However, is modern physics sufficient to adequately describe, model and explain the evolution of biological complexity? Detailed parallels have been drawn between statistical thermodynamics and the population-genetic theory of biological evolution. Based on these parallels, we outline new perspectives on biological innovation and major transitions in evolution, and introduce a biological equivalent of thermodynamic potential that reflects the innovation propensity of an evolving population. Deep analogies have been suggested to also exist between the properties of biological entities and processes, and those of frustrated states in physics, such as glasses. Such systems are characterized by frustration whereby local state with minimal free energy conflict with the global minimum, resulting in ‘emergent phenomena’. We extend such analogies by examining frustration-type phenomena, such as conflicts between different levels of selection, in biological evolution. These frustration effects appear to drive the evolution of biological complexity. We further address evolution in multidimensional fitness landscapes from the point of view of percolation theory and suggest that percolation at level above the critical threshold dictates the tree-like evolution of complex organisms. Taken together, these multiple connections between fundamental processes in physics and biology imply that construction of a meaningful physical theory of biological evolution might not be a futile effort. However, it is unrealistic to expect that such a theory can be created in one scoop; if it ever comes to being, this can only happen through integration of multiple physical models of evolutionary processes. Furthermore, the existing framework of theoretical physics is unlikely to suffice for adequate modeling of the biological level of complexity, and new developments within physics itself are likely to be required.
The Large Mitochondrial Genome of Symbiodinium minutum Reveals Conserved Noncoding Sequences between Dinoflagellates and Apicomplexans.

PubMed

Shoguchi, Eiichi; Shinzato, Chuya; Hisata, Kanako; Satoh, Nori; Mungpakdee, Sutada

2015-07-20

Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (∼326 kb) of the dinoflagellate, Symbiodinium minutum, is AT-rich (∼64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum. Gene map comparisons show that gene order is only slightly conserved between S. minutum and P. falciparum. However, small RNAs and intergenic sequences share sequence similarities with P. falciparum, suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Subclonal diversification of primary breast cancer revealed by multiregion sequencing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yates, Lucy R.; Gerstung, Moritz; Knappskog, Stian

Sequencing cancer genomes may enable tailoring of therapeutics to the underlying biological abnormalities driving a particular patient's tumor. However, sequencing-based strategies rely heavily on representative sampling of tumors. To understand the subclonal structure of primary breast cancer, we applied whole-genome and targeted sequencing to multiple samples from each of 50 patients' tumors (303 samples in total). The extent of subclonal diversification varied among cases and followed spatial patterns. No strict temporal order was evident, with point mutations and rearrangements affecting the most common breast cancer genes, including PIK3CA, TP53, PTEN, BRCA2 and MYC, occurring early in some tumors and latemore » in others. In 13 out of 50 cancers, potentially targetable mutations were subclonal. Landmarks of disease progression, such as resistance to chemotherapy and the acquisition of invasive or metastatic potential, arose within detectable subclones of antecedent lesions. These findings highlight the importance of including analyses of subclonal structure and tumor evolution in clinical trials of primary breast cancer.« less
Subclonal diversification of primary breast cancer revealed by multiregion sequencing

DOE PAGES

Yates, Lucy R.; Gerstung, Moritz; Knappskog, Stian; ...

2015-06-22

Sequencing cancer genomes may enable tailoring of therapeutics to the underlying biological abnormalities driving a particular patient's tumor. However, sequencing-based strategies rely heavily on representative sampling of tumors. To understand the subclonal structure of primary breast cancer, we applied whole-genome and targeted sequencing to multiple samples from each of 50 patients' tumors (303 samples in total). The extent of subclonal diversification varied among cases and followed spatial patterns. No strict temporal order was evident, with point mutations and rearrangements affecting the most common breast cancer genes, including PIK3CA, TP53, PTEN, BRCA2 and MYC, occurring early in some tumors and latemore » in others. In 13 out of 50 cancers, potentially targetable mutations were subclonal. Landmarks of disease progression, such as resistance to chemotherapy and the acquisition of invasive or metastatic potential, arose within detectable subclones of antecedent lesions. These findings highlight the importance of including analyses of subclonal structure and tumor evolution in clinical trials of primary breast cancer.« less
[Screening specific recognition motif of RNA-binding proteins by SELEX in combination with next-generation sequencing technique].

PubMed

Zhang, Lu; Xu, Jinhao; Ma, Jinbiao

2016-07-25

RNA-binding protein exerts important biological function by specifically recognizing RNA motif. SELEX (Systematic evolution of ligands by exponential enrichment), an in vitro selection method, can obtain consensus motif with high-affinity and specificity for many target molecules from DNA or RNA libraries. Here, we combined SELEX with next-generation sequencing to study the protein-RNA interaction in vitro. A pool of RNAs with 20 bp random sequences were transcribed by T7 promoter, and target protein was inserted into plasmid containing SBP-tag, which can be captured by streptavidin beads. Through only one cycle, the specific RNA motif can be obtained, which dramatically improved the selection efficiency. Using this method, we found that human hnRNP A1 RRMs domain (UP1 domain) bound RNA motifs containing AGG and AG sequences. The EMSA experiment indicated that hnRNP A1 RRMs could bind the obtained RNA motif. Taken together, this method provides a rapid and effective method to study the RNA binding specificity of proteins.
The DNA sequence of the human X chromosome

PubMed Central

Ross, Mark T.; Grafham, Darren V.; Coffey, Alison J.; Scherer, Steven; McLay, Kirsten; Muzny, Donna; Platzer, Matthias; Howell, Gareth R.; Burrows, Christine; Bird, Christine P.; Frankish, Adam; Lovell, Frances L.; Howe, Kevin L.; Ashurst, Jennifer L.; Fulton, Robert S.; Sudbrak, Ralf; Wen, Gaiping; Jones, Matthew C.; Hurles, Matthew E.; Andrews, T. Daniel; Scott, Carol E.; Searle, Stephen; Ramser, Juliane; Whittaker, Adam; Deadman, Rebecca; Carter, Nigel P.; Hunt, Sarah E.; Chen, Rui; Cree, Andrew; Gunaratne, Preethi; Havlak, Paul; Hodgson, Anne; Metzker, Michael L.; Richards, Stephen; Scott, Graham; Steffen, David; Sodergren, Erica; Wheeler, David A.; Worley, Kim C.; Ainscough, Rachael; Ambrose, Kerrie D.; Ansari-Lari, M. Ali; Aradhya, Swaroop; Ashwell, Robert I. S.; Babbage, Anne K.; Bagguley, Claire L.; Ballabio, Andrea; Banerjee, Ruby; Barker, Gary E.; Barlow, Karen F.; Barrett, Ian P.; Bates, Karen N.; Beare, David M.; Beasley, Helen; Beasley, Oliver; Beck, Alfred; Bethel, Graeme; Blechschmidt, Karin; Brady, Nicola; Bray-Allen, Sarah; Bridgeman, Anne M.; Brown, Andrew J.; Brown, Mary J.; Bonnin, David; Bruford, Elspeth A.; Buhay, Christian; Burch, Paula; Burford, Deborah; Burgess, Joanne; Burrill, Wayne; Burton, John; Bye, Jackie M.; Carder, Carol; Carrel, Laura; Chako, Joseph; Chapman, Joanne C.; Chavez, Dean; Chen, Ellson; Chen, Guan; Chen, Yuan; Chen, Zhijian; Chinault, Craig; Ciccodicola, Alfredo; Clark, Sue Y.; Clarke, Graham; Clee, Chris M.; Clegg, Sheila; Clerc-Blankenburg, Kerstin; Clifford, Karen; Cobley, Vicky; Cole, Charlotte G.; Conquer, Jen S.; Corby, Nicole; Connor, Richard E.; David, Robert; Davies, Joy; Davis, Clay; Davis, John; Delgado, Oliver; DeShazo, Denise; Dhami, Pawandeep; Ding, Yan; Dinh, Huyen; Dodsworth, Steve; Draper, Heather; Dugan-Rocha, Shannon; Dunham, Andrew; Dunn, Matthew; Durbin, K. James; Dutta, Ireena; Eades, Tamsin; Ellwood, Matthew; Emery-Cohen, Alexandra; Errington, Helen; Evans, Kathryn L.; Faulkner, Louisa; Francis, Fiona; Frankland, John; Fraser, Audrey E.; Galgoczy, Petra; Gilbert, James; Gill, Rachel; Glöckner, Gernot; Gregory, Simon G.; Gribble, Susan; Griffiths, Coline; Grocock, Russell; Gu, Yanghong; Gwilliam, Rhian; Hamilton, Cerissa; Hart, Elizabeth A.; Hawes, Alicia; Heath, Paul D.; Heitmann, Katja; Hennig, Steffen; Hernandez, Judith; Hinzmann, Bernd; Ho, Sarah; Hoffs, Michael; Howden, Phillip J.; Huckle, Elizabeth J.; Hume, Jennifer; Hunt, Paul J.; Hunt, Adrienne R.; Isherwood, Judith; Jacob, Leni; Johnson, David; Jones, Sally; de Jong, Pieter J.; Joseph, Shirin S.; Keenan, Stephen; Kelly, Susan; Kershaw, Joanne K.; Khan, Ziad; Kioschis, Petra; Klages, Sven; Knights, Andrew J.; Kosiura, Anna; Kovar-Smith, Christie; Laird, Gavin K.; Langford, Cordelia; Lawlor, Stephanie; Leversha, Margaret; Lewis, Lora; Liu, Wen; Lloyd, Christine; Lloyd, David M.; Loulseged, Hermela; Loveland, Jane E.; Lovell, Jamieson D.; Lozado, Ryan; Lu, Jing; Lyne, Rachael; Ma, Jie; Maheshwari, Manjula; Matthews, Lucy H.; McDowall, Jennifer; McLaren, Stuart; McMurray, Amanda; Meidl, Patrick; Meitinger, Thomas; Milne, Sarah; Miner, George; Mistry, Shailesh L.; Morgan, Margaret; Morris, Sidney; Müller, Ines; Mullikin, James C.; Nguyen, Ngoc; Nordsiek, Gabriele; Nyakatura, Gerald; O’Dell, Christopher N.; Okwuonu, Geoffery; Palmer, Sophie; Pandian, Richard; Parker, David; Parrish, Julia; Pasternak, Shiran; Patel, Dina; Pearce, Alex V.; Pearson, Danita M.; Pelan, Sarah E.; Perez, Lesette; Porter, Keith M.; Ramsey, Yvonne; Reichwald, Kathrin; Rhodes, Susan; Ridler, Kerry A.; Schlessinger, David; Schueler, Mary G.; Sehra, Harminder K.; Shaw-Smith, Charles; Shen, Hua; Sheridan, Elizabeth M.; Shownkeen, Ratna; Skuce, Carl D.; Smith, Michelle L.; Sotheran, Elizabeth C.; Steingruber, Helen E.; Steward, Charles A.; Storey, Roy; Swann, R. Mark; Swarbreck, David; Tabor, Paul E.; Taudien, Stefan; Taylor, Tineace; Teague, Brian; Thomas, Karen; Thorpe, Andrea; Timms, Kirsten; Tracey, Alan; Trevanion, Steve; Tromans, Anthony C.; d’Urso, Michele; Verduzco, Daniel; Villasana, Donna; Waldron, Lenee; Wall, Melanie; Wang, Qiaoyan; Warren, James; Warry, Georgina L.; Wei, Xuehong; West, Anthony; Whitehead, Siobhan L.; Whiteley, Mathew N.; Wilkinson, Jane E.; Willey, David L.; Williams, Gabrielle; Williams, Leanne; Williamson, Angela; Williamson, Helen; Wilming, Laurens; Woodmansey, Rebecca L.; Wray, Paul W.; Yen, Jennifer; Zhang, Jingkun; Zhou, Jianling; Zoghbi, Huda; Zorilla, Sara; Buck, David; Reinhardt, Richard; Poustka, Annemarie; Rosenthal, André; Lehrach, Hans; Meindl, Alfons; Minx, Patrick J.; Hillier, LaDeana W.; Willard, Huntington F.; Wilson, Richard K.; Waterston, Robert H.; Rice, Catherine M.; Vaudin, Mark; Coulson, Alan; Nelson, David L.; Weinstock, George; Sulston, John E.; Durbin, Richard; Hubbard, Tim; Gibbs, Richard A.; Beck, Stephan; Rogers, Jane; Bentley, David R.

2009-01-01

The human X chromosome has a unique biology that was shaped by its evolution as the sex chromosome shared by males and females. We have determined 99.3% of the euchromatic sequence of the X chromosome. Our analysis illustrates the autosomal origin of the mammalian sex chromosomes, the stepwise process that led to the progressive loss of recombination between X and Y, and the extent of subsequent degradation of the Y chromosome. LINE1 repeat elements cover one-third of the X chromosome, with a distribution that is consistent with their proposed role as way stations in the process of X-chromosome inactivation. We found 1,098 genes in the sequence, of which 99 encode proteins expressed in testis and in various tumour types. A disproportionately high number of mendelian diseases are documented for the X chromosome. Of this number, 168 have been explained by mutations in 113 X-linked genes, which in many cases were characterized with the aid of the DNA sequence. PMID:15772651
Human evolution: a tale from ancient genomes

PubMed Central

2017-01-01

The field of human ancient DNA (aDNA) has moved from mitochondrial sequencing that suffered from contamination and provided limited biological insights, to become a fully genomic discipline that is changing our conception of human history. Recent successes include the sequencing of extinct hominins, and true population genomic studies of Bronze Age populations. Among the emerging areas of aDNA research, the analysis of past epigenomes is set to provide more new insights into human adaptation and disease susceptibility through time. Starting as a mere curiosity, ancient human genetics has become a major player in the understanding of our evolutionary history. This article is part of the themed issue ‘Evo-devo in the genomics era, and the origins of morphological diversity’. PMID:27994125
MEGA5: Molecular Evolutionary Genetics Analysis Using Maximum Likelihood, Evolutionary Distance, and Maximum Parsimony Methods

PubMed Central

Tamura, Koichiro; Peterson, Daniel; Peterson, Nicholas; Stecher, Glen; Nei, Masatoshi; Kumar, Sudhir

2011-01-01

Comparative analysis of molecular sequence data is essential for reconstructing the evolutionary histories of species and inferring the nature and extent of selective forces shaping the evolution of genes and species. Here, we announce the release of Molecular Evolutionary Genetics Analysis version 5 (MEGA5), which is a user-friendly software for mining online databases, building sequence alignments and phylogenetic trees, and using methods of evolutionary bioinformatics in basic biology, biomedicine, and evolution. The newest addition in MEGA5 is a collection of maximum likelihood (ML) analyses for inferring evolutionary trees, selecting best-fit substitution models (nucleotide or amino acid), inferring ancestral states and sequences (along with probabilities), and estimating evolutionary rates site-by-site. In computer simulation analyses, ML tree inference algorithms in MEGA5 compared favorably with other software packages in terms of computational efficiency and the accuracy of the estimates of phylogenetic trees, substitution parameters, and rate variation among sites. The MEGA user interface has now been enhanced to be activity driven to make it easier for the use of both beginners and experienced scientists. This version of MEGA is intended for the Windows platform, and it has been configured for effective use on Mac OS X and Linux desktops. It is available free of charge from http://www.megasoftware.net. PMID:21546353
A Transcriptome Derived Female-Specific Marker from the Invasive Western Mosquitofish (Gambusia affinis)

PubMed Central

Lamatsch, Dunja K.; Adolfsson, Sofia; Senior, Alistair M.; Christiansen, Guntram; Pichler, Maria; Ozaki, Yuichi; Smeds, Linnea; Schartl, Manfred; Nakagawa, Shinichi

2015-01-01

Sex-specific markers are a prerequisite for understanding reproductive biology, genetic factors involved in sex differences, mechanisms of sex determination, and ultimately the evolution of sex chromosomes. The Western mosquitofish, Gambusia affinis, may be considered a model species for sex-chromosome evolution, as it displays female heterogamety (ZW/ZZ), and is also ecologically interesting as a worldwide invasive species. Here, de novo RNA-sequencing on the gonads of sexually mature G. affinis was used to identify contigs that were highly transcribed in females but not in males (i.e., transcripts with ovary-specific expression). Subsequently, 129 primer pairs spanning 79 contigs were tested by PCR to identify sex-specific transcripts. Of those primer pairs, one female-specific DNA marker was identified, Sanger sequenced and subsequently validated in 115 fish. Sequence analyses revealed a high similarity between the identified sex-specific marker and the 3´ UTR of the aminomethyl transferase (amt) gene of the closely related platyfish (Xiphophorus maculatus). This is the first time that RNA-seq has been used to successfully characterize a sex-specific marker in a fish species in the absence of a genome map. Additionally, the identified sex-specific marker represents one of only a handful of such markers in fishes. PMID:25707007

Dynamic epigenetic states of maize centromeres

PubMed Central

Liu, Yalin; Su, Handong; Zhang, Jing; Liu, Yang; Han, Fangpu; Birchler, James A.

2015-01-01

The centromere is a specialized chromosomal region identified as the major constriction, upon which the kinetochore complex is formed, ensuring accurate chromosome orientation and segregation during cell division. The rapid evolution of centromere DNA sequence and the conserved centromere function are two contradictory aspects of centromere biology. Indeed, the sole presence of genetic sequence is not sufficient for centromere formation. Various dicentric chromosomes with one inactive centromere have been recognized. It has also been found that de novo centromere formation is common on fragments in which centromeric DNA sequences are lost. Epigenetic factors play important roles in centromeric chromatin assembly and maintenance. Non-disjunction of the supernumerary B chromosome centromere is independent of centromere function, but centromere pairing during early prophase of meiosis I requires an active centromere. This review discusses recent studies in maize about genetic and epigenetic elements regulating formation and maintenance of centromere chromatin, as well as centromere behavior in meiosis. PMID:26579154
RiboDB Database: A Comprehensive Resource for Prokaryotic Systematics.

PubMed

Jauffrit, Frédéric; Penel, Simon; Delmotte, Stéphane; Rey, Carine; de Vienne, Damien M; Gouy, Manolo; Charrier, Jean-Philippe; Flandrois, Jean-Pierre; Brochier-Armanet, Céline

2016-08-01

Ribosomal proteins (r-proteins) are increasingly used as an alternative to ribosomal rRNA for prokaryotic systematics. However, their routine use is difficult because r-proteins are often not or wrongly annotated in complete genome sequences, and there is currently no dedicated exhaustive database of r-proteins. RiboDB aims at fulfilling this gap. This weekly updated comprehensive database allows the fast and easy retrieval of r-protein sequences from publicly available complete prokaryotic genome sequences. The current version of RiboDB contains 90 r-proteins from 3,750 prokaryotic complete genomes encompassing 38 phyla/major classes and 1,759 different species. RiboDB is accessible at http://ribodb.univ-lyon1.fr and through ACNUC interfaces. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Dynamic epigenetic states of maize centromeres.

PubMed

Liu, Yalin; Su, Handong; Zhang, Jing; Liu, Yang; Han, Fangpu; Birchler, James A

2015-01-01

The centromere is a specialized chromosomal region identified as the major constriction, upon which the kinetochore complex is formed, ensuring accurate chromosome orientation and segregation during cell division. The rapid evolution of centromere DNA sequence and the conserved centromere function are two contradictory aspects of centromere biology. Indeed, the sole presence of genetic sequence is not sufficient for centromere formation. Various dicentric chromosomes with one inactive centromere have been recognized. It has also been found that de novo centromere formation is common on fragments in which centromeric DNA sequences are lost. Epigenetic factors play important roles in centromeric chromatin assembly and maintenance. Non-disjunction of the supernumerary B chromosome centromere is independent of centromere function, but centromere pairing during early prophase of meiosis I requires an active centromere. This review discusses recent studies in maize about genetic and epigenetic elements regulating formation and maintenance of centromere chromatin, as well as centromere behavior in meiosis.
A comparison of biological and cultural evolution.

PubMed

Portin, Petter

2015-03-01

This review begins with a definition of biological evolution and a description of its general principles. This is followed by a presentation of the biological basis of culture, specifically the concept of social selection. Further, conditions for cultural evolution are proposed, including a suggestion for language being the cultural replicator corresponding to the concept of the gene in biological evolution. Principles of cultural evolution are put forward and compared to the principles of biological evolution. Special emphasis is laid on the principle of selection in cultural evolution, including presentation of the concept of cultural fitness. The importance of language as a necessary condition for cultural evolution is stressed. Subsequently, prime differences between biological and cultural evolution are presented, followed by a discussion on interaction of our genome and our culture. The review aims at contributing to the present discussion concerning the modern development of the general theory of evolution, for example by giving a tentative formulation of the necessary and sufficient conditions for cultural evolution, and proposing that human creativity and mind reading or theory of mind are motors specific for it. The paper ends with the notion of the still ongoing coevolution of genes and culture.
A Topical Trajectory on Survival: an Analysis of Link-Making in a Sequence of Lessons on Evolution

NASA Astrophysics Data System (ADS)

Rocksén, Miranda; Olander, Clas

2017-04-01

This study explores the concept of link-making in relation to communicative strategies applied in the teaching and studying of biological evolution. The analysis focused on video recordings of 11 lessons on biological evolution conducted in a Swedish 9th grade class of students aged 15 years. It reveals how the teacher and students connected classroom conversations, the frequency of references to conversations in whole-class settings, and the development of a theme focusing on species survival and extinction. Detailed examples from the data illustrate how this theme developed from its initiation during the first lesson, through discussion and clarification, to its wrapping up during the last lesson. They further illustrate how students made sense of what the teacher said and wrote, and how the teacher postponed issues, explained and developed topics, provided opportunities for link-making, organised the class, motivated students, and checked their understanding. The study's methodological approach offers a way of including several time dimensions within research. Based on our findings, we conclude that the excerpts examined here did succeed in building `islands of coherence' in the co-construction of curricular content. Moreover, the topical trajectory in relation to species survival provided opportunities for constructing a `scientific story' in the classroom.
Genomic evidence for the emergence and evolution of pathogenicity and niche preferences in the genus Campylobacter.

PubMed

Iraola, Gregorio; Pérez, Ruben; Naya, Hugo; Paolicchi, Fernando; Pastor, Eugenia; Valenzuela, Sebastián; Calleros, Lucía; Velilla, Alejandra; Hernández, Martín; Morsella, Claudia

2014-09-04

The genus Campylobacter includes some of the most relevant pathogens for human and animal health; the continuous effort in their characterization has also revealed new species putatively involved in different kind of infections. Nowadays, the available genomic data for the genus comprise a wide variety of species with different pathogenic potential and niche preferences. In this work, we contribute to enlarge this available information presenting the first genome for the species Campylobacter sputorum bv. sputorum and use this and the already sequenced organisms to analyze the emergence and evolution of pathogenicity and niche preferences among Campylobacter species. We found that campylobacters can be unequivocally distinguished in established and putative pathogens depending on their repertory of virulence genes, which have been horizontally acquired from other bacteria because the nonpathogenic Campylobacter ancestor emerged, and posteriorly interchanged between some members of the genus. Additionally, we demonstrated the role of both horizontal gene transfers and diversifying evolution in niche preferences, being able to distinguish genetic features associated to the tropism for oral, genital, and gastrointestinal tissues. In particular, we highlight the role of nonsynonymous evolution of disulphide bond proteins, the invasion antigen B (CiaB), and other secreted proteins in the determination of niche preferences. Our results arise from assessing the previously unmet goal of considering the whole available Campylobacter diversity for genome comparisons, unveiling notorious genetic features that could explain particular phenotypes and set the basis for future research in Campylobacter biology. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
TaxI: a software tool for DNA barcoding using distance methods

PubMed Central

Steinke, Dirk; Vences, Miguel; Salzburger, Walter; Meyer, Axel

2005-01-01

DNA barcoding is a promising approach to the diagnosis of biological diversity in which DNA sequences serve as the primary key for information retrieval. Most existing software for evolutionary analysis of DNA sequences was designed for phylogenetic analyses and, hence, those algorithms do not offer appropriate solutions for the rapid, but precise analyses needed for DNA barcoding, and are also unable to process the often large comparative datasets. We developed a flexible software tool for DNA taxonomy, named TaxI. This program calculates sequence divergences between a query sequence (taxon to be barcoded) and each sequence of a dataset of reference sequences defined by the user. Because the analysis is based on separate pairwise alignments this software is also able to work with sequences characterized by multiple insertions and deletions that are difficult to align in large sequence sets (i.e. thousands of sequences) by multiple alignment algorithms because of computational restrictions. Here, we demonstrate the utility of this approach with two datasets of fish larvae and juveniles from Lake Constance and juvenile land snails under different models of sequence evolution. Sets of ribosomal 16S rRNA sequences, characterized by multiple indels, performed as good as or better than cox1 sequence sets in assigning sequences to species, demonstrating the suitability of rRNA genes for DNA barcoding. PMID:16214755
Biology Professors' and Teachers' Positions Regarding Biological Evolution and Evolution Education in a Middle Eastern Society

ERIC Educational Resources Information Center

BouJaoude, Saouma; Asghar, Anila; Wiles, Jason R.; Jaber, Lama; Sarieddine, Diana; Alters, Brian

2011-01-01

This study investigated three questions: (1) What are Lebanese secondary school (Grade 9-12) biology teachers' and university biology professors' positions regarding biological evolution?, (2) How do participants' religious affiliations relate to their positions about evolutionary science?, and (3) What are participants' positions regarding…
The Influence of Religion and High School Biology Courses on Students' Knowledge of Evolution When They Enter College

ERIC Educational Resources Information Center

Moore, Randy; Cotner, Sehoya; Bates, Alex

2009-01-01

Students whose high school biology course included evolution but not creationism knew more about evolution when they entered college than did students whose courses included evolution plus creationism or whose courses included neither evolution nor creationism. Similarly, students who believed that their high school biology classes were the…
BGD: a database of bat genomes.

PubMed

Fang, Jianfei; Wang, Xuan; Mu, Shuo; Zhang, Shuyi; Dong, Dong

2015-01-01

Bats account for ~20% of mammalian species, and are the only mammals with true powered flight. For the sake of their specialized phenotypic traits, many researches have been devoted to examine the evolution of bats. Until now, some whole genome sequences of bats have been assembled and annotated, however, a uniform resource for the annotated bat genomes is still unavailable. To make the extensive data associated with the bat genomes accessible to the general biological communities, we established a Bat Genome Database (BGD). BGD is an open-access, web-available portal that integrates available data of bat genomes and genes. It hosts data from six bat species, including two megabats and four microbats. Users can query the gene annotations using efficient searching engine, and it offers browsable tracks of bat genomes. Furthermore, an easy-to-use phylogenetic analysis tool was also provided to facilitate online phylogeny study of genes. To the best of our knowledge, BGD is the first database of bat genomes. It will extend our understanding of the bat evolution and be advantageous to the bat sequences analysis. BGD is freely available at: http://donglab.ecnu.edu.cn/databases/BatGenome/.
Not so bad after all: retroviruses and long terminal repeat retrotransposons as a source of new genes in vertebrates.

PubMed

Naville, M; Warren, I A; Haftek-Terreau, Z; Chalopin, D; Brunet, F; Levin, P; Galiana, D; Volff, J-N

2016-04-01

Viruses and transposable elements, once considered as purely junk and selfish sequences, have repeatedly been used as a source of novel protein-coding genes during the evolution of most eukaryotic lineages, a phenomenon called 'molecular domestication'. This is exemplified perfectly in mammals and other vertebrates, where many genes derived from long terminal repeat (LTR) retroelements (retroviruses and LTR retrotransposons) have been identified through comparative genomics and functional analyses. In particular, genes derived from gag structural protein and envelope (env) genes, as well as from the integrase-coding and protease-coding sequences, have been identified in humans and other vertebrates. Retroelement-derived genes are involved in many important biological processes including placenta formation, cognitive functions in the brain and immunity against retroelements, as well as in cell proliferation, apoptosis and cancer. These observations support an important role of retroelement-derived genes in the evolution and diversification of the vertebrate lineage. Copyright © 2016 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved.
Evolution-Based Functional Decomposition of Proteins

PubMed Central

Rivoire, Olivier; Reynolds, Kimberly A.; Ranganathan, Rama

2016-01-01

The essential biological properties of proteins—folding, biochemical activities, and the capacity to adapt—arise from the global pattern of interactions between amino acid residues. The statistical coupling analysis (SCA) is an approach to defining this pattern that involves the study of amino acid coevolution in an ensemble of sequences comprising a protein family. This approach indicates a functional architecture within proteins in which the basic units are coupled networks of amino acids termed sectors. This evolution-based decomposition has potential for new understandings of the structural basis for protein function. To facilitate its usage, we present here the principles and practice of the SCA and introduce new methods for sector analysis in a python-based software package (pySCA). We show that the pattern of amino acid interactions within sectors is linked to the divergence of functional lineages in a multiple sequence alignment—a model for how sector properties might be differentially tuned in members of a protein family. This work provides new tools for studying proteins and for generally testing the concept of sectors as the principal units of function and adaptive variation. PMID:27254668
The planetary biology of cytochrome P450 aromatases.

PubMed

Gaucher, Eric A; Graddy, Logan G; Li, Tang; Simmen, Rosalia C M; Simmen, Frank A; Schreiber, David R; Liberles, David A; Janis, Christine M; Benner, Steven A

2004-08-17

Joining a model for the molecular evolution of a protein family to the paleontological and geological records (geobiology), and then to the chemical structures of substrates, products, and protein folds, is emerging as a broad strategy for generating hypotheses concerning function in a post-genomic world. This strategy expands systems biology to a planetary context, necessary for a notion of fitness to underlie (as it must) any discussion of function within a biomolecular system. Here, we report an example of such an expansion, where tools from planetary biology were used to analyze three genes from the pig Sus scrofa that encode cytochrome P450 aromatases-enzymes that convert androgens into estrogens. The evolutionary history of the vertebrate aromatase gene family was reconstructed. Transition redundant exchange silent substitution metrics were used to interpolate dates for the divergence of family members, the paleontological record was consulted to identify changes in physiology that correlated in time with the change in molecular behavior, and new aromatase sequences from peccary were obtained. Metrics that detect changing function in proteins were then applied, including KA/KS values and those that exploit structural biology. These identified specific amino acid replacements that were associated with changing substrate and product specificity during the time of presumed adaptive change. The combined analysis suggests that aromatase paralogs arose in pigs as a result of selection for Suoidea with larger litters than their ancestors, and permitted the Suoidea to survive the global climatic trauma that began in the Eocene. This combination of bioinformatics analysis, molecular evolution, paleontology, cladistics, global climatology, structural biology, and organic chemistry serves as a paradigm in planetary biology. As the geological, paleontological, and genomic records improve, this approach should become widely useful to make systems biology statements about high-level function for biomolecular systems.
The planetary biology of cytochrome P450 aromatases

PubMed Central

Gaucher, Eric A; Graddy, Logan G; Li, Tang; Simmen, Rosalia CM; Simmen, Frank A; Schreiber, David R; Liberles, David A; Janis, Christine M; Benner, Steven A

2004-01-01

Background Joining a model for the molecular evolution of a protein family to the paleontological and geological records (geobiology), and then to the chemical structures of substrates, products, and protein folds, is emerging as a broad strategy for generating hypotheses concerning function in a post-genomic world. This strategy expands systems biology to a planetary context, necessary for a notion of fitness to underlie (as it must) any discussion of function within a biomolecular system. Results Here, we report an example of such an expansion, where tools from planetary biology were used to analyze three genes from the pig Sus scrofa that encode cytochrome P450 aromatases–enzymes that convert androgens into estrogens. The evolutionary history of the vertebrate aromatase gene family was reconstructed. Transition redundant exchange silent substitution metrics were used to interpolate dates for the divergence of family members, the paleontological record was consulted to identify changes in physiology that correlated in time with the change in molecular behavior, and new aromatase sequences from peccary were obtained. Metrics that detect changing function in proteins were then applied, including KA/KS values and those that exploit structural biology. These identified specific amino acid replacements that were associated with changing substrate and product specificity during the time of presumed adaptive change. The combined analysis suggests that aromatase paralogs arose in pigs as a result of selection for Suoidea with larger litters than their ancestors, and permitted the Suoidea to survive the global climatic trauma that began in the Eocene. Conclusions This combination of bioinformatics analysis, molecular evolution, paleontology, cladistics, global climatology, structural biology, and organic chemistry serves as a paradigm in planetary biology. As the geological, paleontological, and genomic records improve, this approach should become widely useful to make systems biology statements about high-level function for biomolecular systems. PMID:15315709
Understanding phylogenetic incongruence: lessons from phyllostomid bats

PubMed Central

Dávalos, Liliana M; Cirranello, Andrea L; Geisler, Jonathan H; Simmons, Nancy B

2012-01-01

All characters and trait systems in an organism share a common evolutionary history that can be estimated using phylogenetic methods. However, differential rates of change and the evolutionary mechanisms driving those rates result in pervasive phylogenetic conflict. These drivers need to be uncovered because mismatches between evolutionary processes and phylogenetic models can lead to high confidence in incorrect hypotheses. Incongruence between phylogenies derived from morphological versus molecular analyses, and between trees based on different subsets of molecular sequences has become pervasive as datasets have expanded rapidly in both characters and species. For more than a decade, evolutionary relationships among members of the New World bat family Phyllostomidae inferred from morphological and molecular data have been in conflict. Here, we develop and apply methods to minimize systematic biases, uncover the biological mechanisms underlying phylogenetic conflict, and outline data requirements for future phylogenomic and morphological data collection. We introduce new morphological data for phyllostomids and outgroups and expand previous molecular analyses to eliminate methodological sources of phylogenetic conflict such as taxonomic sampling, sparse character sampling, or use of different algorithms to estimate the phylogeny. We also evaluate the impact of biological sources of conflict: saturation in morphological changes and molecular substitutions, and other processes that result in incongruent trees, including convergent morphological and molecular evolution. Methodological sources of incongruence play some role in generating phylogenetic conflict, and are relatively easy to eliminate by matching taxa, collecting more characters, and applying the same algorithms to optimize phylogeny. The evolutionary patterns uncovered are consistent with multiple biological sources of conflict, including saturation in morphological and molecular changes, adaptive morphological convergence among nectar-feeding lineages, and incongruent gene trees. Applying methods to account for nucleotide sequence saturation reduces, but does not completely eliminate, phylogenetic conflict. We ruled out paralogy, lateral gene transfer, and poor taxon sampling and outgroup choices among the processes leading to incongruent gene trees in phyllostomid bats. Uncovering and countering the possible effects of introgression and lineage sorting of ancestral polymorphism on gene trees will require great leaps in genomic and allelic sequencing in this species-rich mammalian family. We also found evidence for adaptive molecular evolution leading to convergence in mitochondrial proteins among nectar-feeding lineages. In conclusion, the biological processes that generate phylogenetic conflict are ubiquitous, and overcoming incongruence requires better models and more data than have been collected even in well-studied organisms such as phyllostomid bats. PMID:22891620
Artificial Intelligence, DNA Mimicry, and Human Health.

PubMed

Stefano, George B; Kream, Richard M

2017-08-14

The molecular evolution of genomic DNA across diverse plant and animal phyla involved dynamic registrations of sequence modifications to maintain existential homeostasis to increasingly complex patterns of environmental stressors. As an essential corollary, driver effects of positive evolutionary pressure are hypothesized to effect concerted modifications of genomic DNA sequences to meet expanded platforms of regulatory controls for successful implementation of advanced physiological requirements. It is also clearly apparent that preservation of updated registries of advantageous modifications of genomic DNA sequences requires coordinate expansion of convergent cellular proofreading/error correction mechanisms that are encoded by reciprocally modified genomic DNA. Computational expansion of operationally defined DNA memory extends to coordinate modification of coding and previously under-emphasized noncoding regions that now appear to represent essential reservoirs of untapped genetic information amenable to evolutionary driven recruitment into the realm of biologically active domains. Additionally, expansion of DNA memory potential via chemical modification and activation of noncoding sequences is targeted to vertical augmentation and integration of an expanded cadre of transcriptional and epigenetic regulatory factors affecting linear coding of protein amino acid sequences within open reading frames.
Bioinformatics-based tools in drug discovery: the cartography from single gene to integrative biological networks.

PubMed

Ramharack, Pritika; Soliman, Mahmoud E S

2018-06-01

Originally developed for the analysis of biological sequences, bioinformatics has advanced into one of the most widely recognized domains in the scientific community. Despite this technological evolution, there is still an urgent need for nontoxic and efficient drugs. The onus now falls on the 'omics domain to meet this need by implementing bioinformatics techniques that will allow for the introduction of pioneering approaches in the rational drug design process. Here, we categorize an updated list of informatics tools and explore the capabilities of integrative bioinformatics in disease control. We believe that our review will serve as a comprehensive guide toward bioinformatics-oriented disease and drug discovery research. Copyright © 2018 Elsevier Ltd. All rights reserved.
Genome Studies on Nematophagous and Entomogenous Fungi in China

PubMed Central

Zhang, Weiwei; Cheng, Xiaoli; Liu, Xingzhong; Xiang, Meichun

2016-01-01

The nematophagous and entomogenous fungi are natural enemies of nematodes and insects and have been utilized by humans to control agricultural and forestry pests. Some of these fungi have been or are being developed as biological control agents in China and worldwide. Several important nematophagous and entomogenous fungi, including nematode-trapping fungi (Arthrobotrys oligospora and Drechslerella stenobrocha), nematode endoparasite (Hirsutella minnesotensis), insect pathogens (Beauveria bassiana and Metarhizium spp.) and Chinese medicinal fungi (Ophiocordyceps sinensis and Cordyceps militaris), have been genome sequenced and extensively analyzed in China. The biology, evolution, and pharmaceutical application of these fungi and their interacting with host nematodes and insects revealed by genomes, comparing genomes coupled with transcriptomes are summarized and reviewed in this paper. PMID:29376926
T-lex2: genotyping, frequency estimation and re-annotation of transposable elements using single or pooled next-generation sequencing data.

PubMed

Fiston-Lavier, Anna-Sophie; Barrón, Maite G; Petrov, Dmitri A; González, Josefa

2015-02-27

Transposable elements (TEs) constitute the most active, diverse and ancient component in a broad range of genomes. Complete understanding of genome function and evolution cannot be achieved without a thorough understanding of TE impact and biology. However, in-depth analysis of TEs still represents a challenge due to the repetitive nature of these genomic entities. In this work, we present a broadly applicable and flexible tool: T-lex2. T-lex2 is the only available software that allows routine, automatic and accurate genotyping of individual TE insertions and estimation of their population frequencies both using individual strain and pooled next-generation sequencing data. Furthermore, T-lex2 also assesses the quality of the calls allowing the identification of miss-annotated TEs and providing the necessary information to re-annotate them. The flexible and customizable design of T-lex2 allows running it in any genome and for any type of TE insertion. Here, we tested the fidelity of T-lex2 using the fly and human genomes. Overall, T-lex2 represents a significant improvement in our ability to analyze the contribution of TEs to genome function and evolution as well as learning about the biology of TEs. T-lex2 is freely available online at http://sourceforge.net/projects/tlex. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Genome Evolution and Meiotic Maps by Massively Parallel DNA Sequencing: Spotted Gar, an Outgroup for the Teleost Genome Duplication

PubMed Central

Amores, Angel; Catchen, Julian; Ferrara, Allyse; Fontenot, Quenton; Postlethwait, John H.

2011-01-01

Genomic resources for hundreds of species of evolutionary, agricultural, economic, and medical importance are unavailable due to the expense of well-assembled genome sequences and difficulties with multigenerational studies. Teleost fish provide many models for human disease but possess anciently duplicated genomes that sometimes obfuscate connectivity. Genomic information representing a fish lineage that diverged before the teleost genome duplication (TGD) would provide an outgroup for exploring the mechanisms of evolution after whole-genome duplication. We exploited massively parallel DNA sequencing to develop meiotic maps with thrift and speed by genotyping F1 offspring of a single female and a single male spotted gar (Lepisosteus oculatus) collected directly from nature utilizing only polymorphisms existing in these two wild individuals. Using Stacks, software that automates the calling of genotypes from polymorphisms assayed by Illumina sequencing, we constructed a map containing 8406 markers. RNA-seq on two map-cross larvae provided a reference transcriptome that identified nearly 1000 mapped protein-coding markers and allowed genome-wide analysis of conserved synteny. Results showed that the gar lineage diverged from teleosts before the TGD and its genome is organized more similarly to that of humans than teleosts. Thus, spotted gar provides a critical link between medical models in teleost fish, to which gar is biologically similar, and humans, to which gar is genomically similar. Application of our F1 dense mapping strategy to species with no prior genome information promises to facilitate comparative genomics and provide a scaffold for ordering the numerous contigs arising from next generation genome sequencing. PMID:21828280

Assessment of Biology Majors' Versus Nonmajors' Views on Evolution, Creationism, and Intelligent Design.

PubMed

Paz-Y-Miño C, Guillermo; Espinosa, Avelina

2009-03-01

The controversy around evolution, creationism, and intelligent design resides in a historical struggle between scientific knowledge and popular belief. Four hundred seventy-six students (biology majors n =237, nonmajors n =239) at a secular liberal arts private university in Northeastern United States responded to a five-question survey to assess their views about: (1) evolution, creationism, and intelligent design in the science class; (2) students' attitudes toward evolution; (3) students' position about the teaching of human evolution; (4) evolution in science exams; and (5) students' willingness to discuss evolution openly. There were 60.6% of biology majors and 42% of nonmajors supported the exclusive teaching of evolution in the science class, while 45.3% of nonmajors and 32% of majors were willing to learn equally about evolution, creationism, and intelligent design (question 1); 70.5% of biology majors and 55.6% of nonmajors valued the factual explanations evolution provides about the origin of life and its place in the universe (question 2); 78% of the combined responders (majors plus nonmajors) preferred science courses where evolution is discussed comprehensively and humans are part of it (question 3); 69% of the combined responders (majors plus nonmajors) had no problem answering questions concerning evolution in science exams (question 4); 48.1% of biology majors and 26.8% of nonmajors accepted evolution and expressed it openly, but 18.2% of the former and 14.2% of the latter accepted evolution privately; 46% of nonmajors and 29.1% of biology majors were reluctant to comment on this topic (question 5). Combined open plus private acceptance of evolution within biology majors increased with seniority, from freshman (60.7%) to seniors (81%), presumably due to gradual exposure to upper-division biology courses with evolutionary content. College curricular/pedagogical reform should fortify evolution literacy at all education levels, particularly among nonbiologists.
Assessment of Biology Majors’ Versus Nonmajors’ Views on Evolution, Creationism, and Intelligent Design

PubMed Central

Paz-y-Miño C., Guillermo

2016-01-01

The controversy around evolution, creationism, and intelligent design resides in a historical struggle between scientific knowledge and popular belief. Four hundred seventy-six students (biology majors n=237, nonmajors n=239) at a secular liberal arts private university in Northeastern United States responded to a five-question survey to assess their views about: (1) evolution, creationism, and intelligent design in the science class; (2) students’ attitudes toward evolution; (3) students’ position about the teaching of human evolution; (4) evolution in science exams; and (5) students’ willingness to discuss evolution openly. There were 60.6% of biology majors and 42% of nonmajors supported the exclusive teaching of evolution in the science class, while 45.3% of nonmajors and 32% of majors were willing to learn equally about evolution, creationism, and intelligent design (question 1); 70.5% of biology majors and 55.6% of nonmajors valued the factual explanations evolution provides about the origin of life and its place in the universe (question 2); 78% of the combined responders (majors plus nonmajors) preferred science courses where evolution is discussed comprehensively and humans are part of it (question 3); 69% of the combined responders (majors plus nonmajors) had no problem answering questions concerning evolution in science exams (question 4); 48.1% of biology majors and 26.8% of nonmajors accepted evolution and expressed it openly, but 18.2% of the former and 14.2% of the latter accepted evolution privately; 46% of nonmajors and 29.1% of biology majors were reluctant to comment on this topic (question 5). Combined open plus private acceptance of evolution within biology majors increased with seniority, from freshman (60.7%) to seniors (81%), presumably due to gradual exposure to upper-division biology courses with evolutionary content. College curricular/pedagogical reform should fortify evolution literacy at all education levels, particularly among nonbiologists. PMID:26973732
The pomegranate (Punica granatum L.) genome provides insights into fruit quality and ovule developmental biology.

PubMed

Yuan, Zhaohe; Fang, Yanming; Zhang, Taikui; Fei, Zhangjun; Han, Fengming; Liu, Cuiyu; Liu, Min; Xiao, Wei; Zhang, Wenjing; Wu, Shan; Zhang, Mengwei; Ju, Youhui; Xu, Huili; Dai, He; Liu, Yujun; Chen, Yanhui; Wang, Lili; Zhou, Jianqing; Guan, Dian; Yan, Ming; Xia, Yanhua; Huang, Xianbin; Liu, Dongyuan; Wei, Hongmin; Zheng, Hongkun

2017-12-22

Pomegranate (Punica granatum L.) has an ancient cultivation history and has become an emerging profitable fruit crop due to its attractive features such as the bright red appearance and the high abundance of medicinally valuable ellagitannin-based compounds in its peel and aril. However, the limited genomic resources have restricted further elucidation of genetics and evolution of these interesting traits. Here, we report a 274-Mb high-quality draft pomegranate genome sequence, which covers approximately 81.5% of the estimated 336-Mb genome, consists of 2177 scaffolds with an N50 size of 1.7 Mb and contains 30 903 genes. Phylogenomic analysis supported that pomegranate belongs to the Lythraceae family rather than the monogeneric Punicaceae family, and comparative analyses showed that pomegranate and Eucalyptus grandis share the paleotetraploidy event. Integrated genomic and transcriptomic analyses provided insights into the molecular mechanisms underlying the biosynthesis of ellagitannin-based compounds, the colour formation in both peels and arils during pomegranate fruit development, and the unique ovule development processes that are characteristic of pomegranate. This genome sequence provides an important resource to expand our understanding of some unique biological processes and to facilitate both comparative biology studies and crop breeding. © 2017 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
Ecotoxicological criteria for final storage quality: Possibilities and limits

NASA Astrophysics Data System (ADS)

Zeyer, Josef; Meyer, Joseph

Landfills are complex chemical and biological reactors whose internal processes are often beyond the immediate control of process engineers. Therefore, the concept of a "Final Storage Landfill" may be deceptive. Furthermore, traditional approaches to establishing discharge criteria and treatment requirements for industrial effluents may not work well for landfill emissions. Factories can often be treated as steady-state processes whose inputs and outputs are predictable; however, landfills are batch reactors whose contents and emissions may be unknown and will vary temporally and spatially. If the contents of a landfill are known, the sequence of chemical reactions can be predicted qualitatively. Even if that sequence is predictable, though, quantitative ecotoxicological criteria will be difficult to establish, and risk assessments based on chemical "laundry lists" will be questionable. The situation is not hopeless, though. New approaches can be developed to monitor and predict landfill emissions. We believe these will include (1) testing (biological and chemical) of internal components of landfills as well as emissions; (2) development of laboratory and/or field methods in which the chemical and biological evolution of landfills can be studied at accelerated rates, thus allowing better prediction of future emissions; and (3) flexible ecotoxicological criteria that are adaptable to the evolving nature of landfill emissions. These criteria should be based on complementary chemical analyses and biological tests that fit into a hierarchical (decision-tree) hazard assessment strategy.
[Biological evolution and ancient DNA].

PubMed

Debruyne, Régis; Barriel, Véronique

2006-05-01

Twenty years after the advent of ancient DNA studies, this discipline seems to have reached the maturity formerly lacking to the fulfilment of its objectives. In its early development paleogenetics, as it is now acknowledged, had to cope with very limited data due to the technical limitations of molecular biology. It led to phylogenetic assumptions often limited in their scope and sometimes non-focused or even spurious results that cast the reluctance of the scientific community. This time seems now over and huge amounts of sequences have become available which overcome the former limitations and bridge the gap between paleogenetics, genomics and population biology. The recent studies over the charismatic woolly mammoth (independent sequencing of the whole mitochondrial genome and of millions of base pairs of the nuclear genome) exemplify the growing accuracy of ancient DNA studies thanks to new molecular approaches. From the earliest publications up to now, the number of mammoth nucleotides was multiplied by 100,000. Likewise, populational approaches of ice-age taxa provide new historical scenarios about the diversification and extinction of the Pleistocene megafauna on the one hand, and about the processes of domestication of animal and vegetal species by Man on the other. They also shed light on the differential structure of molecular diversity between short-term populational research (below 2 My) and long-term (over 2 My) phylogenetic approaches. All those results confirm the growing importance of paleogenetics among the evolutionary biology disciplines.
PARALLEL RACE FORMATION AND THE EVOLUTION OF MIMICRY IN HELICONIUS BUTTERFLIES: A PHYLOGENETIC HYPOTHESIS FROM MITOCHONDRIAL DNA SEQUENCES.

PubMed

Brower, Andrew V Z

1996-02-01

Mimicry has been a fundamental focus of research since the birth of evolutionary biology yet rarely has been studied from a phylogenetic perspective beyond the simple recognition that mimics are not similar due to common descent. The difficulty of finding characters to discern relationships among closely related and convergent taxa has challenged systematists for more than a century. The phenotypic diversity of wing pattens among mimetic Heliconius adds an additional twist to the problem, because single species contain more than a dozen radically different-looking geographical races even though the mimetic advantage is theoretically highest when all individuals within and between species appear the same. Mitochondrial DNA (mtDNA) offers an independent way to address these issues. In this study, Cytochrome Oxidase I and II sequences from multiple, parallel races of Heliconius erato and Heliconius melpomene are examined, to estimate intraspecific phylogeny and gauge sequence divergence and ages of clades among races within each species. Although phenotypes of sympatric races exhibit remarkable concordance between the two species, the mitochondrial cladograms show that the species have not shared a common evolutionary history. H. erato exhibits a basal split between trans- and cis-Andean groups of races, whereas H. melpomene originates in the Guiana Shield. Diverse races in either species appear to have evolved within the last 200,000 yr, and convergent phenotypes have evolved independently within as well as between species. These results contradict prior theories of the evolution of mimicry based on analysis of wing-pattern genetics. © 1996 The Society for the Study of Evolution.
Darwin and Genetics

PubMed Central

Charlesworth, Brian; Charlesworth, Deborah

2009-01-01

Darwin's theory of natural selection lacked an adequate account of inheritance, making it logically incomplete. We review the interaction between evolution and genetics, showing how, unlike Mendel, Darwin's lack of a model of the mechanism of inheritance left him unable to interpret his own data that showed Mendelian ratios, even though he shared with Mendel a more mathematical and probabilistic outlook than most biologists of his time. Darwin's own “pangenesis” model provided a mechanism for generating ample variability on which selection could act. It involved, however, the inheritance of characters acquired during an organism's life, which Darwin himself knew could not explain some evolutionary situations. Once the particulate basis of genetics was understood, it was seen to allow variation to be passed intact to new generations, and evolution could then be understood as a process of changes in the frequencies of stable variants. Evolutionary genetics subsequently developed as a central part of biology. Darwinian principles now play a greater role in biology than ever before, which we illustrate with some examples of studies of natural selection that use DNA sequence data and with some recent advances in answering questions first asked by Darwin. PMID:19933231
Genetic Differences Between Humans and Great Apes -- Implications for the Evolution of Humans

NASA Astrophysics Data System (ADS)

Varki, Ajit

2004-06-01

At the level of individual protein sequences, humans are 97-100% identical to the great apes, our closest evolutionary relatives. The evolution of humans (and of human intelligence) from a common ancestor with the chimpanzee and bonobo involved many steps, influenced by interactions amongst factors of genetic, developmental, ecological, microbial, climatic, behavioral, cultural and social origin. The genetic factors can be approached by direct comparisons of human and great ape genomes, genes and gene products, and by elucidating biochemical and biological consequences of any differences found. We have discovered multiple genetic and biochemical differences between humans and great apes, particularly with respect to a family of cell surface molecules called sialic acids, as well as in the metabolism of thyroid hormones. The hormone differences have potential consequences for human brain development. The differences in sialic acid biology have multiple implications for the human condition, ranging from susceptibility or resistance to microbial pathogens, effects on endogenous receptors in the immune system, and potential effects on placental signaling, expression of oncofetal antigens in cancers, consequences of dietary intake of animal foods, and development of the mammalian brain.
MicroScope: a platform for microbial genome annotation and comparative genomics

PubMed Central

Vallenet, D.; Engelen, S.; Mornico, D.; Cruveiller, S.; Fleury, L.; Lajus, A.; Rouy, Z.; Roche, D.; Salvignol, G.; Scarpelli, C.; Médigue, C.

2009-01-01

The initial outcome of genome sequencing is the creation of long text strings written in a four letter alphabet. The role of in silico sequence analysis is to assist biologists in the act of associating biological knowledge with these sequences, allowing investigators to make inferences and predictions that can be tested experimentally. A wide variety of software is available to the scientific community, and can be used to identify genomic objects, before predicting their biological functions. However, only a limited number of biologically interesting features can be revealed from an isolated sequence. Comparative genomics tools, on the other hand, by bringing together the information contained in numerous genomes simultaneously, allow annotators to make inferences based on the idea that evolution and natural selection are central to the definition of all biological processes. We have developed the MicroScope platform in order to offer a web-based framework for the systematic and efficient revision of microbial genome annotation and comparative analysis (http://www.genoscope.cns.fr/agc/microscope). Starting with the description of the flow chart of the annotation processes implemented in the MicroScope pipeline, and the development of traditional and novel microbial annotation and comparative analysis tools, this article emphasizes the essential role of expert annotation as a complement of automatic annotation. Several examples illustrate the use of implemented tools for the review and curation of annotations of both new and publicly available microbial genomes within MicroScope’s rich integrated genome framework. The platform is used as a viewer in order to browse updated annotation information of available microbial genomes (more than 440 organisms to date), and in the context of new annotation projects (117 bacterial genomes). The human expertise gathered in the MicroScope database (about 280,000 independent annotations) contributes to improve the quality of microbial genome annotation, especially for genomes initially analyzed by automatic procedures alone. Database URLs: http://www.genoscope.cns.fr/agc/mage and http://www.genoscope.cns.fr/agc/microcyc PMID:20157493
MicroScope: a platform for microbial genome annotation and comparative genomics.

PubMed

Vallenet, D; Engelen, S; Mornico, D; Cruveiller, S; Fleury, L; Lajus, A; Rouy, Z; Roche, D; Salvignol, G; Scarpelli, C; Médigue, C

2009-01-01

The initial outcome of genome sequencing is the creation of long text strings written in a four letter alphabet. The role of in silico sequence analysis is to assist biologists in the act of associating biological knowledge with these sequences, allowing investigators to make inferences and predictions that can be tested experimentally. A wide variety of software is available to the scientific community, and can be used to identify genomic objects, before predicting their biological functions. However, only a limited number of biologically interesting features can be revealed from an isolated sequence. Comparative genomics tools, on the other hand, by bringing together the information contained in numerous genomes simultaneously, allow annotators to make inferences based on the idea that evolution and natural selection are central to the definition of all biological processes. We have developed the MicroScope platform in order to offer a web-based framework for the systematic and efficient revision of microbial genome annotation and comparative analysis (http://www.genoscope.cns.fr/agc/microscope). Starting with the description of the flow chart of the annotation processes implemented in the MicroScope pipeline, and the development of traditional and novel microbial annotation and comparative analysis tools, this article emphasizes the essential role of expert annotation as a complement of automatic annotation. Several examples illustrate the use of implemented tools for the review and curation of annotations of both new and publicly available microbial genomes within MicroScope's rich integrated genome framework. The platform is used as a viewer in order to browse updated annotation information of available microbial genomes (more than 440 organisms to date), and in the context of new annotation projects (117 bacterial genomes). The human expertise gathered in the MicroScope database (about 280,000 independent annotations) contributes to improve the quality of microbial genome annotation, especially for genomes initially analyzed by automatic procedures alone.Database URLs: http://www.genoscope.cns.fr/agc/mage and http://www.genoscope.cns.fr/agc/microcyc.
MicroTrout: A comprehensive, genome-wide miRNA target prediction framework for rainbow trout, Oncorhynchus mykiss.

PubMed

Mennigen, Jan A; Zhang, Dapeng

2016-12-01

Rainbow trout represent an important teleost research model and aquaculture species. As such, rainbow trout are employed in diverse areas of biological research, including basic biological disciplines such as comparative physiology, toxicology, and, since rainbow trout have undergone both teleost- and salmonid-specific rounds of genome duplication, molecular evolution. In recent years, microRNAs (miRNAs, small non-protein coding RNAs) have emerged as important posttranscriptional regulators of gene expression in animals. Given the increasingly recognized importance of miRNAs as an additional layer in the regulation of gene expression and hence biological function, recent efforts using RNA- and genome sequencing approaches have resulted in the creation of several resources for the construction of a comprehensive repertoire of rainbow trout miRNAs and isomiRs (variant miRNA sequences that all appear to derive from the same gene but vary in sequence due to post-transcriptional processing). Importantly, through the recent publication of the rainbow trout genome (Berthelot et al., 2014), mRNA 3'UTR information has become available, allowing for the first time the genome-wide prediction of miRNA-target RNA relationships in this species. We here report the creation of the microtrout database, a comprehensive resource for rainbow trout miRNA and annotated 3'UTRs. The comprehensive database was used to implement an algorithm to predict genome-wide rainbow trout-specific miRNA-mRNA target relationships, generating an improved predictive framework over previously published approaches. This work will serve as a useful framework and sequence resource to experimentally address the role of miRNAs in several research areas using the rainbow trout model, examples of which are discussed. Copyright © 2016 Elsevier Inc. All rights reserved.
Detecting gene subnetworks under selection in biological pathways.

PubMed

Gouy, Alexandre; Daub, Joséphine T; Excoffier, Laurent

2017-09-19

Advances in high throughput sequencing technologies have created a gap between data production and functional data analysis. Indeed, phenotypes result from interactions between numerous genes, but traditional methods treat loci independently, missing important knowledge brought by network-level emerging properties. Therefore, detecting selection acting on multiple genes affecting the evolution of complex traits remains challenging. In this context, gene network analysis provides a powerful framework to study the evolution of adaptive traits and facilitates the interpretation of genome-wide data. We developed a method to analyse gene networks that is suitable to evidence polygenic selection. The general idea is to search biological pathways for subnetworks of genes that directly interact with each other and that present unusual evolutionary features. Subnetwork search is a typical combinatorial optimization problem that we solve using a simulated annealing approach. We have applied our methodology to find signals of adaptation to high-altitude in human populations. We show that this adaptation has a clear polygenic basis and is influenced by many genetic components. Our approach, implemented in the R package signet, improves on gene-level classical tests for selection by identifying both new candidate genes and new biological processes involved in adaptation to altitude. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
A Powerful Toolkit for Synthetic Biology: Over 3.8 Billion Years of Evolution

NASA Technical Reports Server (NTRS)

Rothschild, Lynn J.

2010-01-01

The combination of evolutionary with engineering principles will enhance synthetic biology. Conversely, synthetic biology has the potential to enrich evolutionary biology by explaining why some adaptive space is empty, on Earth or elsewhere. Synthetic biology, the design and construction of artificial biological systems, substitutes bio-engineering for evolution, which is seen as an obstacle. But because evolution has produced the complexity and diversity of life, it provides a proven toolkit of genetic materials and principles available to synthetic biology. Evolution operates on the population level, with the populations composed of unique individuals that are historical entities. The source of genetic novelty includes mutation, gene regulation, sex, symbiosis, and interspecies gene transfer. At a phenotypic level, variation derives from regulatory control, replication and diversification of components, compartmentalization, sexual selection and speciation, among others. Variation is limited by physical constraints such as diffusion, and chemical constraints such as reaction rates and membrane fluidity. While some of these tools of evolution are currently in use in synthetic biology, all ought to be examined for utility. A hybrid approach of synthetic biology coupled with fine-tuning through evolution is suggested
A powerful toolkit for synthetic biology: Over 3.8 billion years of evolution.

PubMed

Rothschild, Lynn J

2010-04-01

The combination of evolutionary with engineering principles will enhance synthetic biology. Conversely, synthetic biology has the potential to enrich evolutionary biology by explaining why some adaptive space is empty, on Earth or elsewhere. Synthetic biology, the design and construction of artificial biological systems, substitutes bio-engineering for evolution, which is seen as an obstacle. But because evolution has produced the complexity and diversity of life, it provides a proven toolkit of genetic materials and principles available to synthetic biology. Evolution operates on the population level, with the populations composed of unique individuals that are historical entities. The source of genetic novelty includes mutation, gene regulation, sex, symbiosis, and interspecies gene transfer. At a phenotypic level, variation derives from regulatory control, replication and diversification of components, compartmentalization, sexual selection and speciation, among others. Variation is limited by physical constraints such as diffusion, and chemical constraints such as reaction rates and membrane fluidity. While some of these tools of evolution are currently in use in synthetic biology, all ought to be examined for utility. A hybrid approach of synthetic biology coupled with fine-tuning through evolution is suggested.
Convergent Evolution of Rumen Microbiomes in High-Altitude Mammals.

PubMed

Zhang, Zhigang; Xu, Dongming; Wang, Li; Hao, Junjun; Wang, Jinfeng; Zhou, Xin; Wang, Weiwei; Qiu, Qiang; Huang, Xiaodan; Zhou, Jianwei; Long, Ruijun; Zhao, Fangqing; Shi, Peng

2016-07-25

Studies of genetic adaptation, a central focus of evolutionary biology, most often focus on the host's genome and only rarely on its co-evolved microbiome. The Qinghai-Tibetan Plateau (QTP) offers one of the most extreme environments for the survival of human and other mammalian species. Yaks (Bos grunniens) and Tibetan sheep (T-sheep) (Ovis aries) have adaptations for living in this harsh high-altitude environment, where nomadic Tibetan people keep them primarily for food and livelihood [1]. Adaptive evolution affects energy-metabolism-related genes in a way that helps these ruminants live at high altitude [2, 3]. Herein, we report convergent evolution of rumen microbiomes for energy harvesting persistence in two typical high-altitude ruminants, yaks and T-sheep. Both ruminants yield significantly lower levels of methane and higher yields of volatile fatty acids (VFAs) than their low-altitude relatives, cattle (Bos taurus) and ordinary sheep (Ovis aries). Ultra-deep metagenomic sequencing reveals significant enrichment in VFA-yielding pathways of rumen microbial genes in high-altitude ruminants, whereas methanogenesis pathways show enrichment in the cattle metagenome. Analyses of RNA transcriptomes reveal significant upregulation in 36 genes associated with VFA transport and absorption in the ruminal epithelium of high-altitude ruminants. Our study provides novel insights into the contributions of microbiomes to adaptive evolution in mammals and sheds light on the biological control of greenhouse gas emissions from livestock enteric fermentation. Copyright © 2016 Elsevier Ltd. All rights reserved.
Cellular and Molecular Biological Approaches to Interpreting Ancient Biomarkers

NASA Astrophysics Data System (ADS)

Newman, Dianne K.; Neubauer, Cajetan; Ricci, Jessica N.; Wu, Chia-Hung; Pearson, Ann

2016-06-01

Our ability to read the molecular fossil record has advanced significantly in the past decade. Improvements in biomarker sampling and quantification methods, expansion of molecular sequence databases, and the application of genetic and cellular biological tools to problems in biomarker research have enabled much of this progress. By way of example, we review how attempts to understand the biological function of 2-methylhopanoids in modern bacteria have changed our interpretation of what their molecular fossils tell us about the early history of life. They were once thought to be biomarkers of cyanobacteria and hence the evolution of oxygenic photosynthesis, but we now believe that 2-methylhopanoid biosynthetic capacity originated in the Alphaproteobacteria, that 2-methylhopanoids are regulated in response to stress, and that hopanoid 2-methylation enhances membrane rigidity. We present a new interpretation of 2-methylhopanes that bridges the gap between studies of the functions of 2-methylhopanoids and their patterns of occurrence in the rock record.
Angiosperm phylogeny inferred from multiple genes as a tool for comparative biology.

PubMed

Soltis, P S; Soltis, D E; Chase, M W

1999-11-25

Comparative biology requires a firm phylogenetic foundation to uncover and understand patterns of diversification and evaluate hypotheses of the processes responsible for these patterns. In the angiosperms, studies of diversification in floral form, stamen organization, reproductive biology, photosynthetic pathway, nitrogen-fixing symbioses and life histories have relied on either explicit or implied phylogenetic trees. Furthermore, to understand the evolution of specific genes and gene families, evaluate the extent of conservation of plant genomes and make proper sense of the huge volume of molecular genetic data available for model organisms such as Arabidopsis, Antirrhinum, maize, rice and wheat, a phylogenetic perspective is necessary. Here we report the results of parsimony analyses of DNA sequences of the plastid genes rbcL and atpB and the nuclear 18S rDNA for 560 species of angiosperms and seven non-flowering seed plants and show a well-resolved and well-supported phylogenetic tree for the angiosperms for use in comparative biology.
HIV-1 tropism testing in subjects achieving undetectable HIV-1 RNA: diagnostic accuracy, viral evolution and compartmentalization.

PubMed

Pou, Christian; Codoñer, Francisco M; Thielen, Alexander; Bellido, Rocío; Pérez-Álvarez, Susana; Cabrera, Cecilia; Dalmau, Judith; Curriu, Marta; Lie, Yolanda; Noguera-Julian, Marc; Puig, Jordi; Martínez-Picado, Javier; Blanco, Julià; Coakley, Eoin; Däumer, Martin; Clotet, Bonaventura; Paredes, Roger

2013-01-01

Technically, HIV-1 tropism can be evaluated in plasma or peripheral blood mononuclear cells (PBMCs). However, only tropism testing of plasma HIV-1 has been validated as a tool to predict virological response to CCR5 antagonists in clinical trials. The preferable tropism testing strategy in subjects with undetectable HIV-1 viremia, in whom plasma tropism testing is not feasible, remains uncertain. We designed a proof-of-concept study including 30 chronically HIV-1-infected individuals who achieved HIV-1 RNA <50 copies/mL during at least 2 years after first-line ART initiation. First, we determined the diagnostic accuracy of 454 and population sequencing of gp120 V3-loops in plasma and PBMCs, as well as of MT-2 assays before ART initiation. The Enhanced Sensitivity Trofile Assay (ESTA) was used as the technical reference standard. 454 sequencing of plasma viruses provided the highest agreement with ESTA. The accuracy of 454 sequencing decreased in PBMCs due to reduced specificity. Population sequencing in plasma and PBMCs was slightly less accurate than plasma 454 sequencing, being less sensitive but more specific. MT-2 assays had low sensitivity but 100% specificity. Then, we used optimized 454 sequence data to investigate viral evolution in PBMCs during viremia suppression and only found evolution of R5 viruses in one subject. No de novo CXCR4-using HIV-1 production was observed over time. Finally, Slatkin-Maddison tests suggested that plasma and cell-associated V3 forms were sometimes compartmentalized. The absence of tropism shifts during viremia suppression suggests that, when available, testing of stored plasma samples is generally safe and informative, provided that HIV-1 suppression is maintained. Tropism testing in PBMCs may not necessarily produce equivalent biological results to plasma, because the structure of viral populations and the diagnostic performance of tropism assays may sometimes vary between compartments. Thereby, proviral DNA tropism testing should be specifically validated in clinical trials before it can be applied to routine clinical decision-making.
The public goods hypothesis for the evolution of life on Earth

PubMed Central

2011-01-01

It is becoming increasingly difficult to reconcile the observed extent of horizontal gene transfers with the central metaphor of a great tree uniting all evolving entities on the planet. In this manuscript we describe the Public Goods Hypothesis and show that it is appropriate in order to describe biological evolution on the planet. According to this hypothesis, nucleotide sequences (genes, promoters, exons, etc.) are simply seen as goods, passed from organism to organism through both vertical and horizontal transfer. Public goods sequences are defined by having the properties of being largely non-excludable (no organism can be effectively prevented from accessing these sequences) and non-rival (while such a sequence is being used by one organism it is also available for use by another organism). The universal nature of genetic systems ensures that such non-excludable sequences exist and non-excludability explains why we see a myriad of genes in different combinations in sequenced genomes. There are three features of the public goods hypothesis. Firstly, segments of DNA are seen as public goods, available for all organisms to integrate into their genomes. Secondly, we expect the evolution of mechanisms for DNA sharing and of defense mechanisms against DNA intrusion in genomes. Thirdly, we expect that we do not see a global tree-like pattern. Instead, we expect local tree-like patterns to emerge from the combination of a commonage of genes and vertical inheritance of genomes by cell division. Indeed, while genes are theoretically public goods, in reality, some genes are excludable, particularly, though not only, when they have variant genetic codes or behave as coalition or club goods, available for all organisms of a coalition to integrate into their genomes, and non-rival within the club. We view the Tree of Life hypothesis as a regionalized instance of the Public Goods hypothesis, just like classical mechanics and euclidean geometry are seen as regionalized instances of quantum mechanics and Riemannian geometry respectively. We argue for this change using an axiomatic approach that shows that the Public Goods hypothesis is a better accommodation of the observed data than the Tree of Life hypothesis. PMID:21861918
The Public Goods Hypothesis for the evolution of life on Earth.

PubMed

McInerney, James O; Pisani, Davide; Bapteste, Eric; O'Connell, Mary J

2011-08-23

It is becoming increasingly difficult to reconcile the observed extent of horizontal gene transfers with the central metaphor of a great tree uniting all evolving entities on the planet. In this manuscript we describe the Public Goods Hypothesis and show that it is appropriate in order to describe biological evolution on the planet. According to this hypothesis, nucleotide sequences (genes, promoters, exons, etc.) are simply seen as goods, passed from organism to organism through both vertical and horizontal transfer. Public goods sequences are defined by having the properties of being largely non-excludable (no organism can be effectively prevented from accessing these sequences) and non-rival (while such a sequence is being used by one organism it is also available for use by another organism). The universal nature of genetic systems ensures that such non-excludable sequences exist and non-excludability explains why we see a myriad of genes in different combinations in sequenced genomes. There are three features of the public goods hypothesis. Firstly, segments of DNA are seen as public goods, available for all organisms to integrate into their genomes. Secondly, we expect the evolution of mechanisms for DNA sharing and of defense mechanisms against DNA intrusion in genomes. Thirdly, we expect that we do not see a global tree-like pattern. Instead, we expect local tree-like patterns to emerge from the combination of a commonage of genes and vertical inheritance of genomes by cell division. Indeed, while genes are theoretically public goods, in reality, some genes are excludable, particularly, though not only, when they have variant genetic codes or behave as coalition or club goods, available for all organisms of a coalition to integrate into their genomes, and non-rival within the club. We view the Tree of Life hypothesis as a regionalized instance of the Public Goods hypothesis, just like classical mechanics and euclidean geometry are seen as regionalized instances of quantum mechanics and Riemannian geometry respectively. We argue for this change using an axiomatic approach that shows that the Public Goods hypothesis is a better accommodation of the observed data than the Tree of Life hypothesis.

The evolution of transcriptional regulation in eukaryotes

NASA Technical Reports Server (NTRS)

Wray, Gregory A.; Hahn, Matthew W.; Abouheif, Ehab; Balhoff, James P.; Pizer, Margaret; Rockman, Matthew V.; Romano, Laura A.

2003-01-01

Gene expression is central to the genotype-phenotype relationship in all organisms, and it is an important component of the genetic basis for evolutionary change in diverse aspects of phenotype. However, the evolution of transcriptional regulation remains understudied and poorly understood. Here we review the evolutionary dynamics of promoter, or cis-regulatory, sequences and the evolutionary mechanisms that shape them. Existing evidence indicates that populations harbor extensive genetic variation in promoter sequences, that a substantial fraction of this variation has consequences for both biochemical and organismal phenotype, and that some of this functional variation is sorted by selection. As with protein-coding sequences, rates and patterns of promoter sequence evolution differ considerably among loci and among clades for reasons that are not well understood. Studying the evolution of transcriptional regulation poses empirical and conceptual challenges beyond those typically encountered in analyses of coding sequence evolution: promoter organization is much less regular than that of coding sequences, and sequences required for the transcription of each locus reside at multiple other loci in the genome. Because of the strong context-dependence of transcriptional regulation, sequence inspection alone provides limited information about promoter function. Understanding the functional consequences of sequence differences among promoters generally requires biochemical and in vivo functional assays. Despite these challenges, important insights have already been gained into the evolution of transcriptional regulation, and the pace of discovery is accelerating.
Search and Discovery Strategies for Biotechnology: the Paradigm Shift

PubMed Central

Bull, Alan T.; Ward, Alan C.; Goodfellow, Michael

2000-01-01

Profound changes are occurring in the strategies that biotechnology-based industries are deploying in the search for exploitable biology and to discover new products and develop new or improved processes. The advances that have been made in the past decade in areas such as combinatorial chemistry, combinatorial biosynthesis, metabolic pathway engineering, gene shuffling, and directed evolution of proteins have caused some companies to consider withdrawing from natural product screening. In this review we examine the paradigm shift from traditional biology to bioinformatics that is revolutionizing exploitable biology. We conclude that the reinvigorated means of detecting novel organisms, novel chemical structures, and novel biocatalytic activities will ensure that natural products will continue to be a primary resource for biotechnology. The paradigm shift has been driven by a convergence of complementary technologies, exemplified by DNA sequencing and amplification, genome sequencing and annotation, proteome analysis, and phenotypic inventorying, resulting in the establishment of huge databases that can be mined in order to generate useful knowledge such as the identity and characterization of organisms and the identity of biotechnology targets. Concurrently there have been major advances in understanding the extent of microbial diversity, how uncultured organisms might be grown, and how expression of the metabolic potential of microorganisms can be maximized. The integration of information from complementary databases presents a significant challenge. Such integration should facilitate answers to complex questions involving sequence, biochemical, physiological, taxonomic, and ecological information of the sort posed in exploitable biology. The paradigm shift which we discuss is not absolute in the sense that it will replace established microbiology; rather, it reinforces our view that innovative microbiology is essential for releasing the potential of microbial diversity for biotechnology penetration throughout industry. Various of these issues are considered with reference to deep-sea microbiology and biotechnology. PMID:10974127
LS³: A Method for Improving Phylogenomic Inferences When Evolutionary Rates Are Heterogeneous among Taxa.

PubMed

Rivera-Rivera, Carlos J; Montoya-Burgos, Juan I

2016-06-01

Phylogenetic inference artifacts can occur when sequence evolution deviates from assumptions made by the models used to analyze them. The combination of strong model assumption violations and highly heterogeneous lineage evolutionary rates can become problematic in phylogenetic inference, and lead to the well-described long-branch attraction (LBA) artifact. Here, we define an objective criterion for assessing lineage evolutionary rate heterogeneity among predefined lineages: the result of a likelihood ratio test between a model in which the lineages evolve at the same rate (homogeneous model) and a model in which different lineage rates are allowed (heterogeneous model). We implement this criterion in the algorithm Locus Specific Sequence Subsampling (LS³), aimed at reducing the effects of LBA in multi-gene datasets. For each gene, LS³ sequentially removes the fastest-evolving taxon of the ingroup and tests for lineage rate homogeneity until all lineages have uniform evolutionary rates. The sequences excluded from the homogeneously evolving taxon subset are flagged as potentially problematic. The software implementation provides the user with the possibility to remove the flagged sequences for generating a new concatenated alignment. We tested LS³ with simulations and two real datasets containing LBA artifacts: a nucleotide dataset regarding the position of Glires within mammals and an amino-acid dataset concerning the position of nematodes within bilaterians. The initially incorrect phylogenies were corrected in all cases upon removing data flagged by LS³. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Interleukin 1 receptor antagonist is a member of the interleukin 1 gene family: evolution of a cytokine control mechanism.

PubMed Central

Eisenberg, S P; Brewer, M T; Verderber, E; Heimdal, P; Brandhuber, B J; Thompson, R C

1991-01-01

Interleukin 1 receptor antagonist (IL-1ra) is a protein that binds to the IL-1 receptor and blocks the binding of both IL-1 alpha and -beta without inducing a signal of its own. Human IL-1ra has some sequence identity to human IL-1 beta, but the evolutionary relationship between these proteins has been unclear. We show that the genes for human, mouse, and rat IL-1ra are similar to the genes for IL-1 alpha and IL-1 beta in intron-exon organization, indicating that gene duplication events were important in the creation of this gene family. Furthermore, an analysis of sequence comparisons and mutation rates for IL-1 alpha, IL-1 beta, and IL-1ra suggests that the duplication giving rise to the IL-1ra gene was an early event in the evolution of the gene family. Comparisons between the mature sequences for IL-1ra, IL-1 alpha, and IL-1 beta suggest that IL-1ra has a beta-stranded structure like to IL-1 alpha and IL-1 beta, consistent with the three proteins being related. The N-terminal sequences of IL-1ra appear to be derived from a region of the genome different than those of IL-1 alpha and IL-1 beta, thus explaining their different modes of biosynthesis and suggesting an explanation for their different biological activities. Images PMID:1828896
Introductory Biology Labs... They Just Aren't Sexy Enough!

ERIC Educational Resources Information Center

Cotner, Sehoya; Gallup, Gordon G., Jr.

2011-01-01

The typical introductory biology curriculum includes the nature of science, evolution and genetics. Laboratory activities are designed to engage students in typical subject areas ranging from cell biology and physiology, to ecology and evolution. There are few, if any, laboratory classes exploring the biology and evolution of human sexual…
Biology Teachers' Professional Development Needs for Teaching Evolution

ERIC Educational Resources Information Center

Friedrichsen, Patricia J.; Linke, Nicholas; Barnett, Ellen

2016-01-01

The social controversy surrounding the teaching of evolution puts pressure on secondary biology teachers to deemphasize or omit evolution from their curriculum. In this growing pressure, professional development can offer support to biology teachers. In this study, we surveyed secondary biology teachers in Missouri and report the data from…
Comparative analysis of bat genomes provides insight into the evolution of flight and immunity.

PubMed

Zhang, Guojie; Cowled, Christopher; Shi, Zhengli; Huang, Zhiyong; Bishop-Lilly, Kimberly A; Fang, Xiaodong; Wynne, James W; Xiong, Zhiqiang; Baker, Michelle L; Zhao, Wei; Tachedjian, Mary; Zhu, Yabing; Zhou, Peng; Jiang, Xuanting; Ng, Justin; Yang, Lan; Wu, Lijun; Xiao, Jin; Feng, Yue; Chen, Yuanxin; Sun, Xiaoqing; Zhang, Yong; Marsh, Glenn A; Crameri, Gary; Broder, Christopher C; Frey, Kenneth G; Wang, Lin-Fa; Wang, Jun

2013-01-25

Bats are the only mammals capable of sustained flight and are notorious reservoir hosts for some of the world's most highly pathogenic viruses, including Nipah, Hendra, Ebola, and severe acute respiratory syndrome (SARS). To identify genetic changes associated with the development of bat-specific traits, we performed whole-genome sequencing and comparative analyses of two distantly related species, fruit bat Pteropus alecto and insectivorous bat Myotis davidii. We discovered an unexpected concentration of positively selected genes in the DNA damage checkpoint and nuclear factor κB pathways that may be related to the origin of flight, as well as expansion and contraction of important gene families. Comparison of bat genomes with other mammalian species has provided new insights into bat biology and evolution.
Prof. Hayashi's work on the pre-main sequence evolution and brown dwarfs

NASA Astrophysics Data System (ADS)

Nakano, Takenori

2012-09-01

Prof. Hayashi's work on the evolution of stars in the pre-main sequence stage is reviewed. The historical background and the process of finding the Hayashi phase are mentioned. The work on the evolution of low-mass stars is also reviewed including the determination of the bottom of the main sequence and evolution of brown dwarfs, and comparison is made with the other works in the same period.
Reconstruction of DNA sequences using genetic algorithms and cellular automata: towards mutation prediction?

PubMed

Mizas, Ch; Sirakoulis, G Ch; Mardiris, V; Karafyllidis, I; Glykos, N; Sandaltzopoulos, R

2008-04-01

Change of DNA sequence that fuels evolution is, to a certain extent, a deterministic process because mutagenesis does not occur in an absolutely random manner. So far, it has not been possible to decipher the rules that govern DNA sequence evolution due to the extreme complexity of the entire process. In our attempt to approach this issue we focus solely on the mechanisms of mutagenesis and deliberately disregard the role of natural selection. Hence, in this analysis, evolution refers to the accumulation of genetic alterations that originate from mutations and are transmitted through generations without being subjected to natural selection. We have developed a software tool that allows modelling of a DNA sequence as a one-dimensional cellular automaton (CA) with four states per cell which correspond to the four DNA bases, i.e. A, C, T and G. The four states are represented by numbers of the quaternary number system. Moreover, we have developed genetic algorithms (GAs) in order to determine the rules of CA evolution that simulate the DNA evolution process. Linear evolution rules were considered and square matrices were used to represent them. If DNA sequences of different evolution steps are available, our approach allows the determination of the underlying evolution rule(s). Conversely, once the evolution rules are deciphered, our tool may reconstruct the DNA sequence in any previous evolution step for which the exact sequence information was unknown. The developed tool may be used to test various parameters that could influence evolution. We describe a paradigm relying on the assumption that mutagenesis is governed by a near-neighbour-dependent mechanism. Based on the satisfactory performance of our system in the deliberately simplified example, we propose that our approach could offer a starting point for future attempts to understand the mechanisms that govern evolution. The developed software is open-source and has a user-friendly graphical input interface.
"Evo in the News:" Understanding Evolution and Students' Attitudes toward the Relevance of Evolutionary Biology

ERIC Educational Resources Information Center

Infanti, Lynn M.; Wiles, Jason R.

2014-01-01

This investigation evaluated the effects of exposure to the "Evo in the News" section of the "Understanding Evolution" website on students' attitudes toward biological evolution in undergraduates in a mixed-majors introductory biology course at Syracuse University. Students' attitudes toward evolution and changes therein were…
Investigating Human Evolution Using Digital Imaging & Craniometry

ERIC Educational Resources Information Center

Robertson, John C.

2007-01-01

Human evolution is an important and intriguing area of biology. The significance of evolution as a component of biology curricula, at all levels, can not be overstated; the need to make the most of opportunities to effectively educate students in evolution as a central and unifying realm of biology is paramount. Developing engaging laboratory or…
On-chip dynamic stress control for cancer cell evolution study

NASA Astrophysics Data System (ADS)

Liu, Liyu; Austin, Robert

2010-03-01

The growth and spreading of cancer in host organisms is an evolutionary process. Cells accumulate mutations that help them adapt to changing environments and to obtain survival fitness. However, all cancer--promoting mutations do not occur at once. Cancer cells face selective environmental pressures that drive their evolution in stages. In traditional cancer studies, environmental stress is usually homogenous in space and difficult to change in time. Here, we propose a microfluidic chip employing embedded dynamic traps to generate dynamic heterogeneous microenvironments for cancer cells in evolution studies. Based on polydimethylsiloxane (PDMS) flexible diaphragms, these traps are able to enclose and shield cancer cells or expose them to external environmental stress. Digital controls for each trap determine the nutrition, antibiotics, CO2/O2 conditions, and temperatures to which trapped cells are subjected. Thus, the stress applied to cells can be varied in intensity and duration in each trap independently. The chip can also output cells from specific traps for sequencing and other biological analysis. Hence our design simultaneously monitors and analyzes cell evolution behaviors under dynamic stresses.
The origin and evolution of the sexes: Novel insights from a distant eukaryotic linage.

PubMed

Mignerot, Laure; Coelho, Susana M

2016-01-01

Sexual reproduction is an extraordinarily widespread phenomenon that assures the production of new genetic combinations in nearly all eukaryotic lineages. Although the core features of sexual reproduction (meiosis and syngamy) are highly conserved, the control mechanisms that determine whether an individual is male or female are remarkably labile across eukaryotes. In genetically controlled sexual systems, gender is determined by sex chromosomes, which have emerged independently and repeatedly during evolution. Sex chromosomes have been studied in only a handful of classical model organism, and empirical knowledge on the origin and evolution of the sexes is still surprisingly incomplete. With the advent of new generation sequencing, the taxonomic breadth of model systems has been rapidly expanding, bringing new ideas and fresh views on this fundamental aspect of biology. This mini-review provides a quick state of the art of how the remarkable richness of the sexual characteristics of the brown algae is helping to increase our knowledge about the evolution of sex determination. Copyright © 2016 Académie des sciences. Published by Elsevier SAS. All rights reserved.
Influenza virus sequence feature variant type analysis: evidence of a role for NS1 in influenza virus host range restriction.

PubMed

Noronha, Jyothi M; Liu, Mengya; Squires, R Burke; Pickett, Brett E; Hale, Benjamin G; Air, Gillian M; Galloway, Summer E; Takimoto, Toru; Schmolke, Mirco; Hunt, Victoria; Klem, Edward; García-Sastre, Adolfo; McGee, Monnie; Scheuermann, Richard H

2012-05-01

Genetic drift of influenza virus genomic sequences occurs through the combined effects of sequence alterations introduced by a low-fidelity polymerase and the varying selective pressures experienced as the virus migrates through different host environments. While traditional phylogenetic analysis is useful in tracking the evolutionary heritage of these viruses, the specific genetic determinants that dictate important phenotypic characteristics are often difficult to discern within the complex genetic background arising through evolution. Here we describe a novel influenza virus sequence feature variant type (Flu-SFVT) approach, made available through the public Influenza Research Database resource (www.fludb.org), in which variant types (VTs) identified in defined influenza virus protein sequence features (SFs) are used for genotype-phenotype association studies. Since SFs have been defined for all influenza virus proteins based on known structural, functional, and immune epitope recognition properties, the Flu-SFVT approach allows the rapid identification of the molecular genetic determinants of important influenza virus characteristics and their connection to underlying biological functions. We demonstrate the use of the SFVT approach to obtain statistical evidence for effects of NS1 protein sequence variations in dictating influenza virus host range restriction.
Evolution and function of CAG/polyglutamine repeats in protein–protein interaction networks

PubMed Central

Schaefer, Martin H.; Wanker, Erich E.; Andrade-Navarro, Miguel A.

2012-01-01

Expanded runs of consecutive trinucleotide CAG repeats encoding polyglutamine (polyQ) stretches are observed in the genes of a large number of patients with different genetic diseases such as Huntington's and several Ataxias. Protein aggregation, which is a key feature of most of these diseases, is thought to be triggered by these expanded polyQ sequences in disease-related proteins. However, polyQ tracts are a normal feature of many human proteins, suggesting that they have an important cellular function. To clarify the potential function of polyQ repeats in biological systems, we systematically analyzed available information stored in sequence and protein interaction databases. By integrating genomic, phylogenetic, protein interaction network and functional information, we obtained evidence that polyQ tracts in proteins stabilize protein interactions. This happens most likely through structural changes whereby the polyQ sequence extends a neighboring coiled-coil region to facilitate its interaction with a coiled-coil region in another protein. Alteration of this important biological function due to polyQ expansion results in gain of abnormal interactions, leading to pathological effects like protein aggregation. Our analyses suggest that research on polyQ proteins should shift focus from expanded polyQ proteins into the characterization of the influence of the wild-type polyQ on protein interactions. PMID:22287626
Genomic Sequence around Butterfly Wing Development Genes: Annotation and Comparative Analysis

PubMed Central

Conceição, Inês C.; Long, Anthony D.; Gruber, Jonathan D.; Beldade, Patrícia

2011-01-01

Background Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. Methodology/Principal Findings We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations) and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes). Conclusions The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1) the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2) the high conservation of non-coding sequence around the genes wingless and Ecdysone receptor, both involved in multiple developmental processes including wing pattern formation. PMID:21909358
Diversification of transcription factor-DNA interactions and the evolution of gene regulatory networks.

PubMed

Rogers, Julia M; Bulyk, Martha L

2018-04-25

Sequence-specific transcription factors (TFs) bind short DNA sequences in the genome to regulate the expression of target genes. In the last decade, numerous technical advances have enabled the determination of the DNA-binding specificities of many of these factors. Large-scale screens of many TFs enabled the creation of databases of TF DNA-binding specificities, typically represented as position weight matrices (PWMs). Although great progress has been made in determining and predicting binding specificities systematically, there are still many surprises to be found when studying a particular TF's interactions with DNA in detail. Paralogous TFs' binding specificities can differ in subtle ways, in a manner that is not immediately apparent from looking at their PWMs. These differences affect gene regulatory outputs and enable TFs to rewire transcriptional networks over evolutionary time. This review discusses recent observations made in the study of TF-DNA interactions that highlight the importance of continued in-depth analysis of TF-DNA interactions and their inherent complexity. This article is categorized under: Biological Mechanisms > Regulatory Biology. © 2018 Wiley Periodicals, Inc.
Dissecting the relationship between protein structure and sequence variation

NASA Astrophysics Data System (ADS)

Shahmoradi, Amir; Wilke, Claus; Wilke Lab Team

2015-03-01

Over the past decade several independent works have shown that some structural properties of proteins are capable of predicting protein evolution. The strength and significance of these structure-sequence relations, however, appear to vary widely among different proteins, with absolute correlation strengths ranging from 0 . 1 to 0 . 8 . Here we present the results from a comprehensive search for the potential biophysical and structural determinants of protein evolution by studying more than 200 structural and evolutionary properties in a dataset of 209 monomeric enzymes. We discuss the main protein characteristics responsible for the general patterns of protein evolution, and identify sequence divergence as the main determinant of the strengths of virtually all structure-evolution relationships, explaining ~ 10 - 30 % of observed variation in sequence-structure relations. In addition to sequence divergence, we identify several protein structural properties that are moderately but significantly coupled with the strength of sequence-structure relations. In particular, proteins with more homogeneous back-bone hydrogen bond energies, large fractions of helical secondary structures and low fraction of beta sheets tend to have the strongest sequence-structure relation. BEACON-NSF center for the study of evolution in action.
Comparative Genomics Identifies Epidermal Proteins Associated with the Evolution of the Turtle Shell.

PubMed

Holthaus, Karin Brigit; Strasser, Bettina; Sipos, Wolfgang; Schmidt, Heiko A; Mlitz, Veronika; Sukseree, Supawadee; Weissenbacher, Anton; Tschachler, Erwin; Alibardi, Lorenzo; Eckhart, Leopold

2016-03-01

The evolution of reptiles, birds, and mammals was associated with the origin of unique integumentary structures. Studies on lizards, chicken, and humans have suggested that the evolution of major structural proteins of the outermost, cornified layers of the epidermis was driven by the diversification of a gene cluster called Epidermal Differentiation Complex (EDC). Turtles have evolved unique defense mechanisms that depend on mechanically resilient modifications of the epidermis. To investigate whether the evolution of the integument in these reptiles was associated with specific adaptations of the sequences and expression patterns of EDC-related genes, we utilized newly available genome sequences to determine the epidermal differentiation gene complement of turtles. The EDC of the western painted turtle (Chrysemys picta bellii) comprises more than 100 genes, including at least 48 genes that encode proteins referred to as beta-keratins or corneous beta-proteins. Several EDC proteins have evolved cysteine/proline contents beyond 50% of total amino acid residues. Comparative genomics suggests that distinct subfamilies of EDC genes have been expanded and partly translocated to loci outside of the EDC in turtles. Gene expression analysis in the European pond turtle (Emys orbicularis) showed that EDC genes are differentially expressed in the skin of the various body sites and that a subset of beta-keratin genes within the EDC as well as those located outside of the EDC are expressed predominantly in the shell. Our findings give strong support to the hypothesis that the evolutionary innovation of the turtle shell involved specific molecular adaptations of epidermal differentiation. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Genetic and evolutionary characterization of RABVs from China using the phosphoprotein gene.

PubMed

Wang, Lihua; Wu, Hui; Tao, Xiaoyan; Li, Hao; Rayner, Simon; Liang, Guodong; Tang, Qing

2013-01-07

While the function of the phosphoprotein (P) gene of the rabies virus (RABV) has been well studied in laboratory adapted RABVs, the genetic diversity and evolution characteristics of the P gene of street RABVs remain unclear. The objective of the present study was to investigate the mutation and evolution of P genes in Chinese street RABVs. The P gene of 77 RABVs from brain samples of dogs and wild animals collected in eight Chinese provinces through 2003 to 2008 were sequenced. The open reading frame (ORF) of the P genes was 894 nucleotides (nt) in length, with 85-99% (80-89%) amino acid (nucleotide) identity compared with the laboratory RABVs and vaccine strains. Phylogenetic analysis based on the P gene revealed that Chinese RABVs strains could be divided into two distinct clades, and several RABV variants were found to co circulating in the same province. Two conserved (CD1, 2) and two variable (VD1, 2) domains were identified by comparing the deduced primary sequences of the encoded P proteins. Two sequence motifs, one believed to confer binding to the cytoplasmic dynein light chain LC8 and a lysine-rich sequence were conserved throughout the Chinese RABVs. In contrast, the isolates exhibited lower conservation of one phosphate acceptor and one internal translation initiation site identified in the P protein of the rabies challenge virus standard (CVS) strain. Bayesian coalescent analysis showed that the P gene in Chinese RABVs have a substitution rate (3.305x10(-4) substitutions per site per year) and evolution history (592 years ago) similar to values for the glycoprotein (G) and nucleoprotein (N) reported previously. Several substitutions were found in the P gene of Chinese RABVs strains compared to the laboratory adapted and vaccine strains, whether these variations could affect the biological characteristics of Chinese RABVs need to be further investigated. The substitution rate and evolution history of P gene is similar to G and N gene, combine the topology of phylogenetic tree based on the P gene is similar to the G and N gene trees, indicate that the P, G and N genes are equally valid for examining the phylogenetics of RABVs.

Insights into the phylogeny and coding potential of microbial dark matter

NASA Astrophysics Data System (ADS)

Rinke, Christian; Schwientek, Patrick; Sczyrba, Alexander; Ivanova, Natalia N.; Anderson, Iain J.; Cheng, Jan-Fang; Darling, Aaron; Malfatti, Stephanie; Swan, Brandon K.; Gies, Esther A.; Dodsworth, Jeremy A.; Hedlund, Brian P.; Tsiamis, George; Sievert, Stefan M.; Liu, Wen-Tso; Eisen, Jonathan A.; Hallam, Steven J.; Kyrpides, Nikos C.; Stepanauskas, Ramunas; Rubin, Edward M.; Hugenholtz, Philip; Woyke, Tanja

2013-07-01

Genome sequencing enhances our understanding of the biological world by providing blueprints for the evolutionary and functional diversity that shapes the biosphere. However, microbial genomes that are currently available are of limited phylogenetic breadth, owing to our historical inability to cultivate most microorganisms in the laboratory. We apply single-cell genomics to target and sequence 201 uncultivated archaeal and bacterial cells from nine diverse habitats belonging to 29 major mostly uncharted branches of the tree of life, so-called `microbial dark matter'. With this additional genomic information, we are able to resolve many intra- and inter-phylum-level relationships and to propose two new superphyla. We uncover unexpected metabolic features that extend our understanding of biology and challenge established boundaries between the three domains of life. These include a novel amino acid use for the opal stop codon, an archaeal-type purine synthesis in Bacteria and complete sigma factors in Archaea similar to those in Bacteria. The single-cell genomes also served to phylogenetically anchor up to 20% of metagenomic reads in some habitats, facilitating organism-level interpretation of ecosystem function. This study greatly expands the genomic representation of the tree of life and provides a systematic step towards a better understanding of biological evolution on our planet.
Mutational robustness accelerates the origin of novel RNA phenotypes through phenotypic plasticity.

PubMed

Wagner, Andreas

2014-02-18

Novel phenotypes can originate either through mutations in existing genotypes or through phenotypic plasticity, the ability of one genotype to form multiple phenotypes. From molecules to organisms, plasticity is a ubiquitous feature of life, and a potential source of exaptations, adaptive traits that originated for nonadaptive reasons. Another ubiquitous feature is robustness to mutations, although it is unknown whether such robustness helps or hinders the origin of new phenotypes through plasticity. RNA is ideal to address this question, because it shows extensive plasticity in its secondary structure phenotypes, a consequence of their continual folding and unfolding, and these phenotypes have important biological functions. Moreover, RNA is to some extent robust to mutations. This robustness structures RNA genotype space into myriad connected networks of genotypes with the same phenotype, and it influences the dynamics of evolving populations on a genotype network. In this study I show that both effects help accelerate the exploration of novel phenotypes through plasticity. My observations are based on many RNA molecules sampled at random from RNA sequence space, and on 30 biological RNA molecules. They are thus not only a generic feature of RNA sequence space but are relevant for the molecular evolution of biological RNA. Copyright © 2014 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Insights into the phylogeny and coding potential of microbial dark matter

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rinke, Christian; Schwientek, Patrick; Sczyrba, Alexander

Genome sequencing enhances our understanding of the biological world by providing blueprints for the evolutionary and functional diversity that shapes the biosphere. However, microbial genomes that are currently available are of limited phylogenetic breadth, owing to our historical inability to cultivate most microorganisms in the laboratory. We apply single-cell genomics to target and sequence 201 uncultivated archaeal and bacterial cells fromnine diverse habitats belonging to 29 major mostly uncharted branches of the tree of life, so-called microbial dark matter. With this additional genomic information, we are able to resolve many intra- and inter-phylum-level relationships and to propose two new superphyla.more » We uncover unexpected metabolic features that extend our understanding of biology and challenge established boundaries between the three domains of life. These include a novel amino acid use for the opal stop codon, an archaeal-type purine synthesis in Bacteria and complete sigma factors in Archaea similar to those in Bacteria. The single-cell genomes also served to phylogenetically anchor up to 20percent of metagenomic reads in some habitats, facilitating organism-level interpretation of ecosystem function. This study greatly expands the genomic representation of the tree of life and provides a systematic step towards a better understanding of biological evolution on our planet.« less
A Model of Substitution Trajectories in Sequence Space and Long-Term Protein Evolution

PubMed Central

Usmanova, Dinara R.; Ferretti, Luca; Povolotskaya, Inna S.; Vlasov, Peter K.; Kondrashov, Fyodor A.

2015-01-01

The nature of factors governing the tempo and mode of protein evolution is a fundamental issue in evolutionary biology. Specifically, whether or not interactions between different sites, or epistasis, are important in directing the course of evolution became one of the central questions. Several recent reports have scrutinized patterns of long-term protein evolution claiming them to be compatible only with an epistatic fitness landscape. However, these claims have not yet been substantiated with a formal model of protein evolution. Here, we formulate a simple covarion-like model of protein evolution focusing on the rate at which the fitness impact of amino acids at a site changes with time. We then apply the model to the data on convergent and divergent protein evolution to test whether or not the incorporation of epistatic interactions is necessary to explain the data. We find that convergent evolution cannot be explained without the incorporation of epistasis and the rate at which an amino acid state switches from being acceptable at a site to being deleterious is faster than the rate of amino acid substitution. Specifically, for proteins that have persisted in modern prokaryotic organisms since the last universal common ancestor for one amino acid substitution approximately ten amino acid states switch from being accessible to being deleterious, or vice versa. Thus, molecular evolution can only be perceived in the context of rapid turnover of which amino acids are available for evolution. PMID:25415964
Ohio High School Biology Teachers' Views of State Standard for Evolution: Impacts on Practice

ERIC Educational Resources Information Center

Borgerding, Lisa A.

2012-01-01

High school biology teachers face many challenges as they teach evolution. State standards for evolution may provide support for sound evolution instruction. This study attempts to build upon previous work by investigating teachers' views of evolution standards and their evolution practices in a state where evolution standards have been…
A Review of Research Instruments Assessing Levels of Student Acceptance of Evolution

ERIC Educational Resources Information Center

Yasri, Pratchayapong

2014-01-01

Darwin's theory of evolution by means of natural selection, called evolution for short, is perceived as a unifying theme in biology, forming a major part of all biology syllabuses. It is reported that student acceptance of evolution associates with conceptual understandings of biological contents, nature of science, as well as motivations to…
Student Teachers' Approaches to Teaching Biological Evolution

ERIC Educational Resources Information Center

Borgerding, Lisa A.; Klein, Vanessa A.; Ghosh, Rajlakshmi; Eibel, Albert

2015-01-01

Evolution is fundamental to biology and scientific literacy, but teaching high school evolution is often difficult. Evolution teachers face several challenges including limited content knowledge, personal conflicts with evolution, expectations of resistance, concerns about students' conflicts with religion, and curricular constraints. Evolution…
Rapid evolution of the cerebellum in humans and other great apes.

PubMed

Barton, Robert A; Venditti, Chris

2014-10-20

Humans' unique cognitive abilities are usually attributed to a greatly expanded neocortex, which has been described as "the crowning achievement of evolution and the biological substrate of human mental prowess". The human cerebellum, however, contains four times more neurons than the neocortex and is attracting increasing attention for its wide range of cognitive functions. Using a method for detecting evolutionary rate changes along the branches of phylogenetic trees, we show that the cerebellum underwent rapid size increase throughout the evolution of apes, including humans, expanding significantly faster than predicted by the change in neocortex size. As a result, humans and other apes deviated significantly from the general evolutionary trend for neocortex and cerebellum to change in tandem, having significantly larger cerebella relative to neocortex size than other anthropoid primates. These results suggest that cerebellar specialization was a far more important component of human brain evolution than hitherto recognized and that technical intelligence was likely to have been at least as important as social intelligence in human cognitive evolution. Given the role of the cerebellum in sensory-motor control and in learning complex action sequences, cerebellar specialization is likely to have underpinned the evolution of humans' advanced technological capacities, which in turn may have been a preadaptation for language. Copyright © 2014 Elsevier Ltd. All rights reserved.
Evolution Analysis of Simple Sequence Repeats in Plant Genome.

PubMed

Qin, Zhen; Wang, Yanping; Wang, Qingmei; Li, Aixian; Hou, Fuyun; Zhang, Liming

2015-01-01

Simple sequence repeats (SSRs) are widespread units on genome sequences, and play many important roles in plants. In order to reveal the evolution of plant genomes, we investigated the evolutionary regularities of SSRs during the evolution of plant species and the plant kingdom by analysis of twelve sequenced plant genome sequences. First, in the twelve studied plant genomes, the main SSRs were those which contain repeats of 1-3 nucleotides combination. Second, in mononucleotide SSRs, the A/T percentage gradually increased along with the evolution of plants (except for P. patens). With the increase of SSRs repeat number the percentage of A/T in C. reinhardtii had no significant change, while the percentage of A/T in terrestrial plants species gradually declined. Third, in dinucleotide SSRs, the percentage of AT/TA increased along with the evolution of plant kingdom and the repeat number increased in terrestrial plants species. This trend was more obvious in dicotyledon than monocotyledon. The percentage of CG/GC showed the opposite pattern to the AT/TA. Forth, in trinucleotide SSRs, the percentages of combinations including two or three A/T were in a rising trend along with the evolution of plant kingdom; meanwhile with the increase of SSRs repeat number in plants species, different species chose different combinations as dominant SSRs. SSRs in C. reinhardtii, P. patens, Z. mays and A. thaliana showed their specific patterns related to evolutionary position or specific changes of genome sequences. The results showed that, SSRs not only had the general pattern in the evolution of plant kingdom, but also were associated with the evolution of the specific genome sequence. The study of the evolutionary regularities of SSRs provided new insights for the analysis of the plant genome evolution.
Microbial evolution of sulphate reduction when lateral gene transfer is geographically restricted.

PubMed

Chi Fru, E

2011-07-01

Lateral gene transfer (LGT) is an important mechanism by which micro-organisms acquire new functions. This process has been suggested to be central to prokaryotic evolution in various environments. However, the influence of geographical constraints on the evolution of laterally acquired genes in microbial metabolic evolution is not yet well understood. In this study, the influence of geographical isolation on the evolution of laterally acquired dissimilatory sulphite reductase (dsr) gene sequences in the sulphate-reducing micro-organisms (SRM) was investigated. Sequences on four continental blocks related to SRM known to have received dsr by LGT were analysed using standard phylogenetic and multidimensional statistical methods. Sequences related to lineages with large genetic diversity correlated positively with habitat divergence. Those affiliated to Thermodesulfobacterium indicated strong biogeographical delineation; hydrothermal-vent sequences clustered independently from hot-spring sequences. Some of the hydrothermal-vent and hot-spring sequences suggested to have been acquired from a common ancestral source may have diverged upon isolation within distinct habitats. In contrast, analysis of some Desulfotomaculum sequences indicated they could have been transferred from different ancestral sources but converged upon isolation within the same niche. These results hint that, after lateral acquisition of dsr genes, barriers to gene flow probably play a strong role in their subsequent evolution.
A Generative Angular Model of Protein Structure Evolution

PubMed Central

Golden, Michael; García-Portugués, Eduardo; Sørensen, Michael; Mardia, Kanti V.; Hamelryck, Thomas; Hein, Jotun

2017-01-01

Abstract Recently described stochastic models of protein evolution have demonstrated that the inclusion of structural information in addition to amino acid sequences leads to a more reliable estimation of evolutionary parameters. We present a generative, evolutionary model of protein structure and sequence that is valid on a local length scale. The model concerns the local dependencies between sequence and structure evolution in a pair of homologous proteins. The evolutionary trajectory between the two structures in the protein pair is treated as a random walk in dihedral angle space, which is modeled using a novel angular diffusion process on the two-dimensional torus. Coupling sequence and structure evolution in our model allows for modeling both “smooth” conformational changes and “catastrophic” conformational jumps, conditioned on the amino acid changes. The model has interpretable parameters and is comparatively more realistic than previous stochastic models, providing new insights into the relationship between sequence and structure evolution. For example, using the trained model we were able to identify an apparent sequence–structure evolutionary motif present in a large number of homologous protein pairs. The generative nature of our model enables us to evaluate its validity and its ability to simulate aspects of protein evolution conditioned on an amino acid sequence, a related amino acid sequence, a related structure or any combination thereof. PMID:28453724
Selective Constraints on Coding Sequences of Nervous System Genes Are a Major Determinant of Duplicate Gene Retention in Vertebrates.

PubMed

Roux, Julien; Liu, Jialin; Robinson-Rechavi, Marc

2017-11-01

The evolutionary history of vertebrates is marked by three ancient whole-genome duplications: two successive rounds in the ancestor of vertebrates, and a third one specific to teleost fishes. Biased loss of most duplicates enriched the genome for specific genes, such as slow evolving genes, but this selective retention process is not well understood. To understand what drives the long-term preservation of duplicate genes, we characterized duplicated genes in terms of their expression patterns. We used a new method of expression enrichment analysis, TopAnat, applied to in situ hybridization data from thousands of genes from zebrafish and mouse. We showed that the presence of expression in the nervous system is a good predictor of a higher rate of retention of duplicate genes after whole-genome duplication. Further analyses suggest that purifying selection against the toxic effects of misfolded or misinteracting proteins, which is particularly strong in nonrenewing neural tissues, likely constrains the evolution of coding sequences of nervous system genes, leading indirectly to the preservation of duplicate genes after whole-genome duplication. Whole-genome duplications thus greatly contributed to the expansion of the toolkit of genes available for the evolution of profound novelties of the nervous system at the base of the vertebrate radiation. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
The molecular biology and evolution of feline immunodeficiency viruses of cougars

PubMed Central

Poss, Mary; Ross, Howard; Rodrigo, Allen; Terwee, Julie; VandeWoude, Sue; Biek, Roman

2008-01-01

Feline immunodeficiency virus (FIV) is a lentivirus that has been identified in many members of the family Felidae but domestic cats are the only FIV host in which infection results in disease. We studied FIVpco infection of cougars (Puma concolor) as a model for asymptomatic lentivirus infections to understand the mechanisms of host-virus coexistence. Several natural cougar populations were evaluated to determine if there are any consequences of FIVpco infection on cougar fecundity, survival, or susceptibility to other infections. We have sequenced full length viral genomes and conducted a detailed analysis of viral molecular evolution on these sequences and on genome fragments of serially sampled animals to determine the evolutionary forces experienced by this virus in cougars. In addition, we have evaluated the molecular genetics of FIVpco in a new host, domestic cats, to determine the evolutionary consequences to a host-adapted virus associated with cross-species infection. Our results indicate that there are no significant differences in survival, fecundity or susceptibility to other infections between FIVpco-infected and uninfected cougars. The molecular evolution of FIVpco is characterized by a slower evolutionary rate and an absence of positive selection, but also by proviral and plasma viral loads comparable to those of epidemic lentiviruses such as HIV-1 or FIVfca. Evolutionary and recombination rates and selection profiles change significantly when FIVpco replicates in a new host. PMID:18295904
Complete mitogenome of Asiatic lion resolves phylogenetic status within Panthera.

PubMed

Bagatharia, Snehal B; Joshi, Madhvi N; Pandya, Rohan V; Pandit, Aanal S; Patel, Riddhi P; Desai, Shivangi M; Sharma, Anu; Panchal, Omkar; Jasmani, Falguni P; Saxena, Akshay K

2013-08-23

The origin, evolution and speciation of the lion, has been subject of interest, debate and study. The present surviving lions of the genus Panthera comprise of eight sub-species inclusive of Asiatic lion Panthera leo persica of India's Gir forest. Except for the Asiatic lion, the other seven subspecies are found in different parts of Africa. There have been different opinions regarding the phylogenetic status of Panthera leo, as well as classifying lions of different geographic regions into subspecies and races. In the present study, mitogenome sequence of P. leo persica deduced, using Ion Torrent PGM to assess phylogeny and evolution which may play an increasingly important role in conservation biology. The mtDNA sequence of P. leo persica is 17,057 bp in length with 40.8% GC content. Annotation of mitogenome revealed total 37 genes, including 13 protein coding, 2 rRNA and 22 tRNA. Phylogenetic analysis based on whole mitogenome, suggests Panthera pardus as a neighbouring species to P. leo with species divergence at ~2.96 mya. This work presents first report on complete mitogenome of Panthera leo persica. It sheds light on the phylogenetic and evolutionary status within and across Felidae members. The result compared and evaluated with earlier reports of Felidae shows alteration of phylogenetic status and species evolution. This study may provide information on genetic diversity and population stability.
Complete mitogenome of asiatic lion resolves phylogenetic status within Panthera

PubMed Central

2013-01-01

Background The origin, evolution and speciation of the lion, has been subject of interest, debate and study. The present surviving lions of the genus Panthera comprise of eight sub-species inclusive of Asiatic lion Panthera leo persica of India's Gir forest. Except for the Asiatic lion, the other seven subspecies are found in different parts of Africa. There have been different opinions regarding the phylogenetic status of Panthera leo, as well as classifying lions of different geographic regions into subspecies and races. In the present study, mitogenome sequence of P. leo persica deduced, using Ion Torrent PGM to assess phylogeny and evolution which may play an increasingly important role in conservation biology. Results The mtDNA sequence of P. leo persica is 17,057 bp in length with 40.8% GC content. Annotation of mitogenome revealed total 37 genes, including 13 protein coding, 2 rRNA and 22 tRNA. Phylogenetic analysis based on whole mitogenome, suggests Panthera pardus as a neighbouring species to P. leo with species divergence at ~2.96 mya. Conclusion This work presents first report on complete mitogenome of Panthera leo persica. It sheds light on the phylogenetic and evolutionary status within and across Felidae members. The result compared and evaluated with earlier reports of Felidae shows alteration of phylogenetic status and species evolution. This study may provide information on genetic diversity and population stability. PMID:23968279
Functional Diversity of Haloacid Dehalogenase Superfamily Phosphatases from Saccharomyces cerevisiae: BIOCHEMICAL, STRUCTURAL, AND EVOLUTIONARY INSIGHTS.

PubMed

Kuznetsova, Ekaterina; Nocek, Boguslaw; Brown, Greg; Makarova, Kira S; Flick, Robert; Wolf, Yuri I; Khusnutdinova, Anna; Evdokimova, Elena; Jin, Ke; Tan, Kemin; Hanson, Andrew D; Hasnain, Ghulam; Zallot, Rémi; de Crécy-Lagard, Valérie; Babu, Mohan; Savchenko, Alexei; Joachimiak, Andrzej; Edwards, Aled M; Koonin, Eugene V; Yakunin, Alexander F

2015-07-24

The haloacid dehalogenase (HAD)-like enzymes comprise a large superfamily of phosphohydrolases present in all organisms. The Saccharomyces cerevisiae genome encodes at least 19 soluble HADs, including 10 uncharacterized proteins. Here, we biochemically characterized 13 yeast phosphatases from the HAD superfamily, which includes both specific and promiscuous enzymes active against various phosphorylated metabolites and peptides with several HADs implicated in detoxification of phosphorylated compounds and pseudouridine. The crystal structures of four yeast HADs provided insight into their active sites, whereas the structure of the YKR070W dimer in complex with substrate revealed a composite substrate-binding site. Although the S. cerevisiae and Escherichia coli HADs share low sequence similarities, the comparison of their substrate profiles revealed seven phosphatases with common preferred substrates. The cluster of secondary substrates supporting significant activity of both S. cerevisiae and E. coli HADs includes 28 common metabolites that appear to represent the pool of potential activities for the evolution of novel HAD phosphatases. Evolution of novel substrate specificities of HAD phosphatases shows no strict correlation with sequence divergence. Thus, evolution of the HAD superfamily combines the conservation of the overall substrate pool and the substrate profiles of some enzymes with remarkable biochemical and structural flexibility of other superfamily members. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
Evolution of Synonymous Codon Usage in Neurospora tetrasperma and Neurospora discreta

PubMed Central

Whittle, C. A.; Sun, Y.; Johannesson, H.

2011-01-01

Neurospora comprises a primary model system for the study of fungal genetics and biology. In spite of this, little is known about genome evolution in Neurospora. For example, the evolution of synonymous codon usage is largely unknown in this genus. In the present investigation, we conducted a comprehensive analysis of synonymous codon usage and its relationship to gene expression and gene length (GL) in Neurospora tetrasperma and Neurospora discreta. For our analysis, we examined codon usage among 2,079 genes per organism and assessed gene expression using large-scale expressed sequenced tag (EST) data sets (279,323 and 453,559 ESTs for N. tetrasperma and N. discreta, respectively). Data on relative synonymous codon usage revealed 24 codons (and two putative codons) that are more frequently used in genes with high than with low expression and thus were defined as optimal codons. Although codon-usage bias was highly correlated with gene expression, it was independent of selectively neutral base composition (introns); thus demonstrating that translational selection drives synonymous codon usage in these genomes. We also report that GL (coding sequences [CDS]) was inversely associated with optimal codon usage at each gene expression level, with highly expressed short genes having the greatest frequency of optimal codons. Optimal codon frequency was moderately higher in N. tetrasperma than in N. discreta, which might be due to variation in selective pressures and/or mating systems. PMID:21402862
Intra and Interspecific Variations of Gene Expression Levels in Yeast Are Largely Neutral: (Nei Lecture, SMBE 2016, Gold Coast).

PubMed

Yang, Jian-Rong; Maclean, Calum J; Park, Chungoo; Zhao, Huabin; Zhang, Jianzhi

2017-09-01

It is commonly, although not universally, accepted that most intra and interspecific genome sequence variations are more or less neutral, whereas a large fraction of organism-level phenotypic variations are adaptive. Gene expression levels are molecular phenotypes that bridge the gap between genotypes and corresponding organism-level phenotypes. Yet, it is unknown whether natural variations in gene expression levels are mostly neutral or adaptive. Here we address this fundamental question by genome-wide profiling and comparison of gene expression levels in nine yeast strains belonging to three closely related Saccharomyces species and originating from five different ecological environments. We find that the transcriptome-based clustering of the nine strains approximates the genome sequence-based phylogeny irrespective of their ecological environments. Remarkably, only ∼0.5% of genes exhibit similar expression levels among strains from a common ecological environment, no greater than that among strains with comparable phylogenetic relationships but different environments. These and other observations strongly suggest that most intra and interspecific variations in yeast gene expression levels result from the accumulation of random mutations rather than environmental adaptations. This finding has profound implications for understanding the driving force of gene expression evolution, genetic basis of phenotypic adaptation, and general role of stochasticity in evolution. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Evolution beyond neo-Darwinism: a new conceptual framework.

PubMed

Noble, Denis

2015-01-01

Experimental results in epigenetics and related fields of biological research show that the Modern Synthesis (neo-Darwinist) theory of evolution requires either extension or replacement. This article examines the conceptual framework of neo-Darwinism, including the concepts of 'gene', 'selfish', 'code', 'program', 'blueprint', 'book of life', 'replicator' and 'vehicle'. This form of representation is a barrier to extending or replacing existing theory as it confuses conceptual and empirical matters. These need to be clearly distinguished. In the case of the central concept of 'gene', the definition has moved all the way from describing a necessary cause (defined in terms of the inheritable phenotype itself) to an empirically testable hypothesis (in terms of causation by DNA sequences). Neo-Darwinism also privileges 'genes' in causation, whereas in multi-way networks of interactions there can be no privileged cause. An alternative conceptual framework is proposed that avoids these problems, and which is more favourable to an integrated systems view of evolution. © 2015. Published by The Company of Biologists Ltd.
Enzyme Recruitment and Its Role in Metabolic Expansion

PubMed Central

2015-01-01

Although more than 109 years have passed since the existence of the last universal common ancestor, proteins have yet to reach the limits of divergence. As a result, metabolic complexity is ever expanding. Identifying and understanding the mechanisms that drive and limit the divergence of protein sequence space impact not only evolutionary biologists investigating molecular evolution but also synthetic biologists seeking to design useful catalysts and engineer novel metabolic pathways. Investigations over the past 50 years indicate that the recruitment of enzymes for new functions is a key event in the acquisition of new metabolic capacity. In this review, we outline the genetic mechanisms that enable recruitment and summarize the present state of knowledge regarding the functional characteristics of extant catalysts that facilitate recruitment. We also highlight recent examples of enzyme recruitment, both from the historical record provided by phylogenetics and from enzyme evolution experiments. We conclude with a look to the future, which promises fruitful consequences from the convergence of molecular evolutionary theory, laboratory-directed evolution, and synthetic biology. PMID:24483367

Niche construction, sources of selection and trait coevolution.

PubMed

Laland, Kevin; Odling-Smee, John; Endler, John

2017-10-06

Organisms modify and choose components of their local environments. This 'niche construction' can alter ecological processes, modify natural selection and contribute to inheritance through ecological legacies. Here, we propose that niche construction initiates and modifies the selection directly affecting the constructor, and on other species, in an orderly, directed and sustained manner. By dependably generating specific environmental states, niche construction co-directs adaptive evolution by imposing a consistent statistical bias on selection. We illustrate how niche construction can generate this evolutionary bias by comparing it with artificial selection. We suggest that it occupies the middle ground between artificial and natural selection. We show how the perspective leads to testable predictions related to: (i) reduced variance in measures of responses to natural selection in the wild; (ii) multiple trait coevolution, including the evolution of sequences of traits and patterns of parallel evolution; and (iii) a positive association between niche construction and biodiversity. More generally, we submit that evolutionary biology would benefit from greater attention to the diverse properties of all sources of selection.
Analysis of horizontal genetic transfer in red algae in the post-genomics age

PubMed Central

Chan, Cheong Xin; Bhattacharya, Debashish

2013-01-01

The recently published genome of the unicellular red alga Porphyridium purpureum revealed a gene-rich, intron-poor species, which is surprising for a free-living mesophile. Of the 8,355 predicted protein-coding regions, up to 773 (9.3%) were implicated in horizontal genetic transfer (HGT) events involving other prokaryote and eukaryote lineages. A much smaller number, up to 174 (2.1%) showed unambiguous evidence of vertical inheritance. Together with other red algal genomes, nearly all published in 2013, these data provide an excellent platform for studying diverse aspects of algal biology and evolution. This novel information will help investigators test existing hypotheses about the impact of endosymbiosis and HGT on algal evolution and enable comparative analysis within a more-refined, hypothesis-driven framework that extends beyond HGT. Here we explore the impacts of this infusion of red algal genome data on addressing questions regarding the complex nature of algal evolution and highlight the need for scalable phylogenomic approaches to handle the forthcoming deluge of sequence information. PMID:24475368
Using the Developmental Gene Bicoid to Identify Species of Forensically Important Blowflies (Diptera: Calliphoridae)

PubMed Central

Park, Seong Hwan; Park, Chung Hyun; Zhang, Yong; Piao, Huguo; Chung, Ukhee; Kim, Seong Yoon; Ko, Kwang Soo; Yi, Cheong-Ho; Jo, Tae-Ho; Hwang, Juck-Joon

2013-01-01

Identifying species of insects used to estimate postmortem interval (PMI) is a major subject in forensic entomology. Because forensic insect specimens are morphologically uniform and are obtained at various developmental stages, DNA markers are greatly needed. To develop new autosomal DNA markers to identify species, partial genomic sequences of the bicoid (bcd) genes, containing the homeobox and its flanking sequences, from 12 blowfly species (Aldrichina grahami, Calliphora vicina, Calliphora lata, Triceratopyga calliphoroides, Chrysomya megacephala, Chrysomya pinguis, Phormia regina, Lucilia ampullacea, Lucilia caesar, Lucilia illustris, Hemipyrellia ligurriens and Lucilia sericata; Calliphoridae: Diptera) were determined and analyzed. This study first sequenced the ten blowfly species other than C. vicina and L. sericata. Based on the bcd sequences of these 12 blowfly species, a phylogenetic tree was constructed that discriminates the subfamilies of Calliphoridae (Luciliinae, Chrysomyinae, and Calliphorinae) and most blowfly species. Even partial genomic sequences of about 500 bp can distinguish most blowfly species. The short intron 2 and coding sequences downstream of the bcd homeobox in exon 3 could be utilized to develop DNA markers for forensic applications. These gene sequences are important in the evolution of insect developmental biology and are potentially useful for identifying insect species in forensic science. PMID:23586044
Landscape of somatic mutations and clonal evolution in mantle cell lymphoma.

PubMed

Beà, Sílvia; Valdés-Mas, Rafael; Navarro, Alba; Salaverria, Itziar; Martín-Garcia, David; Jares, Pedro; Giné, Eva; Pinyol, Magda; Royo, Cristina; Nadeu, Ferran; Conde, Laura; Juan, Manel; Clot, Guillem; Vizán, Pedro; Di Croce, Luciano; Puente, Diana A; López-Guerra, Mónica; Moros, Alexandra; Roue, Gael; Aymerich, Marta; Villamor, Neus; Colomo, Lluís; Martínez, Antonio; Valera, Alexandra; Martín-Subero, José I; Amador, Virginia; Hernández, Luis; Rozman, Maria; Enjuanes, Anna; Forcada, Pilar; Muntañola, Ana; Hartmann, Elena M; Calasanz, María J; Rosenwald, Andreas; Ott, German; Hernández-Rivas, Jesús M; Klapper, Wolfram; Siebert, Reiner; Wiestner, Adrian; Wilson, Wyndham H; Colomer, Dolors; López-Guillermo, Armando; López-Otín, Carlos; Puente, Xose S; Campo, Elías

2013-11-05

Mantle cell lymphoma (MCL) is an aggressive tumor, but a subset of patients may follow an indolent clinical course. To understand the mechanisms underlying this biological heterogeneity, we performed whole-genome and/or whole-exome sequencing on 29 MCL cases and their respective matched normal DNA, as well as 6 MCL cell lines. Recurrently mutated genes were investigated by targeted sequencing in an independent cohort of 172 MCL patients. We identified 25 significantly mutated genes, including known drivers such as ataxia-telangectasia mutated (ATM), cyclin D1 (CCND1), and the tumor suppressor TP53; mutated genes encoding the anti-apoptotic protein BIRC3 and Toll-like receptor 2 (TLR2); and the chromatin modifiers WHSC1, MLL2, and MEF2B. We also found NOTCH2 mutations as an alternative phenomenon to NOTCH1 mutations in aggressive tumors with a dismal prognosis. Analysis of two simultaneous or subsequent MCL samples by whole-genome/whole-exome (n = 8) or targeted (n = 19) sequencing revealed subclonal heterogeneity at diagnosis in samples from different topographic sites and modulation of the initial mutational profile at the progression of the disease. Some mutations were predominantly clonal or subclonal, indicating an early or late event in tumor evolution, respectively. Our study identifies molecular mechanisms contributing to MCL pathogenesis and offers potential targets for therapeutic intervention.
Extensive Mobilome-Driven Genome Diversification in Mouse Gut-Associated Bacteroides vulgatus mpk.

PubMed

Lange, Anna; Beier, Sina; Steimle, Alex; Autenrieth, Ingo B; Huson, Daniel H; Frick, Julia-Stefanie

2016-04-25

Like many other Bacteroides species, Bacteroides vulgatus strain mpk, a mouse fecal isolate which was shown to promote intestinal homeostasis, utilizes a variety of mobile elements for genome evolution. Based on sequences collected by Pacific Biosciences SMRT sequencing technology, we discuss the challenges of assembling and studying a bacterial genome of high plasticity. Additionally, we conducted comparative genomics comparing this commensal strain with the B. vulgatus type strain ATCC 8482 as well as multiple other Bacteroides and Parabacteroides strains to reveal the most important differences and identify the unique features of B. vulgatus mpk. The genome of B. vulgatus mpk harbors a large and diverse set of mobile element proteins compared with other sequenced Bacteroides strains. We found evidence of a number of different horizontal gene transfer events and a genome landscape that has been extensively altered by different mobilization events. A CRISPR/Cas system could be identified that provides a possible mechanism for preventing the integration of invading external DNA. We propose that the high genome plasticity and the introduced genome instabilities of B. vulgatus mpk arising from the various mobilization events might play an important role not only in its adaptation to the challenging intestinal environment in general, but also in its ability to interact with the gut microbiota. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
De Novo ORFs in Drosophila Are Important to Organismal Fitness and Evolved Rapidly from Previously Non-coding Sequences

PubMed Central

Reinhardt, Josephine A.; Wanjiru, Betty M.; Brant, Alicia T.; Saelao, Perot; Begun, David J.; Jones, Corbin D.

2013-01-01

How non-coding DNA gives rise to new protein-coding genes (de novo genes) is not well understood. Recent work has revealed the origins and functions of a few de novo genes, but common principles governing the evolution or biological roles of these genes are unknown. To better define these principles, we performed a parallel analysis of the evolution and function of six putatively protein-coding de novo genes described in Drosophila melanogaster. Reconstruction of the transcriptional history of de novo genes shows that two de novo genes emerged from novel long non-coding RNAs that arose at least 5 MY prior to evolution of an open reading frame. In contrast, four other de novo genes evolved a translated open reading frame and transcription within the same evolutionary interval suggesting that nascent open reading frames (proto-ORFs), while not required, can contribute to the emergence of a new de novo gene. However, none of the genes arose from proto-ORFs that existed long before expression evolved. Sequence and structural evolution of de novo genes was rapid compared to nearby genes and the structural complexity of de novo genes steadily increases over evolutionary time. Despite the fact that these genes are transcribed at a higher level in males than females, and are most strongly expressed in testes, RNAi experiments show that most of these genes are essential in both sexes during metamorphosis. This lethality suggests that protein coding de novo genes in Drosophila quickly become functionally important. PMID:24146629
Factors which influence Texas biology teachers' decisions to emphasize fundamental concepts of evolution

NASA Astrophysics Data System (ADS)

Bilica, Kimberly Lynn

The teaching of biological evolution in public science classrooms has been mitigated by a lingering and historic climate of controversy (Skoog, 1984; Skoog, 1979). This controversy has successfully stalled attempts to bring authentic science literacy to the American public (Bybee, 1997). The first encouraging signs of the abatement of this controversy occurred during the early 1990s when several prominent science organizations promoted evolution to its appropriate status as a central and unifying concept in biology (National Science Teachers Association, 1992; National Research Council, 1996; American Association for the Advancement of Science, 1990, 1993). The organizations acknowledged that not only should biological evolution be taught, evolution should stand as one of a select group of essential concepts upon which biology curricula should be built. Bandura's Social Learning theory (Bandura, 1997; Lumpe, Haney, & Czerniak, 2000) and Helms' Model of Identity (Helms, 1998) provide the theoretical basis for this study. Both Bandura and Helms explain the actions of teachers by examining the beliefs and values that influence their decisions. The models distinguish between two types of belief systems: capacity beliefs and context beliefs (Lumpe, et al, 2000; Helms, 1998). Both belief types influence and are influenced by individual actions. In this study, the action to be described is the decision that teachers make about the degree of emphasis on evolution in the classroom. The capacity beliefs that will be examined are teachers' beliefs about their capability to teach evolution. The contextual beliefs in this study are perceptions about students' capabilities to learn evolution, the status of evolution in science, the place of evolution in the biology classroom, the influence of textbooks, time, and community/school values. This study contributes to and extends the knowledge base established by studies of evolution education by exploring the relative amount of emphasis that Texas biology teachers currently as well as prefer to place on fundamental evolution concepts in relationship to specific belief factors which influence biology teachers' curricular decisions.
Comparative Genomics in Drosophila.

PubMed

Oti, Martin; Pane, Attilio; Sammeth, Michael

2018-01-01

Since the pioneering studies of Thomas Hunt Morgan and coworkers at the dawn of the twentieth century, Drosophila melanogaster and its sister species have tremendously contributed to unveil the rules underlying animal genetics, development, behavior, evolution, and human disease. Recent advances in DNA sequencing technologies launched Drosophila into the post-genomic era and paved the way for unprecedented comparative genomics investigations. The complete sequencing and systematic comparison of the genomes from 12 Drosophila species represents a milestone achievement in modern biology, which allowed a plethora of different studies ranging from the annotation of known and novel genomic features to the evolution of chromosomes and, ultimately, of entire genomes. Despite the efforts of countless laboratories worldwide, the vast amount of data that were produced over the past 15 years is far from being fully explored.In this chapter, we will review some of the bioinformatic approaches that were developed to interrogate the genomes of the 12 Drosophila species. Setting off from alignments of the entire genomic sequences, the degree of conservation can be separately evaluated for every region of the genome, providing already first hints about elements that are under purifying selection and therefore likely functional. Furthermore, the careful analysis of repeated sequences sheds light on the evolutionary dynamics of transposons, an enigmatic and fascinating class of mobile elements housed in the genomes of animals and plants. Comparative genomics also aids in the computational identification of the transcriptionally active part of the genome, first and foremost of protein-coding loci, but also of transcribed nevertheless apparently noncoding regions, which were once considered "junk" DNA. Eventually, the synergy between functional and comparative genomics also facilitates in silico and in vivo studies on cis-acting regulatory elements, like transcription factor binding sites, that due to the high degree of sequence variability usually impose increased challenges for bioinformatics approaches.
A NGS approach to the encrusting Mediterranean sponge Crella elegans (Porifera, Demospongiae, Poecilosclerida): transcriptome sequencing, characterization and overview of the gene expression along three life cycle stages.

PubMed

Pérez-Porro, A R; Navarro-Gómez, D; Uriz, M J; Giribet, G

2013-05-01

Sponges can be dominant organisms in many marine and freshwater habitats where they play essential ecological roles. They also represent a key group to address important questions in early metazoan evolution. Recent approaches for improving knowledge on sponge biological and ecological functions as well as on animal evolution have focused on the genetic toolkits involved in ecological responses to environmental changes (biotic and abiotic), development and reproduction. These approaches are possible thanks to newly available, massive sequencing technologies-such as the Illumina platform, which facilitate genome and transcriptome sequencing in a cost-effective manner. Here we present the first NGS (next-generation sequencing) approach to understanding the life cycle of an encrusting marine sponge. For this we sequenced libraries of three different life cycle stages of the Mediterranean sponge Crella elegans and generated de novo transcriptome assemblies. Three assemblies were based on sponge tissue of a particular life cycle stage, including non-reproductive tissue, tissue with sperm cysts and tissue with larvae. The fourth assembly pooled the data from all three stages. By aggregating data from all the different life cycle stages we obtained a higher total number of contigs, contigs with blast hit and annotated contigs than from one stage-based assemblies. In that multi-stage assembly we obtained a larger number of the developmental regulatory genes known for metazoans than in any other assembly. We also advance the differential expression of selected genes in the three life cycle stages to explore the potential of RNA-seq for improving knowledge on functional processes along the sponge life cycle. © 2013 Blackwell Publishing Ltd.
Population genomics of eusocial insects: the costs of a vertebrate-like effective population size.

PubMed

Romiguier, J; Lourenco, J; Gayral, P; Faivre, N; Weinert, L A; Ravel, S; Ballenghien, M; Cahais, V; Bernard, A; Loire, E; Keller, L; Galtier, N

2014-03-01

The evolution of reproductive division of labour and social life in social insects has lead to the emergence of several life-history traits and adaptations typical of larger organisms: social insect colonies can reach masses of several kilograms, they start reproducing only when they are several years old, and can live for decades. These features and the monopolization of reproduction by only one or few individuals in a colony should affect molecular evolution by reducing the effective population size. We tested this prediction by analysing genome-wide patterns of coding sequence polymorphism and divergence in eusocial vs. noneusocial insects based on newly generated RNA-seq data. We report very low amounts of genetic polymorphism and an elevated ratio of nonsynonymous to synonymous changes – a marker of the effective population size – in four distinct species of eusocial insects, which were more similar to vertebrates than to solitary insects regarding molecular evolutionary processes. Moreover, the ratio of nonsynonymous to synonymous substitutions was positively correlated with the level of social complexity across ant species. These results are fully consistent with the hypothesis of a reduced effective population size and an increased genetic load in eusocial insects, indicating that the evolution of social life has important consequences at both the genomic and population levels. © 2014 The Authors. Journal of Evolutionary Biology © 2014 European Society For Evolutionary Biology.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Vassilevska, Tanya

This is the first code, designed to run on a desktop, which models the intracellular replication and the cell-to-cell infection and demonstrates virus evolution at the molecular level. This code simulates the infection of a population of "idealized biological cells" (represented as objects that do not divide or have metabolism) with "virus" (represented by its genetic sequence), the replication and simultaneous mutation of the virus which leads to evolution of the population of genetically diverse viruses. The code is built to simulate single-stranded RNA viruses. The input for the code is 1. the number of biological cells in the culture,more » 2. the initial composition of the virus population, 3. the reference genome of the RNA virus, 4. the coordinates of the genome regions and their significance and, 5. parameters determining the dynamics of virus replication, such as the mutation rate. The simulation ends when all cells have been infected or when no more infections occurs after a given number of attempts. The code has the ability to simulate the evolution of the virus in serial passage of cell "cultures", i.e. after the end of a simulation, a new one is immediately scheduled with a new culture of infected cells. The code outputs characteristics of the resulting virus population dynamics and genetic composition of the virus population, such as the top dominant genomes, percentage of a genome with specific characteristics.« less
A Phylogenomic Approach Based on PCR Target Enrichment and High Throughput Sequencing: Resolving the Diversity within the South American Species of Bartsia L. (Orobanchaceae)

PubMed Central

Tank, David C.

2016-01-01

Advances in high-throughput sequencing (HTS) have allowed researchers to obtain large amounts of biological sequence information at speeds and costs unimaginable only a decade ago. Phylogenetics, and the study of evolution in general, is quickly migrating towards using HTS to generate larger and more complex molecular datasets. In this paper, we present a method that utilizes microfluidic PCR and HTS to generate large amounts of sequence data suitable for phylogenetic analyses. The approach uses the Fluidigm Access Array System (Fluidigm, San Francisco, CA, USA) and two sets of PCR primers to simultaneously amplify 48 target regions across 48 samples, incorporating sample-specific barcodes and HTS adapters (2,304 unique amplicons per Access Array). The final product is a pooled set of amplicons ready to be sequenced, and thus, there is no need to construct separate, costly genomic libraries for each sample. Further, we present a bioinformatics pipeline to process the raw HTS reads to either generate consensus sequences (with or without ambiguities) for every locus in every sample or—more importantly—recover the separate alleles from heterozygous target regions in each sample. This is important because it adds allelic information that is well suited for coalescent-based phylogenetic analyses that are becoming very common in conservation and evolutionary biology. To test our approach and bioinformatics pipeline, we sequenced 576 samples across 96 target regions belonging to the South American clade of the genus Bartsia L. in the plant family Orobanchaceae. After sequencing cleanup and alignment, the experiment resulted in ~25,300bp across 486 samples for a set of 48 primer pairs targeting the plastome, and ~13,500bp for 363 samples for a set of primers targeting regions in the nuclear genome. Finally, we constructed a combined concatenated matrix from all 96 primer combinations, resulting in a combined aligned length of ~40,500bp for 349 samples. PMID:26828929
Biological and Clinical Implications of Clonal Heterogeneity and Clonal Evolution in Multiple Myeloma.

PubMed

Bianchi, Giada; Ghobrial, Irene M

Clonal heterogeneity and clonal evolution have emerged as critical concepts in the field of oncology over the past four decades, largely thanks to the implementation of novel technologies such as comparative genomic hybridization, whole genome/exome sequencing and epigenetic analysis. Along with the identification of cancer stem cells in the majority of neoplasia, the recognition of intertumor and intratumor variability has provided a novel perspective to understand the mechanisms behind tumor evolution and its implication in terms of treatment failure and cancer relapse or recurrence. First hypothesized over two decades ago, clonal heterogeneity and clonal evolution have been confirmed in multiple myeloma (MM), an incurable cancer of plasma cells, almost universally preceded by a pre-malignant conditioned named monoclonal gammopathy of undetermined significance (MGUS). The genetic events and molecular mechanisms underlying such evolution have been difficult to dissect. Moreover, while a role for the bone marrow microenvironment in supporting MM cell survival, proliferation and drug-resistance has been well established, whether it is directly involved in driving evolution from MGUS to MM is at present unclear. We present in this review a historical excursus on the concepts of clonal heterogeneity and clonal evolution in MM with a special emphasis on their role in the progression from MGUS to MM; the contribution of the microenvironment; and the clinical implications in terms of resistance to treatment and disease relapse/recurrence.
Biological and Clinical Implications of Clonal Heterogeneity and Clonal Evolution in Multiple Myeloma

PubMed Central

Bianchi, Giada; Ghobrial, Irene M.

2015-01-01

Clonal heterogeneity and clonal evolution have emerged as critical concepts in the field of oncology over the past four decades, largely thanks to the implementation of novel technologies such as comparative genomic hybridization, whole genome/exome sequencing and epigenetic analysis. Along with the identification of cancer stem cells in the majority of neoplasia, the recognition of intertumor and intratumor variability has provided a novel perspective to understand the mechanisms behind tumor evolution and its implication in terms of treatment failure and cancer relapse or recurrence. First hypothesized over two decades ago, clonal heterogeneity and clonal evolution have been confirmed in multiple myeloma (MM), an incurable cancer of plasma cells, almost universally preceded by a pre-malignant conditioned named monoclonal gammopathy of undetermined significance (MGUS). The genetic events and molecular mechanisms underlying such evolution have been difficult to dissect. Moreover, while a role for the bone marrow microenvironment in supporting MM cell survival, proliferation and drug-resistance has been well established, whether it is directly involved in driving evolution from MGUS to MM is at present unclear. We present in this review a historical excursus on the concepts of clonal heterogeneity and clonal evolution in MM with a special emphasis on their role in the progression from MGUS to MM; the contribution of the microenvironment; and the clinical implications in terms of resistance to treatment and disease relapse/recurrence. PMID:25705146
A New Tephrochronology for Early Diverse Stone Tool Technologies and Long-Distance Raw Material Transport in the Middle-Late Pleistocene Kapthurin Formation, East Africa.

NASA Astrophysics Data System (ADS)

Blegen, N.; Jicha, B.

2017-12-01

The Middle to Late Pleistocene (780-10 ka) of East Africa records significant behavioral change, the earliest fossils of Homo sapiens and the dispersals of our species across and out of Africa. Studying human evolution in the Middle to Late Pleistocene thus requires an extensive and precise chronology relating the appearances of various behaviors preserved in archaeological sequences to aspects of hominin biology and evidence of past environments preserved in the fossils and geological sequences. Tephrochronology provides the chronostratigraphic resolution to achieve this through correlation and dating of volcanic ashes. The tephrochronology of the Kapthurin Formation presented here, based on tephra correlations and 40Ar/ 39Ar dates, provides new ages between 396.3 ± 3.4 ka and 465.3 ± 1.0 ka for nine sites showing some of the earliest evidence of diverse blade and Levallois methods of core reduction. These are >110 kyr older than previously known in East Africa. New 40Ar/ 39Ar dates provide a refined age of 222.5 ± 0.6 ka for early evidence of long-distance obsidian transport at the Sibilo School Road Site. Long-distance tephra correlation between the Baringo and Lake Victoria basins also provides a new date of 100 ka for the Middle Stone Age site of Keraswanin. By providing new or older dates for 11 sites containing several important aspects of hominin behavior and extending the chronology of the Kapthurin Formation forward by 130,000 years, the tephrochronology presented here contributes one of the longest and most refined chronostratigraphic frameworks relevant to modern human evolution. In conjunction with recent archaeological and paleoenvironmental data, this tephrochronology provides the foundation to understand the process of modern human behavioral evolution through the East African Middle and Late Pleistocene as it relates to biological and paleoenvironmental circumstances.
Plastid Phylogenomic Analyses Resolve Tofieldiaceae as the Root of the Early Diverging Monocot Order Alismatales.

PubMed

Luo, Yang; Ma, Peng-Fei; Li, Hong-Tao; Yang, Jun-Bo; Wang, Hong; Li, De-Zhu

2016-04-06

The predominantly aquatic order Alismatales, which includes approximately 4,500 species within Araceae, Tofieldiaceae, and the core alismatid families, is a key group in investigating the origin and early diversification of monocots. Despite their importance, phylogenetic ambiguity regarding the root of the Alismatales tree precludes answering questions about the early evolution of the order. Here, we sequenced the first complete plastid genomes from three key families in this order:Potamogeton perfoliatus(Potamogetonaceae),Sagittaria lichuanensis(Alismataceae), andTofieldia thibetica(Tofieldiaceae). Each family possesses the typical quadripartite structure, with plastid genome sizes of 156,226, 179,007, and 155,512 bp, respectively. Among them, the plastid genome ofS. lichuanensisis the largest in monocots and the second largest in angiosperms. Like other sequenced Alismatales plastid genomes, all three families generally encode the same 113 genes with similar structure and arrangement. However, we detected 2.4 and 6 kb inversions in the plastid genomes ofSagittariaandPotamogeton, respectively. Further, we assembled a 79 plastid protein-coding gene sequence data matrix of 22 taxa that included the three newly generated plastid genomes plus 19 previously reported ones, which together represent all primary lineages of monocots and outgroups. In plastid phylogenomic analyses using maximum likelihood and Bayesian inference, we show both strong support for Acorales as sister to the remaining monocots and monophyly of Alismatales. More importantly, Tofieldiaceae was resolved as the most basal lineage within Alismatales. These results provide new insights into the evolution of Alismatales as well as the early-diverging monocots as a whole. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
The Genomes of Two Bat Species with Long Constant Frequency Echolocation Calls.

PubMed

Dong, Dong; Lei, Ming; Hua, Panyu; Pan, Yi-Hsuan; Mu, Shuo; Zheng, Guantao; Pang, Erli; Lin, Kui; Zhang, Shuyi

2017-01-01

Bats can perceive the world by using a wide range of sensory systems, and some of the systems have become highly specialized, such as auditory sensory perception. Among bat species, the Old World leaf-nosed bats and horseshoe bats (rhinolophoid bats) possess the most sophisticated echolocation systems. Here, we reported the whole-genome sequencing and de novo assembles of two rhinolophoid bats-the great leaf-nosed bat (Hipposideros armiger) and the Chinese rufous horseshoe bat (Rhinolophus sinicus). Comparative genomic analyses revealed the adaptation of auditory sensory perception in the rhinolophoid bat lineages, probably resulting from the extreme selectivity used in the auditory processing by these bats. Pseudogenization of some vision-related genes in rhinolophoid bats was observed, suggesting that these genes have undergone relaxed natural selection. An extensive contraction of olfactory receptor gene repertoires was observed in the lineage leading to the common ancestor of bats. Further extensive gene contractions can be observed in the branch leading to the rhinolophoid bats. Such concordance suggested that molecular changes at one sensory gene might have direct consequences for genes controlling for other sensory modalities. To characterize the population genetic structure and patterns of evolution, we re-sequenced the genome of 20 great leaf-nosed bats from four different geographical locations of China. The result showed similar sequence diversity values and little differentiation among populations. Moreover, evidence of genetic adaptations to high altitudes in the great leaf-nosed bats was observed. Taken together, our work provided a useful resource for future research on the evolution of bats. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Comparative Genomics of the Dual-Obligate Symbionts from the Treehopper, Entylia carinata (Hemiptera: Membracidae), Provide Insight into the Origins and Evolution of an Ancient Symbiosis.

PubMed

Mao, Meng; Yang, Xiushuai; Poff, Kirsten; Bennett, Gordon

2017-06-01

Insect species in the Auchenorrhyncha suborder (Hemiptera) maintain ancient obligate symbioses with bacteria that provide essential amino acids (EAAs) deficient in their plant-sap diets. Molecular studies have revealed that two complementary symbiont lineages, "Candidatus Sulcia muelleri" and a betaproteobacterium ("Ca. Zinderia insecticola" in spittlebugs [Cercopoidea] and "Ca. Nasuia deltocephalinicola" in leafhoppers [Cicadellidae]) may have persisted in the suborder since its origin ∼300 Ma. However, investigation of how this pair has co-evolved on a genomic level is limited to only a few host lineages. We sequenced the complete genomes of Sulcia and a betaproteobacterium from the treehopper, Entylia carinata (Membracidae: ENCA), as the first representative from this species-rich group. It also offers the opportunity to compare symbiont evolution across a major insect group, the Membracoidea (leafhoppers + treehoppers). Genomic analyses show that the betaproteobacteria in ENCA is a member of the Nasuia lineage. Both symbionts have larger genomes (Sulcia = 218 kb and Nasuia = 144 kb) than related lineages in Deltocephalinae leafhoppers, retaining genes involved in basic cellular functions and information processing. Nasuia-ENCA further exhibits few unique gene losses, suggesting that its parent lineage in the common ancestor to the Membracoidea was already highly reduced. Sulcia-ENCA has lost the abilities to synthesize menaquinone cofactor and to complete the synthesis of the branched-chain EAAs. Both capabilities are conserved in other Sulcia lineages sequenced from across the Auchenorrhyncha. Finally, metagenomic sequencing recovered the partial genome of an Arsenophonus symbiont, although it infects only 20% of individuals indicating a facultative role. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Evolution of Vertebrate Phototransduction: Cascade Activation.

PubMed

Lamb, Trevor D; Patel, Hardip; Chuah, Aaron; Natoli, Riccardo C; Davies, Wayne I L; Hart, Nathan S; Collin, Shaun P; Hunt, David M

2016-08-01

We applied high-throughput sequencing to eye tissue from several species of basal vertebrates (a hagfish, two species of lamprey, and five species of gnathostome fish), and we analyzed the mRNA sequences for the proteins underlying activation of the phototransduction cascade. The molecular phylogenies that we constructed from these sequences are consistent with the 2R WGD model of two rounds of whole genome duplication. Our analysis suggests that agnathans retain an additional representative (that has been lost in gnathostomes) in each of the gene families we studied; the evidence is strong for the G-protein α subunit (GNAT) and the cGMP phosphodiesterase (PDE6), and indicative for the cyclic nucleotide-gated channels (CNGA and CNGB). Two of the species (the hagfish Eptatretus cirrhatus and the lamprey Mordacia mordax) possess only a single class of photoreceptor, simplifying deductions about the composition of cascade protein isoforms utilized in their photoreceptors. For the other lamprey, Geotria australis, analysis of the ratios of transcript levels in downstream and upstream migrant animals permits tentative conclusions to be drawn about the isoforms used in four of the five spectral classes of photoreceptor. Overall, our results suggest that agnathan rod-like photoreceptors utilize the same GNAT1 as gnathostomes, together with a homodimeric PDE6 that may be agnathan-specific, whereas agnathan cone-like photoreceptors utilize a GNAT that may be agnathan-specific, together with the same PDE6C as gnathostomes. These findings help elucidate the evolution of the vertebrate phototransduction cascade from an ancestral chordate phototransduction cascade that existed prior to the vertebrate radiation. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Elephant Transcriptome Provides Insights into the Evolution of Eutherian Placentation

PubMed Central

Hou, Zhuo-Cheng; Sterner, Kirstin N.; Romero, Roberto; Than, Nandor Gabor; Gonzalez, Juan M.; Weckle, Amy; Xing, Jun; Benirschke, Kurt; Goodman, Morris; Wildman, Derek E.

2012-01-01

The chorioallantoic placenta connects mother and fetus in eutherian pregnancies. In order to understand the evolution of the placenta and provide further understanding of placenta biology, we sequenced the transcriptome of a term placenta of an African elephant (Loxodonta africana) and compared these data with RNA sequence and microarray data from other eutherian placentas including human, mouse, and cow. We characterized the composition of 55,910 expressed sequence tag (i.e., cDNA) contigs using our custom annotation pipeline. A Markov algorithm was used to cluster orthologs of human, mouse, cow, and elephant placenta transcripts. We found 2,963 genes are commonly expressed in the placentas of these eutherian mammals. Gene ontology categories previously suggested to be important for placenta function (e.g., estrogen receptor signaling pathway, cell motion and migration, and adherens junctions) were significantly enriched in these eutherian placenta–expressed genes. Genes duplicated in different lineages and also specifically expressed in the placenta contribute to the great diversity observed in mammalian placenta anatomy. We identified 1,365 human lineage–specific, 1,235 mouse lineage–specific, 436 cow lineage–specific, and 904 elephant-specific placenta-expressed (PE) genes. The most enriched clusters of human-specific PE genes are signal/glycoprotein and immunoglobulin, and humans possess a deeply invasive human hemochorial placenta that comes into direct contact with maternal immune cells. Inference of phylogenetically conserved and derived transcripts demonstrates the power of comparative transcriptomics to trace placenta evolution and variation across mammals and identified candidate genes that may be important in the normal function of the human placenta, and their dysfunction may be related to human pregnancy complications. PMID:22546564

Principles of long noncoding RNA evolution derived from direct comparison of transcriptomes in 17 species.

PubMed

Hezroni, Hadas; Koppstein, David; Schwartz, Matthew G; Avrutin, Alexandra; Bartel, David P; Ulitsky, Igor

2015-05-19

The inability to predict long noncoding RNAs from genomic sequence has impeded the use of comparative genomics for studying their biology. Here, we develop methods that use RNA sequencing (RNA-seq) data to annotate the transcriptomes of 16 vertebrates and the echinoid sea urchin, uncovering thousands of previously unannotated genes, most of which produce long intervening noncoding RNAs (lincRNAs). Although in each species, >70% of lincRNAs cannot be traced to homologs in species that diverged >50 million years ago, thousands of human lincRNAs have homologs with similar expression patterns in other species. These homologs share short, 5'-biased patches of sequence conservation nested in exonic architectures that have been extensively rewired, in part by transposable element exonization. Thus, over a thousand human lincRNAs are likely to have conserved functions in mammals, and hundreds beyond mammals, but those functions require only short patches of specific sequences and can tolerate major changes in gene architecture. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
A database of annotated tentative orthologs from crop abiotic stress transcripts.

PubMed

Balaji, Jayashree; Crouch, Jonathan H; Petite, Prasad V N S; Hoisington, David A

2006-10-07

A minimal requirement to initiate a comparative genomics study on plant responses to abiotic stresses is a dataset of orthologous sequences. The availability of a large amount of sequence information, including those derived from stress cDNA libraries allow for the identification of stress related genes and orthologs associated with the stress response. Orthologous sequences serve as tools to explore genes and their relationships across species. For this purpose, ESTs from stress cDNA libraries across 16 crop species including 6 important cereal crops and 10 dicots were systematically collated and subjected to bioinformatics analysis such as clustering, grouping of tentative orthologous sets, identification of protein motifs/patterns in the predicted protein sequence, and annotation with stress conditions, tissue/library source and putative function. All data are available to the scientific community at http://intranet.icrisat.org/gt1/tog/homepage.htm. We believe that the availability of annotated plant abiotic stress ortholog sets will be a valuable resource for researchers studying the biology of environmental stresses in plant systems, molecular evolution and genomics.
Multidimensional Time-Resolved Spectroscopy of Vibrational Coherence in Biopolyenes

NASA Astrophysics Data System (ADS)

Buckup, Tiago; Motzkus, Marcus

2014-04-01

Multidimensional femtosecond time-resolved vibrational coherence spectroscopy allows one to investigate the evolution of vibrational coherence in electronic excited states. Methods such as pump-degenerate four-wave mixing and pump-impulsive vibrational spectroscopy combine an initial ultrashort laser pulse with a nonlinear probing sequence to reinduce vibrational coherence exclusively in the excited states. By carefully exploiting specific electronic resonances, one can detect vibrational coherence from 0 cm-1 to over 2,000 cm-1 and map its evolution. This review focuses on the observation and mapping of high-frequency vibrational coherence for all-trans biological polyenes such as Î²-carotene, lycopene, retinal, and retinal Schiff base. We discuss the role of molecular symmetry in vibrational coherence activity in the S1 electronic state and the interplay of coupling between electronic states and vibrational coherence.
Practices and Perspectives of College Instructors on Addressing Religious Beliefs When Teaching Evolution

ERIC Educational Resources Information Center

Barnes, M. Elizabeth; Brownell, Sara E.

2016-01-01

Evolution is a core concept of biology, and yet many college biology students do not accept evolution because of their religious beliefs. However, we do not currently know how instructors perceive their role in helping students accept evolution or how they address the perceived conflict between religion and evolution when they teach evolution.…
Evolution across the Curriculum: Microbiology

PubMed Central

Burmeister, Alita R.; Smith, James J.

2016-01-01

An integrated understanding of microbiology and evolutionary biology is essential for students pursuing careers in microbiology and healthcare fields. In this Perspective, we discuss the usefulness of evolutionary concepts and an overall evolutionary framework for students enrolled in microbiology courses. Further, we propose a set of learning goals for students studying microbial evolution concepts. We then describe some barriers to microbial evolution teaching and learning and encourage the continued incorporation of evidence-based teaching practices into microbiology courses at all levels. Next, we review the current status of microbial evolution assessment tools and describe some education resources available for teaching microbial evolution. Successful microbial evolution education will require that evolution be taught across the undergraduate biology curriculum, with a continued focus on applications and applied careers, while aligning with national biology education reform initiatives. Journal of Microbiology & Biology Education PMID:27158306
Computation of repetitions and regularities of biologically weighted sequences.

PubMed

Christodoulakis, M; Iliopoulos, C; Mouchard, L; Perdikuri, K; Tsakalidis, A; Tsichlas, K

2006-01-01

Biological weighted sequences are used extensively in molecular biology as profiles for protein families, in the representation of binding sites and often for the representation of sequences produced by a shotgun sequencing strategy. In this paper, we address three fundamental problems in the area of biologically weighted sequences: (i) computation of repetitions, (ii) pattern matching, and (iii) computation of regularities. Our algorithms can be used as basic building blocks for more sophisticated algorithms applied on weighted sequences.
Practices and Perspectives of College Instructors on Addressing Religious Beliefs When Teaching Evolution.

PubMed

Barnes, M Elizabeth; Brownell, Sara E

2016-01-01

Evolution is a core concept of biology, and yet many college biology students do not accept evolution because of their religious beliefs. However, we do not currently know how instructors perceive their role in helping students accept evolution or how they address the perceived conflict between religion and evolution when they teach evolution. This study explores instructor practices and beliefs related to mitigating students' perceived conflict between religion and evolution. Interviews with 32 instructors revealed that many instructors do not believe it is their goal to help students accept evolution and that most instructors do not address the perceived conflict between religion and evolution. Instructors cited many barriers to discussing religion in the context of evolution in their classes, most notably the instructors' own personal beliefs that religion and evolution may be incompatible. These data are exploratory and are intended to stimulate a series of questions about how we as college biology instructors teach evolution. © 2016 M. E. Barnes and S. E. Brownell. CBE—Life Sciences Education © 2016 The American Society for Cell Biology. This article is distributed by The American Society for Cell Biology under license from the author(s). It is available to the public under an Attribution–Noncommercial–Share Alike 3.0 Unported Creative Commons License (http://creativecommons.org/licenses/by-nc-sa/3.0).
Ludwig von Bertalanffy's organismic view on the theory of evolution.

PubMed

Drack, Manfred

2015-03-01

Ludwig von Bertalanffy was a key figure in the advancement of theoretical biology. His early considerations already led him to recognize the necessity of considering the organism as a system, as an organization of parts and processes. He termed the resulting research program organismic biology, which he extended to all basic questions of biology and almost all areas of biology, hence also to the theory of evolution. This article begins by outlining the rather unknown (because often written in German) research of Bertalanffy in the field of theoretical biology. The basics of the organismic approach are then described. This is followed by Bertalanffy's considerations on the theory of evolution, in which he used methods from theoretical biology and then introduced his own, organismic, view on evolution, leading to the demand for finding laws of evolution. Finally, his view on the concept of homology is presented. © 2015 Wiley Periodicals, Inc.
CFGP: a web-based, comparative fungal genomics platform.

PubMed

Park, Jongsun; Park, Bongsoo; Jung, Kyongyong; Jang, Suwang; Yu, Kwangyul; Choi, Jaeyoung; Kong, Sunghyung; Park, Jaejin; Kim, Seryun; Kim, Hyojeong; Kim, Soonok; Kim, Jihyun F; Blair, Jaime E; Lee, Kwangwon; Kang, Seogchan; Lee, Yong-Hwan

2008-01-01

Since the completion of the Saccharomyces cerevisiae genome sequencing project in 1996, the genomes of over 80 fungal species have been sequenced or are currently being sequenced. Resulting data provide opportunities for studying and comparing fungal biology and evolution at the genome level. To support such studies, the Comparative Fungal Genomics Platform (CFGP; http://cfgp.snu.ac.kr), a web-based multifunctional informatics workbench, was developed. The CFGP comprises three layers, including the basal layer, middleware and the user interface. The data warehouse in the basal layer contains standardized genome sequences of 65 fungal species. The middleware processes queries via six analysis tools, including BLAST, ClustalW, InterProScan, SignalP 3.0, PSORT II and a newly developed tool named BLASTMatrix. The BLASTMatrix permits the identification and visualization of genes homologous to a query across multiple species. The Data-driven User Interface (DUI) of the CFGP was built on a new concept of pre-collecting data and post-executing analysis instead of the 'fill-in-the-form-and-press-SUBMIT' user interfaces utilized by most bioinformatics sites. A tool termed Favorite, which supports the management of encapsulated sequence data and provides a personalized data repository to users, is another novel feature in the DUI.
Genome-wide selection components analysis in a fish with male pregnancy.

PubMed

Flanagan, Sarah P; Jones, Adam G

2017-04-01

A major goal of evolutionary biology is to identify the genome-level targets of natural and sexual selection. With the advent of next-generation sequencing, whole-genome selection components analysis provides a promising avenue in the search for loci affected by selection in nature. Here, we implement a genome-wide selection components analysis in the sex role reversed Gulf pipefish, Syngnathus scovelli. Our approach involves a double-digest restriction-site associated DNA sequencing (ddRAD-seq) technique, applied to adult females, nonpregnant males, pregnant males, and their offspring. An F ST comparison of allele frequencies among these groups reveals 47 genomic regions putatively experiencing sexual selection, as well as 468 regions showing a signature of differential viability selection between males and females. A complementary likelihood ratio test identifies similar patterns in the data as the F ST analysis. Sexual selection and viability selection both tend to favor the rare alleles in the population. Ultimately, we conclude that genome-wide selection components analysis can be a useful tool to complement other approaches in the effort to pinpoint genome-level targets of selection in the wild. © 2017 The Author(s). Evolution © 2017 The Society for the Study of Evolution.
Deep Investigation of Arabidopsis thaliana Junk DNA Reveals a Continuum between Repetitive Elements and Genomic Dark Matter

PubMed Central

Maumus, Florian; Quesneville, Hadi

2014-01-01

Eukaryotic genomes contain highly variable amounts of DNA with no apparent function. This so-called junk DNA is composed of two components: repeated and repeat-derived sequences (together referred to as the repeatome), and non-annotated sequences also known as genomic dark matter. Because of their high duplication rates as compared to other genomic features, transposable elements are predominant contributors to the repeatome and the products of their decay is thought to be a major source of genomic dark matter. Determining the origin and composition of junk DNA is thus important to help understanding genome evolution as well as host biology. In this study, we have used a combination of tools enabling to show that the repeatome from the small and reducing A. thaliana genome is significantly larger than previously thought. Furthermore, we present the concepts and results from a series of innovative approaches suggesting that a significant amount of the A. thaliana dark matter is of repetitive origin. As a tentative standard for the community, we propose a deep compendium annotation of the A. thaliana repeatome that may help addressing farther genome evolution as well as transcriptional and epigenetic regulation in this model plant. PMID:24709859
Evolutionary versatility of eukaryotic protein domains revealed by their bigram networks

PubMed Central

2011-01-01

Background Protein domains are globular structures of independently folded polypeptides that exert catalytic or binding activities. Their sequences are recognized as evolutionary units that, through genome recombination, constitute protein repertoires of linkage patterns. Via mutations, domains acquire modified functions that contribute to the fitness of cells and organisms. Recent studies have addressed the evolutionary selection that may have shaped the functions of individual domains and the emergence of particular domain combinations, which led to new cellular functions in multi-cellular animals. This study focuses on modeling domain linkage globally and investigates evolutionary implications that may be revealed by novel computational analysis. Results A survey of 77 completely sequenced eukaryotic genomes implies a potential hierarchical and modular organization of biological functions in most living organisms. Domains in a genome or multiple genomes are modeled as a network of hetero-duplex covalent linkages, termed bigrams. A novel computational technique is introduced to decompose such networks, whereby the notion of domain "networking versatility" is derived and measured. The most and least "versatile" domains (termed "core domains" and "peripheral domains" respectively) are examined both computationally via sequence conservation measures and experimentally using selected domains. Our study suggests that such a versatility measure extracted from the bigram networks correlates with the adaptivity of domains during evolution, where the network core domains are highly adaptive, significantly contrasting the network peripheral domains. Conclusions Domain recombination has played a major part in the evolution of eukaryotes attributing to genome complexity. From a system point of view, as the results of selection and constant refinement, networks of domain linkage are structured in a hierarchical modular fashion. Domains with high degree of networking versatility appear to be evolutionary adaptive, potentially through functional innovations. Domain bigram networks are informative as a model of biological functions. The networking versatility indices extracted from such networks for individual domains reflect the strength of evolutionary selection that the domains have experienced. PMID:21849086
Evolutionary versatility of eukaryotic protein domains revealed by their bigram networks.

PubMed

Xie, Xueying; Jin, Jing; Mao, Yongyi

2011-08-18

Protein domains are globular structures of independently folded polypeptides that exert catalytic or binding activities. Their sequences are recognized as evolutionary units that, through genome recombination, constitute protein repertoires of linkage patterns. Via mutations, domains acquire modified functions that contribute to the fitness of cells and organisms. Recent studies have addressed the evolutionary selection that may have shaped the functions of individual domains and the emergence of particular domain combinations, which led to new cellular functions in multi-cellular animals. This study focuses on modeling domain linkage globally and investigates evolutionary implications that may be revealed by novel computational analysis. A survey of 77 completely sequenced eukaryotic genomes implies a potential hierarchical and modular organization of biological functions in most living organisms. Domains in a genome or multiple genomes are modeled as a network of hetero-duplex covalent linkages, termed bigrams. A novel computational technique is introduced to decompose such networks, whereby the notion of domain "networking versatility" is derived and measured. The most and least "versatile" domains (termed "core domains" and "peripheral domains" respectively) are examined both computationally via sequence conservation measures and experimentally using selected domains. Our study suggests that such a versatility measure extracted from the bigram networks correlates with the adaptivity of domains during evolution, where the network core domains are highly adaptive, significantly contrasting the network peripheral domains. Domain recombination has played a major part in the evolution of eukaryotes attributing to genome complexity. From a system point of view, as the results of selection and constant refinement, networks of domain linkage are structured in a hierarchical modular fashion. Domains with high degree of networking versatility appear to be evolutionary adaptive, potentially through functional innovations. Domain bigram networks are informative as a model of biological functions. The networking versatility indices extracted from such networks for individual domains reflect the strength of evolutionary selection that the domains have experienced.
Horizontal Transfer of Non-LTR Retrotransposons from Arthropods to Flowering Plants.

PubMed

Gao, Dongying; Chu, Ye; Xia, Han; Xu, Chunming; Heyduk, Karolina; Abernathy, Brian; Ozias-Akins, Peggy; Leebens-Mack, James H; Jackson, Scott A

2018-02-01

Even though lateral movements of transposons across families and even phyla within multicellular eukaryotic kingdoms have been found, little is known about transposon transfer between the kingdoms Animalia and Plantae. We discovered a novel non-LTR retrotransposon, AdLINE3, in a wild peanut species. Sequence comparisons and phylogenetic analyses indicated that AdLINE3 is a member of the RTE clade, originally identified in a nematode and rarely reported in plants. We identified RTE elements in 82 plants, spanning angiosperms to algae, including recently active elements in some flowering plants. RTE elements in flowering plants were likely derived from a single family we refer to as An-RTE. Interestingly, An-RTEs show significant DNA sequence identity with non-LTR retroelements from 42 animals belonging to four phyla. Moreover, the sequence identity of RTEs between two arthropods and two plants was higher than that of homologous genes. Phylogenetic and evolutionary analyses of RTEs from both animals and plants suggest that the An-RTE family was likely transferred horizontally into angiosperms from an ancient aphid(s) or ancestral arthropod(s). Notably, some An-RTEs were recruited as coding sequences of functional genes participating in metabolic or other biochemical processes in plants. This is the first potential example of horizontal transfer of transposons between animals and flowering plants. Our findings help to understand exchanges of genetic material between the kingdom Animalia and Plantae and suggest arthropods likely impacted on plant genome evolution. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Sequencing of the sea lamprey (Petromyzon marinus) genome provides insights into vertebrate evolution

PubMed Central

Smith, Jeramiah J; Kuraku, Shigehiro; Holt, Carson; Sauka-Spengler, Tatjana; Jiang, Ning; Campbell, Michael S; Yandell, Mark D; Manousaki, Tereza; Meyer, Axel; Bloom, Ona E; Morgan, Jennifer R; Buxbaum, Joseph D; Sachidanandam, Ravi; Sims, Carrie; Garruss, Alexander S; Cook, Malcolm; Krumlauf, Robb; Wiedemann, Leanne M; Sower, Stacia A; Decatur, Wayne A; Hall, Jeffrey A; Amemiya, Chris T; Saha, Nil R; Buckley, Katherine M; Rast, Jonathan P; Das, Sabyasachi; Hirano, Masayuki; McCurley, Nathanael; Guo, Peng; Rohner, Nicolas; Tabin, Clifford J; Piccinelli, Paul; Elgar, Greg; Ruffier, Magali; Aken, Bronwen L; Searle, Stephen MJ; Muffato, Matthieu; Pignatelli, Miguel; Herrero, Javier; Jones, Matthew; Brown, C Titus; Chung-Davidson, Yu-Wen; Nanlohy, Kaben G; Libants, Scot V; Yeh, Chu-Yin; McCauley, David W; Langeland, James A; Pancer, Zeev; Fritzsch, Bernd; de Jong, Pieter J; Zhu, Baoli; Fulton, Lucinda L; Theising, Brenda; Flicek, Paul; Bronner, Marianne E; Warren, Wesley C; Clifton, Sandra W; Wilson, Richard K; Li, Weiming

2013-01-01

Lampreys are representatives of an ancient vertebrate lineage that diverged from our own ~500 million years ago. By virtue of this deeply shared ancestry, the sea lamprey (P. marinus) genome is uniquely poised to provide insight into the ancestry of vertebrate genomes and the underlying principles of vertebrate biology. Here, we present the first lamprey whole-genome sequence and assembly. We note challenges faced owing to its high content of repetitive elements and GC bases, as well as the absence of broad-scale sequence information from closely related species. Analyses of the assembly indicate that two whole-genome duplications likely occurred before the divergence of ancestral lamprey and gnathostome lineages. Moreover, the results help define key evolutionary events within vertebrate lineages, including the origin of myelin-associated proteins and the development of appendages. The lamprey genome provides an important resource for reconstructing vertebrate origins and the evolutionary events that have shaped the genomes of extant organisms. PMID:23435085
Exploring bacterial epigenomics in the next-generation sequencing era: a new approach for an emerging frontier.

PubMed

Chen, Poyin; Jeannotte, Richard; Weimer, Bart C

2014-05-01

Epigenetics has an important role for the success of foodborne pathogen persistence in diverse host niches. Substantial challenges exist in determining DNA methylation to situation-specific phenotypic traits. DNA modification, mediated by restriction-modification systems, functions as an immune response against antagonistic external DNA, and bacteriophage-acquired methyltransferases (MTase) and orphan MTases - those lacking the cognate restriction endonuclease - facilitate evolution of new phenotypes via gene expression modulation via DNA and RNA modifications, including methylation and phosphorothioation. Recent establishment of large-scale genome sequencing projects will result in a significant increase in genome availability that will lead to new demands for data analysis including new predictive bioinformatics approaches that can be verified with traditional scientific rigor. Sequencing technologies that detect modification coupled with mass spectrometry to discover new adducts is a powerful tactic to study bacterial epigenetics, which is poised to make novel and far-reaching discoveries that link biological significance and the bacterial epigenome. Copyright © 2014 Elsevier Ltd. All rights reserved.
Quantitative analysis of RNA-protein interactions on a massively parallel array for mapping biophysical and evolutionary landscapes

PubMed Central

Buenrostro, Jason D.; Chircus, Lauren M.; Araya, Carlos L.; Layton, Curtis J.; Chang, Howard Y.; Snyder, Michael P.; Greenleaf, William J.

2015-01-01

RNA-protein interactions drive fundamental biological processes and are targets for molecular engineering, yet quantitative and comprehensive understanding of the sequence determinants of affinity remains limited. Here we repurpose a high-throughput sequencing instrument to quantitatively measure binding and dissociation of MS2 coat protein to >107 RNA targets generated on a flow-cell surface by in situ transcription and inter-molecular tethering of RNA to DNA. We decompose the binding energy contributions from primary and secondary RNA structure, finding that differences in affinity are often driven by sequence-specific changes in association rates. By analyzing the biophysical constraints and modeling mutational paths describing the molecular evolution of MS2 from low- to high-affinity hairpins, we quantify widespread molecular epistasis, and a long-hypothesized structure-dependent preference for G:U base pairs over C:A intermediates in evolutionary trajectories. Our results suggest that quantitative analysis of RNA on a massively parallel array (RNAMaP) relationships across molecular variants. PMID:24727714
Chromosome-scale assembly of the Monopterus genome.

PubMed

Zhao, Xueya; Luo, Majing; Li, Zhigang; Zhong, Pei; Cheng, Yibin; Lai, Fengling; Wang, Xin; Min, Jiumeng; Bai, Mingzhou; Yang, Yulan; Cheng, Hanhua; Zhou, Rongjia

2018-05-01

The teleost fish Monopterus albus is emerging as a new model for biological studies due to its natural sex transition and small genome, in addition to its enormous economic and potential medical value. However, no genomic information for the Monopterus is currently available. Here, we sequenced and de novo assembled the genome of M. albus and report the de novochromosome assembly by FISH walking assisted by conserved synteny (Cafs). Using Cafs, 328 scaffolds were assembled into 12 chromosomes, which covered genomic sequences of 555 Mb, accounting for 81.3% of the sequences assembled in scaffolds (∼689 Mb). A total of 18 ,660 genes were mapped on the chromosomes and showed a nonrandom distribution along chromosomes. We report the first reference genome of the Monopterus and provide an efficient Cafs strategy for a de novo chromosome-level assembly of the Monopterus genome, which provides a valuable resource, not only for further studies in genetics, evolution, and development, particularly sex determination, but also for breed improvement of the species.
The genotype-phenotype map of an evolving digital organism.

PubMed

Fortuna, Miguel A; Zaman, Luis; Ofria, Charles; Wagner, Andreas

2017-02-01

To understand how evolving systems bring forth novel and useful phenotypes, it is essential to understand the relationship between genotypic and phenotypic change. Artificial evolving systems can help us understand whether the genotype-phenotype maps of natural evolving systems are highly unusual, and it may help create evolvable artificial systems. Here we characterize the genotype-phenotype map of digital organisms in Avida, a platform for digital evolution. We consider digital organisms from a vast space of 10141 genotypes (instruction sequences), which can form 512 different phenotypes. These phenotypes are distinguished by different Boolean logic functions they can compute, as well as by the complexity of these functions. We observe several properties with parallels in natural systems, such as connected genotype networks and asymmetric phenotypic transitions. The likely common cause is robustness to genotypic change. We describe an intriguing tension between phenotypic complexity and evolvability that may have implications for biological evolution. On the one hand, genotypic change is more likely to yield novel phenotypes in more complex organisms. On the other hand, the total number of novel phenotypes reachable through genotypic change is highest for organisms with simple phenotypes. Artificial evolving systems can help us study aspects of biological evolvability that are not accessible in vastly more complex natural systems. They can also help identify properties, such as robustness, that are required for both human-designed artificial systems and synthetic biological systems to be evolvable.
The genotype-phenotype map of an evolving digital organism

PubMed Central

Zaman, Luis; Wagner, Andreas

2017-01-01

To understand how evolving systems bring forth novel and useful phenotypes, it is essential to understand the relationship between genotypic and phenotypic change. Artificial evolving systems can help us understand whether the genotype-phenotype maps of natural evolving systems are highly unusual, and it may help create evolvable artificial systems. Here we characterize the genotype-phenotype map of digital organisms in Avida, a platform for digital evolution. We consider digital organisms from a vast space of 10141 genotypes (instruction sequences), which can form 512 different phenotypes. These phenotypes are distinguished by different Boolean logic functions they can compute, as well as by the complexity of these functions. We observe several properties with parallels in natural systems, such as connected genotype networks and asymmetric phenotypic transitions. The likely common cause is robustness to genotypic change. We describe an intriguing tension between phenotypic complexity and evolvability that may have implications for biological evolution. On the one hand, genotypic change is more likely to yield novel phenotypes in more complex organisms. On the other hand, the total number of novel phenotypes reachable through genotypic change is highest for organisms with simple phenotypes. Artificial evolving systems can help us study aspects of biological evolvability that are not accessible in vastly more complex natural systems. They can also help identify properties, such as robustness, that are required for both human-designed artificial systems and synthetic biological systems to be evolvable. PMID:28241039

Genomic Insights into the Origin of Parasitism in the Emerging Plant Pathogen Bursaphelenchus xylophilus

PubMed Central

Kikuchi, Taisei; Cotton, James A.; Dalzell, Jonathan J.; Hasegawa, Koichi; Kanzaki, Natsumi; McVeigh, Paul; Takanashi, Takuma; Tsai, Isheng J.; Assefa, Samuel A.; Cock, Peter J. A.; Otto, Thomas Dan; Hunt, Martin; Reid, Adam J.; Sanchez-Flores, Alejandro; Tsuchihara, Kazuko; Yokoi, Toshiro; Larsson, Mattias C.; Miwa, Johji; Maule, Aaron G.; Sahashi, Norio; Jones, John T.; Berriman, Matthew

2011-01-01

Bursaphelenchus xylophilus is the nematode responsible for a devastating epidemic of pine wilt disease in Asia and Europe, and represents a recent, independent origin of plant parasitism in nematodes, ecologically and taxonomically distinct from other nematodes for which genomic data is available. As well as being an important pathogen, the B. xylophilus genome thus provides a unique opportunity to study the evolution and mechanism of plant parasitism. Here, we present a high-quality draft genome sequence from an inbred line of B. xylophilus, and use this to investigate the biological basis of its complex ecology which combines fungal feeding, plant parasitic and insect-associated stages. We focus particularly on putative parasitism genes as well as those linked to other key biological processes and demonstrate that B. xylophilus is well endowed with RNA interference effectors, peptidergic neurotransmitters (including the first description of ins genes in a parasite) stress response and developmental genes and has a contracted set of chemosensory receptors. B. xylophilus has the largest number of digestive proteases known for any nematode and displays expanded families of lysosome pathway genes, ABC transporters and cytochrome P450 pathway genes. This expansion in digestive and detoxification proteins may reflect the unusual diversity in foods it exploits and environments it encounters during its life cycle. In addition, B. xylophilus possesses a unique complement of plant cell wall modifying proteins acquired by horizontal gene transfer, underscoring the impact of this process on the evolution of plant parasitism by nematodes. Together with the lack of proteins homologous to effectors from other plant parasitic nematodes, this confirms the distinctive molecular basis of plant parasitism in the Bursaphelenchus lineage. The genome sequence of B. xylophilus adds to the diversity of genomic data for nematodes, and will be an important resource in understanding the biology of this unusual parasite. PMID:21909270
Genomic insights into the origin of parasitism in the emerging plant pathogen Bursaphelenchus xylophilus.

PubMed

Kikuchi, Taisei; Cotton, James A; Dalzell, Jonathan J; Hasegawa, Koichi; Kanzaki, Natsumi; McVeigh, Paul; Takanashi, Takuma; Tsai, Isheng J; Assefa, Samuel A; Cock, Peter J A; Otto, Thomas Dan; Hunt, Martin; Reid, Adam J; Sanchez-Flores, Alejandro; Tsuchihara, Kazuko; Yokoi, Toshiro; Larsson, Mattias C; Miwa, Johji; Maule, Aaron G; Sahashi, Norio; Jones, John T; Berriman, Matthew

2011-09-01

Bursaphelenchus xylophilus is the nematode responsible for a devastating epidemic of pine wilt disease in Asia and Europe, and represents a recent, independent origin of plant parasitism in nematodes, ecologically and taxonomically distinct from other nematodes for which genomic data is available. As well as being an important pathogen, the B. xylophilus genome thus provides a unique opportunity to study the evolution and mechanism of plant parasitism. Here, we present a high-quality draft genome sequence from an inbred line of B. xylophilus, and use this to investigate the biological basis of its complex ecology which combines fungal feeding, plant parasitic and insect-associated stages. We focus particularly on putative parasitism genes as well as those linked to other key biological processes and demonstrate that B. xylophilus is well endowed with RNA interference effectors, peptidergic neurotransmitters (including the first description of ins genes in a parasite) stress response and developmental genes and has a contracted set of chemosensory receptors. B. xylophilus has the largest number of digestive proteases known for any nematode and displays expanded families of lysosome pathway genes, ABC transporters and cytochrome P450 pathway genes. This expansion in digestive and detoxification proteins may reflect the unusual diversity in foods it exploits and environments it encounters during its life cycle. In addition, B. xylophilus possesses a unique complement of plant cell wall modifying proteins acquired by horizontal gene transfer, underscoring the impact of this process on the evolution of plant parasitism by nematodes. Together with the lack of proteins homologous to effectors from other plant parasitic nematodes, this confirms the distinctive molecular basis of plant parasitism in the Bursaphelenchus lineage. The genome sequence of B. xylophilus adds to the diversity of genomic data for nematodes, and will be an important resource in understanding the biology of this unusual parasite.
The nearly neutral and selection theories of molecular evolution under the fisher geometrical framework: substitution rate, population size, and complexity.

PubMed

Razeto-Barry, Pablo; Díaz, Javier; Vásquez, Rodrigo A

2012-06-01

The general theories of molecular evolution depend on relatively arbitrary assumptions about the relative distribution and rate of advantageous, deleterious, neutral, and nearly neutral mutations. The Fisher geometrical model (FGM) has been used to make distributions of mutations biologically interpretable. We explored an FGM-based molecular model to represent molecular evolutionary processes typically studied by nearly neutral and selection models, but in which distributions and relative rates of mutations with different selection coefficients are a consequence of biologically interpretable parameters, such as the average size of the phenotypic effect of mutations and the number of traits (complexity) of organisms. A variant of the FGM-based model that we called the static regime (SR) represents evolution as a nearly neutral process in which substitution rates are determined by a dynamic substitution process in which the population's phenotype remains around a suboptimum equilibrium fitness produced by a balance between slightly deleterious and slightly advantageous compensatory substitutions. As in previous nearly neutral models, the SR predicts a negative relationship between molecular evolutionary rate and population size; however, SR does not have the unrealistic properties of previous nearly neutral models such as the narrow window of selection strengths in which they work. In addition, the SR suggests that compensatory mutations cannot explain the high rate of fixations driven by positive selection currently found in DNA sequences, contrary to what has been previously suggested. We also developed a generalization of SR in which the optimum phenotype can change stochastically due to environmental or physiological shifts, which we called the variable regime (VR). VR models evolution as an interplay between adaptive processes and nearly neutral steady-state processes. When strong environmental fluctuations are incorporated, the process becomes a selection model in which evolutionary rate does not depend on population size, but is critically dependent on the complexity of organisms and mutation size. For SR as well as VR we found that key parameters of molecular evolution are linked by biological factors, and we showed that they cannot be fixed independently by arbitrary criteria, as has usually been assumed in previous molecular evolutionary models.
The Nearly Neutral and Selection Theories of Molecular Evolution Under the Fisher Geometrical Framework: Substitution Rate, Population Size, and Complexity

PubMed Central

Razeto-Barry, Pablo; Díaz, Javier; Vásquez, Rodrigo A.

2012-01-01

The general theories of molecular evolution depend on relatively arbitrary assumptions about the relative distribution and rate of advantageous, deleterious, neutral, and nearly neutral mutations. The Fisher geometrical model (FGM) has been used to make distributions of mutations biologically interpretable. We explored an FGM-based molecular model to represent molecular evolutionary processes typically studied by nearly neutral and selection models, but in which distributions and relative rates of mutations with different selection coefficients are a consequence of biologically interpretable parameters, such as the average size of the phenotypic effect of mutations and the number of traits (complexity) of organisms. A variant of the FGM-based model that we called the static regime (SR) represents evolution as a nearly neutral process in which substitution rates are determined by a dynamic substitution process in which the population’s phenotype remains around a suboptimum equilibrium fitness produced by a balance between slightly deleterious and slightly advantageous compensatory substitutions. As in previous nearly neutral models, the SR predicts a negative relationship between molecular evolutionary rate and population size; however, SR does not have the unrealistic properties of previous nearly neutral models such as the narrow window of selection strengths in which they work. In addition, the SR suggests that compensatory mutations cannot explain the high rate of fixations driven by positive selection currently found in DNA sequences, contrary to what has been previously suggested. We also developed a generalization of SR in which the optimum phenotype can change stochastically due to environmental or physiological shifts, which we called the variable regime (VR). VR models evolution as an interplay between adaptive processes and nearly neutral steady-state processes. When strong environmental fluctuations are incorporated, the process becomes a selection model in which evolutionary rate does not depend on population size, but is critically dependent on the complexity of organisms and mutation size. For SR as well as VR we found that key parameters of molecular evolution are linked by biological factors, and we showed that they cannot be fixed independently by arbitrary criteria, as has usually been assumed in previous molecular evolutionary models. PMID:22426879
MitoRes: a resource of nuclear-encoded mitochondrial genes and their products in Metazoa.

PubMed

Catalano, Domenico; Licciulli, Flavio; Turi, Antonio; Grillo, Giorgio; Saccone, Cecilia; D'Elia, Domenica

2006-01-24

Mitochondria are sub-cellular organelles that have a central role in energy production and in other metabolic pathways of all eukaryotic respiring cells. In the last few years, with more and more genomes being sequenced, a huge amount of data has been generated providing an unprecedented opportunity to use the comparative analysis approach in studies of evolution and functional genomics with the aim of shedding light on molecular mechanisms regulating mitochondrial biogenesis and metabolism. In this context, the problem of the optimal extraction of representative datasets of genomic and proteomic data assumes a crucial importance. Specialised resources for nuclear-encoded mitochondria-related proteins already exist; however, no mitochondrial database is currently available with the same features of MitoRes, which is an update of the MitoNuc database extensively modified in its structure, data sources and graphical interface. It contains data on nuclear-encoded mitochondria-related products for any metazoan species for which this type of data is available and also provides comprehensive sequence datasets (gene, transcript and protein) as well as useful tools for their extraction and export. MitoRes http://www2.ba.itb.cnr.it/MitoRes/ consolidates information from publicly external sources and automatically annotates them into a relational database. Additionally, it also clusters proteins on the basis of their sequence similarity and interconnects them with genomic data. The search engine and sequence management tools allow the query/retrieval of the database content and the extraction and export of sequences (gene, transcript, protein) and related sub-sequences (intron, exon, UTR, CDS, signal peptide and gene flanking regions) ready to be used for in silico analysis. The tool we describe here has been developed to support lab scientists and bioinformaticians alike in the characterization of molecular features and evolution of mitochondrial targeting sequences. The way it provides for the retrieval and extraction of sequences allows the user to overcome the obstacles encountered in the integrative use of different bioinformatic resources and the completeness of the sequence collection allows intra- and interspecies comparison at different biological levels (gene, transcript and protein).
Extremely Low Genomic Diversity of Rickettsia japonica Distributed in Japan.

PubMed

Akter, Arzuba; Ooka, Tadasuke; Gotoh, Yasuhiro; Yamamoto, Seigo; Fujita, Hiromi; Terasoma, Fumio; Kida, Kouji; Taira, Masakatsu; Nakadouzono, Fumiko; Gokuden, Mutsuyo; Hirano, Manabu; Miyashiro, Mamoru; Inari, Kouichi; Shimazu, Yukie; Tabara, Kenji; Toyoda, Atsushi; Yoshimura, Dai; Itoh, Takehiko; Kitano, Tomokazu; Sato, Mitsuhiko P; Katsura, Keisuke; Mondal, Shakhinur Islam; Ogura, Yoshitoshi; Ando, Shuji; Hayashi, Tetsuya

2017-01-01

Rickettsiae are obligate intracellular bacteria that have small genomes as a result of reductive evolution. Many Rickettsia species of the spotted fever group (SFG) cause tick-borne diseases known as "spotted fevers". The life cycle of SFG rickettsiae is closely associated with that of the tick, which is generally thought to act as a bacterial vector and reservoir that maintains the bacterium through transstadial and transovarial transmission. Each SFG member is thought to have adapted to a specific tick species, thus restricting the bacterial distribution to a relatively limited geographic region. These unique features of SFG rickettsiae allow investigation of how the genomes of such biologically and ecologically specialized bacteria evolve after genome reduction and the types of population structures that are generated. Here, we performed a nationwide, high-resolution phylogenetic analysis of Rickettsia japonica, an etiological agent of Japanese spotted fever that is distributed in Japan and Korea. The comparison of complete or nearly complete sequences obtained from 31 R. japonica strains isolated from various sources in Japan over the past 30 years demonstrated an extremely low level of genomic diversity. In particular, only 34 single nucleotide polymorphisms were identified among the 27 strains of the major lineage containing all clinical isolates and tick isolates from the three tick species. Our data provide novel insights into the biology and genome evolution of R. japonica, including the possibilities of recent clonal expansion and a long generation time in nature due to the long dormant phase associated with tick life cycles. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Evolutionary origin and functional divergence of totipotent cell homeobox genes in eutherian mammals.

PubMed

Maeso, Ignacio; Dunwell, Thomas L; Wyatt, Chris D R; Marlétaz, Ferdinand; Vető, Borbála; Bernal, Juan A; Quah, Shan; Irimia, Manuel; Holland, Peter W H

2016-06-13

A central goal of evolutionary biology is to link genomic change to phenotypic evolution. The origin of new transcription factors is a special case of genomic evolution since it brings opportunities for novel regulatory interactions and potentially the emergence of new biological properties. We demonstrate that a group of four homeobox gene families (Argfx, Leutx, Dprx, Tprx), plus a gene newly described here (Pargfx), arose by tandem gene duplication from the retinal-expressed Crx gene, followed by asymmetric sequence evolution. We show these genes arose as part of repeated gene gain and loss events on a dynamic chromosomal region in the stem lineage of placental mammals, on the forerunner of human chromosome 19. The human orthologues of these genes are expressed specifically in early embryo totipotent cells, peaking from 8-cell to morula, prior to cell fate restrictions; cow orthologues have similar expression. To examine biological roles, we used ectopic gene expression in cultured human cells followed by high-throughput RNA-seq and uncovered extensive transcriptional remodelling driven by three of the genes. Comparison to transcriptional profiles of early human embryos suggest roles in activating and repressing a set of developmentally-important genes that spike at 8-cell to morula, rather than a general role in genome activation. We conclude that a dynamic chromosome region spawned a set of evolutionarily new homeobox genes, the ETCHbox genes, specifically in eutherian mammals. After these genes diverged from the parental Crx gene, we argue they were recruited for roles in the preimplantation embryo including activation of genes at the 8-cell stage and repression after morula. We propose these new homeobox gene roles permitted fine-tuning of cell fate decisions necessary for specification and function of embryonic and extra-embryonic tissues utilised in mammalian development and pregnancy.
Acceptance of Evolution Increases with Student Academic Level: A Comparison Between a Secular and a Religious College

PubMed Central

Paz-y-Miño C., Guillermo

2012-01-01

Acceptance of evolution among the general public, high schools, teachers, and scientists has been documented in the USA; little is known about college students’ views on evolution; this population is relevant since it transits from a high-school/parent-protective environment to an independent role in societal decisions. Here we compare perspectives about evolution, creationism, and intelligent design (ID) between a secular (S) and a religious (R) college in the Northeastern USA. Interinstitutional comparisons showed that 64% (mean S + R) biology majors vs. 42/62% (S/R) nonmajors supported the exclusive teaching of evolution in science classes; 24/29% (S/R) biology majors vs. 26/38% (S/R) nonmajors perceived ID as both alternative to evolution and/or scientific theory about the origin of life; 76% (mean S + R) biology majors and nonmajors accepted evolutionary explanations about the origin of life; 86% (mean S + R) biology majors vs. 79% (mean S + R) nonmajors preferred science courses where human evolution is discussed; 76% (mean S+R) biology majors vs. 79% (mean S + R) nonmajors welcomed questions about evolution in exams and/or thought that such questions should always be in exams; and 66% (mean S + R) biology majors vs. 46% (mean S + R) nonmajors admitted they accept evolution openly and/or privately. Intrainstitutional comparisons showed that overall acceptance of evolution among biologists (S or R) increased gradually from the freshman to the senior year, due to exposure to upper-division courses with evolutionary content. College curricular/pedagogical reform should fortify evolution literacy at all education levels, particularly among nonbiologists. PMID:22957109
Computational Study of the Genomic and Epigenomic Phenomena

NASA Astrophysics Data System (ADS)

Yang, Wenjing

Biological systems are perhaps the ultimate complex systems, uniquely capable of processing and communicating information, reproducing in their lifetimes, and adapting in evolutionary time scales. My dissertation research focuses on using computational approaches to understand the biocomplexity manifested in the multitude of length scales and time scales. At the molecular and cellular level, central to the complex behavior of a biological system is the regulatory network. My research study focused on epigenetics, which is essential for multicellular organisms to establish cellular identity during development or in response to intracellular and environmental stimuli. My computational study of epigenomics is greatly facilitated by recent advances in high-throughput sequencing technology, which enables high-resolution snapshots of epigenomes and transcriptomes. Using human CD4+ T cell as a model system, the dynamical changes in epigenome and transcriptome pertinent to T cell activation were investigated at the genome scale. Going beyond traditional focus on transcriptional regulation, I provided evidences that post-transcriptional regulation may serve as a major component of the regulatory network. In addition, I explored alternative polyadenylation, another novel aspect of gene regulation, and how it cross-talks with the local chromatin structure. As the renowned theoretical biologist Theodosius Dobzhansky said eloquently, "Nothing in biology makes sense except in the light of evolution''. To better understand this ubiquitous driving force in the biological world, I went beyond molecular events in a single organism, and investigated the dynamical changes of population structure along the evolutionary time scale. To this end, we used HIV virus population dynamics in the host immune system as a model system. The evolution of HIV viral population plays a key role in AIDS immunopathogenesis with its exceptionally high mutation rate. However, the theoretical studies of the effect of recombination have been rather limited. Given the phylogenetic and experimental evidences for the high recombination rate and its important role in HIV evolution and epidemics, I established a mathematical model to study the effect of recombination, and explored the complex behavior of this dynamics system.
X-MATE: a flexible system for mapping short read data

PubMed Central

Pearson, John V.; Cloonan, Nicole; Grimmond, Sean M.

2011-01-01

Summary: Accurate and complete mapping of short-read sequencing to a reference genome greatly enhances the discovery of biological results and improves statistical predictions. We recently presented RNA-MATE, a pipeline for the recursive mapping of RNA-Seq datasets. With the rapid increase in genome re-sequencing projects, progression of available mapping software and the evolution of file formats, we now present X-MATE, an updated version of RNA-MATE, capable of mapping both RNA-Seq and DNA datasets and with improved performance, output file formats, configuration files, and flexibility in core mapping software. Availability: Executables, source code, junction libraries, test data and results and the user manual are available from http://grimmond.imb.uq.edu.au/X-MATE/. Contact: n.cloonan@uq.edu.au; s.grimmond@uq.edu.au Supplementary information: Supplementary data are available at Bioinformatics Online. PMID:21216778
A Study of the Comparative Effectiveness of Zoology Prerequisites at Slippery Rock State College.

ERIC Educational Resources Information Center

Morrison, William Sechler

This study compared the effectiveness of three sequences of prerequisite courses required before taking zoology. Sequence 1 prerequisite courses consisted of general biology and human biology; Sequence 2 consisted of general biology; and Sequence 3 required cell biology. Zoology students in the spring of 1972 were pretest and a posttest. The mean…
Thoughts on the cultural evolution of man. Developmental imprinting and transgenerational effect.

PubMed

Csaba, György

2007-01-01

The biological evolution of man stopped since it has been conveyed to the objects, created by man. This paper introduces the concept of "conveyed evolution". Being part of the cultural evolution, the conveyed evolution is a continuation of the biological one. There are several similarities between the laws of biological and conveyed evolution, albeit the differences are important as well. Some laws of the conveyed evolution are described here. The conveyed evolution has man-made repair mechanisms (medicine, protection of environment) which defend man from harm. Man's fragility limits the progress of conveyed evolution. However, artificial compounds or environmental pollutants which are provoked by the conveyed evolution induce chemical (hormonal) imprinting in the developmental critical periods, which is transmitted to the progeny generations (transgenerational effect). This could cause evolutionary alterations without mutation.
Ludwig von Bertalanffy's Organismic View on the Theory of Evolution

PubMed Central

Drack, Manfred

2015-01-01

Ludwig von Bertalanffy was a key figure in the advancement of theoretical biology. His early considerations already led him to recognize the necessity of considering the organism as a system, as an organization of parts and processes. He termed the resulting research program organismic biology, which he extended to all basic questions of biology and almost all areas of biology, hence also to the theory of evolution. This article begins by outlining the rather unknown (because often written in German) research of Bertalanffy in the field of theoretical biology. The basics of the organismic approach are then described. This is followed by Bertalanffy's considerations on the theory of evolution, in which he used methods from theoretical biology and then introduced his own, organismic, view on evolution, leading to the demand for finding laws of evolution. Finally, his view on the concept of homology is presented. J. Exp. Zool. (Mol. Dev. Evol.) 324B: 77–90, 2015. © 2015 The Authors. Journal of Experimental Zoology Part B: Molecular and Developmental Evolution published by Wiley Periodicals, Inc. PMID:25727202
Characterization of the first complete genome sequence of an Impatiens necrotic spot orthotospovirus isolate from the United States and worldwide phylogenetic analyses of INSV isolates.

PubMed

Zhao, Kaixi; Margaria, Paolo; Rosa, Cristina

2018-05-10

Impatiens necrotic spot orthotospovirus (INSV) can impact economically important ornamental plants and vegetables worldwide. Characterization studies on INSV are limited. For most INSV isolates, there are no complete genome sequences available. This lack of genomic information has a negative impact on the understanding of the INSV genetic diversity and evolution. Here we report the first complete nucleotide sequence of a US INSV isolate. INSV-UP01 was isolated from an impatiens in Pennsylvania, US. RT-PCR was used to clone its full-length genome and Vector NTI to assemble overlapping sequences. Phylogenetic trees were constructed by using MEGA7 software to show the phylogenetic relationships with other available INSV sequences worldwide. This US isolate has genome and biological features classical of INSV species and clusters in the Western Hemisphere clade, but its origin appears to be recent. Furthermore, INSV-UP01 might have been involved in a recombination event with an Italian isolate belonging to the Asian clade. Our analyses support that INSV isolates infect a broad plant-host range they group by geographic origin and not by host, and are subjected to frequent recombination events. These results justify the need to generate and analyze complete genome sequences of orthotospoviruses in general and INSV in particular.
AntiClustal: Multiple Sequence Alignment by antipole clustering and linear approximate 1-median computation.

PubMed

Di Pietro, C; Di Pietro, V; Emmanuele, G; Ferro, A; Maugeri, T; Modica, E; Pigola, G; Pulvirenti, A; Purrello, M; Ragusa, M; Scalia, M; Shasha, D; Travali, S; Zimmitti, V

2003-01-01

In this paper we present a new Multiple Sequence Alignment (MSA) algorithm called AntiClusAl. The method makes use of the commonly use idea of aligning homologous sequences belonging to classes generated by some clustering algorithm, and then continue the alignment process ina bottom-up way along a suitable tree structure. The final result is then read at the root of the tree. Multiple sequence alignment in each cluster makes use of the progressive alignment with the 1-median (center) of the cluster. The 1-median of set S of sequences is the element of S which minimizes the average distance from any other sequence in S. Its exact computation requires quadratic time. The basic idea of our proposed algorithm is to make use of a simple and natural algorithmic technique based on randomized tournaments which has been successfully applied to large size search problems in general metric spaces. In particular a clustering algorithm called Antipole tree and an approximate linear 1-median computation are used. Our algorithm compared with Clustal W, a widely used tool to MSA, shows a better running time results with fully comparable alignment quality. A successful biological application showing high aminoacid conservation during evolution of Xenopus laevis SOD2 is also cited.
Population structure in relation to host-plant ecology and Wolbachia infestation in the comma butterfly.

PubMed

Kodandaramaiah, U; Weingartner, E; Janz, N; Dalén, L; Nylin, S

2011-10-01

Experimental work on Polygonia c-album, a temperate polyphagous butterfly species, has shown that Swedish, Belgian, Norwegian and Estonian females are generalists with respect to host-plant preference, whereas females from UK and Spain are specialized on Urticaceae. Female preference is known to have a strong genetic component. We test whether the specialist and generalist populations form respective genetic clusters using data from mitochondrial sequences and 10 microsatellite loci. Results do not support this hypothesis, suggesting that the specialist and generalist traits have evolved more than once independently. Mitochondrial DNA variation suggests a rapid expansion scenario, with a single widespread haplotype occurring in high frequency, whereas microsatellite data indicate strong differentiation of the Moroccan population. Based on a comparison of polymorphism in the mitochondrial data and sequences from a nuclear gene, we show that the diversity in the former is significantly less than that expected under neutral evolution. Furthermore, we found that almost all butterfly samples were infected with a single strain of Wolbachia, a maternally inherited bacterium. We reason that indirect selection on the mitochondrial genome mediated by a recent sweep of Wolbachia infection has depleted variability in the mitochondrial sequences. We also surmise that P. c-album could have expanded out of a single glacial refugium and colonized Morocco recently. © 2011 The Authors. Journal of Evolutionary Biology © 2011 European Society For Evolutionary Biology.
Universal Sequence Replication, Reversible Polymerization and Early Functional Biopolymers: A Model for the Initiation of Prebiotic Sequence Evolution

PubMed Central

Walker, Sara Imari; Grover, Martha A.; Hud, Nicholas V.

2012-01-01

Many models for the origin of life have focused on understanding how evolution can drive the refinement of a preexisting enzyme, such as the evolution of efficient replicase activity. Here we present a model for what was, arguably, an even earlier stage of chemical evolution, when polymer sequence diversity was generated and sustained before, and during, the onset of functional selection. The model includes regular environmental cycles (e.g. hydration-dehydration cycles) that drive polymers between times of replication and functional activity, which coincide with times of different monomer and polymer diffusivity. Template-directed replication of informational polymers, which takes place during the dehydration stage of each cycle, is considered to be sequence-independent. New sequences are generated by spontaneous polymer formation, and all sequences compete for a finite monomer resource that is recycled via reversible polymerization. Kinetic Monte Carlo simulations demonstrate that this proposed prebiotic scenario provides a robust mechanism for the exploration of sequence space. Introduction of a polymer sequence with monomer synthetase activity illustrates that functional sequences can become established in a preexisting pool of otherwise non-functional sequences. Functional selection does not dominate system dynamics and sequence diversity remains high, permitting the emergence and spread of more than one functional sequence. It is also observed that polymers spontaneously form clusters in simulations where polymers diffuse more slowly than monomers, a feature that is reminiscent of a previous proposal that the earliest stages of life could have been defined by the collective evolution of a system-wide cooperation of polymer aggregates. Overall, the results presented demonstrate the merits of considering plausible prebiotic polymer chemistries and environments that would have allowed for the rapid turnover of monomer resources and for regularly varying monomer/polymer diffusivities. PMID:22493682
Evolution of natural agents: preservation, advance, and emergence of functional information.

PubMed

Sharov, Alexei A

2016-04-01

Biological evolution is often viewed narrowly as a change of morphology or allele frequency in a sequence of generations. Here I pursue an alternative informational concept of evolution, as preservation, advance, and emergence of functional information in natural agents. Functional information is a network of signs (e.g., memory, transient messengers, and external signs) that are used by agents to preserve and regulate their functions. Functional information is preserved in evolution via complex interplay of copying and construction processes: the digital components are copied, whereas interpreting subagents together with scaffolds, tools, and resources, are constructed. Some of these processes are simple and invariant, whereas others are complex and contextual. Advance of functional information includes improvement and modification of already existing functions. Although the genome information may change passively and randomly, the interpretation is active and guided by the logic of agent behavior and embryonic development. Emergence of new functions is based on the reinterpretation of already existing information, when old tools, resources, and control algorithms are adopted for novel functions. Evolution of functional information progressed from protosemiosis, where signs correspond directly to actions, to eusemiosis, where agents associate signs with objects. Language is the most advanced form of eusemiosis, where the knowledge of objects and models is communicated between agents.
Evolution of natural agents: preservation, advance, and emergence of functional information

PubMed Central

Sharov, Alexei A.

2016-01-01

Biological evolution is often viewed narrowly as a change of morphology or allele frequency in a sequence of generations. Here I pursue an alternative informational concept of evolution, as preservation, advance, and emergence of functional information in natural agents. Functional information is a network of signs (e.g., memory, transient messengers, and external signs) that are used by agents to preserve and regulate their functions. Functional information is preserved in evolution via complex interplay of copying and construction processes: the digital components are copied, whereas interpreting subagents together with scaffolds, tools, and resources, are constructed. Some of these processes are simple and invariant, whereas others are complex and contextual. Advance of functional information includes improvement and modification of already existing functions. Although the genome information may change passively and randomly, the interpretation is active and guided by the logic of agent behavior and embryonic development. Emergence of new functions is based on the reinterpretation of already existing information, when old tools, resources, and control algorithms are adopted for novel functions. Evolution of functional information progressed from protosemiosis, where signs correspond directly to actions, to eusemiosis, where agents associate signs with objects. Language is the most advanced form of eusemiosis, where the knowledge of objects and models is communicated between agents. PMID:27525048
Evolution of Protein Domain Repeats in Metazoa

PubMed Central

Schüler, Andreas; Bornberg-Bauer, Erich

2016-01-01

Repeats are ubiquitous elements of proteins and they play important roles for cellular function and during evolution. Repeats are, however, also notoriously difficult to capture computationally and large scale studies so far had difficulties in linking genetic causes, structural properties and evolutionary trajectories of protein repeats. Here we apply recently developed methods for repeat detection and analysis to a large dataset comprising over hundred metazoan genomes. We find that repeats in larger protein families experience generally very few insertions or deletions (indels) of repeat units but there is also a significant fraction of noteworthy volatile outliers with very high indel rates. Analysis of structural data indicates that repeats with an open structure and independently folding units are more volatile and more likely to be intrinsically disordered. Such disordered repeats are also significantly enriched in sites with a high functional potential such as linear motifs. Furthermore, the most volatile repeats have a high sequence similarity between their units. Since many volatile repeats also show signs of recombination, we conclude they are often shaped by concerted evolution. Intriguingly, many of these conserved yet volatile repeats are involved in host-pathogen interactions where they might foster fast but subtle adaptation in biological arms races. Key Words: protein evolution, domain rearrangements, protein repeats, concerted evolution. PMID:27671125

Some links on this page may take you to non-federal websites. Their policies may differ from this site.