sequence evolutionary phase: Topics by Science.gov

Sample records for sequence evolutionary phase

Evolution of high-mass star-forming regions .

NASA Astrophysics Data System (ADS)

Giannetti, A.; Leurini, S.; Wyrowski, F.; Urquhart, J.; König, C.; Csengeri, T.; Güsten, R.; Menten, K. M.

Observational identification of a coherent evolutionary sequence for high-mass star-forming regions is still missing. We use the progressive heating of the gas caused by the feedback of high-mass young stellar objects to prove the statistical validity of the most common schemes used to observationally define an evolutionary sequence for high-mass clumps, and identify which physical process dominates in the different phases. From the spectroscopic follow-ups carried out towards the TOP100 sample between 84 and 365 km s^-1 giga hertz, we selected several multiplets of CH3CN, CH3CCH, and CH3OH lines to derive the physical properties of the gas in the clumps along the evolutionary sequence. We demonstrate that the evolutionary sequence is statistically valid, and we define intervals in L/M separating the compression, collapse and accretion, and disruption phases. The first hot cores and ZAMS stars appear at L/M≈10usk {L_ȯ}msun-1
Evidence of the evolved nature of the B[e] star MWC 137

DOE Office of Scientific and Technical Information (OSTI.GOV)

Muratore, M. F.; Arias, M. L.; Cidale, L.

2015-01-01

The evolutionary phase of B[e] stars is difficult to establish due to the uncertainties in their fundamental parameters. For instance, possible classifications for the Galactic B[e] star MWC 137 include pre-main-sequence and post-main-sequence phases, with a large range in luminosity. Our goal is to clarify the evolutionary stage of this peculiar object, and to study the CO molecular component of its circumstellar medium. To this purpose, we modeled the CO molecular bands using high-resolution K-band spectra. We find that MWC 137 is surrounded by a detached cool (T=1900±100 K) and dense (N=(3±1)×10{sup 21} cm{sup −2}) ring of CO gas orbitingmore » the star with a rotational velocity, projected to the line of sight, of 84 ± 2 km s{sup −1}. We also find that the molecular gas is enriched in the isotope {sup 13}C, excluding the classification of the star as a Herbig Be. The observed isotopic abundance ratio ({sup 12}C/{sup 13}C = 25 ± 2) derived from our modeling is compatible with a proto-planetary nebula, main-sequence, or supergiant evolutionary phase. However, based on some observable characteristics of MWC 137, we propose that the supergiant scenario seems to be the most plausible. Hence, we suggest that MWC 137 could be in an extremely short-lived phase, evolving from a B[e] supergiant to a blue supergiant with a bipolar ring nebula.« less
The development of the red giant branch. I - Theoretical evolutionary sequences

NASA Technical Reports Server (NTRS)

Sweigart, Allen V.; Greggio, Laura; Renzini, Alvio

1989-01-01

A grid of 100 evolutionary sequences extending from the zero-age main sequence to the onset of helium burning has been computed for stellar masses between 1.4 and 3.4 solar masses, helium abundances of 0.20 and 0.30, and heavy-element abundances of 0.004, 0.01, and 0.04. Using these computations the transition in the morphology of the red giant branch (RGB) between low-mass stars, which have an extended and luminous first RGB phase prior to helium ignition, and intermediate-mass stars, which do not, is investigated. Extensive tabulations of the numerical results are provided to aid in applying these sequences. The effects of the first dredge-up on the surface helium and CNO abundances of the sequences is discussed.
Evolution of X-ray activity of 1-3 Msun late-type stars in early post-main-sequence phases

NASA Astrophysics Data System (ADS)

Pizzolato, N.; Maggio, A.; Sciortino, S.

2000-09-01

We have investigated the variation of coronal X-ray emission during early post-main-sequence phases for a sample of 120 late-type stars within 100 pc, and with estimated masses in the range 1-3 Msun, based on Hipparcos parallaxes and recent evolutionary models. These stars were observed with the ROSAT/PSPC, and the data processed with the Palermo-CfA pipeline, including detection and evaluation of X-ray fluxes (or upper limits) by means of a wavelet transform algorithm. We have studied the evolutionary history of X-ray luminosity and surface flux for stars in selected mass ranges, including stars with inactive A-type progenitors on the main sequence and lower mass solar-type stars. Our stellar sample suggests a trend of increasing X-ray emission level with age for stars with masses M > 1.5 Msun, and a decline for lower-mass stars. A similar behavior holds for the average coronal temperature, which follows a power-law correlation with the X-ray luminosity, independently of their mass and evolutionary state. We have also studied the relationship between X-ray luminosity and surface rotation rate for stars in the same mass ranges, and how this relationships departs from the Lx ~ vrot2 law followed by main-sequence stars. Our results are interpreted in terms of a magnetic dynamo whose efficiency depends on the stellar evolutionary state through the mass-dependent changes of the stellar internal structure, including the properties of envelope convection and the internal rotation profile.
The Spectral Energy Distribution of the Earliest Phases of Massive Star Formation from the Spizter and Herschel Archives

NASA Astrophysics Data System (ADS)

Klein, Randolf; Looney, Leslie; Henning, Thomas; Chakrabarti, Sukanya; Shenoy, Sachin

2015-08-01

Infrared Dark Clouds (IRDCs) are very good candidates for the earliest phases of massive star formation, but can only be found in regions with high infrared background. We have searched for early phases among cold and massive (M>100M⊙) cloud cores by selecting cores from millimeter continuum surveys (Faundez et al. 2004, Sridharan et al. 2005, Klein et al. 2005, Beltran et al. 2006) without associations at short wavelengths. We compared the millimeter continuum peak positions with IR and radio catalogs (2MASS, MSX, IRAS, and NVSS) and excluded cores that had sources associated with the cores' peaks. We compiled a list of 173 cores in over 117 regions that are candidates for very early phases of Massive Star Formation (MSF). Now with the Spitzer and Herschel archives, these cores can be characterized further. The GLIMPSE and MIPSGAL programs alone covered 86 of these regions. The Herschel Archive adds even longer wavelengths. We are compiling this data set to construct the complete spectral energy distribution (SED) in the mid- and far-infrared with good spatial resolution and broad spectral coverage. This allow us to disentangle the complex regions and model the SED of the deeply embedded protostars/clusters.We will be presenting the IR properties of all cores and their embedded source, attempt a characterization, and order the cores in an evolutionary sequence. The resulting properties can be compared to e.g. IRDCs, a class of objects suggested to be the earliest stages of MSF. With the relative large number of cores, we can try to answer questions like: How homogeneous or diverse are our regions in terms of their evolutionary stage? Where do our embedded sources fit in the evolutionary sequence of IRDCs, hot molecular cores, ultra-compact HII regions, etc? How is the MSF shaping the environment and vice versa? Can we extrapolate to the initial conditions of MSF using our evolutionary sequence?
The evolutionary history of protein fold families and proteomes confirms that the archaeal ancestor is more ancient than the ancestors of other superkingdoms

PubMed Central

2012-01-01

Background The entire evolutionary history of life can be studied using myriad sequences generated by genomic research. This includes the appearance of the first cells and of superkingdoms Archaea, Bacteria, and Eukarya. However, the use of molecular sequence information for deep phylogenetic analyses is limited by mutational saturation, differential evolutionary rates, lack of sequence site independence, and other biological and technical constraints. In contrast, protein structures are evolutionary modules that are highly conserved and diverse enough to enable deep historical exploration. Results Here we build phylogenies that describe the evolution of proteins and proteomes. These phylogenetic trees are derived from a genomic census of protein domains defined at the fold family (FF) level of structural classification. Phylogenomic trees of FF structures were reconstructed from genomic abundance levels of 2,397 FFs in 420 proteomes of free-living organisms. These trees defined timelines of domain appearance, with time spanning from the origin of proteins to the present. Timelines are divided into five different evolutionary phases according to patterns of sharing of FFs among superkingdoms: (1) a primordial protein world, (2) reductive evolution and the rise of Archaea, (3) the rise of Bacteria from the common ancestor of Bacteria and Eukarya and early development of the three superkingdoms, (4) the rise of Eukarya and widespread organismal diversification, and (5) eukaryal diversification. The relative ancestry of the FFs shows that reductive evolution by domain loss is dominant in the first three phases and is responsible for both the diversification of life from a universal cellular ancestor and the appearance of superkingdoms. On the other hand, domain gains are predominant in the last two phases and are responsible for organismal diversification, especially in Bacteria and Eukarya. Conclusions The evolution of functions that are associated with corresponding FFs along the timeline reveals that primordial metabolic domains evolved earlier than informational domains involved in translation and transcription, supporting the metabolism-first hypothesis rather than the RNA world scenario. In addition, phylogenomic trees of proteomes reconstructed from FFs appearing in each of the five phases of the protein world show that trees reconstructed from ancient domain structures were consistently rooted in archaeal lineages, supporting the proposal that the archaeal ancestor is more ancient than the ancestors of other superkingdoms. PMID:22284070
Voltage-Gated Sodium Channels: Evolutionary History and Distinctive Sequence Features.

PubMed

Kasimova, M A; Granata, D; Carnevale, V

2016-01-01

Voltage-gated sodium channels (Nav) are responsible for the rising phase of the action potential. Their role in electrical signal transmission is so relevant that their emergence is believed to be one of the crucial factors enabling development of nervous system. The presence of voltage-gated sodium-selective channels in bacteria (BacNav) has raised questions concerning the evolutionary history of the ones in animals. Here we review some of the milestones in the field of Nav phylogenetic analysis and discuss some of the most important sequence features that distinguish these channels from voltage-gated potassium channels and transient receptor potential channels. Copyright © 2016 Elsevier Inc. All rights reserved.
Reconstructing evolutionary trees in parallel for massive sequences.

PubMed

Zou, Quan; Wan, Shixiang; Zeng, Xiangxiang; Ma, Zhanshan Sam

2017-12-14

Building the evolutionary trees for massive unaligned DNA sequences is challenging and crucial. However, reconstructing evolutionary tree for ultra-large sequences is hard. Massive multiple sequence alignment is also challenging and time/space consuming. Hadoop and Spark are developed recently, which bring spring light for the classical computational biology problems. In this paper, we tried to solve the multiple sequence alignment and evolutionary reconstruction in parallel. HPTree, which is developed in this paper, can deal with big DNA sequence files quickly. It works well on the >1GB files, and gets better performance than other evolutionary reconstruction tools. Users could use HPTree for reonstructing evolutioanry trees on the computer clusters or cloud platform (eg. Amazon Cloud). HPTree could help on population evolution research and metagenomics analysis. In this paper, we employ the Hadoop and Spark platform and design an evolutionary tree reconstruction software tool for unaligned massive DNA sequences. Clustering and multiple sequence alignment are done in parallel. Neighbour-joining model was employed for the evolutionary tree building. We opened our software together with source codes via http://lab.malab.cn/soft/HPtree/ .
EvoDB: a database of evolutionary rate profiles, associated protein domains and phylogenetic trees for PFAM-A

PubMed Central

Ndhlovu, Andrew; Durand, Pierre M.; Hazelhurst, Scott

2015-01-01

The evolutionary rate at codon sites across protein-coding nucleotide sequences represents a valuable tier of information for aligning sequences, inferring homology and constructing phylogenetic profiles. However, a comprehensive resource for cataloguing the evolutionary rate at codon sites and their corresponding nucleotide and protein domain sequence alignments has not been developed. To address this gap in knowledge, EvoDB (an Evolutionary rates DataBase) was compiled. Nucleotide sequences and their corresponding protein domain data including the associated seed alignments from the PFAM-A (protein family) database were used to estimate evolutionary rate (ω = dN/dS) profiles at codon sites for each entry. EvoDB contains 98.83% of the gapped nucleotide sequence alignments and 97.1% of the evolutionary rate profiles for the corresponding information in PFAM-A. As the identification of codon sites under positive selection and their position in a sequence profile is usually the most sought after information for molecular evolutionary biologists, evolutionary rate profiles were determined under the M2a model using the CODEML algorithm in the PAML (Phylogenetic Analysis by Maximum Likelihood) suite of software. Validation of nucleotide sequences against amino acid data was implemented to ensure high data quality. EvoDB is a catalogue of the evolutionary rate profiles and provides the corresponding phylogenetic trees, PFAM-A alignments and annotated accession identifier data. In addition, the database can be explored and queried using known evolutionary rate profiles to identify domains under similar evolutionary constraints and pressures. EvoDB is a resource for evolutionary, phylogenetic studies and presents a tier of information untapped by current databases. Database URL: http://www.bioinf.wits.ac.za/software/fire/evodb PMID:26140928
EvoDB: a database of evolutionary rate profiles, associated protein domains and phylogenetic trees for PFAM-A.

PubMed

Ndhlovu, Andrew; Durand, Pierre M; Hazelhurst, Scott

2015-01-01

The evolutionary rate at codon sites across protein-coding nucleotide sequences represents a valuable tier of information for aligning sequences, inferring homology and constructing phylogenetic profiles. However, a comprehensive resource for cataloguing the evolutionary rate at codon sites and their corresponding nucleotide and protein domain sequence alignments has not been developed. To address this gap in knowledge, EvoDB (an Evolutionary rates DataBase) was compiled. Nucleotide sequences and their corresponding protein domain data including the associated seed alignments from the PFAM-A (protein family) database were used to estimate evolutionary rate (ω = dN/dS) profiles at codon sites for each entry. EvoDB contains 98.83% of the gapped nucleotide sequence alignments and 97.1% of the evolutionary rate profiles for the corresponding information in PFAM-A. As the identification of codon sites under positive selection and their position in a sequence profile is usually the most sought after information for molecular evolutionary biologists, evolutionary rate profiles were determined under the M2a model using the CODEML algorithm in the PAML (Phylogenetic Analysis by Maximum Likelihood) suite of software. Validation of nucleotide sequences against amino acid data was implemented to ensure high data quality. EvoDB is a catalogue of the evolutionary rate profiles and provides the corresponding phylogenetic trees, PFAM-A alignments and annotated accession identifier data. In addition, the database can be explored and queried using known evolutionary rate profiles to identify domains under similar evolutionary constraints and pressures. EvoDB is a resource for evolutionary, phylogenetic studies and presents a tier of information untapped by current databases. © The Author(s) 2015. Published by Oxford University Press.
Computationally mapping sequence space to understand evolutionary protein engineering.

PubMed

Armstrong, Kathryn A; Tidor, Bruce

2008-01-01

Evolutionary protein engineering has been dramatically successful, producing a wide variety of new proteins with altered stability, binding affinity, and enzymatic activity. However, the success of such procedures is often unreliable, and the impact of the choice of protein, engineering goal, and evolutionary procedure is not well understood. We have created a framework for understanding aspects of the protein engineering process by computationally mapping regions of feasible sequence space for three small proteins using structure-based design protocols. We then tested the ability of different evolutionary search strategies to explore these sequence spaces. The results point to a non-intuitive relationship between the error-prone PCR mutation rate and the number of rounds of replication. The evolutionary relationships among feasible sequences reveal hub-like sequences that serve as particularly fruitful starting sequences for evolutionary search. Moreover, genetic recombination procedures were examined, and tradeoffs relating sequence diversity and search efficiency were identified. This framework allows us to consider the impact of protein structure on the allowed sequence space and therefore on the challenges that each protein presents to error-prone PCR and genetic recombination procedures.
Evolutionary profiles derived from the QR factorization of multiple structural alignments gives an economy of information.

PubMed

O'Donoghue, Patrick; Luthey-Schulten, Zaida

2005-02-25

We present a new algorithm, based on the multidimensional QR factorization, to remove redundancy from a multiple structural alignment by choosing representative protein structures that best preserve the phylogenetic tree topology of the homologous group. The classical QR factorization with pivoting, developed as a fast numerical solution to eigenvalue and linear least-squares problems of the form Ax=b, was designed to re-order the columns of A by increasing linear dependence. Removing the most linear dependent columns from A leads to the formation of a minimal basis set which well spans the phase space of the problem at hand. By recasting the problem of redundancy in multiple structural alignments into this framework, in which the matrix A now describes the multiple alignment, we adapted the QR factorization to produce a minimal basis set of protein structures which best spans the evolutionary (phase) space. The non-redundant and representative profiles obtained from this procedure, termed evolutionary profiles, are shown in initial results to outperform well-tested profiles in homology detection searches over a large sequence database. A measure of structural similarity between homologous proteins, Q(H), is presented. By properly accounting for the effect and presence of gaps, a phylogenetic tree computed using this metric is shown to be congruent with the maximum-likelihood sequence-based phylogeny. The results indicate that evolutionary information is indeed recoverable from the comparative analysis of protein structure alone. Applications of the QR ordering and this structural similarity metric to analyze the evolution of structure among key, universally distributed proteins involved in translation, and to the selection of representatives from an ensemble of NMR structures are also discussed.
Exploring Evolutionary Patterns in Genetic Sequence: A Computer Exercise

ERIC Educational Resources Information Center

Shumate, Alice M.; Windsor, Aaron J.

2010-01-01

The increase in publications presenting molecular evolutionary analyses and the availability of comparative sequence data through resources such as NCBI's GenBank underscore the necessity of providing undergraduates with hands-on sequence analysis skills in an evolutionary context. This need is particularly acute given that students have been…
HI Absorption in Merger Remnants

NASA Technical Reports Server (NTRS)

Teng, Stacy H.; Veileux, Sylvain; Baker, Andrew J.

2012-01-01

It has been proposed that ultraluminous infrared galaxies (ULIRGs) pass through a luminous starburst phase, followed by a dust-enshrouded AGN phase, and finally evolve into optically bright "naked" quasars once they shed their gas/dust reservoirs through powerful wind events. We present the results of our recent 21- cm HI survey of 21 merger remnants with the Green Bank Telescope. These remnants were selected from the QUEST (Quasar/ULIRG Evolution Study) sample of ULIRGs and PG quasars; our targets are all bolometrically dominated by AGN and sample all phases of the proposed ULIRG -> IR-excess quasar -> optical quasar sequence. We explore whether there is an evolutionary connection between ULIRGs and quasars by looking for the occurrence of HI absorption tracing neutral gas outflows; our results will allow us to identify where along the sequence the majority of a merger's gas reservoir is expelled.
A theoretical and observational study of the Red Giant Branch phase transition in Magellanic Cloud clusters - A progress report

NASA Technical Reports Server (NTRS)

Buonanno, R.; Corsi, C. E.; Fusi Pecci, F.; Greggio, L.; Renzini, A.; Sweigart, A. V.

1986-01-01

Preliminary results are reported for an investigation comparing theoretical models of the sudden appearance of an extended RGB (and its effects on the spectral energy distributions of stellar populations) with data from ESO CCD observations of clusters in the LMC and SMC. Isochrones for the entire RGB are being constructed on the basis of 100 new evolutionary sequences (calculated using the evolution code of Sweigart and Gross, 1976 and 1978) to permit determination of synthetic colors and spectral energy distributions. The observations so far indicate a main sequence about 0.1 mag redder than that predicted by the present models or by the isochrones of VandenBerg and Bell (1985), and fail to show a B-V color difference at the RGB phase transition.
Toward a method for tracking virus evolutionary trajectory applied to the pandemic H1N1 2009 influenza virus.

PubMed

Squires, R Burke; Pickett, Brett E; Das, Sajal; Scheuermann, Richard H

2014-12-01

In 2009 a novel pandemic H1N1 influenza virus (H1N1pdm09) emerged as the first official influenza pandemic of the 21st century. Early genomic sequence analysis pointed to the swine origin of the virus. Here we report a novel computational approach to determine the evolutionary trajectory of viral sequences that uses data-driven estimations of nucleotide substitution rates to track the gradual accumulation of observed sequence alterations over time. Phylogenetic analysis and multiple sequence alignments show that sequences belonging to the resulting evolutionary trajectory of the H1N1pdm09 lineage exhibit a gradual accumulation of sequence variations and tight temporal correlations in the topological structure of the phylogenetic trees. These results suggest that our evolutionary trajectory analysis (ETA) can more effectively pinpoint the evolutionary history of viruses, including the host and geographical location traversed by each segment, when compared against either BLAST or traditional phylogenetic analysis alone. Copyright © 2014 Elsevier B.V. All rights reserved.
Bacterial Genome Instability

PubMed Central

Darmon, Elise

2014-01-01

SUMMARY Bacterial genomes are remarkably stable from one generation to the next but are plastic on an evolutionary time scale, substantially shaped by horizontal gene transfer, genome rearrangement, and the activities of mobile DNA elements. This implies the existence of a delicate balance between the maintenance of genome stability and the tolerance of genome instability. In this review, we describe the specialized genetic elements and the endogenous processes that contribute to genome instability. We then discuss the consequences of genome instability at the physiological level, where cells have harnessed instability to mediate phase and antigenic variation, and at the evolutionary level, where horizontal gene transfer has played an important role. Indeed, this ability to share DNA sequences has played a major part in the evolution of life on Earth. The evolutionary plasticity of bacterial genomes, coupled with the vast numbers of bacteria on the planet, substantially limits our ability to control disease. PMID:24600039
Neutral tumor evolution in myeloma is associated with poor prognosis.

PubMed

Johnson, David C; Lenive, Oleg; Mitchell, Jonathan; Jackson, Graham; Owen, Roger; Drayson, Mark; Cook, Gordon; Jones, John R; Pawlyn, Charlotte; Davies, Faith E; Walker, Brian A; Wardell, Christopher; Gregory, Walter M; Cairns, David; Morgan, Gareth J; Houlston, Richard S; Kaiser, Martin F

2017-10-05

Recent studies suggest that the evolutionary history of a cancer is important in forecasting clinical outlook. To gain insight into the clonal dynamics of multiple myeloma (MM) and its possible influence on patient outcomes, we analyzed whole exome sequencing tumor data for 333 patients from Myeloma XI, a UK phase 3 trial and 434 patients from the CoMMpass study, all of which had received immunomodulatory drug (IMiD) therapy. By analyzing mutant allele frequency distributions in tumors, we found that 17% to 20% of MM is under neutral evolutionary dynamics. These tumors are associated with poorer patient survival in nonintensively treated patients, which is consistent with the reduced therapeutic efficacy of microenvironment-modulating IMiDs. Our findings provide evidence that knowledge of the evolutionary history of MM has relevance for predicting patient outcomes and personalizing therapy. © 2017 by The American Society of Hematology.
Ginkgo biloba's footprint of dynamic Pleistocene history dates back only 390,000 years ago.

PubMed

Hohmann, Nora; Wolf, Eva M; Rigault, Philippe; Zhou, Wenbin; Kiefer, Markus; Zhao, Yunpeng; Fu, Cheng-Xin; Koch, Marcus A

2018-04-27

At the end of the Pliocene and the beginning of Pleistocene glaciation and deglaciation cycles Ginkgo biloba went extinct all over the world, and only few populations remained in China in relict areas serving as sanctuary for Tertiary relict trees. Yet the status of these regions as refuge areas with naturally existing populations has been proven not earlier than one decade ago. Herein we elaborated the hypothesis that during the Pleistocene cooling periods G. biloba expanded its distribution range in China repeatedly. Whole plastid genomes were sequenced, assembled and annotated, and sequence data was analyzed in a phylogenetic framework of the entire gymnosperms to establish a robust spatio-temporal framework for gymnosperms and in particular for G. biloba Pleistocene evolutionary history. Using a phylogenetic approach, we identified that Ginkgoatae stem group age is about 325 million years, whereas crown group radiation of extant Ginkgo started not earlier than 390,000 years ago. During repeated warming phases, Gingko populations were separated and isolated by contraction of distribution range and retreated into mountainous regions serving as refuge for warm-temperate deciduous forests. Diversification and phylogenetic splits correlate with the onset of cooling phases when Ginkgo expanded its distribution range and gene pools merged. Analysis of whole plastid genome sequence data representing the entire spatio-temporal genetic variation of wild extant Ginkgo populations revealed the deepest temporal footprint dating back to approximately 390,000 years ago. Present-day directional West-East admixture of genetic diversity is shown to be the result of pronounced effects of the last cooling period. Our evolutionary framework will serve as a conceptual roadmap for forthcoming genomic sequence data, which can then provide deep insights into the demographic history of Ginkgo.
The genome sequence of the outbreeding globe artichoke constructed de novo incorporating a phase-aware low-pass sequencing strategy of F1 progeny

PubMed Central

Scaglione, Davide; Reyes-Chin-Wo, Sebastian; Acquadro, Alberto; Froenicke, Lutz; Portis, Ezio; Beitel, Christopher; Tirone, Matteo; Mauro, Rosario; Lo Monaco, Antonino; Mauromicale, Giovanni; Faccioli, Primetta; Cattivelli, Luigi; Rieseberg, Loren; Michelmore, Richard; Lanteri, Sergio

2016-01-01

Globe artichoke (Cynara cardunculus var. scolymus) is an out-crossing, perennial, multi-use crop species that is grown worldwide and belongs to the Compositae, one of the most successful Angiosperm families. We describe the first genome sequence of globe artichoke. The assembly, comprising of 13,588 scaffolds covering 725 of the 1,084 Mb genome, was generated using ~133-fold Illumina sequencing data and encodes 26,889 predicted genes. Re-sequencing (30×) of globe artichoke and cultivated cardoon (C. cardunculus var. altilis) parental genotypes and low-coverage (0.5 to 1×) genotyping-by-sequencing of 163 F1 individuals resulted in 73% of the assembled genome being anchored in 2,178 genetic bins ordered along 17 chromosomal pseudomolecules. This was achieved using a novel pipeline, SOILoCo (Scaffold Ordering by Imputation with Low Coverage), to detect heterozygous regions and assign parental haplotypes with low sequencing read depth and of unknown phase. SOILoCo provides a powerful tool for de novo genome analysis of outcrossing species. Our data will enable genome-scale analyses of evolutionary processes among crops, weeds, and wild species within and beyond the Compositae, and will facilitate the identification of economically important genes from related species. PMID:26786968

TESTING CONVECTIVE-CORE OVERSHOOTING USING PERIOD SPACINGS OF DIPOLE MODES IN RED GIANTS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Montalban, J.; Noels, A.; Dupret, M.-A.

2013-04-01

Uncertainties on central mixing in main-sequence (MS) and core He-burning (He-B) phases affect key predictions of stellar evolution such as late evolutionary phases, chemical enrichment, ages, etc. We propose a test of the extension of extra-mixing in two relevant evolutionary phases based on period spacing ({Delta}P) of solar-like oscillating giants. From stellar models and their corresponding adiabatic frequencies (respectively, computed with ATON and LOSC codes), we provide the first predictions of the observable {Delta}P for stars in the red giant branch and in the red clump (RC). We find (1) a clear correlation between {Delta}P and the mass of themore » helium core (M{sub He}); the latter in intermediate-mass stars depends on the MS overshooting, and hence it can be used to set constraints on extra-mixing during MS when coupled with chemical composition; and (2) a linear dependence of the average value of the asymptotic period spacing (({Delta}P){sub a}) on the size of the convective core during the He-B phase. A first comparison with the inferred asymptotic period spacing for Kepler RC stars also suggests the need for extra-mixing during this phase, as evinced from other observational facts.« less
PSSRdb: a relational database of polymorphic simple sequence repeats extracted from prokaryotic genomes.

PubMed

Kumar, Pankaj; Chaitanya, Pasumarthy S; Nagarajaram, Hampapathalu A

2011-01-01

PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1-6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in some bacteria. Although SSR-mediated phase variation and antigenic variations have been well-studied in some bacteria there seems a lot of other species of prokaryotes yet to be investigated for SSR mediated adaptive and other evolutionary advantages. As a part of our on-going studies on SSR polymorphism in prokaryotes we compared the genome sequences of various strains and isolates available for 85 different species of prokaryotes and extracted a number of SSRs showing length variations and created a relational database called PSSRdb. This database gives useful information such as location of PSSRs in genomes, length variation across genomes, the regions harboring PSSRs, etc. The information provided in this database is very useful for further research and analysis of SSRs in prokaryotes.
Rapidly rotating polytropes in general relativity

NASA Technical Reports Server (NTRS)

Cook, Gregory B.; Shapiro, Stuart L.; Teukolsky, Saul A.

1994-01-01

We construct an extensive set of equilibrium sequences of rotating polytropes in general relativity. We determine a number of important physical parameters of such stars, including maximum mass and maximum spin rate. The stability of the configurations against quasi-radial perturbations is diagnosed. Two classes of evolutionary sequences of fixed rest mass and entropy are explored: normal sequences which behave very much like Newtonian evolutionary sequences, and supramassive sequences which exist solely because of relativistic effects. Dissipation leading to loss of angular momentum causes a star to evolve in a quasi-stationary fashion along an evolutionary sequence. Supramassive sequences evolve towards eventual catastrophic collapse to a black hole. Prior to collapse, the star must spin up as it loses angular momentum, an effect which may provide an observational precursor to gravitational collapse to a black hole.
Hidden long evolutionary memory in a model biochemical network

NASA Astrophysics Data System (ADS)

Ali, Md. Zulfikar; Wingreen, Ned S.; Mukhopadhyay, Ranjan

2018-04-01

We introduce a minimal model for the evolution of functional protein-interaction networks using a sequence-based mutational algorithm, and apply the model to study neutral drift in networks that yield oscillatory dynamics. Starting with a functional core module, random evolutionary drift increases network complexity even in the absence of specific selective pressures. Surprisingly, we uncover a hidden order in sequence space that gives rise to long-term evolutionary memory, implying strong constraints on network evolution due to the topology of accessible sequence space.
From protostellar to pre-main-sequence evolution

NASA Astrophysics Data System (ADS)

D'Antona, F.

I summarize the status of pre-main-sequence evolutionary tracks starting from the first steps dating back to the concept of Hayashi track. Understanding of the dynamical protostellar phase in the vision of Palla & Stahler, who introduced the concept of the deuterium burning thermostat and of stellar birthline, provided for a long time a link between the dynamical and hydrostatic evolution. Disk accretion however changed considerably the view, but re-introducing some ambiguities which must still be solved. The limitations and uncertainties in the mass and age determination from models for young stellar objects are summarized, but the burning of light elements is still a powerful observational signature.
Extraction of High Molecular Weight DNA from Fungal Rust Spores for Long Read Sequencing.

PubMed

Schwessinger, Benjamin; Rathjen, John P

2017-01-01

Wheat rust fungi are complex organisms with a complete life cycle that involves two different host plants and five different spore types. During the asexual infection cycle on wheat, rusts produce massive amounts of dikaryotic urediniospores. These spores are dikaryotic (two nuclei) with each nucleus containing one haploid genome. This dikaryotic state is likely to contribute to their evolutionary success, making them some of the major wheat pathogens globally. Despite this, most published wheat rust genomes are highly fragmented and contain very little haplotype-specific sequence information. Current long-read sequencing technologies hold great promise to provide more contiguous and haplotype-phased genome assemblies. Long reads are able to span repetitive regions and phase structural differences between the haplomes. This increased genome resolution enables the identification of complex loci and the study of genome evolution beyond simple nucleotide polymorphisms. Long-read technologies require pure high molecular weight DNA as an input for sequencing. Here, we describe a DNA extraction protocol for rust spores that yields pure double-stranded DNA molecules with molecular weight of >50 kilo-base pairs (kbp). The isolated DNA is of sufficient purity for PacBio long-read sequencing, but may require additional purification for other sequencing technologies such as Nanopore and 10× Genomics.
Transcriptome analysis of the desert locust central nervous system: production and annotation of a Schistocerca gregaria EST database.

PubMed

Badisco, Liesbeth; Huybrechts, Jurgen; Simonet, Gert; Verlinden, Heleen; Marchal, Elisabeth; Huybrechts, Roger; Schoofs, Liliane; De Loof, Arnold; Vanden Broeck, Jozef

2011-03-21

The desert locust (Schistocerca gregaria) displays a fascinating type of phenotypic plasticity, designated as 'phase polyphenism'. Depending on environmental conditions, one genome can be translated into two highly divergent phenotypes, termed the solitarious and gregarious (swarming) phase. Although many of the underlying molecular events remain elusive, the central nervous system (CNS) is expected to play a crucial role in the phase transition process. Locusts have also proven to be interesting model organisms in a physiological and neurobiological research context. However, molecular studies in locusts are hampered by the fact that genome/transcriptome sequence information available for this branch of insects is still limited. We have generated 34,672 raw expressed sequence tags (EST) from the CNS of desert locusts in both phases. These ESTs were assembled in 12,709 unique transcript sequences and nearly 4,000 sequences were functionally annotated. Moreover, the obtained S. gregaria EST information is highly complementary to the existing orthopteran transcriptomic data. Since many novel transcripts encode neuronal signaling and signal transduction components, this paper includes an overview of these sequences. Furthermore, several transcripts being differentially represented in solitarious and gregarious locusts were retrieved from this EST database. The findings highlight the involvement of the CNS in the phase transition process and indicate that this novel annotated database may also add to the emerging knowledge of concomitant neuronal signaling and neuroplasticity events. In summary, we met the need for novel sequence data from desert locust CNS. To our knowledge, we hereby also present the first insect EST database that is derived from the complete CNS. The obtained S. gregaria EST data constitute an important new source of information that will be instrumental in further unraveling the molecular principles of phase polyphenism, in further establishing locusts as valuable research model organisms and in molecular evolutionary and comparative entomology.
Discovery radiomics via evolutionary deep radiomic sequencer discovery for pathologically proven lung cancer detection.

PubMed

Shafiee, Mohammad Javad; Chung, Audrey G; Khalvati, Farzad; Haider, Masoom A; Wong, Alexander

2017-10-01

While lung cancer is the second most diagnosed form of cancer in men and women, a sufficiently early diagnosis can be pivotal in patient survival rates. Imaging-based, or radiomics-driven, detection methods have been developed to aid diagnosticians, but largely rely on hand-crafted features that may not fully encapsulate the differences between cancerous and healthy tissue. Recently, the concept of discovery radiomics was introduced, where custom abstract features are discovered from readily available imaging data. We propose an evolutionary deep radiomic sequencer discovery approach based on evolutionary deep intelligence. Motivated by patient privacy concerns and the idea of operational artificial intelligence, the evolutionary deep radiomic sequencer discovery approach organically evolves increasingly more efficient deep radiomic sequencers that produce significantly more compact yet similarly descriptive radiomic sequences over multiple generations. As a result, this framework improves operational efficiency and enables diagnosis to be run locally at the radiologist's computer while maintaining detection accuracy. We evaluated the evolved deep radiomic sequencer (EDRS) discovered via the proposed evolutionary deep radiomic sequencer discovery framework against state-of-the-art radiomics-driven and discovery radiomics methods using clinical lung CT data with pathologically proven diagnostic data from the LIDC-IDRI dataset. The EDRS shows improved sensitivity (93.42%), specificity (82.39%), and diagnostic accuracy (88.78%) relative to previous radiomics approaches.
Mhc class II B gene evolution in East African cichlid fishes.

PubMed

Figueroa, F; Mayer, W E; Sültmann, H; O'hUigin, C; Tichy, H; Satta, Y; Takezaki, N; Takahata, N; Klein, J

2000-06-01

A distinctive feature of essential major histocompatibility complex (Mhc) loci is their polymorphism characterized by large genetic distances between alleles and long persistence times of allelic lineages. Since the lineages often span several successive speciations, we investigated the behavior of the Mhc alleles during or close to the speciation phase. We sequenced exon 2 of the class II B locus 4 from 232 East African cichlid fishes representing 32 related species. The divergence times of the (sub)species ranged from 6,000 to 8.4 million years. Two types of evolutionary analysis were used to elucidate the pattern of exon 2 sequence divergence. First, phylogenetic methods were applied to reconstruct the most likely evolutionary pathways leading from the last common ancestor of the set to the extant sequences, and to assess the probable mechanisms involved in allelic diversification. Second, pairwise comparisons of sequences were carried out to detect differences seemingly incompatible with origin by nonparallel point mutations. The analysis revealed point mutations to be the most important mechanism behind allelic divergences, with recombination playing only an auxiliary part. Comparison of sequences from related species revealed evidence of random allelic (lineage) losses apparently associated with speciation. Sharing of identical alleles could be demonstrated between species that diverged 2 million years ago. The phylogeny of the exon was incongruent with that of the flanking introns, indicating either a high degree of convergent evolution at the peptide-binding region-encoding sites, or intron homogenization.
Evolutionary status of isolated B[e] stars

NASA Astrophysics Data System (ADS)

Lee, Chien-De; Chen, Wen-Ping; Liu, Sheng-Yuan

2016-08-01

Aims: We study a sample of eight B[e] stars with uncertain evolutionary status to shed light on the origin of their circumstellar dust. Methods: We performed a diagnostic analysis on the spectral energy distribution beyond infrared wavelengths, and conducted a census of neighboring region of each target to ascertain its evolutionary status. Results: In comparison to pre-main sequence Herbig stars, these B[e] stars show equally substantial excess emission in the near-infrared, indicative of existence of warm dust, but much reduced excess at longer wavelengths, so the dusty envelopes should be compact in size. Isolation from star-forming regions excludes the possibility of their pre-main sequence status. Six of our targets, including HD 50138, HD 45677, CD-24 5721, CD-49 3441, MWC 623, and HD 85567, have been previously considered as FS CMa stars, whereas HD 181615/6 and HD 98922 are added to the sample by this work. We argue that the circumstellar grains of these isolated B[e] stars, already evolved beyond the pre-main sequence phase, should be formed in situ. This is in contrast to Herbig stars, which inherit large grains from parental molecular clouds. It has been thought that HD 98922, in particular, is a Herbig star because of its large infrared excess, but we propose it being in a more evolved stage. Because dust condenses out of stellar mass loss in an inside-out manner, the dusty envelope is spatially confined, and anisotropic mass flows, or anomalous optical properties of tiny grains, lead to the generally low line-of-sight extinction toward these stars.
Molecular selection in a unified evolutionary sequence

NASA Technical Reports Server (NTRS)

Fox, S. W.

1986-01-01

With guidance from experiments and observations that indicate internally limited phenomena, an outline of unified evolutionary sequence is inferred. Such unification is not visible for a context of random matrix and random mutation. The sequence proceeds from Big Bang through prebiotic matter, protocells, through the evolving cell via molecular and natural selection, to mind, behavior, and society.
EGenBio: A Data Management System for Evolutionary Genomics and Biodiversity

PubMed Central

Nahum, Laila A; Reynolds, Matthew T; Wang, Zhengyuan O; Faith, Jeremiah J; Jonna, Rahul; Jiang, Zhi J; Meyer, Thomas J; Pollock, David D

2006-01-01

Background Evolutionary genomics requires management and filtering of large numbers of diverse genomic sequences for accurate analysis and inference on evolutionary processes of genomic and functional change. We developed Evolutionary Genomics and Biodiversity (EGenBio; ) to begin to address this. Description EGenBio is a system for manipulation and filtering of large numbers of sequences, integrating curated sequence alignments and phylogenetic trees, managing evolutionary analyses, and visualizing their output. EGenBio is organized into three conceptual divisions, Evolution, Genomics, and Biodiversity. The Genomics division includes tools for selecting pre-aligned sequences from different genes and species, and for modifying and filtering these alignments for further analysis. Species searches are handled through queries that can be modified based on a tree-based navigation system and saved. The Biodiversity division contains tools for analyzing individual sequences or sequence alignments, whereas the Evolution division contains tools involving phylogenetic trees. Alignments are annotated with analytical results and modification history using our PRAED format. A miscellaneous Tools section and Help framework are also available. EGenBio was developed around our comparative genomic research and a prototype database of mtDNA genomes. It utilizes MySQL-relational databases and dynamic page generation, and calls numerous custom programs. Conclusion EGenBio was designed to serve as a platform for tools and resources to ease combined analysis in evolution, genomics, and biodiversity. PMID:17118150
TARGETED CAPTURE IN EVOLUTIONARY AND ECOLOGICAL GENOMICS

PubMed Central

Jones, Matthew R.; Good, Jeffrey M.

2016-01-01

The rapid expansion of next-generation sequencing has yielded a powerful array of tools to address fundamental biological questions at a scale that was inconceivable just a few years ago. Various genome partitioning strategies to sequence select subsets of the genome have emerged as powerful alternatives to whole genome sequencing in ecological and evolutionary genomic studies. High throughput targeted capture is one such strategy that involves the parallel enrichment of pre-selected genomic regions of interest. The growing use of targeted capture demonstrates its potential power to address a range of research questions, yet these approaches have yet to expand broadly across labs focused on evolutionary and ecological genomics. In part, the use of targeted capture has been hindered by the logistics of capture design and implementation in species without established reference genomes. Here we aim to 1) increase the accessibility of targeted capture to researchers working in non-model taxa by discussing capture methods that circumvent the need of a reference genome, 2) highlight the evolutionary and ecological applications where this approach is emerging as a powerful sequencing strategy, and 3) discuss the future of targeted capture and other genome partitioning approaches in light of the increasing accessibility of whole genome sequencing. Given the practical advantages and increasing feasibility of high-throughput targeted capture, we anticipate an ongoing expansion of capture-based approaches in evolutionary and ecological research, synergistic with an expansion of whole genome sequencing. PMID:26137993
Analyses of Evolutionary Characteristics of the Hemagglutinin-Esterase Gene of Influenza C Virus during a Period of 68 Years Reveals Evolutionary Patterns Different from Influenza A and B Viruses.

PubMed

Furuse, Yuki; Matsuzaki, Yoko; Nishimura, Hidekazu; Oshitani, Hitoshi

2016-11-26

Infections with the influenza C virus causing respiratory symptoms are common, particularly among children. Since isolation and detection of the virus are rarely performed, compared with influenza A and B viruses, the small number of available sequences of the virus makes it difficult to analyze its evolutionary dynamics. Recently, we reported the full genome sequence of 102 strains of the virus. Here, we exploited the data to elucidate the evolutionary characteristics and phylodynamics of the virus compared with influenza A and B viruses. Along with our data, we obtained public sequence data of the hemagglutinin-esterase gene of the virus; the dataset consists of 218 unique sequences of the virus collected from 14 countries between 1947 and 2014. Informatics analyses revealed that (1) multiple lineages have been circulating globally; (2) there have been weak and infrequent selective bottlenecks; (3) the evolutionary rate is low because of weak positive selection and a low capability to induce mutations; and (4) there is no significant positive selection although a few mutations affecting its antigenicity have been induced. The unique evolutionary dynamics of the influenza C virus must be shaped by multiple factors, including virological, immunological, and epidemiological characteristics.
Analyses of Evolutionary Characteristics of the Hemagglutinin-Esterase Gene of Influenza C Virus during a Period of 68 Years Reveals Evolutionary Patterns Different from Influenza A and B Viruses

PubMed Central

Furuse, Yuki; Matsuzaki, Yoko; Nishimura, Hidekazu; Oshitani, Hitoshi

2016-01-01

Infections with the influenza C virus causing respiratory symptoms are common, particularly among children. Since isolation and detection of the virus are rarely performed, compared with influenza A and B viruses, the small number of available sequences of the virus makes it difficult to analyze its evolutionary dynamics. Recently, we reported the full genome sequence of 102 strains of the virus. Here, we exploited the data to elucidate the evolutionary characteristics and phylodynamics of the virus compared with influenza A and B viruses. Along with our data, we obtained public sequence data of the hemagglutinin-esterase gene of the virus; the dataset consists of 218 unique sequences of the virus collected from 14 countries between 1947 and 2014. Informatics analyses revealed that (1) multiple lineages have been circulating globally; (2) there have been weak and infrequent selective bottlenecks; (3) the evolutionary rate is low because of weak positive selection and a low capability to induce mutations; and (4) there is no significant positive selection although a few mutations affecting its antigenicity have been induced. The unique evolutionary dynamics of the influenza C virus must be shaped by multiple factors, including virological, immunological, and epidemiological characteristics. PMID:27898037
ConSurf 2016: an improved methodology to estimate and visualize evolutionary conservation in macromolecules

PubMed Central

Ashkenazy, Haim; Abadi, Shiran; Martz, Eric; Chay, Ofer; Mayrose, Itay; Pupko, Tal; Ben-Tal, Nir

2016-01-01

The degree of evolutionary conservation of an amino acid in a protein or a nucleic acid in DNA/RNA reflects a balance between its natural tendency to mutate and the overall need to retain the structural integrity and function of the macromolecule. The ConSurf web server (http://consurf.tau.ac.il), established over 15 years ago, analyses the evolutionary pattern of the amino/nucleic acids of the macromolecule to reveal regions that are important for structure and/or function. Starting from a query sequence or structure, the server automatically collects homologues, infers their multiple sequence alignment and reconstructs a phylogenetic tree that reflects their evolutionary relations. These data are then used, within a probabilistic framework, to estimate the evolutionary rates of each sequence position. Here we introduce several new features into ConSurf, including automatic selection of the best evolutionary model used to infer the rates, the ability to homology-model query proteins, prediction of the secondary structure of query RNA molecules from sequence, the ability to view the biological assembly of a query (in addition to the single chain), mapping of the conservation grades onto 2D RNA models and an advanced view of the phylogenetic tree that enables interactively rerunning ConSurf with the taxa of a sub-tree. PMID:27166375
Evolution of Enzyme Superfamilies: Comprehensive Exploration of Sequence-Function Relationships.

PubMed

Baier, F; Copp, J N; Tokuriki, N

2016-11-22

The sequence and functional diversity of enzyme superfamilies have expanded through billions of years of evolution from a common ancestor. Understanding how protein sequence and functional "space" have expanded, at both the evolutionary and molecular level, is central to biochemistry, molecular biology, and evolutionary biology. Integrative approaches that examine protein sequence, structure, and function have begun to provide comprehensive views of the functional diversity and evolutionary relationships within enzyme superfamilies. In this review, we outline the recent advances in our understanding of enzyme evolution and superfamily functional diversity. We describe the tools that have been used to comprehensively analyze sequence relationships and to characterize sequence and function relationships. We also highlight recent large-scale experimental approaches that systematically determine the activity profiles across enzyme superfamilies. We identify several intriguing insights from this recent body of work. First, promiscuous activities are prevalent among extant enzymes. Second, many divergent proteins retain "function connectivity" via enzyme promiscuity, which can be used to probe the evolutionary potential and history of enzyme superfamilies. Finally, we discuss open questions regarding the intricacies of enzyme divergence, as well as potential research directions that will deepen our understanding of enzyme superfamily evolution.
Evolutionary dynamics of retrotransposons assessed by high-throughput sequencing in wild relatives of wheat.

PubMed

Senerchia, Natacha; Wicker, Thomas; Felber, François; Parisod, Christian

2013-01-01

Transposable elements (TEs) represent a major fraction of plant genomes and drive their evolution. An improved understanding of genome evolution requires the dynamics of a large number of TE families to be considered. We put forward an approach bypassing the required step of a complete reference genome to assess the evolutionary trajectories of high copy number TE families from genome snapshot with high-throughput sequencing. Low coverage sequencing of the complex genomes of Aegilops cylindrica and Ae. geniculata using 454 identified more than 70% of the sequences as known TEs, mainly long terminal repeat (LTR) retrotransposons. Comparing the abundance of reads as well as patterns of sequence diversity and divergence within and among genomes assessed the dynamics of 44 major LTR retrotransposon families of the 165 identified. In particular, molecular population genetics on individual TE copies distinguished recently active from quiescent families and highlighted different evolutionary trajectories of retrotransposons among related species. This work presents a suite of tools suitable for current sequencing data, allowing to address the genome-wide evolutionary dynamics of TEs at the family level and advancing our understanding of the evolution of nonmodel genomes.
The evolution of high-metallicity horizontal-branch stars and the origin of the ultraviolet light in elliptical galaxies

NASA Technical Reports Server (NTRS)

Horch, E.; Demarque, P.; Pinsonneault, M.

1992-01-01

Evolutionary calculations of high-metallicity horizontal-branch stars show that for the relevant masses and helium abundances, post-HB evolution in the HR diagram does not proceed toward and along the AGB, but rather toward a 'slow blue phase' in the vicinity of the helium-burning main sequence, following the extinction of the hydrogen shell energy source. For solar and twice solar metallicity, the blue phase begins during the helium shell-burning phase (in agreement with the work of Brocato and Castellani and Tornambe); for 3 times solar metallicity, it begins earlier, during the helium core-burning phase. This behavior differs from what takes place at lower metallicities. The implications for high-metallicity old stellar populations in the Galactic bulge and for the integrated colors of elliptical galaxies are discussed.
Open Reading Frame Phylogenetic Analysis on the Cloud

PubMed Central

2013-01-01

Phylogenetic analysis has become essential in researching the evolutionary relationships between viruses. These relationships are depicted on phylogenetic trees, in which viruses are grouped based on sequence similarity. Viral evolutionary relationships are identified from open reading frames rather than from complete sequences. Recently, cloud computing has become popular for developing internet-based bioinformatics tools. Biocloud is an efficient, scalable, and robust bioinformatics computing service. In this paper, we propose a cloud-based open reading frame phylogenetic analysis service. The proposed service integrates the Hadoop framework, virtualization technology, and phylogenetic analysis methods to provide a high-availability, large-scale bioservice. In a case study, we analyze the phylogenetic relationships among Norovirus. Evolutionary relationships are elucidated by aligning different open reading frame sequences. The proposed platform correctly identifies the evolutionary relationships between members of Norovirus. PMID:23671843

A Generative Angular Model of Protein Structure Evolution

PubMed Central

Golden, Michael; García-Portugués, Eduardo; Sørensen, Michael; Mardia, Kanti V.; Hamelryck, Thomas; Hein, Jotun

2017-01-01

Abstract Recently described stochastic models of protein evolution have demonstrated that the inclusion of structural information in addition to amino acid sequences leads to a more reliable estimation of evolutionary parameters. We present a generative, evolutionary model of protein structure and sequence that is valid on a local length scale. The model concerns the local dependencies between sequence and structure evolution in a pair of homologous proteins. The evolutionary trajectory between the two structures in the protein pair is treated as a random walk in dihedral angle space, which is modeled using a novel angular diffusion process on the two-dimensional torus. Coupling sequence and structure evolution in our model allows for modeling both “smooth” conformational changes and “catastrophic” conformational jumps, conditioned on the amino acid changes. The model has interpretable parameters and is comparatively more realistic than previous stochastic models, providing new insights into the relationship between sequence and structure evolution. For example, using the trained model we were able to identify an apparent sequence–structure evolutionary motif present in a large number of homologous protein pairs. The generative nature of our model enables us to evaluate its validity and its ability to simulate aspects of protein evolution conditioned on an amino acid sequence, a related amino acid sequence, a related structure or any combination thereof. PMID:28453724
Evolutionary distances in the twilight zone--a rational kernel approach.

PubMed

Schwarz, Roland F; Fletcher, William; Förster, Frank; Merget, Benjamin; Wolf, Matthias; Schultz, Jörg; Markowetz, Florian

2010-12-31

Phylogenetic tree reconstruction is traditionally based on multiple sequence alignments (MSAs) and heavily depends on the validity of this information bottleneck. With increasing sequence divergence, the quality of MSAs decays quickly. Alignment-free methods, on the other hand, are based on abstract string comparisons and avoid potential alignment problems. However, in general they are not biologically motivated and ignore our knowledge about the evolution of sequences. Thus, it is still a major open question how to define an evolutionary distance metric between divergent sequences that makes use of indel information and known substitution models without the need for a multiple alignment. Here we propose a new evolutionary distance metric to close this gap. It uses finite-state transducers to create a biologically motivated similarity score which models substitutions and indels, and does not depend on a multiple sequence alignment. The sequence similarity score is defined in analogy to pairwise alignments and additionally has the positive semi-definite property. We describe its derivation and show in simulation studies and real-world examples that it is more accurate in reconstructing phylogenies than competing methods. The result is a new and accurate way of determining evolutionary distances in and beyond the twilight zone of sequence alignments that is suitable for large datasets.
Detecting and Analyzing Genetic Recombination Using RDP4.

PubMed

Martin, Darren P; Murrell, Ben; Khoosal, Arjun; Muhire, Brejnev

2017-01-01

Recombination between nucleotide sequences is a major process influencing the evolution of most species on Earth. The evolutionary value of recombination has been widely debated and so too has its influence on evolutionary analysis methods that assume nucleotide sequences replicate without recombining. When nucleic acids recombine, the evolution of the daughter or recombinant molecule cannot be accurately described by a single phylogeny. This simple fact can seriously undermine the accuracy of any phylogenetics-based analytical approach which assumes that the evolutionary history of a set of recombining sequences can be adequately described by a single phylogenetic tree. There are presently a large number of available methods and associated computer programs for analyzing and characterizing recombination in various classes of nucleotide sequence datasets. Here we examine the use of some of these methods to derive and test recombination hypotheses using multiple sequence alignments.
ATLASGAL-selected massive clumps in the inner Galaxy. V. Temperature structure and evolution

NASA Astrophysics Data System (ADS)

Giannetti, A.; Leurini, S.; Wyrowski, F.; Urquhart, J.; Csengeri, T.; Menten, K. M.; König, C.; Güsten, R.

2017-07-01

Context. Observational identification of a solid evolutionary sequence for high-mass star-forming regions is still missing. Spectroscopic observations give the opportunity to test possible schemes and connect the phases identified to physical processes. Aims: We aim to use the progressive heating of the gas caused by the feedback of high-mass young stellar objects to prove the statistical validity of the most common schemes used to observationally define an evolutionary sequence for high-mass clumps, and characterise the sensitivity of different tracers to this process. Methods: From the spectroscopic follow-ups carried out towards submillimeter continuum (dust) emission-selected massive clumps (the ATLASGAL TOP100 sample) with the IRAM 30 m, Mopra, and APEX telescopes between 84 GHz and 365 GHz, we selected several multiplets of CH3CN, CH3CCH, and CH3OH emission lines to derive and compare the physical properties of the gas in the clumps along the evolutionary sequence, fitting simultaneously the large number of lines that these molecules have in the observed band. Our findings are compared with results obtained from optically thin CO isotopologues, dust, and ammonia from previous studies on the same sample. Results: The chemical properties of each species have a major role on the measured physical properties. Low temperatures are traced by ammonia, methanol, and CO (in the early phases), the warm and dense envelope can be probed with CH3CN, CH3CCH, and, in evolved sources where CO is abundant in the gas phase, via its optically thin isotopologues. CH3OH and CH3CN are also abundant in the hot cores, and we suggest that their high-excitation transitions are good tools to study the kinematics in the hot gas associated with the inner envelope surrounding the young stellar objects that these clumps are hosting. All tracers show, to different degrees according to their properties, progressive warming with evolution. The relation between gas temperature and the luminosity-to-mass (L/M) ratio is reproduced by a simple toy model of a spherical, internally heated clump. Conclusions: The evolutionary sequence defined for the clumps is statistically valid and we could identify the physical processes dominating in different intervals of L/M. For L/M ≾ 2 L⊙M⊙-1 a large quantity of the gas is still accumulated and compressed at the bottom of the potential well. Between 2 L⊙M⊙-1 ≾ L/M ≾ 40 L⊙M⊙-1 the young stellar objects gain mass and increase in luminosity; the first hot cores hosting intermediate- or high-mass ZAMS stars appear around L/M 10 L⊙M⊙-1. Finally, for L/M ≳ 40 L⊙M⊙-1 Hii regions become common, showing that dissipation of the parental clump dominates. Tables from A.1 to A.8 are only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/603/A33
The Evolution of Bony Vertebrate Enhancers at Odds with Their Coding Sequence Landscape.

PubMed

Yousaf, Aisha; Sohail Raza, Muhammad; Ali Abbasi, Amir

2015-08-06

Enhancers lie at the heart of transcriptional and developmental gene regulation. Therefore, changes in enhancer sequences usually disrupt the target gene expression and result in disease phenotypes. Despite the well-established role of enhancers in development and disease, evolutionary sequence studies are lacking. The current study attempts to unravel the puzzle of bony vertebrates' conserved noncoding elements (CNE) enhancer evolution. Bayesian phylogenetics of enhancer sequences spotlights promising interordinal relationships among placental mammals, proposing a closer relationship between humans and laurasiatherians while placing rodents at the basal position. Clock-based estimates of enhancer evolution provided a dynamic picture of interspecific rate changes across the bony vertebrate lineage. Moreover, coelacanth in the study augmented our appreciation of the vertebrate cis-regulatory evolution during water-land transition. Intriguingly, we observed a pronounced upsurge in enhancer evolution in land-dwelling vertebrates. These novel findings triggered us to further investigate the evolutionary trend of coding as well as CNE nonenhancer repertoires, to highlight the relative evolutionary dynamics of diverse genomic landscapes. Surprisingly, the evolutionary rates of enhancer sequences were clearly at odds with those of the coding and the CNE nonenhancer sequences during vertebrate adaptation to land, with land vertebrates exhibiting significantly reduced rates of coding sequence evolution in comparison to their fast evolving regulatory landscape. The observed variation in tetrapod cis-regulatory elements caused the fine-tuning of associated gene regulatory networks. Therefore, the increased evolutionary rate of tetrapods' enhancer sequences might be responsible for the variation in developmental regulatory circuits during the process of vertebrate adaptation to land. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Evolutionary tree reconstruction

NASA Technical Reports Server (NTRS)

Cheeseman, Peter; Kanefsky, Bob

1990-01-01

It is described how Minimum Description Length (MDL) can be applied to the problem of DNA and protein evolutionary tree reconstruction. If there is a set of mutations that transform a common ancestor into a set of the known sequences, and this description is shorter than the information to encode the known sequences directly, then strong evidence for an evolutionary relationship has been found. A heuristic algorithm is described that searches for the simplest tree (smallest MDL) that finds close to optimal trees on the test data. Various ways of extending the MDL theory to more complex evolutionary relationships are discussed.
3D RNA and functional interactions from evolutionary couplings

PubMed Central

Weinreb, Caleb; Riesselman, Adam; Ingraham, John B.; Gross, Torsten; Sander, Chris; Marks, Debora S.

2016-01-01

Summary Non-coding RNAs are ubiquitous, but the discovery of new RNA gene sequences far outpaces research on their structure and functional interactions. We mine the evolutionary sequence record to derive precise information about function and structure of RNAs and RNA-protein complexes. As in protein structure prediction, we use maximum entropy global probability models of sequence co-variation to infer evolutionarily constrained nucleotide-nucleotide interactions within RNA molecules, and nucleotide-amino acid interactions in RNA-protein complexes. The predicted contacts allow all-atom blinded 3D structure prediction at good accuracy for several known RNA structures and RNA-protein complexes. For unknown structures, we predict contacts in 160 non-coding RNA families. Beyond 3D structure prediction, evolutionary couplings help identify important functional interactions, e.g., at switch points in riboswitches and at a complex nucleation site in HIV. Aided by accelerating sequence accumulation, evolutionary coupling analysis can accelerate the discovery of functional interactions and 3D structures involving RNA. PMID:27087444
Tests of two convection theories for red giant and red supergiant envelopes

NASA Technical Reports Server (NTRS)

Stothers, Richard B.; Chin, Chao-Wen

1995-01-01

Two theories of stellar envelope convection are considered here in the context of red giants and red supergiants of intermediate to high mass: Boehm-Vitense's standard mixing-length theory (MLT) and Canuto & Mazzitelli's new theory incorporating the full spectrum of turbulence (FST). Both theories assume incompressible convection. Two formulations of the convective mixing length are also evaluated: l proportional to the local pressure scale height (H(sub P)) and l proportional to the distance from the upper boundary of the convection zone (z). Applications to test both theories are made by calculating stellar evolutionary sequences into the red zone (z). Applications to test both theories are made by calculating stellar evolutionary sequences into the red phase of core helium burning. Since the theoretically predicted effective temperatures for cool stars are known to be sensitive to the assigned value of the mixing length, this quantity has been individually calibrated for each evolutionary sequence. The calibration is done in a composite Hertzsprung-Russell diagram for the red giant and red supergiant members of well-observed Galactic open clusters. The MLT model requires the constant of proportionality for the convective mixing length to vary by a small but statistically significant amount with stellar mass, whereas the FST model succeeds in all cases with the mixing lenghth simply set equal to z. The structure of the deep stellar interior, however, remains very nearly unaffected by the choices of convection theory and mixing lenghth. Inside the convective envelope itself, a density inversion always occurs, but is somewhat smaller for the convectively more efficient MLT model. On physical grounds the FST model is preferable, and seems to alleviate the problem of finding the proper mixing length.
Targeted sequencing for high-resolution evolutionary analyses following genome duplication in salmonid fish: Proof of concept for key components of the insulin-like growth factor axis.

PubMed

Lappin, Fiona M; Shaw, Rebecca L; Macqueen, Daniel J

2016-12-01

High-throughput sequencing has revolutionised comparative and evolutionary genome biology. It has now become relatively commonplace to generate multiple genomes and/or transcriptomes to characterize the evolution of large taxonomic groups of interest. Nevertheless, such efforts may be unsuited to some research questions or remain beyond the scope of some research groups. Here we show that targeted high-throughput sequencing offers a viable alternative to study genome evolution across a vertebrate family of great scientific interest. Specifically, we exploited sequence capture and Illumina sequencing to characterize the evolution of key components from the insulin-like growth (IGF) signalling axis of salmonid fish at unprecedented phylogenetic resolution. The IGF axis represents a central governor of vertebrate growth and its core components were expanded by whole genome duplication in the salmonid ancestor ~95Ma. Using RNA baits synthesised to genes encoding the complete family of IGF binding proteins (IGFBP) and an IGF hormone (IGF2), we captured, sequenced and assembled orthologous and paralogous exons from species representing all ten salmonid genera. This approach generated 299 novel sequences, most as complete or near-complete protein-coding sequences. Phylogenetic analyses confirmed congruent evolutionary histories for all nineteen recognized salmonid IGFBP family members and identified novel salmonid-specific IGF2 paralogues. Moreover, we reconstructed the evolution of duplicated IGF axis paralogues across a replete salmonid phylogeny, revealing complex historic selection regimes - both ancestral to salmonids and lineage-restricted - that frequently involved asymmetric paralogue divergence under positive and/or relaxed purifying selection. Our findings add to an emerging literature highlighting diverse applications for targeted sequencing in comparative-evolutionary genomics. We also set out a viable approach to obtain large sets of nuclear genes for any member of the salmonid family, which should enable insights into the evolutionary role of whole genome duplication before additional nuclear genome sequences become available. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Protein 3D Structure Computed from Evolutionary Sequence Variation

PubMed Central

Sheridan, Robert; Hopf, Thomas A.; Pagnani, Andrea; Zecchina, Riccardo; Sander, Chris

2011-01-01

The evolutionary trajectory of a protein through sequence space is constrained by its function. Collections of sequence homologs record the outcomes of millions of evolutionary experiments in which the protein evolves according to these constraints. Deciphering the evolutionary record held in these sequences and exploiting it for predictive and engineering purposes presents a formidable challenge. The potential benefit of solving this challenge is amplified by the advent of inexpensive high-throughput genomic sequencing. In this paper we ask whether we can infer evolutionary constraints from a set of sequence homologs of a protein. The challenge is to distinguish true co-evolution couplings from the noisy set of observed correlations. We address this challenge using a maximum entropy model of the protein sequence, constrained by the statistics of the multiple sequence alignment, to infer residue pair couplings. Surprisingly, we find that the strength of these inferred couplings is an excellent predictor of residue-residue proximity in folded structures. Indeed, the top-scoring residue couplings are sufficiently accurate and well-distributed to define the 3D protein fold with remarkable accuracy. We quantify this observation by computing, from sequence alone, all-atom 3D structures of fifteen test proteins from different fold classes, ranging in size from 50 to 260 residues., including a G-protein coupled receptor. These blinded inferences are de novo, i.e., they do not use homology modeling or sequence-similar fragments from known structures. The co-evolution signals provide sufficient information to determine accurate 3D protein structure to 2.7–4.8 Å Cα-RMSD error relative to the observed structure, over at least two-thirds of the protein (method called EVfold, details at http://EVfold.org). This discovery provides insight into essential interactions constraining protein evolution and will facilitate a comprehensive survey of the universe of protein structures, new strategies in protein and drug design, and the identification of functional genetic variants in normal and disease genomes. PMID:22163331
Discovery of magnetic A supergiants: the descendants of magnetic main-sequence B stars

NASA Astrophysics Data System (ADS)

Neiner, Coralie; Oksala, Mary E.; Georgy, Cyril; Przybilla, Norbert; Mathis, Stéphane; Wade, Gregg; Kondrak, Matthias; Fossati, Luca; Blazère, Aurore; Buysschaert, Bram; Grunhut, Jason

2017-10-01

In the context of the high resolution, high signal-to-noise ratio, high sensitivity, spectropolarimetric survey BritePol, which complements observations by the BRITE constellation of nanosatellites for asteroseismology, we are looking for and measuring the magnetic field of all stars brighter than V = 4. In this paper, we present circularly polarized spectra obtained with HarpsPol at ESO in La Silla (Chile) and ESPaDOnS at CFHT (Hawaii) for three hot evolved stars: ι Car, HR 3890 and ɛ CMa. We detected a magnetic field in all three stars. Each star has been observed several times to confirm the magnetic detections and check for variability. The stellar parameters of the three objects were determined and their evolutionary status was ascertained employing evolution models computed with the Geneva code. ɛ CMa was already known and is confirmed to be magnetic, but our modelling indicates that it is located near the end of the main sequence, I.e. it is still in a core hydrogen burning phase. ι Car and HR 3890 are the first discoveries of magnetic hot supergiants located well after the end of the main sequence on the Hertzsprung-Russell diagram. These stars are probably the descendants of main-sequence magnetic massive stars. Their current field strength (a few G) is compatible with magnetic flux conservation during stellar evolution. These results provide observational constraints for the development of future evolutionary models of hot stars including a fossil magnetic field.
Abundance patterns of evolved stars with Hipparcos parallaxes and ages based on the APOGEE data base

NASA Astrophysics Data System (ADS)

Jia, Y. P.; Chen, Y. Q.; Zhao, G.; Bari, M. A.; Zhao, J. K.; Tan, K. F.

2018-01-01

We investigate the abundance patterns for four groups of stars at evolutionary phases from sub-giant to red clump (RC) and trace the chemical evolution of the disc by taking 21 individual elemental abundances from APOGEE and ages from evolutionary models with the aid of Hipparcos distances. We find that the abundances of six elements (Si, S, K, Ca, Mn and Ni) are similar from the sub-giant phase to the RC phase. In particular, we find that a group of stars with low [C/N] ratios, mainly from the second sequence of RC stars, show that there is a difference in the transfer efficiency of the C-N-O cycle between the main and the secondary RC sequences. We also compare the abundance patterns of C-N, Mg-Al and Na-O with giant stars in globular clusters from APOGEE and find that field stars follow similar patterns as M107, a metal-rich globular cluster with [M/H] ∼- 1.0, which shows that the self-enrichment mechanism represented by strong C-N, Mg-Al and Na-O anti-correlations may not be important as the metallicity reaches [M/H] > -1.0 dex. Based on the abundances of above-mentioned six elements and [Fe/H], we investigate age versus abundance relations and find some old super-metal-rich stars in our sample. Their properties of old age and being rich in metal are evidence for stellar migration. The age versus metallicity relations in low-[α/M] bins show unexpectedly positive slopes. We propose that the fresh metal-poor gas infalling on to the Galactic disc may be the precursor for this unexpected finding.
MEGA5: Molecular Evolutionary Genetics Analysis Using Maximum Likelihood, Evolutionary Distance, and Maximum Parsimony Methods

PubMed Central

Tamura, Koichiro; Peterson, Daniel; Peterson, Nicholas; Stecher, Glen; Nei, Masatoshi; Kumar, Sudhir

2011-01-01

Comparative analysis of molecular sequence data is essential for reconstructing the evolutionary histories of species and inferring the nature and extent of selective forces shaping the evolution of genes and species. Here, we announce the release of Molecular Evolutionary Genetics Analysis version 5 (MEGA5), which is a user-friendly software for mining online databases, building sequence alignments and phylogenetic trees, and using methods of evolutionary bioinformatics in basic biology, biomedicine, and evolution. The newest addition in MEGA5 is a collection of maximum likelihood (ML) analyses for inferring evolutionary trees, selecting best-fit substitution models (nucleotide or amino acid), inferring ancestral states and sequences (along with probabilities), and estimating evolutionary rates site-by-site. In computer simulation analyses, ML tree inference algorithms in MEGA5 compared favorably with other software packages in terms of computational efficiency and the accuracy of the estimates of phylogenetic trees, substitution parameters, and rate variation among sites. The MEGA user interface has now been enhanced to be activity driven to make it easier for the use of both beginners and experienced scientists. This version of MEGA is intended for the Windows platform, and it has been configured for effective use on Mac OS X and Linux desktops. It is available free of charge from http://www.megasoftware.net. PMID:21546353
Rapid Multi-Locus Sequence Typing Using Microfluidic Biochips

DTIC Science & Technology

2010-05-12

Sequence Types. The evolutionary history of all the B. cereus MLST concatenated Sequence Types (545 taxa, 2,394 nucleotide positions) was inferred using...the Neighbor-Joining method [28]. The bootstrap consensus tree inferred from 100 replicates was taken to represent the evolutionary history of the... Chlamydia (manuscript in preparation) and performed pilot studies on Staphylococcus aureus and Streptoccus pneumoniae (Data S4 and Text S2). Another potential
SCARF: maximizing next-generation EST assemblies for evolutionary and population genomic analyses.

PubMed

Barker, Michael S; Dlugosch, Katrina M; Reddy, A Chaitanya C; Amyotte, Sarah N; Rieseberg, Loren H

2009-02-15

Scaffolded and Corrected Assembly of Roche 454 (SCARF) is a next-generation sequence assembly tool for evolutionary genomics that is designed especially for assembling 454 EST sequences against high-quality reference sequences from related species. The program was created to knit together 454 contigs that do not assemble during traditional de novo assembly, using a reference sequence library to orient the 454 sequences. SCARF is freely available at http://msbarker.com/software.htm, and is released under the open source GPLv3 license (http://www.opensource.org/licenses/gpl-3.0.html.
A Large Stellar Evolution Database for Population Synthesis Studies. I. Scaled Solar Models and Isochrones

NASA Astrophysics Data System (ADS)

Pietrinferni, Adriano; Cassisi, Santi; Salaris, Maurizio; Castelli, Fiorella

2004-09-01

We present a large and updated stellar evolution database for low-, intermediate-, and high-mass stars in a wide metallicity range, suitable for studying Galactic and extragalactic simple and composite stellar populations using population synthesis techniques. The stellar mass range is between ~0.5 and 10 Msolar with a fine mass spacing. The metallicity [Fe/H] comprises 10 values ranging from -2.27 to 0.40, with a scaled solar metal distribution. The initial He mass fraction ranges from Y=0.245, for the more metal-poor composition, up to 0.303 for the more metal-rich one, with ΔY/ΔZ~1.4. For each adopted chemical composition, the evolutionary models have been computed without (canonical models) and with overshooting from the Schwarzschild boundary of the convective cores during the central H-burning phase. Semiconvection is included in the treatment of core convection during the He-burning phase. The whole set of evolutionary models can be used to compute isochrones in a wide age range, from ~30 Myr to ~15 Gyr. Both evolutionary models and isochrones are available in several observational planes, employing an updated set of bolometric corrections and color-Teff relations computed for this project. The number of points along the models and the resulting isochrones is selected in such a way that interpolation for intermediate metallicities not contained in the grid is straightforward; a simple quadratic interpolation produces results of sufficient accuracy for population synthesis applications.We compare our isochrones with results from a series of widely used stellar evolution databases and perform some empirical tests for the reliability of our models. Since this work is devoted to scaled solar chemical compositions, we focus our attention on the Galactic disk stellar populations, employing multicolor photometry of unevolved field main-sequence stars with precise Hipparcos parallaxes, well-studied open clusters, and one eclipsing binary system with precise measurements of masses, radii, and [Fe/H] of both components. We find that the predicted metallicity dependence of the location of the lower, unevolved main sequence in the color magnitude diagram (CMD) appears in satisfactory agreement with empirical data. When comparing our models with CMDs of selected, well-studied, open clusters, once again we were able to properly match the whole observed evolutionary sequences by assuming cluster distance and reddening estimates in satisfactory agreement with empirical evaluations of these quantities. In general, models including overshooting during the H-burning phase provide a better match to the observations, at least for ages below ~4 Gyr. At [Fe/H] around solar and higher ages (i.e., smaller convective cores) before the onset of radiative cores, the selected efficiency of core overshooting may be too high in our model, as well as in various other models in the literature. Since we also provide canonical models, the reader is strongly encouraged to always compare the results from both sets in this critical age range.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Ramirez, Ramses M.; Kaltenegger, Lisa

We calculate the pre-main-sequence habitable zone (HZ) for stars of spectral classes F-M. The spatial distribution of liquid water and its change during the pre-main-sequence phase of protoplanetary systems is important for understanding how planets become habitable. Such worlds are interesting targets for future missions because the coolest stars could provide habitable conditions for up to 2.5 billion years post-accretion. Moreover, for a given star type, planetary systems are more easily resolved because of higher pre-main-sequence stellar luminosities, resulting in larger planet-star separation for cool stars than is the case for the traditional main-sequence (MS) HZ. We use one-dimensional radiative-convectivemore » climate and stellar evolutionary models to calculate pre-main-sequence HZ distances for F1-M8 stellar types. We also show that accreting planets that are later located in the traditional MS HZ orbiting stars cooler than a K5 (including the full range of M stars) receive stellar fluxes that exceed the runaway greenhouse threshold, and thus may lose substantial amounts of water initially delivered to them. We predict that M-star planets need to initially accrete more water than Earth did, or, alternatively, have additional water delivered later during the long pre-MS phase to remain habitable. Our findings are also consistent with recent claims that Venus lost its water during accretion.« less
Phylodynamic reconstruction of O CATHAY topotype foot-and-mouth disease virus epidemics in the Philippines.

PubMed

Di Nardo, Antonello; Knowles, Nick J; Wadsworth, Jemma; Haydon, Daniel T; King, Donald P

2014-08-24

Reconstructing the evolutionary history, demographic signal and dispersal processes from viral genome sequences contributes to our understanding of the epidemiological dynamics underlying epizootic events. In this study, a Bayesian phylogenetic framework was used to explore the phylodynamics and spatio-temporal dispersion of the O CATHAY topotype of foot-and-mouth disease virus (FMDV) that caused epidemics in the Philippines between 1994 and 2005. Sequences of the FMDV genome encoding the VP1 showed that the O CATHAY FMD epizootic in the Philippines resulted from a single introduction and was characterised by three main transmission hubs in Rizal, Bulacan and Manila Provinces. From a wider regional perspective, phylogenetic reconstruction of all available O CATHAY VP1 nucleotide sequences identified three distinct sub-lineages associated with country-based clusters originating in Hong Kong Special Administrative Region (SAR), the Philippines and Taiwan. The root of this phylogenetic tree was located in Hong Kong SAR, representing the most likely source for the introduction of this lineage into the Philippines and Taiwan. The reconstructed O CATHAY phylodynamics revealed three chronologically distinct evolutionary phases, culminating in a reduction in viral diversity over the final 10 years. The analysis suggests that viruses from the O CATHAY topotype have been continually maintained within swine industries close to Hong Kong SAR, following the extinction of virus lineages from the Philippines and the reduced number of FMD cases in Taiwan.
Protein interface classification by evolutionary analysis

PubMed Central

2012-01-01

Background Distinguishing biologically relevant interfaces from lattice contacts in protein crystals is a fundamental problem in structural biology. Despite efforts towards the computational prediction of interface character, many issues are still unresolved. Results We present here a protein-protein interface classifier that relies on evolutionary data to detect the biological character of interfaces. The classifier uses a simple geometric measure, number of core residues, and two evolutionary indicators based on the sequence entropy of homolog sequences. Both aim at detecting differential selection pressure between interface core and rim or rest of surface. The core residues, defined as fully buried residues (>95% burial), appear to be fundamental determinants of biological interfaces: their number is in itself a powerful discriminator of interface character and together with the evolutionary measures it is able to clearly distinguish evolved biological contacts from crystal ones. We demonstrate that this definition of core residues leads to distinctively better results than earlier definitions from the literature. The stringent selection and quality filtering of structural and sequence data was key to the success of the method. Most importantly we demonstrate that a more conservative selection of homolog sequences - with relatively high sequence identities to the query - is able to produce a clearer signal than previous attempts. Conclusions An evolutionary approach like the one presented here is key to the advancement of the field, which so far was missing an effective method exploiting the evolutionary character of protein interfaces. Its coverage and performance will only improve over time thanks to the incessant growth of sequence databases. Currently our method reaches an accuracy of 89% in classifying interfaces of the Ponstingl 2003 datasets and it lends itself to a variety of useful applications in structural biology and bioinformatics. We made the corresponding software implementation available to the community as an easy-to-use graphical web interface at http://www.eppic-web.org. PMID:23259833
Aromatic residues engineered into the beta-turn nucleation site of ubiquitin lead to a complex folding landscape, non-native side-chain interactions, and kinetic traps.

PubMed

Rea, Anita M; Simpson, Emma R; Meldrum, Jill K; Williams, Huw E L; Searle, Mark S

2008-12-02

The fast folding of small proteins is likely to be the product of evolutionary pressures that balance the search for native-like contacts in the transition state with the minimum number of stable non-native interactions that could lead to partially folded states prone to aggregation and amyloid formation. We have investigated the effects of non-native interactions on the folding landscape of yeast ubiquitin by introducing aromatic substitutions into the beta-turn region of the N-terminal beta-hairpin, using both the native G-bulged type I turn sequence (TXTGK) as well as an engineered 2:2 XNGK type I' turn sequence. The N-terminal beta-hairpin is a recognized folding nucleation site in ubiquitin. The folding kinetics for wt-Ub (TLTGK) and the type I' turn mutant (TNGK) reveal only a weakly populated intermediate, however, substitution with X = Phe or Trp in either context results in a high propensity to form a stable compact intermediate where the initial U-->I collapse is visible as a distinct kinetic phase. The introduction of Trp into either of the two host turn sequences results in either complex multiphase kinetics with the possibility of parallel folding pathways, or formation of a highly compact I-state stabilized by non-native interactions that must unfold before refolding. Sequence substitutions with aromatic residues within a localized beta-turn capable of forming non-native hydrophobic contacts in both the native state and partially folded states has the undesirable consequence that folding is frustrated by the formation of stable compact intermediates that evolutionary pressures at the sequence level may have largely eliminated.

Mitochondrial genome sequencing helps show the evolutionary mechanism of mitochondrial genome formation in Brassica

PubMed Central

2011-01-01

Background Angiosperm mitochondrial genomes are more complex than those of other organisms. Analyses of the mitochondrial genome sequences of at least 11 angiosperm species have showed several common properties; these cannot easily explain, however, how the diverse mitotypes evolved within each genus or species. We analyzed the evolutionary relationships of Brassica mitotypes by sequencing. Results We sequenced the mitotypes of cam (Brassica rapa), ole (B. oleracea), jun (B. juncea), and car (B. carinata) and analyzed them together with two previously sequenced mitotypes of B. napus (pol and nap). The sizes of whole single circular genomes of cam, jun, ole, and car are 219,747 bp, 219,766 bp, 360,271 bp, and 232,241 bp, respectively. The mitochondrial genome of ole is largest as a resulting of the duplication of a 141.8 kb segment. The jun mitotype is the result of an inherited cam mitotype, and pol is also derived from the cam mitotype with evolutionary modifications. Genes with known functions are conserved in all mitotypes, but clear variation in open reading frames (ORFs) with unknown functions among the six mitotypes was observed. Sequence relationship analysis showed that there has been genome compaction and inheritance in the course of Brassica mitotype evolution. Conclusions We have sequenced four Brassica mitotypes, compared six Brassica mitotypes and suggested a mechanism for mitochondrial genome formation in Brassica, including evolutionary events such as inheritance, duplication, rearrangement, genome compaction, and mutation. PMID:21988783
Beyond Reasonable Doubt: Evolution from DNA Sequences

PubMed Central

Penny, David

2013-01-01

We demonstrate quantitatively that, as predicted by evolutionary theory, sequences of homologous proteins from different species converge as we go further and further back in time. The converse, a non-evolutionary model can be expressed as probabilities, and the test works for chloroplast, nuclear and mitochondrial sequences, as well as for sequences that diverged at different time depths. Even on our conservative test, the probability that chance could produce the observed levels of ancestral convergence for just one of the eight datasets of 51 proteins is ≈1×10−19 and combined over 8 datasets is ≈1×10−132. By comparison, there are about 1080 protons in the universe, hence the probability that the sequences could have been produced by a process involving unrelated ancestral sequences is about 1050 lower than picking, among all protons, the same proton at random twice in a row. A non-evolutionary control model shows no convergence, and only a small number of parameters are required to account for the observations. It is time that that researchers insisted that doubters put up testable alternatives to evolution. PMID:23950906
High pressure behaviour of uranium dicarbide (UC{sub 2}): Ab-initio study

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sahoo, B. D., E-mail: bdsahoo@barc.gov.in; Mukherjee, D.; Joshi, K. D.

2016-08-28

The structural stability of uranium dicarbide has been examined under hydrostatic compression employing evolutionary structure search algorithm implemented in the universal structure predictor: evolutionary Xtallography (USPEX) code in conjunction with ab-initio electronic band structure calculation method. The ab-initio total energy calculations involved for this purpose have been carried out within both generalized gradient approximations (GGA) and GGA + U approximations. Our calculations under GGA approximation predict the high pressure structural sequence of tetragonal → monoclinic → orthorhombic for this material with transition pressures of ∼8 GPa and 42 GPa, respectively. The same transition sequence is predicted by calculations within GGA + U also with transition pressuresmore » placed at ∼24 GPa and ∼50 GPa, respectively. Further, on the basis of comparison of zero pressure equilibrium volume and equation of state with available experimental data, we find that GGA + U approximation with U = 2.5 eV describes this material better than the simple GGA approximation. The theoretically predicted high pressure structural phase transitions are in disagreement with the only high experimental study by Dancausse et al. [J. Alloys. Compd. 191, 309 (1993)] on this compound which reports a tetragonal to hexagonal phase transition at a pressure of ∼17.6 GPa. Interestingly, during lowest enthalpy structure search using USPEX, we do not see any hexagonal phase to be closer to the predicted monoclinic phase even within 0.2 eV/f. unit. More experiments with varying carbon contents in UC{sub 2} sample are required to resolve this discrepancy. The existence of these high pressure phases predicted by static lattice calculations has been further substantiated by analyzing the elastic and lattice dynamic stability of these structures in the pressure regimes of their structural stability. Additionally, various thermo-physical quantities such as equilibrium volume, bulk modulus, Debye temperature, thermal expansion coefficient, Gruneisen parameter, and heat capacity at ambient conditions have been determined from these calculations and compared with the available experimental data.« less
Determination of evolutionary relationships of outbreak-associated Listeria monocytogenes strains of serotypes 1/2a and 1/2b by whole-genome sequencing

USDA-ARS?s Scientific Manuscript database

We used whole-genome sequencing to determine evolutionary relationships among 20 outbreak-associated clinical isolates of Listeria monocytogenes serotypes 1/2a and 1/2b. Isolates from 6 of 11 outbreaks fell outside the clonal groups or “epidemic clones” that have been previously associated with outb...
A Case-by-Case Evolutionary Analysis of Four Imprinted Retrogenes

PubMed Central

McCole, Ruth B; Loughran, Noeleen B; Chahal, Mandeep; Fernandes, Luis P; Roberts, Roland G; Fraternali, Franca; O'Connell, Mary J; Oakey, Rebecca J

2011-01-01

Retroposition is a widespread phenomenon resulting in the generation of new genes that are initially related to a parent gene via very high coding sequence similarity. We examine the evolutionary fate of four retrogenes generated by such an event; mouse Inpp5f_v2, Mcts2, Nap1l5, and U2af1-rs1. These genes are all subject to the epigenetic phenomenon of parental imprinting. We first provide new data on the age of these retrogene insertions. Using codon-based models of sequence evolution, we show these retrogenes have diverse evolutionary trajectories, including divergence from the parent coding sequence under positive selection pressure, purifying selection pressure maintaining parent-retrogene similarity, and neutral evolution. Examination of the expression pattern of retrogenes shows an atypical, broad pattern across multiple tissues. Protein 3D structure modeling reveals that a positively selected residue in U2af1-rs1, not shared by its parent, may influence protein conformation. Our case-by-case analysis of the evolution of four imprinted retrogenes reveals that this interesting class of imprinted genes, while similar in regulation and sequence characteristics, follow very varied evolutionary paths. PMID:21166792
ECOD: An Evolutionary Classification of Protein Domains

PubMed Central

Kinch, Lisa N.; Pei, Jimin; Shi, Shuoyong; Kim, Bong-Hyun; Grishin, Nick V.

2014-01-01

Understanding the evolution of a protein, including both close and distant relationships, often reveals insight into its structure and function. Fast and easy access to such up-to-date information facilitates research. We have developed a hierarchical evolutionary classification of all proteins with experimentally determined spatial structures, and presented it as an interactive and updatable online database. ECOD (Evolutionary Classification of protein Domains) is distinct from other structural classifications in that it groups domains primarily by evolutionary relationships (homology), rather than topology (or “fold”). This distinction highlights cases of homology between domains of differing topology to aid in understanding of protein structure evolution. ECOD uniquely emphasizes distantly related homologs that are difficult to detect, and thus catalogs the largest number of evolutionary links among structural domain classifications. Placing distant homologs together underscores the ancestral similarities of these proteins and draws attention to the most important regions of sequence and structure, as well as conserved functional sites. ECOD also recognizes closer sequence-based relationships between protein domains. Currently, approximately 100,000 protein structures are classified in ECOD into 9,000 sequence families clustered into close to 2,000 evolutionary groups. The classification is assisted by an automated pipeline that quickly and consistently classifies weekly releases of PDB structures and allows for continual updates. This synchronization with PDB uniquely distinguishes ECOD among all protein classifications. Finally, we present several case studies of homologous proteins not recorded in other classifications, illustrating the potential of how ECOD can be used to further biological and evolutionary studies. PMID:25474468
ECOD: an evolutionary classification of protein domains.

PubMed

Cheng, Hua; Schaeffer, R Dustin; Liao, Yuxing; Kinch, Lisa N; Pei, Jimin; Shi, Shuoyong; Kim, Bong-Hyun; Grishin, Nick V

2014-12-01

Understanding the evolution of a protein, including both close and distant relationships, often reveals insight into its structure and function. Fast and easy access to such up-to-date information facilitates research. We have developed a hierarchical evolutionary classification of all proteins with experimentally determined spatial structures, and presented it as an interactive and updatable online database. ECOD (Evolutionary Classification of protein Domains) is distinct from other structural classifications in that it groups domains primarily by evolutionary relationships (homology), rather than topology (or "fold"). This distinction highlights cases of homology between domains of differing topology to aid in understanding of protein structure evolution. ECOD uniquely emphasizes distantly related homologs that are difficult to detect, and thus catalogs the largest number of evolutionary links among structural domain classifications. Placing distant homologs together underscores the ancestral similarities of these proteins and draws attention to the most important regions of sequence and structure, as well as conserved functional sites. ECOD also recognizes closer sequence-based relationships between protein domains. Currently, approximately 100,000 protein structures are classified in ECOD into 9,000 sequence families clustered into close to 2,000 evolutionary groups. The classification is assisted by an automated pipeline that quickly and consistently classifies weekly releases of PDB structures and allows for continual updates. This synchronization with PDB uniquely distinguishes ECOD among all protein classifications. Finally, we present several case studies of homologous proteins not recorded in other classifications, illustrating the potential of how ECOD can be used to further biological and evolutionary studies.
Using single cell sequencing data to model the evolutionary history of a tumor.

PubMed

Kim, Kyung In; Simon, Richard

2014-01-24

The introduction of next-generation sequencing (NGS) technology has made it possible to detect genomic alterations within tumor cells on a large scale. However, most applications of NGS show the genetic content of mixtures of cells. Recently developed single cell sequencing technology can identify variation within a single cell. Characterization of multiple samples from a tumor using single cell sequencing can potentially provide information on the evolutionary history of that tumor. This may facilitate understanding how key mutations accumulate and evolve in lineages to form a heterogeneous tumor. We provide a computational method to infer an evolutionary mutation tree based on single cell sequencing data. Our approach differs from traditional phylogenetic tree approaches in that our mutation tree directly describes temporal order relationships among mutation sites. Our method also accommodates sequencing errors. Furthermore, we provide a method for estimating the proportion of time from the earliest mutation event of the sample to the most recent common ancestor of the sample of cells. Finally, we discuss current limitations on modeling with single cell sequencing data and possible improvements under those limitations. Inferring the temporal ordering of mutational sites using current single cell sequencing data is a challenge. Our proposed method may help elucidate relationships among key mutations and their role in tumor progression.
Rapid evolution of the env gene leader sequence in cats naturally infected with feline immunodeficiency virus

PubMed Central

Hughes, Joseph; Biek, Roman; Litster, Annette; Willett, Brian J.; Hosie, Margaret J.

2015-01-01

Analysing the evolution of feline immunodeficiency virus (FIV) at the intra-host level is important in order to address whether the diversity and composition of viral quasispecies affect disease progression. We examined the intra-host diversity and the evolutionary rates of the entire env and structural fragments of the env sequences obtained from sequential blood samples in 43 naturally infected domestic cats that displayed different clinical outcomes. We observed in the majority of cats that FIV env showed very low levels of intra-host diversity. We estimated that env evolved at a rate of 1.16×10−3 substitutions per site per year and demonstrated that recombinant sequences evolved faster than non-recombinant sequences. It was evident that the V3–V5 fragment of FIV env displayed higher evolutionary rates in healthy cats than in those with terminal illness. Our study provided the first evidence that the leader sequence of env, rather than the V3–V5 sequence, had the highest intra-host diversity and the highest evolutionary rate of all env fragments, consistent with this region being under a strong selective pressure for genetic variation. Overall, FIV env displayed relatively low intra-host diversity and evolved slowly in naturally infected cats. The maximum evolutionary rate was observed in the leader sequence of env. Although genetic stability is not necessarily a prerequisite for clinical stability, the higher genetic stability of FIV compared with human immunodeficiency virus might explain why many naturally infected cats do not progress rapidly to AIDS. PMID:25535323
Sequence similarities and evolutionary relationships of microbial, plant and animal alpha-amylases.

PubMed

Janecek, S

1994-09-01

Amino acid sequence comparison of 37 alpha-amylases from microbial, plant and animal sources was performed to identify their mutual sequence similarities in addition to the five already described conserved regions. These sequence regions were examined from structure/function and evolutionary perspectives. An unrooted evolutionary tree of alpha-amylases was constructed on a subset of 55 residues from the alignment of sequence similarities along with conserved regions. The most important new information extracted from the tree was as follows: (a) the close evolutionary relationship of Alteromonas haloplanctis alpha-amylase (thermolabile enzyme from an antarctic psychrotroph) with the already known group of homologous alpha-amylases from streptomycetes, Thermomonospora curvata, insects and mammals, and (b) the remarkable 40.1% identity between starch-saccharifying Bacillus subtilis alpha-amylase and the enzyme from the ruminal bacterium Butyrivibrio fibrisolvens, an alpha-amylase with an unusually large polypeptide chain (943 residues in the mature enzyme). Due to a very high degree of similarity, the whole amino acid sequences of three groups of alpha-amylases, namely (a) fungi and yeasts, (b) plants, and (c) A. haloplanctis, streptomycetes, T. curvata, insects and mammals, were aligned independently and their unrooted distance trees were calculated using these alignments. Possible rooting of the trees was also discussed. Based on the knowledge of the location of the five disulfide bonds in the structure of pig pancreatic alpha-amylase, the possible disulfide bridges were established for each of these groups of homologous alpha-amylases.
Mechanism for DNA transposons to generate introns on genomic scales

PubMed Central

Huff, Jason T.; Zilberman, Daniel; Roy, Scott W.

2017-01-01

Discovered four decades ago, the existence of introns was one of the most unexpected findings in molecular biology1. Introns are sequences interrupting genes that must be removed as part of mRNA production. Genome sequencing projects have documented that most eukaryotic genes contain at least one and frequently many introns2,3. Comparison of these genomes reveals a history of long evolutionary periods with little intron gain punctuated by episodes of rapid, extensive gain2,3. However, no detailed mechanism for such episodic intron generation has been empirically supported on a sufficient scale, despite several proposals4–8. Here we show how short non-autonomous DNA transposons independently generated hundreds to thousands of introns in the prasinophyte Micromonas pusilla and the pelagophyte Aureococcus anophagefferens. Each transposon carries one splice site. The other splice site is co-opted from gene sequence duplicated upon transposon insertion, allowing perfect splicing out of RNA. The distributions of sequences that can be co-opted are biased with respect to codons, and phasing of transposon-generated introns is similarly biased. These transposons insert between preexisting nucleosomes, so that multiple nearby insertions generate nucleosome-sized intervening segments. Thus, transposon insertion and sequence co-option may explain the intron phase biases2 and prevalence of nucleosome-sized exons9 observed in eukaryotes. Overall, the two independent examples of proliferating elements illustrate a general DNA transposon mechanism plausibly accounting for episodes of rapid, extensive intron gain during eukaryotic evolution2,3. PMID:27760113
Evolutionary genetics of insect innate immunity.

PubMed

Viljakainen, Lumi

2015-11-01

Patterns of evolution in immune defense genes help to understand the evolutionary dynamics between hosts and pathogens. Multiple insect genomes have been sequenced, with many of them having annotated immune genes, which paves the way for a comparative genomic analysis of insect immunity. In this review, I summarize the current state of comparative and evolutionary genomics of insect innate immune defense. The focus is on the conserved and divergent components of immunity with an emphasis on gene family evolution and evolution at the sequence level; both population genetics and molecular evolution frameworks are considered. © The Author 2015. Published by Oxford University Press.
Langley's CSI evolutionary model: Phase 2

NASA Technical Reports Server (NTRS)

Horta, Lucas G.; Reaves, Mercedes C.; Elliott, Kenny B.; Belvin, W. Keith; Teter, John E.

1995-01-01

Phase 2 testbed is part of a sequence of laboratory models, developed at NASA Langley Research Center, to enhance our understanding on how to model, control, and design structures for space applications. A key problem with structures that must perform in space is the appearance of unwanted vibrations during operations. Instruments, design independently by different scientists, must share the same vehicle causing them to interact with each other. Once in space, these problems are difficult to correct and therefore, prediction via analysis design, and experiments is very important. Phase 2 laboratory model and its predecessors are designed to fill a gap between theory and practice and to aid in understanding important aspects in modeling, sensor and actuator technology, ground testing techniques, and control design issues. This document provides detailed information on the truss structure and its main components, control computer architecture, and structural models generated along with corresponding experimental results.
Testing Models of Stellar Structure and Evolution I. Comparison with Detached Eclipsing Binaries

NASA Astrophysics Data System (ADS)

del Burgo, C.; Allende Prieto, C.

2018-05-01

We present the results of an analysis aimed at testing the accuracy and precision of the PARSEC v1.2S library of stellar evolution models, combined with a Bayesian approach, to infer stellar parameters. We mainly employ the online DEBCat catalogue by Southworth, a compilation of detached eclipsing binary systems with published measurements of masses and radii to ˜ 2 per cent precision. We select a sample of 318 binary components, with masses between 0.10 and 14.5 solar units, and distances between 1.3 pc and ˜ 8 kpc for Galactic objects and ˜ 44-68 kpc for the extragalactic ones. The Bayesian analysis applied takes on input effective temperature, radius, and [Fe/H], and their uncertainties, returning theoretical predictions for other stellar parameters. From the comparison with dynamical masses, we conclude inferred masses are precisely derived for stars on the main-sequence and in the core-helium-burning phase, with respective uncertainties of 4 per cent and 7 per cent, on average. Subgiants and red giants masses are predicted within 14 per cent, and early asymptotic giant branch stars within 24 per cent. These results are helpful to further improve the models, in particular for advanced evolutionary stages for which our understanding is limited. We obtain distances and ages for the binary systems and compare them, whenever possible, with precise literature estimates, finding excellent agreement. We discuss evolutionary effects and the challenges associated with the inference of stellar ages from evolutionary models. We also provide useful polynomial fittings to theoretical zero-age main-sequence relations.
The evolutionary sequence of post-starburst galaxies

NASA Astrophysics Data System (ADS)

Wilkinson, C. L.; Pimbblet, K. A.; Stott, J. P.

2017-12-01

There are multiple ways in which to select post-starburst galaxies in the literature. In this work, we present a study into how two well-used selection techniques have consequences on observable post-starburst galaxy parameters, such as colour, morphology and environment, and how this affects interpretations of their role in the galaxy duty cycle. We identify a master sample of H δ strong (EWH δ > 3Å) post-starburst galaxies from the value-added catalogue in the seventh data release of the Sloan Digital Sky Survey (SDSS DR7) over a redshift range 0.01 < z < 0.1. From this sample we select two E+A subsets, both having a very little [O II] emission (EW_[O II] > -2.5 Å) but one having an additional cut on EWHα (>-3 Å). We examine the differences in observables and AGN fractions to see what effect the H α cut has on the properties of post-starburst galaxies and what these differing samples can tell us about the duty cycle of post-starburst galaxies. We find that H δ strong galaxies peak in the 'blue cloud', E+As in the 'green valley' and pure E+As in the 'red sequence'. We also find that pure E+As have a more early-type morphology and a higher fraction in denser environments compared with the H δ strong and E+A galaxies. These results suggest that there is an evolutionary sequence in the post-starburst phase from blue discy galaxies with residual star formation to passive red early-types.
The evolution of transcriptional regulation in eukaryotes

NASA Technical Reports Server (NTRS)

Wray, Gregory A.; Hahn, Matthew W.; Abouheif, Ehab; Balhoff, James P.; Pizer, Margaret; Rockman, Matthew V.; Romano, Laura A.

2003-01-01

Gene expression is central to the genotype-phenotype relationship in all organisms, and it is an important component of the genetic basis for evolutionary change in diverse aspects of phenotype. However, the evolution of transcriptional regulation remains understudied and poorly understood. Here we review the evolutionary dynamics of promoter, or cis-regulatory, sequences and the evolutionary mechanisms that shape them. Existing evidence indicates that populations harbor extensive genetic variation in promoter sequences, that a substantial fraction of this variation has consequences for both biochemical and organismal phenotype, and that some of this functional variation is sorted by selection. As with protein-coding sequences, rates and patterns of promoter sequence evolution differ considerably among loci and among clades for reasons that are not well understood. Studying the evolution of transcriptional regulation poses empirical and conceptual challenges beyond those typically encountered in analyses of coding sequence evolution: promoter organization is much less regular than that of coding sequences, and sequences required for the transcription of each locus reside at multiple other loci in the genome. Because of the strong context-dependence of transcriptional regulation, sequence inspection alone provides limited information about promoter function. Understanding the functional consequences of sequence differences among promoters generally requires biochemical and in vivo functional assays. Despite these challenges, important insights have already been gained into the evolution of transcriptional regulation, and the pace of discovery is accelerating.
Evolutionary profiles from the QR factorization of multiple sequence alignments

PubMed Central

Sethi, Anurag; O'Donoghue, Patrick; Luthey-Schulten, Zaida

2005-01-01

We present an algorithm to generate complete evolutionary profiles that represent the topology of the molecular phylogenetic tree of the homologous group. The method, based on the multidimensional QR factorization of numerically encoded multiple sequence alignments, removes redundancy from the alignments and orders the protein sequences by increasing linear dependence, resulting in the identification of a minimal basis set of sequences that spans the evolutionary space of the homologous group of proteins. We observe a general trend that these smaller, more evolutionarily balanced profiles have comparable and, in many cases, better performance in database searches than conventional profiles containing hundreds of sequences, constructed in an iterative and computationally intensive procedure. For more diverse families or superfamilies, with sequence identity <30%, structural alignments, based purely on the geometry of the protein structures, provide better alignments than pure sequence-based methods. Merging the structure and sequence information allows the construction of accurate profiles for distantly related groups. These structure-based profiles outperformed other sequence-based methods for finding distant homologs and were used to identify a putative class II cysteinyl-tRNA synthetase (CysRS) in several archaea that eluded previous annotation studies. Phylogenetic analysis showed the putative class II CysRSs to be a monophyletic group and homology modeling revealed a constellation of active site residues similar to that in the known class I CysRS. PMID:15741270
Tempo and mode of genomic mutations unveil human evolutionary history.

PubMed

Hara, Yuichiro

2015-01-01

Mutations that have occurred in human genomes provide insight into various aspects of evolutionary history such as speciation events and degrees of natural selection. Comparing genome sequences between human and great apes or among humans is a feasible approach for inferring human evolutionary history. Recent advances in high-throughput or so-called 'next-generation' DNA sequencing technologies have enabled the sequencing of thousands of individual human genomes, as well as a variety of reference genomes of hominids, many of which are publicly available. These sequence data can help to unveil the detailed demographic history of the lineage leading to humans as well as the explosion of modern human population size in the last several thousand years. In addition, high-throughput sequencing illustrates the tempo and mode of de novo mutations, which are producing human genetic variation at this moment. Pedigree-based human genome sequencing has shown that mutation rates vary significantly across the human genome. These studies have also provided an improved timescale of human evolution, because the mutation rate estimated from pedigree analysis is half that estimated from traditional analyses based on molecular phylogeny. Because of the dramatic reduction in sequencing cost, sequencing on-demand samples designed for specific studies is now also becoming popular. To produce data of sufficient quality to meet the requirements of the study, it is necessary to set an explicit sequencing plan that includes the choice of sample collection methods, sequencing platforms, and number of sequence reads.
Interchromosomal Duplications on the Bactrocera oleae Y Chromosome Imply a Distinct Evolutionary Origin of the Sex Chromosomes Compared to Drosophila

PubMed Central

Gabrieli, Paolo; Gomulski, Ludvik M.; Bonomi, Angelica; Siciliano, Paolo; Scolari, Francesca; Franz, Gerald; Jessup, Andrew; Malacrida, Anna R.; Gasperi, Giuliano

2011-01-01

Background Diptera have an extraordinary variety of sex determination mechanisms, and Drosophila melanogaster is the paradigm for this group. However, the Drosophila sex determination pathway is only partially conserved and the family Tephritidae affords an interesting example. The tephritid Y chromosome is postulated to be necessary to determine male development. Characterization of Y sequences, apart from elucidating the nature of the male determining factor, is also important to understand the evolutionary history of sex chromosomes within the Tephritidae. We studied the Y sequences from the olive fly, Bactrocera oleae. Its Y chromosome is minute and highly heterochromatic, and displays high heteromorphism with the X chromosome. Methodology/Principal Findings A combined Representational Difference Analysis (RDA) and fluorescence in-situ hybridization (FISH) approach was used to investigate the Y chromosome to derive information on its sequence content. The Y chromosome is strewn with repetitive DNA sequences, the majority of which are also interdispersed in the pericentromeric regions of the autosomes. The Y chromosome appears to have accumulated small and large repetitive interchromosomal duplications. The large interchromosomal duplications harbour an importin-4-like gene fragment. Apart from these importin-4-like sequences, the other Y repetitive sequences are not shared with the X chromosome, suggesting molecular differentiation of these two chromosomes. Moreover, as the identified Y sequences were not detected on the Y chromosomes of closely related tephritids, we can infer divergence in the repetitive nature of their sequence contents. Conclusions/Significance The identification of Y-linked sequences may tell us much about the repetitive nature, the origin and the evolution of Y chromosomes. We hypothesize how these repetitive sequences accumulated and were maintained on the Y chromosome during its evolutionary history. Our data reinforce the idea that the sex chromosomes of the Tephritidae may have distinct evolutionary origins with respect to those of the Drosophilidae and other Dipteran families. PMID:21408187
Phylogenetic Quantification of Intra-tumour Heterogeneity

PubMed Central

Schwarz, Roland F.; Trinh, Anne; Sipos, Botond; Brenton, James D.; Goldman, Nick; Markowetz, Florian

2014-01-01

Intra-tumour genetic heterogeneity is the result of ongoing evolutionary change within each cancer. The expansion of genetically distinct sub-clonal populations may explain the emergence of drug resistance, and if so, would have prognostic and predictive utility. However, methods for objectively quantifying tumour heterogeneity have been missing and are particularly difficult to establish in cancers where predominant copy number variation prevents accurate phylogenetic reconstruction owing to horizontal dependencies caused by long and cascading genomic rearrangements. To address these challenges, we present MEDICC, a method for phylogenetic reconstruction and heterogeneity quantification based on a Minimum Event Distance for Intra-tumour Copy-number Comparisons. Using a transducer-based pairwise comparison function, we determine optimal phasing of major and minor alleles, as well as evolutionary distances between samples, and are able to reconstruct ancestral genomes. Rigorous simulations and an extensive clinical study show the power of our method, which outperforms state-of-the-art competitors in reconstruction accuracy, and additionally allows unbiased numerical quantification of tumour heterogeneity. Accurate quantification and evolutionary inference are essential to understand the functional consequences of tumour heterogeneity. The MEDICC algorithms are independent of the experimental techniques used and are applicable to both next-generation sequencing and array CGH data. PMID:24743184

Functionally essential, invariant glutamate near the C-terminus of strand beta 5 in various (alpha/beta)8-barrel enzymes as a possible indicator of their evolutionary relatedness.

PubMed

Janecek, S; Baláz, S

1995-08-01

Twelve different (alpha/beta)8-barrel enzymes belonging to three structurally distinct families were found to contain, near the C-terminus of their strand beta 5, a conserved invariant glutamic acid residue that plays an important functional role in each of these enzymes. The search was based on the idea that a conserved sequence region of an (alpha/beta)8-barrel enzyme should be more or less conserved also in the equivalent part of the structure of the other enzymes with this folding motif owing to their mutual evolutionary relatedness. For this purpose, the sequence region around the well conserved fifth beta-strand of alpha-amylase containing catalytic glutamate (Glu230, Aspergillus oryzae alpha-amylase numbering), was used as the sequence-structural template. The isolated sequence stretches of the 12 (alpha/beta)8-barrels are discussed from both the sequence-structural and the evolutionary point of view, the invariant glutamate residue being proposed to be a joining feature of the studied group of enzymes remaining from their ancestral (alpha/beta)8-barrel.
OncoNEM: inferring tumor evolution from single-cell sequencing data.

PubMed

Ross, Edith M; Markowetz, Florian

2016-04-15

Single-cell sequencing promises a high-resolution view of genetic heterogeneity and clonal evolution in cancer. However, methods to infer tumor evolution from single-cell sequencing data lag behind methods developed for bulk-sequencing data. Here, we present OncoNEM, a probabilistic method for inferring intra-tumor evolutionary lineage trees from somatic single nucleotide variants of single cells. OncoNEM identifies homogeneous cellular subpopulations and infers their genotypes as well as a tree describing their evolutionary relationships. In simulation studies, we assess OncoNEM's robustness and benchmark its performance against competing methods. Finally, we show its applicability in case studies of muscle-invasive bladder cancer and essential thrombocythemia.
Organization and evolution of highly repeated satellite DNA sequences in plant chromosomes.

PubMed

Sharma, S; Raina, S N

2005-01-01

A major component of the plant nuclear genome is constituted by different classes of repetitive DNA sequences. The structural, functional and evolutionary aspects of the satellite repetitive DNA families, and their organization in the chromosomes is reviewed. The tandem satellite DNA sequences exhibit characteristic chromosomal locations, usually at subtelomeric and centromeric regions. The repetitive DNA family(ies) may be widely distributed in a taxonomic family or a genus, or may be specific for a species, genome or even a chromosome. They may acquire large-scale variations in their sequence and copy number over an evolutionary time-scale. These features have formed the basis of extensive utilization of repetitive sequences for taxonomic and phylogenetic studies. Hybrid polyploids have especially proven to be excellent models for studying the evolution of repetitive DNA sequences. Recent studies explicitly show that some repetitive DNA families localized at the telomeres and centromeres have acquired important structural and functional significance. The repetitive elements are under different evolutionary constraints as compared to the genes. Satellite DNA families are thought to arise de novo as a consequence of molecular mechanisms such as unequal crossing over, rolling circle amplification, replication slippage and mutation that constitute "molecular drive". Copyright 2005 S. Karger AG, Basel.
Huntingtin gene evolution in Chordata and its peculiar features in the ascidian Ciona genus

PubMed Central

Gissi, Carmela; Pesole, Graziano; Cattaneo, Elena; Tartari, Marzia

2006-01-01

Background To gain insight into the evolutionary features of the huntingtin (htt) gene in Chordata, we have sequenced and characterized the full-length htt mRNA in the ascidian Ciona intestinalis, a basal chordate emerging as new invertebrate model organism. Moreover, taking advantage of the availability of genomic and EST sequences, the htt gene structure of a number of chordate species, including the cogeneric ascidian Ciona savignyi, and the vertebrates Xenopus and Gallus was reconstructed. Results The C. intestinalis htt transcript exhibits some peculiar features, such as spliced leader trans-splicing in the 98 nt-long 5' untranslated region (UTR), an alternative splicing in the coding region, eight alternative polyadenylation sites, and no similarities of both 5' and 3'UTRs compared to homologs of the cogeneric C. savignyi. The predicted protein is 2946 amino acids long, shorter than its vertebrate homologs, and lacks the polyQ and the polyP stretches found in the the N-terminal regions of mammalian homologs. The exon-intron organization of the htt gene is almost identical among vertebrates, and significantly conserved between Ciona and vertebrates, allowing us to hypothesize an ancestral chordate gene consisting of at least 40 coding exons. Conclusion During chordate diversification, events of gain/loss, sliding, phase changes, and expansion of introns occurred in both vertebrate and ascidian lineages predominantly in the 5'-half of the htt gene, where there is also evidence of lineage-specific evolutionary dynamics in vertebrates. On the contrary, the 3'-half of the gene is highly conserved in all chordates at the level of both gene structure and protein sequence. Between the two Ciona species, a fast evolutionary rate and/or an early divergence time is suggested by the absence of significant similarity between UTRs, protein divergence comparable to that observed between mammals and fishes, and different distribution of repetitive elements. PMID:17092333
Chemical characterization of the early evolutionary phases of high-mass star-forming regions

NASA Astrophysics Data System (ADS)

Gerner, Thomas

2014-10-01

The formation of high-mass stars is a very complex process and up to date no comprehensive theory about it exists. This thesis studies the early stages of high-mass star-forming regions and employs astrochemistry as a tool to probe their different physical conditions. We split the evolutionary sequence into four observationally motivated stages that are based on a classification proposed in the literature. The sequence is characterized by an increase of the temperatures and densities that strongly influences the chemistry in the different stages. We observed a sample of 59 high-mass star-forming regions that cover the whole sequence and statistically characterized the chemical compositions of the different stages. We determined average column densities of 18 different molecular species and found generally increasing abundances with stage. We fitted them for each stage with a 1D model, such that the result of the best fit to the previous stage was used as new input for the following. This is a unique approach and allowed us to infer physical properties like the temperature and density structure and yielded a typical chemical lifetime for the high-mass star-formation process of 1e5 years. The 18 analyzed molecular species also included four deuterated molecules whose chemistry is particularly sensitive to thermal history and thus is a promising tool to infer chemical ages. We found decreasing trends of the D/H ratios with evolutionary stage for 3 of the 4 molecular species and that the D/H ratio depends more on the fraction of warm and cold gas than on the total amount of gas. That indicates different chemical pathways for the different molecules and confirms the potential use of deuterated species as chemical age indicators. In addition, we mapped a low-mass star forming region in order to study the cosmic ray ionization rate, which is an important parameter in chemical models. While in chemical models it is commonly fixed, we found that it ! strongly varies with environment.
Sequence data - Magnitude and implications of some ambiguities.

NASA Technical Reports Server (NTRS)

Holmquist, R.; Jukes, T. H.

1972-01-01

A stochastic model is applied to the divergence of the horse-pig lineage from a common ansestor in terms of the alpha and beta chains of hemoglobin and fibrinopeptides. The results are compared with those based on the minimum mutation distance model of Fitch (1972). Buckwheat and cauliflower cytochrome c sequences are analyzed to demonstrate their ambiguities. A comparative analysis of evolutionary rates for various proteins of horses and pigs shows that errors of considerable magnitude are introduced by Glx and Asx ambiguities into evolutionary conclusions drawn from sequences of incompletely analyzed proteins.
The evolutionary sequence: origin and emergences.

PubMed

Fox, S W

1986-03-01

The evolutionary sequence is being reexamined experimentally from a "Big Bang"origin to the protocell and from the emergence of protocell and variety of species to Darwin's mental power (mind) and society (The Descent of Man). A most fundamentally revisionary consequence of experiments is an emphasis on endogenous ordering. This principle, seen vividly in ordered copolymerization of amino acids, has had new impact on the theory of Darwinian evolution and has been found to apply to the entire sequence. Herein, I will discuss some problems of dealing with teaching controversial subjects.
The evolutionary sequence: origin and emergences

NASA Technical Reports Server (NTRS)

Fox, S. W.

1986-01-01

The evolutionary sequence is being reexamined experimentally from a "Big Bang"origin to the protocell and from the emergence of protocell and variety of species to Darwin's mental power (mind) and society (The Descent of Man). A most fundamentally revisionary consequence of experiments is an emphasis on endogenous ordering. This principle, seen vividly in ordered copolymerization of amino acids, has had new impact on the theory of Darwinian evolution and has been found to apply to the entire sequence. Herein, I will discuss some problems of dealing with teaching controversial subjects.
Novel Insights on Hantavirus Evolution: The Dichotomy in Evolutionary Pressures Acting on Different Hantavirus Segments.

PubMed

Sankar, Sathish; Upadhyay, Mohita; Ramamurthy, Mageshbabu; Vadivel, Kumaran; Sagadevan, Kalaiselvan; Nandagopal, Balaji; Vivekanandan, Perumal; Sridharan, Gopalan

2015-01-01

Hantaviruses are important emerging zoonotic pathogens. The current understanding of hantavirus evolution is complicated by the lack of consensus on co-divergence of hantaviruses with their animal hosts. In addition, hantaviruses have long-term associations with their reservoir hosts. Analyzing the relative abundance of dinucleotides may shed new light on hantavirus evolution. We studied the relative abundance of dinucleotides and the evolutionary pressures shaping different hantavirus segments. A total of 118 sequences were analyzed; this includes 51 sequences of the S segment, 43 sequences of the M segment and 23 sequences of the L segment. The relative abundance of dinucleotides, effective codon number (ENC), codon usage biases were analyzed. Standard methods were used to investigate the relative roles of mutational pressure and translational selection on the three hantavirus segments. All three segments of hantaviruses are CpG depleted. Mutational pressure is the predominant evolutionary force leading to CpG depletion among hantaviruses. Interestingly, the S segment of hantaviruses is GpU depleted and in contrast to CpG depletion, the depletion of GpU dinucleotides from the S segment is driven by translational selection. Our findings also suggest that mutational pressure is the primary evolutionary pressure acting on the S and the M segments of hantaviruses. While translational selection plays a key role in shaping the evolution of the L segment. Our findings highlight how different evolutionary pressures may contribute disproportionally to the evolution of the three hantavirus segments. These findings provide new insights on the current understanding of hantavirus evolution. There is a dichotomy among evolutionary pressures shaping a) the relative abundance of different dinucleotides in hantavirus genomes b) the evolution of the three hantavirus segments.
AGN radiative feedback in dusty quasar populations

NASA Astrophysics Data System (ADS)

Ishibashi, W.; Banerji, M.; Fabian, A. C.

2017-08-01

New populations of hyper-luminous, dust-obscured quasars have been recently discovered around the peak epoch of galaxy formation (z ˜ 2-3), in addition to similar sources found at lower redshifts. Such dusty quasars are often interpreted as sources 'in transition', from dust-enshrouded starbursts to unobscured luminous quasars, along the evolutionary sequence. Here we consider the role of the active galactic nucleus (AGN) radiative feedback, driven by radiation pressure on dust, in high-luminosity, dust-obscured sources. We analyse how the radiation pressure-driven dusty shell models, with different shell mass configurations, may be applied to the different populations of dusty quasars reported in recent observations. We find that expanding shells, sweeping up matter from the surrounding environment, may account for prolonged obscuration in dusty quasars, e.g. for a central luminosity of L ˜ 1047 erg s-1, a typical obscured phase (with extinction in the range AV ˜ 1-10 mag) may last a few ˜106 yr. On the other hand, fixed-mass shells, coupled with high dust-to-gas ratios, may explain the extreme outflows recently discovered in red quasars at high redshifts. We discuss how the interaction between AGN radiative feedback and the ambient medium at different temporal stages in the evolutionary sequence may contribute to shape the observational appearance of dusty quasar populations.
Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project.

PubMed

Birney, Ewan; Stamatoyannopoulos, John A; Dutta, Anindya; Guigó, Roderic; Gingeras, Thomas R; Margulies, Elliott H; Weng, Zhiping; Snyder, Michael; Dermitzakis, Emmanouil T; Thurman, Robert E; Kuehn, Michael S; Taylor, Christopher M; Neph, Shane; Koch, Christoph M; Asthana, Saurabh; Malhotra, Ankit; Adzhubei, Ivan; Greenbaum, Jason A; Andrews, Robert M; Flicek, Paul; Boyle, Patrick J; Cao, Hua; Carter, Nigel P; Clelland, Gayle K; Davis, Sean; Day, Nathan; Dhami, Pawandeep; Dillon, Shane C; Dorschner, Michael O; Fiegler, Heike; Giresi, Paul G; Goldy, Jeff; Hawrylycz, Michael; Haydock, Andrew; Humbert, Richard; James, Keith D; Johnson, Brett E; Johnson, Ericka M; Frum, Tristan T; Rosenzweig, Elizabeth R; Karnani, Neerja; Lee, Kirsten; Lefebvre, Gregory C; Navas, Patrick A; Neri, Fidencio; Parker, Stephen C J; Sabo, Peter J; Sandstrom, Richard; Shafer, Anthony; Vetrie, David; Weaver, Molly; Wilcox, Sarah; Yu, Man; Collins, Francis S; Dekker, Job; Lieb, Jason D; Tullius, Thomas D; Crawford, Gregory E; Sunyaev, Shamil; Noble, William S; Dunham, Ian; Denoeud, France; Reymond, Alexandre; Kapranov, Philipp; Rozowsky, Joel; Zheng, Deyou; Castelo, Robert; Frankish, Adam; Harrow, Jennifer; Ghosh, Srinka; Sandelin, Albin; Hofacker, Ivo L; Baertsch, Robert; Keefe, Damian; Dike, Sujit; Cheng, Jill; Hirsch, Heather A; Sekinger, Edward A; Lagarde, Julien; Abril, Josep F; Shahab, Atif; Flamm, Christoph; Fried, Claudia; Hackermüller, Jörg; Hertel, Jana; Lindemeyer, Manja; Missal, Kristin; Tanzer, Andrea; Washietl, Stefan; Korbel, Jan; Emanuelsson, Olof; Pedersen, Jakob S; Holroyd, Nancy; Taylor, Ruth; Swarbreck, David; Matthews, Nicholas; Dickson, Mark C; Thomas, Daryl J; Weirauch, Matthew T; Gilbert, James; Drenkow, Jorg; Bell, Ian; Zhao, XiaoDong; Srinivasan, K G; Sung, Wing-Kin; Ooi, Hong Sain; Chiu, Kuo Ping; Foissac, Sylvain; Alioto, Tyler; Brent, Michael; Pachter, Lior; Tress, Michael L; Valencia, Alfonso; Choo, Siew Woh; Choo, Chiou Yu; Ucla, Catherine; Manzano, Caroline; Wyss, Carine; Cheung, Evelyn; Clark, Taane G; Brown, James B; Ganesh, Madhavan; Patel, Sandeep; Tammana, Hari; Chrast, Jacqueline; Henrichsen, Charlotte N; Kai, Chikatoshi; Kawai, Jun; Nagalakshmi, Ugrappa; Wu, Jiaqian; Lian, Zheng; Lian, Jin; Newburger, Peter; Zhang, Xueqing; Bickel, Peter; Mattick, John S; Carninci, Piero; Hayashizaki, Yoshihide; Weissman, Sherman; Hubbard, Tim; Myers, Richard M; Rogers, Jane; Stadler, Peter F; Lowe, Todd M; Wei, Chia-Lin; Ruan, Yijun; Struhl, Kevin; Gerstein, Mark; Antonarakis, Stylianos E; Fu, Yutao; Green, Eric D; Karaöz, Ulaş; Siepel, Adam; Taylor, James; Liefer, Laura A; Wetterstrand, Kris A; Good, Peter J; Feingold, Elise A; Guyer, Mark S; Cooper, Gregory M; Asimenos, George; Dewey, Colin N; Hou, Minmei; Nikolaev, Sergey; Montoya-Burgos, Juan I; Löytynoja, Ari; Whelan, Simon; Pardi, Fabio; Massingham, Tim; Huang, Haiyan; Zhang, Nancy R; Holmes, Ian; Mullikin, James C; Ureta-Vidal, Abel; Paten, Benedict; Seringhaus, Michael; Church, Deanna; Rosenbloom, Kate; Kent, W James; Stone, Eric A; Batzoglou, Serafim; Goldman, Nick; Hardison, Ross C; Haussler, David; Miller, Webb; Sidow, Arend; Trinklein, Nathan D; Zhang, Zhengdong D; Barrera, Leah; Stuart, Rhona; King, David C; Ameur, Adam; Enroth, Stefan; Bieda, Mark C; Kim, Jonghwan; Bhinge, Akshay A; Jiang, Nan; Liu, Jun; Yao, Fei; Vega, Vinsensius B; Lee, Charlie W H; Ng, Patrick; Shahab, Atif; Yang, Annie; Moqtaderi, Zarmik; Zhu, Zhou; Xu, Xiaoqin; Squazzo, Sharon; Oberley, Matthew J; Inman, David; Singer, Michael A; Richmond, Todd A; Munn, Kyle J; Rada-Iglesias, Alvaro; Wallerman, Ola; Komorowski, Jan; Fowler, Joanna C; Couttet, Phillippe; Bruce, Alexander W; Dovey, Oliver M; Ellis, Peter D; Langford, Cordelia F; Nix, David A; Euskirchen, Ghia; Hartman, Stephen; Urban, Alexander E; Kraus, Peter; Van Calcar, Sara; Heintzman, Nate; Kim, Tae Hoon; Wang, Kun; Qu, Chunxu; Hon, Gary; Luna, Rosa; Glass, Christopher K; Rosenfeld, M Geoff; Aldred, Shelley Force; Cooper, Sara J; Halees, Anason; Lin, Jane M; Shulha, Hennady P; Zhang, Xiaoling; Xu, Mousheng; Haidar, Jaafar N S; Yu, Yong; Ruan, Yijun; Iyer, Vishwanath R; Green, Roland D; Wadelius, Claes; Farnham, Peggy J; Ren, Bing; Harte, Rachel A; Hinrichs, Angie S; Trumbower, Heather; Clawson, Hiram; Hillman-Jackson, Jennifer; Zweig, Ann S; Smith, Kayla; Thakkapallayil, Archana; Barber, Galt; Kuhn, Robert M; Karolchik, Donna; Armengol, Lluis; Bird, Christine P; de Bakker, Paul I W; Kern, Andrew D; Lopez-Bigas, Nuria; Martin, Joel D; Stranger, Barbara E; Woodroffe, Abigail; Davydov, Eugene; Dimas, Antigone; Eyras, Eduardo; Hallgrímsdóttir, Ingileif B; Huppert, Julian; Zody, Michael C; Abecasis, Gonçalo R; Estivill, Xavier; Bouffard, Gerard G; Guan, Xiaobin; Hansen, Nancy F; Idol, Jacquelyn R; Maduro, Valerie V B; Maskeri, Baishali; McDowell, Jennifer C; Park, Morgan; Thomas, Pamela J; Young, Alice C; Blakesley, Robert W; Muzny, Donna M; Sodergren, Erica; Wheeler, David A; Worley, Kim C; Jiang, Huaiyang; Weinstock, George M; Gibbs, Richard A; Graves, Tina; Fulton, Robert; Mardis, Elaine R; Wilson, Richard K; Clamp, Michele; Cuff, James; Gnerre, Sante; Jaffe, David B; Chang, Jean L; Lindblad-Toh, Kerstin; Lander, Eric S; Koriabine, Maxim; Nefedov, Mikhail; Osoegawa, Kazutoyo; Yoshinaga, Yuko; Zhu, Baoli; de Jong, Pieter J

2007-06-14

We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.
Sequence co-evolution gives 3D contacts and structures of protein complexes

PubMed Central

Hopf, Thomas A; Schärfe, Charlotta P I; Rodrigues, João P G L M; Green, Anna G; Kohlbacher, Oliver; Sander, Chris; Bonvin, Alexandre M J J; Marks, Debora S

2014-01-01

Protein–protein interactions are fundamental to many biological processes. Experimental screens have identified tens of thousands of interactions, and structural biology has provided detailed functional insight for select 3D protein complexes. An alternative rich source of information about protein interactions is the evolutionary sequence record. Building on earlier work, we show that analysis of correlated evolutionary sequence changes across proteins identifies residues that are close in space with sufficient accuracy to determine the three-dimensional structure of the protein complexes. We evaluate prediction performance in blinded tests on 76 complexes of known 3D structure, predict protein–protein contacts in 32 complexes of unknown structure, and demonstrate how evolutionary couplings can be used to distinguish between interacting and non-interacting protein pairs in a large complex. With the current growth of sequences, we expect that the method can be generalized to genome-wide elucidation of protein–protein interaction networks and used for interaction predictions at residue resolution. DOI: http://dx.doi.org/10.7554/eLife.03430.001 PMID:25255213
Active site of tripeptidyl peptidase II from human erythrocytes is of the subtilisin type.

PubMed Central

Tomkinson, B; Wernstedt, C; Hellman, U; Zetterqvist, O

1987-01-01

The present report presents evidence that the amino acid sequence around the serine of the active site of human tripeptidyl peptidase II is of the subtilisin type. The enzyme from human erythrocytes was covalently labeled at its active site with [3H]diisopropyl fluorophosphate, and the protein was subsequently reduced, alkylated, and digested with trypsin. The labeled tryptic peptides were purified by gel filtration and repeated reversed-phase HPLC, and their amino-terminal sequences were determined. Residue 9 contained the radioactive label and was, therefore, considered to be the active serine residue. The primary structure of the part of the active site (residues 1-10) containing this residue was concluded to be Xaa-Thr-Gln-Leu-Met-Asx-Gly-Thr-Ser-Met. This amino acid sequence is homologous to the sequence surrounding the active serine of the microbial peptidases subtilisin and thermitase. These data demonstrate that human tripeptidyl peptidase II represents a potentially distinct class of human peptidases and raise the question of an evolutionary relationship between the active site of a mammalian peptidase and that of the subtilisin family of serine peptidases. PMID:3313395
Langley's CSI evolutionary model: Phase O

NASA Technical Reports Server (NTRS)

Belvin, W. Keith; Elliott, Kenny B.; Horta, Lucas G.; Bailey, Jim P.; Bruner, Anne M.; Sulla, Jeffrey L.; Won, John; Ugoletti, Roberto M.

1991-01-01

A testbed for the development of Controls Structures Interaction (CSI) technology to improve space science platform pointing is described. The evolutionary nature of the testbed will permit the study of global line-of-sight pointing in phases 0 and 1, whereas, multipayload pointing systems will be studied beginning with phase 2. The design, capabilities, and typical dynamic behavior of the phase 0 version of the CSI evolutionary model (CEM) is documented for investigator both internal and external to NASA. The model description includes line-of-sight pointing measurement, testbed structure, actuators, sensors, and real time computers, as well as finite element and state space models of major components.
A single determinant dominates the rate of yeast protein evolution.

PubMed

Drummond, D Allan; Raval, Alpan; Wilke, Claus O

2006-02-01

A gene's rate of sequence evolution is among the most fundamental evolutionary quantities in common use, but what determines evolutionary rates has remained unclear. Here, we carry out the first combined analysis of seven predictors (gene expression level, dispensability, protein abundance, codon adaptation index, gene length, number of protein-protein interactions, and the gene's centrality in the interaction network) previously reported to have independent influences on protein evolutionary rates. Strikingly, our analysis reveals a single dominant variable linked to the number of translation events which explains 40-fold more variation in evolutionary rate than any other, suggesting that protein evolutionary rate has a single major determinant among the seven predictors. The dominant variable explains nearly half the variation in the rate of synonymous and protein evolution. We show that the two most commonly used methods to disentangle the determinants of evolutionary rate, partial correlation analysis and ordinary multivariate regression, produce misleading or spurious results when applied to noisy biological data. We overcome these difficulties by employing principal component regression, a multivariate regression of evolutionary rate against the principal components of the predictor variables. Our results support the hypothesis that translational selection governs the rate of synonymous and protein sequence evolution in yeast.
Integrated pipeline for inferring the evolutionary history of a gene family embedded in the species tree: a case study on the STIMATE gene family.

PubMed

Song, Jia; Zheng, Sisi; Nguyen, Nhung; Wang, Youjun; Zhou, Yubin; Lin, Kui

2017-10-03

Because phylogenetic inference is an important basis for answering many evolutionary problems, a large number of algorithms have been developed. Some of these algorithms have been improved by integrating gene evolution models with the expectation of accommodating the hierarchy of evolutionary processes. To the best of our knowledge, however, there still is no single unifying model or algorithm that can take all evolutionary processes into account through a stepwise or simultaneous method. On the basis of three existing phylogenetic inference algorithms, we built an integrated pipeline for inferring the evolutionary history of a given gene family; this pipeline can model gene sequence evolution, gene duplication-loss, gene transfer and multispecies coalescent processes. As a case study, we applied this pipeline to the STIMATE (TMEM110) gene family, which has recently been reported to play an important role in store-operated Ca 2+ entry (SOCE) mediated by ORAI and STIM proteins. We inferred their phylogenetic trees in 69 sequenced chordate genomes. By integrating three tree reconstruction algorithms with diverse evolutionary models, a pipeline for inferring the evolutionary history of a gene family was developed, and its application was demonstrated.
Massive star evolution and SN 1987A

NASA Technical Reports Server (NTRS)

Arnett, David

1991-01-01

The evolution of massive stars through hydrogen and helium burning is addressed. A set of stellar evolutionary sequences for mass/solar mass of 15, 20, and 25, and metallicity of 0.002, 0.005, 0.007, 0.010, and 0.20 are presented; semiconvection is restricted to operating slower than the local thermal time scale. Using these sequences, simple models of the massive star content of the LMC are found to agree moderately well with the new observational data of Fitzpatrick and Garmany (1990). LMC supergiants were detected only in their postmain-sequence phases, so that 5-10 times more massive stars are there but not identified as such. It is argued that SN 1987A exhibits the normal evolution of a single star of about 20 solar mases having LMC abundances. Despite the variety of envelope behavior, the structure of the core at collapse is rather similar for the stars of a given mass. Variations due to different rates of mass loss are likely to be larger than those due to composition.
Using Evolutionary Data in Developing Phylogenetic Trees: A Scaffolded Approach with Authentic Data

ERIC Educational Resources Information Center

Davenport, K. D.; Milks, Kirstin Jane; Van Tassell, Rebecca

2015-01-01

Analyzing evolutionary relationships requires that students have a thorough understanding of evidence and of how scientists use evidence to develop these relationships. In this lesson sequence, students work in groups to process many different lines of evidence of evolutionary relationships between ungulates, then construct a scientific argument…
Evolution of sparsity and modularity in a model of protein allostery

NASA Astrophysics Data System (ADS)

Hemery, Mathieu; Rivoire, Olivier

2015-04-01

The sequence of a protein is not only constrained by its physical and biochemical properties under current selection, but also by features of its past evolutionary history. Understanding the extent and the form that these evolutionary constraints may take is important to interpret the information in protein sequences. To study this problem, we introduce a simple but physical model of protein evolution where selection targets allostery, the functional coupling of distal sites on protein surfaces. This model shows how the geometrical organization of couplings between amino acids within a protein structure can depend crucially on its evolutionary history. In particular, two scenarios are found to generate a spatial concentration of functional constraints: high mutation rates and fluctuating selective pressures. This second scenario offers a plausible explanation for the high tolerance of natural proteins to mutations and for the spatial organization of their least tolerant amino acids, as revealed by sequence analysis and mutagenesis experiments. It also implies a faculty to adapt to new selective pressures that is consistent with observations. The model illustrates how several independent functional modules may emerge within the same protein structure, depending on the nature of past environmental fluctuations. Our model thus relates the evolutionary history of proteins to the geometry of their functional constraints, with implications for decoding and engineering protein sequences.
Rooting the archaebacterial tree: the pivotal role of Thermococcus celer in archaebacterial evolution

NASA Technical Reports Server (NTRS)

Achenbach-Richter, L.; Gupta, R.; Zillig, W.; Woese, C. R.

1988-01-01

The sequence of the 16S ribosomal RNA gene from the archaebacterium Thermococcus celer shows the organism to be related to the methanogenic archaebacteria rather than to its phenotypic counterparts, the extremely thermophilic archaebacteria. This conclusion turns on the position of the root of the archaebacterial phylogenetic tree, however. The problems encountered in rooting this tree are analyzed in detail. Under conditions that suppress evolutionary noise both the parsimony and evolutionary distance methods yield a root location (using a number of eubacterial or eukaryotic outgroup sequences) that is consistent with that determined by an "internal rooting" method, based upon an (approximate) determination of relative evolutionary rates.

Inferring the mode of origin of polyploid species from next-generation sequence data.

PubMed

Roux, Camille; Pannell, John R

2015-03-01

Many eukaryote organisms are polyploid. However, despite their importance, evolutionary inference of polyploid origins and modes of inheritance has been limited by a need for analyses of allele segregation at multiple loci using crosses. The increasing availability of sequence data for nonmodel species now allows the application of established approaches for the analysis of genomic data in polyploids. Here, we ask whether approximate Bayesian computation (ABC), applied to realistic traditional and next-generation sequence data, allows correct inference of the evolutionary and demographic history of polyploids. Using simulations, we evaluate the robustness of evolutionary inference by ABC for tetraploid species as a function of the number of individuals and loci sampled, and the presence or absence of an outgroup. We find that ABC adequately retrieves the recent evolutionary history of polyploid species on the basis of both old and new sequencing technologies. The application of ABC to sequence data from diploid and polyploid species of the plant genus Capsella confirms its utility. Our analysis strongly supports an allopolyploid origin of C. bursa-pastoris about 80 000 years ago. This conclusion runs contrary to previous findings based on the same data set but using an alternative approach and is in agreement with recent findings based on whole-genome sequencing. Our results indicate that ABC is a promising and powerful method for revealing the evolution of polyploid species, without the need to attribute alleles to a homeologous chromosome pair. The approach can readily be extended to more complex scenarios involving higher ploidy levels. © 2015 John Wiley & Sons Ltd.
A Systematic Bayesian Integration of Epidemiological and Genetic Data

PubMed Central

Lau, Max S. Y.; Marion, Glenn; Streftaris, George; Gibson, Gavin

2015-01-01

Genetic sequence data on pathogens have great potential to inform inference of their transmission dynamics ultimately leading to better disease control. Where genetic change and disease transmission occur on comparable timescales additional information can be inferred via the joint analysis of such genetic sequence data and epidemiological observations based on clinical symptoms and diagnostic tests. Although recently introduced approaches represent substantial progress, for computational reasons they approximate genuine joint inference of disease dynamics and genetic change in the pathogen population, capturing partially the joint epidemiological-evolutionary dynamics. Improved methods are needed to fully integrate such genetic data with epidemiological observations, for achieving a more robust inference of the transmission tree and other key epidemiological parameters such as latent periods. Here, building on current literature, a novel Bayesian framework is proposed that infers simultaneously and explicitly the transmission tree and unobserved transmitted pathogen sequences. Our framework facilitates the use of realistic likelihood functions and enables systematic and genuine joint inference of the epidemiological-evolutionary process from partially observed outbreaks. Using simulated data it is shown that this approach is able to infer accurately joint epidemiological-evolutionary dynamics, even when pathogen sequences and epidemiological data are incomplete, and when sequences are available for only a fraction of exposures. These results also characterise and quantify the value of incomplete and partial sequence data, which has important implications for sampling design, and demonstrate the abilities of the introduced method to identify multiple clusters within an outbreak. The framework is used to analyse an outbreak of foot-and-mouth disease in the UK, enhancing current understanding of its transmission dynamics and evolutionary process. PMID:26599399
Evolutionary growth process of highly conserved sequences in vertebrate genomes.

PubMed

Ishibashi, Minaka; Noda, Akiko Ogura; Sakate, Ryuichi; Imanishi, Tadashi

2012-08-01

Genome sequence comparison between evolutionarily distant species revealed ultraconserved elements (UCEs) among mammals under strong purifying selection. Most of them were also conserved among vertebrates. Because they tend to be located in the flanking regions of developmental genes, they would have fundamental roles in creating vertebrate body plans. However, the evolutionary origin and selection mechanism of these UCEs remain unclear. Here we report that UCEs arose in primitive vertebrates, and gradually grew in vertebrate evolution. We searched for UCEs in two teleost fishes, Tetraodon nigroviridis and Oryzias latipes, and found 554 UCEs with 100% identity over 100 bps. Comparison of teleost and mammalian UCEs revealed 43 pairs of common, jawed-vertebrate UCEs (jUCE) with high sequence identities, ranging from 83.1% to 99.2%. Ten of them retain lower similarities to the Petromyzon marinus genome, and the substitution rates of four non-exonic jUCEs were reduced after the teleost-mammal divergence, suggesting that robust conservation had been acquired in the jawed vertebrate lineage. Our results indicate that prototypical UCEs originated before the divergence of jawed and jawless vertebrates and have been frozen as perfect conserved sequences in the jawed vertebrate lineage. In addition, our comparative sequence analyses of UCEs and neighboring regions resulted in a discovery of lineage-specific conserved sequences. They were added progressively to prototypical UCEs, suggesting step-wise acquisition of novel regulatory roles. Our results indicate that conserved non-coding elements (CNEs) consist of blocks with distinct evolutionary history, each having been frozen since different evolutionary era along the vertebrate lineage. Copyright © 2012 Elsevier B.V. All rights reserved.
Exploring Pandora's Box: Potential and Pitfalls of Low Coverage Genome Surveys for Evolutionary Biology

PubMed Central

Leese, Florian; Mayer, Christoph; Agrawal, Shobhit; Dambach, Johannes; Dietz, Lars; Doemel, Jana S.; Goodall-Copstake, William P.; Held, Christoph; Jackson, Jennifer A.; Lampert, Kathrin P.; Linse, Katrin; Macher, Jan N.; Nolzen, Jennifer; Raupach, Michael J.; Rivera, Nicole T.; Schubart, Christoph D.; Striewski, Sebastian; Tollrian, Ralph; Sands, Chester J.

2012-01-01

High throughput sequencing technologies are revolutionizing genetic research. With this “rise of the machines”, genomic sequences can be obtained even for unknown genomes within a short time and for reasonable costs. This has enabled evolutionary biologists studying genetically unexplored species to identify molecular markers or genomic regions of interest (e.g. micro- and minisatellites, mitochondrial and nuclear genes) by sequencing only a fraction of the genome. However, when using such datasets from non-model species, it is possible that DNA from non-target contaminant species such as bacteria, viruses, fungi, or other eukaryotic organisms may complicate the interpretation of the results. In this study we analysed 14 genomic pyrosequencing libraries of aquatic non-model taxa from four major evolutionary lineages. We quantified the amount of suitable micro- and minisatellites, mitochondrial genomes, known nuclear genes and transposable elements and searched for contamination from various sources using bioinformatic approaches. Our results show that in all sequence libraries with estimated coverage of about 0.02–25%, many appropriate micro- and minisatellites, mitochondrial gene sequences and nuclear genes from different KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways could be identified and characterized. These can serve as markers for phylogenetic and population genetic analyses. A central finding of our study is that several genomic libraries suffered from different biases owing to non-target DNA or mobile elements. In particular, viruses, bacteria or eukaryote endosymbionts contributed significantly (up to 10%) to some of the libraries analysed. If not identified as such, genetic markers developed from high-throughput sequencing data for non-model organisms may bias evolutionary studies or fail completely in experimental tests. In conclusion, our study demonstrates the enormous potential of low-coverage genome survey sequences and suggests bioinformatic analysis workflows. The results also advise a more sophisticated filtering for problematic sequences and non-target genome sequences prior to developing markers. PMID:23185309
Evolutionary characterization of the West Nile Virus complete genome.

PubMed

Gray, R R; Veras, N M C; Santos, L A; Salemi, M

2010-07-01

The spatial dynamics of the West Nile Virus epidemic in North America are largely unknown. Previous studies that investigated the evolutionary history of the virus used sequence data from the structural genes (prM and E); however, these regions may lack phylogenetic information and obscure true evolutionary relationships. This study systematically evaluated the evolutionary patterns in the eleven genes of the WNV genome in order to determine which region(s) were most phylogenetically informative. We found that while the E region lacks resolution and can potentially result in misleading conclusions, the full NS3 or NS5 regions have strong phylogenetic signal. Furthermore, we show that geographic structure of WNV infection within the US is more pronounced than previously reported in studies that used the structural genes. We conclude that future evolutionary studies should focus on NS3 and NS5 in order to maximize the available sequences while retaining maximal interpretative power to infer temporal and geographic trends among WNV strains. Copyright 2010 Elsevier Inc. All rights reserved.
The impact of age, biogenesis, and genomic clustering on Drosophila microRNA evolution

PubMed Central

Mohammed, Jaaved; Flynt, Alex S.; Siepel, Adam; Lai, Eric C.

2013-01-01

The molecular evolutionary signatures of miRNAs inform our understanding of their emergence, biogenesis, and function. The known signatures of miRNA evolution have derived mostly from the analysis of deeply conserved, canonical loci. In this study, we examine the impact of age, biogenesis pathway, and genomic arrangement on the evolutionary properties of Drosophila miRNAs. Crucial to the accuracy of our results was our curation of high-quality miRNA alignments, which included nearly 150 corrections to ortholog calls and nucleotide sequences of the global 12-way Drosophilid alignments currently available. Using these data, we studied primary sequence conservation, normalized free-energy values, and types of structure-preserving substitutions. We expand upon common miRNA evolutionary patterns that reflect fundamental features of miRNAs that are under functional selection. We observe that melanogaster-subgroup-specific miRNAs, although recently emerged and rapidly evolving, nonetheless exhibit evolutionary signatures that are similar to well-conserved miRNAs and distinct from other structured noncoding RNAs and bulk conserved non-miRNA hairpins. This provides evidence that even young miRNAs may be selected for regulatory activities. More strikingly, we observe that mirtrons and clustered miRNAs both exhibit distinct evolutionary properties relative to solo, well-conserved miRNAs, even after controlling for sequence depth. These studies highlight the previously unappreciated impact of biogenesis strategy and genomic location on the evolutionary dynamics of miRNAs, and affirm that miRNAs do not evolve as a unitary class. PMID:23882112
Taxator-tk: precise taxonomic assignment of metagenomes by fast approximation of evolutionary neighborhoods

PubMed Central

Dröge, J.; Gregor, I.; McHardy, A. C.

2015-01-01

Motivation: Metagenomics characterizes microbial communities by random shotgun sequencing of DNA isolated directly from an environment of interest. An essential step in computational metagenome analysis is taxonomic sequence assignment, which allows identifying the sequenced community members and reconstructing taxonomic bins with sequence data for the individual taxa. For the massive datasets generated by next-generation sequencing technologies, this cannot be performed with de-novo phylogenetic inference methods. We describe an algorithm and the accompanying software, taxator-tk, which performs taxonomic sequence assignment by fast approximate determination of evolutionary neighbors from sequence similarities. Results: Taxator-tk was precise in its taxonomic assignment across all ranks and taxa for a range of evolutionary distances and for short as well as for long sequences. In addition to the taxonomic binning of metagenomes, it is well suited for profiling microbial communities from metagenome samples because it identifies bacterial, archaeal and eukaryotic community members without being affected by varying primer binding strengths, as in marker gene amplification, or copy number variations of marker genes across different taxa. Taxator-tk has an efficient, parallelized implementation that allows the assignment of 6 Gb of sequence data per day on a standard multiprocessor system with 10 CPU cores and microbial RefSeq as the genomic reference data. Availability and implementation: Taxator-tk source and binary program files are publicly available at http://algbio.cs.uni-duesseldorf.de/software/. Contact: Alice.McHardy@uni-duesseldorf.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25388150
Differentiated evolutionary relationships among chordates from comparative alignments of multiple sequences of MyoD and MyoG myogenic regulatory factors.

PubMed

Oliani, L C; Lidani, K C F; Gabriel, J E

2015-10-16

MyoD and MyoG are transcription factors that have essential roles in myogenic lineage determination and muscle differentiation. The purpose of this study was to compare multiple amino acid sequences of myogenic regulatory proteins to infer evolutionary relationships among chordates. Protein sequences from Mus musculus (P10085 and P12979), human Homo sapiens (P15172 and P15173), bovine Bos taurus (Q7YS82 and Q7YS81), wild pig Sus scrofa (P49811 and P49812), quail Coturnix coturnix (P21572 and P34060), chicken Gallus gallus (P16075 and P17920), rat Rattus norvegicus (Q02346 and P20428), domestic water buffalo Bubalus bubalis (D2SP11 and A7L034), and sheep Ovis aries (Q90477 and D3YKV7) were searched from a non-redundant protein sequence database UniProtKB/Swiss-Prot, and subsequently analyzed using the Mega6.0 software. MyoD evolutionary analyses revealed the presence of three main clusters with all mammals branched in one cluster, members of the order Rodentia (mouse and rat) in a second branch linked to the first, and birds of the order Galliformes (chicken and quail) remaining isolated in a third. MyoG evolutionary analyses aligned sequences in two main clusters, all mammalian specimens grouped in different sub-branches, and birds clustered in a second branch. These analyses suggest that the evolution of MyoD and MyoG was driven by different pathways.
MultiSeq: unifying sequence and structure data for evolutionary analysis

PubMed Central

Roberts, Elijah; Eargle, John; Wright, Dan; Luthey-Schulten, Zaida

2006-01-01

Background Since the publication of the first draft of the human genome in 2000, bioinformatic data have been accumulating at an overwhelming pace. Currently, more than 3 million sequences and 35 thousand structures of proteins and nucleic acids are available in public databases. Finding correlations in and between these data to answer critical research questions is extremely challenging. This problem needs to be approached from several directions: information science to organize and search the data; information visualization to assist in recognizing correlations; mathematics to formulate statistical inferences; and biology to analyze chemical and physical properties in terms of sequence and structure changes. Results Here we present MultiSeq, a unified bioinformatics analysis environment that allows one to organize, display, align and analyze both sequence and structure data for proteins and nucleic acids. While special emphasis is placed on analyzing the data within the framework of evolutionary biology, the environment is also flexible enough to accommodate other usage patterns. The evolutionary approach is supported by the use of predefined metadata, adherence to standard ontological mappings, and the ability for the user to adjust these classifications using an electronic notebook. MultiSeq contains a new algorithm to generate complete evolutionary profiles that represent the topology of the molecular phylogenetic tree of a homologous group of distantly related proteins. The method, based on the multidimensional QR factorization of multiple sequence and structure alignments, removes redundancy from the alignments and orders the protein sequences by increasing linear dependence, resulting in the identification of a minimal basis set of sequences that spans the evolutionary space of the homologous group of proteins. Conclusion MultiSeq is a major extension of the Multiple Alignment tool that is provided as part of VMD, a structural visualization program for analyzing molecular dynamics simulations. Both are freely distributed by the NIH Resource for Macromolecular Modeling and Bioinformatics and MultiSeq is included with VMD starting with version 1.8.5. The MultiSeq website has details on how to download and use the software: PMID:16914055
Ancient DNA sequence revealed by error-correcting codes.

PubMed

Brandão, Marcelo M; Spoladore, Larissa; Faria, Luzinete C B; Rocha, Andréa S L; Silva-Filho, Marcio C; Palazzo, Reginaldo

2015-07-10

A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code.
Ancient DNA sequence revealed by error-correcting codes

PubMed Central

Brandão, Marcelo M.; Spoladore, Larissa; Faria, Luzinete C. B.; Rocha, Andréa S. L.; Silva-Filho, Marcio C.; Palazzo, Reginaldo

2015-01-01

A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code. PMID:26159228
Genomic investigations of evolutionary dynamics and epistasis in microbial evolution experiments.

PubMed

Jerison, Elizabeth R; Desai, Michael M

2015-12-01

Microbial evolution experiments enable us to watch adaptation in real time, and to quantify the repeatability and predictability of evolution by comparing identical replicate populations. Further, we can resurrect ancestral types to examine changes over evolutionary time. Until recently, experimental evolution has been limited to measuring phenotypic changes, or to tracking a few genetic markers over time. However, recent advances in sequencing technology now make it possible to extensively sequence clones or whole-population samples from microbial evolution experiments. Here, we review recent work exploiting these techniques to understand the genomic basis of evolutionary change in experimental systems. We first focus on studies that analyze the dynamics of genome evolution in microbial systems. We then survey work that uses observations of sequence evolution to infer aspects of the underlying fitness landscape, concentrating on the epistatic interactions between mutations and the constraints these interactions impose on adaptation. Copyright © 2015 Elsevier Ltd. All rights reserved.
Disentangling the intragroup HI in Compact Groups of galaxies by means of X3D visualization

NASA Astrophysics Data System (ADS)

Verdes-Montenegro, Lourdes; Vogt, Frederic; Aubery, Claire; Duret, Laetitie; Garrido, Julián; Sánchez, Susana; Yun, Min S.; Borthakur, Sanchayeeta; Hess, Kelley; Cluver, Michelle; Del Olmo, Ascensión; Perea, Jaime

2017-03-01

As an extreme kind of environment, Hickson Compact groups (HCGs) have shown to be very complex systems. HI-VLA observations revealed an intrincated network of HI tails and bridges, tracing pre-processing through extreme tidal interactions. We found HCGs to show a large HI deficiency supporting an evolutionary sequence where gas-rich groups transform via tidal interactions and ISM (interstellar medium) stripping into gas-poor systems. We detected as well a diffuse HI component in the groups, increasing with evolutionary phase, although with uncertain distribution. The complex net of detected HI as observed with the VLA seems hence so puzzling as the missing one. In this talk we revisit the existing VLA information on the HI distribution and kinematics of HCGs by means of X3D visualization. X3D constitutes a powerful tool to extract the most from HI data cubes and a mean of simplifying and easing the access to data visualization and publication via three-dimensional (3-D) diagrams.
Evolutionary Engineering Improves Tolerance for Replacement Jet Fuels in Saccharomyces cerevisiae

PubMed Central

Brennan, Timothy C. R.; Williams, Thomas C.; Schulz, Benjamin L.; Palfreyman, Robin W.; Nielsen, Lars K.

2015-01-01

Monoterpenes are liquid hydrocarbons with applications ranging from flavor and fragrance to replacement jet fuel. Their toxicity, however, presents a major challenge for microbial synthesis. Here we evolved limonene-tolerant Saccharomyces cerevisiae strains and sequenced six strains across the 200-generation evolutionary time course. Mutations were found in the tricalbin proteins Tcb2p and Tcb3p. Genomic reconstruction in the parent strain showed that truncation of a single protein (tTcb3p1-989), but not its complete deletion, was sufficient to recover the evolved phenotype improving limonene fitness 9-fold. tTcb3p1-989 increased tolerance toward two other monoterpenes (β-pinene and myrcene) 11- and 8-fold, respectively, and tolerance toward the biojet fuel blend AMJ-700t (10% cymene, 50% limonene, 40% farnesene) 4-fold. tTcb3p1-989 is the first example of successful engineering of phase tolerance and creates opportunities for production of the highly toxic C10 alkenes in yeast. PMID:25746998
Experimental investigation of an RNA sequence space

NASA Technical Reports Server (NTRS)

Lee, Youn-Hyung; Dsouza, Lisa; Fox, George E.

1993-01-01

Modern rRNAs are the historic consequence of an ongoing evolutionary exploration of a sequence space. These extant sequences belong to a special subset of the sequence space that is comprised only of those primary sequences that can validly perform the biological function(s) required of the particular RNA. If it were possible to readily identify all such valid sequences, stochastic predictions could be made about the relative likelihood of various evolutionary pathways available to an RNA. Herein an experimental system which can assess whether a particular sequence is likely to have validity as a eubacterial 5S rRNA is described. A total of ten naturally occurring, and hence known to be valid, sequences and two point mutants of unknown validity were used to test the usefulness of the approach. Nine of the ten valid sequences tested positive whereas both mutants tested as clearly defective. The tenth valid sequence gave results that would be interpreted as reflecting a borderline status were the answer not known. These results demonstrate that it is possible to experimentally determine which sequences in local regions of the sequence space are potentially valid 5S rRNAs.
Metabolic engineering to guide evolution - Creating a novel mode for L-valine production with Corynebacterium glutamicum.

PubMed

Schwentner, Andreas; Feith, André; Münch, Eugenia; Busche, Tobias; Rückert, Christian; Kalinowski, Jörn; Takors, Ralf; Blombach, Bastian

2018-03-06

Evolutionary approaches are often undirected and mutagen-based yielding numerous mutations, which need elaborate screenings to identify relevant targets. We here apply Metabolic engineering to Guide Evolution (MGE), an evolutionary approach evolving and identifying new targets to improve microbial producer strains. MGE is based on the idea to impair the cell's metabolism by metabolic engineering, thereby generating guided evolutionary pressure. It consists of three distinct phases: (i) metabolic engineering to create the evolutionary pressure on the applied strain followed by (ii) a cultivation phase with growth as straightforward screening indicator for the evolutionary event, and (iii) comparative whole genome sequencing (WGS), to identify mutations in the evolved strains, which are eventually re-engineered for verification. Applying MGE, we evolved the PEP and pyruvate carboxylase-deficient strain C. glutamicum Δppc Δpyc to grow on glucose as substrate with rates up to 0.31 ± 0.02 h -1 which corresponds to 80% of the growth rate of the wildtype strain. The intersection of the mutations identified by WGS revealed isocitrate dehydrogenase (ICD) as consistent target in three independently evolved mutants. Upon re-engineering in C. glutamicum Δppc Δpyc, the identified mutations led to diminished ICD activities and activated the glyoxylate shunt replenishing oxaloacetate required for growth. Intracellular relative quantitative metabolome analysis showed that the pools of citrate, isocitrate, cis-aconitate, and L-valine were significantly higher compared to the WT control. As an alternative to existing L-valine producer strains based on inactivated or attenuated pyruvate dehydrogenase complex, we finally engineered the PEP and pyruvate carboxylase-deficient C. glutamicum strains with identified ICD mutations for L-valine production by overexpression of the L-valine biosynthesis genes. Among them, C. glutamicum Δppc Δpyc ICD G407S (pJC4ilvBNCE) produced up to 8.9 ± 0.4 g L-valine L -1 , with a product yield of 0.22 ± 0.01 g L-valine per g glucose. Copyright © 2018 International Metabolic Engineering Society. Published by Elsevier Inc. All rights reserved.
Pyvolve: A Flexible Python Module for Simulating Sequences along Phylogenies.

PubMed

Spielman, Stephanie J; Wilke, Claus O

2015-01-01

We introduce Pyvolve, a flexible Python module for simulating genetic data along a phylogeny using continuous-time Markov models of sequence evolution. Easily incorporated into Python bioinformatics pipelines, Pyvolve can simulate sequences according to most standard models of nucleotide, amino-acid, and codon sequence evolution. All model parameters are fully customizable. Users can additionally specify custom evolutionary models, with custom rate matrices and/or states to evolve. This flexibility makes Pyvolve a convenient framework not only for simulating sequences under a wide variety of conditions, but also for developing and testing new evolutionary models. Pyvolve is an open-source project under a FreeBSD license, and it is available for download, along with a detailed user-manual and example scripts, from http://github.com/sjspielman/pyvolve.
Advances in understanding tumour evolution through single-cell sequencing.

PubMed

Kuipers, Jack; Jahn, Katharina; Beerenwinkel, Niko

2017-04-01

The mutational heterogeneity observed within tumours poses additional challenges to the development of effective cancer treatments. A thorough understanding of a tumour's subclonal composition and its mutational history is essential to open up the design of treatments tailored to individual patients. Comparative studies on a large number of tumours permit the identification of mutational patterns which may refine forecasts of cancer progression, response to treatment and metastatic potential. The composition of tumours is shaped by evolutionary processes. Recent advances in next-generation sequencing offer the possibility to analyse the evolutionary history and accompanying heterogeneity of tumours at an unprecedented resolution, by sequencing single cells. New computational challenges arise when moving from bulk to single-cell sequencing data, leading to the development of novel modelling frameworks. In this review, we present the state of the art methods for understanding the phylogeny encoded in bulk or single-cell sequencing data, and highlight future directions for developing more comprehensive and informative pictures of tumour evolution. This article is part of a Special Issue entitled: Evolutionary principles - heterogeneity in cancer?, edited by Dr. Robert A. Gatenby. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
An Ambystoma mexicanum EST sequencing project: analysis of 17,352 expressed sequence tags from embryonic and regenerating blastema cDNA libraries

PubMed Central

Habermann, Bianca; Bebin, Anne-Gaelle; Herklotz, Stephan; Volkmer, Michael; Eckelt, Kay; Pehlke, Kerstin; Epperlein, Hans Henning; Schackert, Hans Konrad; Wiebe, Glenis; Tanaka, Elly M

2004-01-01

Background The ambystomatid salamander, Ambystoma mexicanum (axolotl), is an important model organism in evolutionary and regeneration research but relatively little sequence information has so far been available. This is a major limitation for molecular studies on caudate development, regeneration and evolution. To address this lack of sequence information we have generated an expressed sequence tag (EST) database for A. mexicanum. Results Two cDNA libraries, one made from stage 18-22 embryos and the other from day-6 regenerating tail blastemas, generated 17,352 sequences. From the sequenced ESTs, 6,377 contigs were assembled that probably represent 25% of the expressed genes in this organism. Sequence comparison revealed significant homology to entries in the NCBI non-redundant database. Further examination of this gene set revealed the presence of genes involved in important cell and developmental processes, including cell proliferation, cell differentiation and cell-cell communication. On the basis of these data, we have performed phylogenetic analysis of key cell-cycle regulators. Interestingly, while cell-cycle proteins such as the cyclin B family display expected evolutionary relationships, the cyclin-dependent kinase inhibitor 1 gene family shows an unusual evolutionary behavior among the amphibians. Conclusions Our analysis reveals the importance of a comprehensive sequence set from a representative of the Caudata and illustrates that the EST sequence database is a rich source of molecular, developmental and regeneration studies. To aid in data mining, the ESTs have been organized into an easily searchable database that is freely available online. PMID:15345051
Genome-Wide Search Identifies 1.9 Mb from the Polar Bear Y Chromosome for Evolutionary Analyses

PubMed Central

Bidon, Tobias; Schreck, Nancy; Hailer, Frank; Nilsson, Maria A.; Janke, Axel

2015-01-01

The male-inherited Y chromosome is the major haploid fraction of the mammalian genome, rendering Y-linked sequences an indispensable resource for evolutionary research. However, despite recent large-scale genome sequencing approaches, only a handful of Y chromosome sequences have been characterized to date, mainly in model organisms. Using polar bear (Ursus maritimus) genomes, we compare two different in silico approaches to identify Y-linked sequences: 1) Similarity to known Y-linked genes and 2) difference in the average read depth of autosomal versus sex chromosomal scaffolds. Specifically, we mapped available genomic sequencing short reads from a male and a female polar bear against the reference genome and identify 112 Y-chromosomal scaffolds with a combined length of 1.9 Mb. We verified the in silico findings for the longer polar bear scaffolds by male-specific in vitro amplification, demonstrating the reliability of the average read depth approach. The obtained Y chromosome sequences contain protein-coding sequences, single nucleotide polymorphisms, microsatellites, and transposable elements that are useful for evolutionary studies. A high-resolution phylogeny of the polar bear patriline shows two highly divergent Y chromosome lineages, obtained from analysis of the identified Y scaffolds in 12 previously published male polar bear genomes. Moreover, we find evidence of gene conversion among ZFX and ZFY sequences in the giant panda lineage and in the ancestor of ursine and tremarctine bears. Thus, the identification of Y-linked scaffold sequences from unordered genome sequences yields valuable data to infer phylogenomic and population-genomic patterns in bears. PMID:26019166

Emerging Concepts of Data Integration in Pathogen Phylodynamics.

PubMed

Baele, Guy; Suchard, Marc A; Rambaut, Andrew; Lemey, Philippe

2017-01-01

Phylodynamics has become an increasingly popular statistical framework to extract evolutionary and epidemiological information from pathogen genomes. By harnessing such information, epidemiologists aim to shed light on the spatio-temporal patterns of spread and to test hypotheses about the underlying interaction of evolutionary and ecological dynamics in pathogen populations. Although the field has witnessed a rich development of statistical inference tools with increasing levels of sophistication, these tools initially focused on sequences as their sole primary data source. Integrating various sources of information, however, promises to deliver more precise insights in infectious diseases and to increase opportunities for statistical hypothesis testing. Here, we review how the emerging concept of data integration is stimulating new advances in Bayesian evolutionary inference methodology which formalize a marriage of statistical thinking and evolutionary biology. These approaches include connecting sequence to trait evolution, such as for host, phenotypic and geographic sampling information, but also the incorporation of covariates of evolutionary and epidemic processes in the reconstruction procedures. We highlight how a full Bayesian approach to covariate modeling and testing can generate further insights into sequence evolution, trait evolution, and population dynamics in pathogen populations. Specific examples demonstrate how such approaches can be used to test the impact of host on rabies and HIV evolutionary rates, to identify the drivers of influenza dispersal as well as the determinants of rabies cross-species transmissions, and to quantify the evolutionary dynamics of influenza antigenicity. Finally, we briefly discuss how data integration is now also permeating through the inference of transmission dynamics, leading to novel insights into tree-generative processes and detailed reconstructions of transmission trees. [Bayesian inference; birth–death models; coalescent models; continuous trait evolution; covariates; data integration; discrete trait evolution; pathogen phylodynamics.
Evolution of microbes and viruses: a paradigm shift in evolutionary biology?

PubMed Central

Koonin, Eugene V.; Wolf, Yuri I.

2012-01-01

When Charles Darwin formulated the central principles of evolutionary biology in the Origin of Species in 1859 and the architects of the Modern Synthesis integrated these principles with population genetics almost a century later, the principal if not the sole objects of evolutionary biology were multicellular eukaryotes, primarily animals and plants. Before the advent of efficient gene sequencing, all attempts to extend evolutionary studies to bacteria have been futile. Sequencing of the rRNA genes in thousands of microbes allowed the construction of the three- domain “ribosomal Tree of Life” that was widely thought to have resolved the evolutionary relationships between the cellular life forms. However, subsequent massive sequencing of numerous, complete microbial genomes revealed novel evolutionary phenomena, the most fundamental of these being: (1) pervasive horizontal gene transfer (HGT), in large part mediated by viruses and plasmids, that shapes the genomes of archaea and bacteria and call for a radical revision (if not abandonment) of the Tree of Life concept, (2) Lamarckian-type inheritance that appears to be critical for antivirus defense and other forms of adaptation in prokaryotes, and (3) evolution of evolvability, i.e., dedicated mechanisms for evolution such as vehicles for HGT and stress-induced mutagenesis systems. In the non-cellular part of the microbial world, phylogenomics and metagenomics of viruses and related selfish genetic elements revealed enormous genetic and molecular diversity and extremely high abundance of viruses that come across as the dominant biological entities on earth. Furthermore, the perennial arms race between viruses and their hosts is one of the defining factors of evolution. Thus, microbial phylogenomics adds new dimensions to the fundamental picture of evolution even as the principle of descent with modification discovered by Darwin and the laws of population genetics remain at the core of evolutionary biology. PMID:22993722
Emerging Concepts of Data Integration in Pathogen Phylodynamics

PubMed Central

Baele, Guy; Suchard, Marc A.; Rambaut, Andrew; Lemey, Philippe

2017-01-01

Phylodynamics has become an increasingly popular statistical framework to extract evolutionary and epidemiological information from pathogen genomes. By harnessing such information, epidemiologists aim to shed light on the spatio-temporal patterns of spread and to test hypotheses about the underlying interaction of evolutionary and ecological dynamics in pathogen populations. Although the field has witnessed a rich development of statistical inference tools with increasing levels of sophistication, these tools initially focused on sequences as their sole primary data source. Integrating various sources of information, however, promises to deliver more precise insights in infectious diseases and to increase opportunities for statistical hypothesis testing. Here, we review how the emerging concept of data integration is stimulating new advances in Bayesian evolutionary inference methodology which formalize a marriage of statistical thinking and evolutionary biology. These approaches include connecting sequence to trait evolution, such as for host, phenotypic and geographic sampling information, but also the incorporation of covariates of evolutionary and epidemic processes in the reconstruction procedures. We highlight how a full Bayesian approach to covariate modeling and testing can generate further insights into sequence evolution, trait evolution, and population dynamics in pathogen populations. Specific examples demonstrate how such approaches can be used to test the impact of host on rabies and HIV evolutionary rates, to identify the drivers of influenza dispersal as well as the determinants of rabies cross-species transmissions, and to quantify the evolutionary dynamics of influenza antigenicity. Finally, we briefly discuss how data integration is now also permeating through the inference of transmission dynamics, leading to novel insights into tree-generative processes and detailed reconstructions of transmission trees. [Bayesian inference; birth–death models; coalescent models; continuous trait evolution; covariates; data integration; discrete trait evolution; pathogen phylodynamics. PMID:28173504
Phylogenomic evidence for a recent and rapid radiation of lizards in the Patagonian Liolaemus fitzingerii species group.

PubMed

Grummer, Jared A; Morando, Mariana M; Avila, Luciano J; Sites, Jack W; Leaché, Adam D

2018-08-01

Rapid evolutionary radiations are difficult to resolve because divergence events are nearly synchronous and gene flow among nascent species can be high, resulting in a phylogenetic "bush". Large datasets composed of sequence loci from across the genome can potentially help resolve some of these difficult phylogenetic problems. A suitable test case is the Liolaemus fitzingerii species group of lizards, which includes twelve species that are broadly distributed in Argentinean Patagonia. The species in the group have had a complex evolutionary history that has led to high morphological variation and unstable taxonomy. We generated a sequence capture dataset for 28 ingroup individuals of 580 nuclear loci, alongside a mitogenomic dataset, to infer phylogenetic relationships among species in this group. Relationships among species were generally weakly supported with the nuclear data, and along with an inferred age of ∼2.6 million years old, indicate either rapid evolution, hybridization, incomplete lineage sorting, non-informative data, or a combination thereof. We inferred a signal of mito-nuclear discordance, indicating potential hybridization between L. melanops and L. martorii, and phylogenetic network analyses provided support for 5 reticulation events among species. Phasing the nuclear loci did not provide additional insight into relationships or suspected patterns of hybridization. Only one clade, composed of L. camarones, L. fitzingerii, and L. xanthoviridis was recovered across all analyses. Genomic datasets provide molecular systematists with new opportunities to resolve difficult phylogenetic problems, yet the lack of phylogenetic resolution in Patagonian Liolaemus is biologically meaningful and indicative of a recent and rapid evolutionary radiation. The phylogenetic relationships of the Liolaemus fitzingerii group may be best modeled as a reticulated network instead of a bifurcating phylogeny. Copyright © 2018 Elsevier Inc. All rights reserved.
Evolution of the Phosphoenolpyruvate Carboxylase Protein Kinase Family in C3 and C4 Flaveria spp.1[W][OPEN

PubMed Central

Aldous, Sophia H.; Weise, Sean E.; Sharkey, Thomas D.; Waldera-Lupa, Daniel M.; Stühler, Kai; Mallmann, Julia; Groth, Georg; Gowik, Udo; Westhoff, Peter; Arsova, Borjana

2014-01-01

The key enzyme for C4 photosynthesis, Phosphoenolpyruvate Carboxylase (PEPC), evolved from nonphotosynthetic PEPC found in C3 ancestors. In all plants, PEPC is phosphorylated by Phosphoenolpyruvate Carboxylase Protein Kinase (PPCK). However, differences in the phosphorylation pattern exist among plants with these photosynthetic types, and it is still not clear if they are due to interspecies differences or depend on photosynthetic type. The genus Flaveria contains closely related C3, C3-C4 intermediate, and C4 species, which are evolutionarily young and thus well suited for comparative analysis. To characterize the evolutionary differences in PPCK between plants with C3 and C4 photosynthesis, transcriptome libraries from nine Flaveria spp. were used, and a two-member PPCK family (PPCKA and PPCKB) was identified. Sequence analysis identified a number of C3- and C4-specific residues with various occurrences in the intermediates. Quantitative analysis of transcriptome data revealed that PPCKA and PPCKB exhibit inverse diel expression patterns and that C3 and C4 Flaveria spp. differ in the expression levels of these genes. PPCKA has maximal expression levels during the day, whereas PPCKB has maximal expression during the night. Phosphorylation patterns of PEPC varied among C3 and C4 Flaveria spp. too, with PEPC from the C4 species being predominantly phosphorylated throughout the day, while in the C3 species the phosphorylation level was maintained during the entire 24 h. Since C4 Flaveria spp. evolved from C3 ancestors, this work links the evolutionary changes in sequence, PPCK expression, and phosphorylation pattern to an evolutionary phase shift of kinase activity from a C3 to a C4 mode. PMID:24850859
Low Frequency Variants, Collapsed Based on Biological Knowledge, Uncover Complexity of Population Stratification in 1000 Genomes Project Data

PubMed Central

Moore, Carrie B.; Wallace, John R.; Wolfe, Daniel J.; Frase, Alex T.; Pendergrass, Sarah A.; Weiss, Kenneth M.; Ritchie, Marylyn D.

2013-01-01

Analyses investigating low frequency variants have the potential for explaining additional genetic heritability of many complex human traits. However, the natural frequencies of rare variation between human populations strongly confound genetic analyses. We have applied a novel collapsing method to identify biological features with low frequency variant burden differences in thirteen populations sequenced by the 1000 Genomes Project. Our flexible collapsing tool utilizes expert biological knowledge from multiple publicly available database sources to direct feature selection. Variants were collapsed according to genetically driven features, such as evolutionary conserved regions, regulatory regions genes, and pathways. We have conducted an extensive comparison of low frequency variant burden differences (MAF<0.03) between populations from 1000 Genomes Project Phase I data. We found that on average 26.87% of gene bins, 35.47% of intergenic bins, 42.85% of pathway bins, 14.86% of ORegAnno regulatory bins, and 5.97% of evolutionary conserved regions show statistically significant differences in low frequency variant burden across populations from the 1000 Genomes Project. The proportion of bins with significant differences in low frequency burden depends on the ancestral similarity of the two populations compared and types of features tested. Even closely related populations had notable differences in low frequency burden, but fewer differences than populations from different continents. Furthermore, conserved or functionally relevant regions had fewer significant differences in low frequency burden than regions under less evolutionary constraint. This degree of low frequency variant differentiation across diverse populations and feature elements highlights the critical importance of considering population stratification in the new era of DNA sequencing and low frequency variant genomic analyses. PMID:24385916
Comparative modeling without implicit sequence alignments.

PubMed

Kolinski, Andrzej; Gront, Dominik

2007-10-01

The number of known protein sequences is about thousand times larger than the number of experimentally solved 3D structures. For more than half of the protein sequences a close or distant structural analog could be identified. The key starting point in a classical comparative modeling is to generate the best possible sequence alignment with a template or templates. With decreasing sequence similarity, the number of errors in the alignments increases and these errors are the main causes of the decreasing accuracy of the molecular models generated. Here we propose a new approach to comparative modeling, which does not require the implicit alignment - the model building phase explores geometric, evolutionary and physical properties of a template (or templates). The proposed method requires prior identification of a template, although the initial sequence alignment is ignored. The model is built using a very efficient reduced representation search engine CABS to find the best possible superposition of the query protein onto the template represented as a 3D multi-featured scaffold. The criteria used include: sequence similarity, predicted secondary structure consistency, local geometric features and hydrophobicity profile. For more difficult cases, the new method qualitatively outperforms existing schemes of comparative modeling. The algorithm unifies de novo modeling, 3D threading and sequence-based methods. The main idea is general and could be easily combined with other efficient modeling tools as Rosetta, UNRES and others.
The evolution of massive stars and their spectra. I. A non-rotating 60 M⊙ star from the zero-age main sequence to the pre-supernova stage

NASA Astrophysics Data System (ADS)

Groh, Jose H.; Meynet, Georges; Ekström, Sylvia; Georgy, Cyril

2014-04-01

For the first time, the interior and spectroscopic evolution of a massive star is analyzed from the zero-age main sequence (ZAMS) to the pre-supernova (SN) stage. For this purpose, we combined stellar evolution models using the Geneva code and stellar atmospheric/wind models using CMFGEN. With our approach, we were able to produce observables, such as a synthetic high-resolution spectrum and photometry, thereby aiding the comparison between evolution models and observed data. Here we analyze the evolution of a non-rotating 60 M⊙ star and its spectrum throughout its lifetime. Interestingly, the star has a supergiant appearance (luminosity class I) even at the ZAMS. We find the following evolutionary sequence of spectral types: O3 I (at the ZAMS), O4 I (middle of the H-core burning phase), B supergiant (BSG), B hypergiant (BHG), hot luminous blue variable (LBV; end of H-core burning), cool LBV (H-shell burning through the beginning of the He-core burning phase), rapid evolution through late WN and early WN, early WC (middle of He-core burning), and WO (end of He-core burning until core collapse). We find the following spectroscopic phase lifetimes: 3.22 × 106 yr for the O-type, 0.34 × 105 yr (BSG), 0.79 × 105 yr (BHG), 2.35 × 105 yr (LBV), 1.05 × 105 yr (WN), 2.57 × 105 yr (WC), and 3.80 × 104 yr (WO). Compared to previous studies, we find a much longer (shorter) duration for the early WN (late WN) phase, as well as a long-lived LBV phase. We show that LBVs arise naturally in single-star evolution models at the end of the MS when the mass-loss rate increases as a consequence of crossing the bistability limit. We discuss the evolution of the spectra, magnitudes, colors, and ionizing flux across the star's lifetime, and the way they are related to the evolution of the interior. We find that the absolute magnitude of the star typically changes by ~6 mag in optical filters across the evolution, with the star becoming significantly fainter in optical filters at the end of the evolution, when it becomes a WO just a few 104 years before the SN explosion. We also discuss the origin of the different spectroscopic phases (i.e., O-type, LBV, WR) and how they are related to evolutionary phases (H-core burning, H-shell burning, He-core burning). Tables 1, 4 and 5 are available in electronic form at http://www.aanda.orgSynthetic spectra are only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (ftp://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/564/A30
The Ancient Evolutionary History of Polyomaviruses

PubMed Central

Buck, Christopher B.; Van Doorslaer, Koenraad; Peretti, Alberto; Geoghegan, Eileen M.; Tisza, Michael J.; An, Ping; Katz, Joshua P.; Pipas, James M.; McBride, Alison A.; Camus, Alvin C.; McDermott, Alexa J.; Dill, Jennifer A.; Delwart, Eric; Ng, Terry F. F.; Farkas, Kata; Austin, Charlotte; Kraberger, Simona; Davison, William; Pastrana, Diana V.; Varsani, Arvind

2016-01-01

Polyomaviruses are a family of DNA tumor viruses that are known to infect mammals and birds. To investigate the deeper evolutionary history of the family, we used a combination of viral metagenomics, bioinformatics, and structural modeling approaches to identify and characterize polyomavirus sequences associated with fish and arthropods. Analyses drawing upon the divergent new sequences indicate that polyomaviruses have been gradually co-evolving with their animal hosts for at least half a billion years. Phylogenetic analyses of individual polyomavirus genes suggest that some modern polyomavirus species arose after ancient recombination events involving distantly related polyomavirus lineages. The improved evolutionary model provides a useful platform for developing a more accurate taxonomic classification system for the viral family Polyomaviridae. PMID:27093155
Low-mass X-ray binary evolution and the origin of millisecond pulsars

NASA Technical Reports Server (NTRS)

Frank, Juhan; King, Andrew R.; Lasota, Jean-Pierre

1992-01-01

The evolution of low-mass X-ray binaries (LMXBs) is considered. It is shown that X-ray irradiation of the companion stars causes these systems to undergo episodes of rapid mass transfer followed by detached phases. The systems are visible as bright X-ray binaries only for a short part of each cycle, so that their space density must be considerably larger than previously estimated. This removes the difficulty in regarding LMXBs as the progenitors of low-mass binary pulsars. The low-accretion-rate phase of the cycle with the soft X-ray transients is identified. It is shown that 3 hr is likely to be the minimum orbital period for LMXBs with main-sequence companions and it is suggested that the evolutionary endpoint for many LMXBs may be systems which are the sites of gamma-ray bursts.
Assessing fluctuating evolutionary pressure in yeast and mammal evolutionary rate covariation using bioinformatics of meiotic protein genetic sequences

NASA Astrophysics Data System (ADS)

Dehipawala, Sunil; Nguyen, A.; Tremberger, G.; Cheung, E.; Holden, T.; Lieberman, D.; Cheung, T.

2013-09-01

The evolutionary rate co-variation in meiotic proteins has been reported for yeast and mammal using phylogenic branch lengths which assess retention, duplication and mutation. The bioinformatics of the corresponding DNA sequences could be classified as a diagram of fractal dimension and Shannon entropy. Results from biomedical gene research provide examples on the diagram methodology. The identification of adaptive selection using entropy marker and functional-structural diversity using fractal dimension would support a regression analysis where the coefficient of determination would serve as evolutionary pathway marker for DNA sequences and be an important component in the astrobiology community. Comparisons between biomedical genes such as EEF2 (elongation factor 2 human, mouse, etc), WDR85 in epigenetics, HAR1 in human specificity, clinical trial targeted cancer gene CD47, SIRT6 in spermatogenesis, and HLA-C in mosquito bite immunology demonstrate the diagram classification methodology. Comparisons to the SEPT4-XIAP pair in stem cell apoptosis, testesexpressed taste genes TAS1R3-GNAT3 pair, and amyloid beta APLP1-APLP2 pair with the yeast-mammal DNA sequences for meiotic proteins RAD50-MRE11 pair and NCAPD2-ICK pair have accounted for the observed fluctuating evolutionary pressure systematically. Regression with high R-sq values or a triangular-like cluster pattern for concordant pairs in co-variation among the studied species could serve as evidences for the possible location of common ancestors in the entropy-fractal dimension diagram, consistent with an example of the human-chimp common ancestor study using the FOXP2 regulated genes reported in human fetal brain study. The Deinococcus radiodurans R1 Rad-A could be viewed as an outlier in the RAD50 diagram and also in the free energy versus fractal dimension regression Cook's distance, consistent with a non-Earth source for this radiation resistant bacterium. Convergent and divergent fluctuating evolutionary pressure could be studied with extension to genetic sequences in organisms in possible astrobiology conditions, with the assumption that the continuation of a book of life would require meiotic proteins everywhere in the universe.
Partial sequence homogenization in the 5S multigene families may generate sequence chimeras and spurious results in phylogenetic reconstructions.

PubMed

Galián, José A; Rosato, Marcela; Rosselló, Josep A

2014-03-01

Multigene families have provided opportunities for evolutionary biologists to assess molecular evolution processes and phylogenetic reconstructions at deep and shallow systematic levels. However, the use of these markers is not free of technical and analytical challenges. Many evolutionary studies that used the nuclear 5S rDNA gene family rarely used contiguous 5S coding sequences due to the routine use of head-to-tail polymerase chain reaction primers that are anchored to the coding region. Moreover, the 5S coding sequences have been concatenated with independent, adjacent gene units in many studies, creating simulated chimeric genes as the raw data for evolutionary analysis. This practice is based on the tacitly assumed, but rarely tested, hypothesis that strict intra-locus concerted evolution processes are operating in 5S rDNA genes, without any empirical evidence as to whether it holds for the recovered data. The potential pitfalls of analysing the patterns of molecular evolution and reconstructing phylogenies based on these chimeric genes have not been assessed to date. Here, we compared the sequence integrity and phylogenetic behavior of entire versus concatenated 5S coding regions from a real data set obtained from closely related plant species (Medicago, Fabaceae). Our results suggest that within arrays sequence homogenization is partially operating in the 5S coding region, which is traditionally assumed to be highly conserved. Consequently, concatenating 5S genes increases haplotype diversity, generating novel chimeric genotypes that most likely do not exist within the genome. In addition, the patterns of gene evolution are distorted, leading to incorrect haplotype relationships in some evolutionary reconstructions.
Evolutionary genomics and HIV restriction factors.

PubMed

Pyndiah, Nitisha; Telenti, Amalio; Rausell, Antonio

2015-03-01

To provide updated insights into innate antiviral immunity and highlight prototypical evolutionary features of well characterized HIV restriction factors. Recently, a new HIV restriction factor, Myxovirus resistance 2, has been discovered and the region/residue responsible for its activity identified using an evolutionary approach. Furthermore, IFI16, an innate immunity protein known to sense several viruses, has been shown to contribute to the defense to HIV-1 by causing cell death upon sensing HIV-1 DNA. Restriction factors against HIV show characteristic signatures of positive selection. Different patterns of accelerated sequence evolution can distinguish antiviral strategies--offense or defence--as well as the level of specificity of the antiviral properties. Sequence analysis of primate orthologs of restriction factors serves to localize functional domains and sites responsible for antiviral action. We use recent discoveries to illustrate how evolutionary genomic analyses help identify new antiviral genes and their mechanisms of action.
Toxin structures as evolutionary tools: Using conserved 3D folds to study the evolution of rapidly evolving peptides.

PubMed

Undheim, Eivind A B; Mobli, Mehdi; King, Glenn F

2016-06-01

Three-dimensional (3D) structures have been used to explore the evolution of proteins for decades, yet they have rarely been utilized to study the molecular evolution of peptides. Here, we highlight areas in which 3D structures can be particularly useful for studying the molecular evolution of peptide toxins. Although we focus our discussion on animal toxins, including one of the most widespread disulfide-rich peptide folds known, the inhibitor cystine knot, our conclusions should be widely applicable to studies of the evolution of disulfide-constrained peptides. We show that conserved 3D folds can be used to identify evolutionary links and test hypotheses regarding the evolutionary origin of peptides with extremely low sequence identity; construct accurate multiple sequence alignments; and better understand the evolutionary forces that drive the molecular evolution of peptides. Also watch the video abstract. © 2016 WILEY Periodicals, Inc.
Secuencias evolutivas e isocronas para estrellas de baja masa e intermedia

NASA Astrophysics Data System (ADS)

Panei, J.; Baume, G.

2016-08-01

We present theoretical evolutionary sequences for low- and intermediate-mass stars. The masses calculated range from 1.7 to 10 M. The initial chemical composition is . In addition, we have taken into account a nuclear network with 17 isotopes and 34 nuclear reactions. With respect to the mix, we considered overshooting with a parameter . The evolutionary calculations were initialized from the region of instability of Hayashi, in order to calculate isochrones of pre-sequence, too.
Characterization of irritans mariner-like elements in the olive fruit fly Bactrocera oleae (Diptera: Tephritidae): evolutionary implications.

PubMed

Ben Lazhar-Ajroud, Wafa; Caruso, Aurore; Mezghani, Maha; Bouallegue, Maryem; Tastard, Emmanuelle; Denis, Françoise; Rouault, Jacques-Deric; Makni, Hanem; Capy, Pierre; Chénais, Benoît; Makni, Mohamed; Casse, Nathalie

2016-08-01

Genomic variation among species is commonly driven by transposable element (TE) invasion; thus, the pattern of TEs in a genome allows drawing an evolutionary history of the studied species. This paper reports in vitro and in silico detection and characterization of irritans mariner-like elements (MLEs) in the genome and transcriptome of Bactrocera oleae (Rossi) (Diptera: Tephritidae). Eleven irritans MLE sequences have been isolated in vitro using terminal inverted repeats (TIRs) as primers, and 215 have been extracted in silico from the sequenced genome of B. oleae. Additionally, the sequenced genomes of Bactrocera tryoni (Froggatt) and Bactrocera cucurbitae (Diptera: Tephritidae) have been explored to identify irritans MLEs. A total of 129 sequences from B. tryoni have been extracted, while the genome of B. cucurbitae appears probably devoid of irritans MLEs. All detected irritans MLEs are defective due to several mutations and are clustered together in a monophyletic group suggesting a common ancestor. The evolutionary history and dynamics of these TEs are discussed in relation with the phylogenetic distribution of their hosts. The knowledge on the structure, distribution, dynamic, and evolution of irritans MLEs in Bactrocera species contributes to the understanding of both their evolutionary history and the invasion history of their hosts. This could also be the basis for genetic control strategies using transposable elements.
LS³: A Method for Improving Phylogenomic Inferences When Evolutionary Rates Are Heterogeneous among Taxa

PubMed Central

Rivera-Rivera, Carlos J.; Montoya-Burgos, Juan I.

2016-01-01

Phylogenetic inference artifacts can occur when sequence evolution deviates from assumptions made by the models used to analyze them. The combination of strong model assumption violations and highly heterogeneous lineage evolutionary rates can become problematic in phylogenetic inference, and lead to the well-described long-branch attraction (LBA) artifact. Here, we define an objective criterion for assessing lineage evolutionary rate heterogeneity among predefined lineages: the result of a likelihood ratio test between a model in which the lineages evolve at the same rate (homogeneous model) and a model in which different lineage rates are allowed (heterogeneous model). We implement this criterion in the algorithm Locus Specific Sequence Subsampling (LS³), aimed at reducing the effects of LBA in multi-gene datasets. For each gene, LS³ sequentially removes the fastest-evolving taxon of the ingroup and tests for lineage rate homogeneity until all lineages have uniform evolutionary rates. The sequences excluded from the homogeneously evolving taxon subset are flagged as potentially problematic. The software implementation provides the user with the possibility to remove the flagged sequences for generating a new concatenated alignment. We tested LS³ with simulations and two real datasets containing LBA artifacts: a nucleotide dataset regarding the position of Glires within mammals and an amino-acid dataset concerning the position of nematodes within bilaterians. The initially incorrect phylogenies were corrected in all cases upon removing data flagged by LS³. PMID:26912812
Genomic V exons from whole genome shotgun data in reptiles.

PubMed

Olivieri, D N; von Haeften, B; Sánchez-Espinel, C; Faro, J; Gambón-Deza, F

2014-08-01

Reptiles and mammals diverged over 300 million years ago, creating two parallel evolutionary lineages amongst terrestrial vertebrates. In reptiles, two main evolutionary lines emerged: one gave rise to Squamata, while the other gave rise to Testudines, Crocodylia, and Aves. In this study, we determined the genomic variable (V) exons from whole genome shotgun sequencing (WGS) data in reptiles corresponding to the three main immunoglobulin (IG) loci and the four main T cell receptor (TR) loci. We show that Squamata lack the TRG and TRD genes, and snakes lack the IGKV genes. In representative species of Testudines and Crocodylia, the seven major IG and TR loci are maintained. As in mammals, genes of the IG loci can be grouped into well-defined IMGT clans through a multi-species phylogenetic analysis. We show that the reptilian IGHV and IGLV genes are distributed amongst the established mammalian clans, while their IGKV genes are found within a single clan, nearly exclusive from the mammalian sequences. The reptilian and mammalian TRAV genes cluster into six common evolutionary clades (since IMGT clans have not been defined for TR). In contrast, the reptilian TRBV genes cluster into three clades, which have few mammalian members. In this locus, the V exon sequences from mammals appear to have undergone different evolutionary diversification processes that occurred outside these shared reptilian clans. These sequences can be obtained in a freely available public repository (http://vgenerepertoire.org).
Grand challenges in evolutionary and population genetics: The importance of integrating epigenetics, genomics, modeling, and experimentation

Treesearch

Samuel A. Cushman

2014-01-01

This is a time of explosive growth in the fields of evolutionary and population genetics, with whole genome sequencing and bioinformatics driving a transformative paradigm shift (Morozova and Marra, 2008). At the same time, advances in epigenetics are thoroughly transforming our understanding of evolutionary processes and their implications for populations, species and...
Atomic diffusion and mixing in old stars. V. A deeper look into the globular cluster NGC 6752

NASA Astrophysics Data System (ADS)

Gruyters, Pieter; Nordlander, Thomas; Korn, Andreas J.

2014-07-01

Context. Abundance trends in heavier elements with evolutionary phase have been shown to exist in the globular cluster NGC 6752 ([Fe / H] = -1.6). These trends are a result of atomic diffusion and additional (non-convective) mixing. Studying such trends can provide us with important constraints on the extent to which diffusion modifies the internal structure and surface abundances of solar-type, metal-poor stars. Aims: Taking advantage of a larger data sample, we investigate the reality and the size of these abundance trends and address questions and potential biases associated with the various stellar populations that make up NGC 6752. Methods: We perform an abundance analysis by combining photometric and spectroscopic data of 194 stars located between the turnoff point and the base of the red giant branch. Stellar parameters are derived from uvby Strömgren photometry. Using the quantitative-spectroscopy package SME, stellar surface abundances for light elements such as Li, Na, Mg, Al, and Si as well as heavier elements such as Ca, Ti, and Fe are derived in an automated way by fitting synthetic spectra to individual lines in the stellar spectra, obtained with the VLT/FLAMES-GIRAFFE spectrograph. Results: Based on uvby Strömgren photometry, we are able to separate three stellar populations in NGC 6752 along the evolutionary sequence from the base of the red giant branch down to the turnoff point. We find weak systematic abundance trends with evolutionary phase for Ca, Ti, and Fe which are best explained by stellar-structure models including atomic diffusion with efficient additional mixing. We derive a new value for the initial lithium abundance of NGC 6752 after correcting for the effect of atomic diffusion and additional mixing which falls slightly below the predicted standard BBN value. Conclusions: We find three stellar populations by combining photometric and spectroscopic data of 194 stars in the globular cluster NGC 6752. Abundance trends for groups of elements, differently affected by atomic diffusion and additional mixing, are identified. Although the statistical significance of the individual trends is weak, they all support the notion that atomic diffusion is operational along the evolutionary sequence of NGC 6752. Based on data collected at the ESO telescopes under programs 079.D-0645(A) and 081.D-0253(A).Full Tables 2 and 8 are only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (ftp://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/567/A72

Biophysics of protein evolution and evolutionary protein biophysics

PubMed Central

Sikosek, Tobias; Chan, Hue Sun

2014-01-01

The study of molecular evolution at the level of protein-coding genes often entails comparing large datasets of sequences to infer their evolutionary relationships. Despite the importance of a protein's structure and conformational dynamics to its function and thus its fitness, common phylogenetic methods embody minimal biophysical knowledge of proteins. To underscore the biophysical constraints on natural selection, we survey effects of protein mutations, highlighting the physical basis for marginal stability of natural globular proteins and how requirement for kinetic stability and avoidance of misfolding and misinteractions might have affected protein evolution. The biophysical underpinnings of these effects have been addressed by models with an explicit coarse-grained spatial representation of the polypeptide chain. Sequence–structure mappings based on such models are powerful conceptual tools that rationalize mutational robustness, evolvability, epistasis, promiscuous function performed by ‘hidden’ conformational states, resolution of adaptive conflicts and conformational switches in the evolution from one protein fold to another. Recently, protein biophysics has been applied to derive more accurate evolutionary accounts of sequence data. Methods have also been developed to exploit sequence-based evolutionary information to predict biophysical behaviours of proteins. The success of these approaches demonstrates a deep synergy between the fields of protein biophysics and protein evolution. PMID:25165599
Ignoring heterozygous sites biases phylogenomic estimates of divergence times: implications for the evolutionary history of microtus voles.

PubMed

Lischer, Heidi E L; Excoffier, Laurent; Heckel, Gerald

2014-04-01

Phylogenetic reconstruction of the evolutionary history of closely related organisms may be difficult because of the presence of unsorted lineages and of a relatively high proportion of heterozygous sites that are usually not handled well by phylogenetic programs. Genomic data may provide enough fixed polymorphisms to resolve phylogenetic trees, but the diploid nature of sequence data remains analytically challenging. Here, we performed a phylogenomic reconstruction of the evolutionary history of the common vole (Microtus arvalis) with a focus on the influence of heterozygosity on the estimation of intraspecific divergence times. We used genome-wide sequence information from 15 voles distributed across the European range. We provide a novel approach to integrate heterozygous information in existing phylogenetic programs by repeated random haplotype sampling from sequences with multiple unphased heterozygous sites. We evaluated the impact of the use of full, partial, or no heterozygous information for tree reconstructions on divergence time estimates. All results consistently showed four deep and strongly supported evolutionary lineages in the vole data. These lineages undergoing divergence processes split only at the end or after the last glacial maximum based on calibration with radiocarbon-dated paleontological material. However, the incorporation of information from heterozygous sites had a significant impact on absolute and relative branch length estimations. Ignoring heterozygous information led to an overestimation of divergence times between the evolutionary lineages of M. arvalis. We conclude that the exclusion of heterozygous sites from evolutionary analyses may cause biased and misleading divergence time estimates in closely related taxa.
Conservation of Endo16 expression in sea urchins despite evolutionary divergence in both cis and trans-acting components of transcriptional regulation

NASA Technical Reports Server (NTRS)

Romano, Laura A.; Wray, Gregory A.

2003-01-01

Evolutionary changes in transcriptional regulation undoubtedly play an important role in creating morphological diversity. However, there is little information about the evolutionary dynamics of cis-regulatory sequences. This study examines the functional consequence of evolutionary changes in the Endo16 promoter of sea urchins. The Endo16 gene encodes a large extracellular protein that is expressed in the endoderm and may play a role in cell adhesion. Its promoter has been characterized in exceptional detail in the purple sea urchin, Strongylocentrotus purpuratus. We have characterized the structure and function of the Endo16 promoter from a second sea urchin species, Lytechinus variegatus. The Endo16 promoter sequences have evolved in a strongly mosaic manner since these species diverged approximately 35 million years ago: the most proximal region (module A) is conserved, but the remaining modules (B-G) are unalignable. Despite extensive divergence in promoter sequences, the pattern of Endo16 transcription is largely conserved during embryonic and larval development. Transient expression assays demonstrate that 2.2 kb of upstream sequence in either species is sufficient to drive GFP reporter expression that correctly mimics this pattern of Endo16 transcription. Reciprocal cross-species transient expression assays imply that changes have also evolved in the set of transcription factors that interact with the Endo16 promoter. Taken together, these results suggest that stabilizing selection on the transcriptional output may have operated to maintain a similar pattern of Endo16 expression in S. purpuratus and L. variegatus, despite dramatic divergence in promoter sequence and mechanisms of transcriptional regulation.
Asteroseismology of ZZ Ceti stars with full evolutionary white dwarf models. II. The impact of AGB thermal pulses on the asteroseismic inferences of ZZ Ceti stars

NASA Astrophysics Data System (ADS)

De Gerónimo, F. C.; Althaus, L. G.; Córsico, A. H.; Romero, A. D.; Kepler, S. O.

2018-05-01

Context. The thermally pulsing phase on the asymptotic giant branch (TP-AGB) is the last nuclear burning phase experienced by most low- and intermediate-mass stars. During this phase, the outer chemical stratification above the C/O core of the emerging white dwarf (WD) is built up. The chemical structure resulting from progenitor evolution strongly impacts the whole pulsation spectrum exhibited by ZZ Ceti stars, which are pulsating C/O core white dwarfs located on a narrow instability strip at Teff 12 000 K. Several physical processes occurring during progenitor evolution strongly affect the chemical structure of these stars; those found during the TP-AGB phase are the most relevant for the pulsational properties of ZZ Ceti stars. Aims: We present a study of the impact of the chemical structure built up during the TP-AGB evolution on the stellar parameters inferred from asteroseismological fits of ZZ Ceti stars. Methods: Our analysis is based on a set of carbon-oxygen core white dwarf models with masses from 0.534 to 0.6463 M⊙ derived from full evolutionary computations from the ZAMS to the ZZ Ceti domain. We computed evolutionary sequences that experience different number of thermal pulses (TP). Results: We find that the occurrence or not of thermal pulses during AGB evolution implies an average deviation in the asteroseimological effective temperature of ZZ Ceti stars of at most 8% and on the order of ≲5% in the stellar mass. For the mass of the hydrogen envelope, however, we find deviations up to 2 orders of magnitude in the case of cool ZZ Ceti stars. Hot and intermediate temperature ZZ Ceti stars show no differences in the hydrogen envelope mass in most cases. Conclusions: Our results show that, in general, the impact of the occurrence or not of thermal pulses in the progenitor stars is not negligible and must be taken into account in asteroseismological studies of ZZ Ceti stars.
Deciphering evolutionary strata on plant sex chromosomes and fungal mating-type chromosomes through compositional segmentation.

PubMed

Pandey, Ravi S; Azad, Rajeev K

2016-03-01

Sex chromosomes have evolved from a pair of homologous autosomes which differentiated into sex determination systems, such as XY or ZW system, as a consequence of successive recombination suppression between the gametologous chromosomes. Identifying the regions of recombination suppression, namely, the "evolutionary strata", is central to understanding the history and dynamics of sex chromosome evolution. Evolution of sex chromosomes as a consequence of serial recombination suppressions is well-studied for mammals and birds, but not for plants, although 48 dioecious plants have already been reported. Only two plants Silene latifolia and papaya have been studied until now for the presence of evolutionary strata on their X chromosomes, made possible by the sequencing of sex-linked genes on both the X and Y chromosomes, which is a requirement of all current methods that determine stratum structure based on the comparison of gametologous sex chromosomes. To circumvent this limitation and detect strata even if only the sequence of sex chromosome in the homogametic sex (i.e. X or Z chromosome) is available, we have developed an integrated segmentation and clustering method. In application to gene sequences on the papaya X chromosome and protein-coding sequences on the S. latifolia X chromosome, our method could decipher all known evolutionary strata, as reported by previous studies. Our method, after validating on known strata on the papaya and S. latifolia X chromosome, was applied to the chromosome 19 of Populus trichocarpa, an incipient sex chromosome, deciphering two, yet unknown, evolutionary strata. In addition, we applied this approach to the recently sequenced sex chromosome V of the brown alga Ectocarpus sp. that has a haploid sex determination system (UV system) recovering the sex determining and pseudoautosomal regions, and then to the mating-type chromosomes of an anther-smut fungus Microbotryum lychnidis-dioicae predicting five strata in the non-recombining region of both the chromosomes.
JCoDA: a tool for detecting evolutionary selection.

PubMed

Steinway, Steven N; Dannenfelser, Ruth; Laucius, Christopher D; Hayes, James E; Nayak, Sudhir

2010-05-27

The incorporation of annotated sequence information from multiple related species in commonly used databases (Ensembl, Flybase, Saccharomyces Genome Database, Wormbase, etc.) has increased dramatically over the last few years. This influx of information has provided a considerable amount of raw material for evaluation of evolutionary relationships. To aid in the process, we have developed JCoDA (Java Codon Delimited Alignment) as a simple-to-use visualization tool for the detection of site specific and regional positive/negative evolutionary selection amongst homologous coding sequences. JCoDA accepts user-inputted unaligned or pre-aligned coding sequences, performs a codon-delimited alignment using ClustalW, and determines the dN/dS calculations using PAML (Phylogenetic Analysis Using Maximum Likelihood, yn00 and codeml) in order to identify regions and sites under evolutionary selection. The JCoDA package includes a graphical interface for Phylip (Phylogeny Inference Package) to generate phylogenetic trees, manages formatting of all required file types, and streamlines passage of information between underlying programs. The raw data are output to user configurable graphs with sliding window options for straightforward visualization of pairwise or gene family comparisons. Additionally, codon-delimited alignments are output in a variety of common formats and all dN/dS calculations can be output in comma-separated value (CSV) format for downstream analysis. To illustrate the types of analyses that are facilitated by JCoDA, we have taken advantage of the well studied sex determination pathway in nematodes as well as the extensive sequence information available to identify genes under positive selection, examples of regional positive selection, and differences in selection based on the role of genes in the sex determination pathway. JCoDA is a configurable, open source, user-friendly visualization tool for performing evolutionary analysis on homologous coding sequences. JCoDA can be used to rapidly screen for genes and regions of genes under selection using PAML. It can be freely downloaded at http://www.tcnj.edu/~nayaklab/jcoda.
JCoDA: a tool for detecting evolutionary selection

PubMed Central

2010-01-01

Background The incorporation of annotated sequence information from multiple related species in commonly used databases (Ensembl, Flybase, Saccharomyces Genome Database, Wormbase, etc.) has increased dramatically over the last few years. This influx of information has provided a considerable amount of raw material for evaluation of evolutionary relationships. To aid in the process, we have developed JCoDA (Java Codon Delimited Alignment) as a simple-to-use visualization tool for the detection of site specific and regional positive/negative evolutionary selection amongst homologous coding sequences. Results JCoDA accepts user-inputted unaligned or pre-aligned coding sequences, performs a codon-delimited alignment using ClustalW, and determines the dN/dS calculations using PAML (Phylogenetic Analysis Using Maximum Likelihood, yn00 and codeml) in order to identify regions and sites under evolutionary selection. The JCoDA package includes a graphical interface for Phylip (Phylogeny Inference Package) to generate phylogenetic trees, manages formatting of all required file types, and streamlines passage of information between underlying programs. The raw data are output to user configurable graphs with sliding window options for straightforward visualization of pairwise or gene family comparisons. Additionally, codon-delimited alignments are output in a variety of common formats and all dN/dS calculations can be output in comma-separated value (CSV) format for downstream analysis. To illustrate the types of analyses that are facilitated by JCoDA, we have taken advantage of the well studied sex determination pathway in nematodes as well as the extensive sequence information available to identify genes under positive selection, examples of regional positive selection, and differences in selection based on the role of genes in the sex determination pathway. Conclusions JCoDA is a configurable, open source, user-friendly visualization tool for performing evolutionary analysis on homologous coding sequences. JCoDA can be used to rapidly screen for genes and regions of genes under selection using PAML. It can be freely downloaded at http://www.tcnj.edu/~nayaklab/jcoda. PMID:20507581
Insights into the evolution of enzyme substrate promiscuity after the discovery of (βα)₈ isomerase evolutionary intermediates from a diverse metagenome.

PubMed

Noda-García, Lianet; Juárez-Vázquez, Ana L; Ávila-Arcos, María C; Verduzco-Castro, Ernesto A; Montero-Morán, Gabriela; Gaytán, Paul; Carrillo-Tripp, Mauricio; Barona-Gómez, Francisco

2015-06-10

Current sequence-based approaches to identify enzyme functional shifts, such as enzyme promiscuity, have proven to be highly dependent on a priori functional knowledge, hampering our ability to reconstruct evolutionary history behind these mechanisms. Hidden Markov Model (HMM) profiles, broadly used to classify enzyme families, can be useful to distinguish between closely related enzyme families with different specificities. The (βα)8-isomerase HisA/PriA enzyme family, involved in L-histidine (HisA, mono-substrate) biosynthesis in most bacteria and plants, but also in L-tryptophan (HisA/TrpF or PriA, dual-substrate) biosynthesis in most Actinobacteria, has been used as model system to explore evolutionary hypotheses and therefore has a considerable amount of evolutionary, functional and structural knowledge available. We searched for functional evolutionary intermediates between the HisA and PriA enzyme families in order to understand the functional divergence between these families. We constructed a HMM profile that correctly classifies sequences of unknown function into the HisA and PriA enzyme sub-families. Using this HMM profile, we mined a large metagenome to identify plausible evolutionary intermediate sequences between HisA and PriA. These sequences were used to perform phylogenetic reconstructions and to identify functionally conserved amino acids. Biochemical characterization of one selected enzyme (CAM1) with a mutation within the functionally essential N-terminus phosphate-binding site, namely, an alanine instead of a glycine in HisA or a serine in PriA, showed that this evolutionary intermediate has dual-substrate specificity. Moreover, site-directed mutagenesis of this alanine residue, either backwards into a glycine or forward into a serine, revealed the robustness of this enzyme. None of these mutations, presumably upon functionally essential amino acids, significantly abolished its enzyme activities. A truncated version of this enzyme (CAM2) predicted to adopt a (βα)6-fold, and thus entirely lacking a C-terminus phosphate-binding site, was identified and shown to have HisA activity. As expected, reconstruction of the evolution of PriA from HisA with HMM profiles suggest that functional shifts involve mutations in evolutionarily intermediate enzymes of otherwise functionally essential residues or motifs. These results are in agreement with a link between promiscuous enzymes and intragenic epistasis. HMM provides a convenient approach for gaining insights into these evolutionary processes.
The development of the red giant branch. II - Astrophysical properties

NASA Technical Reports Server (NTRS)

Sweigart, Allen V.; Greggio, Laura; Renzini, Alvio

1990-01-01

Evolutionary sequences developed in another paper are used here to investigate the properties of the red giant branch (RGB) phase transition. Results are found for compositions in the range Y(MS) between 0.20 and 0.30 and Z between 0.004 and 0.04. The transition mass M(HeF) increases as either Y(MS) decreases or Z increases. The stellar population transition age t(HeF) is virtually independent of composition and close to 0.6 Gyr. The RGB phase transition occurs almost abruptly over a mass range of only a few tenths of a solar mass or, equivalently, over a time interval of about 0.2 Gyr in the life of a stellar population. During the RGB phase transition the core mass Mc at helium ignition increases very rapidly by about 0.15 solar mass, while the luminosity at the tip of the RGB increases by about one order of magnitude. Absolute minima are found for the values of Mc and the RGB tip luminosity.
An Evolutionary Machine Learning Framework for Big Data Sequence Mining

ERIC Educational Resources Information Center

Kamath, Uday Krishna

2014-01-01

Sequence classification is an important problem in many real-world applications. Unlike other machine learning data, there are no "explicit" features or signals in sequence data that can help traditional machine learning algorithms learn and predict from the data. Sequence data exhibits inter-relationships in the elements that are…
Adaptive evolutionary walks require neutral intermediates in RNA fitness landscapes.

PubMed

Rendel, Mark D

2011-01-01

In RNA fitness landscapes with interconnected networks of neutral mutations, neutral precursor mutations can play an important role in facilitating the accessibility of epistatic adaptive mutant combinations. I use an exhaustively surveyed fitness landscape model based on short sequence RNA genotypes (and their secondary structure phenotypes) to calculate the minimum rate at which mutants initially appearing as neutral are incorporated into an adaptive evolutionary walk. I show first, that incorporating neutral mutations significantly increases the number of point mutations in a given evolutionary walk when compared to estimates from previous adaptive walk models. Second, that incorporating neutral mutants into such a walk significantly increases the final fitness encountered on that walk - indeed evolutionary walks including neutral steps often reach the global optimum in this model. Third, and perhaps most importantly, evolutionary paths of this kind are often extremely winding in their nature and have the potential to undergo multiple mutations at a given sequence position within a single walk; the potential of these winding paths to mislead phylogenetic reconstruction is briefly considered. Copyright © 2010 Elsevier Inc. All rights reserved.
Resolving the Origin of Rabbit Hemorrhagic Disease Virus: Insights from an Investigation of the Viral Stocks Released in Australia

PubMed Central

Eden, John-Sebastian; Read, Andrew J.; Duckworth, Janine A.; Strive, Tanja

2015-01-01

To resolve the evolutionary history of rabbit hemorrhagic disease virus (RHDV), we performed a genomic analysis of the viral stocks imported and released as a biocontrol measure in Australia, as well as a global phylogenetic analysis. Importantly, conflicts were identified between the sequences determined here and those previously published that may have affected evolutionary rate estimates. By removing likely erroneous sequences, we show that RHDV emerged only shortly before its initial description in China. PMID:26378178
Genome-Wide Search Identifies 1.9 Mb from the Polar Bear Y Chromosome for Evolutionary Analyses.

PubMed

Bidon, Tobias; Schreck, Nancy; Hailer, Frank; Nilsson, Maria A; Janke, Axel

2015-05-27

The male-inherited Y chromosome is the major haploid fraction of the mammalian genome, rendering Y-linked sequences an indispensable resource for evolutionary research. However, despite recent large-scale genome sequencing approaches, only a handful of Y chromosome sequences have been characterized to date, mainly in model organisms. Using polar bear (Ursus maritimus) genomes, we compare two different in silico approaches to identify Y-linked sequences: 1) Similarity to known Y-linked genes and 2) difference in the average read depth of autosomal versus sex chromosomal scaffolds. Specifically, we mapped available genomic sequencing short reads from a male and a female polar bear against the reference genome and identify 112 Y-chromosomal scaffolds with a combined length of 1.9 Mb. We verified the in silico findings for the longer polar bear scaffolds by male-specific in vitro amplification, demonstrating the reliability of the average read depth approach. The obtained Y chromosome sequences contain protein-coding sequences, single nucleotide polymorphisms, microsatellites, and transposable elements that are useful for evolutionary studies. A high-resolution phylogeny of the polar bear patriline shows two highly divergent Y chromosome lineages, obtained from analysis of the identified Y scaffolds in 12 previously published male polar bear genomes. Moreover, we find evidence of gene conversion among ZFX and ZFY sequences in the giant panda lineage and in the ancestor of ursine and tremarctine bears. Thus, the identification of Y-linked scaffold sequences from unordered genome sequences yields valuable data to infer phylogenomic and population-genomic patterns in bears. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Diversity and evolutionary patterns of immune genes in free-ranging Namibian leopards (Panthera pardus pardus).

PubMed

Castro-Prieto, Aines; Wachter, Bettina; Melzheimer, Joerg; Thalwitzer, Susanne; Sommer, Simone

2011-01-01

The genes of the major histocompatibility complex (MHC) are a key component of the mammalian immune system and have become important molecular markers for fitness-related genetic variation in wildlife populations. Currently, no information about the MHC sequence variation and constitution in African leopards exists. In this study, we isolated and characterized genetic variation at the adaptively most important region of MHC class I and MHC class II-DRB genes in 25 free-ranging African leopards from Namibia and investigated the mechanisms that generate and maintain MHC polymorphism in the species. Using single-stranded conformation polymorphism analysis and direct sequencing, we detected 6 MHC class I and 6 MHC class II-DRB sequences, which likely correspond to at least 3 MHC class I and 3 MHC class II-DRB loci. Amino acid sequence variation in both MHC classes was higher or similar in comparison to other reported felids. We found signatures of positive selection shaping the diversity of MHC class I and MHC class II-DRB loci during the evolutionary history of the species. A comparison of MHC class I and MHC class II-DRB sequences of the leopard to those of other felids revealed a trans-species mode of evolution. In addition, the evolutionary relationships of MHC class II-DRB sequences between African and Asian leopard subspecies are discussed.
Exotic behavior and crystal structures of calcium under pressure

PubMed Central

Oganov, Artem R.; Ma, Yanming; Xu, Ying; Errea, Ion; Bergara, Aitor; Lyakhov, Andriy O.

2010-01-01

Experimental studies established that calcium undergoes several counterintuitive transitions under pressure: fcc → bcc → simple cubic → Ca-IV → Ca-V, and becomes a good superconductor in the simple cubic and higher-pressure phases. Here, using ab initio evolutionary simulations, we explore the behavior of Ca under pressure and find a number of new phases. Our structural sequence differs from the traditional picture for Ca, but is similar to that for Sr. The β-tin (I41/amd) structure, rather than simple cubic, is predicted to be the theoretical ground state at 0 K and 33–71 GPa. This structure can be represented as a large distortion of the simple cubic structure, just as the higher-pressure phases stable between 71 and 134 GPa. The structure of Ca-V, stable above 134 GPa, is a complex host-guest structure. According to our calculations, the predicted phases are superconductors with Tc increasing under pressure and reaching approximately 20 K at 120 GPa, in good agreement with experiment. PMID:20382865
MOCASSIN-prot: A multi-objective clustering approach for protein similarity networks

USDA-ARS?s Scientific Manuscript database

Motivation: Proteins often include multiple conserved domains. Various evolutionary events including duplication and loss of domains, domain shuffling, as well as sequence divergence contribute to generating complexities in protein structures, and consequently, in their functions. The evolutionary h...
Magnetic fields in single late-type giants in the Solar vicinity: How common is magnetic activity on the giant branches?

NASA Astrophysics Data System (ADS)

Konstantinova-Antova, Renada; Aurière, Michel; Charbonnel, Corinne; Drake, Natalia; Wade, Gregg; Tsvetkova, Svetla; Petit, Pascal; Schröder, Klaus-Peter; Lèbre, Agnes

2014-08-01

We present our first results on a new sample containing all single G, K and M giants down to V = 4 mag in the Solar vicinity, suitable for spectropolarimetric (Stokes V) observations with Narval at TBL, France. For detection and measurement of the magnetic field (MF), the Least Squares Deconvolution (LSD) method was applied (Donati et al. 1997) that in the present case enables detection of large-scale MFs even weaker than the solar one (the typical precision of our longitudinal MF measurements is 0.1-0.2 G). The evolutionary status of the stars is determined on the basis of the evolutionary models with rotation (Lagarde et al. 2012; Charbonnel et al., in prep.) and fundamental parameters given by Massarotti et al. (1998). The stars appear to be in the mass range 1-4 M ⊙, situated at different evolutionary stages after the Main Sequence (MS), up to the Asymptotic Giant Branch (AGB). The sample contains 45 stars. Up to now, 29 stars are observed (that is about 64% of the sample), each observed at least twice. For 2 stars in the Hertzsprung gap, one is definitely Zeeman detected. Only 5 G and K giants, situated mainly at the base of the Red Giant Branch (RGB) and in the He-burning phase are detected. Surprisingly, a lot of stars ascending towards the RGB tip and in early AGB phase are detected (8 of 13 observed stars). For all Zeeman detected stars v sin i is redetermined and appears in the interval 2-3 km/s, but few giants with MF possess larger v sin i.
δ-exceedance records and random adaptive walks

NASA Astrophysics Data System (ADS)

Park, Su-Chan; Krug, Joachim

2016-08-01

We study a modified record process where the kth record in a series of independent and identically distributed random variables is defined recursively through the condition {Y}k\\gt {Y}k-1-{δ }k-1 with a deterministic sequence {δ }k\\gt 0 called the handicap. For constant {δ }k\\equiv δ and exponentially distributed random variables it has been shown in previous work that the process displays a phase transition as a function of δ between a normal phase where the mean record value increases indefinitely and a stationary phase where the mean record value remains bounded and a finite fraction of all entries are records (Park et al 2015 Phys. Rev. E 91 042707). Here we explore the behavior for general probability distributions and decreasing and increasing sequences {δ }k, focusing in particular on the case when {δ }k matches the typical spacing between subsequent records in the underlying simple record process without handicap. We find that a continuous phase transition occurs only in the exponential case, but a novel kind of first order transition emerges when {δ }k is increasing. The problem is partly motivated by the dynamics of evolutionary adaptation in biological fitness landscapes, where {δ }k corresponds to the change of the deterministic fitness component after k mutational steps. The results for the record process are used to compute the mean number of steps that a population performs in such a landscape before being trapped at a local fitness maximum.
Quadrupedal locomotor simulation: producing more realistic gaits using dual-objective optimization

PubMed Central

Hirasaki, Eishi

2018-01-01

In evolutionary biomechanics it is often considered that gaits should evolve to minimize the energetic cost of travelling a given distance. In gait simulation this goal often leads to convincing gait generation. However, as the musculoskeletal models used get increasingly sophisticated, it becomes apparent that such a single goal can lead to extremely unrealistic gait patterns. In this paper, we explore the effects of requiring adequate lateral stability and show how this increases both energetic cost and the realism of the generated walking gait in a high biofidelity chimpanzee musculoskeletal model. We also explore the effects of changing the footfall sequences in the simulation so it mimics both the diagonal sequence walking gaits that primates typically use and also the lateral sequence walking gaits that are much more widespread among mammals. It is apparent that adding a lateral stability criterion has an important effect on the footfall phase relationship, suggesting that lateral stability may be one of the key drivers behind the observed footfall sequences in quadrupedal gaits. The observation that single optimization goals are no longer adequate for generating gait in current models has important implications for the use of biomimetic virtual robots to predict the locomotor patterns in fossil animals. PMID:29657790
Active site of tripeptidyl peptidase II from human erythrocytes is of the subtilisin type

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tomkinson, B.; Wernstedt, C.; Hellman, U.

1987-11-01

The present report presents evidence that the amino acid sequence around the serine of the active site of human tripeptidyl peptidase II is of the subtilisin type. The enzyme from human erythrocytes was covalently labeled at its active site with (/sup 3/H)diisopropyl fluorophosphate, and the protein was subsequently reduced, alkylated, and digested with trypsin. The labeled tryptic peptides were purified by gel filtration and repeated reversed-phase HPLC, and their amino-terminal sequences were determined. Residue 9 contained the radioactive label and was, therefore, considered to be the active serine residue. The primary structure of the part of the active site (residuesmore » 1-10) containing this residue was concluded to be Xaa-Thr-Gln-Leu-Met-Asx-Gly-Thr-Ser-Met. This amino acid sequence is homologous to the sequence surrounding the active serine of the microbial peptidases subtilisin and thermitase. These data demonstrate that human tripeptidyl peptidase II represents a potentially distinct class of human peptidases and raise the question of an evolutionary relationship between the active site of a mammalian peptidase and that of the subtilisin family of serine peptidases.« less

Evolutionary Influenced Interaction Pattern as Indicator for the Investigation of Natural Variants Causing Nephrogenic Diabetes Insipidus

PubMed Central

Labudde, Dirk

2015-01-01

The importance of short membrane sequence motifs has been shown in many works and emphasizes the related sequence motif analysis. Together with specific transmembrane helix-helix interactions, the analysis of interacting sequence parts is helpful for understanding the process during membrane protein folding and in retaining the three-dimensional fold. Here we present a simple high-throughput analysis method for deriving mutational information of interacting sequence parts. Applied on aquaporin water channel proteins, our approach supports the analysis of mutational variants within different interacting subsequences and finally the investigation of natural variants which cause diseases like, for example, nephrogenic diabetes insipidus. In this work we demonstrate a simple method for massive membrane protein data analysis. As shown, the presented in silico analyses provide information about interacting sequence parts which are constrained by protein evolution. We present a simple graphical visualization medium for the representation of evolutionary influenced interaction pattern pairs (EIPPs) adapted to mutagen investigations of aquaporin-2, a protein whose mutants are involved in the rare endocrine disorder known as nephrogenic diabetes insipidus, and membrane proteins in general. Furthermore, we present a new method to derive new evolutionary variations within EIPPs which can be used for further mutagen laboratory investigations. PMID:26180540
Evolutionary Influenced Interaction Pattern as Indicator for the Investigation of Natural Variants Causing Nephrogenic Diabetes Insipidus.

PubMed

Grunert, Steffen; Labudde, Dirk

2015-01-01

The importance of short membrane sequence motifs has been shown in many works and emphasizes the related sequence motif analysis. Together with specific transmembrane helix-helix interactions, the analysis of interacting sequence parts is helpful for understanding the process during membrane protein folding and in retaining the three-dimensional fold. Here we present a simple high-throughput analysis method for deriving mutational information of interacting sequence parts. Applied on aquaporin water channel proteins, our approach supports the analysis of mutational variants within different interacting subsequences and finally the investigation of natural variants which cause diseases like, for example, nephrogenic diabetes insipidus. In this work we demonstrate a simple method for massive membrane protein data analysis. As shown, the presented in silico analyses provide information about interacting sequence parts which are constrained by protein evolution. We present a simple graphical visualization medium for the representation of evolutionary influenced interaction pattern pairs (EIPPs) adapted to mutagen investigations of aquaporin-2, a protein whose mutants are involved in the rare endocrine disorder known as nephrogenic diabetes insipidus, and membrane proteins in general. Furthermore, we present a new method to derive new evolutionary variations within EIPPs which can be used for further mutagen laboratory investigations.
Microsporidia, amitochondrial protists, possess a 70-kDa heat shock protein gene of mitochondrial evolutionary origin.

PubMed

Peyretaillade, E; Broussolle, V; Peyret, P; Méténier, G; Gouy, M; Vivarès, C P

1998-06-01

An intronless gene encoding a protein of 592 amino acid residues with similarity to 70-kDa heat shock proteins (HSP70s) has been cloned and sequenced from the amitochondrial protist Encephalitozoon cuniculi (phylum Microsporidia). Southern blot analyses show the presence of a single gene copy located on chromosome XI. The encoded protein exhibits an N-terminal hydrophobic leader sequence and two motifs shared by proteobacterial and mitochondrially expressed HSP70 homologs. Phylogenetic analysis using maximum likelihood and evolutionary distances place the E. cuniculi sequence in the cluster of mitochondrially expressed HSP70s, with a higher evolutionary rate than those of homologous sequences. Similar results were obtained after cloning a fragment of the homologous gene in the closely related species E. hellem. The presence of a nuclear targeting signal-like sequence supports a role of the Encephalitozoon HSP70 as a molecular chaperone of nuclear proteins. No evidence for cytosolic or endoplasmic reticulum forms of HSP70 was obtained through PCR amplification. These data suggest that Encephalitozoon species have evolved from an ancestor bearing mitochondria, which is in disagreement with the postulated presymbiotic origin of Microsporidia. The specific role and intracellular localization of the mitochondrial HSP70-like protein remain to be elucidated.
Simple versus complex models of trait evolution and stasis as a response to environmental change

NASA Astrophysics Data System (ADS)

Hunt, Gene; Hopkins, Melanie J.; Lidgard, Scott

2015-04-01

Previous analyses of evolutionary patterns, or modes, in fossil lineages have focused overwhelmingly on three simple models: stasis, random walks, and directional evolution. Here we use likelihood methods to fit an expanded set of evolutionary models to a large compilation of ancestor-descendant series of populations from the fossil record. In addition to the standard three models, we assess more complex models with punctuations and shifts from one evolutionary mode to another. As in previous studies, we find that stasis is common in the fossil record, as is a strict version of stasis that entails no real evolutionary changes. Incidence of directional evolution is relatively low (13%), but higher than in previous studies because our analytical approach can more sensitively detect noisy trends. Complex evolutionary models are often favored, overwhelmingly so for sequences comprising many samples. This finding is consistent with evolutionary dynamics that are, in reality, more complex than any of the models we consider. Furthermore, the timing of shifts in evolutionary dynamics varies among traits measured from the same series. Finally, we use our empirical collection of evolutionary sequences and a long and highly resolved proxy for global climate to inform simulations in which traits adaptively track temperature changes over time. When realistically calibrated, we find that this simple model can reproduce important aspects of our paleontological results. We conclude that observed paleontological patterns, including the prevalence of stasis, need not be inconsistent with adaptive evolution, even in the face of unstable physical environments.
Phylogenetics and Gene Structure Dynamics of Polygalacturonase Genes in Aspergillus and Neurospora crassa

PubMed Central

Hong, Jin-Sung; Ryu, Ki-Hyun; Kwon, Soon-Jae; Kim, Jin-Won; Kim, Kwang-Soo; Park, Kyong-Cheul

2013-01-01

Polygalacturonase (PG) gene is a typical gene family present in eukaryotes. Forty-nine PGs were mined from the genomes of Neurospora crassa and five Aspergillus species. The PGs were classified into 3 clades such as clade 1 for rhamno-PGs, clade 2 for exo-PGs and clade 3 for exo- and endo-PGs, which were further grouped into 13 sub-clades based on the polypeptide sequence similarity. In gene structure analysis, a total of 124 introns were present in 44 genes and five genes lacked introns to give an average of 2.5 introns per gene. Intron phase distribution was 64.5% for phase 0, 21.8% for phase 1, and 13.7% for phase 2, respectively. The introns varied in their sequences and their lengths ranged from 20 bp to 424 bp with an average of 65.9 bp, which is approximately half the size of introns in other fungal genes. There were 29 homologous intron blocks and 26 of those were sub-clade specific. Intron losses were counted in 18 introns in which no obvious phase preference for intron loss was observed. Eighteen introns were placed at novel positions, which is considerably higher than those of plant PGs. In an evolutionary sense both intron loss and gain must have taken place for shaping the current PGs in these fungi. Together with the small intron size, low conservation of homologous intron blocks and higher number of novel introns, PGs of fungal species seem to have recently undergone highly dynamic evolution. PMID:25288950
Novel variable number of tandem repeats of gibbon MAOA gene and its evolutionary significance.

PubMed

Choi, Yuri; Jung, Yi-Deun; Ayarpadikannan, Selvam; Koga, Akihiko; Imai, Hiroo; Hirai, Hirohisa; Roos, Christian; Kim, Heui-Soo

2014-08-01

Variable number of tandem repeats (VNTRs) are scattered throughout the primate genome, and genetic variation of these VNTRs have been accumulated during primate radiation. Here, we analyzed VNTRs upstream of the monoamine oxidase A (MAOA) gene in 11 different gibbon species. An abundance of truncated VNTR sequences and copy number differences were observed compared to those of human VNTR sequences. To better understand the biological role of these VNTRs, a luciferase activity assay was conducted and results indicated that selected VNTR sequences of the MAOA gene from human and three different gibbon species (Hylobates klossii, Hylobates lar, and Nomascus concolor) showed silencing ability. Together, these data could be useful for understanding the evolutionary history and functional significance of MAOA VNTR sequences in gibbon species.
Peregrine and saker falcon genome sequences provide insights into evolution of a predatory lifestyle.

PubMed

Zhan, Xiangjiang; Pan, Shengkai; Wang, Junyi; Dixon, Andrew; He, Jing; Muller, Margit G; Ni, Peixiang; Hu, Li; Liu, Yuan; Hou, Haolong; Chen, Yuanping; Xia, Jinquan; Luo, Qiong; Xu, Pengwei; Chen, Ying; Liao, Shengguang; Cao, Changchang; Gao, Shukun; Wang, Zhaobao; Yue, Zhen; Li, Guoqing; Yin, Ye; Fox, Nick C; Wang, Jun; Bruford, Michael W

2013-05-01

As top predators, falcons possess unique morphological, physiological and behavioral adaptations that allow them to be successful hunters: for example, the peregrine is renowned as the world's fastest animal. To examine the evolutionary basis of predatory adaptations, we sequenced the genomes of both the peregrine (Falco peregrinus) and saker falcon (Falco cherrug), and we present parallel, genome-wide evidence for evolutionary innovation and selection for a predatory lifestyle. The genomes, assembled using Illumina deep sequencing with greater than 100-fold coverage, are both approximately 1.2 Gb in length, with transcriptome-assisted prediction of approximately 16,200 genes for both species. Analysis of 8,424 orthologs in both falcons, chicken, zebra finch and turkey identified consistent evidence for genome-wide rapid evolution in these raptors. SNP-based inference showed contrasting recent demographic trajectories for the two falcons, and gene-based analysis highlighted falcon-specific evolutionary novelties for beak development and olfaction and specifically for homeostasis-related genes in the arid environment-adapted saker.
The different origins of magnetic fields and activity in the Hertzsprung gap stars, OU Andromedae and 31 Comae

NASA Astrophysics Data System (ADS)

Borisova, A.; Aurière, M.; Petit, P.; Konstantinova-Antova, R.; Charbonnel, C.; Drake, N. A.

2016-06-01

Context. When crossing the Hertzsprung gap, intermediate-mass stars develop a convective envelope. Fast rotators on the main sequence, or Ap star descendants, are expected to become magnetic active subgiants during this evolutionary phase. Aims: We compare the surface magnetic fields and activity indicators of two active, fast rotating red giants with similar masses and spectral class but different rotation rates - OU And (Prot = 24.2 d) and 31 Com (Prot = 6.8 d) - to address the question of the origin of their magnetism and high activity. Methods: Observations were carried out with the Narval spectropolarimeter in 2008 and 2013. We used the least-squares deconvolution (LSD) technique to extract Stokes V and I profiles with high signal-to-noise ratio to detect Zeeman signatures of the magnetic field of the stars. We then provide Zeeman-Doppler imaging (ZDI), activity indicators monitoring, and a precise estimation of stellar parameters. We use state-of-the-art stellar evolutionary models, including rotation, to infer the evolutionary status of our giants, as well as their initial rotation velocity on the main sequence, and we interpret our observational results in the light of the theoretical Rossby numbers. Results: The detected magnetic field of OU Andromedae (OU And) is a strong one. Its longitudinal component Bl reaches 40 G and presents an about sinusoidal variation with reversal of the polarity. The magnetic topology of OU And is dominated by large-scale elements and is mainly poloidal with an important dipole component, as well as a significant toroidal component. The detected magnetic field of 31 Comae (31 Com) is weaker, with a magnetic map showing a more complex field geometry, and poloidal and toroidal components of equal contributions. The evolutionary models show that the progenitors of OU And and 31 Com must have been rotating at velocities that correspond to 30 and 53%, respectively, of their critical rotation velocity on the zero age main sequence. Both OU And and 31 Com have very similar masses (2.7 and 2.85 M⊙, respectively), and they both lie in the Hertzsprung gap. Conclusions: OU And appears to be the probable descendant of a magnetic Ap star, and 31 Com the descendant of a relatively fast rotator on the main sequence. Because of the relatively fast rotation in the Hertzsprung gap and the onset of the development of a convective envelope, OU And also has a dynamo in operation. Based on observations obtained at the telescope Bernard Lyot (TBL) at Observatoire du Pic du Midi, CNRS/INSU and Université de Toulouse, France.
VCFtoTree: a user-friendly tool to construct locus-specific alignments and phylogenies from thousands of anthropologically relevant genome sequences.

PubMed

Xu, Duo; Jaber, Yousef; Pavlidis, Pavlos; Gokcumen, Omer

2017-09-26

Constructing alignments and phylogenies for a given locus from large genome sequencing studies with relevant outgroups allow novel evolutionary and anthropological insights. However, no user-friendly tool has been developed to integrate thousands of recently available and anthropologically relevant genome sequences to construct complete sequence alignments and phylogenies. Here, we provide VCFtoTree, a user friendly tool with a graphical user interface that directly accesses online databases to download, parse and analyze genome variation data for regions of interest. Our pipeline combines popular sequence datasets and tree building algorithms with custom data parsing to generate accurate alignments and phylogenies using all the individuals from the 1000 Genomes Project, Neanderthal and Denisovan genomes, as well as reference genomes of Chimpanzee and Rhesus Macaque. It can also be applied to other phased human genomes, as well as genomes from other species. The output of our pipeline includes an alignment in FASTA format and a tree file in newick format. VCFtoTree fulfills the increasing demand for constructing alignments and phylogenies for a given loci from thousands of available genomes. Our software provides a user friendly interface for a wider audience without prerequisite knowledge in programming. VCFtoTree can be accessed from https://github.com/duoduoo/VCFtoTree_3.0.0 .
Genome-wide comparative analysis of four Indian Drosophila species.

PubMed

Mohanty, Sujata; Khanna, Radhika

2017-12-01

Comparative analysis of multiple genomes of closely or distantly related Drosophila species undoubtedly creates excitement among evolutionary biologists in exploring the genomic changes with an ecology and evolutionary perspective. We present herewith the de novo assembled whole genome sequences of four Drosophila species, D. bipectinata, D. takahashii, D. biarmipes and D. nasuta of Indian origin using Next Generation Sequencing technology on an Illumina platform along with their detailed assembly statistics. The comparative genomics analysis, e.g. gene predictions and annotations, functional and orthogroup analysis of coding sequences and genome wide SNP distribution were performed. The whole genome of Zaprionus indianus of Indian origin published earlier by us and the genome sequences of previously sequenced 12 Drosophila species available in the NCBI database were included in the analysis. The present work is a part of our ongoing genomics project of Indian Drosophila species.
The Evolution of Ion Pumps.

ERIC Educational Resources Information Center

Maloney, Peter C.; Wilson, T. Hastings

1985-01-01

Constructs an evolutionary sequence to account for the diversity of ion pumps found today. Explanations include primary ion pumps in bacteria, features and distribution of ATP-driven pumps, preference for cation transport, and proton pump reversal. The integrated evolutionary hypothesis should encourage new experimental approaches. (DH)
LS³: A Method for Improving Phylogenomic Inferences When Evolutionary Rates Are Heterogeneous among Taxa.

PubMed

Rivera-Rivera, Carlos J; Montoya-Burgos, Juan I

2016-06-01

Phylogenetic inference artifacts can occur when sequence evolution deviates from assumptions made by the models used to analyze them. The combination of strong model assumption violations and highly heterogeneous lineage evolutionary rates can become problematic in phylogenetic inference, and lead to the well-described long-branch attraction (LBA) artifact. Here, we define an objective criterion for assessing lineage evolutionary rate heterogeneity among predefined lineages: the result of a likelihood ratio test between a model in which the lineages evolve at the same rate (homogeneous model) and a model in which different lineage rates are allowed (heterogeneous model). We implement this criterion in the algorithm Locus Specific Sequence Subsampling (LS³), aimed at reducing the effects of LBA in multi-gene datasets. For each gene, LS³ sequentially removes the fastest-evolving taxon of the ingroup and tests for lineage rate homogeneity until all lineages have uniform evolutionary rates. The sequences excluded from the homogeneously evolving taxon subset are flagged as potentially problematic. The software implementation provides the user with the possibility to remove the flagged sequences for generating a new concatenated alignment. We tested LS³ with simulations and two real datasets containing LBA artifacts: a nucleotide dataset regarding the position of Glires within mammals and an amino-acid dataset concerning the position of nematodes within bilaterians. The initially incorrect phylogenies were corrected in all cases upon removing data flagged by LS³. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
De novo sequencing and characterization of floral transcriptome in two species of buckwheat (Fagopyrum)

PubMed Central

2011-01-01

Background Transcriptome sequencing data has become an integral component of modern genetics, genomics and evolutionary biology. However, despite advances in the technologies of DNA sequencing, such data are lacking for many groups of living organisms, in particular, many plant taxa. We present here the results of transcriptome sequencing for two closely related plant species. These species, Fagopyrum esculentum and F. tataricum, belong to the order Caryophyllales - a large group of flowering plants with uncertain evolutionary relationships. F. esculentum (common buckwheat) is also an important food crop. Despite these practical and evolutionary considerations Fagopyrum species have not been the subject of large-scale sequencing projects. Results Normalized cDNA corresponding to genes expressed in flowers and inflorescences of F. esculentum and F. tataricum was sequenced using the 454 pyrosequencing technology. This resulted in 267 (for F. esculentum) and 229 (F. tataricum) thousands of reads with average length of 341-349 nucleotides. De novo assembly of the reads produced about 25 thousands of contigs for each species, with 7.5-8.2× coverage. Comparative analysis of two transcriptomes demonstrated their overall similarity but also revealed genes that are presumably differentially expressed. Among them are retrotransposon genes and genes involved in sugar biosynthesis and metabolism. Thirteen single-copy genes were used for phylogenetic analysis; the resulting trees are largely consistent with those inferred from multigenic plastid datasets. The sister relationships of the Caryophyllales and asterids now gained high support from nuclear gene sequences. Conclusions 454 transcriptome sequencing and de novo assembly was performed for two congeneric flowering plant species, F. esculentum and F. tataricum. As a result, a large set of cDNA sequences that represent orthologs of known plant genes as well as potential new genes was generated. PMID:21232141
Multiplex sequencing of plant chloroplast genomes using Solexa sequencing-by-synthesis technology

Treesearch

Richard Cronn; Aaron Liston; Matthew Parks; David S. Gernandt; Rongkun Shen; Todd Mockler

2008-01-01

Organellar DNA sequences are widely used in evolutionary and population genetic studies; however, the conservative nature of chloroplast gene and genome evolution often limits phylogenetic resolution and statistical power. To gain maximal access to the historical record contained within chloroplast genomes, we have adapted multiplex sequencing-by-synthesis (MSBS) to...
The dynamic evolutionary history of the bananaquit (Coereba flaveola) in the Caribbean revealed by a multigene analysis

PubMed Central

2008-01-01

Background The bananaquit (Coereba flaveola) is a small nectivorous and frugivorous emberizine bird (order Passeriformes) that is an abundant resident throughout the Caribbean region. We used multi-gene analyses to investigate the evolutionary history of this species throughout its distribution in the West Indies and in South and Middle America. We sequenced six mitochondrial genes (3744 base pairs) and three nuclear genes (2049 base pairs) for forty-four bananaquits and three outgroup species. We infer the ancestral area of the present-day bananaquit populations, report on the species' phylogenetic, biogeographic and evolutionary history, and propose scenarios for its diversification and range expansion. Results Phylogenetic concordance between mitochondrial and nuclear genes at the base of the bananaquit phylogeny supported a West Indian origin for continental populations. Multi-gene analysis showing genetic remnants of successive colonization events in the Lesser Antilles reinforced earlier research demonstrating that bananaquits alternate periods of invasiveness and colonization with biogeographic quiescence. Although nuclear genes provided insufficient information at the tips of the tree to further evaluate relationships of closely allied but strongly supported mitochondrial DNA clades, the discrepancy between mitochondrial and nuclear data in the population of Dominican Republic suggested that the mitochondrial genome was recently acquired by introgression from Jamaica. Conclusion This study represents one of the most complete phylogeographic analyses of its kind and reveals three patterns that are not commonly appreciated in birds: (1) island to mainland colonization, (2) multiple expansion phases, and (3) mitochondrial genome replacement. The detail revealed by this analysis will guide evolutionary analyses of populations in archipelagos such as the West Indies, which include islands varying in size, age, and geological history. Our results suggest that multi-gene phylogenies will permit improved comparative analysis of the evolutionary histories of different lineages in the same geographical setting, which provide replicated "natural experiments" for testing evolutionary hypotheses. PMID:18718030
The Rise and Fall of an Evolutionary Innovation: Contrasting Strategies of Venom Evolution in Ancient and Young Animals

PubMed Central

Sunagar, Kartik; Moran, Yehu

2015-01-01

Animal venoms are theorized to evolve under the significant influence of positive Darwinian selection in a chemical arms race scenario, where the evolution of venom resistance in prey and the invention of potent venom in the secreting animal exert reciprocal selection pressures. Venom research to date has mainly focused on evolutionarily younger lineages, such as snakes and cone snails, while mostly neglecting ancient clades (e.g., cnidarians, coleoids, spiders and centipedes). By examining genome, venom-gland transcriptome and sequences from the public repositories, we report the molecular evolutionary regimes of several centipede and spider toxin families, which surprisingly accumulated low-levels of sequence variations, despite their long evolutionary histories. Molecular evolutionary assessment of over 3500 nucleotide sequences from 85 toxin families spanning the breadth of the animal kingdom has unraveled a contrasting evolutionary strategy employed by ancient and evolutionarily young clades. We show that the venoms of ancient lineages remarkably evolve under the heavy constraints of negative selection, while toxin families in lineages that originated relatively recently rapidly diversify under the influence of positive selection. We propose that animal venoms mostly employ a ‘two-speed’ mode of evolution, where the major influence of diversifying selection accompanies the earlier stages of ecological specialization (e.g., diet and range expansion) in the evolutionary history of the species–the period of expansion, resulting in the rapid diversification of the venom arsenal, followed by longer periods of purifying selection that preserve the potent toxin pharmacopeia–the period of purification and fixation. However, species in the period of purification may re-enter the period of expansion upon experiencing a major shift in ecology or environment. Thus, we highlight for the first time the significant roles of purifying and episodic selections in shaping animal venoms. PMID:26492532
The Liverwort Contains a Lectin That Is Structurally and Evolutionary Related to the Monocot Mannose-Binding Lectins1

PubMed Central

Peumans, Willy J.; Barre, Annick; Bras, Julien; Rougé, Pierre; Proost, Paul; Van Damme, Els J.M.

2002-01-01

A mannose (Man)-binding lectin has been isolated and characterized from the thallus of the liverwort Marchantia polymorpha. N-terminal sequencing indicated that the M. polymorpha agglutinin (Marpola) shares sequence similarity with the superfamily of monocot Man-binding lectins. Searches in the databases yielded expressed sequence tags encoding Marpola. Sequence analysis, molecular modeling, and docking experiments revealed striking structural similarities between Marpola and the monocot Man-binding lectins. Activity and specificity studies further indicated that Marpola is a much stronger agglutinin than the Galanthus nivalis agglutinin and exhibits a preference for methylated Man and glucose, which is unprecedented within the family of monocot Man-binding lectins. The discovery of Marpola allows us, for the first time, to corroborate the evolutionary relationship between a lectin from a lower plant and a well-established lectin family from flowering plants. In addition, the identification of Marpola sheds a new light on the molecular evolution of the superfamily of monocot Man-binding lectins. Beside evolutionary considerations, the occurrence of a G. nivalis agglutinin homolog in a lower plant necessitates the rethinking of the physiological role of the whole family of monocot Man-binding lectins. PMID:12114560
Studying the evolutionary relationships and phylogenetic trees of 21 groups of tRNA sequences based on complex networks.

PubMed

Wei, Fangping; Chen, Bowen

2012-03-01

To find out the evolutionary relationships among different tRNA sequences of 21 amino acids, 22 networks are constructed. One is constructed from whole tRNAs, and the other 21 networks are constructed from the tRNAs which carry the same amino acids. A new method is proposed such that the alignment scores of any two amino acids groups are determined by the average degree and the average clustering coefficient of their networks. The anticodon feature of isolated tRNA and the phylogenetic trees of 21 group networks are discussed. We find that some isolated tRNA sequences in 21 networks still connect with other tRNAs outside their group, which reflects the fact that those tRNAs might evolve by intercrossing among these 21 groups. We also find that most anticodons among the same cluster are only one base different in the same sites when S ≥ 70, and they stay in the same rank in the ladder of evolutionary relationships. Those observations seem to agree on that some tRNAs might mutate from the same ancestor sequences based on point mutation mechanisms.
Determinants of the rate of protein sequence evolution

PubMed Central

Zhang, Jianzhi; Yang, Jian-Rong

2015-01-01

The rate and mechanism of protein sequence evolution have been central questions in evolutionary biology since the 1960s. Although the rate of protein sequence evolution depends primarily on the level of functional constraint, exactly what constitutes functional constraint has remained unclear. The increasing availability of genomic data has allowed for much needed empirical examinations on the nature of functional constraint. These studies found that the evolutionary rate of a protein is predominantly influenced by its expression level rather than functional importance. A combination of theoretical and empirical analyses have identified multiple mechanisms behind these observations and demonstrated a prominent role that selection against errors in molecular and cellular processes plays in protein evolution. PMID:26055156
Distinct retroelement classes define evolutionary breakpoints demarcating sites of evolutionary novelty

PubMed Central

Longo, Mark S; Carone, Dawn M; Green, Eric D; O'Neill, Michael J; O'Neill, Rachel J

2009-01-01

Background Large-scale genome rearrangements brought about by chromosome breaks underlie numerous inherited diseases, initiate or promote many cancers and are also associated with karyotype diversification during species evolution. Recent research has shown that these breakpoints are nonrandomly distributed throughout the mammalian genome and many, termed "evolutionary breakpoints" (EB), are specific genomic locations that are "reused" during karyotypic evolution. When the phylogenetic trajectory of orthologous chromosome segments is considered, many of these EB are coincident with ancient centromere activity as well as new centromere formation. While EB have been characterized as repeat-rich regions, it has not been determined whether specific sequences have been retained during evolution that would indicate previous centromere activity or a propensity for new centromere formation. Likewise, the conservation of specific sequence motifs or classes at EBs among divergent mammalian taxa has not been determined. Results To define conserved sequence features of EBs associated with centromere evolution, we performed comparative sequence analysis of more than 4.8 Mb within the tammar wallaby, Macropus eugenii, derived from centromeric regions (CEN), euchromatic regions (EU), and an evolutionary breakpoint (EB) that has undergone convergent breakpoint reuse and past centromere activity in marsupials. We found a dramatic enrichment for long interspersed nucleotide elements (LINE1s) and endogenous retroviruses (ERVs) and a depletion of short interspersed nucleotide elements (SINEs) shared between CEN and EBs. We analyzed the orthologous human EB (14q32.33), known to be associated with translocations in many cancers including multiple myelomas and plasma cell leukemias, and found a conserved distribution of similar repetitive elements. Conclusion Our data indicate that EBs tracked within the class Mammalia harbor sequence features retained since the divergence of marsupials and eutherians that may have predisposed these genomic regions to large-scale chromosomal instability. PMID:19630942

Post-Starburst Galaxies At The End of The E+A Phase

NASA Astrophysics Data System (ADS)

Liu, Charles; Marinelli, Mariarosa; Chang, Madeleine; Lyczko, Camilla; Vega Orozco, Cecilia; SDSS-IV Collaboration

2018-06-01

Post-starburst galaxies, once thought to be rare curiosities, are now recognized to represent a key phase in the galaxy evolution. The post-starburst, or E+A phase, should however not be considered as a single, short-lived phenomenon; rather, it is an extended evolutionary process that occurs a galaxy transitions from an actively star-forming system into a quiescent one. We present a study of nearby galaxies at or near the end of the E+A phase, wherein all star formation has been quenched, the fossilized stellar population of the most recent starburst is highly localized, and the remainder of the galaxy's stellar population is old and quiescent. The luminosity and stellar age distribution of these "end-phase E+As" can provide insights into the evolution of galaxies onto and within the red sequence, from active to passive systems. This work is supported by National Science Foundation grants to CUNY College of Staten Island and the American Museum of Natural History; the College of Staten Island Office of Academic Affairs; the Sherman Fairchild Science Pathways Scholars Program (SP^2) at Barnard College; and the Alfred P. Sloan Foundation.
Evolutionary Dynamics of the Gametologous CTNNB1 Gene on the Z and W Chromosomes of Snakes.

PubMed

Laopichienpong, Nararat; Muangmai, Narongrit; Chanhome, Lawan; Suntrarachun, Sunutcha; Twilprawat, Panupon; Peyachoknagul, Surin; Srikulnath, Kornsorn

2017-03-01

Snakes exhibit genotypic sex determination with female heterogamety (ZZ males and ZW females), and the state of sex chromosome differentiation also varies among lineages. To investigate the evolutionary history of homologous genes located in the nonrecombining region of differentiated sex chromosomes in snakes, partial sequences of the gametologous CTNNB1 gene were analyzed for 12 species belonging to henophid (Cylindrophiidae, Xenopeltidae, and Pythonidae) and caenophid snakes (Viperidae, Elapidae, and Colubridae). Nonsynonymous/synonymous substitution ratios (Ka/Ks) in coding sequences were low (Ka/Ks < 1) between CTNNB1Z and CTNNB1W, suggesting that these 2 genes may have similar functional properties. However, frequencies of intron sequence substitutions and insertion–deletions were higher in CTNNB1Z than CTNNB1W, suggesting that Z-linked sequences evolved faster than W-linked sequences. Molecular phylogeny based on both intron and exon sequences showed the presence of 2 major clades: 1) Z-linked sequences of Caenophidia and 2) W-linked sequences of Caenophidia clustered with Z-linked sequences of Henophidia, which suggests that the sequence divergence between CTNNB1Z and CTNNB1W in Caenophidia may have occurred by the cessation of recombination after the split from Henophidia.
Genomic sequencing of Pleistocene cave bears

DOE Office of Scientific and Technical Information (OSTI.GOV)

Noonan, James P.; Hofreiter, Michael; Smith, Doug

2005-04-01

Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome,more » the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.« less
Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution.

PubMed

2004-12-09

We present here a draft genome sequence of the red jungle fowl, Gallus gallus. Because the chicken is a modern descendant of the dinosaurs and the first non-mammalian amniote to have its genome sequenced, the draft sequence of its genome--composed of approximately one billion base pairs of sequence and an estimated 20,000-23,000 genes--provides a new perspective on vertebrate genome evolution, while also improving the annotation of mammalian genomes. For example, the evolutionary distance between chicken and human provides high specificity in detecting functional elements, both non-coding and coding. Notably, many conserved non-coding sequences are far from genes and cannot be assigned to defined functional classes. In coding regions the evolutionary dynamics of protein domains and orthologous groups illustrate processes that distinguish the lineages leading to birds and mammals. The distinctive properties of avian microchromosomes, together with the inferred patterns of conserved synteny, provide additional insights into vertebrate chromosome architecture.
Novel antigenic shift in HA sequences of H1N1 viruses detected by big data analysis.

PubMed

Zhang, Ruiying; Xu, Chongfeng; Duan, Ziyuan

2017-07-01

The influenza virus H1N1 has been prevalent all over the world for nearly a century. Many studies on its evolutionary history, substitution rate and antigenicity-associated sites have been done with small datasets. To have a complete view, we analysed 3171 full-length HA sequences from human H1N1 viruses sampled from 1918 to 2016, and discovered a new clade has formed with sequences isolated in Iran. Based on genetic distance calculations, we revealed an uneven evolutionary rate among sequences isolated in different years. We also found that the HA1 fragment of the new clade is like that of viruses that existed in the 1930s, while the HA2 fragment is closely associated with strains isolated after the 2009 pandemic. This new, "mixed" HA sequence indicates a cryptic antigenic shift event occurred, and it should draw more attention to the new clade identified from sequences from Iran. Copyright © 2017. Published by Elsevier B.V.
Evolutionary relationships of flying foxes (genus Pteropus) in the Philippines inferred from DNA sequences of cytochrome b gene.

PubMed

Bastian, S T; Tanaka, K; Anunciado, R V P; Natural, N G; Sumalde, A C; Namikawa, T

2002-04-01

Six flying fox species, genus Pteropus (four from the Philippines) were investigated using complete cytochrome b gene sequences (1140 bp) to infer their evolutionary relationships. The DNA sequences generated via polymerase chain reaction were analyzed using the neighbor-joining, parsimony, and maximum likelihood methods. We estimated that the first evolutionary event among these Pteropus species occurred approximately 13.90 +/- 1.49 MYA. Within this short period of evolutionary time we further hypothesized that the ancestors of the flying foxes found in the Philippines experienced a subsequent diversification forming two clusters in the topology. The first cluster is composed of P. pumilus (Philippine endemic), P. speciosus (restricted in western Mindanao) with P. scapulatus, while the second one comprised P. vampyrus and P. dasymallus species based on the analysis from first and second codon positions. Consistently, all phylogenetic analyses divulged close association of P. dasymallus with P. vampyrus contradicting the previous report categorizing P. dasymallus under subniger species group with P. pumilus. P. speciosus, and P. hypomelanus. The Philippine endemic species (P. pumilus) is closely linked with P. speciosus. The representative samples of P. vampyrus showed a large genetic distance of 1.87%. The large genetic distance between P. dasymallus and P. hypomelanus, P. pumilus and P. speciosus denotes a distinct species group.
Promoter Motifs in NCLDVs: An Evolutionary Perspective

PubMed Central

Oliveira, Graziele Pereira; Andrade, Ana Cláudia dos Santos Pereira; Rodrigues, Rodrigo Araújo Lima; Arantes, Thalita Souza; Boratto, Paulo Victor Miranda; Silva, Ludmila Karen dos Santos; Dornas, Fábio Pio; Trindade, Giliane de Souza; Drumond, Betânia Paiva; La Scola, Bernard; Kroon, Erna Geessien; Abrahão, Jônatas Santos

2017-01-01

For many years, gene expression in the three cellular domains has been studied in an attempt to discover sequences associated with the regulation of the transcription process. Some specific transcriptional features were described in viruses, although few studies have been devoted to understanding the evolutionary aspects related to the spread of promoter motifs through related viral families. The discovery of giant viruses and the proposition of the new viral order Megavirales that comprise a monophyletic group, named nucleo-cytoplasmic large DNA viruses (NCLDV), raised new questions in the field. Some putative promoter sequences have already been described for some NCLDV members, bringing new insights into the evolutionary history of these complex microorganisms. In this review, we summarize the main aspects of the transcription regulation process in the three domains of life, followed by a systematic description of what is currently known about promoter regions in several NCLDVs. We also discuss how the analysis of the promoter sequences could bring new ideas about the giant viruses’ evolution. Finally, considering a possible common ancestor for the NCLDV group, we discussed possible promoters’ evolutionary scenarios and propose the term “MEGA-box” to designate an ancestor promoter motif (‘TATATAAAATTGA’) that could be evolved gradually by nucleotides’ gain and loss and point mutations. PMID:28117683
Evolution and Vaccination of Influenza Virus.

PubMed

Lam, Ham Ching; Bi, Xuan; Sreevatsan, Srinand; Boley, Daniel

2017-08-01

In this study, we present an application paradigm in which an unsupervised machine learning approach is applied to the high-dimensional influenza genetic sequences to investigate whether vaccine is a driving force to the evolution of influenza virus. We first used a visualization approach to visualize the evolutionary paths of vaccine-controlled and non-vaccine-controlled influenza viruses in a low-dimensional space. We then quantified the evolutionary differences between their evolutionary trajectories through the use of within- and between-scatter matrices computation to provide the statistical confidence to support the visualization results. We used the influenza surface Hemagglutinin (HA) gene for this study as the HA gene is the major target of the immune system. The visualization is achieved without using any clustering methods or prior information about the influenza sequences. Our results clearly showed that the evolutionary trajectories between vaccine-controlled and non-vaccine-controlled influenza viruses are different and vaccine as an evolution driving force cannot be completely eliminated.
The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures.

PubMed

Goldenberg, Ofir; Erez, Elana; Nimrod, Guy; Ben-Tal, Nir

2009-01-01

ConSurf-DB is a repository for evolutionary conservation analysis of the proteins of known structures in the Protein Data Bank (PDB). Sequence homologues of each of the PDB entries were collected and aligned using standard methods. The evolutionary conservation of each amino acid position in the alignment was calculated using the Rate4Site algorithm, implemented in the ConSurf web server. The algorithm takes into account the phylogenetic relations between the aligned proteins and the stochastic nature of the evolutionary process explicitly. Rate4Site assigns a conservation level for each position in the multiple sequence alignment using an empirical Bayesian inference. Visual inspection of the conservation patterns on the 3D structure often enables the identification of key residues that comprise the functionally important regions of the protein. The repository is updated with the latest PDB entries on a monthly basis and will be rebuilt annually. ConSurf-DB is available online at http://consurfdb.tau.ac.il/
The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures

PubMed Central

Goldenberg, Ofir; Erez, Elana; Nimrod, Guy; Ben-Tal, Nir

2009-01-01

ConSurf-DB is a repository for evolutionary conservation analysis of the proteins of known structures in the Protein Data Bank (PDB). Sequence homologues of each of the PDB entries were collected and aligned using standard methods. The evolutionary conservation of each amino acid position in the alignment was calculated using the Rate4Site algorithm, implemented in the ConSurf web server. The algorithm takes into account the phylogenetic relations between the aligned proteins and the stochastic nature of the evolutionary process explicitly. Rate4Site assigns a conservation level for each position in the multiple sequence alignment using an empirical Bayesian inference. Visual inspection of the conservation patterns on the 3D structure often enables the identification of key residues that comprise the functionally important regions of the protein. The repository is updated with the latest PDB entries on a monthly basis and will be rebuilt annually. ConSurf-DB is available online at http://consurfdb.tau.ac.il/ PMID:18971256
Evolutionary Dynamics and Diversity in Microbial Populations

NASA Astrophysics Data System (ADS)

Thompson, Joel; Fisher, Daniel

2013-03-01

Diseases such as flu and cancer adapt at an astonishing rate. In large part, viruses and cancers are so difficult to prevent because they are continually evolving. Controlling such ``evolutionary diseases'' requires a better understanding of the underlying evolutionary dynamics. It is conventionally assumed that adaptive mutations are rare and therefore will occur and sweep through the population in succession. Recent experiments using modern sequencing technologies have illuminated the many ways in which real population sequence data does not conform to the predictions of conventional theory. We consider a very simple model of asexual evolution and perform simulations in a range of parameters thought to be relevant for microbes and cancer. Simulation results reveal complex evolutionary dynamics typified by competition between lineages with different sets of adaptive mutations. This dynamical process leads to a distribution of mutant gene frequencies different than expected under the conventional assumption that adaptive mutations are rare. Simulated gene frequencies share several conspicuous features with data collected from laboratory-evolved yeast and the worldwide population of influenza.
The Evolutionary Status of Be Stars: Results from a Photometric Study of Southern Open Clusters

NASA Astrophysics Data System (ADS)

McSwain, M. Virginia; Gies, Douglas R.

2005-11-01

Be stars are a class of rapidly rotating B stars with circumstellar disks that cause Balmer and other line emission. There are three possible reasons for the rapid rotation of Be stars: they may have been born as rapid rotators, spun up by binary mass transfer, or spun up during the main-sequence (MS) evolution of B stars. To test the various formation scenarios, we have conducted a photometric survey of 55 open clusters in the southern sky. Of these, five clusters are probably not physically associated groups and our results for two other clusters are not reliable, but we identify 52 definite Be stars and an additional 129 Be candidates in the remaining clusters. We use our results to examine the age and evolutionary dependence of the Be phenomenon. We find an overall increase in the fraction of Be stars with age until 100 Myr, and Be stars are most common among the brightest, most massive B-type stars above the zero-age main sequence (ZAMS). We show that a spin-up phase at the terminal-age main sequence (TAMS) cannot produce the observed distribution of Be stars, but up to 73% of the Be stars detected may have been spun-up by binary mass transfer. Most of the remaining Be stars were likely rapid rotators at birth. Previous studies have suggested that low metallicity and high cluster density may also favor Be star formation. Our results indicate a possible increase in the fraction of Be stars with increasing cluster distance from the Galactic center (in environments of decreasing metallicity). However, the trend is not significant and could be ruled out due to the intrinsic scatter in our data. We also find no relationship between the fraction of Be stars and cluster density.
Microsatellite loci discovery from next-generation sequencing data and loci characterization in the epizoic barnacle Chelonibia testudinaria (Linnaeus, 1758)

PubMed Central

Zardus, John D.; Wares, John P.

2016-01-01

Microsatellite markers remain an important tool for ecological and evolutionary research, but are unavailable for many non-model organisms. One such organism with rare ecological and evolutionary features is the epizoic barnacle Chelonibia testudinaria (Linnaeus, 1758). Chelonibia testudinaria appears to be a host generalist, and has an unusual sexual system, androdioecy. Genetic studies on host specificity and mating behavior are impeded by the lack of fine-scale, highly variable markers, such as microsatellite markers. In the present study, we discovered thousands of new microsatellite loci from next-generation sequencing data, and characterized 12 loci thoroughly. We conclude that 11 of these loci will be useful markers in future ecological and evolutionary studies on C. testudinaria. PMID:27231653
Evolutionary Dynamics of Microsatellite Distribution in Plants: Insight from the Comparison of Sequenced Brassica, Arabidopsis and Other Angiosperm Species

PubMed Central

Shi, Jiaqin; Huang, Shunmou; Fu, Donghui; Yu, Jinyin; Wang, Xinfa; Hua, Wei; Liu, Shengyi; Liu, Guihua; Wang, Hanzhong

2013-01-01

Despite their ubiquity and functional importance, microsatellites have been largely ignored in comparative genomics, mostly due to the lack of genomic information. In the current study, microsatellite distribution was characterized and compared in the whole genomes and both the coding and non-coding DNA sequences of the sequenced Brassica, Arabidopsis and other angiosperm species to investigate their evolutionary dynamics in plants. The variation in the microsatellite frequencies of these angiosperm species was much smaller than those for their microsatellite numbers and genome sizes, suggesting that microsatellite frequency may be relatively stable in plants. The microsatellite frequencies of these angiosperm species were significantly negatively correlated with both their genome sizes and transposable elements contents. The pattern of microsatellite distribution may differ according to the different genomic regions (such as coding and non-coding sequences). The observed differences in many important microsatellite characteristics (especially the distribution with respect to motif length, type and repeat number) of these angiosperm species were generally accordant with their phylogenetic distance, which suggested that the evolutionary dynamics of microsatellite distribution may be generally consistent with plant divergence/evolution. Importantly, by comparing these microsatellite characteristics (especially the distribution with respect to motif type) the angiosperm species (aside from a few species) all clustered into two obviously different groups that were largely represented by monocots and dicots, suggesting a complex and generally dichotomous evolutionary pattern of microsatellite distribution in angiosperms. Polyploidy may lead to a slight increase in microsatellite frequency in the coding sequences and a significant decrease in microsatellite frequency in the whole genome/non-coding sequences, but have little effect on the microsatellite distribution with respect to motif length, type and repeat number. Interestingly, several microsatellite characteristics seemed to be constant in plant evolution, which can be well explained by the general biological rules. PMID:23555856
Origin and early evolution of photosynthesis

NASA Technical Reports Server (NTRS)

Blankenship, R. E.

1992-01-01

Photosynthesis was well-established on the earth at least 3.5 thousand million years ago, and it is widely believed that these ancient organisms had similar metabolic capabilities to modern cyanobacteria. This requires that development of two photosystems and the oxygen evolution capability occurred very early in the earth's history, and that a presumed phase of evolution involving non-oxygen evolving photosynthetic organisms took place even earlier. The evolutionary relationships of the reaction center complexes found in all the classes of currently existing organisms have been analyzed using sequence analysis and biophysical measurements. The results indicate that all reaction centers fall into two basic groups, those with pheophytin and a pair of quinones as early acceptors, and those with iron sulfur clusters as early acceptors. No simple linear branching evolutionary scheme can account for the distribution patterns of reaction centers in existing photosynthetic organisms, and lateral transfer of genetic information is considered as a likely possibility. Possible scenarios for the development of primitive reaction centers into the heterodimeric protein structures found in existing reaction centers and for the development of organisms with two linked photosystems are presented.
Physical properties of high-mass star-forming clumps in different evolutionary stages from the Bolocam Galactic Plane Survey

NASA Astrophysics Data System (ADS)

Svoboda, Brian; Shirley, Yancy; Rosolowsky, Erik; Dunham, Miranda; Ellsworth-Bowers, Timothy; Ginsburg, Adam

2013-07-01

High mass stars play a key role in the physical and chemical evolution of the interstellar medium, yet the evolutionary sequence for high mass star forming regions is poorly understood. Recent Galactic plane surveys are providing the first systematic view of high-mass star-forming regions in all evolutionary phases across the Milky Way. We present observations of the 22.23 GHz H2O maser transition J(Ka,Kc) = 6(1,6)→5(2,3) transition toward 1398 clumps identified in the Bolocam Galactic Plane Survey using the 100m Green Bank Telescope (GBT). We detect 392 H2O masers, 279 (71%) newly discovered. We show that H2O masers can identify the presence of protostars which were not previously identified by Spitzer/MSX Galactic plane IR surveys: 25% of IR-dark clumps have an H2O maser. We compare the physical properties of the clumps in the Bolocam Galactic Plane Survey (BGPS) with observations of diagnostics of star formation activity: 8 and 24 um YSO candidates, H2O and CH3OH masers, shocked H2, EGOs, and UCHII regions. We identify a sub-sample of 400 clumps with no star formation indicators representing the largest and most robust sample of pre-protocluster candidates from an unbiased survey to date. The different evolutionary stages show strong separations in HCO+ linewidth and integrated intensity, surface mass density, and kinetic temperature. Monte Carlo techniques are applied to distance probability distribution functions (DPDFs) in order to marginalize over the kinematic distance ambiguity and calculate the distribution of derived quantities for clumps in different evolutionary stages. Surface area and dust mass show weak separations above > 2 pc^2 and > 3x10^3 solar masses. An observed breakdown occurs in the size-linewidth relationship with no differentiation by evolutionary stage. Future work includes adding evolutionary indicators (MIPSGAL, HiGal, MMB) and expanding DPDF priors (HI self-absorption, Galactic structure) for more well-resolved KDAs.
Evolutionary Distance of Amino Acid Sequence Orthologs across Macaque Subspecies: Identifying Candidate Genes for SIV Resistance in Chinese Rhesus Macaques

PubMed Central

Ross, Cody T.; Roodgar, Morteza; Smith, David Glenn

2015-01-01

We use the Reciprocal Smallest Distance (RSD) algorithm to identify amino acid sequence orthologs in the Chinese and Indian rhesus macaque draft sequences and estimate the evolutionary distance between such orthologs. We then use GOanna to map gene function annotations and human gene identifiers to the rhesus macaque amino acid sequences. We conclude methodologically by cross-tabulating a list of amino acid orthologs with large divergence scores with a list of genes known to be involved in SIV or HIV pathogenesis. We find that many of the amino acid sequences with large evolutionary divergence scores, as calculated by the RSD algorithm, have been shown to be related to HIV pathogenesis in previous laboratory studies. Four of the strongest candidate genes for SIVmac resistance in Chinese rhesus macaques identified in this study are CDK9, CXCL12, TRIM21, and TRIM32. Additionally, ANKRD30A, CTSZ, GORASP2, GTF2H1, IL13RA1, MUC16, NMDAR1, Notch1, NT5M, PDCD5, RAD50, and TM9SF2 were identified as possible candidates, among others. We failed to find many laboratory experiments contrasting the effects of Indian and Chinese orthologs at these sites on SIVmac pathogenesis, but future comparative studies might hold fertile ground for research into the biological mechanisms underlying innate resistance to SIVmac in Chinese rhesus macaques. PMID:25884674
Incorporating evolution of transcription factor binding sites into annotated alignments.

PubMed

Bais, Abha S; Grossmann, Stefen; Vingron, Martin

2007-08-01

Identifying transcription factor binding sites (TFBSs) is essential to elucidate putative regulatory mechanisms. A common strategy is to combine cross-species conservation with single sequence TFBS annotation to yield "conserved TFBSs". Most current methods in this field adopt a multi-step approach that segregates the two aspects. Again, it is widely accepted that the evolutionary dynamics of binding sites differ from those of the surrounding sequence. Hence, it is desirable to have an approach that explicitly takes this factor into account. Although a plethora of approaches have been proposed for the prediction of conserved TFBSs, very few explicitly model TFBS evolutionary properties, while additionally being multi-step. Recently, we introduced a novel approach to simultaneously align and annotate conserved TFBSs in a pair of sequences. Building upon the standard Smith-Waterman algorithm for local alignments, SimAnn introduces additional states for profiles to output extended alignments or annotated alignments. That is, alignments with parts annotated as gaplessly aligned TFBSs (pair-profile hits)are generated. Moreover,the pair- profile related parameters are derived in a sound statistical framework. In this article, we extend this approach to explicitly incorporate evolution of binding sites in the SimAnn framework. We demonstrate the extension in the theoretical derivations through two position-specific evolutionary models, previously used for modelling TFBS evolution. In a simulated setting, we provide a proof of concept that the approach works given the underlying assumptions,as compared to the original work. Finally, using a real dataset of experimentally verified binding sites in human-mouse sequence pairs,we compare the new approach (eSimAnn) to an existing multi-step tool that also considers TFBS evolution. Although it is widely accepted that binding sites evolve differently from the surrounding sequences, most comparative TFBS identification methods do not explicitly consider this.Additionally, prediction of conserved binding sites is carried out in a multi-step approach that segregates alignment from TFBS annotation. In this paper, we demonstrate how the simultaneous alignment and annotation approach of SimAnn can be further extended to incorporate TFBS evolutionary relationships. We study how alignments and binding site predictions interplay at varying evolutionary distances and for various profile qualities.
The genomic landscape of rapid, repeated evolutionary rescue from toxic pollution in wild fish

USDA-ARS?s Scientific Manuscript database

Here we describe evolutionary rescue from intense pollution via multiple modes of selection in killifish populations from 4 urban estuaries of the US eastern seaboard. Comparative transcriptomics and analysis of 384 whole genome sequences show that the functioning of a receptor-based signaling pathw...
Analysis of evolutionary patterns of genes in campylobacter jejuni and C. coli

USDA-ARS?s Scientific Manuscript database

Background: In order to investigate the population genetics structure of thermophilic Campylobacter spp., we extracted a set of 1029 core gene families (CGF) from 25 sequenced genomes of C. jejuni, C. coli and C. lari. Based on these CGFs we employed different approaches to reveal the evolutionary ...

Application of resequencing to rice genomics, functional genomics and evolutionary analysis

PubMed Central

2014-01-01

Rice is a model system used for crop genomics studies. The completion of the rice genome draft sequences in 2002 not only accelerated functional genome studies, but also initiated a new era of resequencing rice genomes. Based on the reference genome in rice, next-generation sequencing (NGS) using the high-throughput sequencing system can efficiently accomplish whole genome resequencing of various genetic populations and diverse germplasm resources. Resequencing technology has been effectively utilized in evolutionary analysis, rice genomics and functional genomics studies. This technique is beneficial for both bridging the knowledge gap between genotype and phenotype and facilitating molecular breeding via gene design in rice. Here, we also discuss the limitation, application and future prospects of rice resequencing. PMID:25006357
Datamonkey 2.0: a modern web application for characterizing selective and other evolutionary processes.

PubMed

Weaver, Steven; Shank, Stephen D; Spielman, Stephanie J; Li, Michael; Muse, Spencer V; Kosakovsky Pond, Sergei L

2018-01-02

Inference of how evolutionary forces have shaped extant genetic diversity is a cornerstone of modern comparative sequence analysis. Advances in sequence generation and increased statistical sophistication of relevant methods now allow researchers to extract ever more evolutionary signal from the data, albeit at an increased computational cost. Here, we announce the release of Datamonkey 2.0, a completely re-engineered version of the Datamonkey web-server for analyzing evolutionary signatures in sequence data. For this endeavor, we leveraged recent developments in open-source libraries that facilitate interactive, robust, and scalable web application development. Datamonkey 2.0 provides a carefully curated collection of methods for interrogating coding-sequence alignments for imprints of natural selection, packaged as a responsive (i.e. can be viewed on tablet and mobile devices), fully interactive, and API-enabled web application. To complement Datamonkey 2.0, we additionally release HyPhy Vision, an accompanying JavaScript application for visualizing analysis results. HyPhy Vision can also be used separately from Datamonkey 2.0 to visualize locally-executed HyPhy analyses. Together, Datamonkey 2.0 and HyPhy Vision showcase how scientific software development can benefit from general-purpose open-source frameworks. Datamonkey 2.0 is freely and publicly available at http://www.datamonkey. org, and the underlying codebase is available from https://github.com/veg/datamonkey-js. © The Author 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Revealing Less Derived Nature of Cartilaginous Fish Genomes with Their Evolutionary Time Scale Inferred with Nuclear Genes

PubMed Central

Renz, Adina J.; Meyer, Axel; Kuraku, Shigehiro

2013-01-01

Cartilaginous fishes, divided into Holocephali (chimaeras) and Elasmoblanchii (sharks, rays and skates), occupy a key phylogenetic position among extant vertebrates in reconstructing their evolutionary processes. Their accurate evolutionary time scale is indispensable for better understanding of the relationship between phenotypic and molecular evolution of cartilaginous fishes. However, our current knowledge on the time scale of cartilaginous fish evolution largely relies on estimates using mitochondrial DNA sequences. In this study, making the best use of the still partial, but large-scale sequencing data of cartilaginous fish species, we estimate the divergence times between the major cartilaginous fish lineages employing nuclear genes. By rigorous orthology assessment based on available genomic and transcriptomic sequence resources for cartilaginous fishes, we selected 20 protein-coding genes in the nuclear genome, spanning 2973 amino acid residues. Our analysis based on the Bayesian inference resulted in the mean divergence time of 421 Ma, the late Silurian, for the Holocephali-Elasmobranchii split, and 306 Ma, the late Carboniferous, for the split between sharks and rays/skates. By applying these results and other documented divergence times, we measured the relative evolutionary rate of the Hox A cluster sequences in the cartilaginous fish lineages, which resulted in a lower substitution rate with a factor of at least 2.4 in comparison to tetrapod lineages. The obtained time scale enables mapping phenotypic and molecular changes in a quantitative framework. It is of great interest to corroborate the less derived nature of cartilaginous fish at the molecular level as a genome-wide phenomenon. PMID:23825540
Revealing less derived nature of cartilaginous fish genomes with their evolutionary time scale inferred with nuclear genes.

PubMed

Renz, Adina J; Meyer, Axel; Kuraku, Shigehiro

2013-01-01

Cartilaginous fishes, divided into Holocephali (chimaeras) and Elasmoblanchii (sharks, rays and skates), occupy a key phylogenetic position among extant vertebrates in reconstructing their evolutionary processes. Their accurate evolutionary time scale is indispensable for better understanding of the relationship between phenotypic and molecular evolution of cartilaginous fishes. However, our current knowledge on the time scale of cartilaginous fish evolution largely relies on estimates using mitochondrial DNA sequences. In this study, making the best use of the still partial, but large-scale sequencing data of cartilaginous fish species, we estimate the divergence times between the major cartilaginous fish lineages employing nuclear genes. By rigorous orthology assessment based on available genomic and transcriptomic sequence resources for cartilaginous fishes, we selected 20 protein-coding genes in the nuclear genome, spanning 2973 amino acid residues. Our analysis based on the Bayesian inference resulted in the mean divergence time of 421 Ma, the late Silurian, for the Holocephali-Elasmobranchii split, and 306 Ma, the late Carboniferous, for the split between sharks and rays/skates. By applying these results and other documented divergence times, we measured the relative evolutionary rate of the Hox A cluster sequences in the cartilaginous fish lineages, which resulted in a lower substitution rate with a factor of at least 2.4 in comparison to tetrapod lineages. The obtained time scale enables mapping phenotypic and molecular changes in a quantitative framework. It is of great interest to corroborate the less derived nature of cartilaginous fish at the molecular level as a genome-wide phenomenon.
OrthoMaM v8: a database of orthologous exons and coding sequences for comparative genomics in mammals.

PubMed

Douzery, Emmanuel J P; Scornavacca, Celine; Romiguier, Jonathan; Belkhir, Khalid; Galtier, Nicolas; Delsuc, Frédéric; Ranwez, Vincent

2014-07-01

Comparative genomic studies extensively rely on alignments of orthologous sequences. Yet, selecting, gathering, and aligning orthologous exons and protein-coding sequences (CDS) that are relevant for a given evolutionary analysis can be a difficult and time-consuming task. In this context, we developed OrthoMaM, a database of ORTHOlogous MAmmalian Markers describing the evolutionary dynamics of orthologous genes in mammalian genomes using a phylogenetic framework. Since its first release in 2007, OrthoMaM has regularly evolved, not only to include newly available genomes but also to incorporate up-to-date software in its analytic pipeline. This eighth release integrates the 40 complete mammalian genomes available in Ensembl v73 and provides alignments, phylogenies, evolutionary descriptor information, and functional annotations for 13,404 single-copy orthologous CDS and 6,953 long exons. The graphical interface allows to easily explore OrthoMaM to identify markers with specific characteristics (e.g., taxa availability, alignment size, %G+C, evolutionary rate, chromosome location). It hence provides an efficient solution to sample preprocessed markers adapted to user-specific needs. OrthoMaM has proven to be a valuable resource for researchers interested in mammalian phylogenomics, evolutionary genomics, and has served as a source of benchmark empirical data sets in several methodological studies. OrthoMaM is available for browsing, query and complete or filtered downloads at http://www.orthomam.univ-montp2.fr/. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Analytical model for minority games with evolutionary learning

NASA Astrophysics Data System (ADS)

Campos, Daniel; Méndez, Vicenç; Llebot, Josep E.; Hernández, Germán A.

2010-06-01

In a recent work [D. Campos, J.E. Llebot, V. Méndez, Theor. Popul. Biol. 74 (2009) 16] we have introduced a biological version of the Evolutionary Minority Game that tries to reproduce the intraspecific competition for limited resources in an ecosystem. In comparison with the complex decision-making mechanisms used in standard Minority Games, only two extremely simple strategies ( juveniles and adults) are accessible to the agents. Complexity is introduced instead through an evolutionary learning rule that allows younger agents to learn taking better decisions. We find that this game shows many of the typical properties found for Evolutionary Minority Games, like self-segregation behavior or the existence of an oscillation phase for a certain range of the parameter values. However, an analytical treatment becomes much easier in our case, taking advantage of the simple strategies considered. Using a model consisting of a simple dynamical system, the phase diagram of the game (which differentiates three phases: adults crowd, juveniles crowd and oscillations) is reproduced.
Analysis of 41 plant genomes supports a wave of successful genome duplications in association with the Cretaceous–Paleogene boundary

PubMed Central

Vanneste, Kevin; Baele, Guy; Maere, Steven; Van de Peer, Yves

2014-01-01

Ancient whole-genome duplications (WGDs), also referred to as paleopolyploidizations, have been reported in most evolutionary lineages. Their attributed role remains a major topic of discussion, ranging from an evolutionary dead end to a road toward evolutionary success, with evidence supporting both fates. Previously, based on dating WGDs in a limited number of plant species, we found a clustering of angiosperm paleopolyploidizations around the Cretaceous–Paleogene (K–Pg) extinction event about 66 million years ago. Here we revisit this finding, which has proven controversial, by combining genome sequence information for many more plant lineages and using more sophisticated analyses. We include 38 full genome sequences and three transcriptome assemblies in a Bayesian evolutionary analysis framework that incorporates uncorrelated relaxed clock methods and fossil uncertainty. In accordance with earlier findings, we demonstrate a strongly nonrandom pattern of genome duplications over time with many WGDs clustering around the K–Pg boundary. We interpret these results in the context of recent studies on invasive polyploid plant species, and suggest that polyploid establishment is promoted during times of environmental stress. We argue that considering the evolutionary potential of polyploids in light of the environmental and ecological conditions present around the time of polyploidization could mitigate the stark contrast in the proposed evolutionary fates of polyploids. PMID:24835588
Historian: accurate reconstruction of ancestral sequences and evolutionary rates.

PubMed

Holmes, Ian H

2017-04-15

Reconstruction of ancestral sequence histories, and estimation of parameters like indel rates, are improved by using explicit evolutionary models and summing over uncertain alignments. The previous best tool for this purpose (according to simulation benchmarks) was ProtPal, but this tool was too slow for practical use. Historian combines an efficient reimplementation of the ProtPal algorithm with performance-improving heuristics from other alignment tools. Simulation results on fidelity of rate estimation via ancestral reconstruction, along with evaluations on the structurally informed alignment dataset BAliBase 3.0, recommend Historian over other alignment tools for evolutionary applications. Historian is available at https://github.com/evoldoers/historian under the Creative Commons Attribution 3.0 US license. ihholmes+historian@gmail.com. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
rbcL gene sequences provide evidence for the evolutionary lineages of leptosporangiate ferns.

PubMed

Hasebe, M; Omori, T; Nakazawa, M; Sano, T; Kato, M; Iwatsuki, K

1994-06-07

Pteriodophytes have a longer evolutionary history than any other vascular land plant and, therefore, have endured greater loss of phylogenetically informative information. This factor has resulted in substantial disagreements in evaluating characters and, thus, controversy in establishing a stable classification. To compare competing classifications, we obtained DNA sequences of a chloroplast gene. The sequence of 1206 nt of the large subunit of the ribulose-bisphosphate carboxylase gene (rbcL) was determined from 58 species, representing almost all families of leptosporangiate ferns. Phlogenetic trees were inferred by the neighbor-joining and the parsimony methods. The two methods produced almost identical phylogenetic trees that provided insights concerning major general evolutionary trends in the leptosporangiate ferns. Interesting findings were as follows: (i) two morphologically distinct heterosporous water ferns, Marsilea and Salvinia, are sister genera; (ii) the tree ferns (Cyatheaceae, Dicksoniaceae, and Metaxyaceae) are monophyletic; and (iii) polypodioids are distantly related to the gleichenioids in spite of the similarity of their exindusiate soral morphology and are close to the higher indusiate ferns. In addition, the affinities of several "problematic genera" were assessed.
Evolutionary origin and phylogeny of the modern holocephalans (Chondrichthyes: Chimaeriformes): a mitogenomic perspective.

PubMed

Inoue, Jun G; Miya, Masaki; Lam, Kevin; Tay, Boon-Hui; Danks, Janine A; Bell, Justin; Walker, Terrence I; Venkatesh, Byrappa

2010-11-01

With our increasing ability for generating whole-genome sequences, comparative analysis of whole genomes has become a powerful tool for understanding the structure, function, and evolutionary history of human and other vertebrate genomes. By virtue of their position basal to bony vertebrates, cartilaginous fishes (class Chondrichthyes) are a valuable outgroup in comparative studies of vertebrates. Recently, a holocephalan cartilaginous fish, the elephant shark, Callorhinchus milii (Subclass Holocephali: Order Chimaeriformes), has been proposed as a model genome, and low-coverage sequence of its genome has been generated. Despite such an increasing interest, the evolutionary history of the modern holocephalans-a previously successful and diverse group but represented by only 39 extant species-and their relationship with elasmobranchs and other jawed vertebrates has been poorly documented largely owing to a lack of well-preserved fossil materials after the end-Permian about 250 Ma. In this study, we assembled the whole mitogenome sequences for eight representatives from all the three families of the modern holocephalans and investigated their phylogenetic relationships and evolutionary history. Unambiguously aligned sequences from these holocephalans together with 17 other vertebrates (9,409 nt positions excluding entire third codon positions) were subjected to partitioned maximum likelihood analysis. The resulting tree strongly supported a single origin of the modern holocephalans and their sister-group relationship with elasmobranchs. The mitogenomic tree recovered the most basal callorhinchids within the chimaeriforms, which is sister to a clade comprising the remaining two families (rhinochimaerids and chimaerids). The timetree derived from a relaxed molecular clock Bayesian method suggests that the holocephalans originated in the Silurian about 420 Ma, having survived from the end-Permian (250 Ma) mass extinction and undergoing familial diversifications during the late Jurassic to early Cretaceous (170-120 Ma). This postulated evolutionary scenario agrees well with that based on the paleontological observations.
The physics of evolution

NASA Astrophysics Data System (ADS)

Eigen, Manfred

1988-12-01

The Darwinian concept of evolution through natural selection has been revised and put on a solid physical basis, in a form which applies to self-replicable macromolecules. Two new concepts are introduced: sequence space and quasi-species. Evolutionary change in the DNA- or RNA-sequence of a gene can be mapped as a trajectory in a sequence space of dimension ν, where ν corresponds to the number of changeable positions in the genomic sequence. Emphasis, however, is shifted from the single surviving wildtype, a single point in the sequence space, to the complex structure of the mutant distribution that constitutes the quasi-species. Selection is equivalent to an establishment of the quasi-species in a localized region of sequence space, subject to threshold conditions for the error rate and sequence length. Arrival of a new mutant may violate the local threshold condition and thereby lead to a displacement of the quasi-species into a different region of sequence space. This transformation is similar to a phase transition; the dynamical equations that describe the quase-species have been shown to be analogous to those of the two-dimensional Ising model of ferromagnetism. The occurrence of a selectively advantageous mutant is biased by the particulars of the quasi-species distribution, whose mutants are populated according to their fitness relative to that of the wild-type. Inasmuch as fitness regions are connected (like mountain ridges) the evolutionary trajectory is guided to regions of optimal fitness. Evolution experiments in test tubes confirm this modification of the simple chance and law nature of the Darwinian concept. The results of the theory can also be applied to the construction of a machine that provides optimal conditions for a rapid evolution of functionally active macromolecules. An introduction to the physics of molecular evolution by the author has appeared recently.1 Detailed studies of the kinetics and mechanisms of replication of RNA, the most likely candidate for early evolution2,3, and of the implications on natural selection have been given in Refs. 4 and 5. The quasi-species model has been constructed in Refs. 6 and 7 using the concept of sequence space. Subsequently various methods have been invented to elucidate this concept and to relate it to the theory of critical phenomena 8-19. The instability of the quasi-species at the error threshold is discussed in Ref. 10. Evolution experiments with RNA strands in test tubes are described in Refs. 21 and 22.
Transmissible cancers in an evolutionary context.

PubMed

Ujvari, Beata; Papenfuss, Anthony T; Belov, Katherine

2016-07-01

Cancer is an evolutionary and ecological process in which complex interactions between tumour cells and their environment share many similarities with organismal evolution. Tumour cells with highest adaptive potential have a selective advantage over less fit cells. Naturally occurring transmissible cancers provide an ideal model system for investigating the evolutionary arms race between cancer cells and their surrounding micro-environment and macro-environment. However, the evolutionary landscapes in which contagious cancers reside have not been subjected to comprehensive investigation. Here, we provide a multifocal analysis of transmissible tumour progression and discuss the selection forces that shape it. We demonstrate that transmissible cancers adapt to both their micro-environment and macro-environment, and evolutionary theories applied to organisms are also relevant to these unique diseases. The three naturally occurring transmissible cancers, canine transmissible venereal tumour (CTVT) and Tasmanian devil facial tumour disease (DFTD) and the recently discovered clam leukaemia, exhibit different evolutionary phases: (i) CTVT, the oldest naturally occurring cell line is remarkably stable; (ii) DFTD exhibits the signs of stepwise cancer evolution; and (iii) clam leukaemia shows genetic instability. While all three contagious cancers carry the signature of ongoing and fairly recent adaptations to selective forces, CTVT appears to have reached an evolutionary stalemate with its host, while DFTD and the clam leukaemia appear to be still at a more dynamic phase of their evolution. Parallel investigation of contagious cancer genomes and transcriptomes and of their micro-environment and macro-environment could shed light on the selective forces shaping tumour development at different time points: during the progressive phase and at the endpoint. A greater understanding of transmissible cancers from an evolutionary ecology perspective will provide novel avenues for the prevention and treatment of both contagious and non-communicable cancers. © 2016 The Authors. BioEssays published by WILEY Periodicals, Inc.
Sequence Search and Comparative Genomic Analysis of SUMO-Activating Enzymes Using CoGe.

PubMed

Carretero-Paulet, Lorenzo; Albert, Victor A

2016-01-01

The growing number of genome sequences completed during the last few years has made necessary the development of bioinformatics tools for the easy access and retrieval of sequence data, as well as for downstream comparative genomic analyses. Some of these are implemented as online platforms that integrate genomic data produced by different genome sequencing initiatives with data mining tools as well as various comparative genomic and evolutionary analysis possibilities.Here, we use the online comparative genomics platform CoGe ( http://www.genomevolution.org/coge/ ) (Lyons and Freeling. Plant J 53:661-673, 2008; Tang and Lyons. Front Plant Sci 3:172, 2012) (1) to retrieve the entire complement of orthologous and paralogous genes belonging to the SUMO-Activating Enzymes 1 (SAE1) gene family from a set of species representative of the Brassicaceae plant eudicot family with genomes fully sequenced, and (2) to investigate the history, timing, and molecular mechanisms of the gene duplications driving the evolutionary expansion and functional diversification of the SAE1 family in Brassicaceae.
Covariant Evolutionary Event Analysis for Base Interaction Prediction Using a Relational Database Management System for RNA.

PubMed

Xu, Weijia; Ozer, Stuart; Gutell, Robin R

2009-01-01

With an increasingly large amount of sequences properly aligned, comparative sequence analysis can accurately identify not only common structures formed by standard base pairing but also new types of structural elements and constraints. However, traditional methods are too computationally expensive to perform well on large scale alignment and less effective with the sequences from diversified phylogenetic classifications. We propose a new approach that utilizes coevolutional rates among pairs of nucleotide positions using phylogenetic and evolutionary relationships of the organisms of aligned sequences. With a novel data schema to manage relevant information within a relational database, our method, implemented with a Microsoft SQL Server 2005, showed 90% sensitivity in identifying base pair interactions among 16S ribosomal RNA sequences from Bacteria, at a scale 40 times bigger and 50% better sensitivity than a previous study. The results also indicated covariation signals for a few sets of cross-strand base stacking pairs in secondary structure helices, and other subtle constraints in the RNA structure.
Covariant Evolutionary Event Analysis for Base Interaction Prediction Using a Relational Database Management System for RNA

PubMed Central

Xu, Weijia; Ozer, Stuart; Gutell, Robin R.

2010-01-01

With an increasingly large amount of sequences properly aligned, comparative sequence analysis can accurately identify not only common structures formed by standard base pairing but also new types of structural elements and constraints. However, traditional methods are too computationally expensive to perform well on large scale alignment and less effective with the sequences from diversified phylogenetic classifications. We propose a new approach that utilizes coevolutional rates among pairs of nucleotide positions using phylogenetic and evolutionary relationships of the organisms of aligned sequences. With a novel data schema to manage relevant information within a relational database, our method, implemented with a Microsoft SQL Server 2005, showed 90% sensitivity in identifying base pair interactions among 16S ribosomal RNA sequences from Bacteria, at a scale 40 times bigger and 50% better sensitivity than a previous study. The results also indicated covariation signals for a few sets of cross-strand base stacking pairs in secondary structure helices, and other subtle constraints in the RNA structure. PMID:20502534
Origin and Reticulate Evolutionary Process of Wheatgrass Elymus trachycaulus (Triticeae: Poaceae)

PubMed Central

Zuo, Hongwei; Wu, Panpan; Wu, Dexiang; Sun, Genlou

2015-01-01

To study origin and evolutionary dynamics of tetraploid Elymus trachycaulus that has been cytologically defined as containing StH genomes, thirteen accessions of E. trachycaulus were analyzed using two low-copy nuclear gene Pepc (phosphoenolpyruvate carboxylase) and Rpb2 (the second largest subunit of RNA polymerase II), and one chloroplast region trnL–trnF (spacer between the tRNA Leu (UAA) gene and the tRNA-Phe (GAA) gene). Our chloroplast data indicated that Pseudoroegneria (St genome) was the maternal donor of E. trachycaulus. Rpb2 data indicated that the St genome in E. trachycaulus was originated from either P. strigosa, P. stipifolia, P. spicata or P. geniculate. The Hordeum (H genome)-like sequences of E. trachycaulus are polyphyletic in the Pepc tree, suggesting that the H genome in E. trachycaulus was contributed by multiple sources, whether due to multiple origins or introgression resulting from subsequent hybridization. Failure to recovering St copy of Pepc sequence in most accessions of E. trachycaulus might be caused by genome convergent evolution in allopolyploids. Multiple copies of H-like Pepc sequence from each accession with relative large deletions and insertions might be caused by either instability of Pepc sequence in H- genome or incomplete concerted evolution. Our results highlighted complex evolutionary history of E. trachycaulus. PMID:25946188
Evolution of thermotolerance in hot spring cyanobacteria of the genus Synechococcus

NASA Technical Reports Server (NTRS)

Miller, S. R.; Castenholz, R. W.

2000-01-01

The extension of ecological tolerance limits may be an important mechanism by which microorganisms adapt to novel environments, but it may come at the evolutionary cost of reduced performance under ancestral conditions. We combined a comparative physiological approach with phylogenetic analyses to study the evolution of thermotolerance in hot spring cyanobacteria of the genus Synechococcus. Among the 20 laboratory clones of Synechococcus isolated from collections made along an Oregon hot spring thermal gradient, four different 16S rRNA gene sequences were identified. Phylogenies constructed by using the sequence data indicated that the clones were polyphyletic but that three of the four sequence groups formed a clade. Differences in thermotolerance were observed for clones with different 16S rRNA gene sequences, and comparison of these physiological differences within a phylogenetic framework provided evidence that more thermotolerant lineages of Synechococcus evolved from less thermotolerant ancestors. The extension of the thermal limit in these bacteria was correlated with a reduction in the breadth of the temperature range for growth, which provides evidence that enhanced thermotolerance has come at the evolutionary cost of increased thermal specialization. This study illustrates the utility of using phylogenetic comparative methods to investigate how evolutionary processes have shaped historical patterns of ecological diversification in microorganisms.
Dynamics, morphogenesis and convergence of evolutionary quantum Prisoner's Dilemma games on networks

PubMed Central

Yong, Xi

2016-01-01

The authors proposed a quantum Prisoner's Dilemma (PD) game as a natural extension of the classic PD game to resolve the dilemma. Here, we establish a new Nash equilibrium principle of the game, propose the notion of convergence and discover the convergence and phase-transition phenomena of the evolutionary games on networks. We investigate the many-body extension of the game or evolutionary games in networks. For homogeneous networks, we show that entanglement guarantees a quick convergence of super cooperation, that there is a phase transition from the convergence of defection to the convergence of super cooperation, and that the threshold for the phase transitions is principally determined by the Nash equilibrium principle of the game, with an accompanying perturbation by the variations of structures of networks. For heterogeneous networks, we show that the equilibrium frequencies of super-cooperators are divergent, that entanglement guarantees emergence of super-cooperation and that there is a phase transition of the emergence with the threshold determined by the Nash equilibrium principle, accompanied by a perturbation by the variations of structures of networks. Our results explore systematically, for the first time, the dynamics, morphogenesis and convergence of evolutionary games in interacting and competing systems. PMID:27118882
Co-evolutionary data mining for fuzzy rules: automatic fitness function creation phase space, and experiments

NASA Astrophysics Data System (ADS)

Smith, James F., III; Blank, Joseph A.

2003-03-01

An approach is being explored that involves embedding a fuzzy logic based resource manager in an electronic game environment. Game agents can function under their own autonomous logic or human control. This approach automates the data mining problem. The game automatically creates a cleansed database reflecting the domain expert's knowledge, it calls a data mining function, a genetic algorithm, for data mining of the data base as required and allows easy evaluation of the information extracted. The co-evolutionary fitness functions, chromosomes and stopping criteria for ending the game are discussed. Genetic algorithm and genetic program based data mining procedures are discussed that automatically discover new fuzzy rules and strategies. The strategy tree concept and its relationship to co-evolutionary data mining are examined as well as the associated phase space representation of fuzzy concepts. The overlap of fuzzy concepts in phase space reduces the effective strategies available to adversaries. Co-evolutionary data mining alters the geometric properties of the overlap region known as the admissible region of phase space significantly enhancing the performance of the resource manager. Procedures for validation of the information data mined are discussed and significant experimental results provided.
Alignment-free microbial phylogenomics under scenarios of sequence divergence, genome rearrangement and lateral genetic transfer.

PubMed

Bernard, Guillaume; Chan, Cheong Xin; Ragan, Mark A

2016-07-01

Alignment-free (AF) approaches have recently been highlighted as alternatives to methods based on multiple sequence alignment in phylogenetic inference. However, the sensitivity of AF methods to genome-scale evolutionary scenarios is little known. Here, using simulated microbial genome data we systematically assess the sensitivity of nine AF methods to three important evolutionary scenarios: sequence divergence, lateral genetic transfer (LGT) and genome rearrangement. Among these, AF methods are most sensitive to the extent of sequence divergence, less sensitive to low and moderate frequencies of LGT, and most robust against genome rearrangement. We describe the application of AF methods to three well-studied empirical genome datasets, and introduce a new application of the jackknife to assess node support. Our results demonstrate that AF phylogenomics is computationally scalable to multi-genome data and can generate biologically meaningful phylogenies and insights into microbial evolution.

Beyond Linear Sequence Comparisons: The use of genome-levelcharacters for phylogenetic reconstruction

DOE Office of Scientific and Technical Information (OSTI.GOV)

Boore, Jeffrey L.

2004-11-27

Although the phylogenetic relationships of many organisms have been convincingly resolved by the comparisons of nucleotide or amino acid sequences, others have remained equivocal despite great effort. Now that large-scale genome sequencing projects are sampling many lineages, it is becoming feasible to compare large data sets of genome-level features and to develop this as a tool for phylogenetic reconstruction that has advantages over conventional sequence comparisons. Although it is unlikely that these will address a large number of evolutionary branch points across the broad tree of life due to the infeasibility of such sampling, they have great potential for convincinglymore » resolving many critical, contested relationships for which no other data seems promising. However, it is important that we recognize potential pitfalls, establish reasonable standards for acceptance, and employ rigorous methodology to guard against a return to earlier days of scenario-driven evolutionary reconstructions.« less
Prediction of RNA secondary structures: from theory to models and real molecules

NASA Astrophysics Data System (ADS)

Schuster, Peter

2006-05-01

RNA secondary structures are derived from RNA sequences, which are strings built form the natural four letter nucleotide alphabet, {AUGC}. These coarse-grained structures, in turn, are tantamount to constrained strings over a three letter alphabet. Hence, the secondary structures are discrete objects and the number of sequences always exceeds the number of structures. The sequences built from two letter alphabets form perfect structures when the nucleotides can form a base pair, as is the case with {GC} or {AU}, but the relation between the sequences and structures differs strongly from the four letter alphabet. A comprehensive theory of RNA structure is presented, which is based on the concepts of sequence space and shape space, being a space of structures. It sets the stage for modelling processes in ensembles of RNA molecules like evolutionary optimization or kinetic folding as dynamical phenomena guided by mappings between the two spaces. The number of minimum free energy (mfe) structures is always smaller than the number of sequences, even for two letter alphabets. Folding of RNA molecules into mfe energy structures constitutes a non-invertible mapping from sequence space onto shape space. The preimage of a structure in sequence space is defined as its neutral network. Similarly the set of suboptimal structures is the preimage of a sequence in shape space. This set represents the conformation space of a given sequence. The evolutionary optimization of structures in populations is a process taking place in sequence space, whereas kinetic folding occurs in molecular ensembles that optimize free energy in conformation space. Efficient folding algorithms based on dynamic programming are available for the prediction of secondary structures for given sequences. The inverse problem, the computation of sequences for predefined structures, is an important tool for the design of RNA molecules with tailored properties. Simultaneous folding or cofolding of two or more RNA molecules can be modelled readily at the secondary structure level and allows prediction of the most stable (mfe) conformations of complexes together with suboptimal states. Cofolding algorithms are important tools for efficient and highly specific primer design in the polymerase chain reaction (PCR) and help to explain the mechanisms of small interference RNA (si-RNA) molecules in gene regulation. The evolutionary optimization of RNA structures is illustrated by the search for a target structure and mimics aptamer selection in evolutionary biotechnology. It occurs typically in steps consisting of short adaptive phases interrupted by long epochs of little or no obvious progress in optimization. During these quasi-stationary epochs the populations are essentially confined to neutral networks where they search for sequences that allow a continuation of the adaptive process. Modelling RNA evolution as a simultaneous process in sequence and shape space provides answers to questions of the optimal population size and mutation rates. Kinetic folding is a stochastic process in conformation space. Exact solutions are derived by direct simulation in the form of trajectory sampling or by solving the master equation. The exact solutions can be approximated straightforwardly by Arrhenius kinetics on barrier trees, which represent simplified versions of conformational energy landscapes. The existence of at least one sequence forming any arbitrarily chosen pair of structures is granted by the intersection theorem. Folding kinetics is the key to understanding and designing multistable RNA molecules or RNA switches. These RNAs form two or more long lived conformations, and conformational changes occur either spontaneously or are induced through binding of small molecules or other biopolymers. RNA switches are found in nature where they act as elements in genetic and metabolic regulation. The reliability of RNA secondary structure prediction is limited by the accuracy with which the empirical parameters can be determined and by principal deficiencies, for example by the lack of energy contributions resulting from tertiary interactions. In addition, native structures may be determined by folding kinetics rather than by thermodynamics. We address the first problem by considering base pair probabilities or base pairing entropies, which are derived from the partition function of conformations. A high base pair probability corresponding to a low pairing entropy is taken as an indicator of a high reliability of prediction. Pseudoknots are discussed as an example of a tertiary interaction that is highly important for RNA function. Moreover, pseudoknot formation is readily incorporated into structure prediction algorithms. Some examples of experimental data on RNA secondary structures that are readily explained using the landscape concept are presented. They deal with (i) properties of RNA molecules with random sequences, (ii) RNA molecules from restricted alphabets, (iii) existence of neutral networks, (iv) shape space covering, (v) riboswitches and (vi) evolution of non-coding RNAs as an example of evolution restricted to neutral networks.
HABITABLE ZONES OF POST-MAIN SEQUENCE STARS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ramirez, Ramses M.; Kaltenegger, Lisa

Once a star leaves the main sequence and becomes a red giant, its Habitable Zone (HZ) moves outward, promoting detectable habitable conditions at larger orbital distances. We use a one-dimensional radiative-convective climate and stellar evolutionary models to calculate post-MS HZ distances for a grid of stars from 3700 to 10,000 K (∼M1 to A5 stellar types) for different stellar metallicities. The post-MS HZ limits are comparable to the distances of known directly imaged planets. We model the stellar as well as planetary atmospheric mass loss during the Red Giant Branch (RGB) and Asymptotic Giant Branch (AGB) phases for super-Moons tomore » super-Earths. A planet can stay between 200 million years up to 9 Gyr in the post-MS HZ for our hottest and coldest grid stars, respectively, assuming solar metallicity. These numbers increase for increased stellar metallicity. Total atmospheric erosion only occurs for planets in close-in orbits. The post-MS HZ orbital distances are within detection capabilities of direct imaging techniques.« less
Evolutionary and Ecological Characterization of Mayaro Virus Strains Isolated during an Outbreak, Venezuela, 2010

PubMed Central

Auguste, Albert J.; Liria, Jonathan; Forrester, Naomi L.; Giambalvo, Dileyvic; Moncada, Maria; Long, Kanya C.; Morón, Dulce; de Manzione, Nuris; Tesh, Robert B.; Halsey, Eric S.; Kochel, Tadeusz J.; Hernandez, Rosa; Navarro, Juan-Carlos

2015-01-01

In 2010, an outbreak of febrile illness with arthralgic manifestations was detected at La Estación village, Portuguesa State, Venezuela. The etiologic agent was determined to be Mayaro virus (MAYV), a reemerging South American alphavirus. A total of 77 cases was reported and 19 were confirmed as seropositive. MAYV was isolated from acute-phase serum samples from 6 symptomatic patients. We sequenced 27 complete genomes representing the full spectrum of MAYV genetic diversity, which facilitated detection of a new genotype, designated N. Phylogenetic analysis of genomic sequences indicated that etiologic strains from Venezuela belong to genotype D. Results indicate that MAYV is highly conserved genetically, showing ≈17% nucleotide divergence across all 3 genotypes and 4% among genotype D strains in the most variable genes. Coalescent analyses suggested genotypes D and L diverged ≈150 years ago and genotype diverged N ≈250 years ago. This virus commonly infects persons residing near enzootic transmission foci because of anthropogenic incursions. PMID:26401714
Evolutionary and Ecological Characterization of Mayaro Virus Strains Isolated during an Outbreak, Venezuela, 2010.

PubMed

Auguste, Albert J; Liria, Jonathan; Forrester, Naomi L; Giambalvo, Dileyvic; Moncada, Maria; Long, Kanya C; Morón, Dulce; de Manzione, Nuris; Tesh, Robert B; Halsey, Eric S; Kochel, Tadeusz J; Hernandez, Rosa; Navarro, Juan-Carlos; Weaver, Scott C

2015-10-01

In 2010, an outbreak of febrile illness with arthralgic manifestations was detected at La Estación village, Portuguesa State, Venezuela. The etiologic agent was determined to be Mayaro virus (MAYV), a reemerging South American alphavirus. A total of 77 cases was reported and 19 were confirmed as seropositive. MAYV was isolated from acute-phase serum samples from 6 symptomatic patients. We sequenced 27 complete genomes representing the full spectrum of MAYV genetic diversity, which facilitated detection of a new genotype, designated N. Phylogenetic analysis of genomic sequences indicated that etiologic strains from Venezuela belong to genotype D. Results indicate that MAYV is highly conserved genetically, showing ≈17% nucleotide divergence across all 3 genotypes and 4% among genotype D strains in the most variable genes. Coalescent analyses suggested genotypes D and L diverged ≈150 years ago and genotype diverged N ≈250 years ago. This virus commonly infects persons residing near enzootic transmission foci because of anthropogenic incursions.
DOE Office of Scientific and Technical Information (OSTI.GOV)

MacDonald, James; Mullan, D. J.

KIC 7177553 is a quadruple system containing two binaries of orbital periods 16.5 and 18 days. All components have comparable masses and are slowly rotating with spectral types of ∼G2V. The longer period binary is eclipsing with component masses and radii M {sub 1} = 1.043 ± 0.014 M {sub ⊙}, R {sub 1} = 0.940 ± 0.005 R {sub ⊙} and M {sub 2} = 0.986 ± 0.015 M {sub ⊙}, R {sub 2} = 0.941 ± 0.005 R {sub ⊙}. The essentially equal radii measurements are inconsistent with the two stars being on the man sequence at themore » same age using standard nonmagnetic stellar evolution models. Instead a consistent scenario is found if the stars are in their pre-main-sequence phase of evolution and have an age of 32–36 Myr. We have also computed evolutionary models of magnetic stars, but we find that our nonmagnetic models fit the empirical radii and effective temperatures better than the magnetic models.« less
Stellar evolution of high mass based on the Ledoux criterion for convection

NASA Technical Reports Server (NTRS)

Stothers, R.; Chin, C.

1972-01-01

Theoretical evolutionary sequences of models for stars of 15 and 30 solar masses were computed from the zero-age main sequence to the end of core helium burning. During the earliest stages of core helium depletion, the envelope rapidly expands into the red-supergiant configuration. At 15 solar mass, a blue loop on the H-R diagram ensues if the initial metals abundance, initial helium abundance, or C-12 + alpha particle reaction rate is sufficiently large, or if the 3-alpha reaction rate is sufficiently small. These quantities affect the opacity of the base of the outer convection zone, the mass of the core, and the thermal properties of the core. The blue loop occurs abruptly and fully developed when the critical value of any of these quantities is exceeded, and the effective temperature range and fraction of the lifetime of core helium burning during the slow phase of the blue loop vary surprisingly little. At 30 solar mass no blue loop occurs for any reasonable set of input parameters.
Coevolution of CRISPR bacteria and phage in 2 dimensions

NASA Astrophysics Data System (ADS)

Han, Pu; Deem, Michael

2014-03-01

CRISPR (cluster regularly interspaced short palindromic repeats) is a newly discovered adaptive, heritable immune system of prokaryotes. It can prevent infection of prokaryotes by phage. Most bacteria and almost all archae have CRISPR. The CRISPR system incorporates short nucleotide sequences from viruses. These incorporated sequences provide a historical record of the host and predator coevolution. We simulate the coevolution of bacteria and phage in 2 dimensions. Each phage has multiple proto-spacers that the bacteria can incorporate. Each bacterium can store multiple spacers in its CRISPR. Phages can escape recognition by the CRISPR system via point mutation or recombination. We will discuss the different evolutionary consequences of point mutation or recombination on the coevolution of bacteria and phage. We will also discuss an intriguing ``dynamic phase transition'' in the number of phage as a function of time and mutation rate. We will show that due to the arm race between phages and bacteria, the frequency of spacers and proto-spacers in a population can oscillate quite rapidly.
Reproduction, symbiosis, and the eukaryotic cell

PubMed Central

Godfrey-Smith, Peter

2015-01-01

This paper develops a conceptual framework for addressing questions about reproduction, individuality, and the units of selection in symbiotic associations, with special attention to the origin of the eukaryotic cell. Three kinds of reproduction are distinguished, and a possible evolutionary sequence giving rise to a mitochondrion-containing eukaryotic cell from an endosymbiotic partnership is analyzed as a series of transitions between each of the three forms of reproduction. The sequence of changes seen in this “egalitarian” evolutionary transition is compared with those that apply in “fraternal” transitions, such as the evolution of multicellularity in animals. PMID:26286983
Evolution Analysis of Simple Sequence Repeats in Plant Genome.

PubMed

Qin, Zhen; Wang, Yanping; Wang, Qingmei; Li, Aixian; Hou, Fuyun; Zhang, Liming

2015-01-01

Simple sequence repeats (SSRs) are widespread units on genome sequences, and play many important roles in plants. In order to reveal the evolution of plant genomes, we investigated the evolutionary regularities of SSRs during the evolution of plant species and the plant kingdom by analysis of twelve sequenced plant genome sequences. First, in the twelve studied plant genomes, the main SSRs were those which contain repeats of 1-3 nucleotides combination. Second, in mononucleotide SSRs, the A/T percentage gradually increased along with the evolution of plants (except for P. patens). With the increase of SSRs repeat number the percentage of A/T in C. reinhardtii had no significant change, while the percentage of A/T in terrestrial plants species gradually declined. Third, in dinucleotide SSRs, the percentage of AT/TA increased along with the evolution of plant kingdom and the repeat number increased in terrestrial plants species. This trend was more obvious in dicotyledon than monocotyledon. The percentage of CG/GC showed the opposite pattern to the AT/TA. Forth, in trinucleotide SSRs, the percentages of combinations including two or three A/T were in a rising trend along with the evolution of plant kingdom; meanwhile with the increase of SSRs repeat number in plants species, different species chose different combinations as dominant SSRs. SSRs in C. reinhardtii, P. patens, Z. mays and A. thaliana showed their specific patterns related to evolutionary position or specific changes of genome sequences. The results showed that, SSRs not only had the general pattern in the evolution of plant kingdom, but also were associated with the evolution of the specific genome sequence. The study of the evolutionary regularities of SSRs provided new insights for the analysis of the plant genome evolution.
Bioinformatics analysis and genetic diversity of the poliovirus.

PubMed

Liu, Yanhan; Ma, Tengfei; Liu, Jianzhu; Zhao, Xiaona; Cheng, Ziqiang; Guo, Huijun; Wang, Shujing; Xu, Ruixue

2014-12-01

Poliomyelitis, a disease which can manifest as muscle paralysis, is caused by the poliovirus, which is a human enterovirus and member of the family Picornaviridae that usually transmits by the faecal-oral route. The viruses of the OPV (oral poliovirus attenuated-live vaccine) strains can mutate in the human intestine during replication and some of these mutations can lead to the recovery of serious neurovirulence. Informatics research of the poliovirus genome can be used to explain further the characteristics of this virus. In this study, sequences from 100 poliovirus isolates were acquired from GenBank. To determine the evolutionary relationship between the strains, we compared and analysed the sequences of the complete poliovirus genome and the VP1 region. The reconstructed phylogenetic trees for the complete sequences and the VP1 sequences were both divided into two branches, indicating that the genetic relationships of the whole poliovirus genome and the VP1 sequences are very similar. This branching indicates that the virulence and pathogenicity of poliomyelitis may be associated with the VP1 region. Sequence alignment of the VP1 region revealed numerous mutation sites in which mutation rates of >30 % were detected. In a group of strains recorded in the USA, mutation sites and mutation types were the same and this may be associated with their distribution in the evolutionary tree and their genetic relationship. In conclusion, the genetic evolutionary relationships of poliovirus isolate sequences are determined to a great extent by the VP1 protein, and poliovirus strains located on the same branch of the phylogenetic tree contain the same mutation spots and mutation types. Hence, the genetic characteristics of the VP1 region in the poliovirus genome should be analysed to identify the transmission route of poliovirus and provide the basis of viral immunity development. © 2014 The Authors.
Genomic Diversity and Evolution of the Lyssaviruses

PubMed Central

Delmas, Olivier; Holmes, Edward C.; Talbi, Chiraz; Larrous, Florence; Dacheux, Laurent; Bouchier, Christiane; Bourhy, Hervé

2008-01-01

Lyssaviruses are RNA viruses with single-strand, negative-sense genomes responsible for rabies-like diseases in mammals. To date, genomic and evolutionary studies have most often utilized partial genome sequences, particularly of the nucleoprotein and glycoprotein genes, with little consideration of genome-scale evolution. Herein, we report the first genomic and evolutionary analysis using complete genome sequences of all recognised lyssavirus genotypes, including 14 new complete genomes of field isolates from 6 genotypes and one genotype that is completely sequenced for the first time. In doing so we significantly increase the extent of genome sequence data available for these important viruses. Our analysis of these genome sequence data reveals that all lyssaviruses have the same genomic organization. A phylogenetic analysis reveals strong geographical structuring, with the greatest genetic diversity in Africa, and an independent origin for the two known genotypes that infect European bats. We also suggest that multiple genotypes may exist within the diversity of viruses currently classified as ‘Lagos Bat’. In sum, we show that rigorous phylogenetic techniques based on full length genome sequence provide the best discriminatory power for genotype classification within the lyssaviruses. PMID:18446239
Transition from positive to neutral in mutation fixation along with continuing rising fitness in thermal adaptive evolution.

PubMed

Kishimoto, Toshihiko; Iijima, Leo; Tatsumi, Makoto; Ono, Naoaki; Oyake, Ayana; Hashimoto, Tomomi; Matsuo, Moe; Okubo, Masato; Suzuki, Shingo; Mori, Kotaro; Kashiwagi, Akiko; Furusawa, Chikara; Ying, Bei-Wen; Yomo, Tetsuya

2010-10-21

It remains to be determined experimentally whether increasing fitness is related to positive selection, while stationary fitness is related to neutral evolution. Long-term laboratory evolution in Escherichia coli was performed under conditions of thermal stress under defined laboratory conditions. The complete cell growth data showed common continuous fitness recovery to every 2°C or 4°C stepwise temperature upshift, finally resulting in an evolved E. coli strain with an improved upper temperature limit as high as 45.9°C after 523 days of serial transfer, equivalent to 7,560 generations, in minimal medium. Two-phase fitness dynamics, a rapid growth recovery phase followed by a gradual increasing growth phase, was clearly observed at diverse temperatures throughout the entire evolutionary process. Whole-genome sequence analysis revealed the transition from positive to neutral in mutation fixation, accompanied with a considerable escalation of spontaneous substitution rate in the late fitness recovery phase. It suggested that continually increasing fitness not always resulted in the reduction of genetic diversity due to the sequential takeovers by fit mutants, but caused the accumulation of a considerable number of mutations that facilitated the neutral evolution.
Agency, Values, and Well-Being: A Human Development Model

ERIC Educational Resources Information Center

Welzel, Christian; Inglehart, Ronald

2010-01-01

This paper argues that feelings of agency are linked to human well-being through a sequence of adaptive mechanisms that promote human development, once existential conditions become permissive. In the first part, we elaborate on the evolutionary logic of this model and outline why an evolutionary perspective is helpful to understand changes in…
Combining Physicochemical and Evolutionary Information for Protein Contact Prediction

PubMed Central

Schneider, Michael; Brock, Oliver

2014-01-01

We introduce a novel contact prediction method that achieves high prediction accuracy by combining evolutionary and physicochemical information about native contacts. We obtain evolutionary information from multiple-sequence alignments and physicochemical information from predicted ab initio protein structures. These structures represent low-energy states in an energy landscape and thus capture the physicochemical information encoded in the energy function. Such low-energy structures are likely to contain native contacts, even if their overall fold is not native. To differentiate native from non-native contacts in those structures, we develop a graph-based representation of the structural context of contacts. We then use this representation to train an support vector machine classifier to identify most likely native contacts in otherwise non-native structures. The resulting contact predictions are highly accurate. As a result of combining two sources of information—evolutionary and physicochemical—we maintain prediction accuracy even when only few sequence homologs are present. We show that the predicted contacts help to improve ab initio structure prediction. A web service is available at http://compbio.robotics.tu-berlin.de/epc-map/. PMID:25338092
Cocoa/Cotton Comparative Genomics

USDA-ARS?s Scientific Manuscript database

With genome sequence from two members of the Malvaceae family recently made available, we are exploring syntenic relationships, gene content, and evolutionary trajectories between the cacao and cotton genomes. An assembly of cacao (Theobroma cacao) using Illumina and 454 sequence technology yielded ...
The Plasmodium gaboni genome illuminates allelic dimorphism of immunologically important surface antigens in P. falciparum.

PubMed

Roy, Scott William

2015-12-01

In the deadly human malaria parasite Plasmodium falciparum, several major merozoite surface proteins (MSPs) show a striking pattern of allelic diversity called allelic dimorphism (AD). In AD, the vast majority of observed alleles fall into two highly divergent allelic classes, with recombinant alleles being rare or not observed, presumably due to repression by natural selection (recombination suppression, or RS). The three AD loci, merozoite surface proteins (MSPs) 1, 2, and 6, along with MSP3, which also exhibits RS among four allelic classes, can be collectively called AD/RS. The causes of AD/RS and the evolutionary history of allelic diversity at these loci remain mysterious. The few available sequences from a single closely related chimpanzee parasite, P. reichenowi, have suggested that for 3/4 loci, AD/RS is an ancient state that has been retained in P. falciparum since well before the P. falciparum-P. reichenowi ancestor. On the other hand, based on comparative sequence analysis, we recently suggested that (i) AD/RS P. falciparum loci have undergone interallelic recombination over longer evolutionary times (on the timescale of recent speciation events), and thus (ii) AD/RS may be a recent phenomenon. The recent publication of genomic sequencing efforts for P. gaboni, an outgroup to P. falciparum and P. reichenowi, allows for improved reconstruction of the evolutionary history of these loci. In this work, I report genic sequence for P. gaboni for all four AD/RS P. falciparum loci (MSP1, 2, 3, and 6). Comparison of these sequences with available P. falciparum and P. reichenowi data strengthens the evidence for interallelic recombination over the evolutionary history of these species and also strengthens the case that AD/RS at these loci is ancient. Combined with previous results, these data provide evidence that AD/RS at different loci has evolved at several different times in the evolutionary history of P. falciparum: (i) before the P. gaboni-P. falciparum divergence, for much of MSP1 and MSP3; (ii) between the P. gaboni-P. falciparum and P. reichenowi-P. falciparum divergences, for the 5' end of the AD region of MSP6 and block 3 of MSP1; (iii) near the P. reichenowi-P. falciparum divergence, for the 3' end of the AD region of MSP6; and (iv) after the P. reichenowi-P. falciparum divergence, for MSP2. Based on these results, I suggest a new hypothesis for long-term evolutionary maintenance of AD/RS by recombination within allelic groups. Copyright © 2015 Elsevier B.V. All rights reserved.
Transcriptome sequencing reveals genome-wide variation in molecular evolutionary rate among ferns.

PubMed

Grusz, Amanda L; Rothfels, Carl J; Schuettpelz, Eric

2016-08-30

Transcriptomics in non-model plant systems has recently reached a point where the examination of nuclear genome-wide patterns in understudied groups is an achievable reality. This progress is especially notable in evolutionary studies of ferns, for which molecular resources to date have been derived primarily from the plastid genome. Here, we utilize transcriptome data in the first genome-wide comparative study of molecular evolutionary rate in ferns. We focus on the ecologically diverse family Pteridaceae, which comprises about 10 % of fern diversity and includes the enigmatic vittarioid ferns-an epiphytic, tropical lineage known for dramatically reduced morphologies and radically elongated phylogenetic branch lengths. Using expressed sequence data for 2091 loci, we perform pairwise comparisons of molecular evolutionary rate among 12 species spanning the three largest clades in the family and ask whether previously documented heterogeneity in plastid substitution rates is reflected in their nuclear genomes. We then inquire whether variation in evolutionary rate is being shaped by genes belonging to specific functional categories and test for differential patterns of selection. We find significant, genome-wide differences in evolutionary rate for vittarioid ferns relative to all other lineages within the Pteridaceae, but we recover few significant correlations between faster/slower vittarioid loci and known functional gene categories. We demonstrate that the faster rates characteristic of the vittarioid ferns are likely not driven by positive selection, nor are they unique to any particular type of nucleotide substitution. Our results reinforce recently reviewed mechanisms hypothesized to shape molecular evolutionary rates in vittarioid ferns and provide novel insight into substitution rate variation both within and among fern nuclear genomes.
An unbiased adaptive sampling algorithm for the exploration of RNA mutational landscapes under evolutionary pressure.

PubMed

Waldispühl, Jérôme; Ponty, Yann

2011-11-01

The analysis of the relationship between sequences and structures (i.e., how mutations affect structures and reciprocally how structures influence mutations) is essential to decipher the principles driving molecular evolution, to infer the origins of genetic diseases, and to develop bioengineering applications such as the design of artificial molecules. Because their structures can be predicted from the sequence data only, RNA molecules provide a good framework to study this sequence-structure relationship. We recently introduced a suite of algorithms called RNAmutants which allows a complete exploration of RNA sequence-structure maps in polynomial time and space. Formally, RNAmutants takes an input sequence (or seed) to compute the Boltzmann-weighted ensembles of mutants with exactly k mutations, and sample mutations from these ensembles. However, this approach suffers from major limitations. Indeed, since the Boltzmann probabilities of the mutations depend of the free energy of the structures, RNAmutants has difficulties to sample mutant sequences with low G+C-contents. In this article, we introduce an unbiased adaptive sampling algorithm that enables RNAmutants to sample regions of the mutational landscape poorly covered by classical algorithms. We applied these methods to sample mutations with low G+C-contents. These adaptive sampling techniques can be easily adapted to explore other regions of the sequence and structural landscapes which are difficult to sample. Importantly, these algorithms come at a minimal computational cost. We demonstrate the insights offered by these techniques on studies of complete RNA sequence structures maps of sizes up to 40 nucleotides. Our results indicate that the G+C-content has a strong influence on the size and shape of the evolutionary accessible sequence and structural spaces. In particular, we show that low G+C-contents favor the apparition of internal loops and thus possibly the synthesis of tertiary structure motifs. On the other hand, high G+C-contents significantly reduce the size of the evolutionary accessible mutational landscapes.
Simultaneously estimating evolutionary history and repeated traits phylogenetic signal: applications to viral and host phenotypic evolution

PubMed Central

Vrancken, Bram; Lemey, Philippe; Rambaut, Andrew; Bedford, Trevor; Longdon, Ben; Günthard, Huldrych F.; Suchard, Marc A.

2014-01-01

Phylogenetic signal quantifies the degree to which resemblance in continuously-valued traits reflects phylogenetic relatedness. Measures of phylogenetic signal are widely used in ecological and evolutionary research, and are recently gaining traction in viral evolutionary studies. Standard estimators of phylogenetic signal frequently condition on data summary statistics of the repeated trait observations and fixed phylogenetics trees, resulting in information loss and potential bias. To incorporate the observation process and phylogenetic uncertainty in a model-based approach, we develop a novel Bayesian inference method to simultaneously estimate the evolutionary history and phylogenetic signal from molecular sequence data and repeated multivariate traits. Our approach builds upon a phylogenetic diffusion framework that model continuous trait evolution as a Brownian motion process and incorporates Pagel’s λ transformation parameter to estimate dependence among traits. We provide a computationally efficient inference implementation in the BEAST software package. We evaluate the synthetic performance of the Bayesian estimator of phylogenetic signal against standard estimators, and demonstrate the use of our coherent framework to address several virus-host evolutionary questions, including virulence heritability for HIV, antigenic evolution in influenza and HIV, and Drosophila sensitivity to sigma virus infection. Finally, we discuss model extensions that will make useful contributions to our flexible framework for simultaneously studying sequence and trait evolution. PMID:25780554

Genes with stable DNA methylation levels show higher evolutionary conservation than genes with fluctuant DNA methylation levels.

PubMed

Zhang, Ruijie; Lv, Wenhua; Luan, Meiwei; Zheng, Jiajia; Shi, Miao; Zhu, Hongjie; Li, Jin; Lv, Hongchao; Zhang, Mingming; Shang, Zhenwei; Duan, Lian; Jiang, Yongshuai

2015-11-24

Different human genes often exhibit different degrees of stability in their DNA methylation levels between tissues, samples or cell types. This may be related to the evolution of human genome. Thus, we compared the evolutionary conservation between two types of genes: genes with stable DNA methylation levels (SM genes) and genes with fluctuant DNA methylation levels (FM genes). For long-term evolutionary characteristics between species, we compared the percentage of the orthologous genes, evolutionary rate dn/ds and protein sequence identity. We found that the SM genes had greater percentages of the orthologous genes, lower dn/ds, and higher protein sequence identities in all the 21 species. These results indicated that the SM genes were more evolutionarily conserved than the FM genes. For short-term evolutionary characteristics among human populations, we compared the single nucleotide polymorphism (SNP) density, and the linkage disequilibrium (LD) degree in HapMap populations and 1000 genomes project populations. We observed that the SM genes had lower SNP densities, and higher degrees of LD in all the 11 HapMap populations and 13 1000 genomes project populations. These results mean that the SM genes had more stable chromosome genetic structures, and were more conserved than the FM genes.
The sequence, and its evolutionary implications, of a Thermococcus celer protein associated with transcription

NASA Technical Reports Server (NTRS)

Kaine, B. P.; Mehr, I. J.; Woese, C. R.

1994-01-01

Through random search, a gene from Thermococcus celer has been identified and sequenced that appears to encode a transcription-associated protein (110 amino acid residues). The sequence has clear homology to approximately the last half of an open reading frame reported previously for Sulfolobus acidocaldarius [Langer, D. & Zillig, W. (1993) Nucleic Acids Res. 21, 2251]. The protein translations of these two archaeal genes in turn are homologs of a small subunit found in eukaryotic RNA polymerase I (A12.2) and the counterpart of this from RNA polymerase II (B12.6). Homology is also seen with the eukaryotic transcription factor TFIIS, but it involves only the terminal 45 amino acids of the archaeal proteins. Evolutionary implications of these homologies are discussed.
Algorithm to find distant repeats in a single protein sequence

PubMed Central

Banerjee, Nirjhar; Sarani, Rangarajan; Ranjani, Chellamuthu Vasuki; Sowmiya, Govindaraj; Michael, Daliah; Balakrishnan, Narayanasamy; Sekar, Kanagaraj

2008-01-01

Distant repeats in protein sequence play an important role in various aspects of protein analysis. A keen analysis of the distant repeats would enable to establish a firm relation of the repeats with respect to their function and three-dimensional structure during the evolutionary process. Further, it enlightens the diversity of duplication during the evolution. To this end, an algorithm has been developed to find all distant repeats in a protein sequence. The scores from Point Accepted Mutation (PAM) matrix has been deployed for the identification of amino acid substitutions while detecting the distant repeats. Due to the biological importance of distant repeats, the proposed algorithm will be of importance to structural biologists, molecular biologists, biochemists and researchers involved in phylogenetic and evolutionary studies. PMID:19052663
Understanding sequence similarity and framework analysis between centromere proteins using computational biology.

PubMed

Doss, C George Priya; Chakrabarty, Chiranjib; Debajyoti, C; Debottam, S

2014-11-01

Certain mysteries pointing toward their recruitment pathways, cell cycle regulation mechanisms, spindle checkpoint assembly, and chromosome segregation process are considered the centre of attraction in cancer research. In modern times, with the established databases, ranges of computational platforms have provided a platform to examine almost all the physiological and biochemical evidences in disease-associated phenotypes. Using existing computational methods, we have utilized the amino acid residues to understand the similarity within the evolutionary variance of different associated centromere proteins. This study related to sequence similarity, protein-protein networking, co-expression analysis, and evolutionary trajectory of centromere proteins will speed up the understanding about centromere biology and will create a road map for upcoming researchers who are initiating their work of clinical sequencing using centromere proteins.
Complete nucleotide sequence of pig (Sus scrofa) mitochondrial genome and dating evolutionary divergence within Artiodactyla.

PubMed

Lin, C S; Sun, Y L; Liu, C Y; Yang, P C; Chang, L C; Cheng, I C; Mao, S J; Huang, M C

1999-08-05

The complete nucleotide sequence of the pig (Sus scrofa) mitochondrial genome, containing 16613bp, is presented in this report. The genome is not a specific length because of the presence of the variable numbers of tandem repeats, 5'-CGTGCGTACA in the displacement loop (D-loop). Genes responsible for 12S and 16S rRNAs, 22 tRNAs, and 13 protein-coding regions are found. The genome carries very few intergenic nucleotides with several instances of overlap between protein-coding or tRNA genes, except in the D-loop region. For evaluating the possible evolutionary relationships between Artiodactyla and Cetacea, the nucleotide substitutions and amino acid sequences of 13 protein-coding genes were aligned by pairwise comparisons of the pig, cow, and fin whale. By comparing these sequences, we suggest that there is a closer relationship between the pig and cow than that between either of these species and fin whale. In addition, the accumulation of transversions and gaps in pig 12S and 16S rRNA genes was compared with that in other eutherian species, including cow, fin whale, human, horse, and harbor seal. The results also reveal a close phylogenetic relationship between pig and cow, as compared to fin whale and others. Thus, according to the sequence differences of mitochondrial rRNA genes in eutherian species, the evolutionary separation of pig and cow occurred about 53-60 million years ago.
Evolutionary versatility of eukaryotic protein domains revealed by their bigram networks

PubMed Central

2011-01-01

Background Protein domains are globular structures of independently folded polypeptides that exert catalytic or binding activities. Their sequences are recognized as evolutionary units that, through genome recombination, constitute protein repertoires of linkage patterns. Via mutations, domains acquire modified functions that contribute to the fitness of cells and organisms. Recent studies have addressed the evolutionary selection that may have shaped the functions of individual domains and the emergence of particular domain combinations, which led to new cellular functions in multi-cellular animals. This study focuses on modeling domain linkage globally and investigates evolutionary implications that may be revealed by novel computational analysis. Results A survey of 77 completely sequenced eukaryotic genomes implies a potential hierarchical and modular organization of biological functions in most living organisms. Domains in a genome or multiple genomes are modeled as a network of hetero-duplex covalent linkages, termed bigrams. A novel computational technique is introduced to decompose such networks, whereby the notion of domain "networking versatility" is derived and measured. The most and least "versatile" domains (termed "core domains" and "peripheral domains" respectively) are examined both computationally via sequence conservation measures and experimentally using selected domains. Our study suggests that such a versatility measure extracted from the bigram networks correlates with the adaptivity of domains during evolution, where the network core domains are highly adaptive, significantly contrasting the network peripheral domains. Conclusions Domain recombination has played a major part in the evolution of eukaryotes attributing to genome complexity. From a system point of view, as the results of selection and constant refinement, networks of domain linkage are structured in a hierarchical modular fashion. Domains with high degree of networking versatility appear to be evolutionary adaptive, potentially through functional innovations. Domain bigram networks are informative as a model of biological functions. The networking versatility indices extracted from such networks for individual domains reflect the strength of evolutionary selection that the domains have experienced. PMID:21849086
Evolutionary versatility of eukaryotic protein domains revealed by their bigram networks.

PubMed

Xie, Xueying; Jin, Jing; Mao, Yongyi

2011-08-18

Protein domains are globular structures of independently folded polypeptides that exert catalytic or binding activities. Their sequences are recognized as evolutionary units that, through genome recombination, constitute protein repertoires of linkage patterns. Via mutations, domains acquire modified functions that contribute to the fitness of cells and organisms. Recent studies have addressed the evolutionary selection that may have shaped the functions of individual domains and the emergence of particular domain combinations, which led to new cellular functions in multi-cellular animals. This study focuses on modeling domain linkage globally and investigates evolutionary implications that may be revealed by novel computational analysis. A survey of 77 completely sequenced eukaryotic genomes implies a potential hierarchical and modular organization of biological functions in most living organisms. Domains in a genome or multiple genomes are modeled as a network of hetero-duplex covalent linkages, termed bigrams. A novel computational technique is introduced to decompose such networks, whereby the notion of domain "networking versatility" is derived and measured. The most and least "versatile" domains (termed "core domains" and "peripheral domains" respectively) are examined both computationally via sequence conservation measures and experimentally using selected domains. Our study suggests that such a versatility measure extracted from the bigram networks correlates with the adaptivity of domains during evolution, where the network core domains are highly adaptive, significantly contrasting the network peripheral domains. Domain recombination has played a major part in the evolution of eukaryotes attributing to genome complexity. From a system point of view, as the results of selection and constant refinement, networks of domain linkage are structured in a hierarchical modular fashion. Domains with high degree of networking versatility appear to be evolutionary adaptive, potentially through functional innovations. Domain bigram networks are informative as a model of biological functions. The networking versatility indices extracted from such networks for individual domains reflect the strength of evolutionary selection that the domains have experienced.
Determination of Fundamental Properties of an M31 Globular Cluster from Main-Sequence Photometry

NASA Astrophysics Data System (ADS)

Ma, Jun; Wu, Zhenyu; Wang, Song; Fan, Zhou; Zhou, Xu; Wu, Jianghua; Jiang, Zhaoji; Chen, Jiansheng

2010-10-01

M31 globular cluster B379 is the first extragalactic cluster whose age was determined by main-sequence photometry. In the main-sequence photometric method, the age of a cluster is obtained by fitting its color-magnitude diagram (CMD) with stellar evolutionary models. However, different stellar evolutionary models use different parameters of stellar evolution, such as range of stellar masses, different opacities and equations of state, and different recipes, and so on. So, it is interesting to check whether different stellar evolutionary models can give consistent results for the same cluster. Brown et al. constrained the age of B379 by comparing its CMD with isochrones of the 2006 VandenBerg models. Using SSP models of Bruzual & Charlot and its multiphotometry, ZMa et al. independently determined the age of B379, which is in good agreement with the determination of Brown et al. The models of Bruzual & Charlot are calculated based on the Padova evolutionary tracks. It is necessary to check whether the age of B379 as determined based on the Padova evolutionary tracks is in agreement with the determination of Brown et al.. In this article, we redetermine the age of B379 using isochrones of the Padova stellar evolutionary models. In addition, the metal abundance, the distance modulus, and the reddening value for B379 are reported. The results obtained are consistent with the previous determinations, which include the age obtained by Brown et al. This article thus confirms the consistency of the age scale of B379 between the Padova isochrones and the 2006 VandenBerg isochrones; i.e., the comparison between the results of Brown et al. and Ma et al. is meaningful. The results reported in this article of values found for B379 are: metallicity [M/H] = log(Z/Z ⊙) = -0.325, age τ = 11.0 ± 1.5 Gyr, reddening E(B - V) = 0.08, and distance modulus (m - M)0 = 24.44 ± 0.10.
Exploring Connectivity in Sequence Space of Functional RNA

NASA Technical Reports Server (NTRS)

Wei, Chenyu; Pohorille, Andrzej; Popovic, Milena; Ditzler, Mark

2017-01-01

Emergence of replicable genetic molecules was one of the marking points in the origin of life, evolution of which can be conceptualized as a walk through the space of all possible sequences. A theoretical concept of fitness landscape helps to understand evolutionary processes through assigning a value of fitness to each genotype. Then, evolution of a phenotype is viewed as a series of consecutive, single-point mutations. Natural selection biases evolution toward peaks of high fitness and away from valleys of low fitness. whereas neutral drift occurs in the sequence space without direction as mutations are introduced at random. Large networks of neutral or near-neutral mutations on a fitness landscape, especially for sufficiently long genomes, are possible or even inevitable. Their detection in experiments, however, has been elusive. Although a few near-neutral evolutionary pathways have been found, recent experimental evidence indicates landscapes consist of largely isolated islands. The generality of these results, however, is not clear, as the genome length or the fraction of functional molecules in the genotypic space might have been insufficient for the emergence of large, neutral networks. Thorough investigation on the structure of the fitness landscape is essential to understand the mechanisms of evolution of early genomes. RNA molecules are commonly assumed to play the pivotal role in the origin of genetic systems. They are widely believed to be early, if not the earliest, genetic and catalytic molecules, with abundant biochemical activities as aptamers and ribozymes, i.e. RNA molecules capable, respectively, to bind small molecules or catalyze chemical reactions. Here, we present results of our recent studies on the structure of the sequence space of RNA ligase ribozymes selected through in vitro evolution. Several hundred thousands of sequences active to a different degree were obtained by way of deep sequencing. Analysis of these sequences revealed several large clusters defined such that every sequence in a cluster can be reached from any other sequence in the same cluster through a series of single point mutations. Sequences in a single cluster appear to adopt more than one secondary structure. The mechanism of refolding within a single cluster was examined. To shed light on possible evolutionary paths in the space of ribozymes, the connectivity between clusters was investigated. The effect of length of RNA molecules on the structure of the fitness landscape and possible evolutionary paths was examined by way of comparing functional sequences of 20 and 80 nucleobases in length. It was found that sequences of different lengths shared secondary structure motifs that were presumed responsible for catalytic activity, with increasing complexity and global structural rearrangements emerging in longer molecules.
An Automated Pipeline for Engineering Many-Enzyme Pathways: Computational Sequence Design, Pathway Expression-Flux Mapping, and Scalable Pathway Optimization.

PubMed

Halper, Sean M; Cetnar, Daniel P; Salis, Howard M

2018-01-01

Engineering many-enzyme metabolic pathways suffers from the design curse of dimensionality. There are an astronomical number of synonymous DNA sequence choices, though relatively few will express an evolutionary robust, maximally productive pathway without metabolic bottlenecks. To solve this challenge, we have developed an integrated, automated computational-experimental pipeline that identifies a pathway's optimal DNA sequence without high-throughput screening or many cycles of design-build-test. The first step applies our Operon Calculator algorithm to design a host-specific evolutionary robust bacterial operon sequence with maximally tunable enzyme expression levels. The second step applies our RBS Library Calculator algorithm to systematically vary enzyme expression levels with the smallest-sized library. After characterizing a small number of constructed pathway variants, measurements are supplied to our Pathway Map Calculator algorithm, which then parameterizes a kinetic metabolic model that ultimately predicts the pathway's optimal enzyme expression levels and DNA sequences. Altogether, our algorithms provide the ability to efficiently map the pathway's sequence-expression-activity space and predict DNA sequences with desired metabolic fluxes. Here, we provide a step-by-step guide to applying the Pathway Optimization Pipeline on a desired multi-enzyme pathway in a bacterial host.
Rapidly rotating neutron stars in general relativity: Realistic equations of state

NASA Technical Reports Server (NTRS)

Cook, Gregory B.; Shapiro, Stuart L.; Teukolsky, Saul A.

1994-01-01

We construct equilibrium sequences of rotating neutron stars in general relativity. We compare results for 14 nuclear matter equations of state. We determine a number of important physical parameters for such stars, including the maximum mass and maximum spin rate. The stability of the configurations to quasi-radial perturbations is assessed. We employ a numerical scheme particularly well suited to handle rapid rotation and large departures from spherical symmetry. We provide an extensive tabulation of models for future reference. Two classes of evolutionary sequences of fixed baryon rest mass and entropy are explored: normal sequences, which behave very much like Newtonian sequences, and supramassive sequences, which exist for neutron stars solely because of general relativistic effects. Adiabatic dissipation of energy and angular momentum causes a star to evolve in quasi-stationary fashion along an evolutionary sequence. Supramassive sequences have masses exceeding the maximum mass of a nonrotating neutron star. A supramassive star evolves toward eventual catastrophic collapse to a black hole. Prior to collapse, the star actually spins up as it loses angular momentum, an effect that may provide an observable precursor to gravitational collapse to a black hole.
An experimental and computational evolution-based method to study a mode of co-evolution of overlapping open reading frames in the AAV2 viral genome.

PubMed

Kawano, Yasuhiro; Neeley, Shane; Adachi, Kei; Nakai, Hiroyuki

2013-01-01

Overlapping open reading frames (ORFs) in viral genomes undergo co-evolution; however, how individual amino acids coded by overlapping ORFs are structurally, functionally, and co-evolutionarily constrained remains difficult to address by conventional homologous sequence alignment approaches. We report here a new experimental and computational evolution-based methodology to address this question and report its preliminary application to elucidating a mode of co-evolution of the frame-shifted overlapping ORFs in the adeno-associated virus (AAV) serotype 2 viral genome. These ORFs encode both capsid VP protein and non-structural assembly-activating protein (AAP). To show proof of principle of the new method, we focused on the evolutionarily conserved QVKEVTQ and KSKRSRR motifs, a pair of overlapping heptapeptides in VP and AAP, respectively. In the new method, we first identified a large number of capsid-forming VP3 mutants and functionally competent AAP mutants of these motifs from mutant libraries by experimental directed evolution under no co-evolutionary constraints. We used Illumina sequencing to obtain a large dataset and then statistically assessed the viability of VP and AAP heptapeptide mutants. The obtained heptapeptide information was then integrated into an evolutionary algorithm, with which VP and AAP were co-evolved from random or native nucleotide sequences in silico. As a result, we demonstrate that these two heptapeptide motifs could exhibit high degeneracy if coded by separate nucleotide sequences, and elucidate how overlap-evoked co-evolutionary constraints play a role in making the VP and AAP heptapeptide sequences into the present shape. Specifically, we demonstrate that two valine (V) residues and β-strand propensity in QVKEVTQ are structurally important, the strongly negative and hydrophilic nature of KSKRSRR is functionally important, and overlap-evoked co-evolution imposes strong constraints on serine (S) residues in KSKRSRR, despite high degeneracy of the motifs in the absence of co-evolutionary constraints.
Using hidden Markov models and observed evolution to annotate viral genomes.

PubMed

McCauley, Stephen; Hein, Jotun

2006-06-01

ssRNA (single stranded) viral genomes are generally constrained in length and utilize overlapping reading frames to maximally exploit the coding potential within the genome length restrictions. This overlapping coding phenomenon leads to complex evolutionary constraints operating on the genome. In regions which code for more than one protein, silent mutations in one reading frame generally have a protein coding effect in another. To maximize coding flexibility in all reading frames, overlapping regions are often compositionally biased towards amino acids which are 6-fold degenerate with respect to the 64 codon alphabet. Previous methodologies have used this fact in an ad hoc manner to look for overlapping genes by motif matching. In this paper differentiated nucleotide compositional patterns in overlapping regions are incorporated into a probabilistic hidden Markov model (HMM) framework which is used to annotate ssRNA viral genomes. This work focuses on single sequence annotation and applies an HMM framework to ssRNA viral annotation. A description of how the HMM is parameterized, whilst annotating within a missing data framework is given. A Phylogenetic HMM (Phylo-HMM) extension, as applied to 14 aligned HIV2 sequences is also presented. This evolutionary extension serves as an illustration of the potential of the Phylo-HMM framework for ssRNA viral genomic annotation. The single sequence annotation procedure (SSA) is applied to 14 different strains of the HIV2 virus. Further results on alternative ssRNA viral genomes are presented to illustrate more generally the performance of the method. The results of the SSA method are encouraging however there is still room for improvement, and since there is overwhelming evidence to indicate that comparative methods can improve coding sequence (CDS) annotation, the SSA method is extended to a Phylo-HMM to incorporate evolutionary information. The Phylo-HMM extension is applied to the same set of 14 HIV2 sequences which are pre-aligned. The performance improvement that results from including the evolutionary information in the analysis is illustrated.
Integrating protein structural dynamics and evolutionary analysis with Bio3D.

PubMed

Skjærven, Lars; Yao, Xin-Qiu; Scarabelli, Guido; Grant, Barry J

2014-12-10

Popular bioinformatics approaches for studying protein functional dynamics include comparisons of crystallographic structures, molecular dynamics simulations and normal mode analysis. However, determining how observed displacements and predicted motions from these traditionally separate analyses relate to each other, as well as to the evolution of sequence, structure and function within large protein families, remains a considerable challenge. This is in part due to the general lack of tools that integrate information of molecular structure, dynamics and evolution. Here, we describe the integration of new methodologies for evolutionary sequence, structure and simulation analysis into the Bio3D package. This major update includes unique high-throughput normal mode analysis for examining and contrasting the dynamics of related proteins with non-identical sequences and structures, as well as new methods for quantifying dynamical couplings and their residue-wise dissection from correlation network analysis. These new methodologies are integrated with major biomolecular databases as well as established methods for evolutionary sequence and comparative structural analysis. New functionality for directly comparing results derived from normal modes, molecular dynamics and principal component analysis of heterogeneous experimental structure distributions is also included. We demonstrate these integrated capabilities with example applications to dihydrofolate reductase and heterotrimeric G-protein families along with a discussion of the mechanistic insight provided in each case. The integration of structural dynamics and evolutionary analysis in Bio3D enables researchers to go beyond a prediction of single protein dynamics to investigate dynamical features across large protein families. The Bio3D package is distributed with full source code and extensive documentation as a platform independent R package under a GPL2 license from http://thegrantlab.org/bio3d/ .
Identifying the pattern of molecular evolution for Zaire ebolavirus in the 2014 outbreak in West Africa.

PubMed

Liu, Si-Qing; Deng, Cheng-Lin; Yuan, Zhi-Ming; Rayner, Simon; Zhang, Bo

2015-06-01

The current Ebola virus disease (EVD) epidemic has killed more than all previous Ebola outbreaks combined and, even as efforts appear to be bringing the outbreak under control, the threat of reemergence remains. The availability of new whole-genome sequences from West Africa in 2014 outbreak, together with those from the earlier outbreaks, provide an opportunity to investigate the genetic characteristics, the epidemiological dynamics and the evolutionary history for Zaire ebolavirus (ZEBOV). To investigate the evolutionary properties of ZEBOV in this outbreak, we examined amino acid mutations, positive selection, and evolutionary rates on the basis of 123 ZEBOV genome sequences. The estimated phylogenetic relationships within ZEBOV revealed that viral sequences from the same period or location formed a distinct cluster. The West Africa viruses probably derived from Middle Africa, consistent with results from previous studies. Analysis of the seven protein regions of ZEBOV revealed evidence of positive selection acting on the GP and L genes. Interestingly, all putatively positive-selected sites identified in the GP are located within the mucin-like domain of the solved structure of the protein, suggesting a possible role in the immune evasion properties of ZEBOV. Compared with earlier outbreaks, the evolutionary rate of GP gene was estimated to significantly accelerate in the 2014 outbreak, suggesting that more ZEBOV variants are generated for human to human transmission during this sweeping epidemic. However, a more balanced sample set and next generation sequencing datasets would help achieve a clearer understanding at the genetic level of how the virus is evolving and adapting to new conditions. Copyright © 2015 Elsevier B.V. All rights reserved.
Effects of Mitochondrial DNA Rate Variation on Reconstruction of Pleistocene Demographic History in a Social Avian Species, Pomatostomus superciliosus

PubMed Central

Norman, Janette A.; Blackmore, Caroline J.; Rourke, Meaghan; Christidis, Les

2014-01-01

Mitochondrial sequence data is often used to reconstruct the demographic history of Pleistocene populations in an effort to understand how species have responded to past climate change events. However, departures from neutral equilibrium conditions can confound evolutionary inference in species with structured populations or those that have experienced periods of population expansion or decline. Selection can affect patterns of mitochondrial DNA variation and variable mutation rates among mitochondrial genes can compromise inferences drawn from single markers. We investigated the contribution of these factors to patterns of mitochondrial variation and estimates of time to most recent common ancestor (TMRCA) for two clades in a co-operatively breeding avian species, the white-browed babbler Pomatostomus superciliosus. Both the protein-coding ND3 gene and hypervariable domain I control region sequences showed departures from neutral expectations within the superciliosus clade, and a two-fold difference in TMRCA estimates. Bayesian phylogenetic analysis provided evidence of departure from a strict clock model of molecular evolution in domain I, leading to an over-estimation of TMRCA for the superciliosus clade at this marker. Our results suggest mitochondrial studies that attempt to reconstruct Pleistocene demographic histories should rigorously evaluate data for departures from neutral equilibrium expectations, including variation in evolutionary rates across multiple markers. Failure to do so can lead to serious errors in the estimation of evolutionary parameters and subsequent demographic inferences concerning the role of climate as a driver of evolutionary change. These effects may be especially pronounced in species with complex social structures occupying heterogeneous environments. We propose that environmentally driven differences in social structure may explain observed differences in evolutionary rate of domain I sequences, resulting from longer than expected retention times for matriarchal lineages in the superciliosus clade. PMID:25181547
Visual system evolution and the nature of the ancestral snake.

PubMed

Simões, B F; Sampaio, F L; Jared, C; Antoniazzi, M M; Loew, E R; Bowmaker, J K; Rodriguez, A; Hart, N S; Hunt, D M; Partridge, J C; Gower, D J

2015-07-01

The dominant hypothesis for the evolutionary origin of snakes from 'lizards' (non-snake squamates) is that stem snakes acquired many snake features while passing through a profound burrowing (fossorial) phase. To investigate this, we examined the visual pigments and their encoding opsin genes in a range of squamate reptiles, focusing on fossorial lizards and snakes. We sequenced opsin transcripts isolated from retinal cDNA and used microspectrophotometry to measure directly the spectral absorbance of the photoreceptor visual pigments in a subset of samples. In snakes, but not lizards, dedicated fossoriality (as in Scolecophidia and the alethinophidian Anilius scytale) corresponds with loss of all visual opsins other than RH1 (λmax 490-497 nm); all other snakes (including less dedicated burrowers) also have functional sws1 and lws opsin genes. In contrast, the retinas of all lizards sampled, even highly fossorial amphisbaenians with reduced eyes, express functional lws, sws1, sws2 and rh1 genes, and most also express rh2 (i.e. they express all five of the visual opsin genes present in the ancestral vertebrate). Our evidence of visual pigment complements suggests that the visual system of stem snakes was partly reduced, with two (RH2 and SWS2) of the ancestral vertebrate visual pigments being eliminated, but that this did not extend to the extreme additional loss of SWS1 and LWS that subsequently occurred (probably independently) in highly fossorial extant scolecophidians and A. scytale. We therefore consider it unlikely that the ancestral snake was as fossorial as extant scolecophidians, whether or not the latter are para- or monophyletic. © 2015 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2015 European Society For Evolutionary Biology.
EVOLUTION OF INTERMEDIATE-MASS X-RAY BINARIES DRIVEN BY THE MAGNETIC BRAKING OF AP/BP STARS. I. ULTRACOMPACT X-RAY BINARIES

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Wen-Cong; Podsiadlowski, Philipp, E-mail: chenwc@pku.edu.cn

2016-10-20

It is generally believed that ultracompact X-ray binaries (UCXBs) evolved from binaries consisting of a neutron star accreting from a low-mass white dwarf (WD) or helium star where mass transfer is driven by gravitational radiation. However, the standard WD evolutionary channel cannot produce the relatively long-period (40–60 minutes) UCXBs with a high time-averaged mass-transfer rate. In this work, we explore an alternative evolutionary route toward UCXBs, where the companions evolve from intermediate-mass Ap/Bp stars with an anomalously strong magnetic field (100–10,000 G). Including the magnetic braking caused by the coupling between the magnetic field and an irradiation-driven wind induced bymore » the X-ray flux from the accreting component, we show that intermediate-mass X-ray binaries (IMXBs) can evolve into UCXBs. Using the MESA code, we have calculated evolutionary sequences for a large number of IMXBs. The simulated results indicate that, for a small wind-driving efficiency f = 10{sup −5}, the anomalous magnetic braking can drive IMXBs to an ultra-short period of 11 minutes. Comparing our simulated results with the observed parameters of 15 identified UCXBs, the anomalous magnetic braking evolutionary channel can account for the formation of seven and eight sources with f = 10{sup −3}, and 10{sup −5}, respectively. In particular, a relatively large value of f can fit three of the long-period, persistent sources with a high mass-transfer rate. Though the proportion of Ap/Bp stars in intermediate-mass stars is only 5%, the lifetime of the UCXB phase is ≳2 Gyr, producing a relatively high number of observable systems, making this an alternative evolutionary channel for the formation of UCXBs.« less
Finding a (pine) needle in a haystack: chloroplast genome sequence divergence in rare and widespread pines

Treesearch

J.B. Whittall; J. Syring; M. Parks; J. Buenrostro; C. Dick; A. Liston; R. Cronn

2010-01-01

Critical to conservation efforts and other investigations at low taxonomic levels, DNA sequence data offer important insights into the distinctiveness, biogeographic partitioning, and evolutionary histories of species. The resolving power of DNA sequences is often limited by insufficient variability at the intraspecific level. This is particularly true of studies...
Towards a physics of evolution: Critical diversity dynamics at the edges of collapse and bursts of diversification

NASA Astrophysics Data System (ADS)

Hanel, Rudolf; Kauffman, Stuart A.; Thurner, Stefan

2007-09-01

Systems governed by the standard mechanisms of biological or technological evolution are often described by catalytic evolution equations. We study the structure of these equations and find an analogy with classical thermodynamic systems. In particular, we can demonstrate the existence of several distinct phases of evolutionary dynamics: a phase of fast growing diversity, one of stationary, finite diversity, and one of rapidly decaying diversity. While the first two phases have been subject to previous work, here we focus on the destructive aspects—in particular the phase diagram—of evolutionary dynamics. The main message is that within a critical region, massive loss of diversity can be triggered by very small external fluctuations. We further propose a dynamical model of diversity which captures spontaneous creation and destruction processes fully respecting the phase diagrams of evolutionary systems. The emergent time series show rich diversity dynamics, including power laws as observed in actual economical data, e.g., firm bankruptcy data. We believe the present model presents a possibility to cast the famous qualitative picture of Schumpeterian economic evolution, into a quantifiable and testable framework.

Invariant glycines and prolines flanking in loops the strand beta 2 of various (alpha/beta)8-barrel enzymes: a hidden homology?

PubMed Central

Janecek, S.

1996-01-01

The question of parallel (alpha/beta)8-barrel fold evolution remains unclear, owing mainly to the lack of sequence homology throughout the amino acid sequences of (alpha/beta)8-barrel enzymes. The "classical" approaches used in the search for homologies among (alpha/beta)8-barrels (e.g., production of structurally based alignments) have yielded alignments perfect from the structural point of view, but the approaches have been unable to reveal the homologies. These are proposed to be "hidden" in (alpha/beta)8-barrel enzymes. The term "hidden homology" means that the alignment of sequence stretches proposed to be homologous need not be structurally fully satisfactory. This is due to the very long evolutionary history of all (alpha/beta)8-barrels. This work identifies so-called hidden homology around the strand beta 2 that is flanked by loops containing invariant glycines and prolines in 17 different (alpha/beta)8-barrel enzymes, i.e., roughly in half of all currently known (alpha/beta)8-barrel proteins. The search was based on the idea that a conserved sequence region of an (alpha/beta)8-barrel enzyme should be more or less conserved also in the equivalent part of the structure of the other enzymes with this folding motif, given their mutual evolutionary relatedness. For this purpose, the sequence region around the well-conserved second beta-strand of alpha-amylase flanked by the invariant glycine and proline (56_GFTAIWITP, Aspergillus oryzae alpha-amylase numbering), was used as the sequence-structural template. The proposal that the second beta-strand of (alpha/beta)8-barrel fold is important from the evolutionary point of view is strongly supported by the increasing trend of the observed beta 2-strand structural similarity for the pairs of (alpha/beta)8-barrel enzymes: alpha-amylase and the alpha-subunit of tryptophan synthase, alpha-amylase and mandelate racemase, and alpha-amylase and cyclodextrin glycosyltransferase. This trend is also in agreement with the existing evolutionary division of the entire family of (alpha/beta)8-barrel proteins. PMID:8762144
Invariant glycines and prolines flanking in loops the strand beta 2 of various (alpha/beta)8-barrel enzymes: a hidden homology?

PubMed

Janecek, S

1996-06-01

The question of parallel (alpha/beta)8-barrel fold evolution remains unclear, owing mainly to the lack of sequence homology throughout the amino acid sequences of (alpha/beta)8-barrel enzymes. The "classical" approaches used in the search for homologies among (alpha/beta)8-barrels (e.g., production of structurally based alignments) have yielded alignments perfect from the structural point of view, but the approaches have been unable to reveal the homologies. These are proposed to be "hidden" in (alpha/beta)8-barrel enzymes. The term "hidden homology" means that the alignment of sequence stretches proposed to be homologous need not be structurally fully satisfactory. This is due to the very long evolutionary history of all (alpha/beta)8-barrels. This work identifies so-called hidden homology around the strand beta 2 that is flanked by loops containing invariant glycines and prolines in 17 different (alpha/beta)8-barrel enzymes, i.e., roughly in half of all currently known (alpha/beta)8-barrel proteins. The search was based on the idea that a conserved sequence region of an (alpha/beta)8-barrel enzyme should be more or less conserved also in the equivalent part of the structure of the other enzymes with this folding motif, given their mutual evolutionary relatedness. For this purpose, the sequence region around the well-conserved second beta-strand of alpha-amylase flanked by the invariant glycine and proline (56_GFTAIWITP, Aspergillus oryzae alpha-amylase numbering), was used as the sequence-structural template. The proposal that the second beta-strand of (alpha/beta)8-barrel fold is important from the evolutionary point of view is strongly supported by the increasing trend of the observed beta 2-strand structural similarity for the pairs of (alpha/beta)8-barrel enzymes: alpha-amylase and the alpha-subunit of tryptophan synthase, alpha-amylase and mandelate racemase, and alpha-amylase and cyclodextrin glycosyltransferase. This trend is also in agreement with the existing evolutionary division of the entire family of (alpha/beta)8-barrel proteins.
OF TRYPANOSOMATIDS. ENDOTRANSFORMATIONS AND ABERRATIONS].

PubMed

Frolov, A O; Malysheva, M N; Kostygov, A Yu

2016-01-01

Endotransformations and aberrations of the life cycle in the evolutionary history of trypanosomatids (Kinetoplastea: Trypanosomatidae) are analyzed. We treat the term "endotransformations" as evolutionarily fixed changes of phases and/or developmental stages of parasites. By contrast, we treat aberrations as evolutionary unstable, periodically arising deformations of developmental phases of trypanosomatids, never leading to life cycle changes. Various examples of life cycle endotransformations and aberrations in representatives of the family Trypanosomatidae are discussed.
Maximizing ecological and evolutionary insight in bisulfite sequencing data sets

PubMed Central

Lea, Amanda J.; Vilgalys, Tauras P.; Durst, Paul A.P.; Tung, Jenny

2017-01-01

Preface Genome-scale bisulfite sequencing approaches have opened the door to ecological and evolutionary studies of DNA methylation in many organisms. These approaches can be powerful. However, they introduce new methodological and statistical considerations, some of which are particularly relevant to non-model systems. Here, we highlight how these considerations influence a study’s power to link methylation variation with a predictor variable of interest. Relative to current practice, we argue that sample sizes will need to increase to provide robust insights. We also provide recommendations for overcoming common challenges and an R Shiny app to aid in study design. PMID:29046582
Time Clustered Sampling Can Inflate the Inferred Substitution Rate in Foot-And-Mouth Disease Virus Analyses.

PubMed

Pedersen, Casper-Emil T; Frandsen, Peter; Wekesa, Sabenzia N; Heller, Rasmus; Sangula, Abraham K; Wadsworth, Jemma; Knowles, Nick J; Muwanika, Vincent B; Siegismund, Hans R

2015-01-01

With the emergence of analytical software for the inference of viral evolution, a number of studies have focused on estimating important parameters such as the substitution rate and the time to the most recent common ancestor (tMRCA) for rapidly evolving viruses. Coupled with an increasing abundance of sequence data sampled under widely different schemes, an effort to keep results consistent and comparable is needed. This study emphasizes commonly disregarded problems in the inference of evolutionary rates in viral sequence data when sampling is unevenly distributed on a temporal scale through a study of the foot-and-mouth (FMD) disease virus serotypes SAT 1 and SAT 2. Our study shows that clustered temporal sampling in phylogenetic analyses of FMD viruses will strongly bias the inferences of substitution rates and tMRCA because the inferred rates in such data sets reflect a rate closer to the mutation rate rather than the substitution rate. Estimating evolutionary parameters from viral sequences should be performed with due consideration of the differences in short-term and longer-term evolutionary processes occurring within sets of temporally sampled viruses, and studies should carefully consider how samples are combined.
Novel non-parametric models to estimate evolutionary rates and divergence times from heterochronous sequence data.

PubMed

Fourment, Mathieu; Holmes, Edward C

2014-07-24

Early methods for estimating divergence times from gene sequence data relied on the assumption of a molecular clock. More sophisticated methods were created to model rate variation and used auto-correlation of rates, local clocks, or the so called "uncorrelated relaxed clock" where substitution rates are assumed to be drawn from a parametric distribution. In the case of Bayesian inference methods the impact of the prior on branching times is not clearly understood, and if the amount of data is limited the posterior could be strongly influenced by the prior. We develop a maximum likelihood method--Physher--that uses local or discrete clocks to estimate evolutionary rates and divergence times from heterochronous sequence data. Using two empirical data sets we show that our discrete clock estimates are similar to those obtained by other methods, and that Physher outperformed some methods in the estimation of the root age of an influenza virus data set. A simulation analysis suggests that Physher can outperform a Bayesian method when the real topology contains two long branches below the root node, even when evolution is strongly clock-like. These results suggest it is advisable to use a variety of methods to estimate evolutionary rates and divergence times from heterochronous sequence data. Physher and the associated data sets used here are available online at http://code.google.com/p/physher/.
A strategy with novel evolutionary features for the iterated prisoner's dilemma.

PubMed

Li, Jiawei; Kendall, Graham

2009-01-01

In recent iterated prisoner's dilemma tournaments, the most successful strategies were those that had identification mechanisms. By playing a predetermined sequence of moves and learning from their opponents' responses, these strategies managed to identify their opponents. We believe that these identification mechanisms may be very useful in evolutionary games. In this paper one such strategy, which we call collective strategy, is analyzed. Collective strategies apply a simple but efficient identification mechanism (that just distinguishes themselves from other strategies), and this mechanism allows them to only cooperate with their group members and defect against any others. In this way, collective strategies are able to maintain a stable population in evolutionary iterated prisoner's dilemma. By means of an invasion barrier, this strategy is compared with other strategies in evolutionary dynamics in order to demonstrate its evolutionary features. We also find that this collective behavior assists the evolution of cooperation in specific evolutionary environments.
The Solute Carrier Families Have a Remarkably Long Evolutionary History with the Majority of the Human Families Present before Divergence of Bilaterian Species

PubMed Central

Höglund, Pär J.; Nordström, Karl J.V.; Schiöth, Helgi B.; Fredriksson, Robert

2011-01-01

The Solute Carriers (SLCs) are membrane proteins that regulate transport of many types of substances over the cell membrane. The SLCs are found in at least 46 gene families in the human genome. Here, we performed the first evolutionary analysis of the entire SLC family based on whole genome sequences. We systematically mined and analyzed the genomes of 17 species to identify SLC genes. In all, we identified 4,813 SLC sequences in these genomes, and we delineated the evolutionary history of each of the subgroups. Moreover, we also identified ten new human sequences not previously classified as SLCs, which most likely belong to the SLC family. We found that 43 of the 46 SLC families found in Homo sapiens were also found in Caenorhabditis elegans, whereas 42 of them were also found in insects. Mammals have a higher number of SLC genes in most families, perhaps reflecting important roles for these in central nervous system functions. This study provides a systematic analysis of the evolutionary history of the SLC families in Eukaryotes showing that the SLC superfamily is ancient with multiple branches that were present before early divergence of Bilateria. The results provide foundation for overall classification of SLC genes and are valuable for annotation and prediction of substrates for the many SLCs that have not been tested in experimental transport assays. PMID:21186191
Evolutionary distance from human homologs reflects allergenicity of animal food proteins.

PubMed

Jenkins, John A; Breiteneder, Heimo; Mills, E N Clare

2007-12-01

In silico analysis of allergens can identify putative relationships among protein sequence, structure, and allergenic properties. Such systematic analysis reveals that most plant food allergens belong to a restricted number of protein superfamilies, with pollen allergens behaving similarly. We have investigated the structural relationships of animal food allergens and their evolutionary relatedness to human homologs to define how closely a protein must resemble a human counterpart to lose its allergenic potential. Profile-based sequence homology methods were used to classify animal food allergens into Pfam families, and in silico analyses of their evolutionary and structural relationships were performed. Animal food allergens could be classified into 3 main families--tropomyosins, EF-hand proteins, and caseins--along with 14 minor families each composed of 1 to 3 allergens. The evolutionary relationships of each of these allergen superfamilies showed that in general, proteins with a sequence identity to a human homolog above approximately 62% were rarely allergenic. Single substitutions in otherwise highly conserved regions containing IgE epitopes in EF-hand parvalbumins may modulate allergenicity. These data support the premise that certain protein structures are more allergenic than others. Contrasting with plant food allergens, animal allergens, such as the highly conserved tropomyosins, challenge the capability of the human immune system to discriminate between foreign and self-proteins. Such immune responses run close to becoming autoimmune responses. Exploiting the closeness between animal allergens and their human homologs in the development of recombinant allergens for immunotherapy will need to consider the potential for developing unanticipated autoimmune responses.
Evolutionary and Functional Relationships in the Truncated Hemoglobin Family.

PubMed

Bustamante, Juan P; Radusky, Leandro; Boechi, Leonardo; Estrin, Darío A; Ten Have, Arjen; Martí, Marcelo A

2016-01-01

Predicting function from sequence is an important goal in current biological research, and although, broad functional assignment is possible when a protein is assigned to a family, predicting functional specificity with accuracy is not straightforward. If function is provided by key structural properties and the relevant properties can be computed using the sequence as the starting point, it should in principle be possible to predict function in detail. The truncated hemoglobin family presents an interesting benchmark study due to their ubiquity, sequence diversity in the context of a conserved fold and the number of characterized members. Their functions are tightly related to O2 affinity and reactivity, as determined by the association and dissociation rate constants, both of which can be predicted and analyzed using in-silico based tools. In the present work we have applied a strategy, which combines homology modeling with molecular based energy calculations, to predict and analyze function of all known truncated hemoglobins in an evolutionary context. Our results show that truncated hemoglobins present conserved family features, but that its structure is flexible enough to allow the switch from high to low affinity in a few evolutionary steps. Most proteins display moderate to high oxygen affinities and multiple ligand migration paths, which, besides some minor trends, show heterogeneous distributions throughout the phylogenetic tree, again suggesting fast functional adaptation. Our data not only deepens our comprehension of the structural basis governing ligand affinity, but they also highlight some interesting functional evolutionary trends.
Evolutionary and Functional Relationships in the Truncated Hemoglobin Family

PubMed Central

Bustamante, Juan P.; Radusky, Leandro; Boechi, Leonardo; Estrin, Darío A.; ten Have, Arjen; Martí, Marcelo A.

2016-01-01

Predicting function from sequence is an important goal in current biological research, and although, broad functional assignment is possible when a protein is assigned to a family, predicting functional specificity with accuracy is not straightforward. If function is provided by key structural properties and the relevant properties can be computed using the sequence as the starting point, it should in principle be possible to predict function in detail. The truncated hemoglobin family presents an interesting benchmark study due to their ubiquity, sequence diversity in the context of a conserved fold and the number of characterized members. Their functions are tightly related to O2 affinity and reactivity, as determined by the association and dissociation rate constants, both of which can be predicted and analyzed using in-silico based tools. In the present work we have applied a strategy, which combines homology modeling with molecular based energy calculations, to predict and analyze function of all known truncated hemoglobins in an evolutionary context. Our results show that truncated hemoglobins present conserved family features, but that its structure is flexible enough to allow the switch from high to low affinity in a few evolutionary steps. Most proteins display moderate to high oxygen affinities and multiple ligand migration paths, which, besides some minor trends, show heterogeneous distributions throughout the phylogenetic tree, again suggesting fast functional adaptation. Our data not only deepens our comprehension of the structural basis governing ligand affinity, but they also highlight some interesting functional evolutionary trends. PMID:26788940
EvoluCode: Evolutionary Barcodes as a Unifying Framework for Multilevel Evolutionary Data.

PubMed

Linard, Benjamin; Nguyen, Ngoc Hoan; Prosdocimi, Francisco; Poch, Olivier; Thompson, Julie D

2012-01-01

Evolutionary systems biology aims to uncover the general trends and principles governing the evolution of biological networks. An essential part of this process is the reconstruction and analysis of the evolutionary histories of these complex, dynamic networks. Unfortunately, the methodologies for representing and exploiting such complex evolutionary histories in large scale studies are currently limited. Here, we propose a new formalism, called EvoluCode (Evolutionary barCode), which allows the integration of different evolutionary parameters (eg, sequence conservation, orthology, synteny …) in a unifying format and facilitates the multilevel analysis and visualization of complex evolutionary histories at the genome scale. The advantages of the approach are demonstrated by constructing barcodes representing the evolution of the complete human proteome. Two large-scale studies are then described: (i) the mapping and visualization of the barcodes on the human chromosomes and (ii) automatic clustering of the barcodes to highlight protein subsets sharing similar evolutionary histories and their functional analysis. The methodologies developed here open the way to the efficient application of other data mining and knowledge extraction techniques in evolutionary systems biology studies. A database containing all EvoluCode data is available at: http://lbgi.igbmc.fr/barcodes.
Evolutionary genomics of miniature inverted-repeat transposable elements (MITEs) in Brassica.

PubMed

Nouroz, Faisal; Noreen, Shumaila; Heslop-Harrison, J S

2015-12-01

Miniature inverted-repeat transposable elements (MITEs) are truncated derivatives of autonomous DNA transposons, and are dispersed abundantly in most eukaryotic genomes. We aimed to characterize various MITEs families in Brassica in terms of their presence, sequence characteristics and evolutionary activity. Dot plot analyses involving comparison of homoeologous bacterial artificial chromosome (BAC) sequences allowed identification of 15 novel families of mobile MITEs. Of which, 5 were Stowaway-like with TA Target Site Duplications (TSDs), 4 Tourist-like with TAA/TTA TSDs, 5 Mutator-like with 9-10 bp TSDs and 1 novel MITE (BoXMITE1) flanked by 3 bp TSDs. Our data suggested that there are about 30,000 MITE-related sequences in Brassica rapa and B. oleracea genomes. In situ hybridization showed one abundant family was dispersed in the A-genome, while another was located near 45S rDNA sites. PCR analysis using primers flanking sequences of MITE elements detected MITE insertion polymorphisms between and within the three Brassica (AA, BB, CC) genomes, with many insertions being specific to single genomes and others showing evidence of more recent evolutionary insertions. Our BAC sequence comparison strategy enables identification of evolutionarily active MITEs with no prior knowledge of MITE sequences. The details of MITE families reported in Brassica enable their identification, characterization and annotation. Insertion polymorphisms of MITEs and their transposition activity indicated important mechanism of genome evolution and diversification. MITE families derived from known Mariner, Harbinger and Mutator DNA transposons were discovered, as well as some novel structures. The identification of Brassica MITEs will have broad applications in Brassica genomics, breeding, hybridization and phylogeny through their use as DNA markers.
Rapid evolutionary change of common bean (Phaseolus vulgaris L) plastome, and the genomic diversification of legume chloroplasts

PubMed Central

Guo, Xianwu; Castillo-Ramírez, Santiago; González, Víctor; Bustos, Patricia; Luís Fernández-Vázquez, José; Santamaría, Rosa Isela; Arellano, Jesús; Cevallos, Miguel A; Dávila, Guillermo

2007-01-01

Background Fabaceae (legumes) is one of the largest families of flowering plants, and some members are important crops. In contrast to what we know about their great diversity or economic importance, our knowledge at the genomic level of chloroplast genomes (cpDNAs or plastomes) for these crops is limited. Results We sequenced the complete genome of the common bean (Phaseolus vulgaris cv. Negro Jamapa) chloroplast. The plastome of P. vulgaris is a 150,285 bp circular molecule. It has gene content similar to that of other legume plastomes, but contains two pseudogenes, rpl33 and rps16. A distinct inversion occurred at the junction points of trnH-GUG/rpl14 and rps19/rps8, as in adzuki bean [1]. These two pseudogenes and the inversion were confirmed in 10 varieties representing the two domestication centers of the bean. Genomic comparative analysis indicated that inversions generally occur in legume plastomes and the magnitude and localization of insertions/deletions (indels) also vary. The analysis of repeat sequences demonstrated that patterns and sequences of tandem repeats had an important impact on sequence diversification between legume plastomes and tandem repeats did not belong to dispersed repeats. Interestingly, P. vulgaris plastome had higher evolutionary rates of change on both genomic and gene levels than G. max, which could be the consequence of pressure from both mutation and natural selection. Conclusion Legume chloroplast genomes are widely diversified in gene content, gene order, indel structure, abundance and localization of repetitive sequences, intracellular sequence exchange and evolutionary rates. The P. vulgaris plastome is a rapidly evolving genome. PMID:17623083
Evolution of the arginase fold and functional diversity

PubMed Central

Dowling, Daniel P.; Costanzo, Luigi Di; Gennadios, Heather A.; Christianson, David W.

2009-01-01

The large number of protein structures deposited in the Protein Data Bank allows for the identification of novel structural superfamilies based on conservation of fold in addition to conservation of amino acid sequence. Since sequence diverges more rapidly than fold in protein evolution, proteins with little or no significant sequence identity are occasionally observed to adopt similar folds, thereby reflecting unanticipated evolutionary relationships. Here, we review the unique α/β fold first observed in the manganese metalloenzyme rat liver arginase, consisting of a parallel 8 stranded β-sheet surrounded by several helices, and its evolutionary relationship with the zinc-requiring and/or iron-requiring histone deacetylases and acetylpolyamine amidohydrolases. Structural comparisons reveal key features of the core α/β fold that contribute to the divergent metal ion specificity and stoichiometry required for the chemical and biological functions of these enzymes. PMID:18360740
Elucidation of cross-species proteomic effects in human and hominin bone proteome identification through a bioinformatics experiment.

PubMed

Welker, F

2018-02-20

The study of ancient protein sequences is increasingly focused on the analysis of older samples, including those of ancient hominins. The analysis of such ancient proteomes thereby potentially suffers from "cross-species proteomic effects": the loss of peptide and protein identifications at increased evolutionary distances due to a larger number of protein sequence differences between the database sequence and the analyzed organism. Error-tolerant proteomic search algorithms should theoretically overcome this problem at both the peptide and protein level; however, this has not been demonstrated. If error-tolerant searches do not overcome the cross-species proteomic issue then there might be inherent biases in the identified proteomes. Here, a bioinformatics experiment is performed to test this using a set of modern human bone proteomes and three independent searches against sequence databases at increasing evolutionary distances: the human (0 Ma), chimpanzee (6-8 Ma) and orangutan (16-17 Ma) reference proteomes, respectively. Incorrectly suggested amino acid substitutions are absent when employing adequate filtering criteria for mutable Peptide Spectrum Matches (PSMs), but roughly half of the mutable PSMs were not recovered. As a result, peptide and protein identification rates are higher in error-tolerant mode compared to non-error-tolerant searches but did not recover protein identifications completely. Data indicates that peptide length and the number of mutations between the target and database sequences are the main factors influencing mutable PSM identification. The error-tolerant results suggest that the cross-species proteomics problem is not overcome at increasing evolutionary distances, even at the protein level. Peptide and protein loss has the potential to significantly impact divergence dating and proteome comparisons when using ancient samples as there is a bias towards the identification of conserved sequences and proteins. Effects are minimized between moderately divergent proteomes, as indicated by almost complete recovery of informative positions in the search against the chimpanzee proteome (≈90%, 6-8 Ma). This provides a bioinformatic background to future phylogenetic and proteomic analysis of ancient hominin proteomes, including the future description of novel hominin amino acid sequences, but also has negative implications for the study of fast-evolving proteins in hominins, non-hominin animals, and ancient bacterial proteins in evolutionary contexts.
Spontaneous Spatial Mapping of Learned Sequence in Chimpanzees: Evidence for a SNARC-Like Effect

PubMed Central

Adachi, Ikuma

2014-01-01

In the last couple of decades, there has been a growing number of reports on space-based representation of numbers and serial order in humans. In the present study, to explore evolutionary origins of such representations, we examined whether our closest evolutionary relatives, chimpanzees, map an acquired sequence onto space in a similar way to humans. The subjects had been trained to perform a number sequence task in which they touched a sequence of “small” to “large” Arabic numerals presented in random locations on the monitor. This task was presented in sessions that also included test trials consisting of only two numerals (1 and 9) horizontally arranged. On half of the trials 1 was located to the left of 9, whereas on the other half 1 was to the right to 9. The Chimpanzees' performance was systematically influenced by the spatial arrangement of the stimuli; specifically, they responded quicker when 1 was on the left and 9 on the right compared to the other way around. This result suggests that chimpanzees, like humans, spontaneously map a learned sequence onto space. PMID:24643044
The use of museum specimens with high-throughput DNA sequencers

PubMed Central

Burrell, Andrew S.; Disotell, Todd R.; Bergey, Christina M.

2015-01-01

Natural history collections have long been used by morphologists, anatomists, and taxonomists to probe the evolutionary process and describe biological diversity. These biological archives also offer great opportunities for genetic research in taxonomy, conservation, systematics, and population biology. They allow assays of past populations, including those of extinct species, giving context to present patterns of genetic variation and direct measures of evolutionary processes. Despite this potential, museum specimens are difficult to work with because natural postmortem processes and preservation methods fragment and damage DNA. These problems have restricted geneticists’ ability to use natural history collections primarily by limiting how much of the genome can be surveyed. Recent advances in DNA sequencing technology, however, have radically changed this, making truly genomic studies from museum specimens possible. We review the opportunities and drawbacks of the use of museum specimens, and suggest how to best execute projects when incorporating such samples. Several high-throughput (HT) sequencing methodologies, including whole genome shotgun sequencing, sequence capture, and restriction digests (demonstrated here), can be used with archived biomaterials. PMID:25532801
Artificial Intelligence, DNA Mimicry, and Human Health.

PubMed

Stefano, George B; Kream, Richard M

2017-08-14

The molecular evolution of genomic DNA across diverse plant and animal phyla involved dynamic registrations of sequence modifications to maintain existential homeostasis to increasingly complex patterns of environmental stressors. As an essential corollary, driver effects of positive evolutionary pressure are hypothesized to effect concerted modifications of genomic DNA sequences to meet expanded platforms of regulatory controls for successful implementation of advanced physiological requirements. It is also clearly apparent that preservation of updated registries of advantageous modifications of genomic DNA sequences requires coordinate expansion of convergent cellular proofreading/error correction mechanisms that are encoded by reciprocally modified genomic DNA. Computational expansion of operationally defined DNA memory extends to coordinate modification of coding and previously under-emphasized noncoding regions that now appear to represent essential reservoirs of untapped genetic information amenable to evolutionary driven recruitment into the realm of biologically active domains. Additionally, expansion of DNA memory potential via chemical modification and activation of noncoding sequences is targeted to vertical augmentation and integration of an expanded cadre of transcriptional and epigenetic regulatory factors affecting linear coding of protein amino acid sequences within open reading frames.
KEPLER ECLIPSING BINARIES WITH DELTA SCUTI/GAMMA DORADUS PULSATING COMPONENTS. I. KIC 9851944

DOE Office of Scientific and Technical Information (OSTI.GOV)

Guo, Zhao; Gies, Douglas R.; Matson, Rachel A.

2016-07-20

KIC 9851944 is a short-period ( P = 2.16 days) eclipsing binary in the Kepler field of view. By combining the analysis of Kepler photometry and phase-resolved spectra from Kitt Peak National Observatory and Lowell Observatory, we determine the atmospheric and physical parameters of both stars. The two components have very different radii (2.27 R {sub ⊙}, 3.19 R {sub ⊙}) but close masses (1.76 M {sub ⊙}, 1.79 M {sub ⊙}) and effective temperatures (7026, 6902 K), indicating different evolutionary stages. The hotter primary is still on the main sequence (MS), while the cooler and larger secondary star hasmore » evolved to the post-MS, burning hydrogen in a shell. A comparison with coeval evolutionary models shows that it requires solar metallicity and a higher mass ratio to fit the radii and temperatures of both stars simultaneously. Both components show δ Scuti-type pulsations, which we interpret as p -modes and p and g mixed modes. After a close examination of the evolution of δ Scuti pulsational frequencies, we make a comparison of the observed frequencies with those calculated from MESA/GYRE.« less

The Role of Rotation in the Evolution of Massive Stars

NASA Technical Reports Server (NTRS)

Heap, Sara R.; Lanz, Thierry M.

2002-01-01

Recent evolutionary models of massive stars predict important effects of rotation including: increasing the rate of mass-loss; lowering the effective gravity; altering the evolutionary track on the HRD; extending the main-sequence phase (both on the HR diagram and in time); and mixing of CNO-processed elements up to the stellar surface. Observations suggest that rotation is a more important factor at lower metallicities because of higher initial rotational velocities and weaker winds. This makes the SMC, a low-metallicity galaxy (Z= 0.2 solar Z), an excellent environment for discerning the role of rotation in massive stars. We report on a FUSE + STIS + optical spectral analysis of 17 O-type stars in the SMC, where we found an enormous range in N abundances. Three stars in the sample have the same (low) CN abundances as the nebular material out of which they formed, namely C = 0.085 solar C and N = 0.034 solar N. However, more than half show N approx. solar N, an enrichment factor of 30X! Such unexpectedly high levels of N have ramifications for the evolution of massive stars including precursors to supernovae. They also raise questions about the sources of nitrogen in the early universe.
A traditional evolutionary history of foot-and-mouth disease viruses in Southeast Asia challenged by analyses of non-structural protein coding sequences

USDA-ARS?s Scientific Manuscript database

Molecular epidemiology and evolution of foot-and-mouth disease virus (FMDV) are widely studied using genomic sequences encoding VP1, the capsid protein containing the most relevant antigenic domains. Although sequencing of the full viral genome is not used as a routine diagnostic or surveillance too...
Unusual chromosomal organization of telomeric sequences and expeditious karyotypic differentiation in the recently evolved Mus terricolor complex.

PubMed

Sharma, G G; Sharma, T

1998-01-01

The Mus terricolor complex displays a stable homozygous arrangement of autosomal heterochromatin variations in the form of accretion of definitive autosomal short arms among three nonoverlapping populations, in concert with an expeditious evolutionary differentiation into three chromosomal species: M. terricolor I, II, and III. In contrast to the highly conservative M. musculus-like chromosomes in the coexisting sibling species, M. booduga, reshuffling and differentiation of centric heterochromatin has occurred in harmony with a revision of centric configurations, resulting in acrocentric and submetacentric autosomes. The chromosomal distribution of the prevalent vertebrate telomeric sequence (TTAGGG)n was examined by fluorescence in situ hybridization to metaphase cells of M. terricolor I, II, and III. An unusual centric organization of internal telomeric sequences was detected in all the submetacentric and acrocentric autosomes. An auxiliary role of these presumably fragile, recombinogenic telomeric sequences in the evolutionary revision of centric configurations in the terricolor complex is hypothesized.
Entropic fluctuations in DNA sequences

NASA Astrophysics Data System (ADS)

Thanos, Dimitrios; Li, Wentian; Provata, Astero

2018-03-01

The Local Shannon Entropy (LSE) in blocks is used as a complexity measure to study the information fluctuations along DNA sequences. The LSE of a DNA block maps the local base arrangement information to a single numerical value. It is shown that despite this reduction of information, LSE allows to extract meaningful information related to the detection of repetitive sequences in whole chromosomes and is useful in finding evolutionary differences between organisms. More specifically, large regions of tandem repeats, such as centromeres, can be detected based on their low LSE fluctuations along the chromosome. Furthermore, an empirical investigation of the appropriate block sizes is provided and the relationship of LSE properties with the structure of the underlying repetitive units is revealed by using both computational and mathematical methods. Sequence similarity between the genomic DNA of closely related species also leads to similar LSE values at the orthologous regions. As an application, the LSE covariance function is used to measure the evolutionary distance between several primate genomes.
De novo assembly and next-generation sequencing to analyse full-length gene variants from codon-barcoded libraries.

PubMed

Cho, Namjin; Hwang, Byungjin; Yoon, Jung-ki; Park, Sangun; Lee, Joongoo; Seo, Han Na; Lee, Jeewon; Huh, Sunghoon; Chung, Jinsoo; Bang, Duhee

2015-09-21

Interpreting epistatic interactions is crucial for understanding evolutionary dynamics of complex genetic systems and unveiling structure and function of genetic pathways. Although high resolution mapping of en masse variant libraries renders molecular biologists to address genotype-phenotype relationships, long-read sequencing technology remains indispensable to assess functional relationship between mutations that lie far apart. Here, we introduce JigsawSeq for multiplexed sequence identification of pooled gene variant libraries by combining a codon-based molecular barcoding strategy and de novo assembly of short-read data. We first validate JigsawSeq on small sub-pools and observed high precision and recall at various experimental settings. With extensive simulations, we then apply JigsawSeq to large-scale gene variant libraries to show that our method can be reliably scaled using next-generation sequencing. JigsawSeq may serve as a rapid screening tool for functional genomics and offer the opportunity to explore evolutionary trajectories of protein variants.
Mathematical model and metaheuristics for simultaneous balancing and sequencing of a robotic mixed-model assembly line

NASA Astrophysics Data System (ADS)

Li, Zixiang; Janardhanan, Mukund Nilakantan; Tang, Qiuhua; Nielsen, Peter

2018-05-01

This article presents the first method to simultaneously balance and sequence robotic mixed-model assembly lines (RMALB/S), which involves three sub-problems: task assignment, model sequencing and robot allocation. A new mixed-integer programming model is developed to minimize makespan and, using CPLEX solver, small-size problems are solved for optimality. Two metaheuristics, the restarted simulated annealing algorithm and co-evolutionary algorithm, are developed and improved to address this NP-hard problem. The restarted simulated annealing method replaces the current temperature with a new temperature to restart the search process. The co-evolutionary method uses a restart mechanism to generate a new population by modifying several vectors simultaneously. The proposed algorithms are tested on a set of benchmark problems and compared with five other high-performing metaheuristics. The proposed algorithms outperform their original editions and the benchmarked methods. The proposed algorithms are able to solve the balancing and sequencing problem of a robotic mixed-model assembly line effectively and efficiently.
The rRNA evolution and procaryotic phylogeny

NASA Technical Reports Server (NTRS)

Fox, G. E.

1986-01-01

Studies of ribosomal RNA primary structure allow reconstruction of phylogenetic trees for prokaryotic organisms. Such studies reveal major dichotomy among the bacteria that separates them into eubacteria and archaebacteria. Both groupings are further segmented into several major divisions. The results obtained from 5S rRNA sequences are essentially the same as those obtained with the 16S rRNA data. In the case of Gram negative bacteria the ribosomal RNA sequencing results can also be directly compared with hybridization studies and cytochrome c sequencing studies. There is again excellent agreement among the several methods. It seems likely then that the overall picture of microbial phylogeny that is emerging from the RNA sequence studies is a good approximation of the true history of these organisms. The RNA data allow examination of the evolutionary process in a semi-quantitative way. The secondary structures of these RNAs are largely established. As a result it is possible to recognize examples of local structural evolution. Evolutionary pathways accounting for these events can be proposed and their probability can be assessed.
Nucleotide sequences of immunoglobulin eta genes of chimpanzee and orangutan: DNA molecular clock and hominoid evolution

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sakoyama, Y.; Hong, K.J.; Byun, S.M.

To determine the phylogenetic relationships among hominoids and the dates of their divergence, the complete nucleotide sequences of the constant region of the immunoglobulin eta-chain (C/sub eta1/) genes from chimpanzee and orangutan have been determined. These sequences were compared with the human eta-chain constant-region sequence. A molecular clock (silent molecular clock), measured by the degree of sequence divergence at the synonymous (silent) positions of protein-encoding regions, was introduced for the present study. From the comparison of nucleotide sequences of ..cap alpha../sub 1/-antitrypsin and ..beta..- and delta-globulin genes between humans and Old World monkeys, the silent molecular clock was calibrated: themore » mean evolutionary rate of silent substitution was determined to be 1.56 x 10/sup -9/ substitutions per site per year. Using the silent molecular clock, the mean divergence dates of chimpanzee and orangutan from the human lineage were estimated as 6.4 +/- 2.6 million years and 17.3 +/- 4.5 million years, respectively. It was also shown that the evolutionary rate of primate genes is considerably slower than those of other mammalian genes.« less
The Most Deeply Conserved Noncoding Sequences in Plants Serve Similar Functions to Those in Vertebrates Despite Large Differences in Evolutionary Rates[W

PubMed Central

Burgess, Diane; Freeling, Michael

2014-01-01

In vertebrates, conserved noncoding elements (CNEs) are functionally constrained sequences that can show striking conservation over >400 million years of evolutionary distance and frequently are located megabases away from target developmental genes. Conserved noncoding sequences (CNSs) in plants are much shorter, and it has been difficult to detect conservation among distantly related genomes. In this article, we show not only that CNS sequences can be detected throughout the eudicot clade of flowering plants, but also that a subset of 37 CNSs can be found in all flowering plants (diverging ∼170 million years ago). These CNSs are functionally similar to vertebrate CNEs, being highly associated with transcription factor and development genes and enriched in transcription factor binding sites. Some of the most highly conserved sequences occur in genes encoding RNA binding proteins, particularly the RNA splicing–associated SR genes. Differences in sequence conservation between plants and animals are likely to reflect differences in the biology of the organisms, with plants being much more able to tolerate genomic deletions and whole-genome duplication events due, in part, to their far greater fecundity compared with vertebrates. PMID:24681619
Two-phase vesicles: a study on evolutionary and stationary models.

PubMed

Sahebifard, MohammadMahdi; Shahidi, Alireza; Ziaei-Rad, Saeed

2017-05-01

In the current article, the dynamic evolution of two-phase vesicles is presented as an extension to a previous stationary model and based on an equilibrium of local forces. In the simplified model, ignoring the effects of membrane inertia, a dynamic equilibrium between the membrane bending potential and local fluid friction is considered in each phase. The equilibrium equations at the domain borders are completed by extended introduction of membrane section reactions. We show that in some cases, the results of stationary and evolutionary models are in agreement with each other and also with experimental observations, while in others the two models differ markedly. The value of our approach is that we can account for unresponsive points of uncertainty using our equations with the local velocity of the lipid membranes and calculating the intermediate states (shapes) in the consequent evolutionary, or response, path.
Detailed phylogenetic analysis of primate T-lymphotropic virus type 1 (PTLV-1) sequences from orangutans (Pongo pygmaeus) reveals new insights into the evolutionary history of PTLV-1 in Asia.

PubMed

Reid, Michael J C; Switzer, William M; Schillaci, Michael A; Ragonnet-Cronin, Manon; Joanisse, Isabelle; Caminiti, Kyna; Lowenberger, Carl A; Galdikas, Birute Mary F; Sandstrom, Paul A; Brooks, James I

2016-09-01

While human T-lymphotropic virus type 1 (HTLV-1) originates from ancient cross-species transmission of simian T-lymphotropic virus type 1 (STLV-1) from infected nonhuman primates, much debate exists on whether the first HTLV-1 occurred in Africa, or in Asia during early human evolution and migration. This topic is complicated by a lack of representative Asian STLV-1 to infer PTLV-1 evolutionary histories. In this study we obtained new STLV-1 LTR and tax sequences from a wild-born Bornean orangutan (Pongo pygmaeus) and performed detailed phylogenetic analyses using both maximum likelihood and Bayesian inference of available Asian PTLV-1 and African STLV-1 sequences. Phylogenies, divergence dates and nucleotide substitution rates were co-inferred and compared using six different molecular clock calibrations in a Bayesian framework, including both archaeological and/or nucleotide substitution rate calibrations. We then combined our molecular results with paleobiogeographical and ecological data to infer the most likely evolutionary history of PTLV-1. Based on the preferred models our analyses robustly inferred an Asian source for PTLV-1 with cross-species transmission of STLV-1 likely from a macaque (Macaca sp.) to an orangutan about 37.9-48.9kya, and to humans between 20.3-25.5kya. An orangutan diversification of STLV-1 commenced approximately 6.4-7.3kya. Our analyses also inferred that HTLV-1 was first introduced into Australia ~3.1-3.7kya, corresponding to both genetic and archaeological changes occurring in Australia at that time. Finally, HTLV-1 appears in Melanesia at ~2.3-2.7kya corresponding to the migration of the Lapita peoples into the region. Our results also provide an important future reference for calibrating information essential for PTLV evolutionary timescale inference. Longer sequence data, or full genomes from a greater representation of Asian primates, including gibbons, leaf monkeys, and Sumatran orangutans are needed to fully elucidate these evolutionary dates and relationships using the model criteria suggested herein. Copyright © 2016 Elsevier B.V. All rights reserved.
The evolutionary rate dynamically tracks changes in HIV-1 epidemics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Maljkovic-berry, Irina; Athreya, Gayathri; Daniels, Marcus

Large-sequence datasets provide an opportunity to investigate the dynamics of pathogen epidemics. Thus, a fast method to estimate the evolutionary rate from large and numerous phylogenetic trees becomes necessary. Based on minimizing tip height variances, we optimize the root in a given phylogenetic tree to estimate the most homogenous evolutionary rate between samples from at least two different time points. Simulations showed that the method had no bias in the estimation of evolutionary rates and that it was robust to tree rooting and topological errors. We show that the evolutionary rates of HIV-1 subtype B and C epidemics have changedmore » over time, with the rate of evolution inversely correlated to the rate of virus spread. For subtype B, the evolutionary rate slowed down and tracked the start of the HAART era in 1996. Subtype C in Ethiopia showed an increase in the evolutionary rate when the prevalence increase markedly slowed down in 1995. Thus, we show that the evolutionary rate of HIV-1 on the population level dynamically tracks epidemic events.« less
Characterization of 17 chaperone-usher fimbriae encoded by Proteus mirabilis reveals strong conservation

PubMed Central

Kuan, Lisa; Schaffer, Jessica N.; Zouzias, Christos D.

2014-01-01

Proteus mirabilis is a Gram-negative enteric bacterium that causes complicated urinary tract infections, particularly in patients with indwelling catheters. Sequencing of clinical isolate P. mirabilis HI4320 revealed the presence of 17 predicted chaperone-usher fimbrial operons. We classified these fimbriae into three groups by their genetic relationship to other chaperone-usher fimbriae. Sixteen of these fimbriae are encoded by all seven currently sequenced P. mirabilis genomes. The predicted protein sequence of the major structural subunit for 14 of these fimbriae was highly conserved (≥95 % identity), whereas three other structural subunits (Fim3A, UcaA and Fim6A) were variable. Further examination of 58 clinical isolates showed that 14 of the 17 predicted major structural subunit genes of the fimbriae were present in most strains (>85 %). Transcription of the predicted major structural subunit genes for all 17 fimbriae was measured under different culture conditions designed to mimic conditions in the urinary tract. The majority of the fimbrial genes were induced during stationary phase, static culture or colony growth when compared to exponential-phase aerated culture. Major structural subunit proteins for six of these fimbriae were detected using MS of proteins sheared from the surface of broth-cultured P. mirabilis, demonstrating that this organism may produce multiple fimbriae within a single culture. The high degree of conservation of P. mirabilis fimbriae stands in contrast to uropathogenic Escherichia coli and Salmonella enterica, which exhibit greater variability in their fimbrial repertoires. These findings suggest there may be evolutionary pressure for P. mirabilis to maintain a large fimbrial arsenal. PMID:24809384
Evolutionary dynamics and the phase structure of the minority game

NASA Astrophysics Data System (ADS)

Yuan, Baosheng; Chen, Kan

2004-06-01

We show that a simple evolutionary scheme, when applied to the minority game (MG), changes the phase structure of the game. In this scheme each agent evolves individually whenever his wealth reaches the specified bankruptcy level, in contrast to the evolutionary schemes used in the previous works. We show that evolution greatly suppresses herding behavior, and it leads to better overall performance of the agents. Similar to the standard nonevolutionary MG, the dependence of the standard deviation σ on the number of agents N and the memory length m can be characterized by a universal curve. We suggest a crowd-anticrowd theory for understanding the effect of evolution in the MG.
A complete mitochondrial genome of wheat (Triticum aestivum cv. Chinese Yumai), and fast evolving mitochondrial genes in higher plants.

PubMed

Cui, Peng; Liu, Huitao; Lin, Qiang; Ding, Feng; Zhuo, Guoyin; Hu, Songnian; Liu, Dongcheng; Yang, Wenlong; Zhan, Kehui; Zhang, Aimin; Yu, Jun

2009-12-01

Plant mitochondrial genomes, encoding necessary proteins involved in the system of energy production, play an important role in the development and reproduction of the plant. They occupy a specific evolutionary pattern relative to their nuclear counterparts. Here, we determined the winter wheat (Triticum aestivum cv. Chinese Yumai) mitochondrial genome in a length of 452 and 526 bp by shotgun sequencing its BAC library. It contains 202 genes, including 35 known protein-coding genes, three rRNA and 17 tRNA genes, as well as 149 open reading frames (ORFs; greater than 300 bp in length). The sequence is almost identical to the previously reported sequence of the spring wheat (T. aestivum cv. Chinese Spring); we only identified seven SNPs (three transitions and four transversions) and 10 indels (insertions and deletions) between the two independently acquired sequences, and all variations were found in non-coding regions. This result confirmed the accuracy of the previously reported mitochondrial sequence of the Chinese Spring wheat. The nucleotide frequency and codon usage of wheat are common among the lineage of higher plant with a high AT-content of 58%. Molecular evolutionary analysis demonstrated that plant mitochondrial genomes evolved at different rates, which may correlate with substantial variations in metabolic rate and generation time among plant lineages. In addition, through the estimation of the ratio of non-synonymous to synonymous substitution rates between orthologous mitochondrion-encoded genes of higher plants, we found an accelerated evolutionary rate that seems to be the result of relaxed selection.
Evolution of ribozymes in the presence of a mineral surface

PubMed Central

Stephenson, James D.; Popović, Milena; Bristow, Thomas F.

2016-01-01

Mineral surfaces are often proposed as the sites of critical processes in the emergence of life. Clay minerals in particular are thought to play significant roles in the origin of life including polymerizing, concentrating, organizing, and protecting biopolymers. In these scenarios, the impact of minerals on biopolymer folding is expected to influence evolutionary processes. These processes include both the initial emergence of functional structures in the presence of the mineral and the subsequent transition away from the mineral-associated niche. The initial evolution of function depends upon the number and distribution of sequences capable of functioning in the presence of the mineral, and the transition to new environments depends upon the overlap between sequences that evolve on the mineral surface and sequences that can perform the same functions in the mineral's absence. To examine these processes, we evolved self-cleaving ribozymes in vitro in the presence or absence of Na-saturated montmorillonite clay mineral particles. Starting from a shared population of random sequences, RNA populations were evolved in parallel, along separate evolutionary trajectories. Comparative sequence analysis and activity assays show that the impact of this clay mineral on functional structure selection was minimal; it neither prevented common structures from emerging, nor did it promote the emergence of new structures. This suggests that montmorillonite does not improve RNA's ability to evolve functional structures; however, it also suggests that RNAs that do evolve in contact with montmorillonite retain the same structures in mineral-free environments, potentially facilitating an evolutionary transition away from a mineral-associated niche. PMID:27793980
MySSP: Non-stationary evolutionary sequence simulation, including indels

PubMed Central

Rosenberg, Michael S.

2007-01-01

MySSP is a new program for the simulation of DNA sequence evolution across a phylogenetic tree. Although many programs are available for sequence simulation, MySSP is unique in its inclusion of indels, flexibility in allowing for non-stationary patterns, and output of ancestral sequences. Some of these features can individually be found in existing programs, but have not all have been previously available in a single package. PMID:19325855
TAS3 miR390-dependent loci in non-vascular land plants: towards a comprehensive reconstruction of the gene evolutionary history.

PubMed

Morozov, Sergey Y; Milyutina, Irina A; Erokhina, Tatiana N; Ozerova, Liudmila V; Troitsky, Alexey V; Solovyev, Andrey G

2018-01-01

Trans-acting small interfering RNAs (ta-siRNAs) are transcribed from protein non-coding genomic TAS loci and belong to a plant-specific class of endogenous small RNAs. These siRNAs have been found to regulate gene expression in most taxa including seed plants, gymnosperms, ferns and mosses. In this study, bioinformatic and experimental PCR-based approaches were used as tools to analyze TAS3 and TAS6 loci in transcriptomes and genomic DNAs from representatives of evolutionary distant non-vascular plant taxa such as Bryophyta, Marchantiophyta and Anthocerotophyta. We revealed previously undiscovered TAS3 loci in plant classes Sphagnopsida and Anthocerotopsida, as well as TAS6 loci in Bryophyta classes Tetraphidiopsida, Polytrichopsida, Andreaeopsida and Takakiopsida. These data further unveil the evolutionary pathway of the miR390-dependent TAS3 loci in land plants. We also identified charophyte alga sequences coding for SUPPRESSOR OF GENE SILENCING 3 (SGS3), which is required for generation of ta-siRNAs in plants, and hypothesized that the appearance of TAS3-related sequences could take place at a very early step in evolutionary transition from charophyte algae to an earliest common ancestor of land plants.
Observing Clonal Dynamics across Spatiotemporal Axes: A Prelude to Quantitative Fitness Models for Cancer.

PubMed

McPherson, Andrew W; Chan, Fong Chun; Shah, Sohrab P

2018-02-01

The ability to accurately model evolutionary dynamics in cancer would allow for prediction of progression and response to therapy. As a prelude to quantitative understanding of evolutionary dynamics, researchers must gather observations of in vivo tumor evolution. High-throughput genome sequencing now provides the means to profile the mutational content of evolving tumor clones from patient biopsies. Together with the development of models of tumor evolution, reconstructing evolutionary histories of individual tumors generates hypotheses about the dynamics of evolution that produced the observed clones. In this review, we provide a brief overview of the concepts involved in predicting evolutionary histories, and provide a workflow based on bulk and targeted-genome sequencing. We then describe the application of this workflow to time series data obtained for transformed and progressed follicular lymphomas (FL), and contrast the observed evolutionary dynamics between these two subtypes. We next describe results from a spatial sampling study of high-grade serous (HGS) ovarian cancer, propose mechanisms of disease spread based on the observed clonal mixtures, and provide examples of diversification through subclonal acquisition of driver mutations and convergent evolution. Finally, we state implications of the techniques discussed in this review as a necessary but insufficient step on the path to predictive modelling of disease dynamics. Copyright © 2018 Cold Spring Harbor Laboratory Press; all rights reserved.
Effects of Darwinian Selection and Mutability on Rate of Broadly Neutralizing Antibody Evolution during HIV-1 Infection

PubMed Central

Sheng, Zizhang; Schramm, Chaim A.; Connors, Mark; Morris, Lynn; Mascola, John R.; Kwong, Peter D.; Shapiro, Lawrence

2016-01-01

Accumulation of somatic mutations in antibody variable regions is critical for antibody affinity maturation, with HIV-1 broadly neutralizing antibodies (bnAbs) generally requiring years to develop. We recently found that the rate at which mutations accumulate decreases over time, but the mechanism governing this slowing is unclear. In this study, we investigated whether natural selection and/or mutability of the antibody variable region contributed significantly to observed decrease in rate. We used longitudinally sampled sequences of immunoglobulin transcripts of single lineages from each of 3 donors, as determined by next generation sequencing. We estimated the evolutionary rates of the complementarity determining regions (CDRs), which are most significant for functional selection, and found they evolved about 1.5- to 2- fold faster than the framework regions. We also analyzed the presence of AID hotspots and coldspots at different points in lineage development and observed an average decrease in mutability of less than 10 percent over time. Altogether, the correlation between Darwinian selection strength and evolutionary rate trended toward significance, especially for CDRs, but cannot fully explain the observed changes in evolutionary rate. The mutability modulated by AID hotspots and coldspots changes correlated only weakly with evolutionary rates. The combined effects of Darwinian selection and mutability contribute substantially to, but do not fully explain, evolutionary rate change for HIV-1-targeting bnAb lineages. PMID:27191167

BLAST and FASTA similarity searching for multiple sequence alignment.

PubMed

Pearson, William R

2014-01-01

BLAST, FASTA, and other similarity searching programs seek to identify homologous proteins and DNA sequences based on excess sequence similarity. If two sequences share much more similarity than expected by chance, the simplest explanation for the excess similarity is common ancestry-homology. The most effective similarity searches compare protein sequences, rather than DNA sequences, for sequences that encode proteins, and use expectation values, rather than percent identity, to infer homology. The BLAST and FASTA packages of sequence comparison programs provide programs for comparing protein and DNA sequences to protein databases (the most sensitive searches). Protein and translated-DNA comparisons to protein databases routinely allow evolutionary look back times from 1 to 2 billion years; DNA:DNA searches are 5-10-fold less sensitive. BLAST and FASTA can be run on popular web sites, but can also be downloaded and installed on local computers. With local installation, target databases can be customized for the sequence data being characterized. With today's very large protein databases, search sensitivity can also be improved by searching smaller comprehensive databases, for example, a complete protein set from an evolutionarily neighboring model organism. By default, BLAST and FASTA use scoring strategies target for distant evolutionary relationships; for comparisons involving short domains or queries, or searches that seek relatively close homologs (e.g. mouse-human), shallower scoring matrices will be more effective. Both BLAST and FASTA provide very accurate statistical estimates, which can be used to reliably identify protein sequences that diverged more than 2 billion years ago.
The Awesome Power of Yeast Evolutionary Genetics: New Genome Sequences and Strain Resources for the Saccharomyces sensu stricto Genus

PubMed Central

Scannell, Devin R.; Zill, Oliver A.; Rokas, Antonis; Payen, Celia; Dunham, Maitreya J.; Eisen, Michael B.; Rine, Jasper; Johnston, Mark; Hittinger, Chris Todd

2011-01-01

High-quality, well-annotated genome sequences and standardized laboratory strains fuel experimental and evolutionary research. We present improved genome sequences of three species of Saccharomyces sensu stricto yeasts: S. bayanus var. uvarum (CBS 7001), S. kudriavzevii (IFO 1802T and ZP 591), and S. mikatae (IFO 1815T), and describe their comparison to the genomes of S. cerevisiae and S. paradoxus. The new sequences, derived by assembling millions of short DNA sequence reads together with previously published Sanger shotgun reads, have vastly greater long-range continuity and far fewer gaps than the previously available genome sequences. New gene predictions defined a set of 5261 protein-coding orthologs across the five most commonly studied Saccharomyces yeasts, enabling a re-examination of the tempo and mode of yeast gene evolution and improved inferences of species-specific gains and losses. To facilitate experimental investigations, we generated genetically marked, stable haploid strains for all three of these Saccharomyces species. These nearly complete genome sequences and the collection of genetically marked strains provide a valuable toolset for comparative studies of gene function, metabolism, and evolution, and render Saccharomyces sensu stricto the most experimentally tractable model genus. These resources are freely available and accessible through www.SaccharomycesSensuStricto.org. PMID:22384314
Phase diagrams for an evolutionary prisoner's dilemma game on two-dimensional lattices

NASA Astrophysics Data System (ADS)

Szabó, György; Vukov, Jeromos; Szolnoki, Attila

2005-10-01

The effects of payoffs and noise on the maintenance of cooperative behavior are studied in an evolutionary prisoner’s dilemma game with players located on the sites of different two-dimensional lattices. This system exhibits a phase transition from a mixed state of cooperators and defectors to a homogeneous one where only the defectors remain alive. Using Monte Carlo simulations and the generalized mean-field approximations we have determined the phase boundaries (critical points) separating the two phases on the plane of the temperature (noise) and temptation to choose defection. In the zero temperature limit the cooperation can be sustained only for those connectivity structures where three-site clique percolation occurs.
Comparable contributions of structural-functional constraints and expression level to the rate of protein sequence evolution

PubMed Central

Wolf, Maxim Y; Wolf, Yuri I; Koonin, Eugene V

2008-01-01

Background Proteins show a broad range of evolutionary rates. Understanding the factors that are responsible for the characteristic rate of evolution of a given protein arguably is one of the major goals of evolutionary biology. A long-standing general assumption used to be that the evolution rate is, primarily, determined by the specific functional constraints that affect the given protein. These constrains were traditionally thought to depend both on the specific features of the protein's structure and its biological role. The advent of systems biology brought about new types of data, such as expression level and protein-protein interactions, and unexpectedly, a variety of correlations between protein evolution rate and these variables have been observed. The strongest connections by far were repeatedly seen between protein sequence evolution rate and the expression level of the respective gene. It has been hypothesized that this link is due to the selection for the robustness of the protein structure to mistranslation-induced misfolding that is particularly important for highly expressed proteins and is the dominant determinant of the sequence evolution rate. Results This work is an attempt to assess the relative contributions of protein domain structure and function, on the one hand, and expression level on the other hand, to the rate of sequence evolution. To this end, we performed a genome-wide analysis of the effect of the fusion of a pair of domains in multidomain proteins on the difference in the domain-specific evolutionary rates. The mistranslation-induced misfolding hypothesis would predict that, within multidomain proteins, fused domains, on average, should evolve at substantially closer rates than the same domains in different proteins because, within a mutlidomain protein, all domains are translated at the same rate. We performed a comprehensive comparison of the evolutionary rates of mammalian and plant protein domains that are either joined in multidomain proteins or contained in distinct proteins. Substantial homogenization of evolutionary rates in multidomain proteins was, indeed, observed in both animals and plants, although highly significant differences between domain-specific rates remained. The contributions of the translation rate, as determined by the effect of the fusion of a pair of domains within a multidomain protein, and intrinsic, domain-specific structural-functional constraints appear to be comparable in magnitude. Conclusion Fusion of domains in a multidomain protein results in substantial homogenization of the domain-specific evolutionary rates but significant differences between domain-specific evolution rates remain. Thus, the rate of translation and intrinsic structural-functional constraints both exert sizable and comparable effects on sequence evolution. Reviewers This article was reviewed by Sergei Maslov, Dennis Vitkup, Claus Wilke (nominated by Orly Alter), and Allan Drummond (nominated by Joel Bader). For the full reviews, please go to the Reviewers' Reports section. PMID:18840284
Evolutionary trajectory of Pack-MULEs is determined by their epigenetic status

USDA-ARS?s Scientific Manuscript database

Acquisition and rearrangement of host genes by transposable elements is one mechanism to increase gene diversity. The rice genome is replete in such sequences and while ~3,000 Pack- Mutator-like transposable elements containing gene sequences (Pack-MULEs) have been identified, their function remains...
EXors and the stellar birthline

NASA Astrophysics Data System (ADS)

Moody, Mackenzie S. L.; Stahler, Steven W.

2017-04-01

We assess the evolutionary status of EXors. These low-mass, pre-main-sequence stars repeatedly undergo sharp luminosity increases, each a year or so in duration. We place into the HR diagram all EXors that have documented quiescent luminosities and effective temperatures, and thus determine their masses and ages. Two alternate sets of pre-main-sequence tracks are used, and yield similar results. Roughly half of EXors are embedded objects, I.e., they appear observationally as Class I or flat-spectrum infrared sources. We find that these are relatively young and are located close to the stellar birthline in the HR diagram. Optically visible EXors, on the other hand, are situated well below the birthline. They have ages of several Myr, typical of classical T Tauri stars. Judging from the limited data at hand, we find no evidence that binarity companions trigger EXor eruptions; this issue merits further investigation. We draw several general conclusions. First, repetitive luminosity outbursts do not occur in all pre-main-sequence stars, and are not in themselves a sign of extreme youth. They persist, along with other signs of activity, in a relatively small subset of these objects. Second, the very existence of embedded EXors demonstrates that at least some Class I infrared sources are not true protostars, but very young pre-main-sequence objects still enshrouded in dusty gas. Finally, we believe that the embedded pre-main-sequence phase is of observational and theoretical significance, and should be included in a more complete account of early stellar evolution.
Phylogenetic estimates of diversification rate are affected by molecular rate variation.

PubMed

Duchêne, D A; Hua, X; Bromham, L

2017-10-01

Molecular phylogenies are increasingly being used to investigate the patterns and mechanisms of macroevolution. In particular, node heights in a phylogeny can be used to detect changes in rates of diversification over time. Such analyses rest on the assumption that node heights in a phylogeny represent the timing of diversification events, which in turn rests on the assumption that evolutionary time can be accurately predicted from DNA sequence divergence. But there are many influences on the rate of molecular evolution, which might also influence node heights in molecular phylogenies, and thus affect estimates of diversification rate. In particular, a growing number of studies have revealed an association between the net diversification rate estimated from phylogenies and the rate of molecular evolution. Such an association might, by influencing the relative position of node heights, systematically bias estimates of diversification time. We simulated the evolution of DNA sequences under several scenarios where rates of diversification and molecular evolution vary through time, including models where diversification and molecular evolutionary rates are linked. We show that commonly used methods, including metric-based, likelihood and Bayesian approaches, can have a low power to identify changes in diversification rate when molecular substitution rates vary. Furthermore, the association between the rates of speciation and molecular evolution rate can cause the signature of a slowdown or speedup in speciation rates to be lost or misidentified. These results suggest that the multiple sources of variation in molecular evolutionary rates need to be considered when inferring macroevolutionary processes from phylogenies. © 2017 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2017 European Society For Evolutionary Biology.
Modelling and strategy optimisation for a kind of networked evolutionary games with memories under the bankruptcy mechanism

NASA Astrophysics Data System (ADS)

Fu, Shihua; Li, Haitao; Zhao, Guodong

2018-05-01

This paper investigates the evolutionary dynamic and strategy optimisation for a kind of networked evolutionary games whose strategy updating rules incorporate 'bankruptcy' mechanism, and the situation that each player's bankruptcy is due to the previous continuous low profits gaining from the game is considered. First, by using semi-tensor product of matrices method, the evolutionary dynamic of this kind of games is expressed as a higher order logical dynamic system and then converted into its algebraic form, based on which, the evolutionary dynamic of the given games can be discussed. Second, the strategy optimisation problem is investigated, and some free-type control sequences are designed to maximise the total payoff of the whole game. Finally, an illustrative example is given to show that our new results are very effective.
Characterization of Hepatitis C Virus (HCV) Envelope Diversification from Acute to Chronic Infection within a Sexually Transmitted HCV Cluster by Using Single-Molecule, Real-Time Sequencing

PubMed Central

Ho, Cynthia K. Y.; Raghwani, Jayna; Koekkoek, Sylvie; Liang, Richard H.; Van der Meer, Jan T. M.; Van Der Valk, Marc; De Jong, Menno; Pybus, Oliver G.

2016-01-01

ABSTRACT In contrast to other available next-generation sequencing platforms, PacBio single-molecule, real-time (SMRT) sequencing has the advantage of generating long reads albeit with a relatively higher error rate in unprocessed data. Using this platform, we longitudinally sampled and sequenced the hepatitis C virus (HCV) envelope genome region (1,680 nucleotides [nt]) from individuals belonging to a cluster of sexually transmitted cases. All five subjects were coinfected with HIV-1 and a closely related strain of HCV genotype 4d. In total, 50 samples were analyzed by using SMRT sequencing. By using 7 passes of circular consensus sequencing, the error rate was reduced to 0.37%, and the median number of sequences was 612 per sample. A further reduction of insertions was achieved by alignment against a sample-specific reference sequence. However, in vitro recombination during PCR amplification could not be excluded. Phylogenetic analysis supported close relationships among HCV sequences from the four male subjects and subsequent transmission from one subject to his female partner. Transmission was characterized by a strong genetic bottleneck. Viral genetic diversity was low during acute infection and increased upon progression to chronicity but subsequently fluctuated during chronic infection, caused by the alternate detection of distinct coexisting lineages. SMRT sequencing combines long reads with sufficient depth for many phylogenetic analyses and can therefore provide insights into within-host HCV evolutionary dynamics without the need for haplotype reconstruction using statistical algorithms. IMPORTANCE Next-generation sequencing has revolutionized the study of genetically variable RNA virus populations, but for phylogenetic and evolutionary analyses, longer sequences than those generated by most available platforms, while minimizing the intrinsic error rate, are desired. Here, we demonstrate for the first time that PacBio SMRT sequencing technology can be used to generate full-length HCV envelope sequences at the single-molecule level, providing a data set with large sequencing depth for the characterization of intrahost viral dynamics. The selection of consensus reads derived from at least 7 full circular consensus sequencing rounds significantly reduced the intrinsic high error rate of this method. We used this method to genetically characterize a unique transmission cluster of sexually transmitted HCV infections, providing insight into the distinct evolutionary pathways in each patient over time and identifying the transmission-associated genetic bottleneck as well as fluctuations in viral genetic diversity over time, accompanied by dynamic shifts in viral subpopulations. PMID:28077634
Venom proteomic and venomous glands transcriptomic analysis of the Egyptian scorpion Scorpio maurus palmatus (Arachnida: Scorpionidae).

PubMed

Abdel-Rahman, Mohamed A; Quintero-Hernandez, Veronica; Possani, Lourival D

2013-11-01

Proteomic analysis of the scorpion venom Scorpio maurus palmatus was performed using reverse-phase HPLC separation followed by mass spectrometry determination. Sixty five components were identified with molecular masses varying from 413 to 14,009 Da. The high percentage of peptides (41.5%) was from 3 to 5 KDa which may represent linear antimicrobial peptides and KScTxs. Also, 155 expressed sequence tags (ESTs) were analyzed through construction the cDNA library prepared from a pair of venomous gland. About 77% of the ESTs correspond to toxin-like peptides and proteins with definite open reading frames. The cDNA sequencing results also show the presence of sequences whose putative products have sequence similarity with antimicrobial peptides (24%), insecticidal toxins, β-NaScTxs, κ-KScTxs, α-KScTxs, calcines and La1-like peptides. Also, we have obtained 23 atypical types of venom molecules not recorded in other scorpion species. Moreover, 9% of the total ESTs revealed significant similarities with proteins involved in the cellular processes of these scorpion venomous glands. This is the first set of molecular masses and transcripts described from this species, in which various venom molecules have been identified. They belong to either known or unassigned types of scorpion venom peptides and proteins, and provide valuable information for evolutionary analysis and venomics. Copyright © 2013 Elsevier Ltd. All rights reserved.
The Blue Straggler Star Population in NGC 1261: Evidence for a Post-core-collapse Bounce State

NASA Astrophysics Data System (ADS)

Simunovic, Mirko; Puzia, Thomas H.; Sills, Alison

2014-11-01

We present a multi-passband photometric study of the Blue Straggler Star (BSS) population in the Galactic globular cluster (GC) NGC 1261, using available space- and ground-based survey data. The inner BSS population is found to have two distinct sequences in the color-magnitude diagram (CMD), similar to double BSS sequences detected in other GCs. These well defined sequences are presumably linked to single short-lived events such as core collapse, which are expected to boost the formation of BSSs. In agreement with this, we find a BSS sequence in NGC 1261 which can be well reproduced individually by a theoretical model prediction of a 2 Gyr old population of stellar collision products, which are expected to form in the denser inner regions during short-lived core contraction phases. Additionally, we report the occurrence of a group of BSSs with unusually blue colors in the CMD, which are consistent with a corresponding model of a 200 Myr old population of stellar collision products. The properties of the NGC 1261 BSS populations, including their spatial distributions, suggest an advanced dynamical evolutionary state of the cluster, but the core of this GC does not show the classical signatures of core collapse. We argue that these apparent contradictions provide evidence for a post-core-collapse bounce state seen in dynamical simulations of old GCs.
THE RELATION BETWEEN GALAXY STRUCTURE AND SPECTRAL TYPE: IMPLICATIONS FOR THE BUILDUP OF THE QUIESCENT GALAXY POPULATION AT 0.5 < z < 2.0

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yano, Michael; Kriek, Mariska; Wel, Arjen van der

We present the relation between galaxy structure and spectral type, using a K-selected galaxy sample at 0.5 < z < 2.0. Based on similarities between the UV-to-NIR spectral energy distributions (SEDs), we classify galaxies into 32 spectral types. The different types span a wide range in evolutionary phases, and thus—in combination with available CANDELS/F160W imaging—are ideal to study the structural evolution of galaxies. Effective radii (R{sub e}) and Sérsic parameters (n) have been measured for 572 individual galaxies, and for each type, we determine R{sub e} at fixed stellar mass by correcting for the mass-size relation. We use the rest-frame U − V versus V − J diagrammore » to investigate evolutionary trends. When moving into the direction perpendicular to the star-forming sequence, in which we see the Hα equivalent width and the specific star formation rate (sSFR) decrease, we find a decrease in R{sub e} and an increase in n. On the quiescent sequence we find an opposite trend, with older redder galaxies being larger. When splitting the sample into redshift bins, we find that young post-starburst galaxies are most prevalent at z > 1.5 and significantly smaller than all other galaxy types at the same redshift. This result suggests that the suppression of star formation may be associated with significant structural evolution at z > 1.5. At z < 1, galaxy types with intermediate sSFRs (10{sup −11.5}–10{sup −10.5} yr{sup −1}) do not have post-starburst SED shapes. These galaxies have similar sizes as older quiescent galaxies, implying that they can passively evolve onto the quiescent sequence, without increasing the average size of the quiescent galaxy population.« less
Floral gene resources from basal angiosperms for comparative genomics research

PubMed Central

Albert, Victor A; Soltis, Douglas E; Carlson, John E; Farmerie, William G; Wall, P Kerr; Ilut, Daniel C; Solow, Teri M; Mueller, Lukas A; Landherr, Lena L; Hu, Yi; Buzgo, Matyas; Kim, Sangtae; Yoo, Mi-Jeong; Frohlich, Michael W; Perl-Treves, Rafael; Schlarbaum, Scott E; Bliss, Barbara J; Zhang, Xiaohong; Tanksley, Steven D; Oppenheimer, David G; Soltis, Pamela S; Ma, Hong; dePamphilis, Claude W; Leebens-Mack, James H

2005-01-01

Background The Floral Genome Project was initiated to bridge the genomic gap between the most broadly studied plant model systems. Arabidopsis and rice, although now completely sequenced and under intensive comparative genomic investigation, are separated by at least 125 million years of evolutionary time, and cannot in isolation provide a comprehensive perspective on structural and functional aspects of flowering plant genome dynamics. Here we discuss new genomic resources available to the scientific community, comprising cDNA libraries and Expressed Sequence Tag (EST) sequences for a suite of phylogenetically basal angiosperms specifically selected to bridge the evolutionary gaps between model plants and provide insights into gene content and genome structure in the earliest flowering plants. Results Random sequencing of cDNAs from representatives of phylogenetically important eudicot, non-grass monocot, and gymnosperm lineages has so far (as of 12/1/04) generated 70,514 ESTs and 48,170 assembled unigenes. Efficient sorting of EST sequences into putative gene families based on whole Arabidopsis/rice proteome comparison has permitted ready identification of cDNA clones for finished sequencing. Preliminarily, (i) proportions of functional categories among sequenced floral genes seem representative of the entire Arabidopsis transcriptome, (ii) many known floral gene homologues have been captured, and (iii) phylogenetic analyses of ESTs are providing new insights into the process of gene family evolution in relation to the origin and diversification of the angiosperms. Conclusion Initial comparisons illustrate the utility of the EST data sets toward discovery of the basic floral transcriptome. These first findings also afford the opportunity to address a number of conspicuous evolutionary genomic questions, including reproductive organ transcriptome overlap between angiosperms and gymnosperms, genome-wide duplication history, lineage-specific gene duplication and functional divergence, and analyses of adaptive molecular evolution. Since not all genes in the floral transcriptome will be associated with flowering, these EST resources will also be of interest to plant scientists working on other functions, such as photosynthesis, signal transduction, and metabolic pathways. PMID:15799777
Datasets for evolutionary comparative genomics

PubMed Central

Liberles, David A

2005-01-01

Many decisions about genome sequencing projects are directed by perceived gaps in the tree of life, or towards model organisms. With the goal of a better understanding of biology through the lens of evolution, however, there are additional genomes that are worth sequencing. One such rationale for whole-genome sequencing is discussed here, along with other important strategies for understanding the phenotypic divergence of species. PMID:16086856
Paleoenvironmental reconstruction and evolution of an Upper Cretaceous lacustrine-fluvial-deltaic sequence in the Parecis Basin, Brazil

NASA Astrophysics Data System (ADS)

Rubert, Rogerio R.; Mizusaki, Ana Maria Pimentel; Martinelli, Agustín G.; Urban, Camile

2017-12-01

The Cretaceous in the Brazilian Platform records events of magmatism, tectonism and sedimentation coupled to the Gondwana breakup. Some of these events are registered as sedimentary sequences in interior basins, such as in the Cretaceous sequence of the Alto Xingu Sub-basin, Parecis Basin, Central Brazil. This article proposes the faciologic characterization and paleoenvironmental reconstruction of the Cretaceous sequence of the eastern portion of the Parecis Basin and its relation with some reactivated structures as, for instance, the Serra Formosa Arch. Based on both data from outcrops and core drillings a paleoenvironmental and evolutionary reconstruction of the sequence is herein presented. The base of the studied section is characterized by chemical and low energy clastic sedimentation of Lake Bottom and Shoreline, in a context of fast initial subsidence and low sedimentation rate. As the subsidence process decreased, a deltaic progradation became dominant with deposition in a prodelta environment, followed by a deltaic front and deltaic plain interbedded with fluvial plain, and aeolian deposition completing the sequence. The inferred Coniacian-Santonian age is based on vertebrate (fishes and notosuchians) and ostracod fossils with regional chrono-correlates in the Adamantina (Bauru Group), Capacete (Sanfranciscana Basin), and Bajo de la Carpa (Neuquén Group, in Argentina) formations. The formation of a Coniacian depocenter in the Alto Xingu Sub-basin is associated to the Turonian-Coniacian reactivation event in the Peruvian Orogenic Phase of the Andean Orogeny, with the transference of stresses to interplate setting, reactivating Proterozoic structures of the basement.
A history estimate and evolutionary analysis of rabies virus variants in China.

PubMed

Ming, Pinggang; Yan, Jiaxin; Rayner, Simon; Meng, Shengli; Xu, Gelin; Tang, Qing; Wu, Jie; Luo, Jing; Yang, Xiaoming

2010-03-01

To investigate the evolutionary dynamics of rabies virus (RABV) in China, we collected and sequenced 55 isolates sampled from 14 Chinese provinces over the last 40 years and performed a coalescent-based analysis of the G gene. This revealed that the RABV currently circulating in China is composed of three main groups. Bayesian coalescent analysis estimated the date of the most recent common ancestor for the current RABV Chinese strains to be 1412 (with a 95 % confidence interval of 1006-1736). The estimated mean substitution rate for the G gene sequences (3.961x10(-4) substitutions per site per year) was in accordance with previous reports for RABV.
On the path to genetic novelties: insights from programmed DNA elimination and RNA splicing.

PubMed

Catania, Francesco; Schmitz, Jürgen

2015-01-01

Understanding how genetic novelties arise is a central goal of evolutionary biology. To this end, programmed DNA elimination and RNA splicing deserve special consideration. While programmed DNA elimination reshapes genomes by eliminating chromatin during organismal development, RNA splicing rearranges genetic messages by removing intronic regions during transcription. Small RNAs help to mediate this class of sequence reorganization, which is not error-free. It is this imperfection that makes programmed DNA elimination and RNA splicing excellent candidates for generating evolutionary novelties. Leveraging a number of these two processes' mechanistic and evolutionary properties, which have been uncovered over the past years, we present recently proposed models and empirical evidence for how splicing can shape the structure of protein-coding genes in eukaryotes. We also chronicle a number of intriguing similarities between the processes of programmed DNA elimination and RNA splicing, and highlight the role that the variation in the population-genetic environment may play in shaping their target sequences. © 2015 Wiley Periodicals, Inc.
PlantTFDB 3.0: a portal for the functional and evolutionary study of plant transcription factors

PubMed Central

Jin, Jinpu; Zhang, He; Kong, Lei; Gao, Ge; Luo, Jingchu

2014-01-01

With the aim to provide a resource for functional and evolutionary study of plant transcription factors (TFs), we updated the plant TF database PlantTFDB to version 3.0 (http://planttfdb.cbi.pku.edu.cn). After refining the TF classification pipeline, we systematically identified 129 288 TFs from 83 species, of which 67 species have genome sequences, covering main lineages of green plants. Besides the abundant annotation provided in the previous version, we generated more annotations for identified TFs, including expression, regulation, interaction, conserved elements, phenotype information, expert-curated descriptions derived from UniProt, TAIR and NCBI GeneRIF, as well as references to provide clues for functional studies of TFs. To help identify evolutionary relationship among identified TFs, we assigned 69 450 TFs into 3924 orthologous groups, and constructed 9217 phylogenetic trees for TFs within the same families or same orthologous groups, respectively. In addition, we set up a TF prediction server in this version for users to identify TFs from their own sequences. PMID:24174544
MEGA-CC: computing core of molecular evolutionary genetics analysis program for automated and iterative data analysis.

PubMed

Kumar, Sudhir; Stecher, Glen; Peterson, Daniel; Tamura, Koichiro

2012-10-15

There is a growing need in the research community to apply the molecular evolutionary genetics analysis (MEGA) software tool for batch processing a large number of datasets and to integrate it into analysis workflows. Therefore, we now make available the computing core of the MEGA software as a stand-alone executable (MEGA-CC), along with an analysis prototyper (MEGA-Proto). MEGA-CC provides users with access to all the computational analyses available through MEGA's graphical user interface version. This includes methods for multiple sequence alignment, substitution model selection, evolutionary distance estimation, phylogeny inference, substitution rate and pattern estimation, tests of natural selection and ancestral sequence inference. Additionally, we have upgraded the source code for phylogenetic analysis using the maximum likelihood methods for parallel execution on multiple processors and cores. Here, we describe MEGA-CC and outline the steps for using MEGA-CC in tandem with MEGA-Proto for iterative and automated data analysis. http://www.megasoftware.net/.
Local Geometry and Evolutionary Conservation of Protein Surfaces Reveal the Multiple Recognition Patches in Protein-Protein Interactions

PubMed Central

Laine, Elodie; Carbone, Alessandra

2015-01-01

Protein-protein interactions (PPIs) are essential to all biological processes and they represent increasingly important therapeutic targets. Here, we present a new method for accurately predicting protein-protein interfaces, understanding their properties, origins and binding to multiple partners. Contrary to machine learning approaches, our method combines in a rational and very straightforward way three sequence- and structure-based descriptors of protein residues: evolutionary conservation, physico-chemical properties and local geometry. The implemented strategy yields very precise predictions for a wide range of protein-protein interfaces and discriminates them from small-molecule binding sites. Beyond its predictive power, the approach permits to dissect interaction surfaces and unravel their complexity. We show how the analysis of the predicted patches can foster new strategies for PPIs modulation and interaction surface redesign. The approach is implemented in JET2, an automated tool based on the Joint Evolutionary Trees (JET) method for sequence-based protein interface prediction. JET2 is freely available at www.lcqb.upmc.fr/JET2. PMID:26690684

Genomics of Actinobacteria: Tracing the Evolutionary History of an Ancient Phylum†

PubMed Central

Ventura, Marco; Canchaya, Carlos; Tauch, Andreas; Chandra, Govind; Fitzgerald, Gerald F.; Chater, Keith F.; van Sinderen, Douwe

2007-01-01

Summary: Actinobacteria constitute one of the largest phyla among Bacteria and represent gram-positive bacteria with a high G+C content in their DNA. This bacterial group includes microorganisms exhibiting a wide spectrum of morphologies, from coccoid to fragmenting hyphal forms, as well as possessing highly variable physiological and metabolic properties. Furthermore, Actinobacteria members have adopted different lifestyles, and can be pathogens (e.g., Corynebacterium, Mycobacterium, Nocardia, Tropheryma, and Propionibacterium), soil inhabitants (Streptomyces), plant commensals (Leifsonia), or gastrointestinal commensals (Bifidobacterium). The divergence of Actinobacteria from other bacteria is ancient, making it impossible to identify the phylogenetically closest bacterial group to Actinobacteria. Genome sequence analysis has revolutionized every aspect of bacterial biology by enhancing the understanding of the genetics, physiology, and evolutionary development of bacteria. Various actinobacterial genomes have been sequenced, revealing a wide genomic heterogeneity probably as a reflection of their biodiversity. This review provides an account of the recent explosion of actinobacterial genomics data and an attempt to place this in a biological and evolutionary context. PMID:17804669
Double-stranded telomeric DNA binding proteins: Diversity matters.

PubMed

Červenák, Filip; Juríková, Katarína; Sepšiová, Regina; Neboháčová, Martina; Nosek, Jozef; Tomáška, L'ubomír

2017-01-01

Telomeric sequences constitute only a small fraction of the whole genome yet they are crucial for ensuring genomic stability. This function is in large part mediated by protein complexes recruited to telomeric sequences by specific telomere-binding proteins (TBPs). Although the principal tasks of nuclear telomeres are the same in all eukaryotes, TBPs in various taxa exhibit a surprising diversity indicating their distinct evolutionary origin. This diversity is especially pronounced in ascomycetous yeasts where they must have co-evolved with rapidly diversifying sequences of telomeric repeats. In this article we (i) provide a historical overview of the discoveries leading to the current list of TBPs binding to double-stranded (ds) regions of telomeres, (ii) describe examples of dsTBPs highlighting their diversity in even closely related species, and (iii) speculate about possible evolutionary trajectories leading to a long list of various dsTBPs fulfilling the same general role(s) in their own unique ways.
The genome sequence of the emerging common midwife toad virus identifies an evolutionary intermediate within ranaviruses.

PubMed

Mavian, Carla; López-Bueno, Alberto; Balseiro, Ana; Casais, Rosa; Alcamí, Antonio; Alejo, Alí

2012-04-01

Worldwide amphibian population declines have been ascribed to global warming, increasing pollution levels, and other factors directly related to human activities. These factors may additionally be favoring the emergence of novel pathogens. In this report, we have determined the complete genome sequence of the emerging common midwife toad ranavirus (CMTV), which has caused fatal disease in several amphibian species across Europe. Phylogenetic and gene content analyses of the first complete genomic sequence from a ranavirus isolated in Europe show that CMTV is an amphibian-like ranavirus (ALRV). However, the CMTV genome structure is novel and represents an intermediate evolutionary stage between the two previously described ALRV groups. We find that CMTV clusters with several other ranaviruses isolated from different hosts and locations which might also be included in this novel ranavirus group. This work sheds light on the phylogenetic relationships within this complex group of emerging, disease-causing viruses.
Computational analysis and functional expression of ancestral copepod luciferase.

PubMed

Takenaka, Yasuhiro; Noda-Ogura, Akiko; Imanishi, Tadashi; Yamaguchi, Atsushi; Gojobori, Takashi; Shigeri, Yasushi

2013-10-10

We recently reported the cDNA sequences of 11 copepod luciferases from the superfamily Augaptiloidea in the order Calanoida. They were classified into two groups, Metridinidae and Heterorhabdidae/Lucicutiidae families, by phylogenetic analyses. To elucidate the evolutionary processes, we have now further isolated 12 copepod luciferases from Augaptiloidea species (Metridia asymmetrica, Metridia curticauda, Pleuromamma scutullata, Pleuromamma xiphias, Lucicutia ovaliformis and Heterorhabdus tanneri). Codon-based synonymous/nonsynonymous tests of positive selection for 25 identified copepod luciferases suggested that positive Darwinian selection operated in the evolution of Heterorhabdidae luciferases, whereas two types of Metridinidae luciferases had diversified via neutral mechanism. By in silico analysis of the decoded amino acid sequences of 25 copepod luciferases, we inferred two protein sequences as ancestral copepod luciferases. They were expressed in HEK293 cells where they exhibited notable luciferase activity both in intracellular lysates and cultured media, indicating that the luciferase activity was established before evolutionary diversification of these copepod species. © 2013.
Biophysical models of protein evolution: Understanding the patterns of evolutionary sequence divergence

PubMed Central

Echave, Julian; Wilke, Claus O.

2018-01-01

For decades, rates of protein evolution have been interpreted in terms of the vague concept of “functional importance”. Slowly evolving proteins or sites within proteins were assumed to be more functionally important and thus subject to stronger selection pressure. More recently, biophysical models of protein evolution, which combine evolutionary theory with protein biophysics, have completely revolutionized our view of the forces that shape sequence divergence. Slowly evolving proteins have been found to evolve slowly because of selection against toxic misfolding and misinteractions, linking their rate of evolution primarily to their abundance. Similarly, most slowly evolving sites in proteins are not directly involved in function, but mutating them has large impacts on protein structure and stability. Here, we review the studies of the emergent field of biophysical protein evolution that have shaped our current understanding of sequence divergence patterns. We also propose future research directions to develop this nascent field. PMID:28301766
Phylogenetic and Protein Sequence Analysis of Bacterial Chemoreceptors.

PubMed

Ortega, Davi R; Zhulin, Igor B

2018-01-01

Identifying chemoreceptors in sequenced bacterial genomes, revealing their domain architecture, inferring their evolutionary relationships, and comparing them to chemoreceptors of known function become important steps in genome annotation and chemotaxis research. Here, we describe bioinformatics procedures that enable such analyses, using two closely related bacterial genomes as examples.
Inquiry-Based Learning of Molecular Phylogenetics

ERIC Educational Resources Information Center

Campo, Daniel; Garcia-Vazquez, Eva

2008-01-01

Reconstructing phylogenies from nucleotide sequences is a challenge for students because it strongly depends on evolutionary models and computer tools that are frequently updated. We present here an inquiry-based course aimed at learning how to trace a phylogeny based on sequences existing in public databases. Computer tools are freely available…
The Ecological Rise of Whales Chronicled by the Fossil Record.

PubMed

Pyenson, Nicholas D

2017-06-05

The evolution of cetaceans is one of the best examples of macroevolution documented from the fossil record. While ecological transitions dominate each phase of cetacean history, this context is rarely stated explicitly. The first major ecological phase involves a transition from riverine and deltaic environments to marine ones, concomitant with dramatic evolutionary transformations documented in their early fossil record. The second major phase involves ecological shifts associated with evolutionary innovations: echolocation (facilitating hunting prey at depth) and filter-feeding (enhancing foraging efficiency on small prey). This latter phase involves body size shifts, attributable to changes in foraging depth and environmental forcing, as well as re-invasions of freshwater systems on continental basins by multiple lineages. Modern phenomena driving cetacean ecology, such as trophic dynamics and arms races, have an evolutionary basis that remains mostly unexamined. The fossil record of cetaceans provides an historical basis for understanding current ecological mechanisms and consequences, especially as global climate change rapidly alters ocean and river ecosystems at rates and scales comparable to those over geologic time. Published by Elsevier Ltd.
Phylogeny and strain typing of Escherichia coli, inferred from variation at mononucleotide repeat loci.

PubMed

Diamant, Eran; Palti, Yniv; Gur-Arie, Riva; Cohen, Helit; Hallerman, Eric M; Kashi, Yechezkel

2004-04-01

Multilocus sequencing of housekeeping genes has been used previously for bacterial strain typing and for inferring evolutionary relationships among strains of Escherichia coli. In this study, we used shorter intergenic sequences that contained simple sequence repeats (SSRs) of repeating mononucleotide motifs (mononucleotide repeats [MNRs]) to infer the phylogeny of pathogenic and commensal E. coli strains. Seven noncoding loci (four MNRs and three non-SSRs) were sequenced in 27 strains, including enterohemorrhagic (six isolates of O157:H7), enteropathogenic, enterotoxigenic, B, and K-12 strains. The four MNRs were also sequenced in 20 representative strains of the E. coli reference (ECOR) collection. Sequence polymorphism was significantly higher at the MNR loci, including the flanking sequences, indicating a higher mutation rate in the sequences flanking the MNR tracts. The four MNR loci were amplifiable by PCR in the standard ECOR A, B1, and D groups, but only one (yaiN) in the B2 group was amplified, which is consistent with previous studies that suggested that B2 is the most ancient group. High sequence compatibility was found between the four MNR loci, indicating that they are in the same clonal frame. The phylogenetic trees that were constructed from the sequence data were in good agreement with those of previous studies that used multilocus enzyme electrophoresis. The results demonstrate that MNR loci are useful for inferring phylogenetic relationships and provide much higher sequence variation than housekeeping genes. Therefore, the use of MNR loci for multilocus sequence typing should prove efficient for clinical diagnostics, epidemiology, and evolutionary study of bacteria.
Phylogeny and Strain Typing of Escherichia coli, Inferred from Variation at Mononucleotide Repeat Loci

PubMed Central

Diamant, Eran; Palti, Yniv; Gur-Arie, Riva; Cohen, Helit; Hallerman, Eric M.; Kashi, Yechezkel

2004-01-01

Multilocus sequencing of housekeeping genes has been used previously for bacterial strain typing and for inferring evolutionary relationships among strains of Escherichia coli. In this study, we used shorter intergenic sequences that contained simple sequence repeats (SSRs) of repeating mononucleotide motifs (mononucleotide repeats [MNRs]) to infer the phylogeny of pathogenic and commensal E. coli strains. Seven noncoding loci (four MNRs and three non-SSRs) were sequenced in 27 strains, including enterohemorrhagic (six isolates of O157:H7), enteropathogenic, enterotoxigenic, B, and K-12 strains. The four MNRs were also sequenced in 20 representative strains of the E. coli reference (ECOR) collection. Sequence polymorphism was significantly higher at the MNR loci, including the flanking sequences, indicating a higher mutation rate in the sequences flanking the MNR tracts. The four MNR loci were amplifiable by PCR in the standard ECOR A, B1, and D groups, but only one (yaiN) in the B2 group was amplified, which is consistent with previous studies that suggested that B2 is the most ancient group. High sequence compatibility was found between the four MNR loci, indicating that they are in the same clonal frame. The phylogenetic trees that were constructed from the sequence data were in good agreement with those of previous studies that used multilocus enzyme electrophoresis. The results demonstrate that MNR loci are useful for inferring phylogenetic relationships and provide much higher sequence variation than housekeeping genes. Therefore, the use of MNR loci for multilocus sequence typing should prove efficient for clinical diagnostics, epidemiology, and evolutionary study of bacteria. PMID:15066845
High pressure ices.

PubMed

Hermann, Andreas; Ashcroft, N W; Hoffmann, Roald

2012-01-17

H(2)O will be more resistant to metallization than previously thought. From computational evolutionary structure searches, we find a sequence of new stable and meta-stable structures for the ground state of ice in the 1-5 TPa (10 to 50 Mbar) regime, in the static approximation. The previously proposed Pbcm structure is superseded by a Pmc2(1) phase at p = 930 GPa, followed by a predicted transition to a P2(1) crystal structure at p = 1.3 TPa. This phase, featuring higher coordination at O and H, is stable over a wide pressure range, reaching 4.8 TPa. We analyze carefully the geometrical changes in the calculated structures, especially the buckling at the H in O-H-O motifs. All structures are insulating--chemistry burns a deep and (with pressure increase) lasting hole in the density of states near the highest occupied electronic levels of what might be component metallic lattices. Metallization of ice in our calculations occurs only near 4.8 TPa, where the metallic C2/m phase becomes most stable. In this regime, zero-point energies much larger than typical enthalpy differences suggest possible melting of the H sublattice, or even the entire crystal.
Evolutionary modes of emergence of short interspersed nuclear element (SINE) families in grasses.

PubMed

Kögler, Anja; Schmidt, Thomas; Wenke, Torsten

2017-11-01

Short interspersed nuclear elements (SINEs) are non-autonomous transposable elements which are propagated by retrotransposition and constitute an inherent part of the genome of most eukaryotic species. Knowledge of heterogeneous and highly abundant SINEs is crucial for de novo (or improvement of) annotation of whole genome sequences. We scanned Poaceae genome sequences of six important cereals (Oryza sativa, Triticum aestivum, Hordeum vulgare, Panicum virgatum, Sorghum bicolor, Zea mays) and Brachypodium distachyon to examine the diversity and evolution of SINE populations. We comparatively analyzed the structural features, distribution, evolutionary relation and abundance of 32 SINE families and subfamilies within grasses, comprising 11 052 individual copies. The investigation of activity profiles within the Poaceae provides insights into their species-specific diversification and amplification. We found that Poaceae SINEs (PoaS) fall into two length categories: simple SINEs of up to 180 bp and dimeric SINEs larger than 240 bp. Detailed analysis at the nucleotide level revealed that multimerization of related and unrelated SINE copies is an important evolutionary mechanism of SINE formation. We conclude that PoaS families diversify by massive reshuffling between SINE families, likely caused by insertion of truncated copies, and provide a model for this evolutionary scenario. Twenty-eight of 32 PoaS families and subfamilies show significant conservation, in particular either in the 5' or 3' regions, across Poaceae species and share large sequence stretches with one or more other PoaS families. © 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.
The complex evolutionary dynamics of ancient and recent polyploidy in Leucaena (Leguminosae; Mimosoideae).

PubMed

Govindarajulu, Rajanikanth; Hughes, Colin E; Alexander, Patrick J; Bailey, C Donovan

2011-12-01

The evolutionary history of Leucaena has been impacted by polyploidy, hybridization, and divergent allopatric species diversification, suggesting that this is an ideal group to investigate the evolutionary tempo of polyploidy and the complexities of reticulation and divergence in plant diversification. Parsimony- and ML-based phylogenetic approaches were applied to 105 accessions sequenced for six sequence characterized amplified region-based nuclear encoded loci, nrDNA ITS, and four cpDNA regions. Hypotheses for the origin of tetraploid species were inferred using results derived from a novel species tree and established gene tree methods and from data on genome sizes and geographic distributions. The combination of comprehensively sampled multilocus DNA sequence data sets and a novel methodology provide strong resolution and support for the origins of all five tetraploid species. A minimum of four allopolyploidization events are required to explain the origins of these species. The origin(s) of one tetraploid pair (L. involucrata/L. pallida) can be equally explained by two unique allopolyploidizations or a single event followed by divergent speciation. Alongside other recent findings, a comprehensive picture of the complex evolutionary dynamics of polyploidy in Leucaena is emerging that includes paleotetraploidization, diploidization of the last common ancestor to Leucaena, allopatric divergence among diploids, and recent allopolyploid origins for tetraploid species likely associated with human translocation of seed. These results provide insights into the role of divergence and reticulation in a well-characterized angiosperm lineage and into traits of diploid parents and derived tetraploids (particularly self-compatibility and year-round flowering) favoring the formation and establishment of novel tetraploids combinations.
Short- and Long-term Evolutionary Dynamics of Bacterial Insertion Sequences: Insights from Wolbachia Endosymbionts

PubMed Central

Cerveau, Nicolas; Leclercq, Sébastien; Leroy, Elodie; Bouchon, Didier; Cordaux, Richard

2011-01-01

Transposable elements (TE) are one of the major driving forces of genome evolution, raising the question of the long-term dynamics underlying their evolutionary success. Long-term TE evolution can readily be reconstructed in eukaryotes, thanks to many degraded copies constituting genomic fossil records of past TE proliferations. By contrast, bacterial genomes usually experience high sequence turnover and short TE retention times, thereby obscuring ancient TE evolutionary patterns. We found that Wolbachia bacterial genomes contain 52–171 insertion sequence (IS) TEs. IS account for 11% of Wolbachia wRi, which is one of the highest IS genomic coverage reported in prokaryotes to date. We show that many IS groups are currently expanding in various Wolbachia genomes and that IS horizontal transfers are frequent among strains, which can explain the apparent synchronicity of these IS proliferations. Remarkably, >70% of Wolbachia IS are nonfunctional. They constitute an unusual bacterial IS genomic fossil record providing direct empirical evidence for a long-term IS evolutionary dynamics following successive periods of intense transpositional activity. Our results show that comprehensive IS annotations have the potential to provide new insights into prokaryote TE evolution and, more generally, prokaryote genome evolution. Indeed, the identification of an important IS genomic fossil record in Wolbachia demonstrates that IS elements are not always of recent origin, contrary to the conventional view of TE evolution in prokaryote genomes. Our results also raise the question whether the abundance of IS fossils is specific to Wolbachia or it may be a general, albeit overlooked, feature of prokaryote genomes. PMID:21940637
Short- and long-term evolutionary dynamics of bacterial insertion sequences: insights from Wolbachia endosymbionts.

PubMed

Cerveau, Nicolas; Leclercq, Sébastien; Leroy, Elodie; Bouchon, Didier; Cordaux, Richard

2011-01-01

Transposable elements (TE) are one of the major driving forces of genome evolution, raising the question of the long-term dynamics underlying their evolutionary success. Long-term TE evolution can readily be reconstructed in eukaryotes, thanks to many degraded copies constituting genomic fossil records of past TE proliferations. By contrast, bacterial genomes usually experience high sequence turnover and short TE retention times, thereby obscuring ancient TE evolutionary patterns. We found that Wolbachia bacterial genomes contain 52-171 insertion sequence (IS) TEs. IS account for 11% of Wolbachia wRi, which is one of the highest IS genomic coverage reported in prokaryotes to date. We show that many IS groups are currently expanding in various Wolbachia genomes and that IS horizontal transfers are frequent among strains, which can explain the apparent synchronicity of these IS proliferations. Remarkably, >70% of Wolbachia IS are nonfunctional. They constitute an unusual bacterial IS genomic fossil record providing direct empirical evidence for a long-term IS evolutionary dynamics following successive periods of intense transpositional activity. Our results show that comprehensive IS annotations have the potential to provide new insights into prokaryote TE evolution and, more generally, prokaryote genome evolution. Indeed, the identification of an important IS genomic fossil record in Wolbachia demonstrates that IS elements are not always of recent origin, contrary to the conventional view of TE evolution in prokaryote genomes. Our results also raise the question whether the abundance of IS fossils is specific to Wolbachia or it may be a general, albeit overlooked, feature of prokaryote genomes.
Evolutionary divergence of chloroplast FAD synthetase proteins

PubMed Central

2010-01-01

Background Flavin adenine dinucleotide synthetases (FADSs) - a group of bifunctional enzymes that carry out the dual functions of riboflavin phosphorylation to produce flavin mononucleotide (FMN) and its subsequent adenylation to generate FAD in most prokaryotes - were studied in plants in terms of sequence, structure and evolutionary history. Results Using a variety of bioinformatics methods we have found that FADS enzymes localized to the chloroplasts, which we term as plant-like FADS proteins, are distributed across a variety of green plant lineages and constitute a divergent protein family clearly of cyanobacterial origin. The C-terminal module of these enzymes does not contain the typical riboflavin kinase active site sequence, while the N-terminal module is broadly conserved. These results agree with a previous work reported by Sandoval et al. in 2008. Furthermore, our observations and preliminary experimental results indicate that the C-terminus of plant-like FADS proteins may contain a catalytic activity, but different to that of their prokaryotic counterparts. In fact, homology models predict that plant-specific conserved residues constitute a distinct active site in the C-terminus. Conclusions A structure-based sequence alignment and an in-depth evolutionary survey of FADS proteins, thought to be crucial in plant metabolism, are reported, which will be essential for the correct annotation of plant genomes and further structural and functional studies. This work is a contribution to our understanding of the evolutionary history of plant-like FADS enzymes, which constitute a new family of FADS proteins whose C-terminal module might be involved in a distinct catalytic activity. PMID:20955574
A reassessment of the evolutionary timescale of bat rabies viruses based upon glycoprotein gene sequences.

PubMed

Kuzmina, Natalia A; Kuzmin, Ivan V; Ellison, James A; Taylor, Steven T; Bergman, David L; Dew, Beverly; Rupprecht, Charles E

2013-10-01

Rabies, an acute progressive encephalomyelitis caused by viruses in the genus Lyssavirus, is one of the oldest known infectious diseases. Although dogs and other carnivores represent the greatest threat to public health as rabies reservoirs, it is commonly accepted that bats are the primary evolutionary hosts of lyssaviruses. Despite early historical documentation of rabies, molecular clock analyses indicate a quite young age of lyssaviruses, which is confusing. For example, the results obtained for partial and complete nucleoprotein gene sequences of rabies viruses (RABV), or for a limited number of glycoprotein gene sequences, indicated that the time of the most recent common ancestor (TMRCA) for current bat RABV diversity in the Americas lies in the seventeenth to eighteenth centuries and might be directly or indirectly associated with the European colonization. Conversely, several other reports demonstrated high genetic similarity between lyssavirus isolates, including RABV, obtained within a time interval of 25-50 years. In the present study, we attempted to re-estimate the age of several North American bat RABV lineages based on the largest set of complete and partial glycoprotein gene sequences compiled to date (n = 201) employing a codon substitution model. Although our results overlap with previous estimates in marginal areas of the 95 % high probability density (HPD), they suggest a longer evolutionary history of American bat RABV lineages (TMRCA at least 732 years, with a 95 % HPD 436-1107 years).
ESPERR: learning strong and weak signals in genomic sequence alignments to identify functional elements.

PubMed

Taylor, James; Tyekucheva, Svitlana; King, David C; Hardison, Ross C; Miller, Webb; Chiaromonte, Francesca

2006-12-01

Genomic sequence signals - such as base composition, presence of particular motifs, or evolutionary constraint - have been used effectively to identify functional elements. However, approaches based only on specific signals known to correlate with function can be quite limiting. When training data are available, application of computational learning algorithms to multispecies alignments has the potential to capture broader and more informative sequence and evolutionary patterns that better characterize a class of elements. However, effective exploitation of patterns in multispecies alignments is impeded by the vast number of possible alignment columns and by a limited understanding of which particular strings of columns may characterize a given class. We have developed a computational method, called ESPERR (evolutionary and sequence pattern extraction through reduced representations), which uses training examples to learn encodings of multispecies alignments into reduced forms tailored for the prediction of chosen classes of functional elements. ESPERR produces a greatly improved Regulatory Potential score, which can discriminate regulatory regions from neutral sites with excellent accuracy ( approximately 94%). This score captures strong signals (GC content and conservation), as well as subtler signals (with small contributions from many different alignment patterns) that characterize the regulatory elements in our training set. ESPERR is also effective for predicting other classes of functional elements, as we show for DNaseI hypersensitive sites and highly conserved regions with developmental enhancer activity. Our software, training data, and genome-wide predictions are available from our Web site (http://www.bx.psu.edu/projects/esperr).
Purifying Selection Maintains Dosage-Sensitive Genes during Degeneration of the Threespine Stickleback Y Chromosome

PubMed Central

White, Michael A.; Kitano, Jun; Peichel, Catherine L.

2015-01-01

Sex chromosomes are subject to unique evolutionary forces that cause suppression of recombination, leading to sequence degeneration and the formation of heteromorphic chromosome pairs (i.e., XY or ZW). Although progress has been made in characterizing the outcomes of these evolutionary processes on vertebrate sex chromosomes, it is still unclear how recombination suppression and sequence divergence typically occur and how gene dosage imbalances are resolved in the heterogametic sex. The threespine stickleback fish (Gasterosteus aculeatus) is a powerful model system to explore vertebrate sex chromosome evolution, as it possesses an XY sex chromosome pair at relatively early stages of differentiation. Using a combination of whole-genome and transcriptome sequencing, we characterized sequence evolution and gene expression across the sex chromosomes. We uncovered two distinct evolutionary strata that correspond with known structural rearrangements on the Y chromosome. In the oldest stratum, only a handful of genes remain, and these genes are under strong purifying selection. By comparing sex-linked gene expression with expression of autosomal orthologs in an outgroup, we show that dosage compensation has not evolved in threespine sticklebacks through upregulation of the X chromosome in males. Instead, in the oldest stratum, the genes that still possess a Y chromosome allele are enriched for genes predicted to be dosage sensitive in mammals and yeast. Our results suggest that dosage imbalances may have been avoided at haploinsufficient genes by retaining function of the Y chromosome allele through strong purifying selection. PMID:25818858
Accelerated Evolution of the ASPM Gene Controlling Brain Size Begins Prior to Human Brain Expansion

PubMed Central

Solomon, Gregory; Gersch, William; Yoon, Young-Ho; Collura, Randall; Ruvolo, Maryellen; Barrett, J. Carl; Woods, C. Geoffrey; Walsh, Christopher A

2004-01-01

Primary microcephaly (MCPH) is a neurodevelopmental disorder characterized by global reduction in cerebral cortical volume. The microcephalic brain has a volume comparable to that of early hominids, raising the possibility that some MCPH genes may have been evolutionary targets in the expansion of the cerebral cortex in mammals and especially primates. Mutations in ASPM, which encodes the human homologue of a fly protein essential for spindle function, are the most common known cause of MCPH. Here we have isolated large genomic clones containing the complete ASPM gene, including promoter regions and introns, from chimpanzee, gorilla, orangutan, and rhesus macaque by transformation-associated recombination cloning in yeast. We have sequenced these clones and show that whereas much of the sequence of ASPM is substantially conserved among primates, specific segments are subject to high Ka/Ks ratios (nonsynonymous/synonymous DNA changes) consistent with strong positive selection for evolutionary change. The ASPM gene sequence shows accelerated evolution in the African hominoid clade, and this precedes hominid brain expansion by several million years. Gorilla and human lineages show particularly accelerated evolution in the IQ domain of ASPM. Moreover, ASPM regions under positive selection in primates are also the most highly diverged regions between primates and nonprimate mammals. We report the first direct application of TAR cloning technology to the study of human evolution. Our data suggest that evolutionary selection of specific segments of the ASPM sequence strongly relates to differences in cerebral cortical size. PMID:15045028

Evolutionary models of interstellar chemistry

NASA Technical Reports Server (NTRS)

Prasad, Sheo S.

1987-01-01

The goal of evolutionary models of interstellar chemistry is to understand how interstellar clouds came to be the way they are, how they will change with time, and to place them in an evolutionary sequence with other celestial objects such as stars. An improved Mark II version of an earlier model of chemistry in dynamically evolving clouds is presented. The Mark II model suggests that the conventional elemental C/O ratio less than one can explain the observed abundances of CI and the nondetection of O2 in dense clouds. Coupled chemical-dynamical models seem to have the potential to generate many observable discriminators of the evolutionary tracks. This is exciting, because, in general, purely dynamical models do not yield enough verifiable discriminators of the predicted tracks.
Modeling populations of rotationally mixed massive stars

NASA Astrophysics Data System (ADS)

Brott, I.

2011-02-01

Massive stars can be considered as cosmic engines. With their high luminosities, strong stellar winds and violent deaths they drive the evolution of galaxies through-out the history of the universe. Despite the importance of massive stars, their evolution is still poorly understood. Two major issues have plagued evolutionary models of massive stars until today: mixing and mass loss On the main sequence, the effects of mass loss remain limited in the considered mass and metallicity range, this thesis concentrates on the role of mixing in massive stars. This thesis approaches this problem just on the cross road between observations and simulations. The main question: Do evolutionary models of single stars, accounting for the effects of rotation, reproduce the observed properties of real stars. In particular we are interested if the evolutionary models can reproduce the surface abundance changes during the main-sequence phase. To constrain our models we build a population synthesis model for the sample of the VLT-FLAMES Survey of Massive stars, for which star-formation history and rotational velocity distribution are well constrained. We consider the four main regions of the Hunter diagram. Nitrogen un-enriched slow rotators and nitrogen enriched fast rotators that are predicted by theory. Nitrogen enriched slow rotators and nitrogen unenriched fast rotators that are not predicted by our model. We conclude that currently these comparisons are not sufficient to verify the theory of rotational mixing. Physical processes in addition to rotational mixing appear necessary to explain the stars in the later two regions. The chapters of this Thesis have been published in the following Journals: Ch. 2: ``Rotating Massive Main-Sequence Stars I: Grids of Evolutionary Models and Isochrones'', I. Brott, S. E. de Mink, M. Cantiello, N. Langer, A. de Koter, C. J. Evans, I. Hunter, C. Trundle, J.S. Vink submitted to Astronomy & Astrop hysics Ch. 3: ``The VLT-FLAMES Survey of Massive Stars: Rotation and Nitrogen Enrichment as the Key to Understanding Massive Star Evolution'', I.Hunter, I.Brott, D.J. Lennon, N. Langer, C. Trundle, A. de Koter, C.J. Evans and R.S.I. Ryans The Astrophysical Journal, 2008, 676, L29-L32 Ch. 4: ``The VLT-FLAMES Survey of Massive Stars: Constraints on Stellar Evolution from the Chemical Compositions of Rapidly Rotating Galactic and Magellanic Cloud B-type Stars '', I. Hunter, I. Brott, N. Langer, D.J. Lennon, P.L. Dufton, I.D. Howarth R.S.I. Ryan, C. Trundle, C. Evans, A. de Koter and S.J. Smartt Published in Astronomy & Astropysics, 2009, 496, 841- 853 Ch. 5: ``Rotating Massive Main-Sequence Stars II: Simulating a Population of LMC early B-type Stars as a Test of Rotational Mixing '', I. Brott, C. J. Evans, I. Hunter, A. de Koter, N. Langer, P. L. Dufton, M. Cantiello, C. Trundle, D. J. Lennon, S.E. de Mink, S.-C. Yoon, P. Anders submitted to Astronomy & Astrophysics Ch 6: ``The Nature of B Supergiants: Clues From a Steep Drop in Rotation Rates at 22 000 K - The possibility of Bi-stability braking'', Jorick S. Vink, I. Brott, G. Graefener, N. Langer, A. de Koter, D.J. Lennon Astronomy & Astrophysics, 2010, 512, L7
Core principles of evolutionary medicine

PubMed Central

Grunspan, Daniel Z; Nesse, Randolph M; Barnes, M Elizabeth; Brownell, Sara E

2018-01-01

Abstract Background and objectives Evolutionary medicine is a rapidly growing field that uses the principles of evolutionary biology to better understand, prevent and treat disease, and that uses studies of disease to advance basic knowledge in evolutionary biology. Over-arching principles of evolutionary medicine have been described in publications, but our study is the first to systematically elicit core principles from a diverse panel of experts in evolutionary medicine. These principles should be useful to advance recent recommendations made by The Association of American Medical Colleges and the Howard Hughes Medical Institute to make evolutionary thinking a core competency for pre-medical education. Methodology The Delphi method was used to elicit and validate a list of core principles for evolutionary medicine. The study included four surveys administered in sequence to 56 expert panelists. The initial open-ended survey created a list of possible core principles; the three subsequent surveys winnowed the list and assessed the accuracy and importance of each principle. Results Fourteen core principles elicited at least 80% of the panelists to agree or strongly agree that they were important core principles for evolutionary medicine. These principles over-lapped with concepts discussed in other articles discussing key concepts in evolutionary medicine. Conclusions and implications This set of core principles will be helpful for researchers and instructors in evolutionary medicine. We recommend that evolutionary medicine instructors use the list of core principles to construct learning goals. Evolutionary medicine is a young field, so this list of core principles will likely change as the field develops further. PMID:29493660
Core principles of evolutionary medicine: A Delphi study.

PubMed

Grunspan, Daniel Z; Nesse, Randolph M; Barnes, M Elizabeth; Brownell, Sara E

2018-01-01

Evolutionary medicine is a rapidly growing field that uses the principles of evolutionary biology to better understand, prevent and treat disease, and that uses studies of disease to advance basic knowledge in evolutionary biology. Over-arching principles of evolutionary medicine have been described in publications, but our study is the first to systematically elicit core principles from a diverse panel of experts in evolutionary medicine. These principles should be useful to advance recent recommendations made by The Association of American Medical Colleges and the Howard Hughes Medical Institute to make evolutionary thinking a core competency for pre-medical education. The Delphi method was used to elicit and validate a list of core principles for evolutionary medicine. The study included four surveys administered in sequence to 56 expert panelists. The initial open-ended survey created a list of possible core principles; the three subsequent surveys winnowed the list and assessed the accuracy and importance of each principle. Fourteen core principles elicited at least 80% of the panelists to agree or strongly agree that they were important core principles for evolutionary medicine. These principles over-lapped with concepts discussed in other articles discussing key concepts in evolutionary medicine. This set of core principles will be helpful for researchers and instructors in evolutionary medicine. We recommend that evolutionary medicine instructors use the list of core principles to construct learning goals. Evolutionary medicine is a young field, so this list of core principles will likely change as the field develops further.
Evolutionary dynamics of Hepatitis C virus in a chronic HIV co-infected patient and its correlation with the immune status.

PubMed

Culasso, Andrés Carlos Alberto; Monzani, María Cecilia; Baré, Patricia; Campos, Rodolfo Hector

2018-05-04

The HCV evolutionary dynamics play a key role in the infection onset, maintenance of chronicity, pathogenicity, and drug resistance variants fixation, and are thought to be one of the main caveats in the development of an effective vaccine. Previous studies in HCV/HIV co-infected patients suggest that a decline in the immune status is related with increases in the HCV intra-host genetic diversity. However, these findings are based on single point sequence diversity measures or coalescence analyses in several virus-host interactions. In this work, we describe the molecular evolution of HCV-E2 region in a single HIV-co-infected patient with two clearly defined immune conditions. The phylogenetic analysis of the HCV-1a sequences from the studied patient showed that he was co-infected with three different viral lineages. These lineages were not evenly detected throughout time. The sequence diversity and coalescence analyses of these lineages suggested the action of different evolutionary patterns in different immune conditions: a slow rate, drift-like process in an immunocompromised condition (low levels of CD4+ T lymphocytes); and a fast rate, variant-switch process in an immunocompetent condition (high levels of CD4+ T lymphocytes). Copyright © 2017. Published by Elsevier B.V.
Physics and evolution of thermophilic adaptation.

PubMed

Berezovsky, Igor N; Shakhnovich, Eugene I

2005-09-06

Analysis of structures and sequences of several hyperthermostable proteins from various sources reveals two major physical mechanisms of their thermostabilization. The first mechanism is "structure-based," whereby some hyperthermostable proteins are significantly more compact than their mesophilic homologues, while no particular interaction type appears to cause stabilization; rather, a sheer number of interactions is responsible for thermostability. Other hyperthermostable proteins employ an alternative, "sequence-based" mechanism of their thermal stabilization. They do not show pronounced structural differences from mesophilic homologues. Rather, a small number of apparently strong interactions is responsible for high thermal stability of these proteins. High-throughput comparative analysis of structures and complete genomes of several hyperthermophilic archaea and bacteria revealed that organisms develop diverse strategies of thermophilic adaptation by using, to a varying degree, two fundamental physical mechanisms of thermostability. The choice of a particular strategy depends on the evolutionary history of an organism. Proteins from organisms that originated in an extreme environment, such as hyperthermophilic archaea (Pyrococcus furiosus), are significantly more compact and more hydrophobic than their mesophilic counterparts. Alternatively, organisms that evolved as mesophiles but later recolonized a hot environment (Thermotoga maritima) relied in their evolutionary strategy of thermophilic adaptation on "sequence-based" mechanism of thermostability. We propose an evolutionary explanation of these differences based on physical concepts of protein designability.
Genetic distances and phylogenetic trees of different Awassi sheep populations based on DNA sequencing.

PubMed

Al-Atiyat, R M; Aljumaah, R S

2014-08-27

This study aimed to estimate evolutionary distances and to reconstruct phylogeny trees between different Awassi sheep populations. Thirty-two sheep individuals from three different geographical areas of Jordan and the Kingdom of Saudi Arabia (KSA) were randomly sampled. DNA was extracted from the tissue samples and sequenced using the T7 promoter universal primer. Different phylogenetic trees were reconstructed from 0.64-kb DNA sequences using the MEGA software with the best general time reverse distance model. Three methods of distance estimation were then used. The maximum composite likelihood test was considered for reconstructing maximum likelihood, neighbor-joining and UPGMA trees. The maximum likelihood tree indicated three major clusters separated by cytosine (C) and thymine (T). The greatest distance was shown between the South sheep and North sheep. On the other hand, the KSA sheep as an outgroup showed shorter evolutionary distance to the North sheep population than to the others. The neighbor-joining and UPGMA trees showed quite reliable clusters of evolutionary differentiation of Jordan sheep populations from the Saudi population. The overall results support geographical information and ecological types of the sheep populations studied. Summing up, the resulting phylogeny trees may contribute to the limited information about the genetic relatedness and phylogeny of Awassi sheep in nearby Arab countries.
Turning gold into ‘junk’: transposable elements utilize central proteins of cellular networks

PubMed Central

Abrusán, György; Szilágyi, András; Zhang, Yang; Papp, Balázs

2013-01-01

The numerous discovered cases of domesticated transposable element (TE) proteins led to the recognition that TEs are a significant source of evolutionary innovation. However, much less is known about the reverse process, whether and to what degree the evolution of TEs is influenced by the genome of their hosts. We addressed this issue by searching for cases of incorporation of host genes into the sequence of TEs and examined the systems-level properties of these genes using the Saccharomyces cerevisiae and Drosophila melanogaster genomes. We identified 51 cases where the evolutionary scenario was the incorporation of a host gene fragment into a TE consensus sequence, and we show that both the yeast and fly homologues of the incorporated protein sequences have central positions in the cellular networks. An analysis of selective pressure (Ka/Ks ratio) detected significant selection in 37% of the cases. Recent research on retrovirus-host interactions shows that virus proteins preferentially target hubs of the host interaction networks enabling them to take over the host cell using only a few proteins. We propose that TEs face a similar evolutionary pressure to evolve proteins with high interacting capacities and take some of the necessary protein domains directly from their hosts. PMID:23341038
Evolutionary implications of phylogenetic analyses of the gene transfer agent (GTA) of Rhodobacter capsulatus.

PubMed

Lang, Andrew S; Taylor, Terumi A; Beatty, J Thomas

2002-11-01

The gene transfer agent (GTA) of the a-proteobacterium Rhodobacter capsulatus is a cell-controlled genetic exchange vector. Genes that encode the GTA structure are clustered in a 15-kb region of the R. capsulatus chromosome, and some of these genes show sequence similarity to known bacteriophage head and tail genes. However, the production of GTA is controlled at the level of transcription by a cellular two-component signal transduction system. This paper describes homologues of both the GTA structural gene cluster and the GTA regulatory genes in the a-proteobacteria Rhodopseudomonas palustris, Rhodobacter sphaeroides, Caulobacter crescentus, Agrobacterium tumefaciens and Brucella melitensis. These sequences were used in a phylogenetic tree approach to examine the evolutionary relationships of selected GTA proteins to these homologues and (pro)phage proteins, which was compared to a 16S rRNA tree. The data indicate that a GTA-like element was present in a single progenitor of the extant species that contain both GTA structural cluster and regulatory gene homologues. The evolutionary relationships of GTA structural proteins to (pro)phage proteins indicated by the phylogenetic tree patterns suggest a predominantly vertical descent of GTA-like sequences in the a-proteobacteria and little past gene exchange with (pro)phages.
BAYESIAN PROTEIN STRUCTURE ALIGNMENT.

PubMed

Rodriguez, Abel; Schmidler, Scott C

The analysis of the three-dimensional structure of proteins is an important topic in molecular biochemistry. Structure plays a critical role in defining the function of proteins and is more strongly conserved than amino acid sequence over evolutionary timescales. A key challenge is the identification and evaluation of structural similarity between proteins; such analysis can aid in understanding the role of newly discovered proteins and help elucidate evolutionary relationships between organisms. Computational biologists have developed many clever algorithmic techniques for comparing protein structures, however, all are based on heuristic optimization criteria, making statistical interpretation somewhat difficult. Here we present a fully probabilistic framework for pairwise structural alignment of proteins. Our approach has several advantages, including the ability to capture alignment uncertainty and to estimate key "gap" parameters which critically affect the quality of the alignment. We show that several existing alignment methods arise as maximum a posteriori estimates under specific choices of prior distributions and error models. Our probabilistic framework is also easily extended to incorporate additional information, which we demonstrate by including primary sequence information to generate simultaneous sequence-structure alignments that can resolve ambiguities obtained using structure alone. This combined model also provides a natural approach for the difficult task of estimating evolutionary distance based on structural alignments. The model is illustrated by comparison with well-established methods on several challenging protein alignment examples.
Predicting protein contact map using evolutionary and physical constraints by integer programming.

PubMed

Wang, Zhiyong; Xu, Jinbo

2013-07-01

Protein contact map describes the pairwise spatial and functional relationship of residues in a protein and contains key information for protein 3D structure prediction. Although studied extensively, it remains challenging to predict contact map using only sequence information. Most existing methods predict the contact map matrix element-by-element, ignoring correlation among contacts and physical feasibility of the whole-contact map. A couple of recent methods predict contact map by using mutual information, taking into consideration contact correlation and enforcing a sparsity restraint, but these methods demand for a very large number of sequence homologs for the protein under consideration and the resultant contact map may be still physically infeasible. This article presents a novel method PhyCMAP for contact map prediction, integrating both evolutionary and physical restraints by machine learning and integer linear programming. The evolutionary restraints are much more informative than mutual information, and the physical restraints specify more concrete relationship among contacts than the sparsity restraint. As such, our method greatly reduces the solution space of the contact map matrix and, thus, significantly improves prediction accuracy. Experimental results confirm that PhyCMAP outperforms currently popular methods no matter how many sequence homologs are available for the protein under consideration. http://raptorx.uchicago.edu.
Two-Stage orders sequencing system for mixed-model assembly

NASA Astrophysics Data System (ADS)

Zemczak, M.; Skolud, B.; Krenczyk, D.

2015-11-01

In the paper, the authors focus on the NP-hard problem of orders sequencing, formulated similarly to Car Sequencing Problem (CSP). The object of the research is the assembly line in an automotive industry company, on which few different models of products, each in a certain number of versions, are assembled on the shared resources, set in a line. Such production type is usually determined as a mixed-model production, and arose from the necessity of manufacturing customized products on the basis of very specific orders from single clients. The producers are nowadays obliged to provide each client the possibility to determine a huge amount of the features of the product they are willing to buy, as the competition in the automotive market is large. Due to the previously mentioned nature of the problem (NP-hard), in the given time period only satisfactory solutions are sought, as the optimal solution method has not yet been found. Most of the researchers that implemented inaccurate methods (e.g. evolutionary algorithms) to solving sequencing problems dropped the research after testing phase, as they were not able to obtain reproducible results, and met problems while determining the quality of the received solutions. Therefore a new approach to solving the problem, presented in this paper as a sequencing system is being developed. The sequencing system consists of a set of determined rules, implemented into computer environment. The system itself works in two stages. First of them is connected with the determination of a place in the storage buffer to which certain production orders should be sent. In the second stage of functioning, precise sets of sequences are determined and evaluated for certain parts of the storage buffer under certain criteria.
A haploid system of sex determination in the brown alga Ectocarpus sp.

PubMed

Ahmed, Sophia; Cock, J Mark; Pessia, Eugenie; Luthringer, Remy; Cormier, Alexandre; Robuchon, Marine; Sterck, Lieven; Peters, Akira F; Dittami, Simon M; Corre, Erwan; Valero, Myriam; Aury, Jean-Marc; Roze, Denis; Van de Peer, Yves; Bothwell, John; Marais, Gabriel A B; Coelho, Susana M

2014-09-08

A common feature of most genetic sex-determination systems studied so far is that sex is determined by nonrecombining genomic regions, which can be of various sizes depending on the species. These regions have evolved independently and repeatedly across diverse groups. A number of such sex-determining regions (SDRs) have been studied in animals, plants, and fungi, but very little is known about the evolution of sexes in other eukaryotic lineages. We report here the sequencing and genomic analysis of the SDR of Ectocarpus, a brown alga that has been evolving independently from plants, animals, and fungi for over one giga-annum. In Ectocarpus, sex is expressed during the haploid phase of the life cycle, and both the female (U) and the male (V) sex chromosomes contain nonrecombining regions. The U and V of this species have been diverging for more than 70 mega-annum, yet gene degeneration has been modest, and the SDR is relatively small, with no evidence for evolutionary strata. These features may be explained by the occurrence of strong purifying selection during the haploid phase of the life cycle and the low level of sexual dimorphism. V is dominant over U, suggesting that femaleness may be the default state, adopted when the male haplotype is absent. The Ectocarpus UV system has clearly had a distinct evolutionary trajectory not only to the well-studied XY and ZW systems but also to the UV systems described so far. Nonetheless, some striking similarities exist, indicating remarkable universality of the underlying processes shaping sex chromosome evolution across distant lineages. Copyright © 2014 Elsevier Ltd. All rights reserved.
Evolutionary paths of streptococcal and staphylococcal superantigens

PubMed Central

2012-01-01

Background Streptococcus pyogenes (GAS) harbors several superantigens (SAgs) in the prophage region of its genome, although speG and smez are not located in this region. The diversity of SAgs is thought to arise during horizontal transfer, but their evolutionary pathways have not yet been determined. We recently completed sequencing the entire genome of S. dysgalactiae subsp. equisimilis (SDSE), the closest relative of GAS. Although speG is the only SAg gene of SDSE, speG was present in only 50% of clinical SDSE strains and smez in none. In this study, we analyzed the evolutionary paths of streptococcal and staphylococcal SAgs. Results We compared the sequences of the 12–60 kb speG regions of nine SDSE strains, five speG+ and four speG–. We found that the synteny of this region was highly conserved, whether or not the speG gene was present. Synteny analyses based on genome-wide comparisons of GAS and SDSE indicated that speG is the direct descendant of a common ancestor of streptococcal SAgs, whereas smez was deleted from SDSE after SDSE and GAS split from a common ancestor. Cumulative nucleotide skew analysis of SDSE genomes suggested that speG was located outside segments of steeper slopes than the stable region in the genome, whereas the region flanking smez was unstable, as expected from the results of GAS. We also detected a previously undescribed staphylococcal SAg gene, selW, and a staphylococcal SAg -like gene, ssl, in the core genomes of all Staphylococcus aureus strains sequenced. Amino acid substitution analyses, based on dN/dS window analysis of the products encoded by speG, selW and ssl suggested that all three genes have been subjected to strong positive selection. Evolutionary analysis based on the Bayesian Markov chain Monte Carlo method showed that each clade included at least one direct descendant. Conclusions Our findings reveal a plausible model for the comprehensive evolutionary pathway of streptococcal and staphylococcal SAgs. PMID:22900646
Variability and genetic structure of the population of watermelon mosaic virus infecting melon in Spain.

PubMed

Moreno, I M; Malpica, J M; Díaz-Pendón, J A; Moriones, E; Fraile, A; García-Arenal, F

2004-01-05

The genetic structure of the population of Watermelon mosaic virus (WMV) in Spain was analysed by the biological and molecular characterisation of isolates sampled from its main host plant, melon. The population was a highly homogeneous one, built of a single pathotype, and comprising isolates closely related genetically. There was indication of temporal replacement of genotypes, but not of spatial structure of the population. Analyses of nucleotide sequences in three genomic regions, that is, in the cistrons for the P1, cylindrical inclusion (CI) and capsid (CP) proteins, showed lower similar values of nucleotide diversity for the P1 than for the CI or CP cistrons. The CI protein and the CP were under tighter evolutionary constraints than the P1 protein. Also, for the CI and CP cistrons, but not for the P1 cistron, two groups of sequences, defining two genetic strains, were apparent. Thus, different genomic regions of WMV show different evolutionary dynamics. Interestingly, for the CI and CP cistrons, sequences were clustered into two regions of the sequence space, defining the two strains above, and no intermediary sequences were identified. Recombinant isolates were found, accounting for at least 7% of the population. These recombinants presented two interesting features: (i) crossover points were detected between the analysed regions in the CI and CP cistrons, but not between those in the P1 and CI cistrons, (ii) crossover points were not observed within the analysed coding regions for the P1, CI or CP proteins. This indicates strong selection against isolates with recombinant proteins, even when originated from closely related strains. Hence, data indicate that genotypes of WMV, generated by mutation or recombination, outside of acceptable, discrete, regions in the evolutionary space, are eliminated from the virus population by negative selection.
Global ecological pattern of ammonia-oxidizing archaea.

PubMed

Cao, Huiluo; Auguet, Jean-Christophe; Gu, Ji-Dong

2013-01-01

The global distribution of ammonia-oxidizing archaea (AOA), which play a pivotal role in the nitrification process, has been confirmed through numerous ecological studies. Though newly available amoA (ammonia monooxygenase subunit A) gene sequences from new environments are accumulating rapidly in public repositories, a lack of information on the ecological and evolutionary factors shaping community assembly of AOA on the global scale is apparent. We conducted a meta-analysis on uncultured AOA using over ca. 6,200 archaeal amoA gene sequences, so as to reveal their community distribution patterns along a wide spectrum of physicochemical conditions and habitat types. The sequences were dereplicated at 95% identity level resulting in a dataset containing 1,476 archaeal amoA gene sequences from eight habitat types: namely soil, freshwater, freshwater sediment, estuarine sediment, marine water, marine sediment, geothermal system, and symbiosis. The updated comprehensive amoA phylogeny was composed of three major monophyletic clusters (i.e. Nitrosopumilus, Nitrosotalea, Nitrosocaldus) and a non-monophyletic cluster constituted mostly by soil and sediment sequences that we named Nitrososphaera. Diversity measurements indicated that marine and estuarine sediments as well as symbionts might be the largest reservoirs of AOA diversity. Phylogenetic analyses were further carried out using macroevolutionary analyses to explore the diversification pattern and rates of nitrifying archaea. In contrast to other habitats that displayed constant diversification rates, marine planktonic AOA interestingly exhibit a very recent and accelerating diversification rate congruent with the lowest phylogenetic diversity observed in their habitats. This result suggested the existence of AOA communities with different evolutionary history in the different habitats. Based on an up-to-date amoA phylogeny, this analysis provided insights into the possible evolutionary mechanisms and environmental parameters that shape AOA community assembly at global scale.
Spliced DNA Sequences in the Paramecium Germline: Their Properties and Evolutionary Potential

PubMed Central

Catania, Francesco; McGrath, Casey L.; Doak, Thomas G.; Lynch, Michael

2013-01-01

Despite playing a crucial role in germline-soma differentiation, the evolutionary significance of developmentally regulated genome rearrangements (DRGRs) has received scant attention. An example of DRGR is DNA splicing, a process that removes segments of DNA interrupting genic and/or intergenic sequences. Perhaps, best known for shaping immune-system genes in vertebrates, DNA splicing plays a central role in the life of ciliated protozoa, where thousands of germline DNA segments are eliminated after sexual reproduction to regenerate a functional somatic genome. Here, we identify and chronicle the properties of 5,286 sequences that putatively undergo DNA splicing (i.e., internal eliminated sequences [IESs]) across the genomes of three closely related species of the ciliate Paramecium (P. tetraurelia, P. biaurelia, and P. sexaurelia). The study reveals that these putative IESs share several physical characteristics. Although our results are consistent with excision events being largely conserved between species, episodes of differential IES retention/excision occur, may have a recent origin, and frequently involve coding regions. Our findings indicate interconversion between somatic—often coding—DNA sequences and noncoding IESs, and provide insights into the role of DNA splicing in creating potentially functional genetic innovation. PMID:23737328
Genetic markers, genotyping methods & next generation sequencing in Mycobacterium tuberculosis

PubMed Central

Desikan, Srinidhi; Narayanan, Sujatha

2015-01-01

Molecular epidemiology (ME) is one of the main areas in tuberculosis research which is widely used to study the transmission epidemics and outbreaks of tubercle bacilli. It exploits the presence of various polymorphisms in the genome of the bacteria that can be widely used as genetic markers. Many DNA typing methods apply these genetic markers to differentiate various strains and to study the evolutionary relationships between them. The three widely used genotyping tools to differentiate Mycobacterium tuberculosis strains are IS6110 restriction fragment length polymorphism (RFLP), spacer oligotyping (Spoligotyping), and mycobacterial interspersed repeat units - variable number of tandem repeats (MIRU-VNTR). A new prospect towards ME was introduced with the development of whole genome sequencing (WGS) and the next generation sequencing (NGS) methods, where the entire genome is sequenced that not only helps in pointing out minute differences between the various sequences but also saves time and the cost. NGS is also found to be useful in identifying single nucleotide polymorphisms (SNPs), comparative genomics and also various aspects about transmission dynamics. These techniques enable the identification of mycobacterial strains and also facilitate the study of their phylogenetic and evolutionary traits. PMID:26205019
A novel approach to multiple sequence alignment using hadoop data grids.

PubMed

Sudha Sadasivam, G; Baktavatchalam, G

2010-01-01

Multiple alignment of protein sequences helps to determine evolutionary linkage and to predict molecular structures. The factors to be considered while aligning multiple sequences are speed and accuracy of alignment. Although dynamic programming algorithms produce accurate alignments, they are computation intensive. In this paper we propose a time efficient approach to sequence alignment that also produces quality alignment. The dynamic nature of the algorithm coupled with data and computational parallelism of hadoop data grids improves the accuracy and speed of sequence alignment. The principle of block splitting in hadoop coupled with its scalability facilitates alignment of very large sequences.
A 5.8S nuclear ribosomal RNA gene sequence database: applications to ecology and evolution

NASA Technical Reports Server (NTRS)

Cullings, K. W.; Vogler, D. R.

1998-01-01

We complied a 5.8S nuclear ribosomal gene sequence database for animals, plants, and fungi using both newly generated and GenBank sequences. We demonstrate the utility of this database as an internal check to determine whether the target organism and not a contaminant has been sequenced, as a diagnostic tool for ecologists and evolutionary biologists to determine the placement of asexual fungi within larger taxonomic groups, and as a tool to help identify fungi that form ectomycorrhizae.

Complete genome sequence of a Chinese isolate of pepper vein yellows virus and evolutionary analysis based on the CP, MP and RdRp coding regions.

PubMed

Liu, Maoyan; Liu, Xiangning; Li, Xun; Zhang, Deyong; Dai, Liangyin; Tang, Qianjun

2016-03-01

The genome sequence of pepper vein yellows virus (PeVYV) (PeVYV-HN, accession number KP326573), isolated from pepper plants (Capsicum annuum L.) grown at the Hunan Vegetables Institute (Changsha, Hunan, China), was determined by deep sequencing of small RNAs. The PeVYV-HN genome consists of 6244 nucleotides, contains six open reading frames (ORFs), and is similar to that of an isolate (AB594828) from Japan. Its genomic organization is similar to that of members of the genus Polerovirus. Sequence analysis revealed that PeVYV-HN shared 92% sequence identity with the Japanese PeVYV genome at both the nucleotide and amino acid levels. Evolutionary analysis based on the coat protein (CP), movement protein (MP), and RNA-dependent RNA polymerase (RdRP) showed that PeVYV could be divided into two major lineages corresponding to their geographical origins. The Asian isolates have a higher population expansion frequency than the African isolates. Negative selection and genetic drift (founder effect) were found to be the potential drivers of the molecular evolution of PeVYV. Moreover, recombination was not the distinct cause of PeVYV evolution. This is the first report of a complete genomic sequence of PeVYV in China.
Phenotype–genotype correlation in Hirschsprung disease is illuminated by comparative analysis of the RET protein sequence

PubMed Central

Kashuk, Carl S.; Stone, Eric A.; Grice, Elizabeth A.; Portnoy, Matthew E.; Green, Eric D.; Sidow, Arend; Chakravarti, Aravinda; McCallion, Andrew S.

2005-01-01

The ability to discriminate between deleterious and neutral amino acid substitutions in the genes of patients remains a significant challenge in human genetics. The increasing availability of genomic sequence data from multiple vertebrate species allows inclusion of sequence conservation and physicochemical properties of residues to be used for functional prediction. In this study, the RET receptor tyrosine kinase serves as a model disease gene in which a broad spectrum (≥116) of disease-associated mutations has been identified among patients with Hirschsprung disease and multiple endocrine neoplasia type 2. We report the alignment of the human RET protein sequence with the orthologous sequences of 12 non-human vertebrates (eight mammalian, one avian, and three teleost species), their comparative analysis, the evolutionary topology of the RET protein, and predicted tolerance for all published missense mutations. We show that, although evolutionary conservation alone provides significant information to predict the effect of a RET mutation, a model that combines comparative sequence data with analysis of physiochemical properties in a quantitative framework provides far greater accuracy. Although the ability to discern the impact of a mutation is imperfect, our analyses permit substantial discrimination between predicted functional classes of RET mutations and disease severity even for a multigenic disease such as Hirschsprung disease. PMID:15956201
Grain boundary phases in bcc metals

DOE PAGES

Frolov, T.; Setyawan, W.; Kurtz, R. J.; ...

2018-01-01

Evolutionary grand-canonical search predicts novel grain boundary structures and multiple grain boundary phases in elemental body-centered cubic (bcc) metals represented by tungsten, tantalum and molybdenum.
The first complete chloroplast genome of the Genistoid legume Lupinus luteus: evidence for a novel major lineage-specific rearrangement and new insights regarding plastome evolution in the legume family

PubMed Central

Martin, Guillaume E.; Rousseau-Gueutin, Mathieu; Cordonnier, Solenn; Lima, Oscar; Michon-Coudouel, Sophie; Naquin, Delphine; de Carvalho, Julie Ferreira; Aïnouche, Malika; Salmon, Armel; Aïnouche, Abdelkader

2014-01-01

Background and Aims To date chloroplast genomes are available only for members of the non-protein amino acid-accumulating clade (NPAAA) Papilionoid lineages in the legume family (i.e. Millettioids, Robinoids and the ‘inverted repeat-lacking clade’, IRLC). It is thus very important to sequence plastomes from other lineages in order to better understand the unusual evolution observed in this model flowering plant family. To this end, the plastome of a lupine species, Lupinus luteus, was sequenced to represent the Genistoid lineage, a noteworthy but poorly studied legume group. Methods The plastome of L. luteus was reconstructed using Roche-454 and Illumina next-generation sequencing. Its structure, repetitive sequences, gene content and sequence divergence were compared with those of other Fabaceae plastomes. PCR screening and sequencing were performed in other allied legumes in order to determine the origin of a large inversion identified in L. luteus. Key Results The first sequenced Genistoid plastome (L. luteus: 155 894 bp) resulted in the discovery of a 36-kb inversion, embedded within the already known 50-kb inversion in the large single-copy (LSC) region of the Papilionoideae. This inversion occurs at the base or soon after the Genistoid emergence, and most probably resulted from a flip–flop recombination between identical 29-bp inverted repeats within two trnS genes. Comparative analyses of the chloroplast gene content of L. luteus vs. Fabaceae and extra-Fabales plastomes revealed the loss of the plastid rpl22 gene, and its functional relocation to the nucleus was verified using lupine transcriptomic data. An investigation into the evolutionary rate of coding and non-coding sequences among legume plastomes resulted in the identification of remarkably variable regions. Conclusions This study resulted in the discovery of a novel, major 36-kb inversion, specific to the Genistoids. Chloroplast mutational hotspots were also identified, which contain novel and potentially informative regions for molecular evolutionary studies at various taxonomic levels in the legumes. Taken together, the results provide new insights into the evolutionary landscape of the legume plastome. PMID:24769537
Diversity Arrays Technology (DArT) for Pan-Genomic Evolutionary Studies of Non-Model Organisms

PubMed Central

James, Karen E.; Schneider, Harald; Ansell, Stephen W.; Evers, Margaret; Robba, Lavinia; Uszynski, Grzegorz; Pedersen, Niklas; Newton, Angela E.; Russell, Stephen J.; Vogel, Johannes C.; Kilian, Andrzej

2008-01-01

Background High-throughput tools for pan-genomic study, especially the DNA microarray platform, have sparked a remarkable increase in data production and enabled a shift in the scale at which biological investigation is possible. The use of microarrays to examine evolutionary relationships and processes, however, is predominantly restricted to model or near-model organisms. Methodology/Principal Findings This study explores the utility of Diversity Arrays Technology (DArT) in evolutionary studies of non-model organisms. DArT is a hybridization-based genotyping method that uses microarray technology to identify and type DNA polymorphism. Theoretically applicable to any organism (even one for which no prior genetic data are available), DArT has not yet been explored in exclusively wild sample sets, nor extensively examined in a phylogenetic framework. DArT recovered 1349 markers of largely low copy-number loci in two lineages of seed-free land plants: the diploid fern Asplenium viride and the haploid moss Garovaglia elegans. Direct sequencing of 148 of these DArT markers identified 30 putative loci including four routinely sequenced for evolutionary studies in plants. Phylogenetic analyses of DArT genotypes reveal phylogeographic and substrate specificity patterns in A. viride, a lack of phylogeographic pattern in Australian G. elegans, and additive variation in hybrid or mixed samples. Conclusions/Significance These results enable methodological recommendations including procedures for detecting and analysing DArT markers tailored specifically to evolutionary investigations and practical factors informing the decision to use DArT, and raise evolutionary hypotheses concerning substrate specificity and biogeographic patterns. Thus DArT is a demonstrably valuable addition to the set of existing molecular approaches used to infer biological phenomena such as adaptive radiations, population dynamics, hybridization, introgression, ecological differentiation and phylogeography. PMID:18301759
TAS3 miR390-dependent loci in non-vascular land plants: towards a comprehensive reconstruction of the gene evolutionary history

PubMed Central

Milyutina, Irina A.; Erokhina, Tatiana N.; Ozerova, Liudmila V.; Troitsky, Alexey V.; Solovyev, Andrey G.

2018-01-01

Trans-acting small interfering RNAs (ta-siRNAs) are transcribed from protein non-coding genomic TAS loci and belong to a plant-specific class of endogenous small RNAs. These siRNAs have been found to regulate gene expression in most taxa including seed plants, gymnosperms, ferns and mosses. In this study, bioinformatic and experimental PCR-based approaches were used as tools to analyze TAS3 and TAS6 loci in transcriptomes and genomic DNAs from representatives of evolutionary distant non-vascular plant taxa such as Bryophyta, Marchantiophyta and Anthocerotophyta. We revealed previously undiscovered TAS3 loci in plant classes Sphagnopsida and Anthocerotopsida, as well as TAS6 loci in Bryophyta classes Tetraphidiopsida, Polytrichopsida, Andreaeopsida and Takakiopsida. These data further unveil the evolutionary pathway of the miR390-dependent TAS3 loci in land plants. We also identified charophyte alga sequences coding for SUPPRESSOR OF GENE SILENCING 3 (SGS3), which is required for generation of ta-siRNAs in plants, and hypothesized that the appearance of TAS3-related sequences could take place at a very early step in evolutionary transition from charophyte algae to an earliest common ancestor of land plants. PMID:29682420
Phylogeny and evolutionary histories of Pyrus L. revealed by phylogenetic trees and networks based on data from multiple DNA sequences.

PubMed

Zheng, Xiaoyan; Cai, Danying; Potter, Daniel; Postman, Joseph; Liu, Jing; Teng, Yuanwen

2014-11-01

Reconstructing the phylogeny of Pyrus has been difficult due to the wide distribution of the genus and lack of informative data. In this study, we collected 110 accessions representing 25 Pyrus species and constructed both phylogenetic trees and phylogenetic networks based on multiple DNA sequence datasets. Phylogenetic trees based on both cpDNA and nuclear LFY2int2-N (LN) data resulted in poor resolution, especially, only five primary species were monophyletic in the LN tree. A phylogenetic network of LN suggested that reticulation caused by hybridization is one of the major evolutionary processes for Pyrus species. Polytomies of the gene trees and star-like structure of cpDNA networks suggested rapid radiation is another major evolutionary process, especially for the occidental species. Pyrus calleryana and P. regelii were the earliest diverged Pyrus species. Two North African species, P. cordata, P. spinosa and P. betulaefolia were descendent of primitive stock Pyrus species and still share some common molecular characters. Southwestern China, where a large number of P. pashia populations are found, is probably the most important diversification center of Pyrus. More accessions and nuclear genes are needed for further understanding the evolutionary histories of Pyrus. Copyright © 2014 Elsevier Inc. All rights reserved.
Evolutionary rates of mitochondrial genomes correspond to diversification rates and to contemporary species richness in birds and reptiles

PubMed Central

Eo, Soo Hyung; DeWoody, J. Andrew

2010-01-01

Rates of biological diversification should ultimately correspond to rates of genome evolution. Recent studies have compared diversification rates with phylogenetic branch lengths, but incomplete phylogenies hamper such analyses for many taxa. Herein, we use pairwise comparisons of confamilial sauropsid (bird and reptile) mitochondrial DNA (mtDNA) genome sequences to estimate substitution rates. These molecular evolutionary rates are considered in light of the age and species richness of each taxonomic family, using a random-walk speciation–extinction process to estimate rates of diversification. We find the molecular clock ticks at disparate rates in different families and at different genes. For example, evolutionary rates are relatively fast in snakes and lizards, intermediate in crocodilians and slow in turtles and birds. There was also rate variation across genes, where non-synonymous substitution rates were fastest at ATP8 and slowest at CO3. Family-by-gene interactions were significant, indicating that local clocks vary substantially among sauropsids. Most importantly, we find evidence that mitochondrial genome evolutionary rates are positively correlated with speciation rates and with contemporary species richness. Nuclear sequences are poorly represented among reptiles, but the correlation between rates of molecular evolution and species diversification also extends to 18 avian nuclear genes we tested. Thus, the nuclear data buttress our mtDNA findings. PMID:20610427
Evolution of heliobacteria: implications for photosynthetic reaction center complexes

NASA Technical Reports Server (NTRS)

Vermaas, W. F.; Blankenship, R. E. (Principal Investigator)

1994-01-01

The evolutionary position of the heliobacteria, a group of green photosynthetic bacteria with a photosynthetic apparatus functionally resembling Photosystem I of plants and cyanobacteria, has been investigated with respect to the evolutionary relationship to Gram-positive bacteria and cyanobacteria. On the basis of 16S rRNA sequence analysis, the heliobacteria appear to be most closely related to Gram-positive bacteria, but also an evolutionary link to cyanobacteria is evident. Interestingly, a 46-residue domain including the putative sixth membrane-spanning region of the heliobacterial reaction center protein show rather strong similarity (33% identity and 72% similarity) to a region including the sixth membrane-spanning region of the CP47 protein, a chlorophyll-binding core antenna polypeptide of Photosystem II. The N-terminal half of the heliobacterial reaction center polypeptide shows a moderate sequence similarity (22% identity over 232 residues) with the CP47 protein, which is significantly more than the similarity with the Photosystem I core polypeptides in this region. An evolutionary model for photosynthetic reaction center complexes is discussed, in which an ancestral homodimeric reaction center protein (possibly resembling the heliobacterial reaction center protein) with 11 membrane-spanning regions per polypeptide has diverged to give rise to the core of Photosystem I, Photosystem II, and of the photosynthetic apparatus in green, purple, and heliobacteria.
A synopsis of test results and knowledge gained from the Phase-0 CSI evolutionary model

NASA Technical Reports Server (NTRS)

Belvin, W. Keith; Elliott, Kenny B.; Horta, Lucas G.

1993-01-01

The Phase-0 CSI Evolutionary Model (CEM) is a testbed for the study of space platform global line-of-sight (LOS) pointing. Now that the tests have been completed, a summary of hardware and closed-loop test experiences is necessary to insure a timely dissemination of the knowledge gained. The testbed is described and modeling experiences are presented followed by a summary of the research performed by various investigators. Some early lessons on implementing the closed-loop controllers are described with particular emphasis on real-time computing requirements. A summary of closed-loop studies and a synopsis of test results are presented. Plans for evolving the CEM from phase 0 to phases 1 and 2 are also described. Subsequently, a summary of knowledge gained from the design and testing of the Phase-0 CEM is made.
Woese on the received view of evolution.

PubMed

Sarkar, Sahotra

2014-01-01

As part of his attempt to reconstruct the earliest phase of the evolution of life on Earth, Woese produced a compelling critique of the received view of evolution from the 20th century. This paper explicitly articulates two related features of that critique that are fundamental but the first of which has not been sufficiently clearly recognized in the context of evolutionary theorizing: (1) according to Woese's scenario of communal evolution during life's earliest phase (roughly, the first billion years of life on Earth), well-defined biological individuals (and, thus, individual lineages) did not exist; and (2) during that phase, evolutionary change took place through ubiquitous horizontal gene transfer (HGT) rather than through vertical transmission of features (including genes) and the combinatorics of HGT was the dominant mechanism of evolutionary change. Both factors present serious challenges to the received view of evolution and that framework would have to be radically altered to incorporate these factors. The extent to which this will be necessary will depend on whether Woese's scenario of collective early evolution is correct.
Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing

Treesearch

Shannon C.K. Straub; Mark Fishbein; Tatyana Livshult; Zachary Foster; Matthew Parks; Kevin Weitemier; Richard C. Cronn; Aaron Liston

2011-01-01

Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.) could be useful in...
Building Phylogenetic Trees from DNA Sequence Data: Investigating Polar Bear and Giant Panda Ancestry.

ERIC Educational Resources Information Center

Maier, Caroline Alexandra

2001-01-01

Presents an activity in which students seek answers to questions about evolutionary relationships by using genetic databases and bioinformatics software. Students build genetic distance matrices and phylogenetic trees based on molecular sequence data using web-based resources. Provides a flowchart of steps involved in accessing, retrieving, and…
Genome Sequence of Fusarium oxysporum f. sp. melonis, a fungus causing wilt disease on melon

USDA-ARS?s Scientific Manuscript database

This manuscript reports the genome sequence of F. oxysporum f. sp. melonis, a fungal pathogen that causes Fusarium wilt disease on melon (Cucumis melo). The project is part of a large comparative study designed to explore the genetic composition and evolutionary origin of this group of horizontally ...
Genome sequence of Fusarium oxysporum f. sp. melonis, a fungus causing wilt disease on melon

USDA-ARS?s Scientific Manuscript database

This manuscript reports the genome sequence of F. oxysporum f. sp. melonis, a fungal pathogen that causes Fusarium wilt disease on melon (Cucumis melo). The project is part of a large comparative study designed to explore the genetic composition and evolutionary origin of this group of horizontally ...
Matrix metalloproteinases: structures, evolution, and diversification.

PubMed

Massova, I; Kotra, L P; Fridman, R; Mobashery, S

1998-09-01

A comprehensive sequence alignment of 64 members of the family of matrix metalloproteinases (MMPs) for the entire sequences, and subsequently the catalytic and the hemopexin-like domains, have been performed. The 64 MMPs were selected from plants, invertebrates, and vertebrates. The analyses disclosed that as many as 23 distinct subfamilies of these proteins are known to exist. Information from the sequence alignments was correlated with structures, both crystallographic as well as computational, of the catalytic domains for the 23 representative members of the MMP family. A survey of the metal binding sites and two loops containing variable sequences of amino acids, which are important for substrate interactions, are discussed. The collective data support the proposal that the assembly of the domains into multidomain enzymes was likely to be an early evolutionary event. This was followed by diversification, perhaps in parallel among the MMPs, in a subsequent evolutionary time scale. Analysis indicates that a retrograde structure simplification may have accounted for the evolution of MMPs with simple domain constituents, such as matrilysin, from the larger and more elaborate enzymes.
Influenza A virus evolution and spatio-temporal dynamics in Eurasian wild birds: a phylogenetic and phylogeographical study of whole-genome sequence data

PubMed Central

Lewis, Nicola S.; Verhagen, Josanne H.; Javakhishvili, Zurab; Russell, Colin A.; Lexmond, Pascal; Westgeest, Kim B.; Bestebroer, Theo M.; Halpin, Rebecca A.; Lin, Xudong; Ransier, Amy; Fedorova, Nadia B.; Stockwell, Timothy B.; Latorre-Margalef, Neus; Olsen, Björn; Smith, Gavin; Bahl, Justin; Wentworth, David E.; Waldenström, Jonas; Fouchier, Ron A. M.

2015-01-01

Low pathogenic avian influenza A viruses (IAVs) have a natural host reservoir in wild waterbirds and the potential to spread to other host species. Here, we investigated the evolutionary, spatial and temporal dynamics of avian IAVs in Eurasian wild birds. We used whole-genome sequences collected as part of an intensive long-term Eurasian wild bird surveillance study, and combined this genetic data with temporal and spatial information to explore the virus evolutionary dynamics. Frequent reassortment and co-circulating lineages were observed for all eight genomic RNA segments over time. There was no apparent species-specific effect on the diversity of the avian IAVs. There was a spatial and temporal relationship between the Eurasian sequences and significant viral migration of avian IAVs from West Eurasia towards Central Eurasia. The observed viral migration patterns differed between segments. Furthermore, we discuss the challenges faced when analysing these surveillance and sequence data, and the caveats to be borne in mind when drawing conclusions from the apparent results of such analyses. PMID:25904147
Lineage-specific genomics: Frequent birth and death in the human genome: The human genome contains many lineage-specific elements created by both sequence and functional turnover.

PubMed

Young, Robert S

2016-07-01

Frequent evolutionary birth and death events have created a large quantity of biologically important, lineage-specific DNA within mammalian genomes. The birth and death of DNA sequences is so frequent that the total number of these insertions and deletions in the human population remains unknown, although there are differences between these groups, e.g. transposable elements contribute predominantly to sequence insertion. Functional turnover - where the activity of a locus is specific to one lineage, but the underlying DNA remains conserved - can also drive birth and death. However, this does not appear to be a major driver of divergent transcriptional regulation. Both sequence and functional turnover have contributed to the birth and death of thousands of functional promoters in the human and mouse genomes. These findings reveal the pervasive nature of evolutionary birth and death and suggest that lineage-specific regions may play an important but previously underappreciated role in human biology and disease. © 2016 The Authors BioEssays Published by WILEY Periodicals, Inc.
Evolutionary relationships of a plant-pathogenic mycoplasmalike organism and Acholeplasma laidlawii deduced from two ribosomal protein gene sequences.

PubMed Central

Lim, P O; Sears, B B

1992-01-01

The families within the class Mollicutes are distinguished by their morphologies, nutritional requirements, and abilities to metabolize certain compounds. Biosystematic classification of the plant-pathogenic mycoplasmalike organisms (MLOs) has been difficult because these organisms have not been cultured in vitro, and hence their nutritional requirements have not been determined nor have physiological characterizations been possible. To investigate the evolutionary relationship of the MLOs to other members of the class Mollicutes, a segment of a ribosomal protein operon was cloned and sequenced from an aster yellows-type MLO which is pathogenic for members of the genus Oenothera and from Acholeplasma laidlawii. The deduced amino acid sequence data from the rpl22 and rps3 genes indicate that the MLOs are more closely related to A. laidlawii than to animal mycoplasmas, confirming previous results from 16S rRNA sequence comparisons. This conclusion is also supported by the finding that the UGA codon is not read as a tryptophan codon in the MLO and A. laidlawii, in contrast to its usage in Mycoplasma capricolum. PMID:1556079
In-silico studies of neutral drift for functional protein interaction networks

NASA Astrophysics Data System (ADS)

Ali, Md Zulfikar; Wingreen, Ned S.; Mukhopadhyay, Ranjan

We have developed a minimal physically-motivated model of protein-protein interaction networks. Our system consists of two classes of enzymes, activators (e.g. kinases) and deactivators (e.g. phosphatases), and the enzyme-mediated activation/deactivation rates are determined by sequence-dependent binding strengths between enzymes and their targets. The network is evolved by introducing random point mutations in the binding sequences where we assume that each new mutation is either fixed or entirely lost. We apply this model to studies of neutral drift in networks that yield oscillatory dynamics, where we start, for example, with a relatively simple network and allow it to evolve by adding nodes and connections while requiring that dynamics be conserved. Our studies demonstrate both the importance of employing a sequence-based evolutionary scheme and the relative rapidity (in evolutionary time) for the redistribution of function over new nodes via neutral drift. Surprisingly, in addition to this redistribution time we discovered another much slower timescale for network evolution, reflecting hidden order in sequence space that we interpret in terms of sparsely connected domains.

Systematic Error in Seed Plant Phylogenomics

PubMed Central

Zhong, Bojian; Deusch, Oliver; Goremykin, Vadim V.; Penny, David; Biggs, Patrick J.; Atherton, Robin A.; Nikiforova, Svetlana V.; Lockhart, Peter James

2011-01-01

Resolving the closest relatives of Gnetales has been an enigmatic problem in seed plant phylogeny. The problem is known to be difficult because of the extent of divergence between this diverse group of gymnosperms and their closest phylogenetic relatives. Here, we investigate the evolutionary properties of conifer chloroplast DNA sequences. To improve taxon sampling of Cupressophyta (non-Pinaceae conifers), we report sequences from three new chloroplast (cp) genomes of Southern Hemisphere conifers. We have applied a site pattern sorting criterion to study compositional heterogeneity, heterotachy, and the fit of conifer chloroplast genome sequences to a general time reversible + G substitution model. We show that non-time reversible properties of aligned sequence positions in the chloroplast genomes of Gnetales mislead phylogenetic reconstruction of these seed plants. When 2,250 of the most varied sites in our concatenated alignment are excluded, phylogenetic analyses favor a close evolutionary relationship between the Gnetales and Pinaceae—the Gnepine hypothesis. Our analytical protocol provides a useful approach for evaluating the robustness of phylogenomic inferences. Our findings highlight the importance of goodness of fit between substitution model and data for understanding seed plant phylogeny. PMID:22016337
A comprehensive aligned nifH gene database: a multipurpose tool for studies of nitrogen-fixing bacteria.

PubMed

Gaby, John Christian; Buckley, Daniel H

2014-01-01

We describe a nitrogenase gene sequence database that facilitates analysis of the evolution and ecology of nitrogen-fixing organisms. The database contains 32 954 aligned nitrogenase nifH sequences linked to phylogenetic trees and associated sequence metadata. The database includes 185 linked multigene entries including full-length nifH, nifD, nifK and 16S ribosomal RNA (rRNA) gene sequences. Evolutionary analyses enabled by the multigene entries support an ancient horizontal transfer of nitrogenase genes between Archaea and Bacteria and provide evidence that nifH has a different history of horizontal gene transfer from the nifDK enzyme core. Further analyses show that lineages in nitrogenase cluster I and cluster III have different rates of substitution within nifD, suggesting that nifD is under different selection pressure in these two lineages. Finally, we find that that the genetic divergence of nifH and 16S rRNA genes does not correlate well at sequence dissimilarity values used commonly to define microbial species, as stains having <3% sequence dissimilarity in their 16S rRNA genes can have up to 23% dissimilarity in nifH. The nifH database has a number of uses including phylogenetic and evolutionary analyses, the design and assessment of primers/probes and the evaluation of nitrogenase sequence diversity. Database URL: http://www.css.cornell.edu/faculty/buckley/nifh.htm.
The Functional Human C-Terminome

PubMed Central

Hedden, Michael; Lyon, Kenneth F.; Brooks, Steven B.; David, Roxanne P.; Limtong, Justin; Newsome, Jacklyn M.; Novakovic, Nemanja; Rajasekaran, Sanguthevar; Thapar, Vishal; Williams, Sean R.; Schiller, Martin R.

2016-01-01

All translated proteins end with a carboxylic acid commonly called the C-terminus. Many short functional sequences (minimotifs) are located on or immediately proximal to the C-terminus. However, information about the function of protein C-termini has not been consolidated into a single source. Here, we built a new “C-terminome” database and web system focused on human proteins. Approximately 3,600 C-termini in the human proteome have a minimotif with an established molecular function. To help evaluate the function of the remaining C-termini in the human proteome, we inferred minimotifs identified by experimentation in rodent cells, predicted minimotifs based upon consensus sequence matches, and predicted novel highly repetitive sequences in C-termini. Predictions can be ranked by enrichment scores or Gene Evolutionary Rate Profiling (GERP) scores, a measurement of evolutionary constraint. By searching for new anchored sequences on the last 10 amino acids of proteins in the human proteome with lengths between 3–10 residues and up to 5 degenerate positions in the consensus sequences, we have identified new consensus sequences that predict instances in the majority of human genes. All of this information is consolidated into a database that can be accessed through a C-terminome web system with search and browse functions for minimotifs and human proteins. A known consensus sequence-based predicted function is assigned to nearly half the proteins in the human proteome. Weblink: http://cterminome.bio-toolkit.com. PMID:27050421
Variation in the genomic locations and sequence conservation of STAR elements among staphylococcal species provides insight into DNA repeat evolution

PubMed Central

2012-01-01

Background Staphylococcus aureus Repeat (STAR) elements are a type of interspersed intergenic direct repeat. In this study the conservation and variation in these elements was explored by bioinformatic analyses of published staphylococcal genome sequences and through sequencing of specific STAR element loci from a large set of S. aureus isolates. Results Using bioinformatic analyses, we found that the STAR elements were located in different genomic loci within each staphylococcal species. There was no correlation between the number of STAR elements in each genome and the evolutionary relatedness of staphylococcal species, however higher levels of repeats were observed in both S. aureus and S. lugdunensis compared to other staphylococcal species. Unexpectedly, sequencing of the internal spacer sequences of individual repeat elements from multiple isolates showed conservation at the sequence level within deep evolutionary lineages of S. aureus. Whilst individual STAR element loci were demonstrated to expand and contract, the sequences associated with each locus were stable and distinct from one another. Conclusions The high degree of lineage and locus-specific conservation of these intergenic repeat regions suggests that STAR elements are maintained due to selective or molecular forces with some of these elements having an important role in cell physiology. The high prevalence in two of the more virulent staphylococcal species is indicative of a potential role for STAR elements in pathogenesis. PMID:23020678
A comprehensive aligned nifH gene database: a multipurpose tool for studies of nitrogen-fixing bacteria

PubMed Central

Gaby, John Christian; Buckley, Daniel H.

2014-01-01

We describe a nitrogenase gene sequence database that facilitates analysis of the evolution and ecology of nitrogen-fixing organisms. The database contains 32 954 aligned nitrogenase nifH sequences linked to phylogenetic trees and associated sequence metadata. The database includes 185 linked multigene entries including full-length nifH, nifD, nifK and 16S ribosomal RNA (rRNA) gene sequences. Evolutionary analyses enabled by the multigene entries support an ancient horizontal transfer of nitrogenase genes between Archaea and Bacteria and provide evidence that nifH has a different history of horizontal gene transfer from the nifDK enzyme core. Further analyses show that lineages in nitrogenase cluster I and cluster III have different rates of substitution within nifD, suggesting that nifD is under different selection pressure in these two lineages. Finally, we find that that the genetic divergence of nifH and 16S rRNA genes does not correlate well at sequence dissimilarity values used commonly to define microbial species, as stains having <3% sequence dissimilarity in their 16S rRNA genes can have up to 23% dissimilarity in nifH. The nifH database has a number of uses including phylogenetic and evolutionary analyses, the design and assessment of primers/probes and the evaluation of nitrogenase sequence diversity. Database URL: http://www.css.cornell.edu/faculty/buckley/nifh.htm PMID:24501396
UET: a database of evolutionarily-predicted functional determinants of protein sequences that cluster as functional sites in protein structures.

PubMed

Lua, Rhonald C; Wilson, Stephen J; Konecki, Daniel M; Wilkins, Angela D; Venner, Eric; Morgan, Daniel H; Lichtarge, Olivier

2016-01-04

The structure and function of proteins underlie most aspects of biology and their mutational perturbations often cause disease. To identify the molecular determinants of function as well as targets for drugs, it is central to characterize the important residues and how they cluster to form functional sites. The Evolutionary Trace (ET) achieves this by ranking the functional and structural importance of the protein sequence positions. ET uses evolutionary distances to estimate functional distances and correlates genotype variations with those in the fitness phenotype. Thus, ET ranks are worse for sequence positions that vary among evolutionarily closer homologs but better for positions that vary mostly among distant homologs. This approach identifies functional determinants, predicts function, guides the mutational redesign of functional and allosteric specificity, and interprets the action of coding sequence variations in proteins, people and populations. Now, the UET database offers pre-computed ET analyses for the protein structure databank, and on-the-fly analysis of any protein sequence. A web interface retrieves ET rankings of sequence positions and maps results to a structure to identify functionally important regions. This UET database integrates several ways of viewing the results on the protein sequence or structure and can be found at http://mammoth.bcm.tmc.edu/uet/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Some Physical Principles Governing Spatial and Temporal Organization in Living Systems

NASA Astrophysics Data System (ADS)

Ali, Md Zulfikar

Spatial and temporal organization in living organisms are crucial for a variety of biological functions and arise from the interplay of large number of interacting molecules. One of the central questions in systems biology is to understand how such an intricate organization emerges from the molecular biochemistry of the cell. In this dissertation we explore two projects. The first project relates to pattern formation in a cell membrane as an example of spatial organization, and the second project relates to the evolution of oscillatory networks as a simple example of temporal organization. For the first project, we introduce a model for pattern formation in a two-component lipid bilayer and study the interplay between membrane composition and membrane geometry, demonstrating the existence of a rich phase diagram. Pattern formation is governed by the interplay between phase separation driven by lipid-lipid interactions and tendency of lipid domains with high intrinsic curvature to deform the membrane away from its preferred position. Depending on membrane parameters, we find the formation of compact lipid micro-clusters or of striped domains. We calculate the stripe width analytically and find good agreement with stripe widths obtained from the simulations. For the second project, we introduce a minimal model for the evolution of functional protein-interaction networks using a sequence-based mutational algorithm and apply it to study the following problems. Using the model, we study robustness and designabilty of a 2-component network that generate oscillations. We completely enumerate the sequence space and the phenotypic space, and discuss the relationship between designabilty, robustness and evolvability. We further apply the model to studies of neutral drift in networks that yield oscillatory dynamics, e.g. starting with a relatively simple network and allowing it to evolve by adding nodes and connections while requiring that oscillatory dynamics be preserved. Our studies demonstrate both the importance of employing a sequence-based evolutionary scheme and the relative rapidity (in evolutionary time) for the redistribution of function over new nodes via neutral drift. In addition we discovered another much slower timescale for network evolution, reflecting hidden order in sequence space that we interpret in terms of sparsely connected domains. Finally, we use the model to study the evolution of an oscillator from a non-oscillatory network under the influence of external periodic forcing as a model for evolution of circadian rhythm in living systems. We use a greedy algorithm based on optimizing biologically motivated fitness functions and find that the algorithm successfully produces oscillators. However, the distribution of free-period of evolved oscillators depends on the choice of fitness functions and the nature of forcing.
Evolution of antibiotic resistance in biofilm and planktonic P. aeruginosa populations exposed to sub-inhibitory levels of ciprofloxacin.

PubMed

Ahmed, Marwa N; Porse, Andreas; Sommer, Morten Otto Alexander; Høiby, Niels; Ciofu, Oana

2018-05-14

The opportunistic Gram-negative pathogen Pseudomonas aeruginosa , known for its intrinsic and acquired antibiotic resistance, has a notorious ability to form biofilms, which often facilitate chronic infections. The evolutionary paths to antibiotic resistance have mainly been investigated in planktonic cultures and are less studied in biofilms. We experimentally evolved P. aeruginosa PAO1 colony-biofilms and stationary-phase planktonic cultures for seven passages in the presence of sub-inhibitory levels (0.1 mg/L) of ciprofloxacin (CIP) and performed a genotypic (whole bacterial population sequencing) and phenotypic assessment of the populations. We observed a higher proportion of CIP resistance in the CIP-evolved biofilm populations compared to planktonic populations exposed to the same drug concentrations. However, the minimal inhibitory concentrations (MICs) of ciprofloxacin were lower in CIP-resistant isolates selected from biofilm population compared to the MICs of CIP-resistant isolates from the planktonic cultures. We found common evolutionary trajectories between the different lineages, with mutations in known CIP resistance determinants as well as growth condition-dependent adaptations. A general trend towards a reduction in type IV-pili dependent motility (twitching) in CIP-evolved populations, and towards loss of virulence associated traits in the populations evolved in the absence of antibiotic, was observed. In conclusion, our data indicate that biofilms facilitate the development of low-level mutational resistance, probably due to the lower effective drug exposure compared to planktonic cultures. These results provide a framework for the selection process of resistant variants and the evolutionary mechanisms in the two different growth conditions. Copyright © 2018 American Society for Microbiology.
Evolutionary connections of biological kingdoms based on protein and nucleic acid sequence evidence

NASA Technical Reports Server (NTRS)

Dayhoff, M. O.

1983-01-01

Prokaryotic and eukaryotic evolutionary trees are developed from protein and nucleic-acid sequences by the methods of numerical taxonomy. Trees are presented for bacterial ferredoxins, 5S ribosomal RNA, c-type cytochromes , cytochromes c2 and c', and 5.8S ribosomal RNA; the implications for early evolution are discussed; and a composite tree showing the branching of the anaerobes, aerobes, archaebacteria, and eukaryotes is shown. Single lines are found for all oxygen-evolving photosynthetic forms and for the salt-loving and high-temperature forms of archaebacteria. It is argued that the eukaryote mitochondria, chloroplasts, and cytoplasmic host material are descended from free-living prokaryotes that formed symbiotic associations, with more than one symbiotic event involved in the evolution of each organelle.
Cloning and characterization of new bioluminescent proteins

NASA Astrophysics Data System (ADS)

Szent-Gyorgyi, Christopher; Ballou, Byron T.; Dagnal, Erich; Bryan, Bruce

1999-07-01

Over the past two years Prolume has undertaken a comprehensive program to clone luciferases and associated 'green fluorescent proteins' (GFPs) from marine animals that use coelenterazine as the luciferin. To data we have cloned several bioluminescent proteins, including two novel copepod luciferases and two anthozoan GFPs. These four proteins have sequences that differ greatly form previously cloned analogous proteins; the sequence diversity apparently is due to independent evolutionary origins and unusual evolutionary constraints. Thus coelenterazine-based bioluminescent systems may also manifest a variety of useful properties. We discuss form this taxonomic perspective the initial biochemical and spectral characterization of our cloned proteins. Emphasis is placed on the anthozoan luciferase-GFP systems, whose efficient resonance energy transfer has elicited much current interest.
Functional and evolutionary trade-offs co-occur between two consolidated memory phases in Drosophila melanogaster

PubMed Central

Lagasse, Fabrice; Moreno, Celine; Preat, Thomas; Mery, Frederic

2012-01-01

Memory is a complex and dynamic process that is composed of different phases. Its evolution under natural selection probably depends on a balance between fitness benefits and costs. In Drosophila, two separate forms of consolidated memory phases can be generated experimentally: anaesthesia-resistant memory (ARM) and long-term memory (LTM). In recent years, several studies have focused on the differences between these long-lasting memory types and have found that, at the functional level, ARM and LTM are antagonistic. How this functional relationship will affect their evolutionary dynamics remains unknown. We selected for flies with either improved ARM or improved LTM over several generations, and found that flies selected specifically for improvement of one consolidated memory phase show reduced performance in the other memory phase. We also found that improved LTM was linked to decreased longevity in male flies but not in females. Conversely, males with improved ARM had increased longevity. We found no correlation between either improved ARM or LTM and other phenotypic traits. This is, to our knowledge, the first evidence of a symmetrical evolutionary trade-off between two memory phases for the same learning task. Such trade-offs may have an important impact on the evolution of cognitive capacities. On a neural level, these results support the hypothesis that mechanisms underlying these forms of consolidated memory are, to some degree, antagonistic. PMID:22859595
From evolutionary computation to the evolution of things.

PubMed

Eiben, Agoston E; Smith, Jim

2015-05-28

Evolution has provided a source of inspiration for algorithm designers since the birth of computers. The resulting field, evolutionary computation, has been successful in solving engineering tasks ranging in outlook from the molecular to the astronomical. Today, the field is entering a new phase as evolutionary algorithms that take place in hardware are developed, opening up new avenues towards autonomous machines that can adapt to their environment. We discuss how evolutionary computation compares with natural evolution and what its benefits are relative to other computing approaches, and we introduce the emerging area of artificial evolution in physical systems.
Evolutionary advantage via common action of recombination and neutrality

NASA Astrophysics Data System (ADS)

Saakian, David B.; Hu, Chin-Kun

2013-11-01

We investigate evolution models with recombination and neutrality. We consider the Crow-Kimura (parallel) mutation-selection model with the neutral fitness landscape, in which there is a central peak with high fitness A, and some of 1-point mutants have the same high fitness A, while the fitness of other sequences is 0. We find that the effect of recombination and neutrality depends on the concrete version of both neutrality and recombination. We consider three versions of neutrality: (a) all the nearest neighbor sequences of the peak sequence have the same high fitness A; (b) all the l-point mutations in a piece of genome of length l≥1 are neutral; (c) the neutral sequences are randomly distributed among the nearest neighbors of the peak sequences. We also consider three versions of recombination: (I) the simple horizontal gene transfer (HGT) of one nucleotide; (II) the exchange of a piece of genome of length l, HGT-l; (III) two-point crossover recombination (2CR). For the case of (a), the 2CR gives a rather strong contribution to the mean fitness, much stronger than that of HGT for a large genome length L. For the random distribution of neutral sequences there is a critical degree of neutrality νc, and for μ<μc and (μc-μ) is not large, the 2CR suppresses the mean fitness while HGT increases it; for ν much larger than νc, the 2CR and HGT-l increase the mean fitness larger than that of the HGT. We also consider the recombination in the case of smooth fitness landscapes. The recombination gives some advantage in the evolutionary dynamics, where recombination distinguishes clearly the mean-field-like evolutionary factors from the fluctuation-like ones. By contrast, mutations affect the mean-field-like and fluctuation-like factors similarly. Consequently, recombination can accelerate the non-mean-field (fluctuation) type dynamics without considerably affecting the mean-field-like factors.
Depletion of CpG Dinucleotides in Papillomaviruses and Polyomaviruses: A Role for Divergent Evolutionary Pressures.

PubMed

Upadhyay, Mohita; Vivekanandan, Perumal

2015-01-01

Papillomaviruses and polyomaviruses are small ds-DNA viruses infecting a wide-range of vertebrate hosts. Evidence supporting co-evolution of the virus with the host does not fully explain the evolutionary path of papillomaviruses and polyomaviruses. Studies analyzing CpG dinucleotide frequencies in virus genomes have provided interesting insights on virus evolution. CpG dinucleotide depletion has not been extensively studied among papillomaviruses and polyomaviruses. We sought to analyze the relative abundance of dinucleotides and the relative roles of evolutionary pressures in papillomaviruses and polyomaviruses. We studied 127 full-length sequences from papillomaviruses and 56 full-length sequences from polyomaviruses. We analyzed the relative abundance of dinucleotides, effective codon number (ENC), differences in synonymous codon usage. We examined the association, if any, between the extent of CpG dinucleotide depletion and the evolutionary lineage of the infected host. We also investigated the contribution of mutational pressure and translational selection to the evolution of papillomaviruses and polyomaviruses. All papillomaviruses and polyomaviruses are CpG depleted. Interestingly, the evolutionary lineage of the infected host determines the extent of CpG depletion among papillomaviruses and polyomaviruses. CpG dinucleotide depletion was more pronounced among papillomaviruses and polyomaviruses infecting human and other mammals as compared to those infecting birds. Our findings demonstrate that CpG depletion among papillomaviruses is linked to mutational pressure; while CpG depletion among polyomaviruses is linked to translational selection. We also present evidence that suggests methylation of CpG dinucleotides may explain, at least in part, the depletion of CpG dinucleotides among papillomaviruses but not polyomaviruses. The extent of CpG depletion among papillomaviruses and polyomaviruses is linked to the evolutionary lineage of the infected host. Our results highlight the existence of divergent evolutionary pressures leading to CpG dinucleotide depletion among small ds-DNA viruses infecting vertebrate hosts.
The pipid root.

PubMed

Bewick, Adam J; Chain, Frédéric J J; Heled, Joseph; Evans, Ben J

2012-12-01

The estimation of phylogenetic relationships is an essential component of understanding evolution. Accurate phylogenetic estimation is difficult, however, when internodes are short and old, when genealogical discordance is common due to large ancestral effective population sizes or ancestral population structure, and when homoplasy is prevalent. Inference of divergence times is also hampered by unknown and uneven rates of evolution, the incomplete fossil record, uncertainty in relationships between fossil and extant lineages, and uncertainty in the age of fossils. Ideally, these challenges can be overcome by developing large "phylogenomic" data sets and by analyzing them with methods that accommodate features of the evolutionary process, such as genealogical discordance, recurrent substitution, recombination, ancestral population structure, gene flow after speciation among sampled and unsampled taxa, and variation in evolutionary rates. In some phylogenetic problems, it is possible to use information that is independent of fossils, such as the geological record, to identify putative triggers for diversification whose associated estimated divergence times can then be compared a posteriori with estimated relationships and ages of fossils. The history of diversification of pipid frog genera Pipa, Hymenochirus, Silurana, and Xenopus, for instance, is characterized by many of these evolutionary and analytical challenges. These frogs diversified dozens of millions of years ago, they have a relatively rich fossil record, their distributions span continental plates with a well characterized geological record of ancient connectivity, and there is considerable disagreement across studies in estimated evolutionary relationships. We used high throughput sequencing and public databases to generate a large phylogenomic data set with which we estimated evolutionary relationships using multilocus coalescence methods. We collected sequence data from Pipa, Hymenochirus, Silurana, and Xenopus and the outgroup taxon Rhinophrynus dorsalis from coding sequence of 113 autosomal regions, averaging ∼300 bp in length (range: 102-1695 bp) and also a portion of the mitochondrial genome. Analysis of these data using multiple approaches recovers strong support for the ((Xenopus, Silurana)(Pipa, Hymenochirus)) topology, and geologically calibrated divergence time estimates that are consistent with estimated ages and phylogenetic affinities of many fossils. These results provide new insights into the biogeography and chronology of pipid diversification during the breakup of Gondwanaland and illustrate how phylogenomic data may be necessary to tackle tough problems in molecular systematics. [Coalescence; gene tree; high-throughout sequencing; lineage sorting; pipid; species tree; Xenopus.].
The draft genome sequence and annotation of the desert woodrat Neotoma lepida.

PubMed

Campbell, Michael; Oakeson, Kelly F; Yandell, Mark; Halpert, James R; Dearing, Denise

2016-09-01

We present the de novo draft genome sequence for a vertebrate mammalian herbivore, the desert woodrat (Neotoma lepida). This species is of ecological and evolutionary interest with respect to ingestion, microbial detoxification and hepatic metabolism of toxic plant secondary compounds from the highly toxic creosote bush (Larrea tridentata) and the juniper shrub (Juniperus monosperma). The draft genome sequence and annotation have been deposited at GenBank under the accession LZPO01000000.
BEYOND THE MAIN SEQUENCE: TESTING THE ACCURACY OF STELLAR MASSES PREDICTED BY THE PARSEC EVOLUTIONARY TRACKS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ghezzi, Luan; Johnson, John Asher, E-mail: lghezzi@cfa.harvard.edu

2015-10-20

Characterizing the physical properties of exoplanets and understanding their formation and orbital evolution requires precise and accurate knowledge of their host stars. Accurately measuring stellar masses is particularly important because they likely influence planet occurrence and the architectures of planetary systems. Single main-sequence stars typically have masses estimated from evolutionary tracks, which generally provide accurate results due to their extensive empirical calibration. However, the validity of this method for subgiants and giants has been called into question by recent studies, with suggestions that the masses of these evolved stars could have been overestimated. We investigate these concerns using a samplemore » of 59 benchmark evolved stars with model-independent masses (from binary systems or asteroseismology) obtained from the literature. We find very good agreement between these benchmark masses and the ones estimated using evolutionary tracks. The average fractional difference in the mass interval ∼0.7–4.5 M{sub ⊙} is consistent with zero (−1.30 ± 2.42%), with no significant trends in the residuals relative to the input parameters. A good agreement between model-dependent and -independent radii (−4.81 ± 1.32%) and surface gravities (0.71 ± 0.51%) is also found. The consistency between independently determined ages for members of binary systems adds further support for the accuracy of the method employed to derive the stellar masses. Taken together, our results indicate that determination of masses of evolved stars using grids of evolutionary tracks is not significantly affected by systematic errors, and is thus valid for estimating the masses of isolated stars beyond the main sequence.« less
Clusters of ancestrally related genes that show paralogy in whole or in part are a major feature of the genomes of humans and other species.

PubMed

Walker, Michael B; King, Benjamin L; Paigen, Kenneth

2012-01-01

Arrangements of genes along chromosomes are a product of evolutionary processes, and we can expect that preferable arrangements will prevail over the span of evolutionary time, often being reflected in the non-random clustering of structurally and/or functionally related genes. Such non-random arrangements can arise by two distinct evolutionary processes: duplications of DNA sequences that give rise to clusters of genes sharing both sequence similarity and common sequence features and the migration together of genes related by function, but not by common descent. To provide a background for distinguishing between the two, which is important for future efforts to unravel the evolutionary processes involved, we here provide a description of the extent to which ancestrally related genes are found in proximity.Towards this purpose, we combined information from five genomic datasets, InterPro, SCOP, PANTHER, Ensembl protein families, and Ensembl gene paralogs. The results are provided in publicly available datasets (http://cgd.jax.org/datasets/clustering/paraclustering.shtml) describing the extent to which ancestrally related genes are in proximity beyond what is expected by chance (i.e. form paraclusters) in the human and nine other vertebrate genomes, as well as the D. melanogaster, C. elegans, A. thaliana, and S. cerevisiae genomes. With the exception of Saccharomyces, paraclusters are a common feature of the genomes we examined. In the human genome they are estimated to include at least 22% of all protein coding genes. Paraclusters are far more prevalent among some gene families than others, are highly species or clade specific and can evolve rapidly, sometimes in response to environmental cues. Altogether, they account for a large portion of the functional clustering previously reported in several genomes.
Measuring fit of sequence data to phylogenetic model: gain of power using marginal tests.

PubMed

Waddell, Peter J; Ota, Rissa; Penny, David

2009-10-01

Testing fit of data to model is fundamentally important to any science, but publications in the field of phylogenetics rarely do this. Such analyses discard fundamental aspects of science as prescribed by Karl Popper. Indeed, not without cause, Popper (Unended quest: an intellectual autobiography. Fontana, London, 1976) once argued that evolutionary biology was unscientific as its hypotheses were untestable. Here we trace developments in assessing fit from Penny et al. (Nature 297:197-200, 1982) to the present. We compare the general log-likelihood ratio (the G or G (2) statistic) statistic between the evolutionary tree model and the multinomial model with that of marginalized tests applied to an alignment (using placental mammal coding sequence data). It is seen that the most general test does not reject the fit of data to model (P approximately 0.5), but the marginalized tests do. Tests on pairwise frequency (F) matrices, strongly (P < 0.001) reject the most general phylogenetic (GTR) models commonly in use. It is also clear (P < 0.01) that the sequences are not stationary in their nucleotide composition. Deviations from stationarity and homogeneity seem to be unevenly distributed amongst taxa; not necessarily those expected from examining other regions of the genome. By marginalizing the 4( t ) patterns of the i.i.d. model to observed and expected parsimony counts, that is, from constant sites, to singletons, to parsimony informative characters of a minimum possible length, then the likelihood ratio test regains power, and it too rejects the evolutionary model with P < 0.001. Given such behavior over relatively recent evolutionary time, readers in general should maintain a healthy skepticism of results, as the scale of the systematic errors in published trees may really be far larger than the analytical methods (e.g., bootstrap) report.
Genetic identification and evolutionary trends of the seagrass Halophila nipponica in temperate coastal waters of Korea.

PubMed

Kim, Young Kyun; Kim, Seung Hyeon; Yi, Joo Mi; Kang, Chang-Keun; Short, Frederick; Lee, Kun-Seop

2017-01-01

Although seagrass species in the genus Halophila are generally distributed in tropical or subtropical regions, H. nipponica has been reported to occur in temperate coastal waters of the northwestern Pacific. Because H. nipponica occurs only in the warm temperate areas influenced by the Kuroshio Current and shows a tropical seasonal growth pattern, such as severely restricted growth in low water temperatures, it was hypothesized that this temperate Halophila species diverged from tropical species in the relatively recent evolutionary past. We used a phylogenetic analysis of internal transcribed spacer (ITS) regions to examine the genetic variability and evolutionary trend of H. nipponica. ITS sequences of H. nipponica from various locations in Korea and Japan were identical or showed very low sequence divergence (less than 3-base pair, bp, difference), confirming that H. nipponica from Japan and Korea are the same species. Halophila species in the section Halophila, which have simple phyllotaxy (a pair of petiolate leaves at the rhizome node), were separated into five well-supported clades by maximum parsimony analysis. H. nipponica grouped with H. okinawensis and H. gaudichaudii from the subtropical regions in the same clade, the latter two species having quite low ITS sequence divergence from H. nipponica (7-15-bp). H. nipponica in Clade I diverged 2.95 ± 1.08 million years ago from species in Clade II, which includes H. ovalis. According to geographical distribution and genetic similarity, H. nipponica appears to have diverged from a tropical species like H. ovalis and adapted to warm temperate environments. The results of divergence time estimates suggest that the temperate H. nipponica is an older species than the subtropical H. okinawensis and H. gaudichaudii and they may have different evolutionary histories.

Genetic identification and evolutionary trends of the seagrass Halophila nipponica in temperate coastal waters of Korea

PubMed Central

Kim, Young Kyun; Kim, Seung Hyeon; Yi, Joo Mi; Kang, Chang-Keun; Short, Frederick; Lee, Kun-Seop

2017-01-01

Although seagrass species in the genus Halophila are generally distributed in tropical or subtropical regions, H. nipponica has been reported to occur in temperate coastal waters of the northwestern Pacific. Because H. nipponica occurs only in the warm temperate areas influenced by the Kuroshio Current and shows a tropical seasonal growth pattern, such as severely restricted growth in low water temperatures, it was hypothesized that this temperate Halophila species diverged from tropical species in the relatively recent evolutionary past. We used a phylogenetic analysis of internal transcribed spacer (ITS) regions to examine the genetic variability and evolutionary trend of H. nipponica. ITS sequences of H. nipponica from various locations in Korea and Japan were identical or showed very low sequence divergence (less than 3-base pair, bp, difference), confirming that H. nipponica from Japan and Korea are the same species. Halophila species in the section Halophila, which have simple phyllotaxy (a pair of petiolate leaves at the rhizome node), were separated into five well-supported clades by maximum parsimony analysis. H. nipponica grouped with H. okinawensis and H. gaudichaudii from the subtropical regions in the same clade, the latter two species having quite low ITS sequence divergence from H. nipponica (7–15-bp). H. nipponica in Clade I diverged 2.95 ± 1.08 million years ago from species in Clade II, which includes H. ovalis. According to geographical distribution and genetic similarity, H. nipponica appears to have diverged from a tropical species like H. ovalis and adapted to warm temperate environments. The results of divergence time estimates suggest that the temperate H. nipponica is an older species than the subtropical H. okinawensis and H. gaudichaudii and they may have different evolutionary histories. PMID:28505209
Integrative View of α2,3-Sialyltransferases (ST3Gal) Molecular and Functional Evolution in Deuterostomes: Significance of Lineage-Specific Losses

PubMed Central

Petit, Daniel; Teppa, Elin; Mir, Anne-Marie; Vicogne, Dorothée; Thisse, Christine; Thisse, Bernard; Filloux, Cyril; Harduin-Lepers, Anne

2015-01-01

Sialyltransferases are responsible for the synthesis of a diverse range of sialoglycoconjugates predicted to be pivotal to deuterostomes’ evolution. In this work, we reconstructed the evolutionary history of the metazoan α2,3-sialyltransferases family (ST3Gal), a subset of sialyltransferases encompassing six subfamilies (ST3Gal I–ST3Gal VI) functionally characterized in mammals. Exploration of genomic and expressed sequence tag databases and search of conserved sialylmotifs led to the identification of a large data set of st3gal-related gene sequences. Molecular phylogeny and large scale sequence similarity network analysis identified four new vertebrate subfamilies called ST3Gal III-r, ST3Gal VII, ST3Gal VIII, and ST3Gal IX. To address the issue of the origin and evolutionary relationships of the st3gal-related genes, we performed comparative syntenic mapping of st3gal gene loci combined to ancestral genome reconstruction. The ten vertebrate ST3Gal subfamilies originated from genome duplication events at the base of vertebrates and are organized in three distinct and ancient groups of genes predating the early deuterostomes. Inferring st3gal gene family history identified also several lineage-specific gene losses, the significance of which was explored in a functional context. Toward this aim, spatiotemporal distribution of st3gal genes was analyzed in zebrafish and bovine tissues. In addition, molecular evolutionary analyses using specificity determining position and coevolved amino acid predictions led to the identification of amino acid residues with potential implication in functional divergence of vertebrate ST3Gal. We propose a detailed scenario of the evolutionary relationships of st3gal genes coupled to a conceptual framework of the evolution of ST3Gal functions. PMID:25534026
Hybrid intelligent methodology to design translation invariant morphological operators for Brazilian stock market prediction.

PubMed

Araújo, Ricardo de A

2010-12-01

This paper presents a hybrid intelligent methodology to design increasing translation invariant morphological operators applied to Brazilian stock market prediction (overcoming the random walk dilemma). The proposed Translation Invariant Morphological Robust Automatic phase-Adjustment (TIMRAA) method consists of a hybrid intelligent model composed of a Modular Morphological Neural Network (MMNN) with a Quantum-Inspired Evolutionary Algorithm (QIEA), which searches for the best time lags to reconstruct the phase space of the time series generator phenomenon and determines the initial (sub-optimal) parameters of the MMNN. Each individual of the QIEA population is further trained by the Back Propagation (BP) algorithm to improve the MMNN parameters supplied by the QIEA. Also, for each prediction model generated, it uses a behavioral statistical test and a phase fix procedure to adjust time phase distortions observed in stock market time series. Furthermore, an experimental analysis is conducted with the proposed method through four Brazilian stock market time series, and the achieved results are discussed and compared to results found with random walk models and the previously introduced Time-delay Added Evolutionary Forecasting (TAEF) and Morphological-Rank-Linear Time-lag Added Evolutionary Forecasting (MRLTAEF) methods. Copyright © 2010 Elsevier Ltd. All rights reserved.
PhyloBot: A Web Portal for Automated Phylogenetics, Ancestral Sequence Reconstruction, and Exploration of Mutational Trajectories.

PubMed

Hanson-Smith, Victor; Johnson, Alexander

2016-07-01

The method of phylogenetic ancestral sequence reconstruction is a powerful approach for studying evolutionary relationships among protein sequence, structure, and function. In particular, this approach allows investigators to (1) reconstruct and "resurrect" (that is, synthesize in vivo or in vitro) extinct proteins to study how they differ from modern proteins, (2) identify key amino acid changes that, over evolutionary timescales, have altered the function of the protein, and (3) order historical events in the evolution of protein function. Widespread use of this approach has been slow among molecular biologists, in part because the methods require significant computational expertise. Here we present PhyloBot, a web-based software tool that makes ancestral sequence reconstruction easy. Designed for non-experts, it integrates all the necessary software into a single user interface. Additionally, PhyloBot provides interactive tools to explore evolutionary trajectories between ancestors, enabling the rapid generation of hypotheses that can be tested using genetic or biochemical approaches. Early versions of this software were used in previous studies to discover genetic mechanisms underlying the functions of diverse protein families, including V-ATPase ion pumps, DNA-binding transcription regulators, and serine/threonine protein kinases. PhyloBot runs in a web browser, and is available at the following URL: http://www.phylobot.com. The software is implemented in Python using the Django web framework, and runs on elastic cloud computing resources from Amazon Web Services. Users can create and submit jobs on our free server (at the URL listed above), or use our open-source code to launch their own PhyloBot server.
PhyloBot: A Web Portal for Automated Phylogenetics, Ancestral Sequence Reconstruction, and Exploration of Mutational Trajectories

PubMed Central

Hanson-Smith, Victor; Johnson, Alexander

2016-01-01

The method of phylogenetic ancestral sequence reconstruction is a powerful approach for studying evolutionary relationships among protein sequence, structure, and function. In particular, this approach allows investigators to (1) reconstruct and “resurrect” (that is, synthesize in vivo or in vitro) extinct proteins to study how they differ from modern proteins, (2) identify key amino acid changes that, over evolutionary timescales, have altered the function of the protein, and (3) order historical events in the evolution of protein function. Widespread use of this approach has been slow among molecular biologists, in part because the methods require significant computational expertise. Here we present PhyloBot, a web-based software tool that makes ancestral sequence reconstruction easy. Designed for non-experts, it integrates all the necessary software into a single user interface. Additionally, PhyloBot provides interactive tools to explore evolutionary trajectories between ancestors, enabling the rapid generation of hypotheses that can be tested using genetic or biochemical approaches. Early versions of this software were used in previous studies to discover genetic mechanisms underlying the functions of diverse protein families, including V-ATPase ion pumps, DNA-binding transcription regulators, and serine/threonine protein kinases. PhyloBot runs in a web browser, and is available at the following URL: http://www.phylobot.com. The software is implemented in Python using the Django web framework, and runs on elastic cloud computing resources from Amazon Web Services. Users can create and submit jobs on our free server (at the URL listed above), or use our open-source code to launch their own PhyloBot server. PMID:27472806
Sequencing of small RNAs of the fern Pleopeltis minima (Polypodiaceae) offers insight into the evolution of the microrna repertoire in land plants

PubMed Central

Berruezo, Florencia; de Souza, Flávio S. J.; Picca, Pablo I.; Nemirovsky, Sergio I.; Martínez Tosar, Leandro; Rivero, Mercedes; Mentaberry, Alejandro N.

2017-01-01

MicroRNAs (miRNAs) are short, single stranded RNA molecules that regulate the stability and translation of messenger RNAs in diverse eukaryotic groups. Several miRNA genes are of ancient origin and have been maintained in the genomes of animal and plant taxa for hundreds of millions of years, playing key roles in development and physiology. In the last decade, genome and small RNA (sRNA) sequencing of several plant species have helped unveil the evolutionary history of land plants. Among these, the fern group (monilophytes) occupies a key phylogenetic position, as it represents the closest extant cousin taxon of seed plants, i.e. gymno- and angiosperms. However, in spite of their evolutionary, economic and ecological importance, no fern genome has been sequenced yet and few genomic resources are available for this group. Here, we sequenced the small RNA fraction of an epiphytic South American fern, Pleopeltis minima (Polypodiaceae), and compared it to plant miRNA databases, allowing for the identification of miRNA families that are shared by all land plants, shared by all vascular plants (tracheophytes) or shared by euphyllophytes (ferns and seed plants) only. Using the recently described transcriptome of another fern, Lygodium japonicum, we also estimated the degree of conservation of fern miRNA targets in relation to other plant groups. Our results pinpoint the origin of several miRNA families in the land plant evolutionary tree with more precision and are a resource for future genomic and functional studies of fern miRNAs. PMID:28494025
A high density physical map of chromosome 1BL supports evolutionary studies, map-based cloning and sequencing in wheat

PubMed Central

2013-01-01

Background As for other major crops, achieving a complete wheat genome sequence is essential for the application of genomics to breeding new and improved varieties. To overcome the complexities of the large, highly repetitive and hexaploid wheat genome, the International Wheat Genome Sequencing Consortium established a chromosome-based strategy that was validated by the construction of the physical map of chromosome 3B. Here, we present improved strategies for the construction of highly integrated and ordered wheat physical maps, using chromosome 1BL as a template, and illustrate their potential for evolutionary studies and map-based cloning. Results Using a combination of novel high throughput marker assays and an assembly program, we developed a high quality physical map representing 93% of wheat chromosome 1BL, anchored and ordered with 5,489 markers including 1,161 genes. Analysis of the gene space organization and evolution revealed that gene distribution and conservation along the chromosome results from the superimposition of the ancestral grass and recent wheat evolutionary patterns, leading to a peak of synteny in the central part of the chromosome arm and an increased density of non-collinear genes towards the telomere. With a density of about 11 markers per Mb, the 1BL physical map provides 916 markers, including 193 genes, for fine mapping the 40 QTLs mapped on this chromosome. Conclusions Here, we demonstrate that high marker density physical maps can be developed in complex genomes such as wheat to accelerate map-based cloning, gain new insights into genome evolution, and provide a foundation for reference sequencing. PMID:23800011
Molecular diversity and evolutionary history of rabies virus strains circulating in the Balkans.

PubMed

McElhinney, L M; Marston, D A; Freuling, C M; Cragg, W; Stankov, S; Lalosevic, D; Lalosevic, V; Müller, T; Fooks, A R

2011-09-01

Molecular studies of European classical rabies viruses (RABV) have revealed a number of geographically clustered lineages. To study the diversity of Balkan RABV, partial nucleoprotein (N) gene sequences were analysed from a unique panel of isolates (n = 210), collected from various hosts between 1972 and 2006. All of the Balkan isolates grouped within the European/Middle East Lineage, with the majority most closely related to East European strains. A number of RABV from Bosnia & Herzegovina and Montenegro, collected between 1986 and 2006, grouped with the West European strains, believed to be responsible for the rabies epizootic that spread throughout Europe in the latter half of the 20th Century. In contrast, no Serbian RABV belonged to this sublineage. However, a distinct group of Serbian fox RABV provided further evidence for the southwards wildlife-mediated movement of rabies from Hungary, Romania and Serbia into Bulgaria. To determine the optimal region for evolutionary analysis, partial, full and concatenated N-gene and glycoprotein (G) gene sequences were compared. Whilst both the divergence times and evolutionary rates were similar irrespective of genomic region, the 95 % highest probability density (HPD) limits were significantly reduced for full N-gene and concatenated NG-gene sequences compared with partial gene sequences. Bayesian coalescent analysis estimated the date of the most common recent ancestor of the Balkan RABV to be 1885 (95 % HPD, 1852-1913), and skyline plots suggested an expansion of the local viral population in 1980-1990, which coincides with the observed emergence of fox rabies in the region.
Evolutionary Origins and Dynamics of Octoploid Strawberry Subgenomes Revealed by Dense Targeted Capture Linkage Maps

PubMed Central

Tennessen, Jacob A.; Govindarajulu, Rajanikanth; Ashman, Tia-Lynn; Liston, Aaron

2014-01-01

Whole-genome duplications are radical evolutionary events that have driven speciation and adaptation in many taxa. Higher-order polyploids have complex histories often including interspecific hybridization and dynamic genomic changes. This chromosomal reshuffling is poorly understood for most polyploid species, despite their evolutionary and agricultural importance, due to the challenge of distinguishing homologous sequences from each other. Here, we use dense linkage maps generated with targeted sequence capture to improve the diploid strawberry (Fragaria vesca) reference genome and to disentangle the subgenomes of the wild octoploid progenitors of cultivated strawberry, Fragaria virginiana and Fragaria chiloensis. Our novel approach, POLiMAPS (Phylogenetics Of Linkage-Map-Anchored Polyploid Subgenomes), leverages sequence reads to associate informative interhomeolog phylogenetic markers with linkage groups and reference genome positions. In contrast to a widely accepted model, we find that one of the four subgenomes originates with the diploid cytoplasm donor F. vesca, one with the diploid Fragaria iinumae, and two with an unknown ancestor close to F. iinumae. Extensive unidirectional introgression has converted F. iinumae-like subgenomes to be more F. vesca-like, but never the reverse, due either to homoploid hybridization in the F. iinumae-like diploid ancestors or else strong selection spreading F. vesca-like sequence among subgenomes through homeologous exchange. In addition, divergence between homeologous chromosomes has been substantially augmented by interchromosomal rearrangements. Our phylogenetic approach reveals novel aspects of the complicated web of genetic exchanges that occur during polyploid evolution and suggests a path forward for unraveling other agriculturally and ecologically important polyploid genomes. PMID:25477420
Sequencing of small RNAs of the fern Pleopeltis minima (Polypodiaceae) offers insight into the evolution of the microrna repertoire in land plants.

PubMed

Berruezo, Florencia; de Souza, Flávio S J; Picca, Pablo I; Nemirovsky, Sergio I; Martínez Tosar, Leandro; Rivero, Mercedes; Mentaberry, Alejandro N; Zelada, Alicia M

2017-01-01

MicroRNAs (miRNAs) are short, single stranded RNA molecules that regulate the stability and translation of messenger RNAs in diverse eukaryotic groups. Several miRNA genes are of ancient origin and have been maintained in the genomes of animal and plant taxa for hundreds of millions of years, playing key roles in development and physiology. In the last decade, genome and small RNA (sRNA) sequencing of several plant species have helped unveil the evolutionary history of land plants. Among these, the fern group (monilophytes) occupies a key phylogenetic position, as it represents the closest extant cousin taxon of seed plants, i.e. gymno- and angiosperms. However, in spite of their evolutionary, economic and ecological importance, no fern genome has been sequenced yet and few genomic resources are available for this group. Here, we sequenced the small RNA fraction of an epiphytic South American fern, Pleopeltis minima (Polypodiaceae), and compared it to plant miRNA databases, allowing for the identification of miRNA families that are shared by all land plants, shared by all vascular plants (tracheophytes) or shared by euphyllophytes (ferns and seed plants) only. Using the recently described transcriptome of another fern, Lygodium japonicum, we also estimated the degree of conservation of fern miRNA targets in relation to other plant groups. Our results pinpoint the origin of several miRNA families in the land plant evolutionary tree with more precision and are a resource for future genomic and functional studies of fern miRNAs.
An evolutionary metabolic engineering approach for enhancing lipogenesis in Yarrowia lipolytica.

PubMed

Liu, Leqian; Pan, Anny; Spofford, Caitlin; Zhou, Nijia; Alper, Hal S

2015-05-01

Lipogenic organisms provide an ideal platform for biodiesel and oleochemical production. Through our previous rational metabolic engineering efforts, lipogenesis titers in Yarrowia lipolytica were significantly enhanced. However, the resulting strain still suffered from decreased biomass generation rates. Here, we employ a rapid evolutionary metabolic engineering approach linked with a floating cell enrichment process to improve lipogenesis rates, titers, and yields. Through this iterative process, we were able to ultimately improve yields from our prior strain by 55% to achieve production titers of 39.1g/L with upwards of 76% of the theoretical maximum yield of conversation. Isolated cells were saturated with up to 87% lipid content. An average specific productivity of 0.56g/L/h was achieved with a maximum instantaneous specific productivity of 0.89g/L/h during the lipid production phase in fermentation. Genomic sequencing of the evolved strains revealed a link between a decrease/loss of function mutation of succinate semialdehyde dehydrogenase, uga2, suggesting the importance of gamma-aminobutyric acid assimilation in lipogenesis. This linkage was validated through gene deletion experiments. This work presents an improved host strain that can serve as a platform for efficient oleochemical production. Copyright © 2015 International Metabolic Engineering Society. Published by Elsevier Inc. All rights reserved.
Clonal selection in xenografted TAM recapitulates the evolutionary process of myeloid leukemia in Down syndrome.

PubMed

Saida, Satoshi; Watanabe, Ken-ichiro; Sato-Otsubo, Aiko; Terui, Kiminori; Yoshida, Kenichi; Okuno, Yusuke; Toki, Tsutomu; Wang, RuNan; Shiraishi, Yuichi; Miyano, Satoru; Kato, Itaru; Morishima, Tatsuya; Fujino, Hisanori; Umeda, Katsutsugu; Hiramatsu, Hidefumi; Adachi, Souichi; Ito, Etsuro; Ogawa, Seishi; Ito, Mamoru; Nakahata, Tatsutoshi; Heike, Toshio

2013-05-23

Transient abnormal myelopoiesis (TAM) is a clonal preleukemic disorder that progresses to myeloid leukemia of Down syndrome (ML-DS) through the accumulation of genetic alterations. To investigate the mechanism of leukemogenesis in this disorder, a xenograft model of TAM was established using NOD/Shi-scid, interleukin (IL)-2Rγ(null) mice. Serial engraftment after transplantation of cells from a TAM patient who developed ML-DS a year later demonstrated their self-renewal capacity. A GATA1 mutation and no copy number alterations (CNAs) were detected in the primary patient sample by conventional genomic sequencing and CNA profiling. However, in serial transplantations, engrafted TAM-derived cells showed the emergence of divergent subclones with another GATA1 mutation and various CNAs, including a 16q deletion and 1q gain, which are clinically associated with ML-DS. Detailed genomic analysis identified minor subclones with a 16q deletion or this distinct GATA1 mutation in the primary patient sample. These results suggest that genetically heterogeneous subclones with varying leukemia-initiating potential already exist in the neonatal TAM phase, and ML-DS may develop from a pool of such minor clones through clonal selection. Our xenograft model of TAM may provide unique insight into the evolutionary process of leukemia.
Fish-T1K (Transcriptomes of 1,000 Fishes) Project: large-scale transcriptome data for fish evolution studies.

PubMed

Sun, Ying; Huang, Yu; Li, Xiaofeng; Baldwin, Carole C; Zhou, Zhuocheng; Yan, Zhixiang; Crandall, Keith A; Zhang, Yong; Zhao, Xiaomeng; Wang, Min; Wong, Alex; Fang, Chao; Zhang, Xinhui; Huang, Hai; Lopez, Jose V; Kilfoyle, Kirk; Zhang, Yong; Ortí, Guillermo; Venkatesh, Byrappa; Shi, Qiong

2016-01-01

Ray-finned fishes (Actinopterygii) represent more than 50 % of extant vertebrates and are of great evolutionary, ecologic and economic significance, but they are relatively underrepresented in 'omics studies. Increased availability of transcriptome data for these species will allow researchers to better understand changes in gene expression, and to carry out functional analyses. An international project known as the "Transcriptomes of 1,000 Fishes" (Fish-T1K) project has been established to generate RNA-seq transcriptome sequences for 1,000 diverse species of ray-finned fishes. The first phase of this project has produced transcriptomes from more than 180 ray-finned fishes, representing 142 species and covering 51 orders and 109 families. Here we provide an overview of the goals of this project and the work done so far.
Arctic biodiversity: Increasing richness accompanies shrinking refugia for a cold-associated tundra fauna

USGS Publications Warehouse

Hope, Andrew G.; Waltari, Eric; Malaney, Jason L.; Payer, David C.; Cook, J.A.; Talbot, Sandra L.

2015-01-01

As ancestral biodiversity responded dynamically to late-Quaternary climate changes, so are extant organisms responding to the warming trajectory of the Anthropocene. Ecological predictive modeling, statistical hypothesis tests, and genetic signatures of demographic change can provide a powerful integrated toolset for investigating these biodiversity responses to climate change, and relative resiliency across different communities. Within the biotic province of Beringia, we analyzed specimen localities and DNA sequences from 28 mammal species associated with boreal forest and Arctic tundra biomes to assess both historical distributional and evolutionary responses and then forecasted future changes based on statistical assessments of past and present trajectories, and quantified distributional and demographic changes in relation to major management regions within the study area. We addressed three sets of hypotheses associated with aspects of methodological, biological, and socio-political importance by asking (1) what is the consistency among implications of predicted changes based on the results of both ecological and evolutionary analyses; (2) what are the ecological and evolutionary implications of climate change considering either total regional diversity or distinct communities associated with major biomes; and (3) are there differences in management implications across regions? Our results indicate increasing Arctic richness through time that highlights a potential state shift across the Arctic landscape. However, within distinct ecological communities, we found a predicted decline in the range and effective population size of tundra species into several discrete refugial areas. Consistency in results based on a combination of both ecological and evolutionary approaches demonstrates increased statistical confidence by applying cross-discipline comparative analyses to conservation of biodiversity, particularly considering variable management regimes that seek to balance sustainable ecosystems with other anthropogenic values. Refugial areas for cold-adapted taxa appear to be persistent across both warm and cold climate phases and although fragmented, constitute vital regions for persistence of Arctic mammals.
Comparative Genome Sequence Analysis of the Bpa/Str Region in Mouse and Man

PubMed Central

Mallon, A.-M.; Platzer, M.; Bate, R.; Gloeckner, G.; Botcherby, M.R.M.; Nordsiek, G.; Strivens, M.A.; Kioschis, P.; Dangel, A.; Cunningham, D.; Straw, R.N.A.; Weston, P.; Gilbert, M.; Fernando, S.; Goodall, K.; Hunter, G.; Greystrong, J.S.; Clarke, D.; Kimberley, C.; Goerdes, M.; Blechschmidt, K.; Rump, A.; Hinzmann, B.; Mundy, C.R.; Miller, W.; Poustka, A.; Herman, G.E.; Rhodes, M.; Denny, P.; Rosenthal, A.; Brown, S.D.M.

2000-01-01

The progress of human and mouse genome sequencing programs presages the possibility of systematic cross-species comparison of the two genomes as a powerful tool for gene and regulatory element identification. As the opportunities to perform comparative sequence analysis emerge, it is important to develop parameters for such analyses and to examine the outcomes of cross-species comparison. Our analysis used gene prediction and a database search of 430 kb of genomic sequence covering the Bpa/Str region of the mouse X chromosome, and 745 kb of genomic sequence from the homologous human X chromosome region. We identified 11 genes in mouse and 13 genes and two pseudogenes in human. In addition, we compared the mouse and human sequences using pairwise alignment and searches for evolutionary conserved regions (ECRs) exceeding a defined threshold of sequence identity. This approach aided the identification of at least four further putative conserved genes in the region. Comparative sequencing revealed that this region is a mosaic in evolutionary terms, with considerably more rearrangement between the two species than realized previously from comparative mapping studies. Surprisingly, this region showed an extremely high LINE and low SINE content, low G+C content, and yet a relatively high gene density, in contrast to the low gene density usually associated with such regions. [The sequence data described in this paper have been submitted to EMBL under the following accession nos.: Mouse Genomic Sequence: Mouse contig A (AL021127), Mouse contig B (AL049866), BAC41M10 (AL136328), PAC303O11(AL136329). Human Genomic Sequence: Human contig 1 (U82671, U82670), Human contig 2 (U82695).] PMID:10854409
Partial Shotgun Sequencing of the Boechera stricta Genome Reveals Extensive Microsynteny and Promoter Conservation with Arabidopsis1[W

PubMed Central

Windsor, Aaron J.; Schranz, M. Eric; Formanová, Nataša; Gebauer-Jung, Steffi; Bishop, John G.; Schnabelrauch, Domenica; Kroymann, Juergen; Mitchell-Olds, Thomas

2006-01-01

Comparative genomics provides insight into the evolutionary dynamics that shape discrete sequences as well as whole genomes. To advance comparative genomics within the Brassicaceae, we have end sequenced 23,136 medium-sized insert clones from Boechera stricta, a wild relative of Arabidopsis (Arabidopsis thaliana). A significant proportion of these sequences, 18,797, are nonredundant and display highly significant similarity (BLASTn e-value ≤ 10−30) to low copy number Arabidopsis genomic regions, including more than 9,000 annotated coding sequences. We have used this dataset to identify orthologous gene pairs in the two species and to perform a global comparison of DNA regions 5′ to annotated coding regions. On average, the 500 nucleotides upstream to coding sequences display 71.4% identity between the two species. In a similar analysis, 61.4% identity was observed between 5′ noncoding sequences of Brassica oleracea and Arabidopsis, indicating that regulatory regions are not as diverged among these lineages as previously anticipated. By mapping the B. stricta end sequences onto the Arabidopsis genome, we have identified nearly 2,000 conserved blocks of microsynteny (bracketing 26% of the Arabidopsis genome). A comparison of fully sequenced B. stricta inserts to their homologous Arabidopsis genomic regions indicates that indel polymorphisms >5 kb contribute substantially to the genome size difference observed between the two species. Further, we demonstrate that microsynteny inferred from end-sequence data can be applied to the rapid identification and cloning of genomic regions of interest from nonmodel species. These results suggest that among diploid relatives of Arabidopsis, small- to medium-scale shotgun sequencing approaches can provide rapid and cost-effective benefits to evolutionary and/or functional comparative genomic frameworks. PMID:16607030
Comment on "Protein sequences from mastodon and Tyrannosaurus rex revealed by mass spectrometry".

PubMed

Pevzner, Pavel A; Kim, Sangtae; Ng, Julio

2008-08-22

Asara et al. (Reports, 13 April 2007, p. 280) reported sequencing of Tyrannosaurus rex proteins and used them to establish the evolutionary relationships between birds and dinosaurs. We argue that the reported T. rex peptides may represent statistical artifacts and call for complete data release to enable experimental and computational verification of their findings.
Phylogeny and evolutionary histories of Pyrus L. revealed by phylogenetic trees and networks based on data from multiple DNA sequences

USDA-ARS?s Scientific Manuscript database

Reconstructing the phylogeny of Pyrus has been difficult due to the wide distribution of the genus and lack of informative data. In this study, we collected 110 accessions representing 25 Pyrus species and constructed both phylogenetic trees and phylogenetic networks based on multiple DNA sequence d...
Protein sequences from mastodon and Tyrannosaurus rex revealed by mass spectrometry.

PubMed

Asara, John M; Schweitzer, Mary H; Freimark, Lisa M; Phillips, Matthew; Cantley, Lewis C

2007-04-13

Fossilized bones from extinct taxa harbor the potential for obtaining protein or DNA sequences that could reveal evolutionary links to extant species. We used mass spectrometry to obtain protein sequences from bones of a 160,000- to 600,000-year-old extinct mastodon (Mammut americanum) and a 68-million-year-old dinosaur (Tyrannosaurus rex). The presence of T. rex sequences indicates that their peptide bonds were remarkably stable. Mass spectrometry can thus be used to determine unique sequences from ancient organisms from peptide fragmentation patterns, a valuable tool to study the evolution and adaptation of ancient taxa from which genomic sequences are unlikely to be obtained.
Comparative Transcriptomes and EVO-DEVO Studies Depending on Next Generation Sequencing.

PubMed

Liu, Tiancheng; Yu, Lin; Liu, Lei; Li, Hong; Li, Yixue

2015-01-01

High throughput technology has prompted the progressive omics studies, including genomics and transcriptomics. We have reviewed the improvement of comparative omic studies, which are attributed to the high throughput measurement of next generation sequencing technology. Comparative genomics have been successfully applied to evolution analysis while comparative transcriptomics are adopted in comparison of expression profile from two subjects by differential expression or differential coexpression, which enables their application in evolutionary developmental biology (EVO-DEVO) studies. EVO-DEVO studies focus on the evolutionary pressure affecting the morphogenesis of development and previous works have been conducted to illustrate the most conserved stages during embryonic development. Old measurements of these studies are based on the morphological similarity from macro view and new technology enables the micro detection of similarity in molecular mechanism. Evolutionary model of embryo development, which includes the "funnel-like" model and the "hourglass" model, has been evaluated by combination of these new comparative transcriptomic methods with prior comparative genomic information. Although the technology has promoted the EVO-DEVO studies into a new era, technological and material limitation still exist and further investigations require more subtle study design and procedure.

Evolutionary and molecular foundations of multiple contemporary functions of the nitroreductase superfamily

PubMed Central

Akiva, Eyal; Copp, Janine N.; Tokuriki, Nobuhiko; Babbitt, Patricia C.

2017-01-01

Insight regarding how diverse enzymatic functions and reactions have evolved from ancestral scaffolds is fundamental to understanding chemical and evolutionary biology, and for the exploitation of enzymes for biotechnology. We undertook an extensive computational analysis using a unique and comprehensive combination of tools that include large-scale phylogenetic reconstruction to determine the sequence, structural, and functional relationships of the functionally diverse flavin mononucleotide-dependent nitroreductase (NTR) superfamily (>24,000 sequences from all domains of life, 54 structures, and >10 enzymatic functions). Our results suggest an evolutionary model in which contemporary subgroups of the superfamily have diverged in a radial manner from a minimal flavin-binding scaffold. We identified the structural design principle for this divergence: Insertions at key positions in the minimal scaffold that, combined with the fixation of key residues, have led to functional specialization. These results will aid future efforts to delineate the emergence of functional diversity in enzyme superfamilies, provide clues for functional inference for superfamily members of unknown function, and facilitate rational redesign of the NTR scaffold. PMID:29078300
Bats, Primates, and the Evolutionary Origins and Diversification of Mammalian Gammaherpesviruses

PubMed Central

Rojas-Anaya, Edith; Kolokotronis, Sergios-Orestis; Taboada, Blanca; Loza-Rubio, Elizabeth; Méndez-Ojeda, Maria L.; Osterrieder, Nikolaus

2016-01-01

ABSTRACT Gammaherpesviruses (γHVs) are generally considered host specific and to have codiverged with their hosts over millions of years. This tenet is challenged here by broad-scale phylogenetic analysis of two viral genes using the largest sample of mammalian γHVs to date, integrating for the first time bat γHV sequences available from public repositories and newly generated viral sequences from two vampire bat species (Desmodus rotundus and Diphylla ecaudata). Bat and primate viruses frequently represented deep branches within the supported phylogenies and clustered among viruses from distantly related mammalian taxa. Following evolutionary scenario testing, we determined the number of host-switching and cospeciation events. Cross-species transmissions have occurred much more frequently than previously estimated, and most of the transmissions were attributable to bats and primates. We conclude that the evolution of the Gammaherpesvirinae subfamily has been driven by both cross-species transmissions and subsequent cospeciation within specific viral lineages and that the bat and primate orders may have potentially acted as superspreaders to other mammalian taxa throughout evolutionary history. PMID:27834200
Regulation of G-protein coupled receptor traffic by an evolutionary conserved hydrophobic signal.

PubMed

Angelotti, Tim; Daunt, David; Shcherbakova, Olga G; Kobilka, Brian; Hurt, Carl M

2010-04-01

Plasma membrane (PM) expression of G-protein coupled receptors (GPCRs) is required for activation by extracellular ligands; however, mechanisms that regulate PM expression of GPCRs are poorly understood. For some GPCRs, such as alpha2c-adrenergic receptors (alpha(2c)-ARs), heterologous expression in non-native cells results in limited PM expression and extensive endoplasmic reticulum (ER) retention. Recently, ER export/retentions signals have been proposed to regulate cellular trafficking of several GPCRs. By utilizing a chimeric alpha(2a)/alpha(2c)-AR strategy, we identified an evolutionary conserved hydrophobic sequence (ALAAALAAAAA) in the extracellular amino terminal region that is responsible in part for alpha(2c)-AR subtype-specific trafficking. To our knowledge, this is the first luminal ER retention signal reported for a GPCR. Removal or disruption of the ER retention signal dramatically increased PM expression and decreased ER retention. Conversely, transplantation of this hydrophobic sequence into alpha(2a)-ARs reduced their PM expression and increased ER retention. This evolutionary conserved hydrophobic trafficking signal within alpha(2c)-ARs serves as a regulator of GPCR trafficking.
Functional and evolutionary relationships between bacteriorhodopsin and halorhodopsin in the archaebacterium, halobacterium halobium

NASA Technical Reports Server (NTRS)

Lanyi, J. K.

1986-01-01

The archaebacteria occupy a unique place in phylogenetic trees constructed from analyses of sequences from key informational macromolecules, and their study continues to yield interesting ideas on the early evolution and divergence of biological forms. It is now known that the halobacteria among these species contain various retinal-proteins, resembling eukaryotic rhodopsins, but with different functions. Two of these pigments, located in the cytoplasmic membranes of the bacteria, are bacteriorhodopsin (a light-driven proton pump) and halorhodopsin (a light-driven chloride pump). Comparison of these systems is expected to reveal structure/function relationships in these simple (primitive?) energy transducing membrane components and evolutionary relationships which had produced the structural features which allow the divergent functions. Findings indicate that very different primary structures are needed for these proteins to accomplish their different functions. Indeed, analysis of partial amino acid sequences from halo-opsin shows already that few if any long segments exist which are homologous to bacterio-opsin. Either these proteins diverged a very long time ago to allow for the observed differences, or the evolutionary clock in the halobacteria runs faster than usual.
Ecological and evolutionary genomics of marine photosynthetic organisms.

PubMed

Coelho, Susana M; Simon, Nathalie; Ahmed, Sophia; Cock, J Mark; Partensky, Frédéric

2013-02-01

Environmental (ecological) genomics aims to understand the genetic basis of relationships between organisms and their abiotic and biotic environments. It is a rapidly progressing field of research largely due to recent advances in the speed and volume of genomic data being produced by next generation sequencing (NGS) technologies. Building on information generated by NGS-based approaches, functional genomic methodologies are being applied to identify and characterize genes and gene systems of both environmental and evolutionary relevance. Marine photosynthetic organisms (MPOs) were poorly represented amongst the early genomic models, but this situation is changing rapidly. Here we provide an overview of the recent advances in the application of ecological genomic approaches to both prokaryotic and eukaryotic MPOs. We describe how these approaches are being used to explore the biology and ecology of marine cyanobacteria and algae, particularly with regard to their functions in a broad range of marine ecosystems. Specifically, we review the ecological and evolutionary insights gained from whole genome and transcriptome sequencing projects applied to MPOs and illustrate how their genomes are yielding information on the specific features of these organisms. © 2012 Blackwell Publishing Ltd.
Monitoring Observatinos of H2O and SiO Masers Toward Post-AGB Stars

NASA Astrophysics Data System (ADS)

Kim, Jaeheon; Cho, Se-Hyung; Yoon, Dong-Hwan

2016-12-01

We present the results of simultaneous monitoring observations of H_2O 6_{1,6}-5_{2,3} (22 GHz) and SiO J=1-0, 2-1, 3-2 maser lines (43, 86, 129 GHz) toward five post-AGB (candidate) stars, using the 21-m single-dish telescopes of the Korean VLBI Network. Depending on the target objects, 7 - 11 epochs of data were obtained. We detected both H_2O and SiO maser lines from four sources: OH16.1-0.3, OH38.10-0.13, OH65.5+1.3, and IRAS 19312+1950. We could not detect H_2O maser emission toward OH13.1+5.1 between the late OH/IR and post-AGB stage. The detected H_2O masers show typical double-peaked line profiles. The SiO masers from four sources, except IRAS 19312+1950, show the peaks around the stellar velocity as a single peak, whereas the SiO masers from IRAS 19312+1950 occur above the red peak of the H_2O maser. We analyzed the properties of detected maser lines, and investigated their evolutionary state through comparison with the full widths at zero power. The distribution of observed target sources was also investigated in the IRAS two-color diagram in relation with the evolutionary stage of post-AGB stars. From our analyses, the evolutionary sequence of observed sources is suggested as OH65.5+1.3 → OH13.1+5.1 → OH16.1-0.3 → OH38.10-0.13, except for IRAS 19312+1950. In addition, OH13.1+5.1 from which the H_2O maser has not been detected is suggested to be on the gateway toward the post-AGB stage. With respect to the enigmatic object, IRAS 19312+1950, we could not clearly figure out its nature. To properly explain the unusual phenomena of SiO and H_2O masers, it is essential to establish the relative locations and spatial distributions of two masers using VLBI technique. We also include the 1.2 - 160 μm spectral energy distribution using photometric data from the following surveys: 2MASS, WISE, MSX, IRAS, and AKARI (IRC and FIS). In addition, from the IRAS LRS spectra, we found that the depth of silicate absorption features shows significant variations depending on the evolutionary sequence, associated with the termination of AGB phase mass-loss.
On the Evolutionary Phase and Mass Loss of the Wolf-Rayet--like Stars in R136a

NASA Astrophysics Data System (ADS)

de Koter, Alex; Heap, Sara R.; Hubeny, Ivan

1997-03-01

We report on a systematic study of the most massive stars, in which we analyzed the spectra of four very luminous stars in the Large Magellanic Cloud. The stars lie in the 30 Doradus complex, three of which are located in the core of the compact cluster, R136a (R136a1, R136a3, and R136a5), and the fourth (Melnick 42), located about 8" north of R136a. Low-resolution spectra (<200 km s-1) of these four stars were obtained with the GHRS and FOS spectrographs on the Hubble Space Telescope. The GHRS spectra cover the spectral range from 1200 to 1750 A, and the FOS spectra from 3200 to 6700 A. We derived the fundamental parameters of these stars by fitting the observations by model spectra calculated with the "ISA-WIND" code of de Koter et al. We find that all four stars are very hot (~45 kK), luminous, and rich in hydrogen. Their positions on the HR-diagram imply that they are stars with masses in the range 60--90 M⊙ that are 2 million years old at most, and hence, they are O-type main-sequence stars still in the core H-burning phase of evolution. Nevertheless, the spectra of two of the stars (R136a1, R136a3) mimic those of Wolf-Rayet stars in showing very strong He II emission lines. According to our calculations, this emission is a natural consequence of a very high mass-loss rate. We conjecture that the most massive stars in R136a---those with initial masses of ~100 M⊙ or more---are born as WR-like stars and that the high mass loss may perhaps be connected to the actual stellar formation process. Because the observed mass-loss rates are up to 3 times higher than assumed by evolutionary models, the main-sequence and post--main-sequence tracks of these stars will be qualitatively different from current models. The mass-loss rate is 3.5--8 times that predicted by the analytical solutions for radiation-driven winds of Kudritzki et al. (1989). However, using sophisticated Monte Carlo calculations of radiative driving in unified model atmospheres, we show that---while we cannot say for sure what initiates the wind---radiation pressure is probably sufficient to accelerate the wind to its observed terminal velocity, if one accounts for the effects of multiple photon scattering in the dense winds of the investigated stars.
Radiation transfer of models of massive star formation. III. The evolutionary sequence

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Yichen; Tan, Jonathan C.; Hosokawa, Takashi, E-mail: yichen.zhang@yale.edu, E-mail: jt@astro.ufl.edu, E-mail: takashi.hosokawa@phys.s.u-tokyo.ac.jp

2014-06-20

We present radiation transfer simulations of evolutionary sequences of massive protostars forming from massive dense cores in environments of high mass surface densities, based on the Turbulent Core Model. The protostellar evolution is calculated with a multi-zone numerical model, with the accretion rate regulated by feedback from an evolving disk wind outflow cavity. The disk evolution is calculated assuming a fixed ratio of disk to protostellar mass, while the core envelope evolution assumes an inside-out collapse of the core with a fixed outer radius. In this framework, an evolutionary track is determined by three environmental initial conditions: the core massmore » M{sub c} , the mass surface density of the ambient clump Σ{sub cl}, and the ratio of the core's initial rotational to gravitational energy β {sub c}. Evolutionary sequences with various M{sub c} , Σ{sub cl}, and β {sub c} are constructed. We find that in a fiducial model with M{sub c} = 60 M {sub ☉}, Σ{sub cl} = 1 g cm{sup –2}, and β {sub c} = 0.02, the final mass of the protostar reaches at least ∼26 M {sub ☉}, making the final star formation efficiency ≳ 0.43. For each of the evolutionary tracks, radiation transfer simulations are performed at selected stages, with temperature profiles, spectral energy distributions (SEDs), and multiwavelength images produced. At a given stage, the envelope temperature depends strongly on Σ{sub cl}, with higher temperatures in a higher Σ{sub cl} core, but only weakly on M{sub c} . The SED and MIR images depend sensitively on the evolving outflow cavity, which gradually widens as the protostar grows. The fluxes at ≲ 100 μm increase dramatically, and the far-IR peaks move to shorter wavelengths. The influence of Σ{sub cl} and β {sub c} (which determines disk size) are discussed. We find that, despite scatter caused by different M{sub c} , Σ{sub cl}, β {sub c}, and inclinations, sources at a given evolutionary stage appear in similar regions of color-color diagrams, especially when using colors with fluxes at ≳ 70 μm, where scatter due to inclination is minimized, implying that such diagrams can be useful diagnostic tools for identifying the evolutionary stages of massive protostars. We discuss how intensity profiles along or perpendicular to the outflow axis are affected by environmental conditions and source evolution and can thus act as additional diagnostics of the massive star formation process.« less
Identification of a current hot spot of HIV type 1 transmission in Mongolia by molecular epidemiological analysis.

PubMed

Davaalkham, Jagdagsuren; Unenchimeg, Puntsag; Baigalmaa, Chultem; Erdenetuya, Gombo; Nyamkhuu, Dulmaa; Shiino, Teiichiro; Tsuchiya, Kiyoto; Hayashida, Tsunefusa; Gatanaga, Hiroyuki; Oka, Shinichi

2011-10-01

We investigated the current molecular epidemiological status of HIV-1 in Mongolia, a country with very low incidence of HIV-1 though with rapid expansion in recent years. HIV-1 pol (1065 nt) and env (447 nt) genes were sequenced to construct phylogenetic trees. The evolutionary rates, molecular clock phylogenies, and other evolutionary parameters were estimated from heterochronous genomic sequences of HIV-1 subtype B by the Bayesian Markov chain Monte Carlo method. We obtained 41 sera from 56 reported HIV-1-positive cases as of May 2009. The main route of infection was men who have sex with men (MSM). Dominant subtypes were subtype B in 32 cases (78%) followed by subtype CRF02_AG (9.8%). The phylogenetic analysis of the pol gene identified two clusters in subtype B sequences. Cluster 1 consisted of 21 cases including MSM and other routes of infection, and cluster 2 consisted of eight MSM cases. The tree analyses demonstrated very short branch lengths in cluster 1, suggesting a surprisingly active expansion of HIV-1 transmission during a short period with the same ancestor virus. Evolutionary analysis indicated that the outbreak started around the early 2000s. This study identified a current hot spot of HIV-1 transmission and potential seed of the epidemic in Mongolia. Comprehensive preventive measures targeting this group are urgently needed.
Comparative Transcriptomics of Strawberries (Fragaria spp.) Provides Insights into Evolutionary Patterns.

PubMed

Qiao, Qin; Xue, Li; Wang, Qia; Sun, Hang; Zhong, Yang; Huang, Jinling; Lei, Jiajun; Zhang, Ticao

2016-01-01

Multiple closely related species with genomic sequences provide an ideal system for studies on comparative and evolutionary genomics, as well as the mechanism of speciation. The whole genome sequences of six strawberry species ( Fragaria spp.) have been released, which provide one of the richest genomic resources of any plant genus. In this study, we first generated seven transcriptome sequences of Fragaria species de novo , with a total of 48,557-82,537 unigenes per species. Combined with 13 other species genomes in Rosales, we reconstructed a phylogenetic tree at the genomic level. The phylogenic tree shows that Fragaria closed grouped with Rubus and the Fragaria clade is divided into three subclades. East Asian species appeared in every subclade, suggesting that the genus originated in this area at ∼7.99 Mya. Four species found in mountains of Southwest China originated at ∼3.98 Mya, suggesting that rapid speciation occurred to adapt to changing environments following the uplift of the Qinghai-Tibet Plateau. Moreover, we identified 510 very significantly positively selected genes in the cultivated species F . × ananassa genome. This set of genes was enriched in functions related to specific agronomic traits, such as carbon metabolism and plant hormone signal transduction processes, which are directly related to fruit quality and flavor. These findings illustrate comprehensive evolutionary patterns in Fragaria and the genetic basis of fruit domestication of cultivated strawberry at the genomic/transcriptomic level.
Classification and Lineage Tracing of SH2 Domains Throughout Eukaryotes.

PubMed

Liu, Bernard A

2017-01-01

Today there exists a rapidly expanding number of sequenced genomes. Cataloging protein interaction domains such as the Src Homology 2 (SH2) domain across these various genomes can be accomplished with ease due to existing algorithms and predictions models. An evolutionary analysis of SH2 domains provides a step towards understanding how SH2 proteins integrated with existing signaling networks to position phosphotyrosine signaling as a crucial driver of robust cellular communication networks in metazoans. However organizing and tracing SH2 domain across organisms and understanding their evolutionary trajectory remains a challenge. This chapter describes several methodologies towards analyzing the evolutionary trajectory of SH2 domains including a global SH2 domain classification system, which facilitates annotation of new SH2 sequences essential for tracing the lineage of SH2 domains throughout eukaryote evolution. This classification utilizes a combination of sequence homology, protein domain architecture and the boundary positions between introns and exons within the SH2 domain or genes encoding these domains. Discrete SH2 families can then be traced across various genomes to provide insight into its origins. Furthermore, additional methods for examining potential mechanisms for divergence of SH2 domains from structural changes to alterations in the protein domain content and genome duplication will be discussed. Therefore a better understanding of SH2 domain evolution may enhance our insight into the emergence of phosphotyrosine signaling and the expansion of protein interaction domains.
One pedigree we all may have come from - did Adam and Eve have the chromosome 2 fusion?

PubMed

Stankiewicz, Paweł

2016-01-01

In contrast to Great Apes, who have 48 chromosomes, modern humans and likely Neandertals and Denisovans have and had, respectively, 46 chromosomes. The reduction in chromosome number was caused by the head-to-head fusion of two ancestral chromosomes to form human chromosome 2 (HSA2) and may have contributed to the reproductive barrier with Great Apes. Next generation sequencing and molecular clock analyses estimated that this fusion arose prior to our last common ancestor with Neandertal and Denisovan hominins ~ 0.74 - 4.5 million years ago. I propose that, unlike recurrent Robertsonian translocations in humans, the HSA2 fusion was a single nonrecurrent event that spread through a small polygamous clan population bottleneck. Its heterozygous to homozygous conversion, fixation, and accumulation in the succeeding populations was likely facilitated by an evolutionary advantage through the genomic loss rather than deregulation of expression of the gene(s) flanking the HSA2 fusion site at 2q13. The origin of HSA2 might have been a critical evolutionary event influencing higher cognitive functions in various early subspecies of hominins. Next generation sequencing of Homo heidelbergensis and Homo erectus genomes and complete reconstruction of DNA sequence of the orthologous subtelomeric chromosomes in Great Apes should enable more precise timing of HSA2 formation and better understanding of its evolutionary consequences.
Carnivore-specific SINEs (Can-SINEs): distribution, evolution, and genomic impact.

PubMed

Walters-Conte, Kathryn B; Johnson, Diana L E; Allard, Marc W; Pecon-Slattery, Jill

2011-01-01

Short interspersed nuclear elements (SINEs) are a type of class 1 transposable element (retrotransposon) with features that allow investigators to resolve evolutionary relationships between populations and species while providing insight into genome composition and function. Characterization of a Carnivora-specific SINE family, Can-SINEs, has, has aided comparative genomic studies by providing rare genomic changes, and neutral sequence variants often needed to resolve difficult evolutionary questions. In addition, Can-SINEs constitute a significant source of functional diversity with Carnivora. Publication of the whole-genome sequence of domestic dog, domestic cat, and giant panda serves as a valuable resource in comparative genomic inferences gleaned from Can-SINEs. In anticipation of forthcoming studies bolstered by new genomic data, this review describes the discovery and characterization of Can-SINE motifs as well as describes composition, distribution, and effect on genome function. As the contribution of noncoding sequences to genomic diversity becomes more apparent, SINEs and other transposable elements will play an increasingly large role in mammalian comparative genomics.
Carnivore-Specific SINEs (Can-SINEs): Distribution, Evolution, and Genomic Impact

PubMed Central

Johnson, Diana L.E.; Allard, Marc W.; Pecon-Slattery, Jill

2011-01-01

Short interspersed nuclear elements (SINEs) are a type of class 1 transposable element (retrotransposon) with features that allow investigators to resolve evolutionary relationships between populations and species while providing insight into genome composition and function. Characterization of a Carnivora-specific SINE family, Can-SINEs, has, has aided comparative genomic studies by providing rare genomic changes, and neutral sequence variants often needed to resolve difficult evolutionary questions. In addition, Can-SINEs constitute a significant source of functional diversity with Carnivora. Publication of the whole-genome sequence of domestic dog, domestic cat, and giant panda serves as a valuable resource in comparative genomic inferences gleaned from Can-SINEs. In anticipation of forthcoming studies bolstered by new genomic data, this review describes the discovery and characterization of Can-SINE motifs as well as describes composition, distribution, and effect on genome function. As the contribution of noncoding sequences to genomic diversity becomes more apparent, SINEs and other transposable elements will play an increasingly large role in mammalian comparative genomics. PMID:21846743
Genotyping of ancient Mycobacterium tuberculosis strains reveals historic genetic diversity.

PubMed

Müller, Romy; Roberts, Charlotte A; Brown, Terence A

2014-04-22

The evolutionary history of the Mycobacterium tuberculosis complex (MTBC) has previously been studied by analysis of sequence diversity in extant strains, but not addressed by direct examination of strain genotypes in archaeological remains. Here, we use ancient DNA sequencing to type 11 single nucleotide polymorphisms and two large sequence polymorphisms in the MTBC strains present in 10 archaeological samples from skeletons from Britain and Europe dating to the second-nineteenth centuries AD. The results enable us to assign the strains to groupings and lineages recognized in the extant MTBC. We show that at least during the eighteenth-nineteenth centuries AD, strains of M. tuberculosis belonging to different genetic groups were present in Britain at the same time, possibly even at a single location, and we present evidence for a mixed infection in at least one individual. Our study shows that ancient DNA typing applied to multiple samples can provide sufficiently detailed information to contribute to both archaeological and evolutionary knowledge of the history of tuberculosis.
Sequencing of the sea lamprey (Petromyzon marinus) genome provides insights into vertebrate evolution

PubMed Central

Smith, Jeramiah J; Kuraku, Shigehiro; Holt, Carson; Sauka-Spengler, Tatjana; Jiang, Ning; Campbell, Michael S; Yandell, Mark D; Manousaki, Tereza; Meyer, Axel; Bloom, Ona E; Morgan, Jennifer R; Buxbaum, Joseph D; Sachidanandam, Ravi; Sims, Carrie; Garruss, Alexander S; Cook, Malcolm; Krumlauf, Robb; Wiedemann, Leanne M; Sower, Stacia A; Decatur, Wayne A; Hall, Jeffrey A; Amemiya, Chris T; Saha, Nil R; Buckley, Katherine M; Rast, Jonathan P; Das, Sabyasachi; Hirano, Masayuki; McCurley, Nathanael; Guo, Peng; Rohner, Nicolas; Tabin, Clifford J; Piccinelli, Paul; Elgar, Greg; Ruffier, Magali; Aken, Bronwen L; Searle, Stephen MJ; Muffato, Matthieu; Pignatelli, Miguel; Herrero, Javier; Jones, Matthew; Brown, C Titus; Chung-Davidson, Yu-Wen; Nanlohy, Kaben G; Libants, Scot V; Yeh, Chu-Yin; McCauley, David W; Langeland, James A; Pancer, Zeev; Fritzsch, Bernd; de Jong, Pieter J; Zhu, Baoli; Fulton, Lucinda L; Theising, Brenda; Flicek, Paul; Bronner, Marianne E; Warren, Wesley C; Clifton, Sandra W; Wilson, Richard K; Li, Weiming

2013-01-01

Lampreys are representatives of an ancient vertebrate lineage that diverged from our own ~500 million years ago. By virtue of this deeply shared ancestry, the sea lamprey (P. marinus) genome is uniquely poised to provide insight into the ancestry of vertebrate genomes and the underlying principles of vertebrate biology. Here, we present the first lamprey whole-genome sequence and assembly. We note challenges faced owing to its high content of repetitive elements and GC bases, as well as the absence of broad-scale sequence information from closely related species. Analyses of the assembly indicate that two whole-genome duplications likely occurred before the divergence of ancestral lamprey and gnathostome lineages. Moreover, the results help define key evolutionary events within vertebrate lineages, including the origin of myelin-associated proteins and the development of appendages. The lamprey genome provides an important resource for reconstructing vertebrate origins and the evolutionary events that have shaped the genomes of extant organisms. PMID:23435085
Phylogenetic relationships of bears (the Ursidae) inferred from mitochondrial DNA sequences.

PubMed

Zhang, Y P; Ryder, O A

1994-12-01

The phylogenetic relationships among some bear species are still open questions. We present here mitochondrial DNA sequences of D-loop region, cytochrome b, 12S rRNA, tRNA(Pro), and tRNA(Thr) genes from all bear species and the giant panda. A series of evolutionary trees with concordant topology has been derived based on the combined data set of all of the mitochondrial DNA sequences, which may have resolved the evolutionary relationships of all bear species: the ancestor of the spectacled bear diverged first, followed by the sloth bear; the brown bear and polar bear are sister taxa relative to the Asiatic black bear; the closest relative of the American black bear is the sun bear. Primers for forensic identification of the giant panda and bears are proposed. Analysis of these data, in combination with data from primates and antelopes, suggests that relative substitutional rates between different mitochondrial DNA regions may vary greatly among different taxa of the vertebrates.
Quantitative analysis of RNA-protein interactions on a massively parallel array for mapping biophysical and evolutionary landscapes

PubMed Central

Buenrostro, Jason D.; Chircus, Lauren M.; Araya, Carlos L.; Layton, Curtis J.; Chang, Howard Y.; Snyder, Michael P.; Greenleaf, William J.

2015-01-01

RNA-protein interactions drive fundamental biological processes and are targets for molecular engineering, yet quantitative and comprehensive understanding of the sequence determinants of affinity remains limited. Here we repurpose a high-throughput sequencing instrument to quantitatively measure binding and dissociation of MS2 coat protein to >107 RNA targets generated on a flow-cell surface by in situ transcription and inter-molecular tethering of RNA to DNA. We decompose the binding energy contributions from primary and secondary RNA structure, finding that differences in affinity are often driven by sequence-specific changes in association rates. By analyzing the biophysical constraints and modeling mutational paths describing the molecular evolution of MS2 from low- to high-affinity hairpins, we quantify widespread molecular epistasis, and a long-hypothesized structure-dependent preference for G:U base pairs over C:A intermediates in evolutionary trajectories. Our results suggest that quantitative analysis of RNA on a massively parallel array (RNAMaP) relationships across molecular variants. PMID:24727714
Ancient Recombination Events between Human Herpes Simplex Viruses

PubMed Central

Burrel, Sonia; Boutolleau, David; Ryu, Diane; Agut, Henri; Merkel, Kevin; Leendertz, Fabian H.

2017-01-01

Abstract Herpes simplex viruses 1 and 2 (HSV-1 and HSV-2) are seen as close relatives but also unambiguously considered as evolutionary independent units. Here, we sequenced the genomes of 18 HSV-2 isolates characterized by divergent UL30 gene sequences to further elucidate the evolutionary history of this virus. Surprisingly, genome-wide recombination analyses showed that all HSV-2 genomes sequenced to date contain HSV-1 fragments. Using phylogenomic analyses, we could also show that two main HSV-2 lineages exist. One lineage is mostly restricted to subSaharan Africa whereas the other has reached a global distribution. Interestingly, only the worldwide lineage is characterized by ancient recombination events with HSV-1. Our findings highlight the complexity of HSV-2 evolution, a virus of putative zoonotic origin which later recombined with its human-adapted relative. They also suggest that coinfections with HSV-1 and 2 may have genomic and potentially functional consequences and should therefore be monitored more closely. PMID:28369565
Markov-modulated Markov chains and the covarion process of molecular evolution.

PubMed

Galtier, N; Jean-Marie, A

2004-01-01

The covarion (or site specific rate variation, SSRV) process of biological sequence evolution is a process by which the evolutionary rate of a nucleotide/amino acid/codon position can change in time. In this paper, we introduce time-continuous, space-discrete, Markov-modulated Markov chains as a model for representing SSRV processes, generalizing existing theory to any model of rate change. We propose a fast algorithm for diagonalizing the generator matrix of relevant Markov-modulated Markov processes. This algorithm makes phylogeny likelihood calculation tractable even for a large number of rate classes and a large number of states, so that SSRV models become applicable to amino acid or codon sequence datasets. Using this algorithm, we investigate the accuracy of the discrete approximation to the Gamma distribution of evolutionary rates, widely used in molecular phylogeny. We show that a relatively large number of classes is required to achieve accurate approximation of the exact likelihood when the number of analyzed sequences exceeds 20, both under the SSRV and among site rate variation (ASRV) models.

The current status of REH theory. [Random Evolutionary Hits in biological molecular evolution

NASA Technical Reports Server (NTRS)

Holmquist, R.; Jukes, T. H.

1981-01-01

A response is made to the evaluation of Fitch (1980) of REH (random evolutionary hits) theory for the evolutionary divergence of proteins and nucleic acids. Correct calculations for the beta hemoglobin mRNAs of the human, mouse and rabbit in the absence and presence of selective constraints are summarized, and it is shown that the alternative evolutionary analysis of Fitch underestimates the total fixed mutations. It is further shown that the model used by Fitch to test for the completeness of the count of total base substitutions is in fact a variant of REH theory. Considerations of the variance inherent in evolutionary estimations are also presented which show the REH model to produce no more variance than other evolutionary models. In the reply, it is argued that, despite the objections raised, REH theory applied to proteins gives inaccurate estimates of total gene substitutions. It is further contended that REH theory developed for nucleic sequences suffers from problems relating to the frequency of nucleotide substitutions, the identity of the codons accepting silent and amino acid-changing substitutions, and estimate uncertainties.
Genome-wide analysis of the cellulose synthase-like (Csl) gene family in bread wheat (Triticum aestivum L.).

PubMed

Kaur, Simerjeet; Dhugga, Kanwarpal S; Beech, Robin; Singh, Jaswinder

2017-11-03

Hemicelluloses are a diverse group of complex, non-cellulosic polysaccharides, which constitute approximately one-third of the plant cell wall and find use as dietary fibres, food additives and raw materials for biofuels. Genes involved in hemicellulose synthesis have not been extensively studied in small grain cereals. In efforts to isolate the sequences for the cellulose synthase-like (Csl) gene family from wheat, we identified 108 genes (hereafter referred to as TaCsl). Each gene was represented by two to three homeoalleles, which are named as TaCslXY_ZA, TaCslXY_ZB, or TaCslXY_ZD, where X denotes the Csl subfamily, Y the gene number and Z the wheat chromosome where it is located. A quarter of these genes were predicted to have 2 to 3 splice variants, resulting in a total of 137 putative translated products. Approximately 45% of TaCsl genes were located on chromosomes 2 and 3. Sequences from the subfamilies C and D were interspersed between the dicots and grasses but those from subfamily A clustered within each group of plants. Proximity of the dicot-specific subfamilies B and G, to the grass-specific subfamilies H and J, respectively, points to their common origin. In silico expression analysis in different tissues revealed that most of the genes were expressed ubiquitously and some were tissue-specific. More than half of the genes had introns in phase 0, one-third in phase 2, and a few in phase 1. Detailed characterization of the wheat Csl genes has enhanced the understanding of their structural, functional, and evolutionary features. This information will be helpful in designing experiments for genetic manipulation of hemicellulose synthesis with the goal of developing improved cultivars for biofuel production and increased tolerance against various stresses.
Adaptive molecular evolution of the two-pore channel 1 gene TPC1 in the karst-adapted genus Primulina (Gesneriaceae)

PubMed Central

Tao, Junjie; Feng, Chao; Ai, Bin; Kang, Ming

2016-01-01

Background and Aims Limestone karst areas possess high floral diversity and endemism. The genus Primulina, which contributes to the unique calcicole flora, has high species richness and exhibit specific soil-based habitat associations that are mainly distributed on calcareous karst soils. The adaptive molecular evolutionary mechanism of the genus to karst calcium-rich environments is still not well understood. The Ca2+-permeable channel TPC1 was used in this study to test whether its gene is involved in the local adaptation of Primulina to karst high-calcium soil environments. Methods Specific amplification and sequencing primers were designed and used to amplify the full-length coding sequences of TPC1 from cDNA of 76 Primulina species. The sequence alignment without recombination and the corresponding reconstructed phylogeny tree were used in molecular evolutionary analyses at the nucleic acid level and amino acid level, respectively. Finally, the identified sites under positive selection were labelled on the predicted secondary structure of TPC1. Key Results Seventy-six full-length coding sequences of Primulina TPC1 were obtained. The length of the sequences varied between 2220 and 2286 bp and the insertion/deletion was located at the 5′ end of the sequences. No signal of substitution saturation was detected in the sequences, while significant recombination breakpoints were detected. The molecular evolutionary analyses showed that TPC1 was dominated by purifying selection and the selective pressures were not significantly different among species lineages. However, significant signals of positive selection were detected at both TPC1 codon level and amino acid level, and five sites under positive selective pressure were identified by at least three different methods. Conclusions The Ca2+-permeable channel TPC1 may be involved in the local adaptation of Primulina to karst Ca2+-rich environments. Different species lineages suffered similar selective pressure associated with calcium in karst environments, and episodic diversifying selection at a few sites may play a major role in the molecular evolution of Primulina TPC1. PMID:27582362
Ribosomal DNA sequence heterogeneity reflects intraspecies phylogenies and predicts genome structure in two contrasting yeast species.

PubMed

West, Claire; James, Stephen A; Davey, Robert P; Dicks, Jo; Roberts, Ian N

2014-07-01

The ribosomal RNA encapsulates a wealth of evolutionary information, including genetic variation that can be used to discriminate between organisms at a wide range of taxonomic levels. For example, the prokaryotic 16S rDNA sequence is very widely used both in phylogenetic studies and as a marker in metagenomic surveys and the internal transcribed spacer region, frequently used in plant phylogenetics, is now recognized as a fungal DNA barcode. However, this widespread use does not escape criticism, principally due to issues such as difficulties in classification of paralogous versus orthologous rDNA units and intragenomic variation, both of which may be significant barriers to accurate phylogenetic inference. We recently analyzed data sets from the Saccharomyces Genome Resequencing Project, characterizing rDNA sequence variation within multiple strains of the baker's yeast Saccharomyces cerevisiae and its nearest wild relative Saccharomyces paradoxus in unprecedented detail. Notably, both species possess single locus rDNA systems. Here, we use these new variation datasets to assess whether a more detailed characterization of the rDNA locus can alleviate the second of these phylogenetic issues, sequence heterogeneity, while controlling for the first. We demonstrate that a strong phylogenetic signal exists within both datasets and illustrate how they can be used, with existing methodology, to estimate intraspecies phylogenies of yeast strains consistent with those derived from whole-genome approaches. We also describe the use of partial Single Nucleotide Polymorphisms, a type of sequence variation found only in repetitive genomic regions, in identifying key evolutionary features such as genome hybridization events and show their consistency with whole-genome Structure analyses. We conclude that our approach can transform rDNA sequence heterogeneity from a problem to a useful source of evolutionary information, enabling the estimation of highly accurate phylogenies of closely related organisms, and discuss how it could be extended to future studies of multilocus rDNA systems. [concerted evolution; genome hydridisation; phylogenetic analysis; ribosomal DNA; whole genome sequencing; yeast]. © The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.
Accounting for epistatic interactions improves the functional analysis of protein structures.

PubMed

Wilkins, Angela D; Venner, Eric; Marciano, David C; Erdin, Serkan; Atri, Benu; Lua, Rhonald C; Lichtarge, Olivier

2013-11-01

The constraints under which sequence, structure and function coevolve are not fully understood. Bringing this mutual relationship to light can reveal the molecular basis of binding, catalysis and allostery, thereby identifying function and rationally guiding protein redesign. Underlying these relationships are the epistatic interactions that occur when the consequences of a mutation to a protein are determined by the genetic background in which it occurs. Based on prior data, we hypothesize that epistatic forces operate most strongly between residues nearby in the structure, resulting in smooth evolutionary importance across the structure. We find that when residue scores of evolutionary importance are distributed smoothly between nearby residues, functional site prediction accuracy improves. Accordingly, we designed a novel measure of evolutionary importance that focuses on the interaction between pairs of structurally neighboring residues. This measure that we term pair-interaction Evolutionary Trace yields greater functional site overlap and better structure-based proteome-wide functional predictions. Our data show that the structural smoothness of evolutionary importance is a fundamental feature of the coevolution of sequence, structure and function. Mutations operate on individual residues, but selective pressure depends in part on the extent to which a mutation perturbs interactions with neighboring residues. In practice, this principle led us to redefine the importance of a residue in terms of the importance of its epistatic interactions with neighbors, yielding better annotation of functional residues, motivating experimental validation of a novel functional site in LexA and refining protein function prediction. lichtarge@bcm.edu. Supplementary data are available at Bioinformatics online.
Accounting for epistatic interactions improves the functional analysis of protein structures

PubMed Central

Wilkins, Angela D.; Venner, Eric; Marciano, David C.; Erdin, Serkan; Atri, Benu; Lua, Rhonald C.; Lichtarge, Olivier

2013-01-01

Motivation: The constraints under which sequence, structure and function coevolve are not fully understood. Bringing this mutual relationship to light can reveal the molecular basis of binding, catalysis and allostery, thereby identifying function and rationally guiding protein redesign. Underlying these relationships are the epistatic interactions that occur when the consequences of a mutation to a protein are determined by the genetic background in which it occurs. Based on prior data, we hypothesize that epistatic forces operate most strongly between residues nearby in the structure, resulting in smooth evolutionary importance across the structure. Methods and Results: We find that when residue scores of evolutionary importance are distributed smoothly between nearby residues, functional site prediction accuracy improves. Accordingly, we designed a novel measure of evolutionary importance that focuses on the interaction between pairs of structurally neighboring residues. This measure that we term pair-interaction Evolutionary Trace yields greater functional site overlap and better structure-based proteome-wide functional predictions. Conclusions: Our data show that the structural smoothness of evolutionary importance is a fundamental feature of the coevolution of sequence, structure and function. Mutations operate on individual residues, but selective pressure depends in part on the extent to which a mutation perturbs interactions with neighboring residues. In practice, this principle led us to redefine the importance of a residue in terms of the importance of its epistatic interactions with neighbors, yielding better annotation of functional residues, motivating experimental validation of a novel functional site in LexA and refining protein function prediction. Contact: lichtarge@bcm.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24021383
The complete coding region sequence of river buffalo (Bubalus bubalis) SRY gene.

PubMed

Parma, Pietro; Feligini, Maria; Greppi, Gianfranco; Enne, Giuseppe

2004-02-01

The Y-linked SRY gene is responsible for testis determination in mammals. Mutations in this gene can lead to XY Gonadal Dysgenesis, an abnormal sexual phenotype described in humans, cattle, horses and river buffalo. We report here the complete river buffalo SRY sequence in order to enable the genetic diagnosis of this disease. The SRY sequence was also used to confirm the evolutionary divergence time between cattle and river buffalo 10 million years ago.
Development of a Prognostic Marker for Lung Cancer Using Analysis of Tumor Evolution

DTIC Science & Technology

2017-08-01

SUPPLEMENTARY NOTES 14. ABSTRACT The goal of this project is to sequence the exomes of single tumor cells from tumors in order to construct evolutionary trees...dissociation, tumor cell isolation, whole genome amplification, and exome sequencing. We have begun to sequence the exomes of single cells and to...of populations, the evolution of tumor cells within a tumor can be diagrammed on a phylogenetic tree. The more diverse a tumor’s phylogenetic tree
Origins of Genes: "Big Bang" or Continuous Creation?

NASA Astrophysics Data System (ADS)

Kesse, Paul K.; Gibbs, Adrian

1992-10-01

Many protein families are common to all cellular organisms, indicating that many genes have ancient origins. Genetic variation is mostly attributed to processes such as mutation, duplication, and rearrangement of ancient modules. Thus it is widely assumed that much of present-day genetic diversity can be traced by common ancestry to a molecular "big bang." A rarely considered alternative is that proteins may arise continuously de novo. One mechanism of generating different coding sequences is by "overprinting," in which an existing nucleotide sequence is translated de novo in a different reading frame or from noncoding open reading frames. The clearest evidence for overprinting is provided when the original gene function is retained, as in overlapping genes. Analysis of their phylogenies indicates which are the original genes and which are their informationally novel partners. We report here the phylogenetic relationships of overlapping coding sequences from steroid-related receptor genes and from tymovirus, luteovirus, and lentivirus genomes. For each pair of overlapping coding sequences, one is confined to a single lineage, whereas the other is more widespread. This suggests that the phylogenetically restricted coding sequence arose only in the progenitor of that lineage by translating an out-of-frame sequence to yield the new polypeptide. The production of novel exons by alternative splicing in thyroid receptor and lentivirus genes suggests that introns can be a valuable evolutionary source for overprinting. New genes and their products may drive major evolutionary changes.
Genomic structure and paralogous regions of the inversion breakpoint occurring between human chromosome 3p12.3 and orangutan chromosome 2.

PubMed

Yue, Y; Grossmann, B; Tsend-Ayush, E; Grützner, F; Ferguson-Smith, M A; Yang, F; Haaf, T

2005-01-01

Intrachromosomal duplications play a significant role in human genome pathology and evolution. To better understand the molecular basis of evolutionary chromosome rearrangements, we performed molecular cytogenetic and sequence analyses of the breakpoint region that distinguishes human chromosome 3p12.3 and orangutan chromosome 2. FISH with region-specific BAC clones demonstrated that the breakpoint-flanking sequences are duplicated intrachromosomally on orangutan 2 and human 3q21 as well as at many pericentromeric and subtelomeric sites throughout the genomes. Breakage and rearrangement of the human 3p12.3-homologous region in the orangutan lineage were associated with a partial loss of duplicated sequences in the breakpoint region. Consistent with our FISH mapping results, computational analysis of the human chromosome 3 genomic sequence revealed three 3p12.3-paralogous sequence blocks on human chromosome 3q21 and smaller blocks on the short arm end 3p26-->p25. This is consistent with the view that sequences from an ancestral site at 3q21 were duplicated at 3p12.3 in a common ancestor of orangutan and humans. Our results show that evolutionary chromosome rearrangements are associated with microduplications and microdeletions, contributing to the DNA differences between closely related species. Copyright (c) 2005 S. Karger AG, Basel.
Molecular cloning, sequence characterization and recombinant expression of Nanog gene in goat fibroblast cells using lentiviral based expression system.

PubMed

Singhal, Dinesh K; Singhal, Raxita; Malik, Hruda N; Kumar, Surender; Kumar, Sudarshan; Mohanty, Ashok K; Kaushik, Jai K; Malakar, Dhruba

2014-01-01

Nanog is a homeodomain containing protein which plays important roles in regulation of signaling pathways for maintenance and induction of pluripotency in stem cells. Because of its unique expression in stem cells it is also regarded as pluripotency marker. In this study goat Nanog (gNanog) gene has been amplified, cloned and characterized at sequence level with successful over-expression in CHO-K1 cell line using a lentiviral based system. gNanog ORF is 903 bp long which codes for Nanog protein of size 300 amino acids (aas). Complete nucleotide sequence shows some evolutionary mutation in goat in comparision to other species. Protein sequence of goat is highly similar to other species. Overall, gNanog nucleotide sequence and predicted protein sequence showed high similarity and minimum divergence with cattle (96 % identity/4 % divergence) and buffalo (94/5 %) while low similarity and high divergence with pig (84/15 %), human (81/23 %) and mouse (69/40 %) indicating evolutionary closeness of gNanog to cattle and buffalo. gNanog lentiviral expression construct was prepared for over-expression of Nanog gene in adult goat fibroblast cells. Lentiviral expression construct of Nanog enabled continuous protein expression for induction and maintenance of pluripotency. Western blotting revealed the expression of Nanog gene at protein level which supported that the lentiviral expression system is highly promising for Nanog protein expression in differentiated goat cell.
IMGD: an integrated platform supporting comparative genomics and phylogenetics of insect mitochondrial genomes

PubMed Central

Lee, Wonhoon; Park, Jongsun; Choi, Jaeyoung; Jung, Kyongyong; Park, Bongsoo; Kim, Donghan; Lee, Jaeyoung; Ahn, Kyohun; Song, Wonho; Kang, Seogchan; Lee, Yong-Hwan; Lee, Seunghwan

2009-01-01

Background Sequences and organization of the mitochondrial genome have been used as markers to investigate evolutionary history and relationships in many taxonomic groups. The rapidly increasing mitochondrial genome sequences from diverse insects provide ample opportunities to explore various global evolutionary questions in the superclass Hexapoda. To adequately support such questions, it is imperative to establish an informatics platform that facilitates the retrieval and utilization of available mitochondrial genome sequence data. Results The Insect Mitochondrial Genome Database (IMGD) is a new integrated platform that archives the mitochondrial genome sequences from 25,747 hexapod species, including 112 completely sequenced and 20 nearly completed genomes and 113,985 partially sequenced mitochondrial genomes. The Species-driven User Interface (SUI) of IMGD supports data retrieval and diverse analyses at multi-taxon levels. The Phyloviewer implemented in IMGD provides three methods for drawing phylogenetic trees and displays the resulting trees on the web. The SNP database incorporated to IMGD presents the distribution of SNPs and INDELs in the mitochondrial genomes of multiple isolates within eight species. A newly developed comparative SNU Genome Browser supports the graphical presentation and interactive interface for the identified SNPs/INDELs. Conclusion The IMGD provides a solid foundation for the comparative mitochondrial genomics and phylogenetics of insects. All data and functions described here are available at the web site . PMID:19351385
Phylogenetically Structured Differences in rRNA Gene Sequence Variation among Species of Arbuscular Mycorrhizal Fungi and Their Implications for Sequence Clustering

PubMed Central

Ekanayake, Saliya; Ruan, Yang; Schütte, Ursel M. E.; Kaonongbua, Wittaya; Fox, Geoffrey; Ye, Yuzhen; Bever, James D.

2016-01-01

ABSTRACT Arbuscular mycorrhizal (AM) fungi form mutualisms with plant roots that increase plant growth and shape plant communities. Each AM fungal cell contains a large amount of genetic diversity, but it is unclear if this diversity varies across evolutionary lineages. We found that sequence variation in the nuclear large-subunit (LSU) rRNA gene from 29 isolates representing 21 AM fungal species generally assorted into genus- and species-level clades, with the exception of species of the genera Claroideoglomus and Entrophospora. However, there were significant differences in the levels of sequence variation across the phylogeny and between genera, indicating that it is an evolutionarily constrained trait in AM fungi. These consistent patterns of sequence variation across both phylogenetic and taxonomic groups pose challenges to interpreting operational taxonomic units (OTUs) as approximations of species-level groups of AM fungi. We demonstrate that the OTUs produced by five sequence clustering methods using 97% or equivalent sequence similarity thresholds failed to match the expected species of AM fungi, although OTUs from AbundantOTU, CD-HIT-OTU, and CROP corresponded better to species than did OTUs from mothur or UPARSE. This lack of OTU-to-species correspondence resulted both from sequences of one species being split into multiple OTUs and from sequences of multiple species being lumped into the same OTU. The OTU richness therefore will not reliably correspond to the AM fungal species richness in environmental samples. Conservatively, this error can overestimate species richness by 4-fold or underestimate richness by one-half, and the direction of this error will depend on the genera represented in the sample. IMPORTANCE Arbuscular mycorrhizal (AM) fungi form important mutualisms with the roots of most plant species. Individual AM fungi are genetically diverse, but it is unclear whether the level of this diversity differs among evolutionary lineages. We found that the amount of sequence variation in an rRNA gene that is commonly used to identify AM fungal species varied significantly between evolutionary groups that correspond to different genera, with the exception of two genera that are genetically indistinguishable from each other. When we clustered groups of similar sequences into operational taxonomic units (OTUs) using five different clustering methods, these patterns of sequence variation caused the number of OTUs to either over- or underestimate the actual number of AM fungal species, depending on the genus. Our results indicate that OTU-based inferences about AM fungal species composition from environmental sequences can be improved if they take these taxonomically structured patterns of sequence variation into account. PMID:27260357
Archaeogenetics in evolutionary medicine.

PubMed

Bouwman, Abigail; Rühli, Frank

2016-09-01

Archaeogenetics is the study of exploration of ancient DNA (aDNA) of more than 70 years old. It is an important part of the wider studies of many different areas of our past, including animal, plant and pathogen evolution and domestication events. Hereby, we address specifically the impact of research in archaeogenetics in the broader field of evolutionary medicine. Studies on ancient hominid genomes help to understand even modern health patterns. Human genetic microevolution, e.g. related to abilities of post-weaning milk consumption, and specifically genetic adaptation in disease susceptibility, e.g. towards malaria and other infectious diseases, are of the upmost importance in contributions of archeogenetics on the evolutionary understanding of human health and disease. With the increase in both the understanding of modern medical genetics and the ability to deep sequence ancient genetic information, the field of archaeogenetic evolutionary medicine is blossoming.
Sequence of the tomato chloroplast DNA and evolutionary comparison of solanaceous plastid genomes.

PubMed

Kahlau, Sabine; Aspinall, Sue; Gray, John C; Bock, Ralph

2006-08-01

Tomato, Solanum lycopersicum (formerly Lycopersicon esculentum), has long been one of the classical model species of plant genetics. More recently, solanaceous species have become a model of evolutionary genomics, with several EST projects and a tomato genome project having been initiated. As a first contribution toward deciphering the genetic information of tomato, we present here the complete sequence of the tomato chloroplast genome (plastome). The size of this circular genome is 155,461 base pairs (bp), with an average AT content of 62.14%. It contains 114 genes and conserved open reading frames (ycfs). Comparison with the previously sequenced plastid DNAs of Nicotiana tabacum and Atropa belladonna reveals patterns of plastid genome evolution in the Solanaceae family and identifies varying degrees of conservation of individual plastid genes. In addition, we discovered several new sites of RNA editing by cytidine-to-uridine conversion. A detailed comparison of editing patterns in the three solanaceous species highlights the dynamics of RNA editing site evolution in chloroplasts. To assess the level of intraspecific plastome variation in tomato, the plastome of a second tomato cultivar was sequenced. Comparison of the two genotypes (IPA-6, bred in South America, and Ailsa Craig, bred in Europe) revealed no nucleotide differences, suggesting that the plastomes of modern tomato cultivars display very little, if any, sequence variation.
Evolutionary conservation of regulatory elements in vertebrate HOX gene clusters

DOE Office of Scientific and Technical Information (OSTI.GOV)

Santini, Simona; Boore, Jeffrey L.; Meyer, Axel

2003-12-31

Due to their high degree of conservation, comparisons of DNA sequences among evolutionarily distantly-related genomes permit to identify functional regions in noncoding DNA. Hox genes are optimal candidate sequences for comparative genome analyses, because they are extremely conserved in vertebrates and occur in clusters. We aligned (Pipmaker) the nucleotide sequences of HoxA clusters of tilapia, pufferfish, striped bass, zebrafish, horn shark, human and mouse (over 500 million years of evolutionary distance). We identified several highly conserved intergenic sequences, likely to be important in gene regulation. Only a few of these putative regulatory elements have been previously described as being involvedmore » in the regulation of Hox genes, while several others are new elements that might have regulatory functions. The majority of these newly identified putative regulatory elements contain short fragments that are almost completely conserved and are identical to known binding sites for regulatory proteins (Transfac). The conserved intergenic regions located between the most rostrally expressed genes in the developing embryo are longer and better retained through evolution. We document that presumed regulatory sequences are retained differentially in either A or A clusters resulting from a genome duplication in the fish lineage. This observation supports both the hypothesis that the conserved elements are involved in gene regulation and the Duplication-Deletion-Complementation model.« less
Satellite DNA Sequences in Canidae and Their Chromosome Distribution in Dog and Red Fox.

PubMed

Vozdova, Miluse; Kubickova, Svatava; Cernohorska, Halina; Fröhlich, Jan; Rubes, Jiri

2016-01-01

Satellite DNA is a characteristic component of mammalian centromeric heterochromatin, and a comparative analysis of its evolutionary dynamics can be used for phylogenetic studies. We analysed satellite and satellite-like DNA sequences available in NCBI for 4 species of the family Canidae (red fox, Vulpes vulpes, VVU; domestic dog, Canis familiaris, CFA; arctic fox, Vulpes lagopus, VLA; raccoon dog, Nyctereutes procyonoides procyonoides, NPR) by comparative sequence analysis, which revealed 86-90% intraspecies and 76-79% interspecies similarity. Comparative fluorescence in situ hybridisation in the red fox and dog showed signals of the red fox satellite probe in canine and vulpine autosomal centromeres, on VVUY, B chromosomes, and in the distal parts of VVU9q and VVU10p which were shown to contain nucleolus organiser regions. The CFA satellite probe stained autosomal centromeres only in the dog. The CFA satellite-like DNA did not show any significant sequence similarity with the satellite DNA of any species analysed and was localised to the centromeres of 9 canine chromosome pairs. No significant heterochromatin block was detected on the B chromosomes of the red fox. Our results show extensive heterogeneity of satellite sequences among Canidae and prove close evolutionary relationships between the red and arctic fox. © 2017 S. Karger AG, Basel.
The three phases of galaxy formation

NASA Astrophysics Data System (ADS)

Clauwens, Bart; Schaye, Joop; Franx, Marijn; Bower, Richard G.

2018-05-01

We investigate the origin of the Hubble sequence by analysing the evolution of the kinematic morphologies of central galaxies in the EAGLE cosmological simulation. By separating each galaxy into disc and spheroidal stellar components and tracing their evolution along the merger tree, we find that the morphology of galaxies follows a common evolutionary trend. We distinguish three phases of galaxy formation. These phases are determined primarily by mass, rather than redshift. For M* ≲ 109.5M⊙ galaxies grow in a disorganised way, resulting in a morphology that is dominated by random stellar motions. This phase is dominated by in-situ star formation, partly triggered by mergers. In the mass range 109.5M⊙ ≲ M* ≲ 1010.5M⊙ galaxies evolve towards a disc-dominated morphology, driven by in-situ star formation. The central spheroid (i.e. the bulge) at z = 0 consists mostly of stars that formed in-situ, yet the formation of the bulge is to a large degree associated with mergers. Finally, at M* ≳ 1010.5M⊙ growth through in-situ star formation slows down considerably and galaxies transform towards a more spheroidal morphology. This transformation is driven more by the buildup of spheroids than by the destruction of discs. Spheroid formation in these galaxies happens mostly by accretion at large radii of stars formed ex-situ (i.e. the halo rather than the bulge).
Chromospheric variations in main-sequence stars

NASA Technical Reports Server (NTRS)

Baliunas, S. L.; Donahue, R. A.; Soon, J. H.; Horne, J. H.; Frazer, J.; Woodard-Eklund, L.; Bradford, M.; Rao, L. M.; Wilson, O. C.; Zhang, Q.

1995-01-01

The fluxes in passbands 0.1 nm wide and centered on the Ca II H and K emission cores have been monitored in 111 stars of spectral type F2-M2 on or near the main sequence in a continuation of an observing program started by O. C. Wilson. Most of the measurements began in 1966, with observations scheduled monthly until 1980, when observations were schedueld sevral times per week. The records, with a long-term precision of about 1.5%, display fluctuations that can be idntified with variations on timescales similar to the 11 yr cycle of solar activity as well as axial rotation, and the growth and decay of emitting regions. We present the records of chromospheric emission and general conclusions about variations in surface magnetic activity on timescales greater than 1 yr but less than a few decades. The results for stars of spectral type G0-K5 V indicate a pattern of change in rotation and chromospheric activity on an evolutionary timescale, in which (1) young stars exhibit high average levels of activity, rapid rotation rates, no Maunder minimum phase and rarely display a smooth, cyclic variation; (2) stars of intermediate age (approximately 1-2 Gyr for 1 solar mass) have moderate levels of activity and rotation rates, and occasional smooth cycles; and (3) stars as old as the Sun and older have slower rotation rates, lower activity levels and smooth cycles with occasional Maunder minimum-phases.
Draft genome of the medaka fish: a comprehensive resource for medaka developmental genetics and vertebrate evolutionary biology.

PubMed

Takeda, Hiroyuki

2008-06-01

The medaka Oryzias latipes is a small egg-laying freshwater teleost, and has become an excellent model system for developmental genetics and evolutionary biology. The medaka genome is relatively small in size, approximately 800 Mb, and the genome sequencing project was recently completed by Japanese research groups, providing a high-quality draft genome sequence of the inbred Hd-rR strain of medaka. In this review, I present an overview of the medaka genome project including genome resources, followed by specific findings obtained with the medaka draft genome. In particular, I focus on the analysis that was done by taking advantage of the medaka system, such as the sex chromosome differentiation and the regional history of medaka species using single nucleotide polymorphisms as genomic markers.

Revising the Evolutionary Stage of HD 163899: The Effects of Convective Overshooting and Rotation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ostrowski, Jakub; Daszyńska-Daszkiewicz, Jadwiga; Cugier, Henryk, E-mail: ostrowski@astro.uni.wroc.pl

We revise the evolutionary status of the B-type supergiant HD 163899 based on the new determinations of the mass–luminosity ratio, effective temperature, and rotational velocity, as well as on the interpretation of the oscillation spectrum of the star. The observed value of the nitrogen-to-carbon abundance fixes the value of the rotation rate of the star. Now, more massive models are strongly preferred than those previously considered, and it is very likely that the star is still in the main-sequence stage. The rotationally induced mixing manifests as the nitrogen overabundance in the atmosphere, which agrees with our analysis of the HARPSmore » spectra. Thus, HD 163899 probably belongs to a group of evolved nitrogen-rich main-sequence stars.« less
Phylogeny of lion tamarins (Leontopithecus spp) based on interphotoreceptor retinol binding protein intron sequences.

PubMed

Mundy, N I; Kelly, J

2001-05-01

The evolutionary relationships of the lion tamarins (Leontopithecus) were investigated using nuclear interphotoreceptor retinol binding protein (IRBP) intron sequences. Phylogenetic reconstructions strongly support the monophyly of the genus, and a sister relationship between the golden lion tamarin, Leontopithecus rosalia, and the black lion tamarin, L. chrysopygus, to the exclusion of the golden-headed lion tamarin, L. chrysomelas. The most parsimonious evolutionary reconstruction suggests that the ancestral lion tamarin and the common ancestor of L. rosalia and L. chrysopygus had predominantly black coats. This reconstruction is not consistent with a theory of orthogenetic evolution of coat color that was based on coat color evolution in marmosets and tamarins. An alternative reconstruction that is consistent with metachromism requires that ancestral lion tamarins had agouti hairs. Copyright 2001 Wiley-Liss, Inc.
The scope and strength of sex-specific selection in genome evolution.

PubMed

Wright, A E; Mank, J E

2013-09-01

Males and females share the vast majority of their genomes and yet are often subject to different, even conflicting, selection. Genomic and transcriptomic developments have made it possible to assess sex-specific selection at the molecular level, and it is clear that sex-specific selection shapes the evolutionary properties of several genomic characteristics, including transcription, post-transcriptional regulation, imprinting, genome structure and gene sequence. Sex-specific selection is strongly influenced by mating system, which also causes neutral evolutionary changes that affect different regions of the genome in different ways. Here, we synthesize theoretical and molecular work in order to provide a cohesive view of the role of sex-specific selection and mating system in genome evolution. We also highlight the need for a combined approach, incorporating both genomic data and experimental phenotypic studies, in order to understand precisely how sex-specific selection drives evolutionary change across the genome. © 2013 The Authors. Journal of Evolutionary Biology © 2013 European Society For Evolutionary Biology.
A Simple General Model of Evolutionary Dynamics

NASA Astrophysics Data System (ADS)

Thurner, Stefan

Evolution is a process in which some variations that emerge within a population (of, e.g., biological species or industrial goods) get selected, survive, and proliferate, whereas others vanish. Survival probability, proliferation, or production rates are associated with the "fitness" of a particular variation. We argue that the notion of fitness is an a posteriori concept in the sense that one can assign higher fitness to species or goods that survive but one can generally not derive or predict fitness per se. Whereas proliferation rates can be measured, fitness landscapes, that is, the inter-dependence of proliferation rates, cannot. For this reason we think that in a physical theory of evolution such notions should be avoided. Here we review a recent quantitative formulation of evolutionary dynamics that provides a framework for the co-evolution of species and their fitness landscapes (Thurner et al., 2010, Physica A 389, 747; Thurner et al., 2010, New J. Phys. 12, 075029; Klimek et al., 2009, Phys. Rev. E 82, 011901 (2010). The corresponding model leads to a generic evolutionary dynamics characterized by phases of relative stability in terms of diversity, followed by phases of massive restructuring. These dynamical modes can be interpreted as punctuated equilibria in biology, or Schumpeterian business cycles (Schumpeter, 1939, Business Cycles, McGraw-Hill, London) in economics. We show that phase transitions that separate phases of high and low diversity can be approximated surprisingly well by mean-field methods. We demonstrate that the mathematical framework is suited to understand systemic properties of evolutionary systems, such as their proneness to collapse, or their potential for diversification. The framework suggests that evolutionary processes are naturally linked to self-organized criticality and to properties of production matrices, such as their eigenvalue spectra. Even though the model is phrased in general terms it is also practical in the sense that it's predictions can be used to understand a series of experimental data ranging from the fossil record to macroeconomic indices.
Interpreting Mammalian Evolution using Fugu Genome Comparisons

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stubbs, L; Ovcharenko, I; Loots, G G

2004-04-02

Comparative sequence analysis of the human and the pufferfish Fugu rubripes (fugu) genomes has revealed several novel functional coding and noncoding regions in the human genome. In particular, the fugu genome has been extremely valuable for identifying transcriptional regulatory elements in human loci harboring unusually high levels of evolutionary conservation to rodent genomes. In such regions, the large evolutionary distance between human and fishes provides an additional filter through which functional noncoding elements can be detected with high efficiency.
Metal-poor stars. IV - The evolution of red giants.

NASA Technical Reports Server (NTRS)

Rood, R. T.

1972-01-01

Detailed evolutionary calculations for six Population-II red giants are presented. The first five of these models are followed from the zero age main sequence to the onset of the helium flash. The sixth model allows the effect of direct electron-neutrino interactions to be estimated. The updated input physics and evolutionary code are described briefly. The results of the calculations are presented in a manner pertinent to later stages of evolutions and suitable for comparison with observations.
The painted turtle, Chrysemys picta: a model system for vertebrate evolution, ecology, and human health.

PubMed

Valenzuela, Nicole

2009-07-01

Painted turtles (Chrysemys picta) are representatives of a vertebrate clade whose biology and phylogenetic position hold a key to our understanding of fundamental aspects of vertebrate evolution. These features make them an ideal emerging model system. Extensive ecological and physiological research provide the context in which to place new research advances in evolutionary genetics, genomics, evolutionary developmental biology, and ecological developmental biology which are enabled by current resources, such as a bacterial artificial chromosome (BAC) library of C. picta, and the imminent development of additional ones such as genome sequences and cDNA and expressed sequence tag (EST) libraries. This integrative approach will allow the research community to continue making advances to provide functional and evolutionary explanations for the lability of biological traits found not only among reptiles but vertebrates in general. Moreover, because humans and reptiles share a common ancestor, and given the ease of using nonplacental vertebrates in experimental biology compared with mammalian embryos, painted turtles are also an emerging model system for biomedical research. For example, painted turtles have been studied to understand many biological responses to overwintering and anoxia, as potential sentinels for environmental xenobiotics, and as a model to decipher the ecology and evolution of sexual development and reproduction. Thus, painted turtles are an excellent reptilian model system for studies with human health, environmental, ecological, and evolutionary significance.
The molecular biology and evolution of feline immunodeficiency viruses of cougars

PubMed Central

Poss, Mary; Ross, Howard; Rodrigo, Allen; Terwee, Julie; VandeWoude, Sue; Biek, Roman

2008-01-01

Feline immunodeficiency virus (FIV) is a lentivirus that has been identified in many members of the family Felidae but domestic cats are the only FIV host in which infection results in disease. We studied FIVpco infection of cougars (Puma concolor) as a model for asymptomatic lentivirus infections to understand the mechanisms of host-virus coexistence. Several natural cougar populations were evaluated to determine if there are any consequences of FIVpco infection on cougar fecundity, survival, or susceptibility to other infections. We have sequenced full length viral genomes and conducted a detailed analysis of viral molecular evolution on these sequences and on genome fragments of serially sampled animals to determine the evolutionary forces experienced by this virus in cougars. In addition, we have evaluated the molecular genetics of FIVpco in a new host, domestic cats, to determine the evolutionary consequences to a host-adapted virus associated with cross-species infection. Our results indicate that there are no significant differences in survival, fecundity or susceptibility to other infections between FIVpco-infected and uninfected cougars. The molecular evolution of FIVpco is characterized by a slower evolutionary rate and an absence of positive selection, but also by proviral and plasma viral loads comparable to those of epidemic lentiviruses such as HIV-1 or FIVfca. Evolutionary and recombination rates and selection profiles change significantly when FIVpco replicates in a new host. PMID:18295904
A pronounced evolutionary shift of the pseudoautosomal region boundary in house mice

PubMed Central

White, Michael A.; Ikeda, Akihiro; Payseur, Bret A.

2012-01-01

The pseudoautosomal region (PAR) is essential for the accurate pairing and segregation of the X and Y chromosomes during meiosis. Despite its functional significance, the PAR shows substantial evolutionary divergence in structure and sequence between mammalian species. An instructive example of PAR evolution is the house mouse Mus musculus domesticus (represented by the C57BL/6J strain), which has the smallest PAR among those that have been mapped. In C57BL/6J, the PAR boundary is located just ~700 kb from the distal end of the X chromosome, whereas the boundary is found at a more proximal position in Mus spretus, a species that diverged from house mice 2–4 million years ago. Here, we use a combination of genetic and physical mapping to document a pronounced shift in the PAR boundary in a second house mouse subspecies, Mus musculus castaneus (represented by the CAST/EiJ strain), ~430 kb proximal of the M. m. domesticus boundary. We demonstrate molecular evolutionary consequences of this shift, including a marked lineage-specific increase in sequence divergence within Mid1, a gene that resides entirely within the M. m. castaneus PAR but straddles the boundary in other subspecies. Our results extend observations of structural divergence in the PAR to closely related subspecies, pointing to major evolutionary changes in this functionally important genomic region over a short time period. PMID:22763584
A pronounced evolutionary shift of the pseudoautosomal region boundary in house mice.

PubMed

White, Michael A; Ikeda, Akihiro; Payseur, Bret A

2012-08-01

The pseudoautosomal region (PAR) is essential for the accurate pairing and segregation of the X and Y chromosomes during meiosis. Despite its functional significance, the PAR shows substantial evolutionary divergence in structure and sequence between mammalian species. An instructive example of PAR evolution is the house mouse Mus musculus domesticus (represented by the C57BL/6J strain), which has the smallest PAR among those that have been mapped. In C57BL/6J, the PAR boundary is located just ~700 kb from the distal end of the X chromosome, whereas the boundary is found at a more proximal position in Mus spretus, a species that diverged from house mice 2-4 million years ago. In this study we used a combination of genetic and physical mapping to document a pronounced shift in the PAR boundary in a second house mouse subspecies, Mus musculus castaneus (represented by the CAST/EiJ strain), ~430 kb proximal of the M. m. domesticus boundary. We demonstrate molecular evolutionary consequences of this shift, including a marked lineage-specific increase in sequence divergence within Mid1, a gene that resides entirely within the M. m. castaneus PAR but straddles the boundary in other subspecies. Our results extend observations of structural divergence in the PAR to closely related subspecies, pointing to major evolutionary changes in this functionally important genomic region over a short time period.
Comparative phylogeography and population genetics within Buteo lineatus reveals evidence of distinct evolutionary lineages

USGS Publications Warehouse

Hull, J.M.; Strobel, Bradley N.; Boal, C.W.; Hull, A.C.; Dykstra, C.R.; Irish, A.M.; Fish, A.M.; Ernest, H.B.

2008-01-01

Traditional subspecies classifications may suggest phylogenetic relationships that are discordant with evolutionary history and mislead evolutionary inference. To more accurately describe evolutionary relationships and inform conservation efforts, we investigated the genetic relationships and demographic histories of Buteo lineatus subspecies in eastern and western North America using 21 nuclear microsatellite loci and 375-base pairs of mitochondrial control region sequence. Frequency based analyses of mitochondrial sequence data support significant population distinction between eastern (B. l. lineatus/alleni/texanus) and western (B. l. elegans) subspecies of B. lineatus. This distinction was further supported by frequency and Bayesian analyses of the microsatellite data. We found evidence of differing demographic histories between regions; among eastern sites, mitochondrial data suggested that rapid population expansion occurred following the end of the last glacial maximum, with B. l. texanus population expansion preceding that of B. l. lineatus/alleni. No evidence of post-glacial population expansion was detected among western samples (B. l. elegans). Rather, microsatellite data suggest that the western population has experienced a recent bottleneck, presumably associated with extensive anthropogenic habitat loss during the 19th and 20th centuries. Our data indicate that eastern and western populations of B. lineatus are genetically distinct lineages, have experienced very different demographic histories, and suggest management as separate conservation units may be warranted. ?? 2008 Elsevier Inc. All rights reserved.
Patterns of Gondwana plant colonisation anddiversification

NASA Astrophysics Data System (ADS)

Anderson, J. M.; Anderson, H. M.; Archangelsky, S.; Bamford, M.; Chandra, S.; Dettmann, M.; Hill, R.; McLoughlin, S.; Rösler, O.

Charting the broad patterns of vascular plant evolution for Gondwana againstthe major global environmental shifts and events is attempted here for the first time. This is based on the analysis of the major vascular plant-bearing formations of the southern continents (plus India) correlated against the standard geological time-scale. Australia, followed closely by South America, are shown to yield by far the most complete sequences of productive strata. Ten seminal turnover pulses in the unfolding evolutionary picture are identified and seen to be linked to continental drift, climate change and mass global extinctions. The rise of vascular plants along the tropical belt, for instance, followed closely after the end-Ordovician warming and extinction. Equally remarkable is that the Late Devonian extinction may have caused both the terrestrialisation of the vertebrates and the origin of the true gymnosperms. The end-Permian extinction, closure of Iapetus, together with warming, appears to have set in motion an unparalleled, explosive, gymnosperm radiation; whilst the Late Triassic extinction dramatically curtailed it. It is suggested that the latitudinal diversity gradient clearly recognised today, where species richness increases towards the tropics, may have been partly reversed during phases of Hot House climate. Evidence hints at this being particularly so at the heyday of the gymnosperms in the Late Triassic super-Hot House world. As for the origin of terrestrial, vascular, plant life, the angiosperms seem closely linked to a phase of marked shift from Ice House to Hot House. Insect and tetrapod evolutionary patterns are discussed in the context of the plants providing the base of the ever-changing ecosystems. Intimate co-evolution is often evident. This isn't always the case, for example the non-linkage between the dominant, giant, long-necked, herbivorous sauropod dinosaurs and the dramatic radiation of the flowering plants in the Mid Cretaceous.
Investigation of the protein osteocalcin of Camelops hesternus: Sequence, structure and phylogenetic implications

NASA Astrophysics Data System (ADS)

Humpula, James F.; Ostrom, Peggy H.; Gandhi, Hasand; Strahler, John R.; Walker, Angela K.; Stafford, Thomas W.; Smith, James J.; Voorhies, Michael R.; George Corner, R.; Andrews, Phillip C.

2007-12-01

Ancient DNA sequences offer an extraordinary opportunity to unravel the evolutionary history of ancient organisms. Protein sequences offer another reservoir of genetic information that has recently become tractable through the application of mass spectrometric techniques. The extent to which ancient protein sequences resolve phylogenetic relationships, however, has not been explored. We determined the osteocalcin amino acid sequence from the bone of an extinct Camelid (21 ka, Camelops hesternus) excavated from Isleta Cave, New Mexico and three bones of extant camelids: bactrian camel ( Camelus bactrianus); dromedary camel ( Camelus dromedarius) and guanaco ( Llama guanacoe) for a diagenetic and phylogenetic assessment. There was no difference in sequence among the four taxa. Structural attributes observed in both modern and ancient osteocalcin include a post-translation modification, Hyp 9, deamidation of Gln 35 and Gln 39, and oxidation of Met 36. Carbamylation of the N-terminus in ancient osteocalcin may result in blockage and explain previous difficulties in sequencing ancient proteins via Edman degradation. A phylogenetic analysis using osteocalcin sequences of 25 vertebrate taxa was conducted to explore osteocalcin protein evolution and the utility of osteocalcin sequences for delineating phylogenetic relationships. The maximum likelihood tree closely reflected generally recognized taxonomic relationships. For example, maximum likelihood analysis recovered rodents, birds and, within hominins, the Homo-Pan-Gorilla trichotomy. Within Artiodactyla, character state analysis showed that a substitution of Pro 4 for His 4 defines the Capra-Ovis clade within Artiodactyla. Homoplasy in our analysis indicated that osteocalcin evolution is not a perfect indicator of species evolution. Limited sequence availability prevented assigning functional significance to sequence changes. Our preliminary analysis of osteocalcin evolution represents an initial step towards a complete character analysis aimed at determining the evolutionary history of this functionally significant protein. We emphasize that ancient protein sequencing and phylogenetic analyses using amino acid sequences must pay close attention to post-translational modifications, amino acid substitutions due to diagenetic alteration and the impacts of isobaric amino acids on mass shifts and sequence alignments.
Evolutionary dynamics of group formation.

PubMed

Javarone, Marco Alberto; Marinazzo, Daniele

2017-01-01

Group formation is a quite ubiquitous phenomenon across different animal species, whose individuals cluster together forming communities of diverse size. Previous investigations suggest that, in general, this phenomenon might have similar underlying reasons across the interested species, despite genetic and behavioral differences. For instance improving the individual safety (e.g. from predators), and increasing the probability to get food resources. Remarkably, the group size might strongly vary from species to species, e.g. shoals of fishes and herds of lions, and sometimes even within the same species, e.g. tribes and families in human societies. Here we build on previous theories stating that the dynamics of group formation may have evolutionary roots, and we explore this fascinating hypothesis from a purely theoretical perspective, with a model using the framework of Evolutionary Game Theory. In our model we hypothesize that homogeneity constitutes a fundamental ingredient in these dynamics. Accordingly, we study a population that tries to form homogeneous groups, i.e. composed of similar agents. The formation of a group can be interpreted as a strategy. Notably, agents can form a group (receiving a 'group payoff'), or can act individually (receiving an 'individual payoff'). The phase diagram of the modeled population shows a sharp transition between the 'group phase' and the 'individual phase', characterized by a critical 'individual payoff'. Our results then support the hypothesis that the phenomenon of group formation has evolutionary roots.
Molecular dissection of transcriptional reprogramming of steviol glycosides synthesis in leaf tissue during developmental phase transitions in Stevia rebaudiana Bert.

PubMed

Singh, Gopal; Singh, Gagandeep; Singh, Pradeep; Parmar, Rajni; Paul, Navgeet; Vashist, Radhika; Swarnkar, Mohit Kumar; Kumar, Ashok; Singh, Sanatsujat; Singh, Anil Kumar; Kumar, Sanjay; Sharma, Ram Kumar

2017-09-19

Stevia is a natural source of commercially important steviol glycosides (SGs), which share biosynthesis route with gibberellic acids (GAs) through plastidal MEP and cytosolic MVA pathways. Ontogeny-dependent deviation in SGs biosynthesis is one of the key factor for global cultivation of Stevia, has not been studied at transcriptional level. To dissect underlying molecular mechanism, we followed a global transcriptome sequencing approach and generated more than 100 million reads. Annotation of 41,262 de novo assembled transcripts identified all the genes required for SGs and GAs biosynthesis. Differential gene expression and quantitative analysis of important pathway genes (DXS, HMGR, KA13H) and gene regulators (WRKY, MYB, NAC TFs) indicated developmental phase dependent utilization of metabolic flux between SGs and GAs synthesis. Further, identification of 124 CYPs and 45 UGTs enrich the genomic resources, and their PPI network analysis with SGs/GAs biosynthesis proteins identifies putative candidates involved in metabolic changes, as supported by their developmental phase-dependent expression. These putative targets can expedite molecular breeding and genetic engineering efforts to enhance SGs content, biomass and yield. Futuristically, the generated dataset will be a useful resource for development of functional molecular markers for diversity characterization, genome mapping and evolutionary studies in Stevia.
Magnetic braking of stellar cores in red giants and supergiants

DOE Office of Scientific and Technical Information (OSTI.GOV)

Maeder, André; Meynet, Georges, E-mail: andre.maeder@unige, E-mail: georges.meynet@unige.ch

2014-10-01

Magnetic configurations, stable on the long term, appear to exist in various evolutionary phases, from main-sequence stars to white dwarfs and neutron stars. The large-scale ordered nature of these fields, often approximately dipolar, and their scaling according to the flux conservation scenario favor a fossil field model. We make some first estimates of the magnetic coupling between the stellar cores and the outer layers in red giants and supergiants. Analytical expressions of the truncation radius of the field coupling are established for a convective envelope and for a rotating radiative zone with horizontal turbulence. The timescales of the internal exchangesmore » of angular momentum are considered. Numerical estimates are made on the basis of recent model grids. The direct magnetic coupling of the core to the extended convective envelope of red giants and supergiants appears unlikely. However, we find that the intermediate radiative zone is fully coupled to the core during the He-burning and later phases. This coupling is able to produce a strong spin down of the core of red giants and supergiants, also leading to relatively slowly rotating stellar remnants such as white dwarfs and pulsars. Some angular momentum is also transferred to the outer convective envelope of red giants and supergiants during the He-burning phase and later.« less
Centromere and telomere sequence alterations reflect the rapid genome evolution within the carnivorous plant genus Genlisea.

PubMed

Tran, Trung D; Cao, Hieu X; Jovtchev, Gabriele; Neumann, Pavel; Novák, Petr; Fojtová, Miloslava; Vu, Giang T H; Macas, Jiří; Fajkus, Jiří; Schubert, Ingo; Fuchs, Joerg

2015-12-01

Linear chromosomes of eukaryotic organisms invariably possess centromeres and telomeres to ensure proper chromosome segregation during nuclear divisions and to protect the chromosome ends from deterioration and fusion, respectively. While centromeric sequences may differ between species, with arrays of tandemly repeated sequences and retrotransposons being the most abundant sequence types in plant centromeres, telomeric sequences are usually highly conserved among plants and other organisms. The genome size of the carnivorous genus Genlisea (Lentibulariaceae) is highly variable. Here we study evolutionary sequence plasticity of these chromosomal domains at an intrageneric level. We show that Genlisea nigrocaulis (1C = 86 Mbp; 2n = 40) and G. hispidula (1C = 1550 Mbp; 2n = 40) differ as to their DNA composition at centromeres and telomeres. G. nigrocaulis and its close relative G. pygmaea revealed mainly 161 bp tandem repeats, while G. hispidula and its close relative G. subglabra displayed a combination of four retroelements at centromeric positions. G. nigrocaulis and G. pygmaea chromosome ends are characterized by the Arabidopsis-type telomeric repeats (TTTAGGG); G. hispidula and G. subglabra instead revealed two intermingled sequence variants (TTCAGG and TTTCAGG). These differences in centromeric and, surprisingly, also in telomeric DNA sequences, uncovered between groups with on average a > 9-fold genome size difference, emphasize the fast genome evolution within this genus. Such intrageneric evolutionary alteration of telomeric repeats with cytosine in the guanine-rich strand, not yet known for plants, might impact the epigenetic telomere chromatin modification. © 2015 The Authors The Plant Journal © 2015 John Wiley & Sons Ltd.
Evolutionary relationships among Pinus (Pinaceae) subsections inferred from multiple low-copy nuclear loci.

Treesearch

John Syring; Ann Willyard; Richard Cronn; Aaron Liston

2005-01-01

Sequence data from nrITS and cpDNA have failed to fully resolve phylogenetic relationships among Pinus species. Four low-copy nuclear genes, developed from the screening of 73 mapped conifer anchor loci, were sequenced from 12 species representing all subsections. Individual loci do not uniformly support either the nrITS or cpDNA hypotheses and in...
Mapping Phylogenetic Trees to Reveal Distinct Patterns of Evolution

PubMed Central

Kendall, Michelle; Colijn, Caroline

2016-01-01

Evolutionary relationships are frequently described by phylogenetic trees, but a central barrier in many fields is the difficulty of interpreting data containing conflicting phylogenetic signals. We present a metric-based method for comparing trees which extracts distinct alternative evolutionary relationships embedded in data. We demonstrate detection and resolution of phylogenetic uncertainty in a recent study of anole lizards, leading to alternate hypotheses about their evolutionary relationships. We use our approach to compare trees derived from different genes of Ebolavirus and find that the VP30 gene has a distinct phylogenetic signature composed of three alternatives that differ in the deep branching structure. Key words: phylogenetics, evolution, tree metrics, genetics, sequencing. PMID:27343287
Phylogenetics.

PubMed

Sleator, Roy D

2011-04-01

The recent rapid expansion in the DNA and protein databases, arising from large-scale genomic and metagenomic sequence projects, has forced significant development in the field of phylogenetics: the study of the evolutionary relatedness of the planet's inhabitants. Advances in phylogenetic analysis have greatly transformed our view of the landscape of evolutionary biology, transcending the view of the tree of life that has shaped evolutionary theory since Darwinian times. Indeed, modern phylogenetic analysis no longer focuses on the restricted Darwinian-Mendelian model of vertical gene transfer, but must also consider the significant degree of lateral gene transfer, which connects and shapes almost all living things. Herein, I review the major tree-building methods, their strengths, weaknesses and future prospects.

A Stochastic Evolutionary Model for Protein Structure Alignment and Phylogeny

PubMed Central

Challis, Christopher J.; Schmidler, Scott C.

2012-01-01

We present a stochastic process model for the joint evolution of protein primary and tertiary structure, suitable for use in alignment and estimation of phylogeny. Indels arise from a classic Links model, and mutations follow a standard substitution matrix, whereas backbone atoms diffuse in three-dimensional space according to an Ornstein–Uhlenbeck process. The model allows for simultaneous estimation of evolutionary distances, indel rates, structural drift rates, and alignments, while fully accounting for uncertainty. The inclusion of structural information enables phylogenetic inference on time scales not previously attainable with sequence evolution models. The model also provides a tool for testing evolutionary hypotheses and improving our understanding of protein structural evolution. PMID:22723302
Evolutionary conservation of sequence and secondary structures inCRISPR repeats

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kunin, Victor; Sorek, Rotem; Hugenholtz, Philip

Clustered Regularly Interspaced Palindromic Repeats (CRISPRs) are a novel class of direct repeats, separated by unique spacer sequences of similar length, that are present in {approx}40% of bacterial and all archaeal genomes analyzed to date. More than 40 gene families, called CRISPR-associated sequences (CAS), appear in conjunction with these repeats and are thought to be involved in the propagation and functioning of CRISPRs. It has been proposed that the CRISPR/CAS system samples, maintains a record of, and inactivates invasive DNA that the cell has encountered, and therefore constitutes a prokaryotic analog of an immune system. Here we analyze CRISPR repeatsmore » identified in 195 microbial genomes and show that they can be organized into multiple clusters based on sequence similarity. All individual repeats in any given cluster were inferred to form characteristic RNA secondary structure, ranging from non-existent to pronounced. Stable secondary structures included G:U base pairs and exhibited multiple compensatory base changes in the stem region, indicating evolutionary conservation and functional importance. We also show that the repeat-based classification corresponds to, and expands upon, a previously reported CAS gene-based classification including specific relationships between CRISPR and CAS subtypes.« less
Normalization of Complete Genome Characteristics: Application to Evolution from Primitive Organisms to Homo sapiens.

PubMed

Sorimachi, Kenji; Okayasu, Teiji; Ohhira, Shuji

2015-04-01

Normalized nucleotide and amino acid contents of complete genome sequences can be visualized as radar charts. The shapes of these charts depict the characteristics of an organism's genome. The normalized values calculated from the genome sequence theoretically exclude experimental errors. Further, because normalization is independent of both target size and kind, this procedure is applicable not only to single genes but also to whole genomes, which consist of a huge number of different genes. In this review, we discuss the applications of the normalization of the nucleotide and predicted amino acid contents of complete genomes to the investigation of genome structure and to evolutionary research from primitive organisms to Homo sapiens. Some of the results could never have been obtained from the analysis of individual nucleotide or amino acid sequences but were revealed only after the normalization of nucleotide and amino acid contents was applied to genome research. The discovery that genome structure was homogeneous was obtained only after normalization methods were applied to the nucleotide or predicted amino acid contents of genome sequences. Normalization procedures are also applicable to evolutionary research. Thus, normalization of the contents of whole genomes is a useful procedure that can help to characterize organisms.
Updating algal evolutionary relationships through plastid genome sequencing: did alveolate plastids emerge through endosymbiosis of an ochrophyte?

PubMed

Ševčíková, Tereza; Horák, Aleš; Klimeš, Vladimír; Zbránková, Veronika; Demir-Hilton, Elif; Sudek, Sebastian; Jenkins, Jerry; Schmutz, Jeremy; Přibyl, Pavel; Fousek, Jan; Vlček, Čestmír; Lang, B Franz; Oborník, Miroslav; Worden, Alexandra Z; Eliáš, Marek

2015-05-28

Algae with secondary plastids of a red algal origin, such as ochrophytes (photosynthetic stramenopiles), are diverse and ecologically important, yet their evolutionary history remains controversial. We sequenced plastid genomes of two ochrophytes, Ochromonas sp. CCMP1393 (Chrysophyceae) and Trachydiscus minutus (Eustigmatophyceae). A shared split of the clpC gene as well as phylogenomic analyses of concatenated protein sequences demonstrated that chrysophytes and eustigmatophytes form a clade, the Limnista, exhibiting an unexpectedly elevated rate of plastid gene evolution. Our analyses also indicate that the root of the ochrophyte phylogeny falls between the recently redefined Khakista and Phaeista assemblages. Taking advantage of the expanded sampling of plastid genome sequences, we revisited the phylogenetic position of the plastid of Vitrella brassicaformis, a member of Alveolata with the least derived plastid genome known for the whole group. The results varied depending on the dataset and phylogenetic method employed, but suggested that the Vitrella plastids emerged from a deep ochrophyte lineage rather than being derived vertically from a hypothetical plastid-bearing common ancestor of alveolates and stramenopiles. Thus, we hypothesize that the plastid in Vitrella, and potentially in other alveolates, may have been acquired by an endosymbiosis of an early ochrophyte.
Petrology and Geochemistry of D'Orbigny, Geochemistry of Sahara 99555, and the Origin of Angrites

NASA Technical Reports Server (NTRS)

Mittlefehldt, David W.; Killgore, Marvin; Lee, Michael T.

2001-01-01

We have done detailed petrologic study of the angrite, D'Orbigny, and geochemical study of it and Sahara 99555. D'Orbigny is an igneous-textured rock composed of Ca-rich olivine, Al-Ti-diopside-hedenbergite, subcalcic kirschsteinite, two generations of hercynitic spinel and anorthite, with the mesostasis phases ulv6spinel, Ca-phosphate, a silicophosphate phase and Fe-sulfide. We report an unknown Fe-Ca-Al-Ti-silicate phase in the mesostasis not previously found in angrites. One hercynitic spinel is a large, rounded homogeneous grain of a different composition than the euhedral and zoned grains. We believe the former is a xenocryst, the first such described from angrites. The mafic phases are highly zoned; mg# of cores for olivine are approx.64, and for clinopyroxene approx.58, and both are zoned to Mg-free rims. The Ca content of olivine increases with decreasing mg#, until olivine with approx.20 mole% Ca is overgrown by subcalcic kirschsteinite with Ca approx.30-35 mole%. Detailed zoning sequences in olivine-subcalcic kirschsteinite and clinopyroxene show slight compositional reversals. There is no mineralogic control that can explain these reversals, and we believe they were likely caused by local additions of more primitive melt during crystallization of D'Orbigny. D'Orbigny is the most ferroan angrite with a bulk rock mg# of 32. Compositionally, it is virtually identical to Sahara 99555; the first set of compositionally identical angrites. Comparison with the other angrites shows that there is no simple petrogenetic sequence, partial melting with or without fractional crystallization, that can explain the angrite suite. Angra dos Reis remains a very anomalous angrite. Angrites show no evidence for the brecciation, shock, or impact or thermal metamorphism that affected the HED suite and ordinary chondrites. This suggests the angrite parent body may have followed a fundamentally different evolutionary path than did these other parent bodies.
Prediction of new high pressure structural sequence in thorium carbide: A first principles study

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sahoo, B. D., E-mail: bdsahoo@barc.gov.in; Joshi, K. D.; Gupta, Satish C.

2015-05-14

In the present work, we report the detailed electronic band structure calculations on thorium monocarbide. The comparison of enthalpies, derived for various phases using evolutionary structure search method in conjunction with first principles total energy calculations at several hydrostatic compressions, yielded a high pressure structural sequence of NaCl type (B1) → Pnma → Cmcm → CsCl type (B2) at hydrostatic pressures of ∼19 GPa, 36 GPa, and 200 GPa, respectively. However, the two high pressure experimental studies by Gerward et al. [J. Appl. Crystallogr. 19, 308 (1986); J. Less-Common Met. 161, L11 (1990)] one up to 36 GPa and other up to 50 GPa, onmore » substoichiometric thorium carbide samples with carbon deficiency of ∼20%, do not report any structural transition. The discrepancy between theory and experiment could be due to the non-stoichiometry of thorium carbide samples used in the experiment. Further, in order to substantiate the results of our static lattice calculations, we have determined the phonon dispersion relations for these structures from lattice dynamic calculations. The theoretically calculated phonon spectrum reveal that the B1 phase fails dynamically at ∼33.8 GPa whereas the Pnma phase appears as dynamically stable structure around the B1 to Pnma transition pressure. Similarly, the Cmcm structure also displays dynamic stability in the regime of its structural stability. The B2 phase becomes dynamically stable much below the Cmcm to B2 transition pressure. Additionally, we have derived various thermophysical properties such as zero pressure equilibrium volume, bulk modulus, its pressure derivative, Debye temperature, thermal expansion coefficient and Gruneisen parameter at 300 K and compared these with available experimental data. Further, the behavior of zero pressure bulk modulus, heat capacity and Helmholtz free energy has been examined as a function temperature and compared with the experimental data of Danan [J. Nucl. Mater. 57, 280 (1975)].« less
A complete mitochondrial genome sequence of the wild two-humped camel (Camelus bactrianus ferus): an evolutionary history of camelidae

PubMed Central

Cui, Peng; Ji, Rimutu; Ding, Feng; Qi, Dan; Gao, Hongwei; Meng, He; Yu, Jun; Hu, Songnian; Zhang, Heping

2007-01-01

Background The family Camelidae that evolved in North America during the Eocene survived with two distinct tribes, Camelini and Lamini. To investigate the evolutionary relationship between them and to further understand the evolutionary history of this family, we determined the complete mitochondrial genome sequence of the wild two-humped camel (Camelus bactrianus ferus), the only wild survivor of the Old World camel. Results The mitochondrial genome sequence (16,680 bp) from C. bactrianus ferus contains 13 protein-coding, two rRNA, and 22 tRNA genes as well as a typical control region; this basic structure is shared by all metazoan mitochondrial genomes. Its protein-coding region exhibits codon usage common to all mammals and possesses the three cryptic stop codons shared by all vertebrates. C. bactrianus ferus together with the rest of mammalian species do not share a triplet nucleotide insertion (GCC) that encodes a proline residue found only in the nd1 gene of the New World camelid Lama pacos. This lineage-specific insertion in the L. pacos mtDNA occurred after the split between the Old and New World camelids suggests that it may have functional implication since a proline insertion in a protein backbone usually alters protein conformation significantly, and nd1 gene has not been seen as polymorphic as the rest of ND family genes among camelids. Our phylogenetic study based on complete mitochondrial genomes excluding the control region suggested that the divergence of the two tribes may occur in the early Miocene; it is much earlier than what was deduced from the fossil record (11 million years). An evolutionary history reconstructed for the family Camelidae based on cytb sequences suggested that the split of bactrian camel and dromedary may have occurred in North America before the tribe Camelini migrated from North America to Asia. Conclusion Molecular clock analysis of complete mitochondrial genomes from C. bactrianus ferus and L. pacos suggested that the two tribes diverged from their common ancestor about 25 million years ago, much earlier than what was predicted based on fossil records. PMID:17640355
Insights into the phylogenetic positions of photosynthetic bacteria obtained from 5S rRNA and 16S rRNA sequence data

NASA Technical Reports Server (NTRS)

Fox, G. E.

1985-01-01

Comparisons of complete 16S ribosomal ribonucleic acid (rRNA) sequences established that the secondary structure of these molecules is highly conserved. Earlier work with 5S rRNA secondary structure revealed that when structural conservation exists the alignment of sequences is straightforward. The constancy of structure implies minimal functional change. Under these conditions a uniform evolutionary rate can be expected so that conditions are favorable for phylogenetic tree construction.
Resolving Evolutionary Relationships in Closely Related Species with Whole-Genome Sequencing Data

PubMed Central

Nater, Alexander; Burri, Reto; Kawakami, Takeshi; Smeds, Linnéa; Ellegren, Hans

2015-01-01

Using genetic data to resolve the evolutionary relationships of species is of major interest in evolutionary and systematic biology. However, reconstructing the sequence of speciation events, the so-called species tree, in closely related and potentially hybridizing species is very challenging. Processes such as incomplete lineage sorting and interspecific gene flow result in local gene genealogies that differ in their topology from the species tree, and analyses of few loci with a single sequence per species are likely to produce conflicting or even misleading results. To study these phenomena on a full phylogenomic scale, we use whole-genome sequence data from 200 individuals of four black-and-white flycatcher species with so far unresolved phylogenetic relationships to infer gene tree topologies and visualize genome-wide patterns of gene tree incongruence. Using phylogenetic analysis in nonoverlapping 10-kb windows, we show that gene tree topologies are extremely diverse and change on a very small physical scale. Moreover, we find strong evidence for gene flow among flycatcher species, with distinct patterns of reduced introgression on the Z chromosome. To resolve species relationships on the background of widespread gene tree incongruence, we used four complementary coalescent-based methods for species tree reconstruction, including complex modeling approaches that incorporate post-divergence gene flow among species. This allowed us to infer the most likely species tree with high confidence. Based on this finding, we show that regions of reduced effective population size, which have been suggested as particularly useful for species tree inference, can produce positively misleading species tree topologies. Our findings disclose the pitfalls of using loci potentially under selection as phylogenetic markers and highlight the potential of modeling approaches to disentangle species relationships in systems with large effective population sizes and post-divergence gene flow. PMID:26187295
Domain architecture conservation in orthologs

PubMed Central

2011-01-01

Background As orthologous proteins are expected to retain function more often than other homologs, they are often used for functional annotation transfer between species. However, ortholog identification methods do not take into account changes in domain architecture, which are likely to modify a protein's function. By domain architecture we refer to the sequential arrangement of domains along a protein sequence. To assess the level of domain architecture conservation among orthologs, we carried out a large-scale study of such events between human and 40 other species spanning the entire evolutionary range. We designed a score to measure domain architecture similarity and used it to analyze differences in domain architecture conservation between orthologs and paralogs relative to the conservation of primary sequence. We also statistically characterized the extents of different types of domain swapping events across pairs of orthologs and paralogs. Results The analysis shows that orthologs exhibit greater domain architecture conservation than paralogous homologs, even when differences in average sequence divergence are compensated for, for homologs that have diverged beyond a certain threshold. We interpret this as an indication of a stronger selective pressure on orthologs than paralogs to retain the domain architecture required for the proteins to perform a specific function. In general, orthologs as well as the closest paralogous homologs have very similar domain architectures, even at large evolutionary separation. The most common domain architecture changes observed in both ortholog and paralog pairs involved insertion/deletion of new domains, while domain shuffling and segment duplication/deletion were very infrequent. Conclusions On the whole, our results support the hypothesis that function conservation between orthologs demands higher domain architecture conservation than other types of homologs, relative to primary sequence conservation. This supports the notion that orthologs are functionally more similar than other types of homologs at the same evolutionary distance. PMID:21819573
A network approach to analyzing highly recombinant malaria parasite genes.

PubMed

Larremore, Daniel B; Clauset, Aaron; Buckee, Caroline O

2013-01-01

The var genes of the human malaria parasite Plasmodium falciparum present a challenge to population geneticists due to their extreme diversity, which is generated by high rates of recombination. These genes encode a primary antigen protein called PfEMP1, which is expressed on the surface of infected red blood cells and elicits protective immune responses. Var gene sequences are characterized by pronounced mosaicism, precluding the use of traditional phylogenetic tools that require bifurcating tree-like evolutionary relationships. We present a new method that identifies highly variable regions (HVRs), and then maps each HVR to a complex network in which each sequence is a node and two nodes are linked if they share an exact match of significant length. Here, networks of var genes that recombine freely are expected to have a uniformly random structure, but constraints on recombination will produce network communities that we identify using a stochastic block model. We validate this method on synthetic data, showing that it correctly recovers populations of constrained recombination, before applying it to the Duffy Binding Like-α (DBLα) domain of var genes. We find nine HVRs whose network communities map in distinctive ways to known DBLα classifications and clinical phenotypes. We show that the recombinational constraints of some HVRs are correlated, while others are independent. These findings suggest that this micromodular structuring facilitates independent evolutionary trajectories of neighboring mosaic regions, allowing the parasite to retain protein function while generating enormous sequence diversity. Our approach therefore offers a rigorous method for analyzing evolutionary constraints in var genes, and is also flexible enough to be easily applied more generally to any highly recombinant sequences.
A Network Approach to Analyzing Highly Recombinant Malaria Parasite Genes

PubMed Central

Larremore, Daniel B.; Clauset, Aaron; Buckee, Caroline O.

2013-01-01

The var genes of the human malaria parasite Plasmodium falciparum present a challenge to population geneticists due to their extreme diversity, which is generated by high rates of recombination. These genes encode a primary antigen protein called PfEMP1, which is expressed on the surface of infected red blood cells and elicits protective immune responses. Var gene sequences are characterized by pronounced mosaicism, precluding the use of traditional phylogenetic tools that require bifurcating tree-like evolutionary relationships. We present a new method that identifies highly variable regions (HVRs), and then maps each HVR to a complex network in which each sequence is a node and two nodes are linked if they share an exact match of significant length. Here, networks of var genes that recombine freely are expected to have a uniformly random structure, but constraints on recombination will produce network communities that we identify using a stochastic block model. We validate this method on synthetic data, showing that it correctly recovers populations of constrained recombination, before applying it to the Duffy Binding Like-α (DBLα) domain of var genes. We find nine HVRs whose network communities map in distinctive ways to known DBLα classifications and clinical phenotypes. We show that the recombinational constraints of some HVRs are correlated, while others are independent. These findings suggest that this micromodular structuring facilitates independent evolutionary trajectories of neighboring mosaic regions, allowing the parasite to retain protein function while generating enormous sequence diversity. Our approach therefore offers a rigorous method for analyzing evolutionary constraints in var genes, and is also flexible enough to be easily applied more generally to any highly recombinant sequences. PMID:24130474
From museums to genomics: old herbarium specimens shed light on a C3 to C4 transition.

PubMed

Besnard, Guillaume; Christin, Pascal-Antoine; Malé, Pierre-Jean G; Lhuillier, Emeline; Lauzeral, Christine; Coissac, Eric; Vorontsova, Maria S

2014-12-01

Collections of specimens held by natural history museums are invaluable material for biodiversity inventory and evolutionary studies, with specimens accumulated over 300 years readily available for sampling. Unfortunately, most museum specimens yield low-quality DNA. Recent advances in sequencing technologies, so called next-generation sequencing, are revolutionizing phylogenetic investigations at a deep level. Here, the Illumina technology (HiSeq) was used on herbarium specimens of Sartidia (subfamily Aristidoideae, Poaceae), a small African-Malagasy grass lineage (six species) characteristic of wooded savannas, which is the C3 sister group of Stipagrostis, an important C4 genus from Africa and SW Asia. Complete chloroplast and nuclear ribosomal sequences were assembled for two Sartidia species, one of which (S. perrieri) is only known from a single specimen collected in Madagascar 100 years ago. Partial sequences of a few single-copy genes encoding phosphoenolpyruvate carboxylases (ppc) and malic enzymes (nadpme) were also assembled. Based on these data, the phylogenetic position of Malagasy Sartidia in the subfamily Aristidoideae was investigated and the biogeographical history of this genus was analysed with full species sampling. The evolutionary history of two genes for C4 photosynthesis (ppc-aL1b and nadpme-IV) in the group was also investigated. The gene encoding the C4 phosphoenolpyruvate caroxylase of Stipagrostis is absent from S. dewinteri suggesting that it is not essential in C3 members of the group, which might have favoured its recruitment into a new metabolic pathway. Altogether, the inclusion of historical museum specimens in phylogenomic analyses of biodiversity opens new avenues for evolutionary studies. © The Author 2014. Published by Oxford University Press on behalf of the Society for Experimental Biology. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Complete chloroplast genome sequences of Praxelis (Eupatorium catarium Veldkamp), an important invasive species.

PubMed

Zhang, Ying; Li, Lei; Yan, Ting Liang; Liu, Qiang

2014-10-01

Praxelis (Eupatorium catarium Veldkamp) is a new hazardous invasive plant species that has caused serious economic losses and environmental damage in the Northern hemisphere tropical and subtropical regions. Although previous studies focused on detecting the biological characteristics of this plant to prevent its expansion, little effort has been made to understand the impact of Praxelis on the ecosystem in an evolutionary process. The genetic information of Praxelis is required for further phylogenetic identification and evolutionary studies. Here, we report the complete Praxelis chloroplast (cp) genome sequence. The Praxelis chloroplast genome is 151,410 bp in length including a small single-copy region (18,547 bp) and a large single-copy region (85,311 bp) separated by a pair of inverted repeats (IRs; 23,776 bp). The genome contains 85 unique and 18 duplicated genes in the IR region. The gene content and organization are similar to other Asteraceae tribe cp genomes. We also analyzed the whole cp genome sequence, repeat structure, codon usage, contraction of the IR and gene structure/organization features between native and invasive Asteraceae plants, in order to understand the evolution of organelle genomes between native and invasive Asteraceae. Comparative analysis identified the 14 markers containing greater than 2% parsimony-informative characters, indicating that they are potential informative markers for barcoding and phylogenetic analysis. Moreover, a sister relationship between Praxelis and seven other species in Asteraceae was found based on phylogenetic analysis of 28 protein-coding sequences. Complete cp genome information is useful for plant phylogenetic and evolutionary studies within this invasive species and also within the Asteraceae family. Copyright © 2014 Elsevier B.V. All rights reserved.
A Surrogate Approach to Study the Evolution of Noncoding DNA Elements That Organize Eukaryotic Genomes

PubMed Central

Vermaak, Danielle; Bayes, Joshua J.

2009-01-01

Comparative genomics provides a facile way to address issues of evolutionary constraint acting on different elements of the genome. However, several important DNA elements have not reaped the benefits of this new approach. Some have proved intractable to current day sequencing technology. These include centromeric and heterochromatic DNA, which are essential for chromosome segregation as well as gene regulation, but the highly repetitive nature of the DNA sequences in these regions make them difficult to assemble into longer contigs. Other sequences, like dosage compensation X chromosomal sites, origins of DNA replication, or heterochromatic sequences that encode piwi-associated RNAs, have proved difficult to study because they do not have recognizable DNA features that allow them to be described functionally or computationally. We have employed an alternate approach to the direct study of these DNA elements. By using proteins that specifically bind these noncoding DNAs as surrogates, we can indirectly assay the evolutionary constraints acting on these important DNA elements. We review the impact that such “surrogate strategies” have had on our understanding of the evolutionary constraints shaping centromeres, origins of DNA replication, and dosage compensation X chromosomal sites. These have begun to reveal that in contrast to the view that such structural DNA elements are either highly constrained (under purifying selection) or free to drift (under neutral evolution), some of them may instead be shaped by adaptive evolution and genetic conflicts (these are not mutually exclusive). These insights also help to explain why the same elements (e.g., centromeres and replication origins), which are so complex in some eukaryotic genomes, can be simple and well defined in other where similar conflicts do not exist. PMID:19635763
Sequencing and comparing whole mitochondrial genomes ofanimals

DOE Office of Scientific and Technical Information (OSTI.GOV)

Boore, Jeffrey L.; Macey, J. Robert; Medina, Monica

2005-04-22

Comparing complete animal mitochondrial genome sequences is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. Not only are they much more informative than shorter sequences of individual genes for inferring evolutionary relatedness, but these data also provide sets of genome-level characters, such as the relative arrangements of genes, that can be especially powerful. We describe here the protocols commonly used for physically isolating mtDNA, for amplifying these by PCR or RCA, for cloning,sequencing, assembly, validation, and gene annotation, and for comparing both sequences and gene arrangements. On several topics, we offer general observations based onmore » our experiences to date with determining and comparing complete mtDNA sequences.« less
The first complete chloroplast genome of the Genistoid legume Lupinus luteus: evidence for a novel major lineage-specific rearrangement and new insights regarding plastome evolution in the legume family.

PubMed

Martin, Guillaume E; Rousseau-Gueutin, Mathieu; Cordonnier, Solenn; Lima, Oscar; Michon-Coudouel, Sophie; Naquin, Delphine; de Carvalho, Julie Ferreira; Aïnouche, Malika; Salmon, Armel; Aïnouche, Abdelkader

2014-06-01

To date chloroplast genomes are available only for members of the non-protein amino acid-accumulating clade (NPAAA) Papilionoid lineages in the legume family (i.e. Millettioids, Robinoids and the 'inverted repeat-lacking clade', IRLC). It is thus very important to sequence plastomes from other lineages in order to better understand the unusual evolution observed in this model flowering plant family. To this end, the plastome of a lupine species, Lupinus luteus, was sequenced to represent the Genistoid lineage, a noteworthy but poorly studied legume group. The plastome of L. luteus was reconstructed using Roche-454 and Illumina next-generation sequencing. Its structure, repetitive sequences, gene content and sequence divergence were compared with those of other Fabaceae plastomes. PCR screening and sequencing were performed in other allied legumes in order to determine the origin of a large inversion identified in L. luteus. The first sequenced Genistoid plastome (L. luteus: 155 894 bp) resulted in the discovery of a 36-kb inversion, embedded within the already known 50-kb inversion in the large single-copy (LSC) region of the Papilionoideae. This inversion occurs at the base or soon after the Genistoid emergence, and most probably resulted from a flip-flop recombination between identical 29-bp inverted repeats within two trnS genes. Comparative analyses of the chloroplast gene content of L. luteus vs. Fabaceae and extra-Fabales plastomes revealed the loss of the plastid rpl22 gene, and its functional relocation to the nucleus was verified using lupine transcriptomic data. An investigation into the evolutionary rate of coding and non-coding sequences among legume plastomes resulted in the identification of remarkably variable regions. This study resulted in the discovery of a novel, major 36-kb inversion, specific to the Genistoids. Chloroplast mutational hotspots were also identified, which contain novel and potentially informative regions for molecular evolutionary studies at various taxonomic levels in the legumes. Taken together, the results provide new insights into the evolutionary landscape of the legume plastome. © The Author 2014. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
A discrete artificial bee colony algorithm for detecting transcription factor binding sites in DNA sequences.

PubMed

Karaboga, D; Aslan, S

2016-04-27

The great majority of biological sequences share significant similarity with other sequences as a result of evolutionary processes, and identifying these sequence similarities is one of the most challenging problems in bioinformatics. In this paper, we present a discrete artificial bee colony (ABC) algorithm, which is inspired by the intelligent foraging behavior of real honey bees, for the detection of highly conserved residue patterns or motifs within sequences. Experimental studies on three different data sets showed that the proposed discrete model, by adhering to the fundamental scheme of the ABC algorithm, produced competitive or better results than other metaheuristic motif discovery techniques.
COI (cytochrome oxidase-I) sequence based studies of Carangid fishes from Kakinada coast, India.

PubMed

Persis, M; Chandra Sekhar Reddy, A; Rao, L M; Khedkar, G D; Ravinder, K; Nasruddin, K

2009-09-01

Mitochondrial DNA, cytochrome oxidase-1 gene sequences were analyzed for species identification and phylogenetic relationship among the very high food value and commercially important Indian carangid fish species. Sequence analysis of COI gene very clearly indicated that all the 28 fish species fell into five distinct groups, which are genetically distant from each other and exhibited identical phylogenetic reservation. All the COI gene sequences from 28 fishes provide sufficient phylogenetic information and evolutionary relationship to distinguish the carangid species unambiguously. This study proves the utility of mtDNA COI gene sequence based approach in identifying fish species at a faster pace.
Comparison of the quality of different magnetic resonance image sequences of multiple myeloma.

PubMed

Sun, Zhao-yong; Zhang, Hai-bo; Li, Shuo; Wang, Yun; Xue, Hua-dan; Jin, Zheng-yu

2015-02-01

To compare the image quality of T1WI fat phase,T1WI water phase, short time inversion recovery (STIR) sequence, and diffusion weighted imaging (DWI) sequence in the evaluation of multiple myeloma (MM). Totally 20MM patients were enrolled in this study. All patients underwent scanning at coronal T1WI fat phase, coronal T1WI water phase, coronal STIR sequence, and axial DWI sequence. The image quality of the four different sequences was evaluated. The image was divided into seven sections(head and neck, chest, abdomen, pelvis, thigh, leg, and foot), and the signal-to-noise ratio (SNR) of each section was measured at 7 segments (skull, spine, pelvis, humerus, femur, tibia and fibula and ribs) were measured. In addition, 20 active MM lesions were selected, and the contrast-to-noise ratio (CNR) of each scan sequence was calculated. The average image quality scores of T1WI fat phase,T1WI water phase, STIR sequence, and DWI sequence were 4.19 ± 0.70,4.16 ± 0.73,3.89 ± 0.70, and 3.76 ± 0.68, respectively. The image quality at T1-fat phase and T1-water phase were significantly higher than those at STIR (P=0.000 and P=0.001) and DWI sequence (both P=0.000); however, there was no significant difference between T1-fat and T1-water phase (P=0.723)and between STIR and DWI sequence (P=0.167). The SNR of T1WI fat phase was significantly higher than those of the other three sequences (all P=0.000), and there was no significant difference among the other three sequences (all P>0.05). Although the CNR of DWI sequences was slightly higher than those of the other three sequences,there was no significant difference among all of them (all P>0.05). Imaging at T1WI fat phase,T1WI water phase, STIR sequence, and DWI sequence has certain advantages,and they should be combined in the diagnosis of MM.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.