LinkFinder: An expert system that constructs phylogenic trees
NASA Technical Reports Server (NTRS)
Inglehart, James; Nelson, Peter C.
1991-01-01
An expert system has been developed using the C Language Integrated Production System (CLIPS) that automates the process of constructing DNA sequence based phylogenies (trees or lineages) that indicate evolutionary relationships. LinkFinder takes as input homologous DNA sequences from distinct individual organisms. It measures variations between the sequences, selects appropriate proportionality constants, and estimates the time that has passed since each pair of organisms diverged from a common ancestor. It then designs and outputs a phylogenic map summarizing these results. LinkFinder can find genetic relationships between different species, and between individuals of the same species, including humans. It was designed to take advantage of the vast amount of sequence data being produced by the Genome Project, and should be of value to evolution theorists who wish to utilize this data, but who have no formal training in molecular genetics. Evolutionary theory holds that distinct organisms carrying a common gene inherited that gene from a common ancestor. Homologous genes vary from individual to individual and species to species, and the amount of variation is now believed to be directly proportional to the time that has passed since divergence from a common ancestor. The proportionality constant must be determined experimentally; it varies considerably with the types of organisms and DNA molecules under study. Given an appropriate constant, and the variation between two DNA sequences, a simple linear equation gives the divergence time.
Determining divergence times with a protein clock: update and reevaluation
NASA Technical Reports Server (NTRS)
Feng, D. F.; Cho, G.; Doolittle, R. F.; Bada, J. L. (Principal Investigator)
1997-01-01
A recent study of the divergence times of the major groups of organisms as gauged by amino acid sequence comparison has been expanded and the data have been reanalyzed with a distance measure that corrects for both constraints on amino acid interchange and variation in substitution rate at different sites. Beyond that, the availability of complete genome sequences for several eubacteria and an archaebacterium has had a great impact on the interpretation of certain aspects of the data. Thus, the majority of the archaebacterial sequences are not consistent with currently accepted views of the Tree of Life which cluster the archaebacteria with eukaryotes. Instead, they are either outliers or mixed in with eubacterial orthologs. The simplest resolution of the problem is to postulate that many of these sequences were carried into eukaryotes by early eubacterial endosymbionts about 2 billion years ago, only very shortly after or even coincident with the divergence of eukaryotes and archaebacteria. The strong resemblances of these same enzymes among the major eubacterial groups suggest that the cyanobacteria and Gram-positive and Gram-negative eubacteria also diverged at about this same time, whereas the much greater differences between archaebacterial and eubacterial sequences indicate these two groups may have diverged between 3 and 4 billion years ago.
Lashbrook, C C; Gonzalez-Bosch, C; Bennett, A B
1994-01-01
Two structurally divergent endo-beta-1,4-glucanase (EGase) cDNAs were cloned from tomato. Although both cDNAs (Cel1 and Cel2) encode potentially glycosylated, basic proteins of 51 to 53 kD and possess multiple amino acid domains conserved in both plant and microbial EGases, Cel1 and Cel2 exhibit only 50% amino acid identity at the overall sequence level. Amino acid sequence comparisons to other plant EGases indicate that tomato Cel1 is most similar to bean abscission zone EGase (68%), whereas Cel2 exhibits greatest sequence identity to avocado fruit EGase (57%). Sequence comparisons suggest the presence of at least two structurally divergent EGase families in plants. Unlike ripening avocado fruit and bean abscission zones in which a single EGase mRNA predominates, EGase expression in tomato reflects the overlapping accumulation of both Cel1 and Cel2 transcripts in ripening fruit and in plant organs undergoing cell separation. Cel1 mRNA contributes significantly to total EGase mRNA accumulation within plant organs undergoing cell separation (abscission zones and mature anthers), whereas Cel2 mRNA is most abundant in ripening fruit. The overlapping expression of divergent EGase genes within a single species may suggest that multiple activities are required for the cooperative disassembly of cell wall components during fruit ripening, floral abscission, and anther dehiscence. PMID:7994180
Single sample resolution of rare microbial dark matter in a marine invertebrate metagenome
DOE Office of Scientific and Technical Information (OSTI.GOV)
Miller, Ian J.; Weyna, Theodore R.; Fong, Stephen S.
Direct, untargeted sequencing of environmental samples (metagenomics) and de novo genome assembly enable the study of uncultured and phylogenetically divergent organisms. However, separating individual genomes from a mixed community has often relied on the differential-coverage analysis of multiple, deeply sequenced samples. In the metagenomic investigation of the marine bryozoan Bugula neritina, we uncovered seven bacterial genomes associated with a single B. neritina individual that appeared to be transient associates, two of which were unique to one individual and undetectable using certain “universal” 16S rRNA primers and probes. We recovered high quality genome assemblies for several rare instances of “microbial darkmore » matter,” or phylogenetically divergent bacteria lacking genomes in reference databases, from a single tissue sample that was not subjected to any physical or chemical pre-treatment. One of these rare, divergent organisms has a small (593 kbp), poorly annotated genome with low GC content (20.9%) and a 16S rRNA gene with just 65% sequence similarity to the closest reference sequence. Lastly, our findings illustrate the importance of sampling strategy and de novo assembly of metagenomic reads to understand the extent and function of bacterial biodiversity.« less
Single sample resolution of rare microbial dark matter in a marine invertebrate metagenome
Miller, Ian J.; Weyna, Theodore R.; Fong, Stephen S.; ...
2016-09-29
Direct, untargeted sequencing of environmental samples (metagenomics) and de novo genome assembly enable the study of uncultured and phylogenetically divergent organisms. However, separating individual genomes from a mixed community has often relied on the differential-coverage analysis of multiple, deeply sequenced samples. In the metagenomic investigation of the marine bryozoan Bugula neritina, we uncovered seven bacterial genomes associated with a single B. neritina individual that appeared to be transient associates, two of which were unique to one individual and undetectable using certain “universal” 16S rRNA primers and probes. We recovered high quality genome assemblies for several rare instances of “microbial darkmore » matter,” or phylogenetically divergent bacteria lacking genomes in reference databases, from a single tissue sample that was not subjected to any physical or chemical pre-treatment. One of these rare, divergent organisms has a small (593 kbp), poorly annotated genome with low GC content (20.9%) and a 16S rRNA gene with just 65% sequence similarity to the closest reference sequence. Lastly, our findings illustrate the importance of sampling strategy and de novo assembly of metagenomic reads to understand the extent and function of bacterial biodiversity.« less
Concerted evolution at the population level: pupfish HindIII satellite DNA sequences.
Elder, J F; Turner, B J
1994-01-01
The canonical monomers (approximately 170 bp) of an abundant (1.9 x 10(6) copies per diploid genome) satellite DNA sequence family in the genome of Cyprinodon variegatus, a "pupfish" that ranges along the Atlantic coast from Cape Cod to central Mexico, are divergent in base sequence in 10 of 12 samples collected from natural populations. The divergence involves substitutions, deletions, and insertions, is marked in scope (mean pairwise sequence similarity = 61.6%; range = 35-95.9%), is largely confined to the 3' half of the monomer, and is not correlated with the distance among collecting sites. Repetitive cloning and direct genomic sequencing experiments failed to detect intrapopulation and intraindividual variation, suggesting high levels of sequence homogeneity within populations. The satellite sequence has therefore undergone "concerted evolution," at the level of the local population. Concerted evolution has previously almost always been discussed in terms of the divergence of species or higher taxa; its intraspecific occurrence apparently has not been reported previously. The generality of the observation is difficult to evaluate, for although satellite DNAs from a large number of organisms have been studied in detail, there appear to be little or no other data on their sequence variation in natural populations. The relationship (if any) between concerted, population level, satellite DNA divergence and the extent of gene flow/genetic isolation among conspecific natural populations remains to be established. Images PMID:8302879
Generalization of Entropy Based Divergence Measures for Symbolic Sequence Analysis
Ré, Miguel A.; Azad, Rajeev K.
2014-01-01
Entropy based measures have been frequently used in symbolic sequence analysis. A symmetrized and smoothed form of Kullback-Leibler divergence or relative entropy, the Jensen-Shannon divergence (JSD), is of particular interest because of its sharing properties with families of other divergence measures and its interpretability in different domains including statistical physics, information theory and mathematical statistics. The uniqueness and versatility of this measure arise because of a number of attributes including generalization to any number of probability distributions and association of weights to the distributions. Furthermore, its entropic formulation allows its generalization in different statistical frameworks, such as, non-extensive Tsallis statistics and higher order Markovian statistics. We revisit these generalizations and propose a new generalization of JSD in the integrated Tsallis and Markovian statistical framework. We show that this generalization can be interpreted in terms of mutual information. We also investigate the performance of different JSD generalizations in deconstructing chimeric DNA sequences assembled from bacterial genomes including that of E. coli, S. enterica typhi, Y. pestis and H. influenzae. Our results show that the JSD generalizations bring in more pronounced improvements when the sequences being compared are from phylogenetically proximal organisms, which are often difficult to distinguish because of their compositional similarity. While small but noticeable improvements were observed with the Tsallis statistical JSD generalization, relatively large improvements were observed with the Markovian generalization. In contrast, the proposed Tsallis-Markovian generalization yielded more pronounced improvements relative to the Tsallis and Markovian generalizations, specifically when the sequences being compared arose from phylogenetically proximal organisms. PMID:24728338
Generalization of entropy based divergence measures for symbolic sequence analysis.
Ré, Miguel A; Azad, Rajeev K
2014-01-01
Entropy based measures have been frequently used in symbolic sequence analysis. A symmetrized and smoothed form of Kullback-Leibler divergence or relative entropy, the Jensen-Shannon divergence (JSD), is of particular interest because of its sharing properties with families of other divergence measures and its interpretability in different domains including statistical physics, information theory and mathematical statistics. The uniqueness and versatility of this measure arise because of a number of attributes including generalization to any number of probability distributions and association of weights to the distributions. Furthermore, its entropic formulation allows its generalization in different statistical frameworks, such as, non-extensive Tsallis statistics and higher order Markovian statistics. We revisit these generalizations and propose a new generalization of JSD in the integrated Tsallis and Markovian statistical framework. We show that this generalization can be interpreted in terms of mutual information. We also investigate the performance of different JSD generalizations in deconstructing chimeric DNA sequences assembled from bacterial genomes including that of E. coli, S. enterica typhi, Y. pestis and H. influenzae. Our results show that the JSD generalizations bring in more pronounced improvements when the sequences being compared are from phylogenetically proximal organisms, which are often difficult to distinguish because of their compositional similarity. While small but noticeable improvements were observed with the Tsallis statistical JSD generalization, relatively large improvements were observed with the Markovian generalization. In contrast, the proposed Tsallis-Markovian generalization yielded more pronounced improvements relative to the Tsallis and Markovian generalizations, specifically when the sequences being compared arose from phylogenetically proximal organisms.
Yamada, Kazuhiko; Kamimura, Eikichi; Kondo, Mariko; Tsuchiya, Kimiyuki; Nishida-Umehara, Chizuko; Matsuda, Yoichi
2006-02-01
We molecularly cloned new families of site-specific repetitive DNA sequences from BglII- and EcoRI-digested genomic DNA of the Syrian hamster (Mesocricetus auratus, Cricetrinae, Rodentia) and characterized them by chromosome in situ hybridization and filter hybridization. They were classified into six different types of repetitive DNA sequence families according to chromosomal distribution and genome organization. The hybridization patterns of the sequences were consistent with the distribution of C-positive bands and/or Hoechst-stained heterochromatin. The centromeric major satellite DNA and sex chromosome-specific and telomeric region-specific repetitive sequences were conserved in the same genus (Mesocricetus) but divergent in different genera. The chromosome-2-specific sequence was conserved in two genera, Mesocricetus and Cricetulus, and a low copy number of repetitive sequences on the heterochromatic chromosome arms were conserved in the subfamily Cricetinae but not in the subfamily Calomyscinae. By contrast, the other type of repetitive sequences on the heterochromatic chromosome arms, which had sequence similarities to a LINE sequence of rodents, was conserved through the three subfamilies, Cricetinae, Calomyscinae and Murinae. The nucleotide divergence of the repetitive sequences of heterochromatin was well correlated with the phylogenetic relationships of the Cricetinae species, and each sequence has been independently amplified and diverged in the same genome.
Complete genome sequence of a divergent strain of Japanese yam mosaic virus from China
USDA-ARS?s Scientific Manuscript database
A novel strain of Japanese yam mosaic virus (JYMV-CN) was identified in a yam plant with foliar mottle symptoms in China. The complete genomic sequence of JYMV-CN was determined. Its genomic sequence of 9701 nucleotides encodes a polyprotein of 3247 amino acids. Its organization was virtually identi...
FRAGS: estimation of coding sequence substitution rates from fragmentary data
Swart, Estienne C; Hide, Winston A; Seoighe, Cathal
2004-01-01
Background Rates of substitution in protein-coding sequences can provide important insights into evolutionary processes that are of biomedical and theoretical interest. Increased availability of coding sequence data has enabled researchers to estimate more accurately the coding sequence divergence of pairs of organisms. However the use of different data sources, alignment protocols and methods to estimate substitution rates leads to widely varying estimates of key parameters that define the coding sequence divergence of orthologous genes. Although complete genome sequence data are not available for all organisms, fragmentary sequence data can provide accurate estimates of substitution rates provided that an appropriate and consistent methodology is used and that differences in the estimates obtainable from different data sources are taken into account. Results We have developed FRAGS, an application framework that uses existing, freely available software components to construct in-frame alignments and estimate coding substitution rates from fragmentary sequence data. Coding sequence substitution estimates for human and chimpanzee sequences, generated by FRAGS, reveal that methodological differences can give rise to significantly different estimates of important substitution parameters. The estimated substitution rates were also used to infer upper-bounds on the amount of sequencing error in the datasets that we have analysed. Conclusion We have developed a system that performs robust estimation of substitution rates for orthologous sequences from a pair of organisms. Our system can be used when fragmentary genomic or transcript data is available from one of the organisms and the other is a completely sequenced genome within the Ensembl database. As well as estimating substitution statistics our system enables the user to manage and query alignment and substitution data. PMID:15005802
USDA-ARS?s Scientific Manuscript database
Genome evolution influences a parasite’s’s pathogenicity, host-pathogen interactions, environmental constraints, and invasion biology, while genome assemblies form the basis of comparative sequence analyses. Given that closely related organisms typically maintain appreciable synteny, the genome asse...
Extensive Local Gene Duplication and Functional Divergence among Paralogs in Atlantic Salmon
Warren, Ian A.; Ciborowski, Kate L.; Casadei, Elisa; Hazlerigg, David G.; Martin, Sam; Jordan, William C.; Sumner, Seirian
2014-01-01
Many organisms can generate alternative phenotypes from the same genome, enabling individuals to exploit diverse and variable environments. A prevailing hypothesis is that such adaptation has been favored by gene duplication events, which generate redundant genomic material that may evolve divergent functions. Vertebrate examples of recent whole-genome duplications are sparse although one example is the salmonids, which have undergone a whole-genome duplication event within the last 100 Myr. The life-cycle of the Atlantic salmon, Salmo salar, depends on the ability to produce alternating phenotypes from the same genome, to facilitate migration and maintain its anadromous life history. Here, we investigate the hypothesis that genome-wide and local gene duplication events have contributed to the salmonid adaptation. We used high-throughput sequencing to characterize the transcriptomes of three key organs involved in regulating migration in S. salar: Brain, pituitary, and olfactory epithelium. We identified over 10,000 undescribed S. salar sequences and designed an analytic workflow to distinguish between paralogs originating from local gene duplication events or from whole-genome duplication events. These data reveal that substantial local gene duplications took place shortly after the whole-genome duplication event. Many of the identified paralog pairs have either diverged in function or become noncoding. Future functional genomics studies will reveal to what extent this rich source of divergence in genetic sequence is likely to have facilitated the evolution of extreme phenotypic plasticity required for an anadromous life-cycle. PMID:24951567
Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Hubisz, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Zhang, Peili; Liu, Jing; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catharine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenée; Verduzco, Daniel; Clerc-Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.
2005-01-01
We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura, and compared this to the genome sequence of Drosophila melanogaster, a primary model organism. Throughout evolution the vast majority of Drosophila genes have remained on the same chromosome arm, but within each arm gene order has been extensively reshuffled, leading to a minimum of 921 syntenic blocks shared between the species. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 25–55 million years (Myr) since the pseudoobscura/melanogaster divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome-wide average, consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than random and nearby sequences between the species—but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila. PMID:15632085
Microbial evolution of sulphate reduction when lateral gene transfer is geographically restricted.
Chi Fru, E
2011-07-01
Lateral gene transfer (LGT) is an important mechanism by which micro-organisms acquire new functions. This process has been suggested to be central to prokaryotic evolution in various environments. However, the influence of geographical constraints on the evolution of laterally acquired genes in microbial metabolic evolution is not yet well understood. In this study, the influence of geographical isolation on the evolution of laterally acquired dissimilatory sulphite reductase (dsr) gene sequences in the sulphate-reducing micro-organisms (SRM) was investigated. Sequences on four continental blocks related to SRM known to have received dsr by LGT were analysed using standard phylogenetic and multidimensional statistical methods. Sequences related to lineages with large genetic diversity correlated positively with habitat divergence. Those affiliated to Thermodesulfobacterium indicated strong biogeographical delineation; hydrothermal-vent sequences clustered independently from hot-spring sequences. Some of the hydrothermal-vent and hot-spring sequences suggested to have been acquired from a common ancestral source may have diverged upon isolation within distinct habitats. In contrast, analysis of some Desulfotomaculum sequences indicated they could have been transferred from different ancestral sources but converged upon isolation within the same niche. These results hint that, after lateral acquisition of dsr genes, barriers to gene flow probably play a strong role in their subsequent evolution.
Czesny, Sergiusz; Epifanio, John; Michalak, Pawel
2012-01-01
Alewife Alosa pseudoharengus, a small clupeid fish native to Atlantic Ocean, has recently (∼150 years ago) invaded the North American Great Lakes and despite challenges of freshwater environment its populations exploded and disrupted local food web structures. This range expansion has been accompanied by dramatic changes at all levels of organization. Growth rates, size at maturation, or fecundity are only a few of the most distinct morphological and life history traits that contrast the two alewife morphs. A question arises to what extent these rapidly evolving differences between marine and freshwater varieties result from regulatory (including phenotypic plasticity) or structural mutations. To gain insights into expression changes and sequence divergence between marine and freshwater alewives, we sequenced transcriptomes of individuals from Lake Michigan and Atlantic Ocean. Population specific single nucleotide polymorphisms were rare but interestingly occurred in sequences of genes that also tended to show large differences in expression. Our results show that the striking phenotypic divergence between anadromous and lake alewives can be attributed to massive regulatory modifications rather than coding changes.
Czesny, Sergiusz; Epifanio, John; Michalak, Pawel
2012-01-01
Alewife Alosa pseudoharengus, a small clupeid fish native to Atlantic Ocean, has recently (∼150 years ago) invaded the North American Great Lakes and despite challenges of freshwater environment its populations exploded and disrupted local food web structures. This range expansion has been accompanied by dramatic changes at all levels of organization. Growth rates, size at maturation, or fecundity are only a few of the most distinct morphological and life history traits that contrast the two alewife morphs. A question arises to what extent these rapidly evolving differences between marine and freshwater varieties result from regulatory (including phenotypic plasticity) or structural mutations. To gain insights into expression changes and sequence divergence between marine and freshwater alewives, we sequenced transcriptomes of individuals from Lake Michigan and Atlantic Ocean. Population specific single nucleotide polymorphisms were rare but interestingly occurred in sequences of genes that also tended to show large differences in expression. Our results show that the striking phenotypic divergence between anadromous and lake alewives can be attributed to massive regulatory modifications rather than coding changes. PMID:22438868
Datasets for evolutionary comparative genomics
Liberles, David A
2005-01-01
Many decisions about genome sequencing projects are directed by perceived gaps in the tree of life, or towards model organisms. With the goal of a better understanding of biology through the lens of evolution, however, there are additional genomes that are worth sequencing. One such rationale for whole-genome sequencing is discussed here, along with other important strategies for understanding the phenotypic divergence of species. PMID:16086856
Shahin, Arwa; Smulders, Marinus J. M.; van Tuyl, Jaap M.; Arens, Paul; Bakker, Freek T.
2014-01-01
Next Generation Sequencing (NGS) may enable estimating relationships among genotypes using allelic variation of multiple nuclear genes simultaneously. We explored the potential and caveats of this strategy in four genetically distant Lilium cultivars to estimate their genetic divergence from transcriptome sequences using three approaches: POFAD (Phylogeny of Organisms from Allelic Data, uses allelic information of sequence data), RAxML (Randomized Accelerated Maximum Likelihood, tree building based on concatenated consensus sequences) and Consensus Network (constructing a network summarizing among gene tree conflicts). Twenty six gene contigs were chosen based on the presence of orthologous sequences in all cultivars, seven of which also had an orthologous sequence in Tulipa, used as out-group. The three approaches generated the same topology. Although the resolution offered by these approaches is high, in this case there was no extra benefit in using allelic information. We conclude that these 26 genes can be widely applied to construct a species tree for the genus Lilium. PMID:25368628
Candida ficus sp. nov., a novel yeast species from the gut of Apriona germari larvae.
Hui, Feng-Li; Niu, Qiu-Hong; Ke, Tao; Liu, Zheng
2012-11-01
A novel yeast species is described based on three strains from the gut of wood-boring larvae collected in a tree trunk of Ficus carica cultivated in parks near Nanyang, central China. Phylogenetic analysis based on sequences of the D1/D2 domains of the large subunit rRNA gene showed that these strains occurred in a separate clade that was genetically distinct from all known ascomycetous yeasts. In terms of pairwise sequence divergence, the novel strains differed by 15.3% divergence from the type strain of Pichia terricola, and by 15.8% divergence from the type strains of Pichia exigua and Candida rugopelliculosa in the D1/D2 domains. All three are ascomycetous yeasts in the Pichia clade. Unlike P. terricola, P. exigua and C. rugopelliculosa, the novel isolates did not ferment glucose. The name Candida ficus sp. nov. is proposed to accommodate these highly divergent organisms, with STN-8(T) (=CICC 1980(T)=CBS 12638(T)) as the type strain.
Archaebacterial rhodopsin sequences: Implications for evolution
NASA Technical Reports Server (NTRS)
Lanyi, J. K.
1991-01-01
It was proposed over 10 years ago that the archaebacteria represent a separate kingdom which diverged very early from the eubacteria and eukaryotes. It follows that investigations of archaebacterial characteristics might reveal features of early evolution. So far, two genes, one for bacteriorhodopsin and another for halorhodopsin, both from Halobacterium halobium, have been sequenced. We cloned and sequenced the gene coding for the polypeptide of another one of these rhodopsins, a halorhodopsin in Natronobacterium pharaonis. Peptide sequencing of cyanogen bromide fragments, and immuno-reactions of the protein and synthetic peptides derived from the C-terminal gene sequence, confirmed that the open reading frame was the structural gene for the pharaonis halorhodopsin polypeptide. The flanking DNA sequences of this gene, as well as those of other bacterial rhodopsins, were compared to previously proposed archaebacterial consensus sequences. In pairwise comparisons of the open reading frame with DNA sequences for bacterio-opsin and halo-opsin from Halobacterium halobium, silent divergences were calculated. These indicate very considerable evolutionary distance between each pair of genes, even in the dame organism. In spite of this, three protein sequences show extensive similarities, indicating strong selective pressures.
Complete genome sequence of a divergent strain of lettuce chlorosis virus from Periwinkle in China
USDA-ARS?s Scientific Manuscript database
A novel strain of Lettuce chlorosis virus (LCV) was identified from periwinkle in China (PW) with foliar interveinal chlorosis and plant dwarfing. Complete nucleotide (nt) sequences of genomic RNA1 and RNA2 of the virus are 8,602 nt and 8,456 nt, respectively. The genomic organization of LCV-PW rese...
Bass, David; Moureau, Gregory; Tang, Shuoya; McAlister, Erica; Culverwell, C. Lorna; Glücksman, Edvard; Wang, Hui; Brown, T. David K.; Gould, Ernest A.; Harbach, Ralph E.; de Lamballerie, Xavier; Firth, Andrew E.
2013-01-01
We investigated whether small RNA (sRNA) sequenced from field-collected mosquitoes and chironomids (Diptera) can be used as a proxy signature of viral prevalence within a range of species and viral groups, using sRNAs sequenced from wild-caught specimens, to inform total RNA deep sequencing of samples of particular interest. Using this strategy, we sequenced from adult Anopheles maculipennis s.l. mosquitoes the apparently nearly complete genome of one previously undescribed virus related to chronic bee paralysis virus, and, from a pool of Ochlerotatus caspius and Oc. detritus mosquitoes, a nearly complete entomobirnavirus genome. We also reconstructed long sequences (1503-6557 nt) related to at least nine other viruses. Crucially, several of the sequences detected were reconstructed from host organisms highly divergent from those in which related viruses have been previously isolated or discovered. It is clear that viral transmission and maintenance cycles in nature are likely to be significantly more complex and taxonomically diverse than previously expected. PMID:24260463
Divergence and Mosaicism among Virulent Soil Phages of the Burkholderia cepacia Complex‡
Summer, Elizabeth J.; Gonzalez, Carlos F.; Bomer, Morgan; Carlile, Thomas; Embry, Addie; Kucherka, Amalie M.; Lee, Jonte; Mebane, Leslie; Morrison, William C.; Mark, Louise; King, Maria D.; LiPuma, John J.; Vidaver, Anne K.; Young, Ry
2006-01-01
We have determined the genomic sequences of four virulent myophages, Bcep1, Bcep43, BcepB1A, and Bcep781, whose hosts are soil isolates of the Burkholderia cepacia complex. Despite temporal and spatial separations between initial isolations, three of the phages (Bcep1, Bcep43, and Bcep781, designated the Bcep781 group) exhibit 87% to 99% sequence identity to one another and most coding region differences are due to synonymous nucleotide substitutions, a hallmark of neutral genetic drift. Phage BcepB1A has a very different genome organization but is clearly a mosaic with respect to many of the genes of the Bcep781 group, as is a defective prophage element in Photorhabdus luminescens. Functions were assigned to 27 out of 71 predicted genes of Bcep1 despite extreme sequence divergence. Using a lambda repressor fusion technique, 10 Bcep781-encoded proteins were identified for their ability to support homotypic interactions. While head and tail morphogenesis genes have retained canonical gene order despite extreme sequence divergence, genes involved in DNA metabolism and host lysis are not organized as in other phages. This unusual genome arrangement may contribute to the ability of the Bcep781-like phages to maintain a unified genomic type. However, the Bcep781 group phages can also engage in lateral gene transfer events with otherwise unrelated phages, a process that contributes to the broader-scale genomic mosaicism prevalent among the tailed phages. PMID:16352842
Lischer, Heidi E L; Excoffier, Laurent; Heckel, Gerald
2014-04-01
Phylogenetic reconstruction of the evolutionary history of closely related organisms may be difficult because of the presence of unsorted lineages and of a relatively high proportion of heterozygous sites that are usually not handled well by phylogenetic programs. Genomic data may provide enough fixed polymorphisms to resolve phylogenetic trees, but the diploid nature of sequence data remains analytically challenging. Here, we performed a phylogenomic reconstruction of the evolutionary history of the common vole (Microtus arvalis) with a focus on the influence of heterozygosity on the estimation of intraspecific divergence times. We used genome-wide sequence information from 15 voles distributed across the European range. We provide a novel approach to integrate heterozygous information in existing phylogenetic programs by repeated random haplotype sampling from sequences with multiple unphased heterozygous sites. We evaluated the impact of the use of full, partial, or no heterozygous information for tree reconstructions on divergence time estimates. All results consistently showed four deep and strongly supported evolutionary lineages in the vole data. These lineages undergoing divergence processes split only at the end or after the last glacial maximum based on calibration with radiocarbon-dated paleontological material. However, the incorporation of information from heterozygous sites had a significant impact on absolute and relative branch length estimations. Ignoring heterozygous information led to an overestimation of divergence times between the evolutionary lineages of M. arvalis. We conclude that the exclusion of heterozygous sites from evolutionary analyses may cause biased and misleading divergence time estimates in closely related taxa.
Pohl, Nélida; Sison-Mangus, Marilou P; Yee, Emily N; Liswi, Saif W; Briscoe, Adriana D
2009-05-13
The increase in availability of genomic sequences for a wide range of organisms has revealed gene duplication to be a relatively common event. Encounters with duplicate gene copies have consequently become almost inevitable in the context of collecting gene sequences for inferring species trees. Here we examine the effect of incorporating duplicate gene copies evolving at different rates on tree reconstruction and time estimation of recent and deep divergences in butterflies. Sequences from ultraviolet-sensitive (UVRh), blue-sensitive (BRh), and long-wavelength sensitive (LWRh) opsins,EF-1 and COI were obtained from 27 taxa representing the five major butterfly families (5535 bp total). Both BRh and LWRh are present in multiple copies in some butterfly lineages and the different copies evolve at different rates. Regardless of the phylogenetic reconstruction method used, we found that analyses of combined data sets using either slower or faster evolving copies of duplicate genes resulted in a single topology in agreement with our current understanding of butterfly family relationships based on morphology and molecules. Interestingly, individual analyses of BRh and LWRh sequences also recovered these family-level relationships. Two different relaxed clock methods resulted in similar divergence time estimates at the shallower nodes in the tree, regardless of whether faster or slower evolving copies were used, with larger discrepancies observed at deeper nodes in the phylogeny. The time of divergence between the monarch butterfly Danaus plexippus and the queen D. gilippus (15.3-35.6 Mya) was found to be much older than the time of divergence between monarch co-mimic Limenitis archippus and red-spotted purple L. arthemis (4.7-13.6 Mya), and overlapping with the time of divergence of the co-mimetic passionflower butterflies Heliconius erato and H. melpomene (13.5-26.1 Mya). Our family-level results are congruent with recent estimates found in the literature and indicate an age of 84-113 million years for the divergence of all butterfly families. These results are consistent with diversification of the butterfly families following the radiation of angiosperms and suggest that some classes of opsin genes may be usefully employed for both phylogenetic reconstruction and divergence time estimation.
The contribution of alu elements to mutagenic DNA double-strand break repair.
Morales, Maria E; White, Travis B; Streva, Vincent A; DeFreece, Cecily B; Hedges, Dale J; Deininger, Prescott L
2015-03-01
Alu elements make up the largest family of human mobile elements, numbering 1.1 million copies and comprising 11% of the human genome. As a consequence of evolution and genetic drift, Alu elements of various sequence divergence exist throughout the human genome. Alu/Alu recombination has been shown to cause approximately 0.5% of new human genetic diseases and contribute to extensive genomic structural variation. To begin understanding the molecular mechanisms leading to these rearrangements in mammalian cells, we constructed Alu/Alu recombination reporter cell lines containing Alu elements ranging in sequence divergence from 0%-30% that allow detection of both Alu/Alu recombination and large non-homologous end joining (NHEJ) deletions that range from 1.0 to 1.9 kb in size. Introduction of as little as 0.7% sequence divergence between Alu elements resulted in a significant reduction in recombination, which indicates even small degrees of sequence divergence reduce the efficiency of homology-directed DNA double-strand break (DSB) repair. Further reduction in recombination was observed in a sequence divergence-dependent manner for diverged Alu/Alu recombination constructs with up to 10% sequence divergence. With greater levels of sequence divergence (15%-30%), we observed a significant increase in DSB repair due to a shift from Alu/Alu recombination to variable-length NHEJ which removes sequence between the two Alu elements. This increase in NHEJ deletions depends on the presence of Alu sequence homeology (similar but not identical sequences). Analysis of recombination products revealed that Alu/Alu recombination junctions occur more frequently in the first 100 bp of the Alu element within our reporter assay, just as they do in genomic Alu/Alu recombination events. This is the first extensive study characterizing the influence of Alu element sequence divergence on DNA repair, which will inform predictions regarding the effect of Alu element sequence divergence on both the rate and nature of DNA repair events.
Han, Xiang Y; Sizer, Kurt C; Thompson, Erika J; Kabanja, Juma; Li, Jun; Hu, Peter; Gómez-Valero, Laura; Silva, Francisco J
2009-10-01
Mycobacterium lepromatosis is a newly discovered leprosy-causing organism. Preliminary phylogenetic analysis of its 16S rRNA gene and a few other gene segments revealed significant divergence from Mycobacterium leprae, a well-known cause of leprosy, that justifies the status of M. lepromatosis as a new species. In this study we analyzed the sequences of 20 genes and pseudogenes (22,814 nucleotides). Overall, the level of matching of these sequences with M. leprae sequences was 90.9%, which substantiated the species-level difference; the levels of matching for the 16S rRNA genes and 14 protein-encoding genes were 98.0% and 93.1%, respectively, but the level of matching for five pseudogenes was only 79.1%. Five conserved protein-encoding genes were selected to construct phylogenetic trees and to calculate the numbers of synonymous substitutions (dS values) and nonsynonymous substitutions (dN values) in the two species. Robust phylogenetic trees constructed using concatenated alignment of these genes placed M. lepromatosis and M. leprae in a tight cluster with long terminal branches, implying that the divergence occurred long ago. The dS and dN values were also much higher than those for other closest pairs of mycobacteria. The dS values were 14 to 28% of the dS values for M. leprae and Mycobacterium tuberculosis, a more divergent pair of species. These results thus indicate that M. lepromatosis and M. leprae diverged approximately 10 million years ago. The M. lepromatosis pseudogenes analyzed that were also pseudogenes in M. leprae showed nearly neutral evolution, and their relative ages were similar to those of M. leprae pseudogenes, suggesting that they were pseudogenes before divergence. Taken together, the results described above indicate that M. lepromatosis and M. leprae diverged from a common ancestor after the massive gene inactivation event described previously for M. leprae.
NASA Astrophysics Data System (ADS)
Nallaseth, Ferez Soli
The Y-chromosome presents a unique cytogenetic framework for the evolution of nucleotide sequences. Alignment of nine Y-chromosomal fragments in their increasing Y-specific/non Y-specific (male/female) sequence divergence ratios was directly and inversely related to their interspersion on these two respective genomic fractions. Sequence analysis confirmed a direct relationship between divergence ratios and the Alu, LINE-1, Satellite and their derivative oligonucleotide contents. Thus their relocation on the Y-chromosome is followed by sequence divergence rather than the well documented concerted evolution of these non-coding progenitor repeated sequences. Five of the nine Y-chromosomal fragments are non-pseudoautosomal and transcribed into heterogeneous PolyA^+ RNA and thus can be retrotransposed. Evolutionary and computer analysis identified homologous oligonucleotide tracts in several human loci suggesting common and random mechanistic origins. Dysgenic genomes represent the accelerated evolution driving sequence divergence (McClintock, 1984). Sex reversal and sterility characterizing dysgenesis occurs in C57BL/6JY ^{rm Pos} but not in 129/SvY^{rm Pos} derivative strains. High frequency, random, multi-locus deletion products of the feral Y^{ rm Pos}-chromosome are generated in the germlines of F1(C57BL/6J X 129/SvY^{ rm Pos})(male) and C57BL/6JY ^{rm Pos}(male) but not in 129/SvY^{rm Pos}(male). Equal, 10^{-1}, 10^ {-2}, and 0 copies (relative to males) of Y^{rm Pos}-specific deletion products respectively characterize C57BL/6JY ^{rm Pos} (HC), (LC), (T) and (F) females. The testes determining loci of inactive Y^{rm Pos}-chromosomes in C57BL/6JY^{rm Pos} HC females are the preferentially deleted/rearranged Y ^{rm Pos}-sequences. Disruption of regulation of plasma testosterone and hepatic MUP-A mRNA levels, TRD of a 4.7 Kbp EcoR1 fragment suggest disruption of autosomal/X-chromosomal sequences. These data and the highly repeated progenitor (Alu, GATA, LINE-1) sequence content of deletion products confirmed the previously unidentified loss of genetic control of mammalian chromosome biology and hybrid dysgenesis.
NASA Astrophysics Data System (ADS)
Xu, Jiajie; Jiang, Bo; Chai, Sanming; He, Yuan; Zhu, Jianyi; Shen, Zonggen; Shen, Songdong
2016-09-01
Filamentous Bangia, which are distributed extensively throughout the world, have simple and similar morphological characteristics. Scientists can classify these organisms using molecular markers in combination with morphology. We successfully sequenced the complete nuclear ribosomal DNA, approximately 13 kb in length, from a marine Bangia population. We further analyzed the small subunit ribosomal DNA gene (nrSSU) and the internal transcribed spacer (ITS) sequence regions along with nine other marine, and two freshwater Bangia samples from China. Pairwise distances of the nrSSU and 5.8S ribosomal DNA gene sequences show the marine samples grouping together with low divergences (00.003; 0-0.006, respectively) from each other, but high divergences (0.123-0.126; 0.198, respectively) from freshwater samples. An exception is the marine sample collected from Weihai, which shows high divergence from both other marine samples (0.063-0.065; 0.129, respectively) and the freshwater samples (0.097; 0.120, respectively). A maximum likelihood phylogenetic tree based on a combined SSU-ITS dataset with maximum likelihood method shows the samples divided into three clades, with the two marine sample clades containing Bangia spp. from North America, Europe, Asia, and Australia; and one freshwater clade, containing Bangia atropurpurea from North America and China.
2011-01-01
Background The Bacillus cereus sensu lato group consists of six species (B. anthracis, B. cereus, B. mycoides, B. pseudomycoides, B. thuringiensis, and B. weihenstephanensis). While classical microbial taxonomy proposed these organisms as distinct species, newer molecular phylogenies and comparative genome sequencing suggests that these organisms should be classified as a single species (thus, we will refer to these organisms collectively as the Bc species-group). How do we account for the underlying similarity of these phenotypically diverse microbes? It has been established for some time that the most rapidly evolving and evolutionarily flexible portions of the bacterial genome are regulatory sequences and transcriptional networks. Other studies have suggested that the sigma factor gene family of these organisms has diverged and expanded significantly relative to their ancestors; sigma factors are those portions of the bacterial transcriptional apparatus that control RNA polymerase recognition for promoter selection. Thus, examining sigma factor divergence in these organisms would concurrently examine both regulatory sequences and transcriptional networks important for divergence. We began this examination by comparison to the sigma factor gene set of B. subtilis. Results Phylogenetic analysis of the Bc species-group utilizing 157 single-copy genes of the family Bacillaceae suggests that several taxonomic revisions of the genus Bacillus should be considered. Within the Bc species-group there is little indication that the currently recognized species form related sub-groupings, suggesting that they are members of the same species. The sigma factor gene family encoded by the Bc species-group appears to be the result of a dynamic gene-duplication and gene-loss process that in previous analyses underestimated the true heterogeneity of the sigma factor content in the Bc species-group. Conclusions Expansion of the sigma factor gene family appears to have preferentially occurred within the extracytoplasmic function (ECF) sigma factor genes, while the primary alternative (PA) sigma factor genes are, in general, highly conserved with those found in B. subtilis. Divergence of the sigma-controlled transcriptional regulons among various members of the Bc species-group likely has a major role in explaining the diversity of phenotypic characteristics seen in members of the Bc species-group. PMID:21864360
Chromosomal Speciation in the Genomics Era: Disentangling Phylogenetic Evolution of Rock-wallabies.
Potter, Sally; Bragg, Jason G; Blom, Mozes P K; Deakin, Janine E; Kirkpatrick, Mark; Eldridge, Mark D B; Moritz, Craig
2017-01-01
The association of chromosome rearrangements (CRs) with speciation is well established, and there is a long history of theory and evidence relating to "chromosomal speciation." Genomic sequencing has the potential to provide new insights into how reorganization of genome structure promotes divergence, and in model systems has demonstrated reduced gene flow in rearranged segments. However, there are limits to what we can understand from a small number of model systems, which each only tell us about one episode of chromosomal speciation. Progressing from patterns of association between chromosome (and genic) change, to understanding processes of speciation requires both comparative studies across diverse systems and integration of genome-scale sequence comparisons with other lines of evidence. Here, we showcase a promising example of chromosomal speciation in a non-model organism, the endemic Australian marsupial genus Petrogale . We present initial phylogenetic results from exon-capture that resolve a history of divergence associated with extensive and repeated CRs. Yet it remains challenging to disentangle gene tree heterogeneity caused by recent divergence and gene flow in this and other such recent radiations. We outline a way forward for better integration of comparative genomic sequence data with evidence from molecular cytogenetics, and analyses of shifts in the recombination landscape and potential disruption of meiotic segregation and epigenetic programming. In all likelihood, CRs impact multiple cellular processes and these effects need to be considered together, along with effects of genic divergence. Understanding the effects of CRs together with genic divergence will require development of more integrative theory and inference methods. Together, new data and analysis tools will combine to shed light on long standing questions of how chromosome and genic divergence promote speciation.
Yang, Zujun; Zhang, Tao; Bolshoy, Alexander; Beharav, Alexander; Nevo, Eviatar
2009-05-01
'Evolution Canyon' (ECI) at Lower Nahal Oren, Mount Carmel, Israel, is an optimal natural microscale model for unravelling evolution in action highlighting the twin evolutionary processes of adaptation and speciation. A major model organism in ECI is wild barley, Hordeum spontaneum, the progenitor of cultivated barley, which displays dramatic interslope adaptive and speciational divergence on the 'African' dry slope (AS) and the 'European' humid slope (ES), separated on average by 200 m. Here we examined interslope single nucleotide polymorphism (SNP) sequences and the expression diversity of the drought resistant dehydrin 1 gene (Dhn1) between the opposite slopes. We analysed 47 plants (genotypes), 4-10 individuals in each of seven stations (populations) in an area of 7000 m(2), for Dhn1 sequence diversity located in the 5' upstream flanking region of the gene. We found significant levels of Dhn1 genic diversity represented by 29 haplotypes, derived from 45 SNPs in a total of 708 bp sites. Most of the haplotypes, 25 out of 29 (= 86.2%), were represented by one genotype; hence, unique to one population. Only a single haplotype was common to both slopes. Genetic divergence of sequence and haplotype diversity was generally and significantly different among the populations and slopes. Nucleotide diversity was higher on the AS, whereas haplotype diversity was higher on the ES. Interslope divergence was significantly higher than intraslope divergence. The applied Tajima D rejected neutrality of the SNP diversity. The Dhn1 expression under dehydration indicated interslope divergent expression between AS and ES genotypes, reinforcing Dhn1 associated with drought resistance of wild barley at 'Evolution Canyon'. These results are inexplicable by mutation, gene flow, or chance effects, and support adaptive natural microclimatic selection as the major evolutionary divergent driving force.
Genome Sequence of the Novel Marine Member of the Gammaproteobacteria Strain HTCC5015▿
Thrash, J. Cameron; Stingl, Ulrich; Cho, Jang-Cheon; Ferriera, Steve; Johnson, Justin; Vergin, Kevin L.; Giovannoni, Stephen J.
2010-01-01
HTCC5015 is a novel, highly divergent marine member of the Gammaproteobacteria, currently without a cultured representative with greater than 89% 16S rRNA gene identity to itself. The organism was isolated from water collected from Hydrostation S south of Bermuda using high-throughput dilution-to-extinction culturing techniques. Here we present the genome sequence of the unique Gammaproteobacterium strain HTCC5015. PMID:20472792
Improvisation in evolution of genes and genomes: whose structure is it anyway?
Shakhnovich, Boris E; Shakhnovich, Eugene I
2008-06-01
Significant progress has been made in recent years in a variety of seemingly unrelated fields such as sequencing, protein structure prediction, and high-throughput transcriptomics and metabolomics. At the same time, new microscopic models have been developed that made it possible to analyze the evolution of genes and genomes from first principles. The results from these efforts enable, for the first time, a comprehensive insight into the evolution of complex systems and organisms on all scales--from sequences to organisms and populations. Every newly sequenced genome uncovers new genes, families, and folds. Where do these new genes come from? How do gene duplication and subsequent divergence of sequence and structure affect the fitness of the organism? What role does regulation play in the evolution of proteins and folds? Emerging synergism between data and modeling provides first robust answers to these questions.
Centromere location in Arabidopsis is unaltered by extreme divergence in CENH3 protein sequence
2017-01-01
During cell division, spindle fibers attach to chromosomes at centromeres. The DNA sequence at regional centromeres is fast evolving with no conserved genetic signature for centromere identity. Instead CENH3, a centromere-specific histone H3 variant, is the epigenetic signature that specifies centromere location across both plant and animal kingdoms. Paradoxically, CENH3 is also adaptively evolving. An ongoing question is whether CENH3 evolution is driven by a functional relationship with the underlying DNA sequence. Here, we demonstrate that despite extensive protein sequence divergence, CENH3 histones from distant species assemble centromeres on the same underlying DNA sequence. We first characterized the organization and diversity of centromere repeats in wild-type Arabidopsis thaliana. We show that A. thaliana CENH3-containing nucleosomes exhibit a strong preference for a unique subset of centromeric repeats. These sequences are largely missing from the genome assemblies and represent the youngest and most homogeneous class of repeats. Next, we tested the evolutionary specificity of this interaction in a background in which the native A. thaliana CENH3 is replaced with CENH3s from distant species. Strikingly, we find that CENH3 from Lepidium oleraceum and Zea mays, although specifying epigenetically weaker centromeres that result in genome elimination upon outcrossing, show a binding pattern on A. thaliana centromere repeats that is indistinguishable from the native CENH3. Our results demonstrate positional stability of a highly diverged CENH3 on independently evolved repeats, suggesting that the sequence specificity of centromeres is determined by a mechanism independent of CENH3. PMID:28223399
Genome analysis and polar tube firing dynamics of mosquito-infecting microsporidia
USDA-ARS?s Scientific Manuscript database
Microsporidia are highly divergent fungi that are obligate intracellular pathogens of a wide range of host organisms. Here we review recent findings from the genome sequences of mosquito-infecting microsporidian species Edhazardia aedis and Vavraia culicis, which show large differences in genome siz...
Pohl, Nélida; Sison-Mangus, Marilou P; Yee, Emily N; Liswi, Saif W; Briscoe, Adriana D
2009-01-01
Background The increase in availability of genomic sequences for a wide range of organisms has revealed gene duplication to be a relatively common event. Encounters with duplicate gene copies have consequently become almost inevitable in the context of collecting gene sequences for inferring species trees. Here we examine the effect of incorporating duplicate gene copies evolving at different rates on tree reconstruction and time estimation of recent and deep divergences in butterflies. Results Sequences from ultraviolet-sensitive (UVRh), blue-sensitive (BRh), and long-wavelength sensitive (LWRh) opsins,EF-1α and COI were obtained from 27 taxa representing the five major butterfly families (5535 bp total). Both BRh and LWRh are present in multiple copies in some butterfly lineages and the different copies evolve at different rates. Regardless of the phylogenetic reconstruction method used, we found that analyses of combined data sets using either slower or faster evolving copies of duplicate genes resulted in a single topology in agreement with our current understanding of butterfly family relationships based on morphology and molecules. Interestingly, individual analyses of BRh and LWRh sequences also recovered these family-level relationships. Two different relaxed clock methods resulted in similar divergence time estimates at the shallower nodes in the tree, regardless of whether faster or slower evolving copies were used, with larger discrepancies observed at deeper nodes in the phylogeny. The time of divergence between the monarch butterfly Danaus plexippus and the queen D. gilippus (15.3–35.6 Mya) was found to be much older than the time of divergence between monarch co-mimic Limenitis archippus and red-spotted purple L. arthemis (4.7–13.6 Mya), and overlapping with the time of divergence of the co-mimetic passionflower butterflies Heliconius erato and H. melpomene (13.5–26.1 Mya). Our family-level results are congruent with recent estimates found in the literature and indicate an age of 84–113 million years for the divergence of all butterfly families. Conclusion These results are consistent with diversification of the butterfly families following the radiation of angiosperms and suggest that some classes of opsin genes may be usefully employed for both phylogenetic reconstruction and divergence time estimation. PMID:19439087
Sequence space and the ongoing expansion of the protein universe.
Povolotskaya, Inna S; Kondrashov, Fyodor A
2010-06-17
The need to maintain the structural and functional integrity of an evolving protein severely restricts the repertoire of acceptable amino-acid substitutions. However, it is not known whether these restrictions impose a global limit on how far homologous protein sequences can diverge from each other. Here we explore the limits of protein evolution using sequence divergence data. We formulate a computational approach to study the rate of divergence of distant protein sequences and measure this rate for ancient proteins, those that were present in the last universal common ancestor. We show that ancient proteins are still diverging from each other, indicating an ongoing expansion of the protein sequence universe. The slow rate of this divergence is imposed by the sparseness of functional protein sequences in sequence space and the ruggedness of the protein fitness landscape: approximately 98 per cent of sites cannot accept an amino-acid substitution at any given moment but a vast majority of all sites may eventually be permitted to evolve when other, compensatory, changes occur. Thus, approximately 3.5 x 10(9) yr has not been enough to reach the limit of divergent evolution of proteins, and for most proteins the limit of sequence similarity imposed by common function may not exceed that of random sequences.
Centromere location in Arabidopsis is unaltered by extreme divergence in CENH3 protein sequence.
Maheshwari, Shamoni; Ishii, Takayoshi; Brown, C Titus; Houben, Andreas; Comai, Luca
2017-03-01
During cell division, spindle fibers attach to chromosomes at centromeres. The DNA sequence at regional centromeres is fast evolving with no conserved genetic signature for centromere identity. Instead CENH3, a centromere-specific histone H3 variant, is the epigenetic signature that specifies centromere location across both plant and animal kingdoms. Paradoxically, CENH3 is also adaptively evolving. An ongoing question is whether CENH3 evolution is driven by a functional relationship with the underlying DNA sequence. Here, we demonstrate that despite extensive protein sequence divergence, CENH3 histones from distant species assemble centromeres on the same underlying DNA sequence. We first characterized the organization and diversity of centromere repeats in wild-type Arabidopsis thaliana We show that A. thaliana CENH3-containing nucleosomes exhibit a strong preference for a unique subset of centromeric repeats. These sequences are largely missing from the genome assemblies and represent the youngest and most homogeneous class of repeats. Next, we tested the evolutionary specificity of this interaction in a background in which the native A. thaliana CENH3 is replaced with CENH3s from distant species. Strikingly, we find that CENH3 from Lepidium oleraceum and Zea mays , although specifying epigenetically weaker centromeres that result in genome elimination upon outcrossing, show a binding pattern on A. thaliana centromere repeats that is indistinguishable from the native CENH3. Our results demonstrate positional stability of a highly diverged CENH3 on independently evolved repeats, suggesting that the sequence specificity of centromeres is determined by a mechanism independent of CENH3. © 2017 Maheshwari et al.; Published by Cold Spring Harbor Laboratory Press.
[Structural organization of 5S ribosomal DNA of Rosa rugosa].
Tynkevych, Iu O; Volkov, R A
2014-01-01
In order to clarify molecular organization of the genomic region encoding 5S rRNA in diploid species Rosa rugosa several 5S rDNA repeated units were cloned and sequenced. Analysis of the obtained sequences revealed that only one length variant of 5S rDNA repeated units, which contains intact promoter elements in the intergenic spacer region (IGS) and appears to be transcriptionally active is present in the genome. Additionally, a limited number of 5S rDNA pseudogenes lacking a portion of coding sequence and the complete IGS was detected. A high level of sequence similarity (from 93.7 to 97.5%) between the IGS of major 5S rDNA variants of East Asian R. rugosa and North American R. nitida was found indicating comparatively recent divergence of these species.
Workman, Rachael E; Myrka, Alexander M; Wong, G William; Tseng, Elizabeth; Welch, Kenneth C; Timp, Winston
2018-03-01
Hummingbirds oxidize ingested nectar sugars directly to fuel foraging but cannot sustain this fuel use during fasting periods, such as during the night or during long-distance migratory flights. Instead, fasting hummingbirds switch to oxidizing stored lipids that are derived from ingested sugars. The hummingbird liver plays a key role in moderating energy homeostasis and this remarkable capacity for fuel switching. Additionally, liver is the principle location of de novo lipogenesis, which can occur at exceptionally high rates, such as during premigratory fattening. Yet understanding how this tissue and whole organism moderates energy turnover is hampered by a lack of information regarding how relevant enzymes differ in sequence, expression, and regulation. We generated a de novo transcriptome of the hummingbird liver using PacBio full-length cDNA sequencing (Iso-Seq), yielding 8.6Gb of sequencing data, or 2.6M reads from 4 different size fractions. We analyzed data using the SMRTAnalysis v3.1 Iso-Seq pipeline, then clustered isoforms into gene families to generate de novo gene contigs using Cogent. We performed orthology analysis to identify closely related sequences between our transcriptome and other avian and human gene sets. Finally, we closely examined homology of critical lipid metabolism genes between our transcriptome data and avian and human genomes. We confirmed high levels of sequence divergence within hummingbird lipogenic enzymes, suggesting a high probability of adaptive divergent function in the hepatic lipogenic pathways. Our results leverage cutting-edge technology and a novel bioinformatics pipeline to provide a first direct look at the transcriptome of this incredible organism.
Choudhary, Kumari S.; Mih, Nathan; Monk, Jonathan; Kavvas, Erol; Yurkovich, James T.; Sakoulas, George; Palsson, Bernhard O.
2018-01-01
Two-component systems (TCSs) consist of a histidine kinase and a response regulator. Here, we evaluated the conservation of the AgrAC TCS among 149 completely sequenced Staphylococcus aureus strains. It is composed of four genes: agrBDCA. We found that: (i) AgrAC system (agr) was found in all but one of the 149 strains, (ii) the agr positive strains were further classified into four agr types based on AgrD protein sequences, (iii) the four agr types not only specified the chromosomal arrangement of the agr genes but also the sequence divergence of AgrC histidine kinase protein, which confers signal specificity, (iv) the sequence divergence was reflected in distinct structural properties especially in the transmembrane region and second extracellular binding domain, and (v) there was a strong correlation between the agr type and the virulence genomic profile of the organism. Taken together, these results demonstrate that bioinformatic analysis of the agr locus leads to a classification system that correlates with the presence of virulence factors and protein structural properties. PMID:29887846
Smith, Jeramiah J; Kuraku, Shigehiro; Holt, Carson; Sauka-Spengler, Tatjana; Jiang, Ning; Campbell, Michael S; Yandell, Mark D; Manousaki, Tereza; Meyer, Axel; Bloom, Ona E; Morgan, Jennifer R; Buxbaum, Joseph D; Sachidanandam, Ravi; Sims, Carrie; Garruss, Alexander S; Cook, Malcolm; Krumlauf, Robb; Wiedemann, Leanne M; Sower, Stacia A; Decatur, Wayne A; Hall, Jeffrey A; Amemiya, Chris T; Saha, Nil R; Buckley, Katherine M; Rast, Jonathan P; Das, Sabyasachi; Hirano, Masayuki; McCurley, Nathanael; Guo, Peng; Rohner, Nicolas; Tabin, Clifford J; Piccinelli, Paul; Elgar, Greg; Ruffier, Magali; Aken, Bronwen L; Searle, Stephen MJ; Muffato, Matthieu; Pignatelli, Miguel; Herrero, Javier; Jones, Matthew; Brown, C Titus; Chung-Davidson, Yu-Wen; Nanlohy, Kaben G; Libants, Scot V; Yeh, Chu-Yin; McCauley, David W; Langeland, James A; Pancer, Zeev; Fritzsch, Bernd; de Jong, Pieter J; Zhu, Baoli; Fulton, Lucinda L; Theising, Brenda; Flicek, Paul; Bronner, Marianne E; Warren, Wesley C; Clifton, Sandra W; Wilson, Richard K; Li, Weiming
2013-01-01
Lampreys are representatives of an ancient vertebrate lineage that diverged from our own ~500 million years ago. By virtue of this deeply shared ancestry, the sea lamprey (P. marinus) genome is uniquely poised to provide insight into the ancestry of vertebrate genomes and the underlying principles of vertebrate biology. Here, we present the first lamprey whole-genome sequence and assembly. We note challenges faced owing to its high content of repetitive elements and GC bases, as well as the absence of broad-scale sequence information from closely related species. Analyses of the assembly indicate that two whole-genome duplications likely occurred before the divergence of ancestral lamprey and gnathostome lineages. Moreover, the results help define key evolutionary events within vertebrate lineages, including the origin of myelin-associated proteins and the development of appendages. The lamprey genome provides an important resource for reconstructing vertebrate origins and the evolutionary events that have shaped the genomes of extant organisms. PMID:23435085
Gómez, Africa; Serra, Manuel; Carvalho, Gary R; Lunt, David H
2002-07-01
Continental lake-dwelling zooplanktonic organisms have long been considered cosmopolitan species with little geographic variation in spite of the isolation of their habitats. Evidence of morphological cohesiveness and high dispersal capabilities support this interpretation. However, this view has been challenged recently as many such species have been shown either to comprise cryptic species complexes or to exhibit marked population genetic differentiation and strong phylogeographic structuring at a regional scale. Here we investigate the molecular phylogeny of the cosmopolitan passively dispersing rotifer Brachionus plicatilis (Rotifera: Monogononta) species complex using nucleotide sequence variation from both nuclear (ribosomal internal transcribed spacer 1, ITS1) and mitochondrial (cytochrome c oxidase subunit I, COI) genes. Analysis of rotifer resting eggs from 27 salt lakes in the Iberian Peninsula plus lakes from four continents revealed nine genetically divergent lineages. The high level of sequence divergence, absence of hybridization, and extensive sympatry observed support the specific status of these lineages. Sequence divergence estimates indicate that the B. plicatilis complex began diversifying many millions of years ago, yet has showed relatively high levels of morphological stasis. We discuss these results in relation to the ecology and genetics of aquatic invertebrates possessing dispersive resting propagules and address the apparent contradiction between zooplanktonic population structure and their morphological stasis.
A little bit of sex matters for genome evolution in asexual plants.
Hojsgaard, Diego; Hörandl, Elvira
2015-01-01
Genome evolution in asexual organisms is theoretically expected to be shaped by various factors: first, hybrid origin, and polyploidy confer a genomic constitution of highly heterozygous genotypes with multiple copies of genes; second, asexuality confers a lack of recombination and variation in populations, which reduces the efficiency of selection against deleterious mutations; hence, the accumulation of mutations and a gradual increase in mutational load (Muller's ratchet) would lead to rapid extinction of asexual lineages; third, allelic sequence divergence is expected to result in rapid divergence of lineages (Meselson effect). Recent transcriptome studies on the asexual polyploid complex Ranunculus auricomus using single-nucleotide polymorphisms confirmed neutral allelic sequence divergence within a short time frame, but rejected a hypothesis of a genome-wide accumulation of mutations in asexuals compared to sexuals, except for a few genes related to reproductive development. We discuss a general model that the observed incidence of facultative sexuality in plants may unmask deleterious mutations with partial dominance and expose them efficiently to purging selection. A little bit of sex may help to avoid genomic decay and extinction.
Ribeiro, José R de A; Carvalho, Patrícia M B de; Cabral, Anderson de S; Macrae, Andrew; Mendonça-Hagler, Leda C S; Berbara, Ricardo L L; Hagler, Allen N
2011-10-01
A novel yeast species within the Metschnikowiaceae is described based on a strain from the sugarcane (Saccharum sp.) rhizoplane of an organically managed farm in Rio de Janeiro, Brazil. The D1/D2 domain of the large subunit ribosomal RNA gene sequence analysis showed that the closest related species were Candida tsuchiyae with 86.2% and Candida thailandica with 86.7% of sequence identity. All three are anamorphs in the Clavispora opuntiae clade. The name Candida middelhoveniana sp. nov. is proposed to accommodate this highly divergent organism with the type strain Instituto de Microbiologia, Universidade Federal do Rio de Janeiro (IMUFRJ) 51965(T) (=Centraalbureau voor Schimmelcultures (CBS) 12306(T), Universidade Federal de Minas Gerais (UFMG)-70(T), DBVPG 8031(T)) and the GenBank/EMBL/DDBJ accession number for the D1/D2 domain LSU rDNA sequence is FN428871. The Mycobank deposit number is MB 519801.
Comparing and combining distance-based and character-based approaches for barcoding turtles.
Reid, B N; LE, M; McCord, W P; Iverson, J B; Georges, A; Bergmann, T; Amato, G; Desalle, R; Naro-Maciel, E
2011-11-01
Molecular barcoding can serve as a powerful tool in wildlife forensics and may prove to be a vital aid in conserving organisms that are threatened by illegal wildlife trade, such as turtles (Order Testudines). We produced cytochrome oxidase subunit one (COI) sequences (650 bp) for 174 turtle species and combined these with publicly available sequences for 50 species to produce a data set representative of the breadth of the order. Variability within the barcode region was assessed, and the utility of both distance-based and character-based methods for species identification was evaluated. For species in which genetic material from more than one individual was available (n = 69), intraspecific divergences were 1.3% on average, although divergences greater than the customary 2% barcode threshold occurred within 15 species. High intraspecific divergences could indicate species with a high degree of internal genetic structure or possibly even cryptic species, although introgression is also probable in some of these taxa. Divergences between species of the same genus were 6.4% on average; however, 49 species were <2% divergent from congeners. Low levels of interspecific divergence could be caused by recent evolutionary radiations coupled with the low rates of mtDNA evolution previously observed in turtles. Complementing distance-based barcoding with character-based methods for identifying diagnostic sets of nucleotides provided better resolution in several cases where distance-based methods failed to distinguish species. An online identification engine was created to provide character-based identifications. This study constitutes the first comprehensive barcoding effort for this seriously threatened order. © 2011 Blackwell Publishing Ltd.
Zill, Oliver A.; Scannell, Devin R.; Kuei, Jeffrey; Sadhu, Meru; Rine, Jasper
2012-01-01
The genetic bases for species-specific traits are widely sought, but reliable experimental methods with which to identify functionally divergent genes are lacking. In the Saccharomyces genus, interspecies complementation tests can be used to evaluate functional conservation and divergence of biological pathways or networks. Silent information regulator (SIR) proteins in S. bayanus provide an ideal test case for this approach because they show remarkable divergence in sequence and paralog number from those found in the closely related S. cerevisiae. We identified genes required for silencing in S. bayanus using a genetic screen for silencing-defective mutants. Complementation tests in interspecies hybrids identified an evolutionarily conserved Sir-protein-based silencing machinery, as defined by two interspecies complementation groups (SIR2 and SIR3). However, recessive mutations in S. bayanus SIR4 isolated from this screen could not be complemented by S. cerevisiae SIR4, revealing species-specific functional divergence in the Sir4 protein despite conservation of the overall function of the Sir2/3/4 complex. A cladistic complementation series localized the occurrence of functional changes in SIR4 to the S. cerevisiae and S. paradoxus branches of the Saccharomyces phylogeny. Most of this functional divergence mapped to sequence changes in the Sir4 PAD. Finally, a hemizygosity modifier screen in the interspecies hybrids identified additional genes involved in S. bayanus silencing. Thus, interspecies complementation tests can be used to identify (1) mutations in genetically underexplored organisms, (2) loci that have functionally diverged between species, and (3) evolutionary events of functional consequence within a genus. PMID:22923378
Sikorav, J L; Duval, N; Anselmet, A; Bon, S; Krejci, E; Legay, C; Osterlund, M; Reimund, B; Massoulié, J
1988-01-01
In this paper, we show the existence of alternative splicing in the 3' region of the coding sequence of Torpedo acetylcholinesterase (AChE). We describe two cDNA structures which both diverge from the previously described coding sequence of the catalytic subunit of asymmetric (A) forms (Schumacher et al., 1986; Sikorav et al., 1987). They both contain a coding sequence followed by a non-coding sequence and a poly(A) stretch. Both of these structures were shown to exist in poly(A)+ RNAs, by S1 mapping experiments. The divergent region encoded by the first sequence corresponds to the precursor of the globular dimeric form (G2a), since it contains the expected C-terminal amino acids, Ala-Cys. These amino acids are followed by a 29 amino acid extension which contains a hydrophobic segment and must be replaced by a glycolipid in the mature protein. Analyses of intact G2a AChE showed that the common domain of the protein contains intersubunit disulphide bonds. The divergent region of the second type of cDNA consists of an adjacent genomic sequence, which is removed as an intron in A and Ga mRNAs, but may encode a distinct, less abundant catalytic subunit. The structures of the cDNA clones indicate that they are derived from minor mRNAs, shorter than the three major transcripts which have been described previously (14.5, 10.5 and 5.5 kb). Oligonucleotide probes specific for the asymmetric and globular terminal regions hybridize with the three major transcripts, indicating that their size is determined by 3'-untranslated regions which are not related to the differential splicing leading to A and Ga forms. Images PMID:3181125
Conceptual issues in Bayesian divergence time estimation
2016-01-01
Bayesian inference of species divergence times is an unusual statistical problem, because the divergence time parameters are not identifiable unless both fossil calibrations and sequence data are available. Commonly used marginal priors on divergence times derived from fossil calibrations may conflict with node order on the phylogenetic tree causing a change in the prior on divergence times for a particular topology. Care should be taken to avoid confusing this effect with changes due to informative sequence data. This effect is illustrated with examples. A topology-consistent prior that preserves the marginal priors is defined and examples are constructed. Conflicts between fossil calibrations and relative branch lengths (based on sequence data) can cause estimates of divergence times that are grossly incorrect, yet have a narrow posterior distribution. An example of this effect is given; it is recommended that overly narrow posterior distributions of divergence times should be carefully scrutinized. This article is part of the themed issue ‘Dating species divergences using rocks and clocks’. PMID:27325831
Conceptual issues in Bayesian divergence time estimation.
Rannala, Bruce
2016-07-19
Bayesian inference of species divergence times is an unusual statistical problem, because the divergence time parameters are not identifiable unless both fossil calibrations and sequence data are available. Commonly used marginal priors on divergence times derived from fossil calibrations may conflict with node order on the phylogenetic tree causing a change in the prior on divergence times for a particular topology. Care should be taken to avoid confusing this effect with changes due to informative sequence data. This effect is illustrated with examples. A topology-consistent prior that preserves the marginal priors is defined and examples are constructed. Conflicts between fossil calibrations and relative branch lengths (based on sequence data) can cause estimates of divergence times that are grossly incorrect, yet have a narrow posterior distribution. An example of this effect is given; it is recommended that overly narrow posterior distributions of divergence times should be carefully scrutinized.This article is part of the themed issue 'Dating species divergences using rocks and clocks'. © 2016 The Author(s).
An improved approximate-Bayesian model-choice method for estimating shared evolutionary history
2014-01-01
Background To understand biological diversification, it is important to account for large-scale processes that affect the evolutionary history of groups of co-distributed populations of organisms. Such events predict temporally clustered divergences times, a pattern that can be estimated using genetic data from co-distributed species. I introduce a new approximate-Bayesian method for comparative phylogeographical model-choice that estimates the temporal distribution of divergences across taxa from multi-locus DNA sequence data. The model is an extension of that implemented in msBayes. Results By reparameterizing the model, introducing more flexible priors on demographic and divergence-time parameters, and implementing a non-parametric Dirichlet-process prior over divergence models, I improved the robustness, accuracy, and power of the method for estimating shared evolutionary history across taxa. Conclusions The results demonstrate the improved performance of the new method is due to (1) more appropriate priors on divergence-time and demographic parameters that avoid prohibitively small marginal likelihoods for models with more divergence events, and (2) the Dirichlet-process providing a flexible prior on divergence histories that does not strongly disfavor models with intermediate numbers of divergence events. The new method yields more robust estimates of posterior uncertainty, and thus greatly reduces the tendency to incorrectly estimate models of shared evolutionary history with strong support. PMID:24992937
Yamada, Kazuhiko; Nishida-Umehara, Chizuko; Matsuda, Yoichi
2004-03-01
We isolated a new family of satellite DNA sequences from HaeIII- and EcoRI-digested genomic DNA of the Blakiston's fish owl ( Ketupa blakistoni). The repetitive sequences were organized in tandem arrays of the 174 bp element, and localized to the centromeric regions of all macrochromosomes, including the Z and W chromosomes, and microchromosomes. This hybridization pattern was consistent with the distribution of C-band-positive centromeric heterochromatin, and the satellite DNA sequences occupied 10% of the total genome as a major component of centromeric heterochromatin. The sequences were homogenized between macro- and microchromosomes in this species, and therefore intraspecific divergence of the nucleotide sequences was low. The 174 bp element cross-hybridized to the genomic DNA of six other Strigidae species, but not to that of the Tytonidae, suggesting that the satellite DNA sequences are conserved in the same family but fairly divergent between the different families in the Strigiformes. Secondly, the centromeric satellite DNAs were cloned from eight Strigidae species, and the nucleotide sequences of 41 monomer fragments were compared within and between species. Molecular phylogenetic relationships of the nucleotide sequences were highly correlated with both the taxonomy based on morphological traits and the phylogenetic tree constructed by DNA-DNA hybridization. These results suggest that the satellite DNA sequence has evolved by concerted evolution in the Strigidae and that it is a good taxonomic and phylogenetic marker to examine genetic diversity between Strigiformes species.
Proudhon, D; Wei, J; Briat, J; Theil, E C
1996-03-01
Ferritin, a protein widespread in nature, concentrates iron approximately 10(11)-10(12)-fold above the solubility within a spherical shell of 24 subunits; it derives in plants and animals from a common ancestor (based on sequence) but displays a cytoplasmic location in animals compared to the plastid in contemporary plants. Ferritin gene regulation in plants and animals is altered by development, hormones, and excess iron; iron signals target DNA in plants but mRNA in animals. Evolution has thus conserved the two end points of ferritin gene expression, the physiological signals and the protein structure, while allowing some divergence of the genetic mechanisms. Comparison of ferritin gene organization in plants and animals, made possible by the cloning of a dicot (soybean) ferritin gene presented here and the recent cloning of two monocot (maize) ferritin genes, shows evolutionary divergence in ferritin gene organization between plants and animals but conservation among plants or among animals; divergence in the genetic mechanism for iron regulation is reflected by the absence in all three plant genes of the IRE, a highly conserved, noncoding sequence in vertebrate animal ferritin mRNA. In plant ferritin genes, the number of introns (n = 7) is higher than in animals (n = 3). Second, no intron positions are conserved when ferritin genes of plants and animals are compared, although all ferritin gene introns are in the coding region; within kingdoms, the intron positions in ferritin genes are conserved. Finally, secondary protein structure has no apparent relationship to intron/exon boundaries in plant ferritin genes, whereas in animal ferritin genes the correspondence is high. The structural differences in introns/exons among phylogenetically related ferritin coding sequences and the high conservation of the gene structure within plant or animal kingdoms of the gene structure within plant or animal kingdoms suggest that kingdom-specific functional constraints may exist to maintain a particular intron/exon pattern within ferritin genes. In the case of plants, where ferritin gene intron placement is unrelated to triplet codons or protein structure, and where ferritin is targeted to the plastid, the selection pressure on gene organization may relate to RNA function and plastid/nuclear signaling.
Workman, Rachael E; Myrka, Alexander M; Wong, G William; Tseng, Elizabeth
2018-01-01
Abstract Background Hummingbirds oxidize ingested nectar sugars directly to fuel foraging but cannot sustain this fuel use during fasting periods, such as during the night or during long-distance migratory flights. Instead, fasting hummingbirds switch to oxidizing stored lipids that are derived from ingested sugars. The hummingbird liver plays a key role in moderating energy homeostasis and this remarkable capacity for fuel switching. Additionally, liver is the principle location of de novo lipogenesis, which can occur at exceptionally high rates, such as during premigratory fattening. Yet understanding how this tissue and whole organism moderates energy turnover is hampered by a lack of information regarding how relevant enzymes differ in sequence, expression, and regulation. Findings We generated a de novo transcriptome of the hummingbird liver using PacBio full-length cDNA sequencing (Iso-Seq), yielding 8.6Gb of sequencing data, or 2.6M reads from 4 different size fractions. We analyzed data using the SMRTAnalysis v3.1 Iso-Seq pipeline, then clustered isoforms into gene families to generate de novo gene contigs using Cogent. We performed orthology analysis to identify closely related sequences between our transcriptome and other avian and human gene sets. Finally, we closely examined homology of critical lipid metabolism genes between our transcriptome data and avian and human genomes. Conclusions We confirmed high levels of sequence divergence within hummingbird lipogenic enzymes, suggesting a high probability of adaptive divergent function in the hepatic lipogenic pathways. Our results leverage cutting-edge technology and a novel bioinformatics pipeline to provide a first direct look at the transcriptome of this incredible organism. PMID:29618047
Perina, Alejandra; Seoane, David; González-Tizón, Ana M; Rodríguez-Fariña, Fernanda; Martínez-Lage, Andrés
2011-10-17
The 5S ribosomal DNA (5S rDNA) is organized in tandem arrays with repeat units that consist of a transcribing region (5S) and a variable nontranscribed spacer (NTS), in higher eukaryotes. Until recently the 5S rDNA was thought to be subject to concerted evolution, however, in several taxa, sequence divergence levels between the 5S and the NTS were found higher than expected under this model. So, many studies have shown that birth-and-death processes and selection can drive the evolution of 5S rDNA. In analyses of 5S rDNA evolution is found several 5S rDNA types in the genome, with low levels of nucleotide variation in the 5S and a spacer region highly divergent. Molecular organization and nucleotide sequence of the 5S ribosomal DNA multigene family (5S rDNA) were investigated in three Pollicipes species in an evolutionary context. The nucleotide sequence variation revealed that several 5S rDNA variants occur in Pollicipes genomes. They are clustered in up to seven different types based on differences in their nontranscribed spacers (NTS). Five different units of 5S rDNA were characterized in P. pollicipes and two different units in P. elegans and P. polymerus. Analysis of these sequences showed that identical types were shared among species and that two pseudogenes were present. We predicted the secondary structure and characterized the upstream and downstream conserved elements. Phylogenetic analysis showed an among-species clustering pattern of 5S rDNA types. These results suggest that the evolution of Pollicipes 5S rDNA is driven by birth-and-death processes with strong purifying selection.
2011-01-01
Background The 5S ribosomal DNA (5S rDNA) is organized in tandem arrays with repeat units that consist of a transcribing region (5S) and a variable nontranscribed spacer (NTS), in higher eukaryotes. Until recently the 5S rDNA was thought to be subject to concerted evolution, however, in several taxa, sequence divergence levels between the 5S and the NTS were found higher than expected under this model. So, many studies have shown that birth-and-death processes and selection can drive the evolution of 5S rDNA. In analyses of 5S rDNA evolution is found several 5S rDNA types in the genome, with low levels of nucleotide variation in the 5S and a spacer region highly divergent. Molecular organization and nucleotide sequence of the 5S ribosomal DNA multigene family (5S rDNA) were investigated in three Pollicipes species in an evolutionary context. Results The nucleotide sequence variation revealed that several 5S rDNA variants occur in Pollicipes genomes. They are clustered in up to seven different types based on differences in their nontranscribed spacers (NTS). Five different units of 5S rDNA were characterized in P. pollicipes and two different units in P. elegans and P. polymerus. Analysis of these sequences showed that identical types were shared among species and that two pseudogenes were present. We predicted the secondary structure and characterized the upstream and downstream conserved elements. Phylogenetic analysis showed an among-species clustering pattern of 5S rDNA types. Conclusions These results suggest that the evolution of Pollicipes 5S rDNA is driven by birth-and-death processes with strong purifying selection. PMID:22004418
Kim, Young-Kyu; Park, Chong-wook; Kim, Ki-Joong
2009-03-31
The chloroplast DNA sequences of Megaleranthis saniculifolia, an endemic and monotypic endangered plant species, were completed in this study (GenBank FJ597983). The genome is 159,924 bp in length. It harbors a pair of IR regions consisting of 26,608 bp each. The lengths of the LSC and SSC regions are 88,326 bp and 18,382 bp, respectively. The structural organizations, gene and intron contents, gene orders, AT contents, codon usages, and transcription units of the Megaleranthis chloroplast genome are similar to those of typical land plant cp DNAs. However, the detailed features of Megaleranthis chloroplast genomes are substantially different from that of Ranunculus, which belongs to the same family, the Ranunculaceae. First, the Megaleranthis cp DNA was 4,797 bp longer than that of Ranunculus due to an expanded IR region into the SSC region and duplicated sequence elements in several spacer regions of the Megaleranthis cp genome. Second, the chloroplast genomes of Megaleranthis and Ranunculus evidence 5.6% sequence divergence in the coding regions, 8.9% sequence divergence in the intron regions, and 18.7% sequence divergence in the intergenic spacer regions, respectively. In both the coding and noncoding regions, average nucleotide substitution rates differed markedly, depending on the genome position. Our data strongly implicate the positional effects of the evolutionary modes of chloroplast genes. The genes evidencing higher levels of base substitutions also have higher incidences of indel mutations and low Ka/Ks ratios. A total of 54 simple sequence repeat loci were identified from the Megaleranthis cp genome. The existence of rich cp SSR loci in the Megaleranthis cp genome provides a rare opportunity to study the population genetic structures of this endangered species. Our phylogenetic trees based on the two independent markers, the nuclear ITS and chloroplast matK sequences, strongly support the inclusion of the Megaleranthis to the Trollius. Therefore, our molecular trees support Ohwi's original treatment of Megaleranthis saniculiforia to Trollius chosenensis Ohwi.
Bhatia, S; Singh Negi, M; Lakshmikumaran, M
1996-11-01
EcoRI restriction of the B. nigra rDNA recombinants, isolated from a lambda genomic library, showed that the 3.9-kb fragment corresponded to the Intergenic Spacer (IGS), which was sequenced and found to be 3,928 bp in size. Sequence and dot-matrix analyses showed that the organization of the B. nigra rDNA IGS was typical of most rDNA spacers, consisting of a central repetitive region and flanking unique sequences on either side. The repetitive region was composed of two repeat families-RF 'A' and RF 'B.' The B. nigra RF 'A' consisted of a tandem array of three full-length copies of a 106-bp sequence element. RF 'B' was composed of 66 tandemly repeated elements. Each 'B' element was only 21-bp in size and this is the smallest repeat unit identified in plant rDNA to date. The putative transcription initiation site (TIS) was identified as nucleotide position 3,110. Based on the sequence analysis it was suggested that the present organization of the repeat families was generated by successive cycles of deletions and amplifications and was being maintained by homogenization processes such as gene conversion and crossing-over.A detailed comparison of the rDNA IGS sequences of the three diploid Brassica species-namely, B. nigra, B. campestris, and B. oleracea-was carried out. First, comparisons revealed that B. campestris and B. oleracea were close to each other as the repeat families in both showed high sequence homology between each other. Second, the repeat elements in both the species were organized in an interspersed manner. Third, a 52-bp sequence, present just downstream of the repeats in B. campestris, was found to be identical to the B. oleracea repeats, thereby suggesting a common progenitor. On the other hand, in B. nigra no interspersion pattern of organization of repeats was observed. Further, the B. nigra RF 'A' was identified as distinct from the repeat families of B. campestris and B. oleracea. Based on this analysis, it was suggested that during speciation B. campestris and B. oleracea evolved in one lineage whereas B. nigra diverged into a separate lineage. The comparative analysis of the IGS helped in identifying not only conserved ancestral sequence motifs of possible functional significance such as promoters and enhancers, but also sequences which showed variation between the three diploid species and were therefore identified as species-specific sequences.
2012-01-01
Background Adaptive divergence driven by environmental heterogeneity has long been a fascinating topic in ecology and evolutionary biology. The study of the genetic basis of adaptive divergence has, however, been greatly hampered by a lack of genomic information. The recent development of transcriptome sequencing provides an unprecedented opportunity to generate large amounts of genomic data for detailed investigations of the genetics of adaptive divergence in non-model organisms. Herein, we used the Illumina sequencing platform to sequence the transcriptome of brain and liver tissues from a single individual of the Vinous-throated Parrotbill, Paradoxornis webbianus bulomachus, an ecologically important avian species in Taiwan with a wide elevational range of sea level to 3100 m. Results Our 10.1 Gbp of sequences were first assembled based on Zebra Finch (Taeniopygia guttata) and chicken (Gallus gallus) RNA references. The remaining reads were then de novo assembled. After filtering out contigs with low coverage (<10X), we retained 67,791 of 487,336 contigs, which covered approximately 5.3% of the P. w. bulomachus genome. Of 7,779 contigs retained for a top-hit species distribution analysis, the majority (about 86%) were matched to known Zebra Finch and chicken transcripts. We also annotated 6,365 contigs to gene ontology (GO) terms: in total, 122 GO-slim terms were assigned, including biological process (41%), molecular function (32%), and cellular component (27%). Many potential genetic markers for future adaptive genomic studies were also identified: 8,589 single nucleotide polymorphisms, 1,344 simple sequence repeats and 109 candidate genes that might be involved in elevational or climate adaptation. Conclusions Our study shows that transcriptome data can serve as a rich genetic resource, even for a single run of short-read sequencing from a single individual of a non-model species. This is the first study providing transcriptomic information for species in the avian superfamily Sylvioidea, which comprises more than 1,000 species. Our data can be used to study adaptive divergence in heterogeneous environments and investigate other important ecological and evolutionary questions in parrotbills from different populations and even in other species in the Sylvioidea. PMID:22530590
Swaggart, Kayleigh A.; Pavlicev, Mihaela; Muglia, Louis J.
2015-01-01
The molecular mechanisms controlling human birth timing at term, or resulting in preterm birth, have been the focus of considerable investigation, but limited insights have been gained over the past 50 years. In part, these processes have remained elusive because of divergence in reproductive strategies and physiology shown by model organisms, making extrapolation to humans uncertain. Here, we summarize the evolution of progesterone signaling and variation in pregnancy maintenance and termination. We use this comparative physiology to support the hypothesis that selective pressure on genomic loci involved in the timing of parturition have shaped human birth timing, and that these loci can be identified with comparative genomic strategies. Previous limitations imposed by divergence of mechanisms provide an important new opportunity to elucidate fundamental pathways of parturition control through increasing availability of sequenced genomes and associated reproductive physiology characteristics across diverse organisms. PMID:25646385
Sperm Bindin Divergence under Sexual Selection and Concerted Evolution in Sea Stars.
Patiño, Susana; Keever, Carson C; Sunday, Jennifer M; Popovic, Iva; Byrne, Maria; Hart, Michael W
2016-08-01
Selection associated with competition among males or sexual conflict between mates can create positive selection for high rates of molecular evolution of gamete recognition genes and lead to reproductive isolation between species. We analyzed coding sequence and repetitive domain variation in the gene encoding the sperm acrosomal protein bindin in 13 diverse sea star species. We found that bindin has a conserved coding sequence domain structure in all 13 species, with several repeated motifs in a large central region that is similar among all sea stars in organization but highly divergent among genera in nucleotide and predicted amino acid sequence. More bindin codons and lineages showed positive selection for high relative rates of amino acid substitution in genera with gonochoric outcrossing adults (and greater expected strength of sexual selection) than in selfing hermaphrodites. That difference is consistent with the expectation that selfing (a highly derived mating system) may moderate the strength of sexual selection and limit the accumulation of bindin amino acid differences. The results implicate both positive selection on single codons and concerted evolution within the repetitive region in bindin divergence, and suggest that both single amino acid differences and repeat differences may affect sperm-egg binding and reproductive compatibility. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
RNA regulators responding to ribosomal protein S15 are frequent in sequence space
Slinger, Betty L.; Meyer, Michelle M.
2016-01-01
There are several natural examples of distinct RNA structures that interact with the same ligand to regulate the expression of homologous genes in different organisms. One essential question regarding this phenomenon is whether such RNA regulators are the result of convergent or divergent evolution. Are the RNAs derived from some common ancestor and diverged to the point where we cannot identify the similarity, or have multiple solutions to the same biological problem arisen independently? A key variable in assessing these alternatives is how frequently such regulators arise within sequence space. Ribosomal protein S15 is autogenously regulated via an RNA regulator in many bacterial species; four apparently distinct regulators have been functionally validated in different bacterial phyla. Here, we explore how frequently such regulators arise within a partially randomized sequence population. We find many RNAs that interact specifically with ribosomal protein S15 from Geobacillus kaustophilus with biologically relevant dissociation constants. Furthermore, of the six sequences we characterize, four show regulatory activity in an Escherichia coli reporter assay. Subsequent footprinting and mutagenesis analysis indicates that protein binding proximal to regulatory features such as the Shine–Dalgarno sequence is sufficient to enable regulation, suggesting that regulation in response to S15 is relatively easily acquired. PMID:27580716
Makowsky, Robert; Cox, Christian L; Roelke, Corey; Chippindale, Paul T
2010-11-01
Determining the appropriate gene for phylogeny reconstruction can be a difficult process. Rapidly evolving genes tend to resolve recent relationships, but suffer from alignment issues and increased homoplasy among distantly related species. Conversely, slowly evolving genes generally perform best for deeper relationships, but lack sufficient variation to resolve recent relationships. We determine the relationship between sequence divergence and Bayesian phylogenetic reconstruction ability using both natural and simulated datasets. The natural data are based on 28 well-supported relationships within the subphylum Vertebrata. Sequences of 12 genes were acquired and Bayesian analyses were used to determine phylogenetic support for correct relationships. Simulated datasets were designed to determine whether an optimal range of sequence divergence exists across extreme phylogenetic conditions. Across all genes we found that an optimal range of divergence for resolving the correct relationships does exist, although this level of divergence expectedly depends on the distance metric. Simulated datasets show that an optimal range of sequence divergence exists across diverse topologies and models of evolution. We determine that a simple to measure property of genetic sequences (genetic distance) is related to phylogenic reconstruction ability in Bayesian analyses. This information should be useful for selecting the most informative gene to resolve any relationships, especially those that are difficult to resolve, as well as minimizing both cost and confounding information during project design. Copyright © 2010. Published by Elsevier Inc.
Zardus, John D; Etter, Ron J; Chase, Michael R; Rex, Michael A; Boyle, Elizabeth E
2006-03-01
The deep-sea soft-sediment environment hosts a diverse and highly endemic fauna of uncertain origin. We know little about how this fauna evolved because geographic patterns of genetic variation, the essential information for inferring patterns of population differentiation and speciation are poorly understood. Using formalin-fixed specimens from archival collections, we quantify patterns of genetic variation in the protobranch bivalve Deminucula atacellana, a species widespread throughout the Atlantic Ocean at bathyal and abyssal depths. Samples were taken from 18 localities in the North American, West European and Argentine basins. A hypervariable region of mitochondrial 16S rDNA was amplified by polymerase chain reaction (PCR) and sequenced from 130 individuals revealing 21 haplotypes. Except for several important exceptions, haplotypes are unique to each basin. Overall gene diversity is high (h = 0.73) with pronounced population structure (Phi(ST) = 0.877) and highly significant geographic associations (P < 0.0001). Sequences cluster into four major clades corresponding to differences in geography and depth. Genetic divergence was much greater among populations at different depths within the same basin, than among those at similar depths but separated by thousands of kilometres. Isolation by distance probably explains much of the interbasin variation. Depth-related divergence may reflect historical patterns of colonization or strong environmental selective gradients. Broadly distributed deep-sea organisms can possess highly genetically divergent populations, despite the lack of any morphological divergence.
Kullback Leibler divergence in complete bacterial and phage genomes
Akhter, Sajia; Kashef, Mona T.; Ibrahim, Eslam S.; Bailey, Barbara
2017-01-01
The amino acid content of the proteins encoded by a genome may predict the coding potential of that genome and may reflect lifestyle restrictions of the organism. Here, we calculated the Kullback–Leibler divergence from the mean amino acid content as a metric to compare the amino acid composition for a large set of bacterial and phage genome sequences. Using these data, we demonstrate that (i) there is a significant difference between amino acid utilization in different phylogenetic groups of bacteria and phages; (ii) many of the bacteria with the most skewed amino acid utilization profiles, or the bacteria that host phages with the most skewed profiles, are endosymbionts or parasites; (iii) the skews in the distribution are not restricted to certain metabolic processes but are common across all bacterial genomic subsystems; (iv) amino acid utilization profiles strongly correlate with GC content in bacterial genomes but very weakly correlate with the G+C percent in phage genomes. These findings might be exploited to distinguish coding from non-coding sequences in large data sets, such as metagenomic sequence libraries, to help in prioritizing subsequent analyses. PMID:29204318
Kullback Leibler divergence in complete bacterial and phage genomes.
Akhter, Sajia; Aziz, Ramy K; Kashef, Mona T; Ibrahim, Eslam S; Bailey, Barbara; Edwards, Robert A
2017-01-01
The amino acid content of the proteins encoded by a genome may predict the coding potential of that genome and may reflect lifestyle restrictions of the organism. Here, we calculated the Kullback-Leibler divergence from the mean amino acid content as a metric to compare the amino acid composition for a large set of bacterial and phage genome sequences. Using these data, we demonstrate that (i) there is a significant difference between amino acid utilization in different phylogenetic groups of bacteria and phages; (ii) many of the bacteria with the most skewed amino acid utilization profiles, or the bacteria that host phages with the most skewed profiles, are endosymbionts or parasites; (iii) the skews in the distribution are not restricted to certain metabolic processes but are common across all bacterial genomic subsystems; (iv) amino acid utilization profiles strongly correlate with GC content in bacterial genomes but very weakly correlate with the G+C percent in phage genomes. These findings might be exploited to distinguish coding from non-coding sequences in large data sets, such as metagenomic sequence libraries, to help in prioritizing subsequent analyses.
Obanda, Vincent; Michuki, George; Jowers, Michael J; Rumberia, Cecilia; Mutinda, Mathew; Lwande, Olivia Wesula; Wangoru, Kihara; Kasiiti-Orengo, Jacquiline; Yongo, Moses; Angelone-Alasaad, Samer
2016-07-01
Following mass deaths of Laughing Doves (Streptopelia senegalensis) in different localities throughout Kenya, internal organs obtained during necropsy of two moribund birds were sampled and analyzed by next generation sequencing. We isolated the virulent strain of pigeon paramyxovirus type-1 (PPMV-1), PPMV1/Laughing Dove/Kenya/Isiolo/B2/2012, which had a characteristic fusion gene motif (110)GGRRQKRF(117). We obtained a partial full genome of 15,114 nucleotides. The phylogenetic relationship based on the fusion gene and genomic sequence grouped our isolate as class II genotype VI, a group of viruses commonly isolated from wild birds but potentially lethal to Chickens ( Gallus gallus domesticus ). The fusion gene isolate clustered with PPMV-I strains from pigeons (Columbidae) in Nigeria. The complete genome showed a basal and highly divergent lineage to American, European, and Asian strains, indicating a divergent evolutionary pathway. The isolated strain is highly virulent and apparently species-specific to Laughing Doves in Kenya. Risk of transmission of such a strain to poultry is potentially high whereas the cyclic epizootic in doves is a threat to conservation of wild Columbidae in Kenya.
Ashworth, Justin; Plaisier, Christopher L.; Lo, Fang Yin; Reiss, David J.; Baliga, Nitin S.
2014-01-01
Widespread microbial genome sequencing presents an opportunity to understand the gene regulatory networks of non-model organisms. This requires knowledge of the binding sites for transcription factors whose DNA-binding properties are unknown or difficult to infer. We adapted a protein structure-based method to predict the specificities and putative regulons of homologous transcription factors across diverse species. As a proof-of-concept we predicted the specificities and transcriptional target genes of divergent archaeal feast/famine regulatory proteins, several of which are encoded in the genome of Halobacterium salinarum. This was validated by comparison to experimentally determined specificities for transcription factors in distantly related extremophiles, chromatin immunoprecipitation experiments, and cis-regulatory sequence conservation across eighteen related species of halobacteria. Through this analysis we were able to infer that Halobacterium salinarum employs a divergent local trans-regulatory strategy to regulate genes (carA and carB) involved in arginine and pyrimidine metabolism, whereas Escherichia coli employs an operon. The prediction of gene regulatory binding sites using structure-based methods is useful for the inference of gene regulatory relationships in new species that are otherwise difficult to infer. PMID:25255272
Ashworth, Justin; Plaisier, Christopher L; Lo, Fang Yin; Reiss, David J; Baliga, Nitin S
2014-01-01
Widespread microbial genome sequencing presents an opportunity to understand the gene regulatory networks of non-model organisms. This requires knowledge of the binding sites for transcription factors whose DNA-binding properties are unknown or difficult to infer. We adapted a protein structure-based method to predict the specificities and putative regulons of homologous transcription factors across diverse species. As a proof-of-concept we predicted the specificities and transcriptional target genes of divergent archaeal feast/famine regulatory proteins, several of which are encoded in the genome of Halobacterium salinarum. This was validated by comparison to experimentally determined specificities for transcription factors in distantly related extremophiles, chromatin immunoprecipitation experiments, and cis-regulatory sequence conservation across eighteen related species of halobacteria. Through this analysis we were able to infer that Halobacterium salinarum employs a divergent local trans-regulatory strategy to regulate genes (carA and carB) involved in arginine and pyrimidine metabolism, whereas Escherichia coli employs an operon. The prediction of gene regulatory binding sites using structure-based methods is useful for the inference of gene regulatory relationships in new species that are otherwise difficult to infer.
de Souza, Gustavo A.; Arntzen, Magnus Ø.; Fortuin, Suereta; Schürch, Anita C.; Målen, Hiwa; McEvoy, Christopher R. E.; van Soolingen, Dick; Thiede, Bernd; Warren, Robin M.; Wiker, Harald G.
2011-01-01
Precise annotation of genes or open reading frames is still a difficult task that results in divergence even for data generated from the same genomic sequence. This has an impact in further proteomic studies, and also compromises the characterization of clinical isolates with many specific genetic variations that may not be represented in the selected database. We recently developed software called multistrain mass spectrometry prokaryotic database builder (MSMSpdbb) that can merge protein databases from several sources and be applied on any prokaryotic organism, in a proteomic-friendly approach. We generated a database for the Mycobacterium tuberculosis complex (using three strains of Mycobacterium bovis and five of M. tuberculosis), and analyzed data collected from two laboratory strains and two clinical isolates of M. tuberculosis. We identified 2561 proteins, of which 24 were present in M. tuberculosis H37Rv samples, but not annotated in the M. tuberculosis H37Rv genome. We were also able to identify 280 nonsynonymous single amino acid polymorphisms and confirm 367 translational start sites. As a proof of concept we applied the database to whole-genome DNA sequencing data of one of the clinical isolates, which allowed the validation of 116 predicted single amino acid polymorphisms and the annotation of 131 N-terminal start sites. Moreover we identified regions not present in the original M. tuberculosis H37Rv sequence, indicating strain divergence or errors in the reference sequence. In conclusion, we demonstrated the potential of using a merged database to better characterize laboratory or clinical bacterial strains. PMID:21030493
Wang, Xiao-Wei; Zhao, Qiong-Yi; Luan, Jun-Bo; Wang, Yu-Jun; Yan, Gen-Hong; Liu, Shu-Sheng
2012-10-04
Genomic divergence between invasive and native species may provide insight into the molecular basis underlying specific characteristics that drive the invasion and displacement of closely related species. In this study, we sequenced the transcriptome of an indigenous species, Asia II 3, of the Bemisia tabaci complex and compared its genetic divergence with the transcriptomes of two invasive whiteflies species, Middle East Asia Minor 1 (MEAM1) and Mediterranean (MED), respectively. More than 16 million reads of 74 base pairs in length were obtained for the Asia II 3 species using the Illumina sequencing platform. These reads were assembled into 52,535 distinct sequences (mean size: 466 bp) and 16,596 sequences were annotated with an E-value above 10-5. Protein family comparisons revealed obvious diversification among the transcriptomes of these species suggesting species-specific adaptations during whitefly evolution. On the contrary, substantial conservation of the whitefly transcriptomes was also evident, despite their differences. The overall divergence of coding sequences between the orthologous gene pairs of Asia II 3 and MEAM1 is 1.73%, which is comparable to the average divergence of Asia II 3 and MED transcriptomes (1.84%) and much higher than that of MEAM1 and MED (0.83%). This is consistent with the previous phylogenetic analyses and crossing experiments suggesting these are distinct species. We also identified hundreds of highly diverged genes and compiled sequence identify data into gene functional groups and found the most divergent gene classes are Cytochrome P450, Glutathione metabolism and Oxidative phosphorylation. These results strongly suggest that the divergence of genes related to metabolism might be the driving force of the MEAM1 and Asia II 3 differentiation. We also analyzed single nucleotide polymorphisms within the orthologous gene pairs of indigenous and invasive whiteflies which are helpful for the investigation of association between allelic and phenotypes. Our data present the most comprehensive sequences for the indigenous whitefly species Asia II 3. The extensive comparisons of Asia II 3, MEAM1 and MED transcriptomes will serve as an invaluable resource for revealing the genetic basis of whitefly invasion and the molecular mechanisms underlying their biological differences.
2012-01-01
Background Genomic divergence between invasive and native species may provide insight into the molecular basis underlying specific characteristics that drive the invasion and displacement of closely related species. In this study, we sequenced the transcriptome of an indigenous species, Asia II 3, of the Bemisia tabaci complex and compared its genetic divergence with the transcriptomes of two invasive whiteflies species, Middle East Asia Minor 1 (MEAM1) and Mediterranean (MED), respectively. Results More than 16 million reads of 74 base pairs in length were obtained for the Asia II 3 species using the Illumina sequencing platform. These reads were assembled into 52,535 distinct sequences (mean size: 466 bp) and 16,596 sequences were annotated with an E-value above 10-5. Protein family comparisons revealed obvious diversification among the transcriptomes of these species suggesting species-specific adaptations during whitefly evolution. On the contrary, substantial conservation of the whitefly transcriptomes was also evident, despite their differences. The overall divergence of coding sequences between the orthologous gene pairs of Asia II 3 and MEAM1 is 1.73%, which is comparable to the average divergence of Asia II 3 and MED transcriptomes (1.84%) and much higher than that of MEAM1 and MED (0.83%). This is consistent with the previous phylogenetic analyses and crossing experiments suggesting these are distinct species. We also identified hundreds of highly diverged genes and compiled sequence identify data into gene functional groups and found the most divergent gene classes are Cytochrome P450, Glutathione metabolism and Oxidative phosphorylation. These results strongly suggest that the divergence of genes related to metabolism might be the driving force of the MEAM1 and Asia II 3 differentiation. We also analyzed single nucleotide polymorphisms within the orthologous gene pairs of indigenous and invasive whiteflies which are helpful for the investigation of association between allelic and phenotypes. Conclusions Our data present the most comprehensive sequences for the indigenous whitefly species Asia II 3. The extensive comparisons of Asia II 3, MEAM1 and MED transcriptomes will serve as an invaluable resource for revealing the genetic basis of whitefly invasion and the molecular mechanisms underlying their biological differences. PMID:23036081
Fine organization of genomic regions tagged to the 5S rDNA locus of the bread wheat 5B chromosome.
Sergeeva, Ekaterina M; Shcherban, Andrey B; Adonina, Irina G; Nesterov, Michail A; Beletsky, Alexey V; Rakitin, Andrey L; Mardanov, Andrey V; Ravin, Nikolai V; Salina, Elena A
2017-11-14
The multigene family encoding the 5S rRNA, one of the most important structurally-functional part of the large ribosomal subunit, is an obligate component of all eukaryotic genomes. 5S rDNA has long been a favored target for cytological and phylogenetic studies due to the inherent peculiarities of its structural organization, such as the tandem arrays of repetitive units and their high interspecific divergence. The complex polyploid nature of the genome of bread wheat, Triticum aestivum, and the technically difficult task of sequencing clusters of tandem repeats mean that the detailed organization of extended genomic regions containing 5S rRNA genes remains unclear. This is despite the recent progress made in wheat genomic sequencing. Using pyrosequencing of BAC clones, in this work we studied the organization of two distinct 5S rDNA-tagged regions of the 5BS chromosome of bread wheat. Three BAC-clones containing 5S rDNA were identified in the 5BS chromosome-specific BAC-library of Triticum aestivum. Using the results of pyrosequencing and assembling, we obtained six 5S rDNA- containing contigs with a total length of 140,417 bp, and two sets (pools) of individual 5S rDNA sequences belonging to separate, but closely located genomic regions on the 5BS chromosome. Both regions are characterized by the presence of approximately 70-80 copies of 5S rDNA, however, they are completely different in their structural organization. The first region contained highly diverged short-type 5S rDNA units that were disrupted by multiple insertions of transposable elements. The second region contained the more conserved long-type 5S rDNA, organized as a single tandem array. FISH using probes specific to both 5S rDNA unit types showed differences in the distribution and intensity of signals on the chromosomes of polyploid wheat species and their diploid progenitors. A detailed structural organization of two closely located 5S rDNA-tagged genomic regions on the 5BS chromosome of bread wheat has been established. These two regions differ in the organization of both 5S rDNA and the neighboring sequences comprised of transposable elements, implying different modes of evolution for these regions.
Neural Encoding and Integration of Learned Probabilistic Sequences in Avian Sensory-Motor Circuitry
Brainard, Michael S.
2013-01-01
Many complex behaviors, such as human speech and birdsong, reflect a set of categorical actions that can be flexibly organized into variable sequences. However, little is known about how the brain encodes the probabilities of such sequences. Behavioral sequences are typically characterized by the probability of transitioning from a given action to any subsequent action (which we term “divergence probability”). In contrast, we hypothesized that neural circuits might encode the probability of transitioning to a given action from any preceding action (which we term “convergence probability”). The convergence probability of repeatedly experienced sequences could naturally become encoded by Hebbian plasticity operating on the patterns of neural activity associated with those sequences. To determine whether convergence probability is encoded in the nervous system, we investigated how auditory-motor neurons in vocal premotor nucleus HVC of songbirds encode different probabilistic characterizations of produced syllable sequences. We recorded responses to auditory playback of pseudorandomly sequenced syllables from the bird's repertoire, and found that variations in responses to a given syllable could be explained by a positive linear dependence on the convergence probability of preceding sequences. Furthermore, convergence probability accounted for more response variation than other probabilistic characterizations, including divergence probability. Finally, we found that responses integrated over >7–10 syllables (∼700–1000 ms) with the sign, gain, and temporal extent of integration depending on convergence probability. Our results demonstrate that convergence probability is encoded in sensory-motor circuitry of the song-system, and suggest that encoding of convergence probability is a general feature of sensory-motor circuits. PMID:24198363
Plant centromere organization: a dynamic structure with conserved functions.
Ma, Jianxin; Wing, Rod A; Bennetzen, Jeffrey L; Jackson, Scott A
2007-03-01
Although the structural features of centromeres from most multicellular eukaryotes remain to be characterized, recent analyses of the complete sequences of two centromeric regions of rice, together with data from Arabidopsis thaliana and maize, have illuminated the considerable size variation and sequence divergence of plant centromeres. Despite the severe suppression of meiotic chromosomal exchange in centromeric and pericentromeric regions of rice, the centromere core shows high rates of unequal homologous recombination in the absence of chromosomal exchange, resulting in frequent and extensive DNA rearrangement. Not only is the sequence of centromeric tandem and non-tandem repeats highly variable but also the copy number, spacing, order and orientation, providing ample natural variation as the basis for selection of superior centromere performance. This review article focuses on the structural and evolutionary dynamics of plant centromere organization and the potential molecular mechanisms responsible for the rapid changes of centromeric components.
Tong, Ying; Zheng, Kang; Zhao, Shufang; Xiao, Guanxiu; Luo, Chen
2012-11-01
Recent studies demonstrated that sequence divergence in both transcriptional regulatory region and coding region contributes to the subfunctionalization of duplicate gene. However, whether sequence divergence in the 3'-untranslated region (3'-UTR) has an impact on the subfunctionalization of duplicate genes remains unclear. Here, we identified two diverging duplicate vsx1 (visual system homeobox-1) loci in goldfish, named vsx1A1 and vsx1A2. Phylogenetic analysis suggests that vsx1A1 and vsx1A2 may arise from a duplication of vsx1 after the separation of goldfish and zebrafish. Sequence comparison revealed that divergence in both transcriptional and translational regulatory regions is higher than divergence in the introns. vsx1A2 expresses during blastula and gastrula stages and in adult retina but silences from segmentation stage to hatching stage, vsx1A1 starts expression from segmentation onward. Comparing to that zebrafish vsx1 expresses in all the developmental stages and in the adult retina, it appears that goldfish vsx1A1 and vsx1A2 are under going to share the functions of ancestral vsx1. The different but overlapping temporal expression patterns of vsx1A1 and vsx1A2 suggest that sequence divergence in the promoter region of duplicate vsx1 is not sufficient for partitioning the functions of ancestral vsx1. By comparing vsx1A1 and vsx1A2 3'-UTR-linked green fluorescent protein gene expression patterns, we demonstrated that the 3'-UTR of vsx1A1 remains but the 3'-UTR of vsx1A2 has lost the capability of mediating bipolar cell specific expression during retina development. These results indicate that sequence divergence in the 3'-UTRs has a clear effect on subfunctionalization of the duplicate genes. © 2012 WILEY PERIODICALS, INC.
Chromosome rearrangements via template switching between diverged repeated sequences
Anand, Ranjith P.; Tsaponina, Olga; Greenwell, Patricia W.; Lee, Cheng-Sheng; Du, Wei; Petes, Thomas D.
2014-01-01
Recent high-resolution genome analyses of cancer and other diseases have revealed the occurrence of microhomology-mediated chromosome rearrangements and copy number changes. Although some of these rearrangements appear to involve nonhomologous end-joining, many must have involved mechanisms requiring new DNA synthesis. Models such as microhomology-mediated break-induced replication (MM-BIR) have been invoked to explain these rearrangements. We examined BIR and template switching between highly diverged sequences in Saccharomyces cerevisiae, induced during repair of a site-specific double-strand break (DSB). Our data show that such template switches are robust mechanisms that give rise to complex rearrangements. Template switches between highly divergent sequences appear to be mechanistically distinct from the initial strand invasions that establish BIR. In particular, such jumps are less constrained by sequence divergence and exhibit a different pattern of microhomology junctions. BIR traversing repeated DNA sequences frequently results in complex translocations analogous to those seen in mammalian cells. These results suggest that template switching among repeated genes is a potent driver of genome instability and evolution. PMID:25367035
Structure and evolution of cereal genomes.
Paterson, Andrew H; Bowers, John E; Peterson, Daniel G; Estill, James C; Chapman, Brad A
2003-12-01
The cereal species, of central importance to our diet, began to diverge 50-70 million years ago. For the past few thousand years, these species have undergone largely parallel selection regimes associated with domestication and improvement. The rice genome sequence provides a platform for organizing information about diverse cereals, and together with genetic maps and sequence samples from other cereals is yielding new insights into both the shared and the independent dimensions of cereal evolution. New data and population-based approaches are identifying genes that have been involved in cereal improvement. Reduced-representation sequencing promises to accelerate gene discovery in many large-genome cereals, and to better link the under-explored genomes of 'orphan' cereals with state-of-the-art knowledge.
2010-01-01
Background Multiple sequence alignments are used to study gene or protein function, phylogenetic relations, genome evolution hypotheses and even gene polymorphisms. Virtually without exception, all available tools focus on conserved segments or residues. Small divergent regions, however, are biologically important for specific quantitative polymerase chain reaction, genotyping, molecular markers and preparation of specific antibodies, and yet have received little attention. As a consequence, they must be selected empirically by the researcher. AlignMiner has been developed to fill this gap in bioinformatic analyses. Results AlignMiner is a Web-based application for detection of conserved and divergent regions in alignments of conserved sequences, focusing particularly on divergence. It accepts alignments (protein or nucleic acid) obtained using any of a variety of algorithms, which does not appear to have a significant impact on the final results. AlignMiner uses different scoring methods for assessing conserved/divergent regions, Entropy being the method that provides the highest number of regions with the greatest length, and Weighted being the most restrictive. Conserved/divergent regions can be generated either with respect to the consensus sequence or to one master sequence. The resulting data are presented in a graphical interface developed in AJAX, which provides remarkable user interaction capabilities. Users do not need to wait until execution is complete and can.even inspect their results on a different computer. Data can be downloaded onto a user disk, in standard formats. In silico and experimental proof-of-concept cases have shown that AlignMiner can be successfully used to designing specific polymerase chain reaction primers as well as potential epitopes for antibodies. Primer design is assisted by a module that deploys several oligonucleotide parameters for designing primers "on the fly". Conclusions AlignMiner can be used to reliably detect divergent regions via several scoring methods that provide different levels of selectivity. Its predictions have been verified by experimental means. Hence, it is expected that its usage will save researchers' time and ensure an objective selection of the best-possible divergent region when closely related sequences are analysed. AlignMiner is freely available at http://www.scbi.uma.es/alignminer. PMID:20525162
Bodewes, R; Kik, M J L; Raj, V Stalin; Schapendonk, C M E; Haagmans, B L; Smits, S L; Osterhaus, A D M E
2013-06-01
Arenaviruses are bi-segmented negative-stranded RNA viruses, which were until recently only detected in rodents and humans. Now highly divergent arenaviruses have been identified in boid snakes with inclusion body disease (IBD). Here, we describe the identification of a new species and variants of the highly divergent arenaviruses, which were detected in tissues of captive boid snakes with IBD in The Netherlands by next-generation sequencing. Phylogenetic analysis of the complete sequence of the open reading frames of the four predicted proteins of one of the detected viruses revealed that this virus was most closely related to the recently identified Golden Gate virus, while considerable sequence differences were observed between the highly divergent arenaviruses detected in this study. These findings add to the recent identification of the highly divergent arenaviruses in boid snakes with IBD in the United States and indicate that these viruses also circulate among boid snakes in Europe.
The Korarchaeota: Archaeal orphans representing an ancestral lineage of life
DOE Office of Scientific and Technical Information (OSTI.GOV)
Elkins, James G.; Kunin, Victor; Anderson, Iain
Based on conserved cellular properties, all life on Earth can be grouped into different phyla which belong to the primary domains Bacteria, Archaea, and Eukarya. However, tracing back their evolutionary relationships has been impeded by horizontal gene transfer and gene loss. Within the Archaea, the kingdoms Crenarchaeota and Euryarchaeota exhibit a profound divergence. In order to elucidate the evolution of these two major kingdoms, representatives of more deeply diverged lineages would be required. Based on their environmental small subunit ribosomal (ss RNA) sequences, the Korarchaeota had been originally suggested to have an ancestral relationship to all known Archaea although thismore » assessment has been refuted. Here we describe the cultivation and initial characterization of the first member of the Korarchaeota, highly unusual, ultrathin filamentous cells about 0.16 {micro}m in diameter. A complete genome sequence obtained from enrichment cultures revealed an unprecedented combination of signature genes which were thought to be characteristic of either the Crenarchaeota, Euryarchaeota, or Eukarya. Cell division appears to be mediated through a FtsZ-dependent mechanism which is highly conserved throughout the Bacteria and Euryarchaeota. An rpb8 subunit of the DNA-dependent RNA polymerase was identified which is absent from other Archaea and has been described as a eukaryotic signature gene. In addition, the representative organism possesses a ribosome structure typical for members of the Crenarchaeota. Based on its gene complement, this lineage likely diverged near the separation of the two major kingdoms of Archaea. Further investigations of these unique organisms may shed additional light onto the evolution of extant life.« less
Asaf, Sajjad; Khan, Abdul Latif; Khan, Muhammad Aaqil; Waqas, Muhammad; Kang, Sang-Mo; Yun, Byung-Wook; Lee, In-Jung
2017-08-08
We investigated the complete chloroplast (cp) genomes of non-model Arabidopsis halleri ssp. gemmifera and Arabidopsis lyrata ssp. petraea using Illumina paired-end sequencing to understand their genetic organization and structure. Detailed bioinformatics analysis revealed genome sizes of both subspecies ranging between 154.4~154.5 kbp, with a large single-copy region (84,197~84,158 bp), a small single-copy region (17,738~17,813 bp) and pair of inverted repeats (IRa/IRb; 26,264~26,259 bp). Both cp genomes encode 130 genes, including 85 protein-coding genes, eight ribosomal RNA genes and 37 transfer RNA genes. Whole cp genome comparison of A. halleri ssp. gemmifera and A. lyrata ssp. petraea, along with ten other Arabidopsis species, showed an overall high degree of sequence similarity, with divergence among some intergenic spacers. The location and distribution of repeat sequences were determined, and sequence divergences of shared genes were calculated among related species. Comparative phylogenetic analysis of the entire genomic data set and 70 shared genes between both cp genomes confirmed the previous phylogeny and generated phylogenetic trees with the same topologies. The sister species of A. halleri ssp. gemmifera is A. umezawana, whereas the closest relative of A. lyrata spp. petraea is A. arenicola.
Laughter and the Management of Divergent Positions in Peer Review Interactions
Raclaw, Joshua; Ford, Cecilia E.
2017-01-01
In this paper we focus on how participants in peer review interactions use laughter as a resource as they publicly report divergence of evaluative positions, divergence that is typical in the give and take of joint grant evaluation. Using the framework of conversation analysis, we examine the infusion of laughter and multimodal laugh-relevant practices into sequences of talk in meetings of grant reviewers deliberating on the evaluation and scoring of high-level scientific grant applications. We focus on a recurrent sequence in these meetings, what we call the score-reporting sequence, in which the assigned reviewers first announce the preliminary scores they have assigned to the grant. We demonstrate that such sequences are routine sites for the use of laugh practices to navigate the initial moments in which divergence of opinion is made explicit. In the context of meetings convened for the purposes of peer review, laughter thus serves as a valuable resource for managing the socially delicate but institutionally required reporting of divergence and disagreement that is endemic to meetings where these types of evaluative tasks are a focal activity. PMID:29170594
NASA Technical Reports Server (NTRS)
Marsh, T. L.; Reich, C. I.; Whitelock, R. B.; Olsen, G. J.; Woese, C. R. (Principal Investigator)
1994-01-01
The first step in transcription initiation in eukaryotes is mediated by the TATA-binding protein, a subunit of the transcription factor IID complex. We have cloned and sequenced the gene for a presumptive homolog of this eukaryotic protein from Thermococcus celer, a member of the Archaea (formerly archaebacteria). The protein encoded by the archaeal gene is a tandem repeat of a conserved domain, corresponding to the repeated domain in its eukaryotic counterparts. Molecular phylogenetic analyses of the two halves of the repeat are consistent with the duplication occurring before the divergence of the archael and eukaryotic domains. In conjunction with previous observations of similarity in RNA polymerase subunit composition and sequences and the finding of a transcription factor IIB-like sequence in Pyrococcus woesei (a relative of T. celer) it appears that major features of the eukaryotic transcription apparatus were well-established before the origin of eukaryotic cellular organization. The divergence between the two halves of the archael protein is less than that between the halves of the individual eukaryotic sequences, indicating that the average rate of sequence change in the archael protein has been less than in its eukaryotic counterparts. To the extent that this lower rate applies to the genome as a whole, a clearer picture of the early genes (and gene families) that gave rise to present-day genomes is more apt to emerge from the study of sequences from the Archaea than from the corresponding sequences from eukaryotes.
Taxonomy, genetic organization, and life cycle of Pneumocystis carinii.
Cushion, M T
1998-12-01
Pneumocystis carinii was initially misidentified as a protozoan parasite. Recent molecular and biochemical analyses provide unequivocal evidence for placement of P. carinii with the fungi, and that P. carinii is most likely an ascomycete. Genetic investigations further show that P. carinii derived from different mammalian hosts (human, rat, mouse, and ferret) exhibit considerable chromosomal and gene sequence divergence indicating that they are likely of different species. The life cycle of P. carinii has not been definitively established, but available evidence is reviewed in light of classification of this organism as a fungus.
Hsieh, Y-C; Chung, J-D; Wang, C-N; Chang, C-T; Chen, C-Y; Hwang, S-Y
2013-01-01
Elucidation of the evolutionary processes that constrain or facilitate adaptive divergence is a central goal in evolutionary biology, especially in non-model organisms. We tested whether changes in dynamics of gene flow (historical vs contemporary) caused population isolation and examined local adaptation in response to environmental selective forces in fragmented Rhododendron oldhamii populations. Variation in 26 expressed sequence tag-simple sequence repeat loci from 18 populations in Taiwan was investigated by examining patterns of genetic diversity, inbreeding, geographic structure, recent bottlenecks, and historical and contemporary gene flow. Selection associated with environmental variables was also examined. Bayesian clustering analysis revealed four regional population groups of north, central, south and southeast with significant genetic differentiation. Historical bottlenecks beginning 9168–13,092 years ago and ending 1584–3504 years ago were revealed by estimates using approximate Bayesian computation for all four regional samples analyzed. Recent migration within and across geographic regions was limited. However, major dispersal sources were found within geographic regions. Altitudinal clines of allelic frequencies of environmentally associated positively selected outliers were found, indicating adaptive divergence. Our results point to a transition from historical population connectivity toward contemporary population isolation and divergence on a regional scale. Spatial and temporal dispersal differences may have resulted in regional population divergence and local adaptation associated with environmental variables, which may have played roles as selective forces at a regional scale. PMID:23591517
Azospirillum Genomes Reveal Transition of Bacteria from Aquatic to Terrestrial Environments
Khalsa-Moyers, Gurusahai; Alexandre, Gladys; Sukharnikov, Leonid O.; Wuichet, Kristin; Hurst, Gregory B.; McDonald, W. Hayes; Robertson, Jon S.; Barbe, Valérie; Calteau, Alexandra; Rouy, Zoé; Mangenot, Sophie; Prigent-Combaret, Claire; Normand, Philippe; Boyer, Mickaël; Siguier, Patricia; Dessaux, Yves; Elmerich, Claudine; Condemine, Guy; Krishnen, Ganisan; Kennedy, Ivan; Paterson, Andrew H.; González, Victor; Mavingui, Patrick; Zhulin, Igor B.
2011-01-01
Fossil records indicate that life appeared in marine environments ∼3.5 billion years ago (Gyr) and transitioned to terrestrial ecosystems nearly 2.5 Gyr. Sequence analysis suggests that “hydrobacteria” and “terrabacteria” might have diverged as early as 3 Gyr. Bacteria of the genus Azospirillum are associated with roots of terrestrial plants; however, virtually all their close relatives are aquatic. We obtained genome sequences of two Azospirillum species and analyzed their gene origins. While most Azospirillum house-keeping genes have orthologs in its close aquatic relatives, this lineage has obtained nearly half of its genome from terrestrial organisms. The majority of genes encoding functions critical for association with plants are among horizontally transferred genes. Our results show that transition of some aquatic bacteria to terrestrial habitats occurred much later than the suggested initial divergence of hydro- and terrabacterial clades. The birth of the genus Azospirillum approximately coincided with the emergence of vascular plants on land. PMID:22216014
Phylogenetic shadowing of primate sequences to find functional regions of the human genome.
Boffelli, Dario; McAuliffe, Jon; Ovcharenko, Dmitriy; Lewis, Keith D; Ovcharenko, Ivan; Pachter, Lior; Rubin, Edward M
2003-02-28
Nonhuman primates represent the most relevant model organisms to understand the biology of Homo sapiens. The recent divergence and associated overall sequence conservation between individual members of this taxon have nonetheless largely precluded the use of primates in comparative sequence studies. We used sequence comparisons of an extensive set of Old World and New World monkeys and hominoids to identify functional regions in the human genome. Analysis of these data enabled the discovery of primate-specific gene regulatory elements and the demarcation of the exons of multiple genes. Much of the information content of the comprehensive primate sequence comparisons could be captured with a small subset of phylogenetically close primates. These results demonstrate the utility of intraprimate sequence comparisons to discover common mammalian as well as primate-specific functional elements in the human genome, which are unattainable through the evaluation of more evolutionarily distant species.
TRStalker: an efficient heuristic for finding fuzzy tandem repeats.
Pellegrini, Marco; Renda, M Elena; Vecchio, Alessio
2010-06-15
Genomes in higher eukaryotic organisms contain a substantial amount of repeated sequences. Tandem Repeats (TRs) constitute a large class of repetitive sequences that are originated via phenomena such as replication slippage and are characterized by close spatial contiguity. They play an important role in several molecular regulatory mechanisms, and also in several diseases (e.g. in the group of trinucleotide repeat disorders). While for TRs with a low or medium level of divergence the current methods are rather effective, the problem of detecting TRs with higher divergence (fuzzy TRs) is still open. The detection of fuzzy TRs is propaedeutic to enriching our view of their role in regulatory mechanisms and diseases. Fuzzy TRs are also important as tools to shed light on the evolutionary history of the genome, where higher divergence correlates with more remote duplication events. We have developed an algorithm (christened TRStalker) with the aim of detecting efficiently TRs that are hard to detect because of their inherent fuzziness, due to high levels of base substitutions, insertions and deletions. To attain this goal, we developed heuristics to solve a Steiner version of the problem for which the fuzziness is measured with respect to a motif string not necessarily present in the input string. This problem is akin to the 'generalized median string' that is known to be an NP-hard problem. Experiments with both synthetic and biological sequences demonstrate that our method performs better than current state of the art for fuzzy TRs and that the fuzzy TRs of the type we detect are indeed present in important biological sequences. TRStalker will be integrated in the web-based TRs Discovery Service (TReaDS) at bioalgo.iit.cnr.it. Supplementary data are available at Bioinformatics online.
Analysis of the Macaca mulatta transcriptome and the sequence divergence between Macaca and human.
Magness, Charles L; Fellin, P Campion; Thomas, Matthew J; Korth, Marcus J; Agy, Michael B; Proll, Sean C; Fitzgibbon, Matthew; Scherer, Christina A; Miner, Douglas G; Katze, Michael G; Iadonato, Shawn P
2005-01-01
We report the initial sequencing and comparative analysis of the Macaca mulatta transcriptome. Cloned sequences from 11 tissues, nine animals, and three species (M. mulatta, M. fascicularis, and M. nemestrina) were sampled, resulting in the generation of 48,642 sequence reads. These data represent an initial sampling of the putative rhesus orthologs for 6,216 human genes. Mean nucleotide diversity within M. mulatta and sequence divergence among M. fascicularis, M. nemestrina, and M. mulatta are also reported.
Middleton, Christopher P.; Senerchia, Natacha; Stein, Nils; Akhunov, Eduard D.; Keller, Beat
2014-01-01
Using Roche/454 technology, we sequenced the chloroplast genomes of 12 Triticeae species, including bread wheat, barley and rye, as well as the diploid progenitors and relatives of bread wheat Triticum urartu, Aegilops speltoides and Ae. tauschii. Two wild tetraploid taxa, Ae. cylindrica and Ae. geniculata, were also included. Additionally, we incorporated wild Einkorn wheat Triticum boeoticum and its domesticated form T. monococcum and two Hordeum spontaneum (wild barley) genotypes. Chloroplast genomes were used for overall sequence comparison, phylogenetic analysis and dating of divergence times. We estimate that barley diverged from rye and wheat approximately 8–9 million years ago (MYA). The genome donors of hexaploid wheat diverged between 2.1–2.9 MYA, while rye diverged from Triticum aestivum approximately 3–4 MYA, more recently than previously estimated. Interestingly, the A genome taxa T. boeoticum and T. urartu were estimated to have diverged approximately 570,000 years ago. As these two have a reproductive barrier, the divergence time estimate also provides an upper limit for the time required for the formation of a species boundary between the two. Furthermore, we conclusively show that the chloroplast genome of hexaploid wheat was contributed by the B genome donor and that this unknown species diverged from Ae. speltoides about 980,000 years ago. Additionally, sequence alignments identified a translocation of a chloroplast segment to the nuclear genome which is specific to the rye/wheat lineage. We propose the presented phylogeny and divergence time estimates as a reference framework for future studies on Triticeae. PMID:24614886
Resnyk, C W; Carré, W; Wang, X; Porter, T E; Simon, J; Le Bihan-Duval, E; Duclos, M J; Aggrey, S E; Cogburn, L A
2017-08-16
Decades of intensive genetic selection in the domestic chicken (Gallus gallus domesticus) have enabled the remarkable rapid growth of today's broiler (meat-type) chickens. However, this enhanced growth rate was accompanied by several unfavorable traits (i.e., increased visceral fatness, leg weakness, and disorders of metabolism and reproduction). The present descriptive analysis of the abdominal fat transcriptome aimed to identify functional genes and biological pathways that likely contribute to an extreme difference in visceral fatness of divergently selected broiler chickens. We used the Del-Mar 14 K Chicken Integrated Systems microarray to take time-course snapshots of global gene transcription in abdominal fat of juvenile [1-11 weeks of age (wk)] chickens divergently selected on bodyweight at two ages (8 and 36 wk). Further, a RNA sequencing analysis was completed on the same abdominal fat samples taken from high-growth (HG) and low-growth (LG) cockerels at 7 wk, the age with the greatest divergence in body weight (3.2-fold) and visceral fatness (19.6-fold). Time-course microarray analysis revealed 312 differentially expressed genes (FDR ≤ 0.05) as the main effect of genotype (HG versus LG), 718 genes in the interaction of age and genotype, and 2918 genes as the main effect of age. The RNA sequencing analysis identified 2410 differentially expressed genes in abdominal fat of HG versus LG chickens at 7 wk. The HG chickens are fatter and over-express numerous genes that support higher rates of visceral adipogenesis and lipogenesis. In abdominal fat of LG chickens, we found higher expression of many genes involved in hemostasis, energy catabolism and endocrine signaling, which likely contribute to their leaner phenotype and slower growth. Many transcription factors and their direct target genes identified in HG and LG chickens could be involved in their divergence in adiposity and growth rate. The present analyses of the visceral fat transcriptome in chickens divergently selected for a large difference in growth rate and abdominal fatness clearly demonstrate that abdominal fat is a very dynamic metabolic and endocrine organ in the chicken. The HG chickens overexpress many transcription factors and their direct target genes, which should enhance in situ lipogenesis and ultimately adiposity. Our observation of enhanced expression of hemostasis and endocrine-signaling genes in diminished abdominal fat of LG cockerels provides insight into genetic mechanisms involved in divergence of abdominal fatness and somatic growth in avian and perhaps mammalian species, including humans.
Ghouila, Amel; Florent, Isabelle; Guerfali, Fatma Zahra; Terrapon, Nicolas; Laouini, Dhafer; Yahia, Sadok Ben; Gascuel, Olivier; Bréhélin, Laurent
2014-01-01
Identification of protein domains is a key step for understanding protein function. Hidden Markov Models (HMMs) have proved to be a powerful tool for this task. The Pfam database notably provides a large collection of HMMs which are widely used for the annotation of proteins in sequenced organisms. This is done via sequence/HMM comparisons. However, this approach may lack sensitivity when searching for domains in divergent species. Recently, methods for HMM/HMM comparisons have been proposed and proved to be more sensitive than sequence/HMM approaches in certain cases. However, these approaches are usually not used for protein domain discovery at a genome scale, and the benefit that could be expected from their utilization for this problem has not been investigated. Using proteins of P. falciparum and L. major as examples, we investigate the extent to which HMM/HMM comparisons can identify new domain occurrences not already identified by sequence/HMM approaches. We show that although HMM/HMM comparisons are much more sensitive than sequence/HMM comparisons, they are not sufficiently accurate to be used as a standalone complement of sequence/HMM approaches at the genome scale. Hence, we propose to use domain co-occurrence--the general domain tendency to preferentially appear along with some favorite domains in the proteins--to improve the accuracy of the approach. We show that the combination of HMM/HMM comparisons and co-occurrence domain detection boosts protein annotations. At an estimated False Discovery Rate of 5%, it revealed 901 and 1098 new domains in Plasmodium and Leishmania proteins, respectively. Manual inspection of part of these predictions shows that it contains several domain families that were missing in the two organisms. All new domain occurrences have been integrated in the EuPathDomains database, along with the GO annotations that can be deduced.
Ghouila, Amel; Florent, Isabelle; Guerfali, Fatma Zahra; Terrapon, Nicolas; Laouini, Dhafer; Yahia, Sadok Ben; Gascuel, Olivier; Bréhélin, Laurent
2014-01-01
Identification of protein domains is a key step for understanding protein function. Hidden Markov Models (HMMs) have proved to be a powerful tool for this task. The Pfam database notably provides a large collection of HMMs which are widely used for the annotation of proteins in sequenced organisms. This is done via sequence/HMM comparisons. However, this approach may lack sensitivity when searching for domains in divergent species. Recently, methods for HMM/HMM comparisons have been proposed and proved to be more sensitive than sequence/HMM approaches in certain cases. However, these approaches are usually not used for protein domain discovery at a genome scale, and the benefit that could be expected from their utilization for this problem has not been investigated. Using proteins of P. falciparum and L. major as examples, we investigate the extent to which HMM/HMM comparisons can identify new domain occurrences not already identified by sequence/HMM approaches. We show that although HMM/HMM comparisons are much more sensitive than sequence/HMM comparisons, they are not sufficiently accurate to be used as a standalone complement of sequence/HMM approaches at the genome scale. Hence, we propose to use domain co-occurrence — the general domain tendency to preferentially appear along with some favorite domains in the proteins — to improve the accuracy of the approach. We show that the combination of HMM/HMM comparisons and co-occurrence domain detection boosts protein annotations. At an estimated False Discovery Rate of 5%, it revealed 901 and 1098 new domains in Plasmodium and Leishmania proteins, respectively. Manual inspection of part of these predictions shows that it contains several domain families that were missing in the two organisms. All new domain occurrences have been integrated in the EuPathDomains database, along with the GO annotations that can be deduced. PMID:24901648
Sequence-Level Mechanisms of Human Epigenome Evolution
Prendergast, James G.D.; Chambers, Emily V.; Semple, Colin A.M.
2014-01-01
DNA methylation and chromatin states play key roles in development and disease. However, the extent of recent evolutionary divergence in the human epigenome and the influential factors that have shaped it are poorly understood. To determine the links between genome sequence and human epigenome evolution, we examined the divergence of DNA methylation and chromatin states following segmental duplication events in the human lineage. Chromatin and DNA methylation states were found to have been generally well conserved following a duplication event, with the evolution of the epigenome largely uncoupled from the total number of genetic changes in the surrounding DNA sequence. However, the epigenome at tissue-specific, distal regulatory regions was observed to be unusually prone to diverge following duplication, with particular sequence differences, altering known sequence motifs, found to be associated with divergence in patterns of DNA methylation and chromatin. Alu elements were found to have played a particularly prominent role in shaping human epigenome evolution, and we show that human-specific AluY insertion events are strongly linked to the evolution of the DNA methylation landscape and gene expression levels, including at key neurological genes in the human brain. Studying paralogous regions within the same sample enables the study of the links between genome and epigenome evolution while controlling for biological and technical variation. We show DNA methylation and chromatin divergence between duplicated regions are linked to the divergence of particular genetic motifs, with Alu elements having played a disproportionate role in the evolution of the epigenome in the human lineage. PMID:24966180
Ned B. Klopfenstein; John W. Hanna; Amy L. Ross-Davis; Jane E. Stewart; Yuko Ota; Rosario Medel-Ortiz; Miguel Armando Lopez-Ramirez; Ruben Damian Elias-Roman; Dionicio Alvarado-Rosales; Mee-Sook Kim
2013-01-01
Armillaria plays diverse ecological roles in forests worldwide, which has inspired interest in understanding phylogenetic relationships within and among species of this genus. Previous rDNA sequence-based phylogenetic analyses of Armillaria have shown general relationships among widely divergent taxa, but rDNA sequences were not reliable for separating closely related...
The Evolutionary Origin of Epithelial Cell-Cell Adhesion Mechanisms
Miller, Phillip W.; Clarke, Donald N.; Weis, William I.; Lowe, Christopher J.; Nelson, W. James
2014-01-01
SUMMARY A simple epithelium forms a barrier between the outside and the inside of an organism, and is the first organized multicellular tissue found in evolution. We examine the relationship between the evolution of epithelia and specialized cell-cell adhesion proteins comprising the classical cadherin/β-catenin/α-catenin complex (CCC). A review of the divergent functional properties of the CCC in metazoans and non-metazoans, and an updated phylogenetic coverage of the CCC using recent genomic data reveal: 1) The core CCC likely originated before the last common ancestor of unikonts and their closest bikont sister taxa. 2) Formation of the CCC may have constrained sequence evolution of the classical cadherin cytoplasmic domain and β-catenin in metazoa. 3) The α-catenin binding domain in β-catenin appears to be the favored mutation site for disrupting β-catenin function in the CCC. 4) The ancestral function of the α/β-catenin heterodimer appears to be an actin-binding module. In some metazoan groups, more complex functions of α-catenin were gained by sequence divergence in the non-actin binding (N-, M-) domains. 5) Allosteric regulation of α-catenin, rather than loss of function mutations, may have evolved for more complex regulation of the actin cytoskeleton. PMID:24210433
Joseph, Sneha; Poriya, Paresh; Kundu, Rahul
2016-11-01
The present study reports the phylogenetic relationship of six zoanthid species belonging to three genera, Isaurus, Palythoa, and Zoanthus identified using systematic computational analysis of mtDNA gene sequences. All six species are first recorded from the coasts of Kathiawar Peninsula, India. Genus: Isaurus is represented by Isaurus tuberculatus, genus Zoanthus is represented by Zoanthus kuroshio and Zoanthus sansibaricus, while genus Palythoa is represented by Palythoa tuberculosa, P. sp. JVK-2006 and Palythoa heliodiscus. Results of the present study revealed that among the various species observed along the coastline, a minimum of 99% sequence divergence and a maximum of 96% sequence divergence were seen. An interspecific divergence of 1-4% and negligible intraspecific divergence was observed. These results not only highlighted the efficiency of the COI gene region in species identification but also demonstrated the genetic variability of zoanthids along the Saurashtra coastline of the west coast of India.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Castelle, Cindy J.; Brown, Christopher T.; Thomas, Brian C.
The Candidate Phyla Radiation (CPR) is a large group of bacteria, the scale of which approaches that of all other bacteria. CPR organisms are inferred to depend on other community members for many basic cellular building blocks and all appear to be obligate anaerobes. To date, there has been no evidence for any significant respiratory capacity in an organism from this radiation. Here we report a curated draft genome for Candidatus Parcunitrobacter nitroensis' a member of the Parcubacteria (OD1) superphylum of the CPR. The genome encodes versatile energy pathways, including fermentative and respiratory capacities, nitrogen and fatty acid metabolism, asmore » well as the first complete electron transport chain described for a member of the CPR. The sequences of all of these enzymes are highly divergent from sequences found in other organisms, suggesting that these capacities were not recently acquired from non-CPR organisms. Although the wide respiration-based repertoire points to a different lifestyle compared to other CPR bacteria, we predict similar obligate dependence on other organisms or the microbial community. The results substantially expand the known metabolic potential of CPR bacteria, although sequence comparisons indicate that these capacities are very rare in members of this radiation.« less
Castelle, Cindy J.; Brown, Christopher T.; Thomas, Brian C.; ...
2017-01-09
The Candidate Phyla Radiation (CPR) is a large group of bacteria, the scale of which approaches that of all other bacteria. CPR organisms are inferred to depend on other community members for many basic cellular building blocks and all appear to be obligate anaerobes. To date, there has been no evidence for any significant respiratory capacity in an organism from this radiation. Here we report a curated draft genome for Candidatus Parcunitrobacter nitroensis' a member of the Parcubacteria (OD1) superphylum of the CPR. The genome encodes versatile energy pathways, including fermentative and respiratory capacities, nitrogen and fatty acid metabolism, asmore » well as the first complete electron transport chain described for a member of the CPR. The sequences of all of these enzymes are highly divergent from sequences found in other organisms, suggesting that these capacities were not recently acquired from non-CPR organisms. Although the wide respiration-based repertoire points to a different lifestyle compared to other CPR bacteria, we predict similar obligate dependence on other organisms or the microbial community. The results substantially expand the known metabolic potential of CPR bacteria, although sequence comparisons indicate that these capacities are very rare in members of this radiation.« less
Nucleotide sequences of bovine alpha S1- and kappa-casein cDNAs.
Stewart, A F; Willis, I M; Mackinlay, A G
1984-01-01
The nucleotide sequences corresponding to bovine alpha S1- and kappa-casein mRNAs are presented. An unusual alpha S1-casein cDNA has been characterised whose 5' end commences upstream from its putative TATA box. The alpha S1-casein mRNA is compared to rat alpha-casein mRNA and two components of divergence are identified. Firstly, the two sequences have diverged at a high point mutation rate and the rate of amino acid replacement by this mechanism is at least as great as the rate of divergence of any other part of the mRNAs. Secondly, the protein coding sequence has been subjected to several insertion/deletion events, one of which may be an example of exon shuffling . The kappa-casein mRNA sequence verifies the proposition that it has arisen from a different ancestral gene to the other caseins. Images PMID:6328443
Ecological Divergence of a Novel Group of Chloroflexus Strains along a Geothermal Gradient
Weltzer, Michael L.
2013-01-01
Environmental gradients are expected to promote the diversification and coexistence of ecological specialists adapted to local conditions. Consistent with this view, genera of phototrophic microorganisms in alkaline geothermal systems generally appear to consist of anciently divergent populations which have specialized on different temperature habitats. At White Creek (Lower Geyser Basin, Yellowstone National Park), however, a novel, 16S rRNA-defined lineage of the filamentous anoxygenic phototroph Chloroflexus (OTU 10, phylum Chloroflexi) occupies a much wider thermal niche than other 16S rRNA-defined groups of phototrophic bacteria. This suggests that Chloroflexus OTU 10 is either an ecological generalist or, alternatively, a group of cryptic thermal specialists which have recently diverged. To distinguish between these alternatives, we first isolated laboratory strains of Chloroflexus OTU 10 from along the White Creek temperature gradient. These strains are identical for partial gene sequences encoding the 16S rRNA and malonyl coenzyme A (CoA) reductase. However, strains isolated from upstream and downstream samples could be distinguished based on sequence variation at pcs, which encodes the propionyl-CoA synthase of the 3-hydroxypropionate pathway of carbon fixation used by the genus Chloroflexus. We next demonstrated that strains have diverged in temperature range for growth. Specifically, we obtained evidence for a positive correlation between thermal niche breadth and temperature optimum, with strains isolated from lower temperatures exhibiting greater thermal specialization than the most thermotolerant strain. The study has implications for our understanding of both the process of niche diversification of microorganisms and how diversity is organized in these hot spring communities. PMID:23263946
Ecological divergence of a novel group of Chloroflexus strains along a geothermal gradient.
Weltzer, Michael L; Miller, Scott R
2013-02-01
Environmental gradients are expected to promote the diversification and coexistence of ecological specialists adapted to local conditions. Consistent with this view, genera of phototrophic microorganisms in alkaline geothermal systems generally appear to consist of anciently divergent populations which have specialized on different temperature habitats. At White Creek (Lower Geyser Basin, Yellowstone National Park), however, a novel, 16S rRNA-defined lineage of the filamentous anoxygenic phototroph Chloroflexus (OTU 10, phylum Chloroflexi) occupies a much wider thermal niche than other 16S rRNA-defined groups of phototrophic bacteria. This suggests that Chloroflexus OTU 10 is either an ecological generalist or, alternatively, a group of cryptic thermal specialists which have recently diverged. To distinguish between these alternatives, we first isolated laboratory strains of Chloroflexus OTU 10 from along the White Creek temperature gradient. These strains are identical for partial gene sequences encoding the 16S rRNA and malonyl coenzyme A (CoA) reductase. However, strains isolated from upstream and downstream samples could be distinguished based on sequence variation at pcs, which encodes the propionyl-CoA synthase of the 3-hydroxypropionate pathway of carbon fixation used by the genus Chloroflexus. We next demonstrated that strains have diverged in temperature range for growth. Specifically, we obtained evidence for a positive correlation between thermal niche breadth and temperature optimum, with strains isolated from lower temperatures exhibiting greater thermal specialization than the most thermotolerant strain. The study has implications for our understanding of both the process of niche diversification of microorganisms and how diversity is organized in these hot spring communities.
El-Sherry, Shiem; Ogedengbe, Mosun E; Hafeez, Mian A; Barta, John R
2013-07-01
Multiple 18S rDNA sequences were obtained from two single-oocyst-derived lines of each of Eimeria meleagrimitis and Eimeria adenoeides. After analysing the 15 new 18S rDNA sequences from two lines of E. meleagrimitis and 17 new sequences from two lines of E. adenoeides, there were clear indications that divergent, paralogous 18S rDNA copies existed within the nuclear genome of E. meleagrimitis. In contrast, mitochondrial cytochrome c oxidase subunit I (COI) partial sequences from all lines of a particular Eimeria sp. were identical and, in phylogenetic analyses, COI sequences clustered unambiguously in monophyletic and highly-supported clades specific to individual Eimeria sp. Phylogenetic analysis of the new 18S rDNA sequences from E. meleagrimitis showed that they formed two distinct clades: Type A with four new sequences; and Type B with nine new sequences; both Types A and B sequences were obtained from each of the single-oocyst-derived lines of E. meleagrimitis. Together these rDNA types formed a well-supported E. meleagrimitis clade. Types A and B 18S rDNA sequences from E. meleagrimitis had a mean sequence identity of only 97.4% whereas mean sequence identity within types was 99.1-99.3%. The observed intraspecific sequence divergence among E. meleagrimitis 18S rDNA sequence types was even higher (approximately 2.6%) than the interspecific sequence divergence present between some well-recognized species such as Eimeria tenella and Eimeria necatrix (1.1%). Our observations suggest that, unlike COI sequences, 18S rDNA sequences are not reliable molecular markers to be used alone for species identification with coccidia, although 18S rDNA sequences have clear utility for phylogenetic reconstruction of apicomplexan parasites at the genus and higher taxonomic ranks. Copyright © 2013. Published by Elsevier Ltd.
A novel, highly divergent ssDNA virus identified in Brazil infecting apple, pear and grapevine.
Basso, Marcos Fernando; da Silva, José Cleydson Ferreira; Fajardo, Thor Vinícius Martins; Fontes, Elizabeth Pacheco Batista; Zerbini, Francisco Murilo
2015-12-02
Fruit trees of temperate and tropical climates are of great economical importance worldwide and several viruses have been reported affecting their productivity and longevity. Fruit trees of different Brazilian regions displaying virus-like symptoms were evaluated for infection by circular DNA viruses. Seventy-four fruit trees were sampled and a novel, highly divergent, monopartite circular ssDNA virus was cloned from apple, pear and grapevine trees. Forty-five complete viral genomes were sequenced, with a size of approx. 3.4 kb and organized into five ORFs. Deduced amino acid sequences showed identities in the range of 38% with unclassified circular ssDNA viruses, nanoviruses and alphasatellites (putative Replication-associated protein, Rep), and begomo-, curto- and mastreviruses (putative coat protein, CP, and movement protein, MP). A large intergenic region contains a short palindromic sequence capable of forming a hairpin-like structure with the loop sequence TAGTATTAC, identical to the conserved nonanucleotide of circoviruses, nanoviruses and alphasatellites. Recombination events were not detected and phylogenetic analysis showed a relationship with circo-, nano- and geminiviruses. PCR confirmed the presence of this novel ssDNA virus in field plants. Infectivity tests using the cloned viral genome confirmed its ability to infect apple and pear tree seedlings, but not Nicotiana benthamiana. The name "Temperate fruit decay-associated virus" (TFDaV) is proposed for this novel virus. Copyright © 2015 Elsevier B.V. All rights reserved.
Miller, Emily J.; Neaves, Linda E.; Zenger, Kyall R.; Herbert, Catherine A.
2017-01-01
The tammar wallaby (Notamacropus eugenii) is one of the most intensively studied of all macropodids and was the first Australasian marsupial to have its genome sequenced. However, comparatively little is known about genetic diversity and differentiation amongst the morphologically distinct allopatric populations of tammar wallabies found in Western (WA) and South Australia (SA). Here we compare autosomal and Y-linked microsatellite genotypes, as well as sequence data (~600 bp) from the mitochondrial DNA (mtDNA) control region (CR) in tammar wallabies from across its distribution. Levels of diversity at autosomal microsatellite loci were typically high in the WA mainland and Kangaroo Island (SA) populations (A = 8.9–10.6; He = 0.77–0.78) but significantly reduced in other endemic island populations (A = 3.8–4.1; He = 0.41–0.48). Autosomal and Y-linked microsatellite loci revealed a pattern of significant differentiation amongst populations, especially between SA and WA. The Kangaroo Island and introduced New Zealand population showed limited differentiation. Multiple divergent mtDNA CR haplotypes were identified within both SA and WA populations. The CR haplotypes of tammar wallabies from SA and WA show reciprocal monophyly and are highly divergent (14.5%), with levels of sequence divergence more typical of different species. Within WA tammar wallabies, island populations each have unique clusters of highly related CR haplotypes and each is most closely related to different WA mainland haplotypes. Y-linked microsatellite haplotypes show a similar pattern of divergence although levels of diversity are lower. In light of these differences, we suggest that two subspecies of tammar wallaby be recognized; Notamacropus eugenii eugenii in SA and N. eugenii derbianus in WA. The extensive neutral genetic diversity and inter-population differentiation identified within tammar wallabies should further increase the species value and usefulness as a model organism. PMID:28257440
Tseng, Shu-Ping; Li, Shou-Hsien; Hsieh, Chia-Hung; Wang, Hurng-Yi; Lin, Si-Min
2014-10-01
Dating the time of divergence and understanding speciation processes are central to the study of the evolutionary history of organisms but are notoriously difficult. The difficulty is largely rooted in variations in the ancestral population size or in the genealogy variation across loci. To depict the speciation processes and divergence histories of three monophyletic Takydromus species endemic to Taiwan, we sequenced 20 nuclear loci and combined with one mitochondrial locus published in GenBank. They were analysed by a multispecies coalescent approach within a Bayesian framework. Divergence dating based on the gene tree approach showed high variation among loci, and the divergence was estimated at an earlier date than when derived by the species-tree approach. To test whether variations in the ancestral population size accounted for the majority of this variation, we conducted computer inferences using isolation-with-migration (IM) and approximate Bayesian computation (ABC) frameworks. The results revealed that gene flow during the early stage of speciation was strongly favoured over the isolation model, and the initiation of the speciation process was far earlier than the dates estimated by gene- and species-based divergence dating. Due to their limited dispersal ability, it is suggested that geographical isolation may have played a major role in the divergence of these Takydromus species. Nevertheless, this study reveals a more complex situation and demonstrates that gene flow during the speciation process cannot be overlooked and may have a great impact on divergence dating. By using multilocus data and incorporating Bayesian coalescence approaches, we provide a more biologically realistic framework for delineating the divergence history of Takydromus. © 2014 John Wiley & Sons Ltd.
Differences in Brain Transcriptomes of Closely Related Baikal Coregonid Species
Bychenko, Oksana S.; Sukhanova, Lyubov V.; Azhikina, Tatyana L.; Skvortsov, Timofey A.; Belomestnykh, Tuyana V.; Sverdlov, Eugene D.
2014-01-01
The aim of this work was to get deeper insight into genetic factors involved in the adaptive divergence of closely related species, specifically two representatives of Baikal coregonids—Baikal whitefish (Coregonus baicalensis Dybowski) and Baikal omul (Coregonus migratorius Georgi)—that diverged from a common ancestor as recently as 10–20 thousand years ago. Using the Serial Analysis of Gene Expression method, we obtained libraries of short representative cDNA sequences (tags) from the brains of Baikal whitefish and omul. A comparative analysis of the libraries revealed quantitative differences among ~4% tags of the fishes under study. Based on the similarity of these tags with cDNA of known organisms, we identified candidate genes taking part in adaptive divergence. The most important candidate genes related to the adaptation of Baikal whitefish and Baikal omul, identified in this work, belong to the genes of cell metabolism, nervous and immune systems, protein synthesis, and regulatory genes as well as to DTSsa4 Tc1-like transposons which are widespread among fishes. PMID:24719892
Blazier, J Chris; Ruhlman, Tracey A; Weng, Mao-Lun; Rehman, Sumaiyah K; Sabir, Jamal S M; Jansen, Robert K
2016-04-18
Genes for the plastid-encoded RNA polymerase (PEP) persist in the plastid genomes of all photosynthetic angiosperms. However, three unrelated lineages (Annonaceae, Passifloraceae and Geraniaceae) have been identified with unusually divergent open reading frames (ORFs) in the conserved region of rpoA, the gene encoding the PEP α subunit. We used sequence-based approaches to evaluate whether these genes retain function. Both gene sequences and complete plastid genome sequences were assembled and analyzed from each of the three angiosperm families. Multiple lines of evidence indicated that the rpoA sequences are likely functional despite retaining as low as 30% nucleotide sequence identity with rpoA genes from outgroups in the same angiosperm order. The ratio of non-synonymous to synonymous substitutions indicated that these genes are under purifying selection, and bioinformatic prediction of conserved domains indicated that functional domains are preserved. One of the lineages (Pelargonium, Geraniaceae) contains species with multiple rpoA-like ORFs that show evidence of ongoing inter-paralog gene conversion. The plastid genomes containing these divergent rpoA genes have experienced extensive structural rearrangement, including large expansions of the inverted repeat. We propose that illegitimate recombination, not positive selection, has driven the divergence of rpoA.
Clustering evolving proteins into homologous families.
Chan, Cheong Xin; Mahbob, Maisarah; Ragan, Mark A
2013-04-08
Clustering sequences into groups of putative homologs (families) is a critical first step in many areas of comparative biology and bioinformatics. The performance of clustering approaches in delineating biologically meaningful families depends strongly on characteristics of the data, including content bias and degree of divergence. New, highly scalable methods have recently been introduced to cluster the very large datasets being generated by next-generation sequencing technologies. However, there has been little systematic investigation of how characteristics of the data impact the performance of these approaches. Using clusters from a manually curated dataset as reference, we examined the performance of a widely used graph-based Markov clustering algorithm (MCL) and a greedy heuristic approach (UCLUST) in delineating protein families coded by three sets of bacterial genomes of different G+C content. Both MCL and UCLUST generated clusters that are comparable to the reference sets at specific parameter settings, although UCLUST tends to under-cluster compositionally biased sequences (G+C content 33% and 66%). Using simulated data, we sought to assess the individual effects of sequence divergence, rate heterogeneity, and underlying G+C content. Performance decreased with increasing sequence divergence, decreasing among-site rate variation, and increasing G+C bias. Two MCL-based methods recovered the simulated families more accurately than did UCLUST. MCL using local alignment distances is more robust across the investigated range of sequence features than are greedy heuristics using distances based on global alignment. Our results demonstrate that sequence divergence, rate heterogeneity and content bias can individually and in combination affect the accuracy with which MCL and UCLUST can recover homologous protein families. For application to data that are more divergent, and exhibit higher among-site rate variation and/or content bias, MCL may often be the better choice, especially if computational resources are not limiting.
Sun, Cheng; Wyngaard, Grace; Walton, D Brian; Wichman, Holly A; Mueller, Rachel Lockridge
2014-03-11
Chromatin diminution is the programmed deletion of DNA from presomatic cell or nuclear lineages during development, producing single organisms that contain two different nuclear genomes. Phylogenetically diverse taxa undergo chromatin diminution--some ciliates, nematodes, copepods, and vertebrates. In cyclopoid copepods, chromatin diminution occurs in taxa with massively expanded germline genomes; depending on species, germline genome sizes range from 15 - 75 Gb, 12-74 Gb of which are lost from pre-somatic cell lineages at germline--soma differentiation. This is more than an order of magnitude more sequence than is lost from other taxa. To date, the sequences excised from copepods have not been analyzed using large-scale genomic datasets, and the processes underlying germline genomic gigantism in this clade, as well as the functional significance of chromatin diminution, have remained unknown. Here, we used high-throughput genomic sequencing and qPCR to characterize the germline and somatic genomes of Mesocyclops edax, a freshwater cyclopoid copepod with a germline genome of ~15 Gb and a somatic genome of ~3 Gb. We show that most of the excised DNA consists of repetitive sequences that are either 1) verifiable transposable elements (TEs), or 2) non-simple repeats of likely TE origin. Repeat elements in both genomes are skewed towards younger (i.e. less divergent) elements. Excised DNA is a non-random sample of the germline repeat element landscape; younger elements, and high frequency DNA transposons and LINEs, are disproportionately eliminated from the somatic genome. Our results suggest that germline genome expansion in M. edax reflects explosive repeat element proliferation, and that billions of base pairs of such repeats are deleted from the somatic genome every generation. Thus, we hypothesize that chromatin diminution is a mechanism that controls repeat element load, and that this load can evolve to be divergent between tissue types within single organisms.
2014-01-01
Background Chromatin diminution is the programmed deletion of DNA from presomatic cell or nuclear lineages during development, producing single organisms that contain two different nuclear genomes. Phylogenetically diverse taxa undergo chromatin diminution — some ciliates, nematodes, copepods, and vertebrates. In cyclopoid copepods, chromatin diminution occurs in taxa with massively expanded germline genomes; depending on species, germline genome sizes range from 15 – 75 Gb, 12–74 Gb of which are lost from pre-somatic cell lineages at germline – soma differentiation. This is more than an order of magnitude more sequence than is lost from other taxa. To date, the sequences excised from copepods have not been analyzed using large-scale genomic datasets, and the processes underlying germline genomic gigantism in this clade, as well as the functional significance of chromatin diminution, have remained unknown. Results Here, we used high-throughput genomic sequencing and qPCR to characterize the germline and somatic genomes of Mesocyclops edax, a freshwater cyclopoid copepod with a germline genome of ~15 Gb and a somatic genome of ~3 Gb. We show that most of the excised DNA consists of repetitive sequences that are either 1) verifiable transposable elements (TEs), or 2) non-simple repeats of likely TE origin. Repeat elements in both genomes are skewed towards younger (i.e. less divergent) elements. Excised DNA is a non-random sample of the germline repeat element landscape; younger elements, and high frequency DNA transposons and LINEs, are disproportionately eliminated from the somatic genome. Conclusions Our results suggest that germline genome expansion in M. edax reflects explosive repeat element proliferation, and that billions of base pairs of such repeats are deleted from the somatic genome every generation. Thus, we hypothesize that chromatin diminution is a mechanism that controls repeat element load, and that this load can evolve to be divergent between tissue types within single organisms. PMID:24618421
Jennings, W Bryan; Wogel, Henrique; Bilate, Marcos; Salles, Rodrigo de O L; Buckup, Paulo A
2016-09-01
The microhylid frogs belonging to the genus Arcovomer have been reported from lowland Atlantic Rainforest in the Brazilian states of Espírito Santo, Rio de Janeiro, and São Paulo. Here, we use DNA barcoding to assess levels of genetic divergence between apparently isolated populations in Espírito Santo and Rio de Janeiro. Our mtDNA data consisting of cytochrome oxidase subunit I (COI) nucleotide sequences reveals 13.2% uncorrected and 30.4% TIM2 + I + Γ corrected genetic divergences between these two populations. This level of divergence exceeds the suggested 10% uncorrected divergence threshold for elevating amphibian populations to candidate species using this marker, which implies that the Espírito Santo population is a species distinct from Arcovomer passarellii. Calibration of our model-corrected sequence divergence estimates suggests that the time of population divergence falls between 12 and 29 million years ago.
DNA barcodes for dragonflies and damselflies (Odonata) of Mindanao, Philippines.
Casas, Princess Angelie S; Sing, Kong-Wah; Lee, Ping-Shin; Nuñeza, Olga M; Villanueva, Reagan Joseph T; Wilson, John-James
2018-03-01
Reliable species identification provides a sounder basis for use of species in the order Odonata as biological indicators and for their conservation, an urgent concern as many species are threatened with imminent extinction. We generated 134 COI barcodes from 36 morphologically identified species of Odonata collected from Mindanao Island, representing 10 families and 19 genera. Intraspecific sequence divergences ranged from 0 to 6.7% with four species showing more than 2%, while interspecific sequence divergences ranged from 0.5 to 23.3% with seven species showing less than 2%. Consequently, no distinct gap was observed between intraspecific and interspecific DNA barcode divergences. The numerous islands of the Philippine archipelago may have facilitated rapid speciation in the Odonata and resulted in low interspecific sequence divergences among closely related groups of species. This study contributes DNA barcodes for 36 morphologically identified species of Odonata reported from Mindanao including 31 species with no previous DNA barcode records.
Llopart, Ana
2018-05-01
The hemizygosity of the X (Z) chromosome fully exposes the fitness effects of mutations on that chromosome and has evolutionary consequences on the relative rates of evolution of X and autosomes. Specifically, several population genetics models predict increased rates of evolution in X-linked loci relative to autosomal loci. This prediction of faster-X evolution has been evaluated and confirmed for both protein coding sequences and gene expression. In the case of faster-X evolution for gene expression divergence, it is often assumed that variation in 5' noncoding sequences is associated with variation in transcript abundance between species but a formal, genomewide test of this hypothesis is still missing. Here, I use whole genome sequence data in Drosophila yakuba and D. santomea to evaluate this hypothesis and report positive correlations between sequence divergence at 5' noncoding sequences and gene expression divergence. I also examine polymorphism and divergence in 9,279 noncoding sequences located at the 5' end of annotated genes and detected multiple signals of positive selection. Notably, I used the traditional synonymous sites as neutral reference to test for adaptive evolution, but I also used bases 8-30 of introns <65 bp, which have been proposed to be a better neutral choice. X-linked genes with high degree of male-biased expression show the most extreme adaptive pattern at 5' noncoding regions, in agreement with faster-X evolution for gene expression divergence and a higher incidence of positively selected recessive mutations. © 2018 The Authors. Molecular Ecology Published by John Wiley & Sons Ltd.
Lobo, Jorge; Ferreira, Maria S; Antunes, Ilisa C; Teixeira, Marcos A L; Borges, Luisa M S; Sousa, Ronaldo; Gomes, Pedro A; Costa, Maria Helena; Cunha, Marina R; Costa, Filipe O
2017-02-01
In this study we compared DNA barcode-suggested species boundaries with morphology-based species identifications in the amphipod fauna of the southern European Atlantic coast. DNA sequences of the cytochrome c oxidase subunit I barcode region (COI-5P) were generated for 43 morphospecies (178 specimens) collected along the Portuguese coast which, together with publicly available COI-5P sequences, produced a final dataset comprising 68 morphospecies and 295 sequences. Seventy-five BINs (Barcode Index Numbers) were assigned to these morphospecies, of which 48 were concordant (i.e., 1 BIN = 1 species), 8 were taxonomically discordant, and 19 were singletons. Twelve species had matching sequences (<2% distance) with conspecifics from distant locations (e.g., North Sea). Seven morphospecies were assigned to multiple, and highly divergent, BINs, including specimens of Corophium multisetosum (18% divergence) and Dexamine spiniventris (16% divergence), which originated from sampling locations on the west coast of Portugal (only about 36 and 250 km apart, respectively). We also found deep divergence (4%-22%) among specimens of seven species from Portugal compared to those from the North Sea and Italy. The detection of evolutionarily meaningful divergence among populations of several amphipod species from southern Europe reinforces the need for a comprehensive re-assessment of the diversity of this faunal group.
Chambers, E Anne; Hebert, Paul D N
2016-01-01
High rates of species discovery and loss have led to the urgent need for more rapid assessment of species diversity in the herpetofauna. DNA barcoding allows for the preliminary identification of species based on sequence divergence. Prior DNA barcoding work on reptiles and amphibians has revealed higher biodiversity counts than previously estimated due to cases of cryptic and undiscovered species. Past studies have provided DNA barcodes for just 14% of the North American herpetofauna, revealing the need for expanded coverage. This study extends the DNA barcode reference library for North American herpetofauna, assesses the utility of this approach in aiding species delimitation, and examines the correspondence between current species boundaries and sequence clusters designated by the BIN system. Sequences were obtained from 730 specimens, representing 274 species (43%) from the North American herpetofauna. Mean intraspecific divergences were 1% and 3%, while average congeneric sequence divergences were 16% and 14% in amphibians and reptiles, respectively. BIN assignments corresponded with current species boundaries in 79% of amphibians, 100% of turtles, and 60% of squamates. Deep divergences (>2%) were noted in 35% of squamate and 16% of amphibian species, and low divergences (<2%) occurred in 12% of reptiles and 23% of amphibians, patterns reflected in BIN assignments. Sequence recovery declined with specimen age, and variation in recovery success was noted among collections. Within collections, barcodes effectively flagged seven mislabeled tissues, and barcode fragments were recovered from five formalin-fixed specimens. This study demonstrates that DNA barcodes can effectively flag errors in museum collections, while BIN splits and merges reveal taxa belonging to deeply diverged or hybridizing lineages. This study is the first effort to compile a reference library of DNA barcodes for herpetofauna on a continental scale.
Chambers, E. Anne; Hebert, Paul D. N.
2016-01-01
Background High rates of species discovery and loss have led to the urgent need for more rapid assessment of species diversity in the herpetofauna. DNA barcoding allows for the preliminary identification of species based on sequence divergence. Prior DNA barcoding work on reptiles and amphibians has revealed higher biodiversity counts than previously estimated due to cases of cryptic and undiscovered species. Past studies have provided DNA barcodes for just 14% of the North American herpetofauna, revealing the need for expanded coverage. Methodology/Principal Findings This study extends the DNA barcode reference library for North American herpetofauna, assesses the utility of this approach in aiding species delimitation, and examines the correspondence between current species boundaries and sequence clusters designated by the BIN system. Sequences were obtained from 730 specimens, representing 274 species (43%) from the North American herpetofauna. Mean intraspecific divergences were 1% and 3%, while average congeneric sequence divergences were 16% and 14% in amphibians and reptiles, respectively. BIN assignments corresponded with current species boundaries in 79% of amphibians, 100% of turtles, and 60% of squamates. Deep divergences (>2%) were noted in 35% of squamate and 16% of amphibian species, and low divergences (<2%) occurred in 12% of reptiles and 23% of amphibians, patterns reflected in BIN assignments. Sequence recovery declined with specimen age, and variation in recovery success was noted among collections. Within collections, barcodes effectively flagged seven mislabeled tissues, and barcode fragments were recovered from five formalin-fixed specimens. Conclusions/Significance This study demonstrates that DNA barcodes can effectively flag errors in museum collections, while BIN splits and merges reveal taxa belonging to deeply diverged or hybridizing lineages. This study is the first effort to compile a reference library of DNA barcodes for herpetofauna on a continental scale. PMID:27116180
Lim, K Yoong; Kovarik, Ales; Matyasek, Roman; Chase, Mark W; Knapp, Sandra; McCarthy, Elizabeth; Clarkson, James J; Leitch, Andrew R
2006-12-01
Combining phylogenetic reconstructions of species relationships with comparative genomic approaches is a powerful way to decipher evolutionary events associated with genome divergence. Here, we reconstruct the history of karyotype and tandem repeat evolution in species of diploid Nicotiana section Alatae. By analysis of plastid DNA, we resolved two clades with high bootstrap support, one containing N. alata, N. langsdorffii, N. forgetiana and N. bonariensis (called the n = 9 group) and another containing N. plumbaginifolia and N. longiflora (called the n = 10 group). Despite little plastid DNA sequence divergence, we observed, via fluorescent in situ hybridization, substantial chromosomal repatterning, including altered chromosome numbers, structure and distribution of repeats. Effort was focussed on 35S and 5S nuclear ribosomal DNA (rDNA) and the HRS60 satellite family of tandem repeats comprising the elements HRS60, NP3R and NP4R. We compared divergence of these repeats in diploids and polyploids of Nicotiana. There are dramatic shifts in the distribution of the satellite repeats and complete replacement of intergenic spacers (IGSs) of 35S rDNA associated with divergence of the species in section Alatae. We suggest that sequence homogenization has replaced HRS60 family repeats at sub-telomeric regions, but that this process may not occur, or occurs more slowly, when the repeats are found at intercalary locations. Sequence homogenization acts more rapidly (at least two orders of magnitude) on 35S rDNA than 5S rDNA and sub-telomeric satellite sequences. This rapid rate of divergence is analogous to that found in polyploid species, and is therefore, in plants, not only associated with polyploidy.
Whole genome investigation of a divergent clade of the pathogen Streptococcus suis
Baig, Abiyad; Weinert, Lucy A.; Peters, Sarah E.; Howell, Kate J.; Chaudhuri, Roy R.; Wang, Jinhong; Holden, Matthew T. G.; Parkhill, Julian; Langford, Paul R.; Rycroft, Andrew N.; Wren, Brendan W.; Tucker, Alexander W.; Maskell, Duncan J.
2015-01-01
Streptococcus suis is a major porcine and zoonotic pathogen responsible for significant economic losses in the pig industry and an increasing number of human cases. Multiple isolates of S. suis show marked genomic diversity. Here, we report the analysis of whole genome sequences of nine pig isolates that caused disease typical of S. suis and had phenotypic characteristics of S. suis, but their genomes were divergent from those of many other S. suis isolates. Comparison of protein sequences predicted from divergent genomes with those from normal S. suis reduced the size of core genome from 793 to only 397 genes. Divergence was clear if phylogenetic analysis was performed on reduced core genes and MLST alleles. Phylogenies based on certain other genes (16S rRNA, sodA, recN, and cpn60) did not show divergence for all isolates, suggesting recombination between some divergent isolates with normal S. suis for these genes. Indeed, there is evidence of recent recombination between the divergent and normal S. suis genomes for 249 of 397 core genes. In addition, phylogenetic analysis based on the 16S rRNA gene and 132 genes that were conserved between the divergent isolates and representatives of the broader Streptococcus genus showed that divergent isolates were more closely related to S. suis. Six out of nine divergent isolates possessed a S. suis-like capsule region with variation in capsular gene sequences but the remaining three did not have a discrete capsule locus. The majority (40/70), of virulence-associated genes in normal S. suis were present in the divergent genomes. Overall, the divergent isolates extend the current diversity of S. suis species but the phenotypic similarities and the large amount of gene exchange with normal S. suis gives insufficient evidence to assign these isolates to a new species or subspecies. Further, sampling and whole genome analysis of more isolates is warranted to understand the diversity of the species. PMID:26583006
Tracking the origins of the cave bear (Ursus spelaeus) by mitochondrial DNA sequencing.
Hänni, C; Laudet, V; Stehelin, D; Taberlet, P
1994-01-01
The different European populations of Ursus arctos, the brown bear, were recently studied for mitochondrial DNA polymorphism. Two clearly distinct lineages (eastern and western) were found, which may have diverged approximately 850,000 years ago. In this context, it was interesting to study the cave bear, Ursus spelaeus, a species which became extinct 20,000 years ago. In this study, we have amplified and sequenced a fragment of 139-bp in the mitochondrial DNA control region of a 40,000-year-old specimen of U. spelaeus. Phylogenetic reconstructions using this sequence and the European brown bear sequences already published suggest that U. spelaeus diverged from an early offshoot of U. arctos--i.e., approximately at the same time as the divergence of the two main lineages of U. arctos. This divergence probably took place at the earliest glaciation, likely due to geographic separation during the earlier Quaternary cold periods. This result is in agreement with the paleontological data available and suggests a good correspondence between molecular and morphological data. Images PMID:7991628
Zhang, Honghai; Chen, Lei
2011-03-01
The dhole (Cuon alpinus) is the only existent species in the genus Cuon (Carnivora: Canidae). In the present study, the complete mitochondrial genome of the dhole was sequenced. The total length is 16672 base pairs which is the shortest in Canidae. Sequence analysis revealed that most mitochondrial genomic functional regions were highly consistent among canid animals except the CSB domain of the control region. The difference in length among the Canidae mitochondrial genome sequences is mainly due to the number of short segments of tandem repeated in the CSB domain. Phylogenetic analysis was progressed based on the concatenated data set of 14 mitochondrial genes of 8 canid animals by using maximum parsimony (MP), maximum likelihood (ML) and Bayesian (BI) inference methods. The genera Vulpes and Nyctereutes formed a sister group and split first within Canidae, followed by that in the Cuon. The divergence in the genus Canis was the latest. The divarication of domestic dogs after that of the Canis lupus laniger is completely supported by all the three topologies. Pairwise sequence divergence data of different mitochondrial genes among canid animals were also determined. Except for the synonymous substitutions in protein-coding genes, the control region exhibits the highest sequence divergences. The synonymous rates are approximately two to six times higher than those of the non-synonymous sites except for a slightly higher rate in the non-synonymous substitution between Cuon alpinus and Vulpes vulpes. 16S rRNA genes have a slightly faster sequence divergence than 12S rRNA and tRNA genes. Based on nucleotide substitutions of tRNA genes and rRNA genes, the times since divergence between dhole and other canid animals, and between domestic dogs and three subspecies of wolves were evaluated. The result indicates that Vulpes and Nyctereutes have a close phylogenetic relationship and the divergence of Nyctereutes is a little earlier. The Tibetan wolf may be an archaic pedigree within wolf subspecies. The genetic distance between wolves and domestic dogs is less than that among different subspecies of wolves. The domestication of dogs was about 1.56-1.92 million years ago or even earlier.
Bayesian estimation of post-Messinian divergence times in Balearic Island lizards.
Brown, R P; Terrasa, B; Pérez-Mellado, V; Castro, J A; Hoskisson, P A; Picornell, A; Ramon, M M
2008-07-01
Phylogenetic relationships and timings of major cladogenesis events are investigated in the Balearic Island lizards Podarcislilfordi and P.pityusensis using 2675bp of mitochondrial and nuclear DNA sequences. Partitioned Bayesian and Maximum Parsimony analyses provided a well-resolved phylogeny with high node-support values. Bayesian MCMC estimation of node dates was investigated by comparing means of posterior distributions from different subsets of the sequence against the most robust analysis which used multiple partitions and allowed for rate heterogeneity among branches under a rate-drift model. Evolutionary rates were systematically underestimated and thus divergence times overestimated when sequences containing lower numbers of variable sites were used (based on ingroup node constraints). The following analyses allowed the best recovery of node times under the constant-rate (i.e., perfect clock) model: (i) all cytochrome b sequence (partitioned by codon position), (ii) cytochrome b (codon position 3 alone), (iii) NADH dehydrogenase (subunits 1 and 2; partitioned by codon position), (iv) cytochrome b and NADH dehydrogenase sequence together (six gene-codon partitions), (v) all unpartitioned sequence, (vi) a full multipartition analysis (nine partitions). Of these, only (iv) and (vi) performed well under the rate-drift model. These findings have significant implications for dating of recent divergence times in other taxa. The earliest P.lilfordi cladogenesis event (divergence of Menorcan populations), occurred before the end of the Pliocene, some 2.6Ma. Subsequent events led to a West Mallorcan lineage (2.0Ma ago), followed 1.2Ma ago by divergence of populations from the southern part of the Cabrera archipelago from a widely-distributed group from north Cabrera, northern and southern Mallorcan islets. Divergence within P.pityusensis is more recent with the main Ibiza and Formentera clades sharing a common ancestor at about 1.0Ma ago. Climatic and sea level changes are likely to have initiated cladogenesis, with lineages making secondary contact during periodic landbridge formation. This oscillating cross-archipelago pattern in which ancient divergence is followed by repeated contact resembles that seen between East-West refugia populations from mainland Europe.
Mouse Vk gene classification by nucleic acid sequence similarity.
Strohal, R; Helmberg, A; Kroemer, G; Kofler, R
1989-01-01
Analyses of immunoglobulin (Ig) variable (V) region gene usage in the immune response, estimates of V gene germline complexity, and other nucleic acid hybridization-based studies depend on the extent to which such genes are related (i.e., sequence similarity) and their organization in gene families. While mouse Igh heavy chain V region (VH) gene families are relatively well-established, a corresponding systematic classification of Igk light chain V region (Vk) genes has not been reported. The present analysis, in the course of which we reviewed the known extent of the Vk germline gene repertoire and Vk gene usage in a variety of responses to foreign and self antigens, provides a classification of mouse Vk genes in gene families composed of members with greater than 80% overall nucleic acid sequence similarity. This classification differed in several aspects from that of VH genes: only some Vk gene families were as clearly separated (by greater than 25% sequence dissimilarity) as typical VH gene families; most Vk gene families were closely related and, in several instances, members from different families were very similar (greater than 80%) over large sequence portions; frequently, classification by nucleic acid sequence similarity diverged from existing classifications based on amino-terminal protein sequence similarity. Our data have implications for Vk gene analyses by nucleic acid hybridization and describe potentially important differences in sequence organization between VH and Vk genes.
Chen, Peng; Han, Yuqing; Zhu, Chaoying; Gao, Bin; Ruan, Luzhang
2017-12-01
The complete mitochondrial genome sequences of Porzana fusca and Porzana pusilla were determined. The two avian species share a high degree of homology in terms of mitochondrial genome organization and gene arrangement. Their corresponding mitochondrial genomes are 16,935 and 16,978 bp and consist of 37 genes and a control region. Their PCGs were both 11,365 bp long and have similar structure. Their tRNA gene sequences could be folded into canonical cloverleaf secondary structure, except for tRNA Ser (AGY) , which lost its "DHU" arm. Based on the concatenated nucleotide sequences of the complete mitochondrial DNA genes of 16 Rallidae species, reconstruction of phylogenetic trees and analysis of the molecular clock of P. fusca and P. pusilla indicated that these species from a sister group, which in turn are sister group to Rallina eurizonoides. The genus Gallirallus is a sister group to genus Lewinia, and these groups in turn are sister groups to genus Porphyrio. Moreover, molecular clock analyses suggested that the basal divergence of Rallidae could be traced back to 40.47 (41.46‒39.45) million years ago (Mya), and the divergence of Porzana occurred approximately 5.80 (15.16‒0.79) Mya.
Koloniuk, Igor; Fránová, Jana; Sarkisova, Tatiana; Přibylová, Jaroslava
2018-05-04
Strawberry crinkle disease is one of the major diseases that threatens strawberry production. Although the biological properties of the agent, strawberry crinkle virus (SCV), have been thoroughly investigated, its complete genome sequence has never been published. Existing RT-PCR-based detection relies on a partial sequence of the L protein gene, presumably the least expressed viral gene. Here, we present complete sequences of two divergent SCV isolates co-infecting a single plant, Fragaria x ananassa cv. Čačanská raná.
Lopez, Philippe; Halary, Sébastien; Bapteste, Eric
2015-10-26
Microbial genetic diversity is often investigated via the comparison of relatively similar 16S molecules through multiple alignments between reference sequences and novel environmental samples using phylogenetic trees, direct BLAST matches, or phylotypes counts. However, are we missing novel lineages in the microbial dark universe by relying on standard phylogenetic and BLAST methods? If so, how can we probe that universe using alternative approaches? We performed a novel type of multi-marker analysis of genetic diversity exploiting the topology of inclusive sequence similarity networks. Our protocol identified 86 ancient gene families, well distributed and rarely transferred across the 3 domains of life, and retrieved their environmental homologs among 10 million predicted ORFs from human gut samples and other metagenomic projects. Numerous highly divergent environmental homologs were observed in gut samples, although the most divergent genes were over-represented in non-gut environments. In our networks, most divergent environmental genes grouped exclusively with uncultured relatives, in maximal cliques. Sequences within these groups were under strong purifying selection and presented a range of genetic variation comparable to that of a prokaryotic domain. Many genes families included environmental homologs that were highly divergent from cultured homologs: in 79 gene families (including 18 ribosomal proteins), Bacteria and Archaea were less divergent than some groups of environmental sequences were to any cultured or viral homologs. Moreover, some groups of environmental homologs branched very deeply in phylogenetic trees of life, when they were not too divergent to be aligned. These results underline how limited our understanding of the most diverse elements of the microbial world remains, and encourage a deeper exploration of natural communities and their genetic resources, hinting at the possibility that still unknown yet major divisions of life have yet to be discovered.
The systematic sequencing of the cancer genome has led to the identification of numerous genetic alterations in cancer. However, a deeper understanding of the functional consequences of these alterations is necessary to guide appropriate therapeutic strategies. Here, we describe Onco-GPS (OncoGenic Positioning System), a data-driven analysis framework to organize individual tumor samples with shared oncogenic alterations onto a reference map defined by their underlying cellular states.
Muwonge, Apollo; Nanyunja, Miriam; Bwogi, Josephine; Lowe, Luis; Liffick, Stephanie L.; Bellini, William J.; Sylvester, Sempala
2005-01-01
We report the first genetic characterization of wildtype measles viruses from Uganda. Thirty-six virus isolates from outbreaks in 6 districts were analyzed from 2000 to 2002. Analyses of sequences of the nucleoprotein (N) and hemagglutinin (H) genes showed that the Ugandan isolates were all closely related, and phylogenetic analysis indicated that these viruses were members of a unique group within clade D. Sequences of the Ugandan viruses were not closely related to any of the World Health Organization reference sequences representing the 22 currently recognized genotypes. The minimum nucleotide divergence between the Ugandan viruses and the most closely related reference strain, genotype D2, was 3.1% for the N gene and 2.6% for the H gene. Therefore, Ugandan viruses should be considered a new, proposed genotype (d10). This new sequence information will expand the utility of molecular epidemiologic techniques for describing measles transmission patterns in eastern Africa. PMID:16318690
Blazier, J. Chris; Ruhlman, Tracey A.; Weng, Mao-Lun; Rehman, Sumaiyah K.; Sabir, Jamal S. M.; Jansen, Robert K.
2016-01-01
Genes for the plastid-encoded RNA polymerase (PEP) persist in the plastid genomes of all photosynthetic angiosperms. However, three unrelated lineages (Annonaceae, Passifloraceae and Geraniaceae) have been identified with unusually divergent open reading frames (ORFs) in the conserved region of rpoA, the gene encoding the PEP α subunit. We used sequence-based approaches to evaluate whether these genes retain function. Both gene sequences and complete plastid genome sequences were assembled and analyzed from each of the three angiosperm families. Multiple lines of evidence indicated that the rpoA sequences are likely functional despite retaining as low as 30% nucleotide sequence identity with rpoA genes from outgroups in the same angiosperm order. The ratio of non-synonymous to synonymous substitutions indicated that these genes are under purifying selection, and bioinformatic prediction of conserved domains indicated that functional domains are preserved. One of the lineages (Pelargonium, Geraniaceae) contains species with multiple rpoA-like ORFs that show evidence of ongoing inter-paralog gene conversion. The plastid genomes containing these divergent rpoA genes have experienced extensive structural rearrangement, including large expansions of the inverted repeat. We propose that illegitimate recombination, not positive selection, has driven the divergence of rpoA. PMID:27087667
Ashfaq, Muhammad; Prosser, Sean; Nasir, Saima; Masood, Mariyam; Ratnasingham, Sujeevan; Hebert, Paul D. N.
2015-01-01
The study analyzes sequence variation of two mitochondrial genes (COI, cytb) in Pediculus humanus from three countries (Egypt, Pakistan, South Africa) that have received little prior attention, and integrates these results with prior data. Analysis indicates a maximum K2P distance of 10.3% among 960 COI sequences and 13.8% among 479 cytb sequences. Three analytical methods (BIN, PTP, ABGD) reveal five concordant OTUs for COI and cytb. Neighbor-Joining analysis of the COI sequences confirm five clusters; three corresponding to previously recognized mitochondrial clades A, B, C and two new clades, “D” and “E”, showing 2.3% and 2.8% divergence from their nearest neighbors (NN). Cytb data corroborate five clusters showing that clades “D” and “E” are both 4.6% divergent from their respective NN clades. Phylogenetic analysis supports the monophyly of all clusters recovered by NJ analysis. Divergence time estimates suggest that the earliest split of P. humanus clades occured slightly more than one million years ago (MYa) and the latest about 0.3 MYa. Sequence divergences in COI and cytb among the five clades of P. humanus are 10X those in their human host, a difference that likely reflects both rate acceleration and the acquisition of lice clades from several archaic hominid lineages. PMID:26373806
A Comparative Encyclopedia of DNA Elements in the Mouse Genome
Yue, Feng; Cheng, Yong; Breschi, Alessandra; Vierstra, Jeff; Wu, Weisheng; Ryba, Tyrone; Sandstrom, Richard; Ma, Zhihai; Davis, Carrie; Pope, Benjamin D.; Shen, Yin; Pervouchine, Dmitri D.; Djebali, Sarah; Thurman, Bob; Kaul, Rajinder; Rynes, Eric; Kirilusha, Anthony; Marinov, Georgi K.; Williams, Brian A.; Trout, Diane; Amrhein, Henry; Fisher-Aylor, Katherine; Antoshechkin, Igor; DeSalvo, Gilberto; See, Lei-Hoon; Fastuca, Meagan; Drenkow, Jorg; Zaleski, Chris; Dobin, Alex; Prieto, Pablo; Lagarde, Julien; Bussotti, Giovanni; Tanzer, Andrea; Denas, Olgert; Li, Kanwei; Bender, M. A.; Zhang, Miaohua; Byron, Rachel; Groudine, Mark T.; McCleary, David; Pham, Long; Ye, Zhen; Kuan, Samantha; Edsall, Lee; Wu, Yi-Chieh; Rasmussen, Matthew D.; Bansal, Mukul S.; Keller, Cheryl A.; Morrissey, Christapher S.; Mishra, Tejaswini; Jain, Deepti; Dogan, Nergiz; Harris, Robert S.; Cayting, Philip; Kawli, Trupti; Boyle, Alan P.; Euskirchen, Ghia; Kundaje, Anshul; Lin, Shin; Lin, Yiing; Jansen, Camden; Malladi, Venkat S.; Cline, Melissa S.; Erickson, Drew T.; Kirkup, Vanessa M; Learned, Katrina; Sloan, Cricket A.; Rosenbloom, Kate R.; de Sousa, Beatriz Lacerda; Beal, Kathryn; Pignatelli, Miguel; Flicek, Paul; Lian, Jin; Kahveci, Tamer; Lee, Dongwon; Kent, W. James; Santos, Miguel Ramalho; Herrero, Javier; Notredame, Cedric; Johnson, Audra; Vong, Shinny; Lee, Kristen; Bates, Daniel; Neri, Fidencio; Diegel, Morgan; Canfield, Theresa; Sabo, Peter J.; Wilken, Matthew S.; Reh, Thomas A.; Giste, Erika; Shafer, Anthony; Kutyavin, Tanya; Haugen, Eric; Dunn, Douglas; Reynolds, Alex P.; Neph, Shane; Humbert, Richard; Hansen, R. Scott; De Bruijn, Marella; Selleri, Licia; Rudensky, Alexander; Josefowicz, Steven; Samstein, Robert; Eichler, Evan E.; Orkin, Stuart H.; Levasseur, Dana; Papayannopoulou, Thalia; Chang, Kai-Hsin; Skoultchi, Arthur; Gosh, Srikanta; Disteche, Christine; Treuting, Piper; Wang, Yanli; Weiss, Mitchell J.; Blobel, Gerd A.; Good, Peter J.; Lowdon, Rebecca F.; Adams, Leslie B.; Zhou, Xiao-Qiao; Pazin, Michael J.; Feingold, Elise A.; Wold, Barbara; Taylor, James; Kellis, Manolis; Mortazavi, Ali; Weissman, Sherman M.; Stamatoyannopoulos, John; Snyder, Michael P.; Guigo, Roderic; Gingeras, Thomas R.; Gilbert, David M.; Hardison, Ross C.; Beer, Michael A.; Ren, Bing
2014-01-01
Summary As the premier model organism in biomedical research, the laboratory mouse shares the majority of protein-coding genes with humans, yet the two mammals differ in significant ways. To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications, and replication domains throughout the mouse genome in diverse cell and tissue types. By comparing with the human genome, we not only confirm substantial conservation in the newly annotated potential functional sequences, but also find a large degree of divergence of other sequences involved in transcriptional regulation, chromatin state and higher order chromatin organization. Our results illuminate the wide range of evolutionary forces acting on genes and their regulatory regions, and provide a general resource for research into mammalian biology and mechanisms of human diseases. PMID:25409824
A comparative encyclopedia of DNA elements in the mouse genome.
Yue, Feng; Cheng, Yong; Breschi, Alessandra; Vierstra, Jeff; Wu, Weisheng; Ryba, Tyrone; Sandstrom, Richard; Ma, Zhihai; Davis, Carrie; Pope, Benjamin D; Shen, Yin; Pervouchine, Dmitri D; Djebali, Sarah; Thurman, Robert E; Kaul, Rajinder; Rynes, Eric; Kirilusha, Anthony; Marinov, Georgi K; Williams, Brian A; Trout, Diane; Amrhein, Henry; Fisher-Aylor, Katherine; Antoshechkin, Igor; DeSalvo, Gilberto; See, Lei-Hoon; Fastuca, Meagan; Drenkow, Jorg; Zaleski, Chris; Dobin, Alex; Prieto, Pablo; Lagarde, Julien; Bussotti, Giovanni; Tanzer, Andrea; Denas, Olgert; Li, Kanwei; Bender, M A; Zhang, Miaohua; Byron, Rachel; Groudine, Mark T; McCleary, David; Pham, Long; Ye, Zhen; Kuan, Samantha; Edsall, Lee; Wu, Yi-Chieh; Rasmussen, Matthew D; Bansal, Mukul S; Kellis, Manolis; Keller, Cheryl A; Morrissey, Christapher S; Mishra, Tejaswini; Jain, Deepti; Dogan, Nergiz; Harris, Robert S; Cayting, Philip; Kawli, Trupti; Boyle, Alan P; Euskirchen, Ghia; Kundaje, Anshul; Lin, Shin; Lin, Yiing; Jansen, Camden; Malladi, Venkat S; Cline, Melissa S; Erickson, Drew T; Kirkup, Vanessa M; Learned, Katrina; Sloan, Cricket A; Rosenbloom, Kate R; Lacerda de Sousa, Beatriz; Beal, Kathryn; Pignatelli, Miguel; Flicek, Paul; Lian, Jin; Kahveci, Tamer; Lee, Dongwon; Kent, W James; Ramalho Santos, Miguel; Herrero, Javier; Notredame, Cedric; Johnson, Audra; Vong, Shinny; Lee, Kristen; Bates, Daniel; Neri, Fidencio; Diegel, Morgan; Canfield, Theresa; Sabo, Peter J; Wilken, Matthew S; Reh, Thomas A; Giste, Erika; Shafer, Anthony; Kutyavin, Tanya; Haugen, Eric; Dunn, Douglas; Reynolds, Alex P; Neph, Shane; Humbert, Richard; Hansen, R Scott; De Bruijn, Marella; Selleri, Licia; Rudensky, Alexander; Josefowicz, Steven; Samstein, Robert; Eichler, Evan E; Orkin, Stuart H; Levasseur, Dana; Papayannopoulou, Thalia; Chang, Kai-Hsin; Skoultchi, Arthur; Gosh, Srikanta; Disteche, Christine; Treuting, Piper; Wang, Yanli; Weiss, Mitchell J; Blobel, Gerd A; Cao, Xiaoyi; Zhong, Sheng; Wang, Ting; Good, Peter J; Lowdon, Rebecca F; Adams, Leslie B; Zhou, Xiao-Qiao; Pazin, Michael J; Feingold, Elise A; Wold, Barbara; Taylor, James; Mortazavi, Ali; Weissman, Sherman M; Stamatoyannopoulos, John A; Snyder, Michael P; Guigo, Roderic; Gingeras, Thomas R; Gilbert, David M; Hardison, Ross C; Beer, Michael A; Ren, Bing
2014-11-20
The laboratory mouse shares the majority of its protein-coding genes with humans, making it the premier model organism in biomedical research, yet the two mammals differ in significant ways. To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications and replication domains throughout the mouse genome in diverse cell and tissue types. By comparing with the human genome, we not only confirm substantial conservation in the newly annotated potential functional sequences, but also find a large degree of divergence of sequences involved in transcriptional regulation, chromatin state and higher order chromatin organization. Our results illuminate the wide range of evolutionary forces acting on genes and their regulatory regions, and provide a general resource for research into mammalian biology and mechanisms of human diseases.
Bernard, Guillaume; Chan, Cheong Xin; Ragan, Mark A
2016-07-01
Alignment-free (AF) approaches have recently been highlighted as alternatives to methods based on multiple sequence alignment in phylogenetic inference. However, the sensitivity of AF methods to genome-scale evolutionary scenarios is little known. Here, using simulated microbial genome data we systematically assess the sensitivity of nine AF methods to three important evolutionary scenarios: sequence divergence, lateral genetic transfer (LGT) and genome rearrangement. Among these, AF methods are most sensitive to the extent of sequence divergence, less sensitive to low and moderate frequencies of LGT, and most robust against genome rearrangement. We describe the application of AF methods to three well-studied empirical genome datasets, and introduce a new application of the jackknife to assess node support. Our results demonstrate that AF phylogenomics is computationally scalable to multi-genome data and can generate biologically meaningful phylogenies and insights into microbial evolution.
Limborg, Morten T.; Larson, Wesley; Shedd, Kyle; Seeb, Lisa W.; Seeb, James E.
2017-01-01
Preservation of heritable ecological diversity within species and populations is a key challenge for managing natural resources and wild populations. Salmonid fish are iconic and socio-economically important species for commercial, aquaculture, and recreational fisheries across the globe. Many salmonids are known to exhibit ecological divergence within species, including distinct feeding ecotypes within the same lakes. Here we used 5559 SNPs, derived from RAD sequencing, to perform population genetic comparisons between two dietary ecotypes of sockeye salmon (Oncorhynchus nerka) in Jo-Jo Lake, Alaska (USA). We tested the standing hypothesis that these two ecotypes are currently diverging as a result of adaptation to distinct dietary niches; results support earlier conclusions of a single panmictic population. The RAD sequence data revealed 40 new SNPs not previously detected in the species, and our sequence data can be used in future studies of ecotypic diversity in salmonid species.
Amores, Angel; Catchen, Julian; Ferrara, Allyse; Fontenot, Quenton; Postlethwait, John H.
2011-01-01
Genomic resources for hundreds of species of evolutionary, agricultural, economic, and medical importance are unavailable due to the expense of well-assembled genome sequences and difficulties with multigenerational studies. Teleost fish provide many models for human disease but possess anciently duplicated genomes that sometimes obfuscate connectivity. Genomic information representing a fish lineage that diverged before the teleost genome duplication (TGD) would provide an outgroup for exploring the mechanisms of evolution after whole-genome duplication. We exploited massively parallel DNA sequencing to develop meiotic maps with thrift and speed by genotyping F1 offspring of a single female and a single male spotted gar (Lepisosteus oculatus) collected directly from nature utilizing only polymorphisms existing in these two wild individuals. Using Stacks, software that automates the calling of genotypes from polymorphisms assayed by Illumina sequencing, we constructed a map containing 8406 markers. RNA-seq on two map-cross larvae provided a reference transcriptome that identified nearly 1000 mapped protein-coding markers and allowed genome-wide analysis of conserved synteny. Results showed that the gar lineage diverged from teleosts before the TGD and its genome is organized more similarly to that of humans than teleosts. Thus, spotted gar provides a critical link between medical models in teleost fish, to which gar is biologically similar, and humans, to which gar is genomically similar. Application of our F1 dense mapping strategy to species with no prior genome information promises to facilitate comparative genomics and provide a scaffold for ordering the numerous contigs arising from next generation genome sequencing. PMID:21828280
Molecular phylogenetic analysis of non-sexually transmitted strains of Haemophilus ducreyi.
Gaston, Jordan R; Roberts, Sally A; Humphreys, Tricia L
2015-01-01
Haemophilus ducreyi, the etiologic agent of chancroid, has been previously reported to show genetic variance in several key virulence factors, placing strains of the bacterium into two genetically distinct classes. Recent studies done in yaws-endemic areas of the South Pacific have shown that H. ducreyi is also a major cause of cutaneous limb ulcers (CLU) that are not sexually transmitted. To genetically assess CLU strains relative to the previously described class I, class II phylogenetic hierarchy, we examined nucleotide sequence diversity at 11 H. ducreyi loci, including virulence and housekeeping genes, which encompass approximately 1% of the H. ducreyi genome. Sequences for all 11 loci indicated that strains collected from leg ulcers exhibit DNA sequences homologous to class I strains of H. ducreyi. However, sequences for 3 loci, including a hemoglobin receptor (hgbA), serum resistance protein (dsrA), and a collagen adhesin (ncaA) contained informative amounts of variation. Phylogenetic analyses suggest that these non-sexually transmitted strains of H. ducreyi comprise a sub-clonal population within class I strains of H. ducreyi. Molecular dating suggests that CLU strains are the most recently developed, having diverged approximately 0.355 million years ago, fourteen times more recently than the class I/class II divergence. The CLU strains' divergence falls after the divergence of humans from chimpanzees, making it the first known H. ducreyi divergence event directly influenced by the selective pressures accompanying human hosts.
NASA Astrophysics Data System (ADS)
Hamid, Nur Athirah Abd; Ismail, Ismanizan
2013-11-01
Polygonum minus, locally named as Kesum is an aromatic herb which is high in secondary metabolite content. Alcohol dehydrogenase is an important enzyme that catalyzes the reversible oxidation of alcohol and aldehyde with the presence of NAD(P)(H) as co-factor. The main focus of this research is to identify the gene of ADH. The total RNA was extracted from leaves of P. minus which was treated with 150 μM Jasmonic acid. Full-length cDNA sequence of ADH was isolated via rapid amplification cDNA end (RACE). Subsequently, in silico analysis was conducted on the full-length cDNA sequence and PCR was done on genomic DNA to determine the exon and intron organization. Two sequences of ADH, designated as PmADH1 and PmADH2 were successfully isolated. Both sequences have ORF of 801 bp which encode 266 aa residues. Nucleotide sequence comparison of PmADH1 and PmADH2 indicated that both sequences are highly similar at the ORF region but divergent in the 3' untranslated regions (UTR). The amino acid is differ at the 107 residue; PmADH1 contains Gly (G) residue while PmADH2 contains Cys (C) residue. The intron-exon organization pattern of both sequences are also same, with 3 introns and 4 exons. Based on in silico analysis, both sequences contain "classical" short chain alcohol dehydrogenases/reductases ((c) SDRs) conserved domain. The results suggest that both sequences are the members of short chain alcohol dehydrogenase family.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Denef, Vincent; Shah, Manesh B; Verberkmoes, Nathan C
The recent surge in microbial genomic sequencing, combined with the development of high-throughput liquid chromatography-mass-spectrometry-based (LC/LC-MS/MS) proteomics, has raised the question of the extent to which genomic information of one strain or environmental sample can be used to profile proteomes of related strains or samples. Even with decreasing sequencing costs, it remains impractical to obtain genomic sequence for every strain or sample analyzed. Here, we evaluate how shotgun proteomics is affected by amino acid divergence between the sample and the genomic database using a probability-based model and a random mutation simulation model constrained by experimental data. To assess the effectsmore » of nonrandom distribution of mutations, we also evaluated identification levels using in silico peptide data from sequenced isolates with average amino acid identities (AAI) varying between 76 and 98%. We compared the predictions to experimental protein identification levels for a sample that was evaluated using a database that included genomic information for the dominant organism and for a closely related variant (95% AAI). The range of models set the boundaries at which half of the proteins in a proteomic experiment can be identified to be 77-92% AAI between orthologs in the sample and database. Consistent with this prediction, experimental data indicated loss of half the identifiable proteins at 90% AAI. Additional analysis indicated a 6.4% reduction of the initial protein coverage per 1% amino acid divergence and total identification loss at 86% AAI. Consequently, shotgun proteomics is capable of cross-strain identifications but avoids most crossspecies false positives.« less
Comparative analysis of gene regulatory networks: from network reconstruction to evolution.
Thompson, Dawn; Regev, Aviv; Roy, Sushmita
2015-01-01
Regulation of gene expression is central to many biological processes. Although reconstruction of regulatory circuits from genomic data alone is therefore desirable, this remains a major computational challenge. Comparative approaches that examine the conservation and divergence of circuits and their components across strains and species can help reconstruct circuits as well as provide insights into the evolution of gene regulatory processes and their adaptive contribution. In recent years, advances in genomic and computational tools have led to a wealth of methods for such analysis at the sequence, expression, pathway, module, and entire network level. Here, we review computational methods developed to study transcriptional regulatory networks using comparative genomics, from sequence to functional data. We highlight how these methods use evolutionary conservation and divergence to reliably detect regulatory components as well as estimate the extent and rate of divergence. Finally, we discuss the promise and open challenges in linking regulatory divergence to phenotypic divergence and adaptation.
Traini, Alessandra; Iorizzo, Massimo; Mann, Harpartap; Bradeen, James M; Carputo, Domenico; Frusciante, Luigi; Chiusano, Maria Luisa
2013-01-01
Tuber-bearing potato species possess several genes that can be exploited to improve the genetic background of the cultivated potato Solanum tuberosum. Among them, S. bulbocastanum and S. commersonii are well known for their strong resistance to environmental stresses. However, scant information is available for these species in terms of genome organization, gene function, and regulatory networks. Consequently, genomic tools to assist breeding are meager, and efficient exploitation of these species has been limited so far. In this paper, we employed the reference genome sequences from cultivated potato and tomato and a collection of sequences of 1,423 potato Diversity Arrays Technology (DArT) markers that show polymorphic representation across the genomes of S. bulbocastanum and/or S. commersonii genotypes. Our results highlighted microscale genome sequence heterogeneity that may play a significant role in functional and structural divergence between related species. Our analytical approach provides knowledge of genome structural and sequence variability that could not be detected by transcriptome and proteome approaches.
Patarca, R; Dorta, B; Ramirez, J L
1982-01-01
As part of a project pertaining the organization of ribosomal genes in Kinetoplastidae, we have created a data base for published sequences of ribosomal nucleic acids, with information in Spanish. As a first step in their processing, we have written a computer program which introduces the new feature of determining the length of the fragments produced after single or multiple digestion with any of the known restriction enzymes. With this information we have detected conserved SAU 3A sites: (i) at the 5' end of the 5.8S rRNA and at the 3' end of the small subunit rRNA, both included in similar larger sequences; (ii) in the 5.8S rRNA of vertebrates (a second one), which is not present in lower eukaryotes, showing a clear evolutive divergence; and, (iii) at the 5' terminal of the small subunit rRNA, included in a larger conserved sequence. The possible biological importance of these sequences is discussed. PMID:6278402
Caron, V; Norgate, M; Ede, F J; Nyman, T; Sunnucks, P
2013-02-01
Invasive organisms can have major impacts on the environment. Some invasive organisms are parthenogenetic in their invasive range and, therefore, exist as a number of asexual lineages (=clones). Determining the reproductive mode of invasive species has important implications for understanding the evolutionary genetics of such species, more especially, for management-relevant traits. The willow sawfly Nematus oligospilus Förster (Hymenoptera: Tenthredinidae) has been introduced unintentionally into several countries in the Southern Hemisphere where it has subsequently become invasive. To assess the population expansion, reproductive mode and host-plant relationships of this insect, microsatellite markers were developed and applied to natural populations sampled from the native and expanded range, along with sequencing of the cytochrome-oxidase I mitochondrial DNA (mtDNA) region. Other tenthredinids across a spectrum of taxonomic similarity to N. oligospilus and having a range of life strategies were also tested. Strict parthenogenesis was apparent within invasive N. oligospilus populations throughout the Southern Hemisphere, which comprised only a small number of genotypes. Sequences of mtDNA were identical for all individuals tested in the invasive range. The microsatellite markers were used successfully in several sawfly species, especially Nematus spp. and other genera of the Nematini tribe, with the degree of success inversely related to genetic divergence as estimated from COI sequences. The confirmation of parthenogenetic reproduction in N. oligospilus and the fact that it has a very limited pool of genotypes have important implications for understanding and managing this species and its biology, including in terms of phenotypic diversity, host relationships, implications for spread and future adaptive change. It would appear to be an excellent model study system for understanding evolution of invasive parthenogens that diverge without sexual reproduction and genetic recombination.
Corominas, Jordi; Ramayo-Caldas, Yuliaxis; Puig-Oliveras, Anna; Estellé, Jordi; Castelló, Anna; Alves, Estefania; Pena, Ramona N; Ballester, Maria; Folch, Josep M
2013-12-01
In pigs, adipose tissue is one of the principal organs involved in the regulation of lipid metabolism. It is particularly involved in the overall fatty acid synthesis with consequences in other lipid-target organs such as muscles and the liver. With this in mind, we have used massive, parallel high-throughput sequencing technologies to characterize the porcine adipose tissue transcriptome architecture in six Iberian x Landrace crossbred pigs showing extreme phenotypes for intramuscular fatty acid composition (three per group). High-throughput RNA sequencing was used to generate a whole characterization of adipose tissue (backfat) transcriptome. A total of 4,130 putative unannotated protein-coding sequences were identified in the 20% of reads which mapped in intergenic regions. Furthermore, 36% of the unmapped reads were represented by interspersed repeats, SINEs being the most abundant elements. Differential expression analyses identified 396 candidate genes among divergent animals for intramuscular fatty acid composition. Sixty-two percent of these genes (247/396) presented higher expression in the group of pigs with higher content of intramuscular SFA and MUFA, while the remaining 149 showed higher expression in the group with higher content of PUFA. Pathway analysis related these genes to biological functions and canonical pathways controlling lipid and fatty acid metabolisms. In concordance with the phenotypic classification of animals, the major metabolic pathway differentially modulated between groups was de novo lipogenesis, the group with more PUFA being the one that showed lower expression of lipogenic genes. These results will help in the identification of genetic variants at loci that affect fatty acid composition traits. The implications of these results range from the improvement of porcine meat quality traits to the application of the pig as an animal model of human metabolic diseases.
Divergence of Gene Body DNA Methylation and Evolution of Plant Duplicate Genes
Wang, Jun; Marowsky, Nicholas C.; Fan, Chuanzhu
2014-01-01
It has been shown that gene body DNA methylation is associated with gene expression. However, whether and how deviation of gene body DNA methylation between duplicate genes can influence their divergence remains largely unexplored. Here, we aim to elucidate the potential role of gene body DNA methylation in the fate of duplicate genes. We identified paralogous gene pairs from Arabidopsis and rice (Oryza sativa ssp. japonica) genomes and reprocessed their single-base resolution methylome data. We show that methylation in paralogous genes nonlinearly correlates with several gene properties including exon number/gene length, expression level and mutation rate. Further, we demonstrated that divergence of methylation level and pattern in paralogs indeed positively correlate with their sequence and expression divergences. This result held even after controlling for other confounding factors known to influence the divergence of paralogs. We observed that methylation level divergence might be more relevant to the expression divergence of paralogs than methylation pattern divergence. Finally, we explored the mechanisms that might give rise to the divergence of gene body methylation in paralogs. We found that exonic methylation divergence more closely correlates with expression divergence than intronic methylation divergence. We show that genomic environments (e.g., flanked by transposable elements and repetitive sequences) of paralogs generated by various duplication mechanisms are associated with the methylation divergence of paralogs. Overall, our results suggest that the changes in gene body DNA methylation could provide another avenue for duplicate genes to develop differential expression patterns and undergo different evolutionary fates in plant genomes. PMID:25310342
Tormey, Duncan; Colbourne, John K; Mockaitis, Keithanne; Choi, Jeong-Hyeon; Lopez, Jacqueline; Burkhart, Joshua; Bradshaw, William; Holzapfel, Christina
2015-10-06
Internal circadian (circa, about; dies, day) clocks enable organisms to maintain adaptive timing of their daily behavioral activities and physiological functions. Eukaryotic clocks consist of core transcription-translation feedback loops that generate a cycle and post-translational modifiers that maintain that cycle at about 24 h. We use the pitcher-plant mosquito, Wyeomyia smithii (subfamily Culicini, tribe Sabethini), to test whether evolutionary divergence of the circadian clock genes in this species, relative to other insects, has involved primarily genes in the core feedback loops or the post-translational modifiers. Heretofore, there is no reference transcriptome or genome sequence for any mosquito in the tribe Sabethini, which includes over 375 mainly circumtropical species. We sequenced, assembled and annotated the transcriptome of W. smithii containing nearly 95 % of conserved single-copy orthologs in animal genomes. We used the translated contigs and singletons to determine the average rates of circadian clock-gene divergence in W. smithii relative to three other mosquito genera, to Drosophila, to the butterfly, Danaus, and to the wasp, Nasonia. Over 1.08 million cDNA sequence reads were obtained consisting of 432.5 million nucleotides. Their assembly produced 25,904 contigs and 54,418 singletons of which 62 % and 28 % are annotated as protein-coding genes, respectively, sharing homology with other animal proteomes. The W. smithii transcriptome includes all nine circadian transcription-translation feedback-loop genes and all eight post-translational modifier genes we sought to identify (Fig. 1). After aligning translated W. smithii contigs and singletons from this transcriptome with other insects, we determined that there was no significant difference in the average divergence of W. smithii from the six other taxa between the core feedback-loop genes and post-translational modifiers. The characterized transcriptome is sufficiently complete and of sufficient quality to have uncovered all of the insect circadian clock genes we sought to identify (Fig. 1). Relative divergence does not differ between core feedback-loop genes and post-translational modifiers of those genes in a Sabethine species (W. smithii) that has experienced a continual northward dispersal into temperate regions of progressively longer summer day lengths as compared with six other insect taxa. An associated microarray platform derived from this work will enable the investigation of functional genomics of circadian rhythmicity, photoperiodic time measurement, and diapause along a photic and seasonal geographic gradient.
Chi, Hongshu; Taik, Patricia; Foley, Emily J; Racicot, Alycia C; Gray, Hilary M; Guzzetta, Katherine E; Lin, Hsin-Yun; Song, Yen-Ling; Tung, Che-Huang; Zenke, Kosuke; Yoshinaga, Tomoyoshi; Cheng, Chao-Yin; Chang, Wei-Jen; Gong, Hui
2017-07-01
The ciliate protozoan Cryptocaryon irritans parasitizes marine fish and causes lethal white spot disease. Sporadic infections as well as large-scale outbreaks have been reported globally and the parasite's broad host range poses particular threat to the aquaculture and ornamental fish markets. In order to better understand C. irritans' population structure, we sequenced and compared mitochondrial cox-1, SSU rRNA, and ITS-1 sequences from 8 new isolates of C. irritans collected in China, Japan, and Taiwan. We detected two SSU rRNA haplotypes, which differ at three positions, separating the isolates into two main groups (I and II). Cox-1 sequences also support the division into two groups, and the cox-1 divergence between these two groups is unexpectedly high (9.28% for 1582 nucleotide positions). The divergence is much greater than that detected in Ichthyophthirius multifiliis, the ciliate protozoan causing freshwater white spot disease in fish, where intraspecies divergence on cox-1 sequence is only 1.95%. ITS-1 sequences derived from these eight isolates and from all other C. irritans isolates (deposited in the GenBank) not only support the two groups, but further suggest the presence of a third group with even greater sequence divergence. Finally, a small Ka/Ks ratio estimated from cox-1 sequences suggests that this gene in C. irritans remains under strong purifying selection. Taken together, the C. irritans species may consists of many subspecies and/or syngens. Further work is needed to determine if there is reproductive isolation between the groups we have defined. Copyright © 2017 Elsevier Inc. All rights reserved.
Schönberg, Anna; Theunert, Christoph; Li, Mingkun; Stoneking, Mark; Nasidze, Ivan
2011-09-01
To investigate the demographic history of human populations from the Caucasus and surrounding regions, we used high-throughput sequencing to generate 147 complete mtDNA genome sequences from random samples of individuals from three groups from the Caucasus (Armenians, Azeri and Georgians), and one group each from Iran and Turkey. Overall diversity is very high, with 144 different sequences that fall into 97 different haplogroups found among the 147 individuals. Bayesian skyline plots (BSPs) of population size change through time show a population expansion around 40-50 kya, followed by a constant population size, and then another expansion around 15-18 kya for the groups from the Caucasus and Iran. The BSP for Turkey differs the most from the others, with an increase from 35 to 50 kya followed by a prolonged period of constant population size, and no indication of a second period of growth. An approximate Bayesian computation approach was used to estimate divergence times between each pair of populations; the oldest divergence times were between Turkey and the other four groups from the South Caucasus and Iran (~400-600 generations), while the divergence time of the three Caucasus groups from each other was comparable to their divergence time from Iran (average of ~360 generations). These results illustrate the value of random sampling of complete mtDNA genome sequences that can be obtained with high-throughput sequencing platforms.
Franco, Bernardo; Hernández, Roberto; López-Villaseñor, Imelda
2012-09-01
Trichomonas vaginalis is a parasitic protozoan of both medical and biological relevance. Transcriptional studies in this organism have focused mainly on type II pol promoters, whereas the elements necessary for transcription by polI or polIII have not been investigated. Here, with the aid of a transient transcription system, we characterised the rDNA intergenic region, defining both the promoter and the terminator sequences required for transcription. We defined the promoter as a compact region of approximately 180 bp. We also identified a potential upstream control element (UCE) that was located 80 bp upstream of the transcription start point (TSP). A transcription termination element was identified within a 34 bp region that was located immediately downstream of the 28S coding sequence. The function of this element depends upon polarity and the presence of both a stretch of uridine residues (U's) and a hairpin structure in the transcript. Our observations provide a strong basis for the study of DNA recognition by the polI transcriptional machinery in this early divergent organism. Copyright © 2012 Elsevier B.V. All rights reserved.
Arrach, Nabil; Fernández-Martín, Rafael; Cerdá-Olmedo, Enrique; Avalos, Javier
2001-01-01
Previous complementation and mapping of mutations that change the usual yellow color of the Zygomycete Phycomyces blakesleeanus to white or red led to the definition of two structural genes for carotene biosynthesis. We have cloned one of these genes, carRA, by taking advantage of its close linkage to the other, carB, responsible for phytoene dehydrogenase. The sequences of the wild type and six mutants have been established, compared with sequences in other organisms, and correlated with the mutant phenotypes. The carRA and carB coding sequences are separated by 1,381 untranslated nucleotides and are divergently transcribed. Gene carRA contains separate domains for two enzymes, lycopene cyclase and phytoene synthase, and regulates the overall activity of the pathway and its response to physical and chemical stimuli from the environment. The lycopene cyclase domain of carRA derived from a duplication of a gene from a common ancestor of fungi and Brevibacterium linens; the phytoene synthase domain is similar to the phytoene and squalene synthases of many organisms; but the regulatory functions appear to be specific to Phycomyces. PMID:11172012
Gaitán-Espitia, Juan Diego; Nespolo, Roberto F.; Opazo, Juan C.
2013-01-01
The complete sequences of three mitochondrial genomes from the land snail Cornu aspersum were determined. The mitogenome has a length of 14050 bp, and it encodes 13 protein-coding genes, 22 transfer RNA genes and two ribosomal RNA genes. It also includes nine small intergene spacers, and a large AT-rich intergenic spacer. The intra-specific divergence analysis revealed that COX1 has the lower genetic differentiation, while the most divergent genes were NADH1, NADH3 and NADH4. With the exception of Euhadra herklotsi, the structural comparisons showed the same gene order within the family Helicidae, and nearly identical gene organization to that found in order Pulmonata. Phylogenetic reconstruction recovered Basommatophora as polyphyletic group, whereas Eupulmonata and Pulmonata as paraphyletic groups. Bayesian and Maximum Likelihood analyses showed that C. aspersum is a close relative of Cepaea nemoralis, and with the other Helicidae species form a sister group of Albinaria caerulea, supporting the monophyly of the Stylommatophora clade. PMID:23826260
Feng, Bang; Liu, Jian Wei; Xu, Jianping; Zhao, Kuan; Ge, Zai Wei; Yang, Zhu L
2017-04-01
The Alpine porcini, Boletus reticuloceps, is an ectomycorrhizal mushroom distributed in subalpine areas of Southwest China, central China, and Taiwan Island. This distribution pattern makes it an ideal organism to infer how ectomycorrhizal fungi have reacted to historical tectonic and climatic changes, and to illustrate the mechanism for the disjunction of organisms between Southwest China and Taiwan. In this study, we explored the phylogeographic pattern of B. reticuloceps by microsatellite genotyping, DNA sequencing, ecological factor analysis, and species distribution modeling. Three genetic groups from the East Himalayas (EH), northern Hengduan Mountains (NHM), and southern Hengduan Mountains (SHM), were identified. The earlier divergent SHM group is found under Abies in moister environments, whereas the EH and NHM groups, which are physically separated by the Mekong-Salween Divide, are found mainly under Picea in drier environments. Samples from Taiwan showed a close relationship with the SHM group. High mountains did not form dispersal barriers among populations in each of the EH, NHM, and SHM groups, probably due to the relatively weak host specificity of B. reticuloceps. Our study indicated that ecological heterogeneity could have contributed to the divergence between the SHM and the NHM-EH groups, while physical barriers could have led to the divergence of the NHM and the EH groups. Dispersal into Taiwan via Central China during the Quaternary glaciations is likely to have shaped its disjunct distribution.
Godinho, R; Mendonça, B; Crespo, E G; Ferrand, N
2006-06-01
The study of nuclear genealogies in natural populations of nonmodel organisms is expected to provide novel insights into the evolutionary history of populations, especially when developed in the framework of well-established mtDNA phylogeographical scenarios. In the Iberian Peninsula, the endemic Schreiber's green lizard Lacerta schreiberi exhibits two highly divergent and allopatric mtDNA lineages that started to split during the late Pliocene. In this work, we performed a fine-scale analysis of the putative mtDNA contact zone together with a global analysis of the patterns of variation observed at the nuclear beta-fibrinogen intron 7 (beta-fibint7). Using a combination of DNA sequencing with single-strand conformational polymorphism (SSCP) analysis, we show that the observed genealogy at the beta-fibint7 locus reveals extensive admixture between two formerly isolated lizard populations while the two mtDNA lineages remain essentially allopatric. In addition, a private beta-fibint7 haplotype detected in the single population where both mtDNA lineages were found in sympatry is probably the result of intragenic recombination between the two more common and divergent beta-fibint7 haplotypes. Our results suggest that the progressive incorporation of nuclear genealogies in investigating the ancient demography and admixture dynamics of divergent genomes will be necessary to obtain a more comprehensive picture of the evolutionary history of organisms.
Goicoechea, P G; Herrán, A; Durand, J; Bodénès, C; Plomion, C; Kremer, A
2015-01-01
We analyzed the genetic mosaic of speciation in two hybridizing Mediterranean white oaks from the Iberian Peninsula (Quercus faginea Lamb. and Quercus pyrenaica Willd.). The two species show ecological divergence in flowering phenology, leaf morphology and composition, and in their basic or acidic soil preferences. Ninety expressed sequence tag-simple sequence repeats (EST-SSRs) and eight nuclear SSRs were genotyped in 96 trees from each species. Genotyping was designed in two steps. First, we used 69 markers evenly distributed over the 12 linkage groups (LGs) of the oak linkage map to confirm the species genetic identity of the sampled genotypes, and searched for differentiation outliers. Then, we genotyped 29 additional markers from the chromosome bins containing the outliers and repeated the multilocus scans. We found one or two additional outliers within four saturated bins, thus confirming that outliers are organized into clusters. Linkage disequilibrium (LD) was extensive; even for loosely linked and for independent markers. Consequently, score tests for association between two-marker haplotypes and the ‘species trait' showed a broad genomic divergence, although substantial variation across the genome and within LGs was also observed. We discuss the influence of several confounding effects on neutrality tests and review the evolutionary processes leading to extensive LD. Finally, we examine how LD analyses within regions that contain outlier clusters and quantitative trait loci can help to identify regions of divergence and/or genomic hitchhiking in the light of predictions from ecological speciation theory. PMID:25515016
Vela, Ana I; Casas-Díaz, Encarna; Lavín, Santiago; Domínguez, Lucas; Fernández-Garayzábal, Jose F
2015-09-01
Four isolates of an unknown Gram-stain-positive, catalase-negative coccus-shaped organism, isolated from the pharynx of four wild rabbits, were characterized by phenotypic and molecular genetic methods. The micro-organisms were tentatively assigned to the genus Streptococcus based on cellular morphological and biochemical criteria, although the organisms did not appear to correspond to any species with a validly published name. Comparative 16S rRNA gene sequencing confirmed their identification as members of the genus Streptococcus, being most closely related phylogenetically to Streptococcus porcorum 682-03(T) (96.9% 16S rRNA gene sequence similarity). Analysis of rpoB and sodA gene sequences showed divergence values between the novel species and S. porcorum 682-03(T) (the closest phylogenetic relative determined from 16S rRNA gene sequences) of 18.1 and 23.9%, respectively. The novel bacterial isolate could be distinguished from the type strain of S. porcorum by several biochemical characteristics, such as the production of glycyl-tryptophan arylamidase and α-chymotrypsin, and the non-acidification of different sugars. Based on both phenotypic and phylogenetic findings, it is proposed that the unknown bacterium be assigned to a novel species of the genus Streptococcus, and named Streptococcus pharyngis sp. nov. The type strain is DICM10-00796B(T) ( = CECT 8754(T) = CCUG 66496(T)).
Howard, Thomas P; Hayward, Andrew P; Tordillos, Anthony; Fragoso, Christopher; Moreno, Maria A; Tohme, Joe; Kausch, Albert P; Mottinger, John P; Dellaporta, Stephen L
2014-01-01
Since their initial discovery, transposons have been widely used as mutagens for forward and reverse genetic screens in a range of organisms. The problems of high copy number and sequence divergence among related transposons have often limited the efficiency at which tagged genes can be identified. A method was developed to identity the locations of Mutator (Mu) transposons in the Zea mays genome using a simple enrichment method combined with genome resequencing to identify transposon junction fragments. The sequencing library was prepared from genomic DNA by digesting with a restriction enzyme that cuts within a perfectly conserved motif of the Mu terminal inverted repeats (TIR). Paired-end reads containing Mu TIR sequences were computationally identified and chromosomal sequences flanking the transposon were mapped to the maize reference genome. This method has been used to identify Mu insertions in a number of alleles and to isolate the previously unidentified lazy plant1 (la1) gene. The la1 gene is required for the negatively gravitropic response of shoots and mutant plants lack the ability to sense gravity. Using bioinformatic and fluorescence microscopy approaches, we show that the la1 gene encodes a cell membrane and nuclear localized protein. Our Mu-Taq method is readily adaptable to identify the genomic locations of any insertion of a known sequence in any organism using any sequencing platform.
Howard, Thomas P.; Hayward, Andrew P.; Tordillos, Anthony; Fragoso, Christopher; Moreno, Maria A.; Tohme, Joe; Kausch, Albert P.; Mottinger, John P.; Dellaporta, Stephen L.
2014-01-01
Since their initial discovery, transposons have been widely used as mutagens for forward and reverse genetic screens in a range of organisms. The problems of high copy number and sequence divergence among related transposons have often limited the efficiency at which tagged genes can be identified. A method was developed to identity the locations of Mutator (Mu) transposons in the Zea mays genome using a simple enrichment method combined with genome resequencing to identify transposon junction fragments. The sequencing library was prepared from genomic DNA by digesting with a restriction enzyme that cuts within a perfectly conserved motif of the Mu terminal inverted repeats (TIR). Paired-end reads containing Mu TIR sequences were computationally identified and chromosomal sequences flanking the transposon were mapped to the maize reference genome. This method has been used to identify Mu insertions in a number of alleles and to isolate the previously unidentified lazy plant1 (la1) gene. The la1 gene is required for the negatively gravitropic response of shoots and mutant plants lack the ability to sense gravity. Using bioinformatic and fluorescence microscopy approaches, we show that the la1 gene encodes a cell membrane and nuclear localized protein. Our Mu-Taq method is readily adaptable to identify the genomic locations of any insertion of a known sequence in any organism using any sequencing platform. PMID:24498020
Skoglund, Pontus; Götherström, Anders; Jakobsson, Mattias
2011-04-01
Despite recent technological advances in DNA sequencing, incomplete coverage remains to be an issue in population genomics, in particular for studies that include ancient samples. Here, we describe an approach to estimate population divergence times for non-overlapping sequence data that is based on probabilities of different genealogical topologies under a structured coalescent model. We show that the approach can be adapted to accommodate common problems such as sequencing errors and postmortem nucleotide misincorporations, and we use simulations to investigate biases involved with estimating genealogical topologies from empirical data. The approach relies on three reference genomes and should be particularly useful for future analysis of genomic data that comprise of nonoverlapping sets of sequences, potentially from different points in time. We applied the method to shotgun sequence data from an ancient wolf together with extant dogs and wolves and found striking resemblance to previously described fine-scale population structure among dog breeds. When comparing modern dogs to four geographically distinct wolves, we find that the divergence time between dogs and an Indian wolf is smallest, followed by the divergence times to a Chinese wolf and a Spanish wolf, and a relatively long divergence time to an Alaskan wolf, suggesting that the origin of modern dogs is somewhere in Eurasia, potentially southern Asia. We find that less than two-thirds of all loci in the boxer and poodle genomes are more similar to each other than to a modern gray wolf and that--assuming complete isolation without gene flow--the divergence time between gray wolves and modern European dogs extends to 3,500 generations before the present, corresponding to approximately 10,000 years ago (95% confidence interval [CI]: 9,000-13,000). We explicitly study the effect of gene flow between dogs and wolves on our estimates and show that a low rate of gene flow is compatible with an even earlier domestication date ∼30,000 years ago (95% CI: 15,000-90,000). This observation is in agreement with recent archaeological findings and indicates that human behavior necessary for domestication of wild animals could have appeared much earlier than the development of agriculture.
Chen, Chao; Wang, Huihua; Liu, Zhiguang; Chen, Xiao; Tang, Jiao; Meng, Fanming; Shi, Wei
2018-06-20
The mechanisms by which organisms adapt to variable environments are a fundamental question in evolutionary biology and are important to protect important species in response to a changing climate. An interesting candidate to study this question is the honey bee Apis cerana, a keystone pollinator with a wide distribution throughout a large variety of climates, that exhibits rapid dispersal. Here, we re-sequenced the genome of 180 A. cerana individuals from eighteen populations throughout China. Using a population genomics approach, we observed considerable genetic variation in A. cerana. Patterns of genetic differentiation indicate high divergence at the subspecies level, and physical barriers rather than distance are the driving force for population divergence. Estimations of divergence time suggested that the main branches diverged between 300 and 500 ka. Analyses of the population history revealed a substantial influence of the Earth's climate on the effective population size of A. cerana, as increased population sizes were observed during warmer periods. Further analyses identified candidate genes under natural selection that are potentially related to honey bee cognition, temperature adaptation, and olfactory. Based on our results, A. cerana may have great potential in response to climate change. Our study provides fundamental knowledge of the evolution and adaptation of A. cerana.
Quantiprot - a Python package for quantitative analysis of protein sequences.
Konopka, Bogumił M; Marciniak, Marta; Dyrka, Witold
2017-07-17
The field of protein sequence analysis is dominated by tools rooted in substitution matrices and alignments. A complementary approach is provided by methods of quantitative characterization. A major advantage of the approach is that quantitative properties defines a multidimensional solution space, where sequences can be related to each other and differences can be meaningfully interpreted. Quantiprot is a software package in Python, which provides a simple and consistent interface to multiple methods for quantitative characterization of protein sequences. The package can be used to calculate dozens of characteristics directly from sequences or using physico-chemical properties of amino acids. Besides basic measures, Quantiprot performs quantitative analysis of recurrence and determinism in the sequence, calculates distribution of n-grams and computes the Zipf's law coefficient. We propose three main fields of application of the Quantiprot package. First, quantitative characteristics can be used in alignment-free similarity searches, and in clustering of large and/or divergent sequence sets. Second, a feature space defined by quantitative properties can be used in comparative studies of protein families and organisms. Third, the feature space can be used for evaluating generative models, where large number of sequences generated by the model can be compared to actually observed sequences.
Hopple, J S; Vilgalys, R
1999-10-01
Phylogenetic relationships were investigated in the mushroom genus Coprinus based on sequence data from the nuclear encoded large-subunit rDNA gene. Forty-seven species of Coprinus and 19 additional species from the families Coprinaceae, Strophariaceae, Bolbitiaceae, Agaricaceae, Podaxaceae, and Montagneaceae were studied. A total of 1360 sites was sequenced across seven divergent domains and intervening sequences. A total of 302 phylogenetically informative characters was found. Ninety-eight percent of the average divergence between taxa was located within the divergent domains, with domains D2 and D8 being most divergent and domains D7 and D10 the least divergent. An empirical test of phylogenetic signal among divergent domains also showed that domains D2 and D3 had the lowest levels of homoplasy. Two equally most parsimonious trees were resolved using Wagner parsimony. A character-state weighted analysis produced 12 equally most parsimonious trees similar to those generated by Wagner parsimony. Phylogenetic analyses employing topological constraints suggest that none of the major taxonomic systems proposed for subgeneric classification is able to completely reflect phylogenetic relationships in Coprinus. A strict consensus integration of the two Wagner trees demonstrates the problematic nature of choosing outgroups within dark-spored mushrooms. The genus Coprinus is found to be polyphyletic and is separated into three distinct clades. Most Coprinus taxa belong to the first two clades, which together form a larger monophyletic group with Lacrymaria and Psathyrella in basal positions. A third clade contains members of Coprinus section Comati as well as the genus Leucocoprinus, Podaxis pistillaris, Montagnea arenaria, and Agaricus pocillator. This third clade is separated from the other species of Coprinus by members of the families Strophariaceae and Bolbitiaceae and the genus Panaeolus. Copyright 1999 Academic Press.
2010-01-01
Background Cryptic species complexes are common among anophelines. Previous phylogenetic analysis based on the complete mtDNA COI gene sequences detected paraphyly in the Neotropical malaria vector Anopheles marajoara. The "Folmer region" detects a single taxon using a 3% divergence threshold. Methods To test the paraphyletic hypothesis and examine the utility of the Folmer region, genealogical trees based on a concatenated (white + 3' COI sequences) dataset and pairwise differentiation of COI fragments were examined. The population structure and demographic history were based on partial COI sequences for 294 individuals from 14 localities in Amazonian Brazil. 109 individuals from 12 localities were sequenced for the nDNA white gene, and 57 individuals from 11 localities were sequenced for the ribosomal DNA (rDNA) internal transcribed spacer 2 (ITS2). Results Distinct A. marajoara lineages were detected by combined genealogical analysis and were also supported among COI haplotypes using a median joining network and AMOVA, with time since divergence during the Pleistocene (<100,000 ya). COI sequences at the 3' end were more variable, demonstrating significant pairwise differentiation (3.82%) compared to the more moderate 2.92% detected by the Folmer region. Lineage 1 was present in all localities, whereas lineage 2 was restricted mainly to the west. Mismatch distributions for both lineages were bimodal, likely due to multiple colonization events and spatial expansion (~798 - 81,045 ya). There appears to be gene flow within, not between lineages, and a partial barrier was detected near Rio Jari in Amapá state, separating western and eastern populations. In contrast, both nDNA data sets (white gene sequences with or without the retention of the 4th intron, and ITS2 sequences and length) detected a single A. marajoara lineage. Conclusions Strong support for combined data with significant differentiation detected in the COI and absent in the nDNA suggest that the divergence is recent, and detectable only by the faster evolving mtDNA. A within subgenus threshold of >2% may be more appropriate among sister taxa in cryptic anopheline complexes than the standard 3%. Differences in demographic history and climatic changes may have contributed to mtDNA lineage divergence in A. marajoara. PMID:20929572
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sakoyama, Y.; Hong, K.J.; Byun, S.M.
To determine the phylogenetic relationships among hominoids and the dates of their divergence, the complete nucleotide sequences of the constant region of the immunoglobulin eta-chain (C/sub eta1/) genes from chimpanzee and orangutan have been determined. These sequences were compared with the human eta-chain constant-region sequence. A molecular clock (silent molecular clock), measured by the degree of sequence divergence at the synonymous (silent) positions of protein-encoding regions, was introduced for the present study. From the comparison of nucleotide sequences of ..cap alpha../sub 1/-antitrypsin and ..beta..- and delta-globulin genes between humans and Old World monkeys, the silent molecular clock was calibrated: themore » mean evolutionary rate of silent substitution was determined to be 1.56 x 10/sup -9/ substitutions per site per year. Using the silent molecular clock, the mean divergence dates of chimpanzee and orangutan from the human lineage were estimated as 6.4 +/- 2.6 million years and 17.3 +/- 4.5 million years, respectively. It was also shown that the evolutionary rate of primate genes is considerably slower than those of other mammalian genes.« less
Bloom DNA Helicase Facilitates Homologous Recombination between Diverged Homologous Sequences*
Kikuchi, Koji; Abdel-Aziz, H. Ismail; Taniguchi, Yoshihito; Yamazoe, Mitsuyoshi; Takeda, Shunichi; Hirota, Kouji
2009-01-01
Bloom syndrome caused by inactivation of the Bloom DNA helicase (Blm) is characterized by increases in the level of sister chromatid exchange, homologous recombination (HR) associated with cross-over. It is therefore believed that Blm works as an anti-recombinase. Meanwhile, in Drosophila, DmBlm is required specifically to promote the synthesis-dependent strand anneal (SDSA), a type of HR not associating with cross-over. However, conservation of Blm function in SDSA through higher eukaryotes has been a matter of debate. Here, we demonstrate the function of Blm in SDSA type HR in chicken DT40 B lymphocyte line, where Ig gene conversion diversifies the immunoglobulin V gene through intragenic HR between diverged homologous segments. This reaction is initiated by the activation-induced cytidine deaminase enzyme-mediated uracil formation at the V gene, which in turn converts into abasic site, presumably leading to a single strand gap. Ig gene conversion frequency was drastically reduced in BLM−/− cells. In addition, BLM−/− cells used limited donor segments harboring higher identity compared with other segments in Ig gene conversion event, suggesting that Blm can promote HR between diverged sequences. To further understand the role of Blm in HR between diverged homologous sequences, we measured the frequency of gene targeting induced by an I-SceI-endonuclease-mediated double-strand break. BLM−/− cells showed a severer defect in the gene targeting frequency as the number of heterologous sequences increased at the double-strand break site. Conversely, the overexpression of Blm, even an ATPase-defective mutant, strongly stimulated gene targeting. In summary, Blm promotes HR between diverged sequences through a novel ATPase-independent mechanism. PMID:19661064
Diversity and Divergence of Dinoflagellate Histone Proteins
Marinov, Georgi K.; Lynch, Michael
2015-01-01
Histone proteins and the nucleosomal organization of chromatin are near-universal eukaroytic features, with the exception of dinoflagellates. Previous studies have suggested that histones do not play a major role in the packaging of dinoflagellate genomes, although several genomic and transcriptomic surveys have detected a full set of core histone genes. Here, transcriptomic and genomic sequence data from multiple dinoflagellate lineages are analyzed, and the diversity of histone proteins and their variants characterized, with particular focus on their potential post-translational modifications and the conservation of the histone code. In addition, the set of putative epigenetic mark readers and writers, chromatin remodelers and histone chaperones are examined. Dinoflagellates clearly express the most derived set of histones among all autonomous eukaryote nuclei, consistent with a combination of relaxation of sequence constraints imposed by the histone code and the presence of numerous specialized histone variants. The histone code itself appears to have diverged significantly in some of its components, yet others are conserved, implying conservation of the associated biochemical processes. Specifically, and with major implications for the function of histones in dinoflagellates, the results presented here strongly suggest that transcription through nucleosomal arrays happens in dinoflagellates. Finally, the plausible roles of histones in dinoflagellate nuclei are discussed. PMID:26646152
Weigand, Michael R; Sundin, George W
2012-08-21
The successful growth of hypermutator strains of bacteria contradicts a clear preference for lower mutation rates observed in the microbial world. Whether by general DNA repair deficiency or the inducible action of low-fidelity DNA polymerases, the evolutionary strategies of bacteria include methods of hypermutation. Although both raise mutation rate, general and inducible hypermutation operate through distinct molecular mechanisms and therefore likely impart unique adaptive consequences. Here we compare the influence of general and inducible hypermutation on adaptation in the model organism Pseudomonas aeruginosa PAO1 through experimental evolution. We observed divergent spectra of single base substitutions derived from general and inducible hypermutation by sequencing rpoB in spontaneous rifampicin-resistant (Rif(R)) mutants. Likewise, the pattern of mutation in a draft genome sequence of a derived inducible hypermutator isolate differed from those of general hypermutators reported in the literature. However, following experimental evolution, populations of both mutator types exhibited comparable improvements in fitness across varied conditions that differed from the highly specific adaptation of nonmutators. Our results suggest that despite their unique mutation spectra, general and inducible hypermutation can analogously influence the ecology and adaptation of bacteria, significantly shaping pathogenic populations where hypermutation has been most widely observed.
Detection of Plasmodium sp. in capybara.
dos Santos, Leonilda Correia; Curotto, Sandra Mara Rotter; de Moraes, Wanderlei; Cubas, Zalmir Silvino; Costa-Nascimento, Maria de Jesus; de Barros Filho, Ivan Roque; Biondo, Alexander Welker; Kirchgatter, Karin
2009-07-07
In the present study, we have microscopically and molecularly surveyed blood samples from 11 captive capybaras (Hydrochaeris hydrochaeris) from the Sanctuary Zoo for Plasmodium sp. infection. One animal presented positive on blood smear by light microscopy. Polymerase chain reaction was carried out accordingly using a nested genus-specific protocol, which uses oligonucleotides from conserved sequences flanking a variable sequence region in the small subunit ribosomal RNA (ssrRNA) of all Plasmodium organisms. This revealed three positive animals. Products from two samples were purified and sequenced. The results showed less than 1% divergence between the two capybara sequences. When compared with GenBank sequences, a 55% similarity was obtained to Toxoplasma gondii and a higher similarity (73-77.2%) was found to ssrRNAs from Plasmodium species that infect reptile, avian, rodents, and human beings. The most similar Plasmodium sequence was from Plasmodium mexicanum that infects lizards of North America, where around 78% identity was found. This work is the first report of Plasmodium in capybaras, and due to the low similarity with other Plasmodium species, we suggest it is a new species, which, in the future could be denominated "Plasmodium hydrochaeri".
Olmsted, R A; Langley, R; Roelke, M E; Goeken, R M; Adger-Johnson, D; Goff, J P; Albert, J P; Packer, C; Laurenson, M K; Caro, T M
1992-10-01
The natural occurrence of lentiviruses closely related to feline immunodeficiency virus (FIV) in nondomestic felid species is shown here to be worldwide. Cross-reactive antibodies to FIV were common in several free-ranging populations of large cats, including East African lions and cheetahs of the Serengeti ecosystem and in puma (also called cougar or mountain lion) populations throughout North America. Infectious puma lentivirus (PLV) was isolated from several Florida panthers, a severely endangered relict puma subspecies inhabiting the Big Cypress Swamp and Everglades ecosystems in southern Florida. Phylogenetic analysis of PLV genomic sequences from disparate geographic isolates revealed appreciable divergence from domestic cat FIV sequences as well as between PLV sequences found in different North American locales. The level of sequence divergence between PLV and FIV was greater than the level of divergence between human and certain simian immunodeficiency viruses, suggesting that the transmission of FIV between feline species is infrequent and parallels in time the emergence of HIV from simian ancestors.
Horner, David S; Lefkimmiatis, Konstantinos; Reyes, Aurelio; Gissi, Carmela; Saccone, Cecilia; Pesole, Graziano
2007-01-01
Background Phylogenetic relationships between Lagomorpha, Rodentia and Primates and their allies (Euarchontoglires) have long been debated. While it is now generally agreed that Rodentia constitutes a monophyletic sister-group of Lagomorpha and that this clade (Glires) is sister to Primates and Dermoptera, higher-level relationships within Rodentia remain contentious. Results We have sequenced and performed extensive evolutionary analyses on the mitochondrial genome of the scaly-tailed flying squirrel Anomalurus sp., an enigmatic rodent whose phylogenetic affinities have been obscure and extensively debated. Our phylogenetic analyses of the coding regions of available complete mitochondrial genome sequences from Euarchontoglires suggest that Anomalurus is a sister taxon to the Hystricognathi, and that this clade represents the most basal divergence among sampled Rodentia. Bayesian dating methods incorporating a relaxed molecular clock provide divergence-time estimates which are consistently in agreement with the fossil record and which indicate a rapid radiation within Glires around 60 million years ago. Conclusion Taken together, the data presented provide a working hypothesis as to the phylogenetic placement of Anomalurus, underline the utility of mitochondrial sequences in the resolution of even relatively deep divergences and go some way to explaining the difficulty of conclusively resolving higher-level relationships within Glires with available data and methodologies. PMID:17288612
When are pathogen genome sequences informative of transmission events?
Ferguson, Neil; Jombart, Thibaut
2018-01-01
Recent years have seen the development of numerous methodologies for reconstructing transmission trees in infectious disease outbreaks from densely sampled whole genome sequence data. However, a fundamental and as of yet poorly addressed limitation of such approaches is the requirement for genetic diversity to arise on epidemiological timescales. Specifically, the position of infected individuals in a transmission tree can only be resolved by genetic data if mutations have accumulated between the sampled pathogen genomes. To quantify and compare the useful genetic diversity expected from genetic data in different pathogen outbreaks, we introduce here the concept of ‘transmission divergence’, defined as the number of mutations separating whole genome sequences sampled from transmission pairs. Using parameter values obtained by literature review, we simulate outbreak scenarios alongside sequence evolution using two models described in the literature to describe transmission divergence of ten major outbreak-causing pathogens. We find that while mean values vary significantly between the pathogens considered, their transmission divergence is generally very low, with many outbreaks characterised by large numbers of genetically identical transmission pairs. We describe the impact of transmission divergence on our ability to reconstruct outbreaks using two outbreak reconstruction tools, the R packages outbreaker and phybreak, and demonstrate that, in agreement with previous observations, genetic sequence data of rapidly evolving pathogens such as RNA viruses can provide valuable information on individual transmission events. Conversely, sequence data of pathogens with lower mean transmission divergence, including Streptococcus pneumoniae, Shigella sonnei and Clostridium difficile, provide little to no information about individual transmission events. Our results highlight the informational limitations of genetic sequence data in certain outbreak scenarios, and demonstrate the need to expand the toolkit of outbreak reconstruction tools to integrate other types of epidemiological data. PMID:29420641
Molecular characterization and distribution of a 145-bp tandem repeat family in the genus Populus.
Rajagopal, J; Das, S; Khurana, D K; Srivastava, P S; Lakshmikumaran, M
1999-10-01
This report aims to describe the identification and molecular characterization of a 145-bp tandem repeat family that accounts for nearly 1.5% of the Populus genome. Three members of this repeat family were cloned and sequenced from Populus deltoides and P. ciliata. The dimers of the repeat were sequenced in order to confirm the head-to-tail organization of the repeat. Hybridization-based analysis using the 145-bp tandem repeat as a probe on genomic DNA gave rise to ladder patterns which were identified to be a result of methylation and (or) sequence heterogeneity. Analysis of the methylation pattern of the repeat family using methylation-sensitive isoschizomers revealed variable methylation of the C residues and lack of methylation of the A residues. Sequence comparisons between the monomers revealed a high degree of sequence divergence that ranged between 6% and 11% in P. deltoides and between 4.2% and 8.3% in P. ciliata. This indicated the presence of sub-families within the 145-bp tandem family of repeats. Divergence was mainly due to the accumulation of point mutations and was concentrated in the central region of the repeat. The 145-bp tandem repeat family did not show significant homology to known tandem repeats from plants. A short stretch of 36 bp was found to show homology of 66.7% to a centromeric repeat from Chironomus plumosus. Dot-blot analysis and Southern hybridization data revealed the presence of the repeat family in 13 of the 14 Populus species examined. The absence of the 145-bp repeat from P. euphratica suggested that this species is relatively distant from other members of the genus, which correlates with taxonomic classifications. The widespread occurrence of the tandem family in the genus indicated that this family may be of ancient origin.
Reding, Dawn M; Addis, Elizabeth A; Palacios, Maria G; Schwartz, Tonia S; Bronikowski, Anne M
2016-07-01
The insulin/insulin-like signaling pathway (IIS) has been shown to mediate life history trade-offs in mammalian model organisms, but the function of this pathway in wild and non-mammalian organisms is understudied. Populations of western terrestrial garter snakes (Thamnophis elegans) around Eagle Lake, California, have evolved variation in growth and maturation rates, mortality senescence rates, and annual reproductive output that partition into two ecotypes: "fast-living" and "slow-living". Thus, genes associated with the IIS network are good candidates for investigating the mechanisms underlying ecological divergence in this system. We reared neonates from each ecotype for 1.5years under two thermal treatments. We then used qPCR to compare mRNA expression levels in three tissue types (brain, liver, skeletal muscle) for four genes (igf1, igf2, igf1r, igf2r), and we used radioimmunoassay to measure plasma IGF-1 and IGF-2 protein levels. Our results show that, in contrast to most mammalian model systems, igf2 mRNA and protein levels exceed those of igf1 and suggest an important role for igf2 in postnatal growth in reptiles. Thermal rearing treatment and recent growth had greater impacts on IGF levels than genetic background (i.e., ecotype), and the two ecotypes responded similarly. This suggests that observed ecotypic differences in field measures of IGFs may more strongly reflect plastic responses in different environments than evolutionary divergence. Future analyses of additional components of the IIS pathway and sequence divergence between the ecotypes will further illuminate how environmental and genetic factors influence the endocrine system and its role in mediating life history trade-offs. Copyright © 2016 Elsevier Inc. All rights reserved.
Pereira, J O P; Freitas, B M; Jorge, D M M; Torres, D C; Soares, C E A; Grangeiro, T B
2009-01-01
Melipona quinquefasciata is a ground-nesting South American stingless bee whose geographic distribution was believed to comprise only the central and southern states of Brazil. We obtained partial sequences (about 500-570 bp) of first internal transcribed spacer (ITS1) nuclear ribosomal DNA from Melipona specimens putatively identified as M. quinquefasciata collected from different localities in northeastern Brazil. To confirm the taxonomic identity of the northeastern samples, specimens from the state of Goiás (Central region of Brazil) were included for comparison. All sequences were deposited in GenBank (accession numbers EU073751-EU073759). The mean nucleotide divergence (excluding sites with insertions/deletions) in the ITS1 sequences was only 1.4%, ranging from 0 to 4.1%. When the sites with insertions/deletions were also taken into account, sequence divergences varied from 0 to 5.3%. In all pairwise comparisons, the ITS1 sequence from the specimens collected in Goiás was most divergent compared to the ITS1 sequences of the bees from the other locations. However, neighbor-joining phylogenetic analysis showed that all ITS1 sequences from northeastern specimens along with the sample of Goiás were resolved in a single clade with a bootstrap support of 100%. The ITS1 sequencing data thus support the occurrence of M. quinquefasciata in northeast Brazil.
Barcoding of fresh water fishes from Pakistan.
Karim, Asma; Iqbal, Asad; Akhtar, Rehan; Rizwan, Muhammad; Amar, Ali; Qamar, Usman; Jahan, Shah
2016-07-01
DNA bar-coding is a taxonomic method that uses small genetic markers in organisms' mitochondrial DNA (mt DNA) for identification of particular species. It uses sequence diversity in a 658-base pair fragment near the 5' end of the mitochondrial cytochrome c oxidase subunit 1 (CO1) gene as a tool for species identification. DNA barcoding is more accurate and reliable method as compared with the morphological identification. It is equally useful in juveniles as well as adult stages of fishes. The present study was conducted to identify three farm fish species of Pakistan (Cyprinus carpio, Cirrhinus mrigala, and Ctenopharyngodon idella) genetically. All of them belonged to family cyprinidae. CO1 gene was amplified. PCR products were sequenced and analyzed by bioinformatic software. Conspecific, congenric, and confamilial k2P nucleotide divergence was estimated. From these findings, it was concluded that the gene sequence, CO1, may serve as milestone for the identification of related species at molecular level.
Paraskevis, D; Magiorkinis, M; Vandamme, A M; Kostrikis, L G; Hatzakis, A
2001-03-01
Human immunodeficiency virus type 1 (HIV-1) has been classified into three main groups and 11 distinct subtypes. Moreover, several circulating recombinant forms (CRFs) of HIV-1 have been recently documented to have spread widely causing extensive HIV-1 epidemics. A subtype, initially designated I (CRF04_cpx), was documented in Cyprus and Greece and was found to comprise regions of sequence derived from subtypes A and G as well as regions of unclassified sequence. Re-analysis of the three full-length CRF04_cpx sequences that were available revealed a mosaic genomic organization of unique complexity comprising regions of sequence from at least five distinct subtypes, A, G, H, K and unclassified regions. These strains account for approximately 2% of the total HIV-1-infected population in Greece, thus providing evidence of the great capability of HIV-1 to recombine and produce highly divergent strains which can be spread successfully through different infection routes.
Camunas-Soler, Joan; Kertesz, Michael; De Vlaminck, Iwijn; Koh, Winston; Pan, Wenying; Martin, Lance; Neff, Norma F.; Okamoto, Jennifer; Wong, Ronald J.; Kharbanda, Sandhya; El-Sayed, Yasser; Blumenfeld, Yair; Stevenson, David K.; Shaw, Gary M.; Wolfe, Nathan D.; Quake, Stephen R.
2017-01-01
Blood circulates throughout the human body and contains molecules drawn from virtually every tissue, including the microbes and viruses which colonize the body. Through massive shotgun sequencing of circulating cell-free DNA from the blood, we identified hundreds of new bacteria and viruses which represent previously unidentified members of the human microbiome. Analyzing cumulative sequence data from 1,351 blood samples collected from 188 patients enabled us to assemble 7,190 contiguous regions (contigs) larger than 1 kbp, of which 3,761 are novel with little or no sequence homology in any existing databases. The vast majority of these novel contigs possess coding sequences, and we have validated their existence both by finding their presence in independent experiments and by performing direct PCR amplification. When their nearest neighbors are located in the tree of life, many of the organisms represent entirely novel taxa, showing that microbial diversity within the human body is substantially broader than previously appreciated. PMID:28830999
Haygood, M G; Distel, D L
1993-05-13
Bioluminescent symbioses range from facultative associations to highly adapted, apparently obligate ones. The family Anomalopidae (flashlight fishes) encompasses five genera of tropical reef fishes that have large suborbital light organs. The suborder Ceratioidei (deep-sea anglerfishes) contains 11 families. In nine of these, females have a bioluminescent lure that contains bacterial symbionts. In all other fish light-organ symbioses (occurring in 10 families in 5 orders), the symbionts belong to three Photobacterium species; nonsymbiotic luminous bacteria are Vibrio species. The bacteria are extracellular and tightly packed in tubules that communicate with the exterior, releasing bacteria into the gut of the host or the surrounding sea water. The released bacteria are usually cultivable and can contribute to planktonic populations. Although anomalopids release bacteria and ceratioids have pores that would allow release, the fate of these bacteria is unknown and they cannot be cultured by standard isolation techniques. We report here phylogenetic analysis of 16S ribosomal RNA gene sequences from light organs that show that anomalopid and ceratioid symbionts are not known luminous bacteria, but are new groups related to Vibrio spp. They are characterized by host specificity, deep divergence between symbionts from different genera (anomalopids) or families (ceratioids) and, possibly, parallel divergence of hosts and symbionts.
Mhc class II B gene evolution in East African cichlid fishes.
Figueroa, F; Mayer, W E; Sültmann, H; O'hUigin, C; Tichy, H; Satta, Y; Takezaki, N; Takahata, N; Klein, J
2000-06-01
A distinctive feature of essential major histocompatibility complex (Mhc) loci is their polymorphism characterized by large genetic distances between alleles and long persistence times of allelic lineages. Since the lineages often span several successive speciations, we investigated the behavior of the Mhc alleles during or close to the speciation phase. We sequenced exon 2 of the class II B locus 4 from 232 East African cichlid fishes representing 32 related species. The divergence times of the (sub)species ranged from 6,000 to 8.4 million years. Two types of evolutionary analysis were used to elucidate the pattern of exon 2 sequence divergence. First, phylogenetic methods were applied to reconstruct the most likely evolutionary pathways leading from the last common ancestor of the set to the extant sequences, and to assess the probable mechanisms involved in allelic diversification. Second, pairwise comparisons of sequences were carried out to detect differences seemingly incompatible with origin by nonparallel point mutations. The analysis revealed point mutations to be the most important mechanism behind allelic divergences, with recombination playing only an auxiliary part. Comparison of sequences from related species revealed evidence of random allelic (lineage) losses apparently associated with speciation. Sharing of identical alleles could be demonstrated between species that diverged 2 million years ago. The phylogeny of the exon was incongruent with that of the flanking introns, indicating either a high degree of convergent evolution at the peptide-binding region-encoding sites, or intron homogenization.
Amoikon, Tiemele Laurent Simon; Grondin, Cécile; Djéni, Théodore N'Dédé; Jacques, Noémie; Casaregola, Serge
2018-05-21
Analysis of yeasts isolated from various biotopes in French Guiana led to the identification of two strains isolated from flowers and designated CLIB 1634 T and CLIB 1707 T . Comparison of the D1/D2 domain of the large subunit (LSU D1/D2) rRNA gene sequences of CLIB 1634 T and CLIB 1707 T to those in the GenBank database revealed that these strains belong to the Starmerella clade. Strain CLIB 1634 T was shown to diverge from the closely related Starmerella apicola type strain CBS 2868 T with a sequence divergence of 1.34 and 1.30 %, in the LSU D1/D2 rRNA gene and internal transcribed spacer (ITS) sequences respectively. Strain CLIB 1634 T and Candida apicola CBS 2868 T diverged by 3.81 and 14.96 % at the level of the protein-coding gene partial sequences EF-1α and RPB2, respectively. CLIB 1707 T was found to have sequence divergence of 3.88 and 9.16 % in the LSU D1/D2 rRNA gene and ITS, respectively, from that of the most closely related species Starmerella ratchasimensis type strain CBS 10611 T . The species Starmerella reginensis f.a., sp. nov. and Starmerella kourouensis f.a., sp. nov. are proposed to accommodate strains CLIB 1634 T (=CBS 15247 T ) and CLIB 1707 T (=CBS 15257 T ), respectively.
Intraspecific variation in Cryptocaryon irritans.
Diggles, B K; Adlard, R D
1997-01-01
Intraspecific variation in the ciliate Cryptocaryon irritans was examined using sequences of the first internal transcribed spacer region (ITS-1) of ribosomal DNA (rDNA) combined with developmental and morphological characters. Amplified rDNA sequences consisting of 151 bases of the flanking 18 S and 5.8 S regions, and the entire ITS-1 region (169 or 170 bases), were determined and compared for 16 isolates of C. irritans from Australia, Israel and the USA. There was one variable base between isolates in the 18 S region and 11 variable bases in the ITS-1 region. Despite their similar morphology, significant sequence variation (4.1% divergence) and developmental differences indicate that Australian C. irritans isolates from estuarine (Moreton Bay) and coral reef (Heron Island) environments are distinct. The Heron Island isolate was genetically closer to morphologically dissimilar isolates from Israel (1.8% divergence) and the USA (2.3% divergence) than it was to the Moreton Bay isolates. Three isolates maintained in our laboratory since February 1994 differed in sequence from earlier laboratory isolates (2.9% to 3.5% divergence), even though all were similar morphologically and originated from the same source. During this time the sequence of the isolates from wild fish in Moreton Bay remained unchanged. These genetic differences indicate the existence of a founder effect in laboratory populations of C. irritans. The genetic variation found here, combined with known morphological and developmental differences, is used to characterise four strains of C. irritans.
Chakona, Albert; Swartz, Ernst R.; Gouws, Gavin
2013-01-01
This study used phylogenetic analyses of mitochondrial cytochrome b sequences to investigate genetic diversity within three broadly co-distributed freshwater fish genera (Galaxias, Pseudobarbus and Sandelia) to shed some light on the processes that promoted lineage diversification and shaped geographical distribution patterns. A total of 205 sequences of Galaxias, 177 sequences of Pseudobarbus and 98 sequences of Sandelia from 146 localities across nine river systems in the south-western Cape Floristic Region (South Africa) were used. The data were analysed using phylogenetic and haplotype network methods and divergence times for the clades retrieved were estimated using *BEAST. Nine extremely divergent (3.5–25.3%) lineages were found within Galaxias. Similarly, deep phylogeographic divergence was evident within Pseudobarbus, with four markedly distinct (3.8–10.0%) phylogroups identified. Sandelia had two deeply divergent (5.5–5.9%) lineages, but seven minor lineages with strong geographical congruence were also identified. The Miocene-Pliocene major sea-level transgression and the resultant isolation of populations in upland refugia appear to have driven widespread allopatric divergence within the three genera. Subsequent coalescence of rivers during the Pleistocene major sea-level regression as well as intermittent drainage connections during wet periods are proposed to have facilitated range expansion of lineages that currently occur across isolated river systems. The high degree of genetic differentiation recovered from the present and previous studies suggest that freshwater fish diversity within the south-western CFR may be vastly underestimated, and taxonomic revisions are required. PMID:23951050
Echave, Julian; Wilke, Claus O.
2018-01-01
For decades, rates of protein evolution have been interpreted in terms of the vague concept of “functional importance”. Slowly evolving proteins or sites within proteins were assumed to be more functionally important and thus subject to stronger selection pressure. More recently, biophysical models of protein evolution, which combine evolutionary theory with protein biophysics, have completely revolutionized our view of the forces that shape sequence divergence. Slowly evolving proteins have been found to evolve slowly because of selection against toxic misfolding and misinteractions, linking their rate of evolution primarily to their abundance. Similarly, most slowly evolving sites in proteins are not directly involved in function, but mutating them has large impacts on protein structure and stability. Here, we review the studies of the emergent field of biophysical protein evolution that have shaped our current understanding of sequence divergence patterns. We also propose future research directions to develop this nascent field. PMID:28301766
Antell, Gregory C.; Zhong, Wen; Kercher, Katherine; Passic, Shendra; Williams, Jean; Liu, Yucheng; James, Tony; Jacobson, Jeffrey M.; Szep, Zsofia
2017-01-01
Vpr is an HIV-1 accessory protein that plays numerous roles during viral replication, and some of which are cell type dependent. To test the hypothesis that HIV-1 tropism extends beyond the envelope into the vpr gene, studies were performed to identify the associations between coreceptor usage and Vpr variation in HIV-1-infected patients. Colinear HIV-1 Env-V3 and Vpr amino acid sequences were obtained from the LANL HIV-1 sequence database and from well-suppressed patients in the Drexel/Temple Medicine CNS AIDS Research and Eradication Study (CARES) Cohort. Genotypic classification of Env-V3 sequences as X4 (CXCR4-utilizing) or R5 (CCR5-utilizing) was used to group colinear Vpr sequences. To reveal the sequences associated with a specific coreceptor usage genotype, Vpr amino acid sequences were assessed for amino acid diversity and Jensen-Shannon divergence between the two groups. Five amino acid alphabets were used to comprehensively examine the impact of amino acid substitutions involving side chains with similar physiochemical properties. Positions 36, 37, 41, 89, and 96 of Vpr were characterized by statistically significant divergence across multiple alphabets when X4 and R5 sequence groups were compared. In addition, consensus amino acid switches were found at positions 37 and 41 in comparisons of the R5 and X4 sequence populations. These results suggest an evolutionary link between Vpr and gp120 in HIV-1-infected patients. PMID:28620613
Evolutionary history of the enolase gene family.
Tracy, M R; Hedges, S B
2000-12-23
The enzyme enolase [EC 4.2.1.11] is found in all organisms, with vertebrates exhibiting tissue-specific isozymes encoded by three genes: alpha (alpha), beta (beta), and gamma (gamma) enolase. Limited taxonomic sampling of enolase has obscured the timing of gene duplication events. To help clarify the evolutionary history of the gene family, cDNAs were sequenced from six taxa representing major lineages of vertebrates: Chiloscyllium punctatum (shark), Amia calva (bowfin), Salmo trutta (trout), Latimeria chalumnae (coelacanth), Lepidosiren paradoxa (South American lungfish), and Neoceratodus forsteri (Australian lungfish). Phylogenetic analysis of all enolase and related gene sequences revealed an early gene duplication event prior to the last common ancestor of living organisms. Several distantly related archaebacterial sequences were designated as 'enolase-2', whereas all other enolase sequences were designated 'enolase-1'. Two of the three isozymes of enolase-1, alpha- and beta-enolase, were discovered in actinopterygian, sarcopterygian, and chondrichthian fishes. Phylogenetic analysis of vertebrate enolases revealed that the two gene duplications leading to the three isozymes of enolase-1 occurred subsequent to the divergence of living agnathans, near the Proterozoic/Phanerozoic boundary (approximately 550Mya). Two copies of enolase, designated alpha(1) and alpha(2), were found in the trout and are presumed to be the result of a genome duplication event.
Cacheux, Lauriane; Ponger, Loïc; Gerbault-Seureau, Michèle; Loll, François; Gey, Delphine; Richard, Florence Anne; Escudé, Christophe
2018-06-01
Alpha satellite is the major repeated DNA element of primate centromeres. Specific evolutionary mechanisms have led to a great diversity of sequence families with peculiar genomic organization and distribution, which have till now been studied mostly in great apes. Using high throughput sequencing of alpha satellite monomers obtained by enzymatic digestion followed by computational and cytogenetic analysis, we compare here the diversity and genomic distribution of alpha satellite DNA in two related Old World monkey species, Cercopithecus pogonias and Cercopithecus solatus, which are known to have diverged about seven million years ago. Two main families of monomers, called C1 and C2, are found in both species. A detailed analysis of our datasets revealed the existence of numerous subfamilies within the centromeric C1 family. Although the most abundant subfamily is conserved between both species, our FISH experiments clearly show that some subfamilies are specific for each species and that their distribution is restricted to a subset of chromosomes, thereby pointing to the existence of recurrent amplification/homogenization events. The pericentromeric C2 family is very abundant on the short arm of all acrocentric chromosomes in both species, pointing to specific mechanisms that lead to this distribution. Results obtained using two different restriction enzymes are fully consistent with a predominant monomeric organization of alpha satellite DNA which coexists with higher order organization patterns in the Cercopithecus pogonias genome. Our study suggests a high dynamics of alpha satellite DNA in Cercopithecini, with recurrent apparition of new sequence variants and interchromosomal sequence transfer.
Ingram, G C; Goodrich, J; Wilkinson, M D; Simon, R; Haughn, G W; Coen, E S
1995-09-01
The unusual floral organs (ufo) mutant of Arabidopsis has flowers with variable homeotic organ transformations and inflorescence-like characteristics. To determine the relationship between UFO and previously characterized meristem and organ identity genes, we cloned UFO and determined its expression pattern. The UFO gene shows extensive homology with FIMBRIATA (FIM), a gene mediating between meristem and organ identity genes in Antirrhinum. All three UFO mutant alleles that we sequenced are predicted to produce truncated proteins. UFO transcripts were first detected in early floral meristems, before organ identity genes had been activated. At later developmental stages, UFO expression is restricted to the junction between sepal and petal primordia. Phenotypic, genetic, and expression pattern comparisons between UFO and FIM suggest that they are cognate homologs and play a similar role in mediating between meristem and organ identity genes. However, some differences in the functions and genetic interactions of UFO and FIM were apparent, indicating that changes in partially redundant pathways have occurred during the evolutionary divergence of Arabidopsis and Antirrhinum.
Kaeding, Allison J.; Ast, Jennifer C.; Pearce, Meghan M.; Urbanczyk, Henryk; Kimura, Seishi; Endo, Hiromitsu; Nakamura, Masaru; Dunlap, Paul V.
2007-01-01
“Photobacterium mandapamensis” (proposed name) and Photobacterium leiognathi are closely related, phenotypically similar marine bacteria that form bioluminescent symbioses with marine animals. Despite their similarity, however, these bacteria can be distinguished phylogenetically by sequence divergence of their luminescence genes, luxCDAB(F)E, by the presence (P. mandapamensis) or the absence (P. leiognathi) of luxF and, as shown here, by the sequence divergence of genes involved in the synthesis of riboflavin, ribBHA. To gain insight into the possibility that P. mandapamensis and P. leiognathi are ecologically distinct, we used these phylogenetic criteria to determine the incidence of P. mandapamensis as a bioluminescent symbiont of marine animals. Five fish species, Acropoma japonicum (Perciformes, Acropomatidae), Photopectoralis panayensis and Photopectoralis bindus (Perciformes, Leiognathidae), Siphamia versicolor (Perciformes, Apogonidae), and Gadella jordani (Gadiformes, Moridae), were found to harbor P. mandapamensis in their light organs. Specimens of A. japonicus, P. panayensis, and P. bindus harbored P. mandapamensis and P. leiognathi together as cosymbionts of the same light organ. Regardless of cosymbiosis, P. mandapamensis was the predominant symbiont of A. japonicum, and it was the apparently exclusive symbiont of S. versicolor and G. jordani. In contrast, P. leiognathi was found to be the predominant symbiont of P. panayensis and P. bindus, and it appears to be the exclusive symbiont of other leiognathid fishes and a loliginid squid. A phylogenetic test for cospeciation revealed no evidence of codivergence between P. mandapamensis and its host fishes, indicating that coevolution apparently is not the basis for this bacterium's host preferences. These results, which are the first report of bacterial cosymbiosis in fish light organs and the first demonstration that P. leiognathi is not the exclusive light organ symbiont of leiognathid fishes, demonstrate that the host species ranges of P. mandapamensis and P. leiognathi are substantially distinct. The host range difference underscores possible differences in the environmental distributions and physiologies of these two bacterial species. PMID:17369329
Iftikhar, Romana; Ashfaq, Muhammad; Rasool, Akhtar; Hebert, Paul D N
2016-01-01
Although thrips are globally important crop pests and vectors of viral disease, species identifications are difficult because of their small size and inconspicuous morphological differences. Sequence variation in the mitochondrial COI-5' (DNA barcode) region has proven effective for the identification of species in many groups of insect pests. We analyzed barcode sequence variation among 471 thrips from various plant hosts in north-central Pakistan. The Barcode Index Number (BIN) system assigned these sequences to 55 BINs, while the Automatic Barcode Gap Discovery detected 56 partitions, a count that coincided with the number of monophyletic lineages recognized by Neighbor-Joining analysis and Bayesian inference. Congeneric species showed an average of 19% sequence divergence (range = 5.6% - 27%) at COI, while intraspecific distances averaged 0.6% (range = 0.0% - 7.6%). BIN analysis suggested that all intraspecific divergence >3.0% actually involved a species complex. In fact, sequences for three major pest species (Haplothrips reuteri, Thrips palmi, Thrips tabaci), and one predatory thrips (Aeolothrips intermedius) showed deep intraspecific divergences, providing evidence that each is a cryptic species complex. The study compiles the first barcode reference library for the thrips of Pakistan, and examines global haplotype diversity in four important pest thrips.
Extensive concerted evolution of rice paralogs and the road to regaining independence.
Wang, Xiyin; Tang, Haibao; Bowers, John E; Feltus, Frank A; Paterson, Andrew H
2007-11-01
Many genes duplicated by whole-genome duplications (WGDs) are more similar to one another than expected. We investigated whether concerted evolution through conversion and crossing over, well-known to affect tandem gene clusters, also affects dispersed paralogs. Genome sequences for two Oryza subspecies reveal appreciable gene conversion in the approximately 0.4 MY since their divergence, with a gradual progression toward independent evolution of older paralogs. Since divergence from subspecies indica, approximately 8% of japonica paralogs produced 5-7 MYA on chromosomes 11 and 12 have been affected by gene conversion and several reciprocal exchanges of chromosomal segments, while approximately 70-MY-old "paleologs" resulting from a genome duplication (GD) show much less conversion. Sequence similarity analysis in proximal gene clusters also suggests more conversion between younger paralogs. About 8% of paleologs may have been converted since rice-sorghum divergence approximately 41 MYA. Domain-encoding sequences are more frequently converted than nondomain sequences, suggesting a sort of circularity--that sequences conserved by selection may be further conserved by relatively frequent conversion. The higher level of concerted evolution in the 5-7 MY-old segmental duplication may reflect the behavior of many genomes within the first few million years after duplication or polyploidization.
Comparative sequence analyses of sixteen reptilian paramyxoviruses
Ahne, W.; Batts, W.N.; Kurath, G.; Winton, J.R.
1999-01-01
Viral genomic RNA of Fer-de-Lance virus (FDLV), a paramyxovirus highly pathogenic for reptiles, was reverse transcribed and cloned. Plasmids with significant sequence similarities to the hemagglutinin-neuraminidase (HN) and polymerase (L) genes of mammalian paramyxoviruses were identified by BLAST search. Partial sequences of the FDLV genes were used to design primers for amplification by nested polymerase chain reaction (PCR) and sequencing of 518-bp L gene and 352-bp HN gene fragments from a collection of 15 previously uncharacterized reptilian paramyxoviruses. Phylogenetic analyses of the partial L and HN sequences produced similar trees in which there were two distinct subgroups of isolates that were supported with maximum bootstrap values, and several intermediate isolates. Within each subgroup the nucleotide divergence values were less than 2.5%, while the divergence between the two subgroups was 20-22%. This indicated that the two subgroups represent distinct virus species containing multiple virus strains. The five intermediate isolates had nucleotide divergence values of 11-20% and may represent additional distinct species. In addition to establishing diversity among reptilian paramyxoviruses, the phylogenetic groupings showed some correlation with geographic location, and clearly demonstrated a low level of host species-specificity within these viruses. Copyright (C) 1999 Elsevier Science B.V.
srRNA evolution and phylogenetic relationships of the genus Naegleria (Protista: Rhizopoda).
Baverstock, P R; Illana, S; Christy, P E; Robinson, B S; Johnson, A M
1989-05-01
A rapid RNA sequencing technique was used to partially sequence the small-subunit ribosomal RNA (srRNA) of four species of the amoeboid genus Naegleria. The extent of nucleotide sequence divergence between the two most divergent species was roughly similar to that found between mammals and frogs. However, the pattern of variation among the Naegleria species was quite different from that found for those species of tetrapods characterized to date. A phylogenetic analysis of the consensus Naegleria sequence showed that Naegleria was not monophyletic with either Acanthamoeba castellanii or Dictyostelium discoideum, two other amoebas for which sequences were available. It was shown that the semiconserved regions of the srRNA molecule evolve in a clocklike fashion and that the clock is time dependent rather than generation dependent.
Accurate read-based metagenome characterization using a hierarchical suite of unique signatures
Freitas, Tracey Allen K.; Li, Po-E; Scholz, Matthew B.; Chain, Patrick S. G.
2015-01-01
A major challenge in the field of shotgun metagenomics is the accurate identification of organisms present within a microbial community, based on classification of short sequence reads. Though existing microbial community profiling methods have attempted to rapidly classify the millions of reads output from modern sequencers, the combination of incomplete databases, similarity among otherwise divergent genomes, errors and biases in sequencing technologies, and the large volumes of sequencing data required for metagenome sequencing has led to unacceptably high false discovery rates (FDR). Here, we present the application of a novel, gene-independent and signature-based metagenomic taxonomic profiling method with significantly and consistently smaller FDR than any other available method. Our algorithm circumvents false positives using a series of non-redundant signature databases and examines Genomic Origins Through Taxonomic CHAllenge (GOTTCHA). GOTTCHA was tested and validated on 20 synthetic and mock datasets ranging in community composition and complexity, was applied successfully to data generated from spiked environmental and clinical samples, and robustly demonstrates superior performance compared with other available tools. PMID:25765641
Taxonomic resolutions based on 18S rRNA genes: a case study of subclass copepoda.
Wu, Shu; Xiong, Jie; Yu, Yuhe
2015-01-01
Biodiversity studies are commonly conducted using 18S rRNA genes. In this study, we compared the inter-species divergence of variable regions (V1-9) within the copepod 18S rRNA gene, and tested their taxonomic resolutions at different taxonomic levels. Our results indicate that the 18S rRNA gene is a good molecular marker for the study of copepod biodiversity, and our conclusions are as follows: 1) 18S rRNA genes are highly conserved intra-species (intra-species similarities are close to 100%); and could aid in species-level analyses, but with some limitations; 2) nearly-whole-length sequences and some partial regions (around V2, V4, and V9) of the 18S rRNA gene can be used to discriminate between samples at both the family and order levels (with a success rate of about 80%); 3) compared with other regions, V9 has a higher resolution at the genus level (with an identification success rate of about 80%); and 4) V7 is most divergent in length, and would be a good candidate marker for the phylogenetic study of Acartia species. This study also evaluated the correlation between similarity thresholds and the accuracy of using nuclear 18S rRNA genes for the classification of organisms in the subclass Copepoda. We suggest that sample identification accuracy should be considered when a molecular sequence divergence threshold is used for taxonomic identification, and that the lowest similarity threshold should be determined based on a pre-designated level of acceptable accuracy.
Taxonomic Resolutions Based on 18S rRNA Genes: A Case Study of Subclass Copepoda
Wu, Shu; Xiong, Jie; Yu, Yuhe
2015-01-01
Biodiversity studies are commonly conducted using 18S rRNA genes. In this study, we compared the inter-species divergence of variable regions (V1–9) within the copepod 18S rRNA gene, and tested their taxonomic resolutions at different taxonomic levels. Our results indicate that the 18S rRNA gene is a good molecular marker for the study of copepod biodiversity, and our conclusions are as follows: 1) 18S rRNA genes are highly conserved intra-species (intra-species similarities are close to 100%); and could aid in species-level analyses, but with some limitations; 2) nearly-whole-length sequences and some partial regions (around V2, V4, and V9) of the 18S rRNA gene can be used to discriminate between samples at both the family and order levels (with a success rate of about 80%); 3) compared with other regions, V9 has a higher resolution at the genus level (with an identification success rate of about 80%); and 4) V7 is most divergent in length, and would be a good candidate marker for the phylogenetic study of Acartia species. This study also evaluated the correlation between similarity thresholds and the accuracy of using nuclear 18S rRNA genes for the classification of organisms in the subclass Copepoda. We suggest that sample identification accuracy should be considered when a molecular sequence divergence threshold is used for taxonomic identification, and that the lowest similarity threshold should be determined based on a pre-designated level of acceptable accuracy. PMID:26107258
Unscrambling butterfly oogenesis
2013-01-01
Background Butterflies are popular model organisms to study physiological mechanisms underlying variability in oogenesis and egg provisioning in response to environmental conditions. Nothing is known, however, about; the developmental mechanisms governing butterfly oogenesis, how polarity in the oocyte is established, or which particular maternal effect genes regulate early embryogenesis. To gain insights into these developmental mechanisms and to identify the conserved and divergent aspects of butterfly oogenesis, we analysed a de novo ovarian transcriptome of the Speckled Wood butterfly Pararge aegeria (L.), and compared the results with known model organisms such as Drosophila melanogaster and Bombyx mori. Results A total of 17306 contigs were annotated, with 30% possibly novel or highly divergent sequences observed. Pararge aegeria females expressed 74.5% of the genes that are known to be essential for D. melanogaster oogenesis. We discuss the genes involved in all aspects of oogenesis, including vitellogenesis and choriogenesis, plus those implicated in hormonal control of oogenesis and transgenerational hormonal effects in great detail. Compared to other insects, a number of significant differences were observed in; the genes involved in stem cell maintenance and differentiation in the germarium, establishment of oocyte polarity, and in several aspects of maternal regulation of zygotic development. Conclusions This study provides valuable resources to investigate a number of divergent aspects of butterfly oogenesis requiring further research. In order to fully unscramble butterfly oogenesis, we also now also have the resources to investigate expression patterns of oogenesis genes under a range of environmental conditions, and to establish their function. PMID:23622113
2015-01-01
Culex pipiens, an invasive mosquito and vector of West Nile virus in the US, has two morphologically indistinguishable forms that differ dramatically in behavior and physiology. Cx. pipiens form pipiens is primarily a bird-feeding temperate mosquito, while the sub-tropical Cx. pipiens form molestus thrives in sewers and feeds on mammals. Because the feral form can diapause during the cold winters but the domestic form cannot, the two Cx. pipiens forms are allopatric in northern Europe and, although viable, hybrids are rare. Cx. pipiens form molestus has spread across all inhabited continents and hybrids of the two forms are common in the US. Here we elucidate the genes and gene families with the greatest divergence rates between these phenotypically diverged mosquito populations, and discuss them in light of their potential biological and ecological effects. After generating and assembling novel transcriptome data for each population, we performed pairwise tests for nonsynonymous divergence (Ka) of homologous coding sequences and examined gene ontology terms that were statistically over-represented in those sequences with the greatest divergence rates. We identified genes involved in digestion (serine endopeptidases), innate immunity (fibrinogens and α-macroglobulins), hemostasis (D7 salivary proteins), olfaction (odorant binding proteins) and chitin binding (peritrophic matrix proteins). By examining molecular divergence between closely related yet phenotypically divergent forms of the same species, our results provide insights into the identity of rapidly-evolving genes between incipient species. Additionally, we found that families of signal transducers, ATP synthases and transcription regulators remained identical at the amino acid level, thus constituting conserved components of the Cx. pipiens proteome. We provide a reference with which to gauge the divergence reported in this analysis by performing a comparison of transcriptome sequences from conspecific (yet allopatric) populations of another member of the Cx. pipiens complex, Cx. quinquefasciatus. PMID:25755934
Overvoorde, P J; Chao, W S; Grimes, H D
1997-06-20
Photoaffinity labeling of a soybean cotyledon membrane fraction identified a sucrose-binding protein (SBP). Subsequent studies have shown that the SBP is a unique plasma membrane protein that mediates the linear uptake of sucrose in the presence of up to 30 mM external sucrose when ectopically expressed in yeast. Analysis of the SBP-deduced amino acid sequence indicates it lacks sequence similarity with other known transport proteins. Data presented here, however, indicate that the SBP shares significant sequence and structural homology with the vicilin-like seed storage proteins that organize into homotrimers. These similarities include a repeated sequence that forms the basis of the reiterated domain structure characteristic of the vicilin-like protein family. In addition, analytical ultracentrifugation and nonreducing SDS-polyacrylamide gel electrophoresis demonstrate that the SBP appears to be organized into oligomeric complexes with a Mr indicative of the existence of SBP homotrimers and homodimers. The structural similarity shared by the SBP and vicilin-like proteins provides a novel framework to explore the mechanistic basis of SBP-mediated sucrose uptake. Expression of the maize Glb protein (a vicilin-like protein closely related to the SBP) in yeast demonstrates that a closely related vicilin-like protein is unable to mediate sucrose uptake. Thus, despite sequence and structural similarities shared by the SBP and the vicilin-like protein family, the SBP is functionally divergent from other members of this group.
[Hepatitis C virus: sequence homology of a European isolate and divergence from the prototype].
Seelig, R; Seelig, H P; Renz, M
1991-08-01
The polymerase chain reaction (PCR) detected specific hepatitis C viral (HCV) RNA sequences in liver biopsies from two patients with chronic hepatitis, in the tissue of a liver implantate, in plasma from four chronic non-A, non-B hepatitis (NANBH) patients and, for the first time, in an infectious anti-D-immunoglobulin preparation. A comparison of the viral sequences coding for a region for the nonstructural NS3 protein from the liver tissues revealed only a very small degree of sequence divergence on the cDNA as well as on the amino acid level (between 0 and 5%). The sequence similarities of the RNA isolated from plasma of the four chronic NANBH patients and the anti-D-immunoglobulin preparation were partly somewhat lower but altogether also high (between 90 and 100%). In contrast, all eight cDNA and amino acid sequences exhibited a significantly higher degree of divergence in comparison with the HCV prototype sequence (between 29 and 32%) than among themselves (between 0 and 10%). This unexpected high sequence similarity of the eight European isolates and their low homology to the Northamerican prototype sequence is indicative for the existence of different types of HCV. This will be important not only for epidemiological studies but also for the development of effective diagnostic procedures and vaccines. Concerning the pathogenesis of NANBH, a double infection or a helper mechanism has to be considered: in addition to the C virus, sequences of an other virus particle were found in the infectious IgG preparation as well as in the liver biopsies.
Horai, S; Hayasaka, K; Kondo, R; Tsugane, K; Takahata, N
1995-01-01
We analyzed the complete mitochondrial DNA (mtDNA) sequences of three humans (African, European, and Japanese), three African apes (common and pygmy chimpanzees, and gorilla), and one orangutan in an attempt to estimate most accurately the substitution rates and divergence times of hominoid mtDNAs. Nonsynonymous substitutions and substitutions in RNA genes have accumulated with an approximately clock-like regularity. From these substitutions and under the assumption that the orangutan and African apes diverged 13 million years ago, we obtained a divergence time for humans and chimpanzees of 4.9 million years. This divergence time permitted calibration of the synonymous substitution rate (3.89 x 10(-8)/site per year). To obtain the substitution rate in the displacement (D)-loop region, we compared the three human mtDNAs and measured the relative abundance of substitutions in the D-loop region and at synonymous sites. The estimated substitution rate in the D-loop region was 7.00 x 10(-8)/site per year. Using both synonymous and D-loop substitutions, we inferred the age of the last common ancestor of the human mtDNAs as 143,000 +/- 18,000 years. The shallow ancestry of human mtDNAs, together with the observation that the African sequence is the most diverged among humans, strongly supports the recent African origin of modern humans, Homo sapiens sapiens. PMID:7530363
Population genomics of parallel hybrid zones in the mimetic butterflies, H. melpomene and H. erato
Ruiz, Mayté; Salazar, Patricio; Counterman, Brian; Medina, Jose Alejandro; Ortiz-Zuazaga, Humberto; Morrison, Anna; Papa, Riccardo
2014-01-01
Hybrid zones can be valuable tools for studying evolution and identifying genomic regions responsible for adaptive divergence and underlying phenotypic variation. Hybrid zones between subspecies of Heliconius butterflies can be very narrow and are maintained by strong selection acting on color pattern. The comimetic species, H. erato and H. melpomene, have parallel hybrid zones in which both species undergo a change from one color pattern form to another. We use restriction-associated DNA sequencing to obtain several thousand genome-wide sequence markers and use these to analyze patterns of population divergence across two pairs of parallel hybrid zones in Peru and Ecuador. We compare two approaches for analysis of this type of data—alignment to a reference genome and de novo assembly—and find that alignment gives the best results for species both closely (H. melpomene) and distantly (H. erato, ∼15% divergent) related to the reference sequence. Our results confirm that the color pattern controlling loci account for the majority of divergent regions across the genome, but we also detect other divergent regions apparently unlinked to color pattern differences. We also use association mapping to identify previously unmapped color pattern loci, in particular the Ro locus. Finally, we identify a new cryptic population of H. timareta in Ecuador, which occurs at relatively low altitude and is mimetic with H. melpomene malleti. PMID:24823669
Chloroplast Genome Evolution in Early Diverged Leptosporangiate Ferns
Kim, Hyoung Tae; Chung, Myong Gi; Kim, Ki-Joong
2014-01-01
In this study, the chloroplast (cp) genome sequences from three early diverged leptosporangiate ferns were completed and analyzed in order to understand the evolution of the genome of the fern lineages. The complete cp genome sequence of Osmunda cinnamomea (Osmundales) was 142,812 base pairs (bp). The cp genome structure was similar to that of eusporangiate ferns. The gene/intron losses that frequently occurred in the cp genome of leptosporangiate ferns were not found in the cp genome of O. cinnamomea. In addition, putative RNA editing sites in the cp genome were rare in O. cinnamomea, even though the sites were frequently predicted to be present in leptosporangiate ferns. The complete cp genome sequence of Diplopterygium glaucum (Gleicheniales) was 151,007 bp and has a 9.7 kb inversion between the trnL-CAA and trnV-GCA genes when compared to O. cinnamomea. Several repeated sequences were detected around the inversion break points. The complete cp genome sequence of Lygodium japonicum (Schizaeales) was 157,142 bp and a deletion of the rpoC1 intron was detected. This intron loss was shared by all of the studied species of the genus Lygodium. The GC contents and the effective numbers of co-dons (ENCs) in ferns varied significantly when compared to seed plants. The ENC values of the early diverged leptosporangiate ferns showed intermediate levels between eusporangiate and core leptosporangiate ferns. However, our phylogenetic tree based on all of the cp gene sequences clearly indicated that the cp genome similarity between O. cinnamomea (Osmundales) and eusporangiate ferns are symplesiomorphies, rather than synapomorphies. Therefore, our data is in agreement with the view that Osmundales is a distinct early diverged lineage in the leptosporangiate ferns. PMID:24823358
Chloroplast genome evolution in early diverged leptosporangiate ferns.
Kim, Hyoung Tae; Chung, Myong Gi; Kim, Ki-Joong
2014-05-01
In this study, the chloroplast (cp) genome sequences from three early diverged leptosporangiate ferns were completed and analyzed in order to understand the evolution of the genome of the fern lineages. The complete cp genome sequence of Osmunda cinnamomea (Osmundales) was 142,812 base pairs (bp). The cp genome structure was similar to that of eusporangiate ferns. The gene/intron losses that frequently occurred in the cp genome of leptosporangiate ferns were not found in the cp genome of O. cinnamomea. In addition, putative RNA editing sites in the cp genome were rare in O. cinnamomea, even though the sites were frequently predicted to be present in leptosporangiate ferns. The complete cp genome sequence of Diplopterygium glaucum (Gleicheniales) was 151,007 bp and has a 9.7 kb inversion between the trnL-CAA and trnVGCA genes when compared to O. cinnamomea. Several repeated sequences were detected around the inversion break points. The complete cp genome sequence of Lygodium japonicum (Schizaeales) was 157,142 bp and a deletion of the rpoC1 intron was detected. This intron loss was shared by all of the studied species of the genus Lygodium. The GC contents and the effective numbers of codons (ENCs) in ferns varied significantly when compared to seed plants. The ENC values of the early diverged leptosporangiate ferns showed intermediate levels between eusporangiate and core leptosporangiate ferns. However, our phylogenetic tree based on all of the cp gene sequences clearly indicated that the cp genome similarity between O. cinnamomea (Osmundales) and eusporangiate ferns are symplesiomorphies, rather than synapomorphies. Therefore, our data is in agreement with the view that Osmundales is a distinct early diverged lineage in the leptosporangiate ferns.
Fourment, Mathieu; Holmes, Edward C
2014-07-24
Early methods for estimating divergence times from gene sequence data relied on the assumption of a molecular clock. More sophisticated methods were created to model rate variation and used auto-correlation of rates, local clocks, or the so called "uncorrelated relaxed clock" where substitution rates are assumed to be drawn from a parametric distribution. In the case of Bayesian inference methods the impact of the prior on branching times is not clearly understood, and if the amount of data is limited the posterior could be strongly influenced by the prior. We develop a maximum likelihood method--Physher--that uses local or discrete clocks to estimate evolutionary rates and divergence times from heterochronous sequence data. Using two empirical data sets we show that our discrete clock estimates are similar to those obtained by other methods, and that Physher outperformed some methods in the estimation of the root age of an influenza virus data set. A simulation analysis suggests that Physher can outperform a Bayesian method when the real topology contains two long branches below the root node, even when evolution is strongly clock-like. These results suggest it is advisable to use a variety of methods to estimate evolutionary rates and divergence times from heterochronous sequence data. Physher and the associated data sets used here are available online at http://code.google.com/p/physher/.
A DNA Barcode Library for North American Ephemeroptera: Progress and Prospects
Webb, Jeffrey M.; Jacobus, Luke M.; Funk, David H.; Zhou, Xin; Kondratieff, Boris; Geraci, Christy J.; DeWalt, R. Edward; Baird, Donald J.; Richard, Barton; Phillips, Iain; Hebert, Paul D. N.
2012-01-01
DNA barcoding of aquatic macroinvertebrates holds much promise as a tool for taxonomic research and for providing the reliable identifications needed for water quality assessment programs. A prerequisite for identification using barcodes is a reliable reference library. We gathered 4165 sequences from the barcode region of the mitochondrial cytochrome c oxidase subunit I gene representing 264 nominal and 90 provisional species of mayflies (Insecta: Ephemeroptera) from Canada, Mexico, and the United States. No species shared barcode sequences and all can be identified with barcodes with the possible exception of some Caenis. Minimum interspecific distances ranged from 0.3–24.7% (mean: 12.5%), while the average intraspecific divergence was 1.97%. The latter value was inflated by the presence of very high divergences in some taxa. In fact, nearly 20% of the species included two or three haplotype clusters showing greater than 5.0% sequence divergence and some values are as high as 26.7%. Many of the species with high divergences are polyphyletic and likely represent species complexes. Indeed, many of these polyphyletic species have numerous synonyms and individuals in some barcode clusters show morphological attributes characteristic of the synonymized species. In light of our findings, it is imperative that type or topotype specimens be sequenced to correctly associate barcode clusters with morphological species concepts and to determine the status of currently synonymized species. PMID:22666447
Zarza, Eugenia; Reynoso, Victor H; Emerson, Brent C
2008-07-01
While Quaternary climatic changes are considered by some to have been a major factor promoting speciation within the neotropics, others suggest that much of the neotropical species diversity originated before the Pleistocene. Using mitochondrial and nuclear sequence data, we evaluate the relative importance of Pleistocene and pre-Pleistocene events within the evolutionary history of the Mexican iguana Ctenosaura pectinata, and related species. Results support the existence of cryptic lineages with strong mitochondrial divergence (> 4%) among them. Some of these lineages form zones of secondary contact, with one of them hybridizing with C. hemilopha. Evolutionary network analyses reveal the oldest populations of C. pectinata to be those of the northern and southern Mexican coastal regions. Inland and mid-latitudinal coastal populations are younger in age as a consequence of a history of local extinction within these regions followed by re-colonization. Estimated divergence times suggest that C. pectinata originated during the Pliocene, whereas geographically distinct mitochondrial DNA lineages first started to diverge during the Pliocene, with subsequent divergence continuing through the Pleistocene. Our results highlight the influence of both Pliocene and Pleistocene events in shaping the geographical distribution of genetic variation within neotropical lowland organisms. Areas of high genetic diversity in southern Mexico were detected, this finding plus the high levels of genetic diversity within C. pectinata, have implications for the conservation of this threatened species.
Sequence-structure relationships in RNA loops: establishing the basis for loop homology modeling.
Schudoma, Christian; May, Patrick; Nikiforova, Viktoria; Walther, Dirk
2010-01-01
The specific function of RNA molecules frequently resides in their seemingly unstructured loop regions. We performed a systematic analysis of RNA loops extracted from experimentally determined three-dimensional structures of RNA molecules. A comprehensive loop-structure data set was created and organized into distinct clusters based on structural and sequence similarity. We detected clear evidence of the hallmark of homology present in the sequence-structure relationships in loops. Loops differing by <25% in sequence identity fold into very similar structures. Thus, our results support the application of homology modeling for RNA loop model building. We established a threshold that may guide the sequence divergence-based selection of template structures for RNA loop homology modeling. Of all possible sequences that are, under the assumption of isosteric relationships, theoretically compatible with actual sequences observed in RNA structures, only a small fraction is contained in the Rfam database of RNA sequences and classes implying that the actual RNA loop space may consist of a limited number of unique loop structures and conserved sequences. The loop-structure data sets are made available via an online database, RLooM. RLooM also offers functionalities for the modeling of RNA loop structures in support of RNA engineering and design efforts.
Multi-species Identification of Polymorphic Peptide Variants via Propagation in Spectral Networks*
Bandeira, Nuno
2016-01-01
Peptide and protein identification remains challenging in organisms with poorly annotated or rapidly evolving genomes, as are commonly encountered in environmental or biofuels research. Such limitations render tandem mass spectrometry (MS/MS) database search algorithms ineffective as they lack corresponding sequences required for peptide-spectrum matching. We address this challenge with the spectral networks approach to (1) match spectra of orthologous peptides across multiple related species and then (2) propagate peptide annotations from identified to unidentified spectra. We here present algorithms to assess the statistical significance of spectral alignments (Align-GF), reduce the impurity in spectral networks, and accurately estimate the error rate in propagated identifications. Analyzing three related Cyanothece species, a model organism for biohydrogen production, spectral networks identified peptides from highly divergent sequences from networks with dozens of variant peptides, including thousands of peptides in species lacking a sequenced genome. Our analysis further detected the presence of many novel putative peptides even in genomically characterized species, thus suggesting the possibility of gaps in our understanding of their proteomic and genomic expression. A web-based pipeline for spectral networks analysis is available at http://proteomics.ucsd.edu/software. PMID:27609420
Population and genomic analysis of the genus Halorubrum
Fullmer, Matthew S.; Soucy, Shannon M.; Swithers, Kristen S.; Makkay, Andrea M.; Wheeler, Ryan; Ventosa, Antonio; Gogarten, J. Peter; Papke, R. Thane
2014-01-01
The Halobacteria are known to engage in frequent gene transfer and homologous recombination. For stably diverged lineages to persist some checks on the rate of between lineage recombination must exist. We surveyed a group of isolates from the Aran-Bidgol endorheic lake in Iran and sequenced a selection of them. Multilocus Sequence Analysis (MLSA) and Average Nucleotide Identity (ANI) revealed multiple clusters (phylogroups) of organisms present in the lake. Patterns of intein and Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) presence/absence and their sequence similarity, GC usage along with the ANI and the identities of the genes used in the MLSA revealed that two of these clusters share an exchange bias toward others in their phylogroup while showing reduced rates of exchange with other organisms in the environment. However, a third cluster, composed in part of named species from other areas of central Asia, displayed many indications of variability in exchange partners, from within the lake as well as outside the lake. We conclude that barriers to gene exchange exist between the two purely Aran-Bidgol phylogroups, and that the third cluster with members from other regions is not a single population and likely reflects an amalgamation of several populations. PMID:24782836
Mihalov-Kovács, Eszter; Martella, Vito; Lanave, Gianvito; Bodnar, Livia; Fehér, Enikő; Marton, Szilvia; Kemenesi, Gábor; Jakab, Ferenc; Bányai, Krisztián
2017-03-15
Canine astrovirus RNA was detected in the stools of 17/63 (26.9%) samples, using either a broadly reactive consensus RT-PCR for astroviruses or random RT-PCR coupled with massive deep sequencing. The complete or nearly complete genome sequence of five canine astroviruses was reconstructed that allowed mapping the genome organization and to investigate the genetic diversity of these viruses. The genome was about 6.6kb in length and contained three open reading frames (ORFs) flanked by a 5' UTR, and a 3' UTR plus a poly-A tail. ORF1a and ORF1b overlapped by 43 nucleotides while the ORF2 overlapped by 8 nucleotides with the 3' end of ORF1b. Upon genome comparison, four strains (HUN/2012/2, HUN/2012/6, HUN/2012/115, and HUN/2012/135) were more related genetically to each other and to UK canine astroviruses (88-96% nt identity), whilst strain HUN/2012/126 was more divergent (75-76% nt identity). In the ORF1b and ORF2, strains HUN/2012/2, HUN/2012/6, and HUN/2012/135 were related genetically to other canine astroviruses identified formerly in Europe and China, whereas strain HUN/2012/126 was related genetically to a divergent canine astrovirus strain, ITA/2010/Zoid. For one canine astrovirus, HUN/2012/8, only a 3.2kb portion of the genome, at the 3' end, could be determined. Interestingly, this strain possessed unique genetic signatures (including a longer ORF1b/ORF2 overlap and a longer 3'UTR) and it was divergent in both ORF1b and ORF2 from all other canine astroviruses, with the highest nucleotide sequence identity (68% and 63%, respectively) to a mink astrovirus, thus suggesting a possible event of interspecies transmission. The genetic heterogeneity of canine astroviruses may pose a challenge for the diagnostics and for future prophylaxis strategies. Copyright © 2016 Elsevier B.V. All rights reserved.
Hirata, Daisuke; Mano, Tsutomu; Abramov, Alexei V; Baryshnikov, Gennady F; Kosintsev, Pavel A; Vorobiev, Alexandr A; Raichev, Evgeny G; Tsunoda, Hiroshi; Kaneko, Yayoi; Murata, Koichi; Fukui, Daisuke; Masuda, Ryuichi
2013-07-01
To further elucidate the migration history of the brown bears (Ursus arctos) on Hokkaido Island, Japan, we analyzed the complete mitochondrial DNA (mtDNA) sequences of 35 brown bears from Hokkaido, the southern Kuril Islands (Etorofu and Kunashiri), Sakhalin Island, and the Eurasian Continent (continental Russia, Bulgaria, and Tibet), and those of four polar bears. Based on these sequences, we reconstructed the maternal phylogeny of the brown bear and estimated divergence times to investigate the timing of brown bear migrations, especially in northeastern Eurasia. Our gene tree showed the mtDNA haplotypes of all 73 brown and polar bears to be divided into eight divergent lineages. The brown bear on Hokkaido was divided into three lineages (central, eastern, and southern). The Sakhalin brown bear grouped with eastern European and western Alaskan brown bears. Etorofu and Kunashiri brown bears were closely related to eastern Hokkaido brown bears and could have diverged from the eastern Hokkaido lineage after formation of the channel between Hokkaido and the southern Kuril Islands. Tibetan brown bears diverged early in the eastern lineage. Southern Hokkaido brown bears were closely related to North American brown bears.
Variable sexually dimorphic gene expression in laboratory strains of Drosophila melanogaster.
Baker, Dean A; Meadows, Lisa A; Wang, Jing; Dow, Julian At; Russell, Steven
2007-12-10
Wild-type laboratory strains of model organisms are typically kept in isolation for many years, with the action of genetic drift and selection on mutational variation causing lineages to diverge with time. Natural populations from which such strains are established, show that gender-specific interactions in particular drive many aspects of sequence level and transcriptional level variation. Here, our goal was to identify genes that display transcriptional variation between laboratory strains of Drosophila melanogaster, and to explore evidence of gender-biased interactions underlying that variability. Transcriptional variation among the laboratory genotypes studied occurs more frequently in males than in females. Qualitative differences are also apparent to suggest that genes within particular functional classes disproportionately display variation in gene expression. Our analysis indicates that genes with reproductive functions are most often divergent between genotypes in both sexes, however a large proportion of female variation can also be attributed to genes without expression in the ovaries. The present study clearly shows that transcriptional variation between common laboratory strains of Drosophila can differ dramatically due to sexual dimorphism. Much of this variation reflects sex-specific challenges associated with divergent physiological trade-offs, morphology and regulatory pathways operating within males and females.
Chan, Yvonne H.; Venev, Sergey V.; Zeldovich, Konstantin B.; Matthews, C. Robert
2017-01-01
Sequence divergence of orthologous proteins enables adaptation to environmental stresses and promotes evolution of novel functions. Limits on evolution imposed by constraints on sequence and structure were explored using a model TIM barrel protein, indole-3-glycerol phosphate synthase (IGPS). Fitness effects of point mutations in three phylogenetically divergent IGPS proteins during adaptation to temperature stress were probed by auxotrophic complementation of yeast with prokaryotic, thermophilic IGPS. Analysis of beneficial mutations pointed to an unexpected, long-range allosteric pathway towards the active site of the protein. Significant correlations between the fitness landscapes of distant orthologues implicate both sequence and structure as primary forces in defining the TIM barrel fitness landscape and suggest that fitness landscapes can be translocated in sequence space. Exploration of fitness landscapes in the context of a protein fold provides a strategy for elucidating the sequence-structure-fitness relationships in other common motifs. PMID:28262665
A High-Density Linkage Map for Astyanax mexicanus Using Genotyping-by-Sequencing Technology
Carlson, Brian M.; Onusko, Samuel W.; Gross, Joshua B.
2014-01-01
The Mexican tetra, Astyanax mexicanus, is a unique model system consisting of cave-adapted and surface-dwelling morphotypes that diverged >1 million years (My) ago. This remarkable natural experiment has enabled powerful genetic analyses of cave adaptation. Here, we describe the application of next-generation sequencing technology to the creation of a high-density linkage map. Our map comprises more than 2200 markers populating 25 linkage groups constructed from genotypic data generated from a single genotyping-by-sequencing project. We leveraged emergent genomic and transcriptomic resources to anchor hundreds of anonymous Astyanax markers to the genome of the zebrafish (Danio rerio), the most closely related model organism to our study species. This facilitated the identification of 784 distinct connections between our linkage map and the Danio rerio genome, highlighting several regions of conserved genomic architecture between the two species despite ∼150 My of divergence. Using a Mendelian cave-associated trait as a proof-of-principle, we successfully recovered the genomic position of the albinism locus near the gene Oca2. Further, our map successfully informed the positions of unplaced Astyanax genomic scaffolds within particular linkage groups. This ability to identify the relative location, orientation, and linear order of unaligned genomic scaffolds will facilitate ongoing efforts to improve on the current early draft and assemble future versions of the Astyanax physical genome. Moreover, this improved linkage map will enable higher-resolution genetic analyses and catalyze the discovery of the genetic basis for cave-associated phenotypes. PMID:25520037
Shih, Kai-Ming; Chang, Chung-Te; Chung, Jeng-Der; Chiang, Yu-Chung; Hwang, Shih-Ying
2018-01-01
Double digest restriction site-associated DNA sequencing (ddRADseq) is a tool for delivering genome-wide single nucleotide polymorphism (SNP) markers for non-model organisms useful in resolving fine-scale population structure and detecting signatures of selection. This study performs population genetic analysis, based on ddRADseq data, of a coniferous species, Keteleeria davidiana var. formosana, disjunctly distributed in northern and southern Taiwan, for investigation of population adaptive divergence in response to environmental heterogeneity. A total of 13,914 SNPs were detected and used to assess genetic diversity, FST outlier detection, population genetic structure, and individual assignments of five populations (62 individuals) of K. davidiana var. formosana. Principal component analysis (PCA), individual assignments, and the neighbor-joining tree were successful in differentiating individuals between northern and southern populations of K. davidiana var. formosana, but apparent gene flow between the southern DW30 population and northern populations was also revealed. Fifteen of 23 highly differentiated SNPs identified were found to be strongly associated with environmental variables, suggesting isolation-by-environment (IBE). However, multiple matrix regression with randomization analysis revealed strong IBE as well as significant isolation-by-distance. Environmental impacts on divergence were found between populations of the North and South regions and also between the two southern neighboring populations. BLASTN annotation of the sequences flanking outlier SNPs gave significant hits for three of 23 markers that might have biological relevance to mitochondrial homeostasis involved in the survival of locally adapted lineages. Species delimitation between K. davidiana var. formosana and its ancestor, K. davidiana, was also examined (72 individuals). This study has produced highly informative population genomic data for the understanding of population attributes, such as diversity, connectivity, and adaptive divergence associated with large- and small-scale environmental heterogeneity in K. davidiana var. formosana. PMID:29449860
Beaudet, Denis; Terrat, Yves; Halary, Sébastien; de la Providencia, Ivan Enrique; Hijri, Mohamed
2013-01-01
Comparative mitochondrial genomics of arbuscular mycorrhizal fungi (AMF) provide new avenues to overcome long-lasting obstacles that have hampered studies aimed at understanding the community structure, diversity, and evolution of these multinucleated and genetically polymorphic organisms.AMF mitochondrial (mt) genomes are homogeneous within isolates, and their intergenic regions harbor numerous mobile elements that have rapidly diverged, including homing endonuclease genes, small inverted repeats, and plasmid-related DNA polymerase genes (dpo), making them suitable targets for the development of reliable strain-specific markers. However, these elements may also lead to genome rearrangements through homologous recombination, although this has never previously been reported in this group of obligate symbiotic fungi. To investigate whether such rearrangements are present and caused by mobile elements in AMF, the mitochondrial genomes from two Glomeraceae members (i.e., Glomus cerebriforme and Glomus sp.) with substantial mtDNA synteny divergence,were sequenced and compared with available glomeromycotan mitochondrial genomes. We used an extensive nucleotide/protein similarity network-based approach to investigated podiversity in AMF as well as in other organisms for which sequences are publicly available. We provide strong evidence of dpo-induced inter-haplotype recombination, leading to a reshuffled mitochondrial genome in Glomus sp. These findings raise questions as to whether AMF single spore cultivations artificially underestimate mtDNA genetic diversity.We assessed potential dpo dispersal mechanisms in AMF and inferred a robust phylogenetic relationship with plant mitochondrial plasmids. Along with other indirect evidence, our analyses indicate that members of the Glomeromycota phylum are potential donors of mitochondrial plasmids to plants.
Beaudet, Denis; Terrat, Yves; Halary, Sébastien; de la Providencia, Ivan Enrique; Hijri, Mohamed
2013-01-01
Comparative mitochondrial genomics of arbuscular mycorrhizal fungi (AMF) provide new avenues to overcome long-lasting obstacles that have hampered studies aimed at understanding the community structure, diversity, and evolution of these multinucleated and genetically polymorphic organisms. AMF mitochondrial (mt) genomes are homogeneous within isolates, and their intergenic regions harbor numerous mobile elements that have rapidly diverged, including homing endonuclease genes, small inverted repeats, and plasmid-related DNA polymerase genes (dpo), making them suitable targets for the development of reliable strain-specific markers. However, these elements may also lead to genome rearrangements through homologous recombination, although this has never previously been reported in this group of obligate symbiotic fungi. To investigate whether such rearrangements are present and caused by mobile elements in AMF, the mitochondrial genomes from two Glomeraceae members (i.e., Glomus cerebriforme and Glomus sp.) with substantial mtDNA synteny divergence, were sequenced and compared with available glomeromycotan mitochondrial genomes. We used an extensive nucleotide/protein similarity network-based approach to investigate dpo diversity in AMF as well as in other organisms for which sequences are publicly available. We provide strong evidence of dpo-induced inter-haplotype recombination, leading to a reshuffled mitochondrial genome in Glomus sp. These findings raise questions as to whether AMF single spore cultivations artificially underestimate mtDNA genetic diversity. We assessed potential dpo dispersal mechanisms in AMF and inferred a robust phylogenetic relationship with plant mitochondrial plasmids. Along with other indirect evidence, our analyses indicate that members of the Glomeromycota phylum are potential donors of mitochondrial plasmids to plants. PMID:23925788
NASA Technical Reports Server (NTRS)
Romano, Laura A.; Wray, Gregory A.
2003-01-01
Evolutionary changes in transcriptional regulation undoubtedly play an important role in creating morphological diversity. However, there is little information about the evolutionary dynamics of cis-regulatory sequences. This study examines the functional consequence of evolutionary changes in the Endo16 promoter of sea urchins. The Endo16 gene encodes a large extracellular protein that is expressed in the endoderm and may play a role in cell adhesion. Its promoter has been characterized in exceptional detail in the purple sea urchin, Strongylocentrotus purpuratus. We have characterized the structure and function of the Endo16 promoter from a second sea urchin species, Lytechinus variegatus. The Endo16 promoter sequences have evolved in a strongly mosaic manner since these species diverged approximately 35 million years ago: the most proximal region (module A) is conserved, but the remaining modules (B-G) are unalignable. Despite extensive divergence in promoter sequences, the pattern of Endo16 transcription is largely conserved during embryonic and larval development. Transient expression assays demonstrate that 2.2 kb of upstream sequence in either species is sufficient to drive GFP reporter expression that correctly mimics this pattern of Endo16 transcription. Reciprocal cross-species transient expression assays imply that changes have also evolved in the set of transcription factors that interact with the Endo16 promoter. Taken together, these results suggest that stabilizing selection on the transcriptional output may have operated to maintain a similar pattern of Endo16 expression in S. purpuratus and L. variegatus, despite dramatic divergence in promoter sequence and mechanisms of transcriptional regulation.
Protein sectors: evolutionary units of three-dimensional structure
Halabi, Najeeb; Rivoire, Olivier; Leibler, Stanislas; Ranganathan, Rama
2011-01-01
Proteins display a hierarchy of structural features at primary, secondary, tertiary, and higher-order levels, an organization that guides our current understanding of their biological properties and evolutionary origins. Here, we reveal a structural organization distinct from this traditional hierarchy by statistical analysis of correlated evolution between amino acids. Applied to the S1A serine proteases, the analysis indicates a decomposition of the protein into three quasi-independent groups of correlated amino acids that we term “protein sectors”. Each sector is physically connected in the tertiary structure, has a distinct functional role, and constitutes an independent mode of sequence divergence in the protein family. Functionally relevant sectors are evident in other protein families as well, suggesting that they may be general features of proteins. We propose that sectors represent a structural organization of proteins that reflects their evolutionary histories. PMID:19703402
Schneider, Sean E.; Thomas, James H.
2014-01-01
We show here that 105 regions in two Lepidoptera genomes appear to derive from horizontally transferred wasp DNA. We experimentally verified the presence of two of these sequences in a diverse set of silkworm (Bombyx mori) genomes. We hypothesize that these horizontal transfers are made possible by the unusual strategy many parasitoid wasps employ of injecting hosts with endosymbiotic polydnaviruses to minimize the host's defense response. Because these virus-like particles deliver wasp DNA to the cells of the host, there has been much interest in whether genetic information can be permanently transferred from the wasp to the host. Two transferred sequences code for a BEN domain, known to be associated with polydnaviruses and transcriptional regulation. These findings represent the first documented cases of horizontal transfer of genes between two organisms by a polydnavirus. This presents an interesting evolutionary paradigm in which host species can acquire new sequences from parasitoid wasps that attack them. Hymenoptera and Lepidoptera diverged ∼300 MYA, making this type of event a source of novel sequences for recipient species. Unlike many other cases of horizontal transfer between two eukaryote species, these sequence transfers can be explained without the need to invoke the sequences ‘hitchhiking’ on a third organism (e.g. retrovirus) capable of independent reproduction. The cellular machinery necessary for the transfer is contained entirely in the wasp genome. The work presented here is the first such discovery of what is likely to be a broader phenomenon among species affected by these wasps. PMID:25296163
SOMKE: kernel density estimation over data streams by sequences of self-organizing maps.
Cao, Yuan; He, Haibo; Man, Hong
2012-08-01
In this paper, we propose a novel method SOMKE, for kernel density estimation (KDE) over data streams based on sequences of self-organizing map (SOM). In many stream data mining applications, the traditional KDE methods are infeasible because of the high computational cost, processing time, and memory requirement. To reduce the time and space complexity, we propose a SOM structure in this paper to obtain well-defined data clusters to estimate the underlying probability distributions of incoming data streams. The main idea of this paper is to build a series of SOMs over the data streams via two operations, that is, creating and merging the SOM sequences. The creation phase produces the SOM sequence entries for windows of the data, which obtains clustering information of the incoming data streams. The size of the SOM sequences can be further reduced by combining the consecutive entries in the sequence based on the measure of Kullback-Leibler divergence. Finally, the probability density functions over arbitrary time periods along the data streams can be estimated using such SOM sequences. We compare SOMKE with two other KDE methods for data streams, the M-kernel approach and the cluster kernel approach, in terms of accuracy and processing time for various stationary data streams. Furthermore, we also investigate the use of SOMKE over nonstationary (evolving) data streams, including a synthetic nonstationary data stream, a real-world financial data stream and a group of network traffic data streams. The simulation results illustrate the effectiveness and efficiency of the proposed approach.
Jeukens, Julie; Bernatchez, Louis
2012-01-01
While gene expression divergence is known to be involved in adaptive phenotypic divergence and speciation, the relative importance of regulatory and structural evolution of genes is poorly understood. A recent next-generation sequencing experiment allowed identifying candidate genes potentially involved in the ongoing speciation of sympatric dwarf and normal lake whitefish (Coregonus clupeaformis), such as cytosolic malate dehydrogenase (MDH1), which showed both significant expression and sequence divergence. The main goal of this study was to investigate into more details the signatures of natural selection in the regulatory and coding sequences of MDH1 in lake whitefish and test for parallelism of these signatures with other coregonine species. Sequencing of the two regions in 118 fish from four sympatric pairs of whitefish and two cisco species revealed a total of 35 single nucleotide polymorphisms (SNPs), with more genetic diversity in European compared to North American coregonine species. While the coding region was found to be under purifying selection, an SNP in the proximal promoter exhibited significant allele frequency divergence in a parallel manner among independent sympatric pairs of North American lake whitefish and European whitefish (C. lavaretus). According to transcription factor binding simulation for 22 regulatory haplotypes of MDH1, putative binding profiles were fairly conserved among species, except for the region around this SNP. Moreover, we found evidence for the role of this SNP in the regulation of MDH1 expression level. Overall, these results provide further evidence for the role of natural selection in gene regulation evolution among whitefish species pairs and suggest its possible link with patterns of phenotypic diversity observed in coregonine species. PMID:22408741
Jeukens, Julie; Bernatchez, Louis
2012-01-01
While gene expression divergence is known to be involved in adaptive phenotypic divergence and speciation, the relative importance of regulatory and structural evolution of genes is poorly understood. A recent next-generation sequencing experiment allowed identifying candidate genes potentially involved in the ongoing speciation of sympatric dwarf and normal lake whitefish (Coregonus clupeaformis), such as cytosolic malate dehydrogenase (MDH1), which showed both significant expression and sequence divergence. The main goal of this study was to investigate into more details the signatures of natural selection in the regulatory and coding sequences of MDH1 in lake whitefish and test for parallelism of these signatures with other coregonine species. Sequencing of the two regions in 118 fish from four sympatric pairs of whitefish and two cisco species revealed a total of 35 single nucleotide polymorphisms (SNPs), with more genetic diversity in European compared to North American coregonine species. While the coding region was found to be under purifying selection, an SNP in the proximal promoter exhibited significant allele frequency divergence in a parallel manner among independent sympatric pairs of North American lake whitefish and European whitefish (C. lavaretus). According to transcription factor binding simulation for 22 regulatory haplotypes of MDH1, putative binding profiles were fairly conserved among species, except for the region around this SNP. Moreover, we found evidence for the role of this SNP in the regulation of MDH1 expression level. Overall, these results provide further evidence for the role of natural selection in gene regulation evolution among whitefish species pairs and suggest its possible link with patterns of phenotypic diversity observed in coregonine species.
Expression Divergence Is Correlated with Sequence Evolution but Not Positive Selection in Conifers.
Hodgins, Kathryn A; Yeaman, Sam; Nurkowski, Kristin A; Rieseberg, Loren H; Aitken, Sally N
2016-06-01
The evolutionary and genomic determinants of sequence evolution in conifers are poorly understood, and previous studies have found only limited evidence for positive selection. Using RNAseq data, we compared gene expression profiles to patterns of divergence and polymorphism in 44 seedlings of lodgepole pine (Pinus contorta) and 39 seedlings of interior spruce (Picea glauca × engelmannii) to elucidate the evolutionary forces that shape their genomes and their plastic responses to abiotic stress. We found that rapidly diverging genes tend to have greater expression divergence, lower expression levels, reduced levels of synonymous site diversity, and longer proteins than slowly diverging genes. Similar patterns were identified for the untranslated regions, but with some exceptions. We found evidence that genes with low expression levels had a larger fraction of nearly neutral sites, suggesting a primary role for negative selection in determining the association between evolutionary rate and expression level. There was limited evidence for differences in the rate of positive selection among genes with divergent versus conserved expression profiles and some evidence supporting relaxed selection in genes diverging in expression between the species. Finally, we identified a small number of genes that showed evidence of site-specific positive selection using divergence data alone. However, estimates of the proportion of sites fixed by positive selection (α) were in the range of other plant species with large effective population sizes suggesting relatively high rates of adaptive divergence among conifers. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Patterns and rates of intron divergence between humans and chimpanzees
Gazave, Elodie; Marqués-Bonet, Tomàs; Fernando, Olga; Charlesworth, Brian; Navarro, Arcadi
2007-01-01
Background Introns, which constitute the largest fraction of eukaryotic genes and which had been considered to be neutral sequences, are increasingly acknowledged as having important functions. Several studies have investigated levels of evolutionary constraint along introns and across classes of introns of different length and location within genes. However, thus far these studies have yielded contradictory results. Results We present the first analysis of human-chimpanzee intron divergence, in which differences in the number of substitutions per intronic site (Ki) can be interpreted as the footprint of different intensities and directions of the pressures of natural selection. Our main findings are as follows: there was a strong positive correlation between intron length and divergence; there was a strong negative correlation between intron length and GC content; and divergence rates vary along introns and depending on their ordinal position within genes (for instance, first introns are more GC rich, longer and more divergent, and divergence is lower at the 3' and 5' ends of all types of introns). Conclusion We show that the higher divergence of first introns is related to their larger size. Also, the lower divergence of short introns suggests that they may harbor a relatively greater proportion of regulatory elements than long introns. Moreover, our results are consistent with the presence of functionally relevant sequences near the 5' and 3' ends of introns. Finally, our findings suggest that other parts of introns may also be under selective constraints. PMID:17309804
Lexer, C; Wüest, R O; Mangili, S; Heuertz, M; Stölting, K N; Pearman, P B; Forest, F; Salamin, N; Zimmermann, N E; Bossolini, E
2014-09-01
Understanding the drivers of population divergence, speciation and species persistence is of great interest to molecular ecology, especially for species-rich radiations inhabiting the world's biodiversity hotspots. The toolbox of population genomics holds great promise for addressing these key issues, especially if genomic data are analysed within a spatially and ecologically explicit context. We have studied the earliest stages of the divergence continuum in the Restionaceae, a species-rich and ecologically important plant family of the Cape Floristic Region (CFR) of South Africa, using the widespread CFR endemic Restio capensis (L.) H.P. Linder & C.R. Hardy as an example. We studied diverging populations of this morphotaxon for plastid DNA sequences and >14 400 nuclear DNA polymorphisms from Restriction site Associated DNA (RAD) sequencing and analysed the results jointly with spatial, climatic and phytogeographic data, using a Bayesian generalized linear mixed modelling (GLMM) approach. The results indicate that population divergence across the extreme environmental mosaic of the CFR is mostly driven by isolation by environment (IBE) rather than isolation by distance (IBD) for both neutral and non-neutral markers, consistent with genome hitchhiking or coupling effects during early stages of divergence. Mixed modelling of plastid DNA and single divergent outlier loci from a Bayesian genome scan confirmed the predominant role of climate and pointed to additional drivers of divergence, such as drift and ecological agents of selection captured by phytogeographic zones. Our study demonstrates the usefulness of population genomics for disentangling the effects of IBD and IBE along the divergence continuum often found in species radiations across heterogeneous ecological landscapes. © 2014 John Wiley & Sons Ltd.
Burgess, Diane; Freeling, Michael
2014-01-01
In vertebrates, conserved noncoding elements (CNEs) are functionally constrained sequences that can show striking conservation over >400 million years of evolutionary distance and frequently are located megabases away from target developmental genes. Conserved noncoding sequences (CNSs) in plants are much shorter, and it has been difficult to detect conservation among distantly related genomes. In this article, we show not only that CNS sequences can be detected throughout the eudicot clade of flowering plants, but also that a subset of 37 CNSs can be found in all flowering plants (diverging ∼170 million years ago). These CNSs are functionally similar to vertebrate CNEs, being highly associated with transcription factor and development genes and enriched in transcription factor binding sites. Some of the most highly conserved sequences occur in genes encoding RNA binding proteins, particularly the RNA splicing–associated SR genes. Differences in sequence conservation between plants and animals are likely to reflect differences in the biology of the organisms, with plants being much more able to tolerate genomic deletions and whole-genome duplication events due, in part, to their far greater fecundity compared with vertebrates. PMID:24681619
Short and long-term genome stability analysis of prokaryotic genomes.
Brilli, Matteo; Liò, Pietro; Lacroix, Vincent; Sagot, Marie-France
2013-05-08
Gene organization dynamics is actively studied because it provides useful evolutionary information, makes functional annotation easier and often enables to characterize pathogens. There is therefore a strong interest in understanding the variability of this trait and the possible correlations with life-style. Two kinds of events affect genome organization: on one hand translocations and recombinations change the relative position of genes shared by two genomes (i.e. the backbone gene order); on the other, insertions and deletions leave the backbone gene order unchanged but they alter the gene neighborhoods by breaking the syntenic regions. A complete picture about genome organization evolution therefore requires to account for both kinds of events. We developed an approach where we model chromosomes as graphs on which we compute different stability estimators; we consider genome rearrangements as well as the effect of gene insertions and deletions. In a first part of the paper, we fit a measure of backbone gene order conservation (hereinafter called backbone stability) against phylogenetic distance for over 3000 genome comparisons, improving existing models for the divergence in time of backbone stability. Intra- and inter-specific comparisons were treated separately to focus on different time-scales. The use of multiple genomes of a same species allowed to identify genomes with diverging gene order with respect to their conspecific. The inter-species analysis indicates that pathogens are more often unstable with respect to non-pathogens. In a second part of the text, we show that in pathogens, gene content dynamics (insertions and deletions) have a much more dramatic effect on genome organization stability than backbone rearrangements. In this work, we studied genome organization divergence taking into account the contribution of both genome order rearrangements and genome content dynamics. By studying species with multiple sequenced genomes available, we were able to explore genome organization stability at different time-scales and to find significant differences for pathogen and non-pathogen species. The output of our framework also allows to identify the conserved gene clusters and/or partial occurrences thereof, making possible to explore how gene clusters assembled during evolution.
Larson, Wesley A; Seeb, Lisa W; Everett, Meredith V; Waples, Ryan K; Templin, William D; Seeb, James E
2014-01-01
Recent advances in population genomics have made it possible to detect previously unidentified structure, obtain more accurate estimates of demographic parameters, and explore adaptive divergence, potentially revolutionizing the way genetic data are used to manage wild populations. Here, we identified 10 944 single-nucleotide polymorphisms using restriction-site-associated DNA (RAD) sequencing to explore population structure, demography, and adaptive divergence in five populations of Chinook salmon (Oncorhynchus tshawytscha) from western Alaska. Patterns of population structure were similar to those of past studies, but our ability to assign individuals back to their region of origin was greatly improved (>90% accuracy for all populations). We also calculated effective size with and without removing physically linked loci identified from a linkage map, a novel method for nonmodel organisms. Estimates of effective size were generally above 1000 and were biased downward when physically linked loci were not removed. Outlier tests based on genetic differentiation identified 733 loci and three genomic regions under putative selection. These markers and genomic regions are excellent candidates for future research and can be used to create high-resolution panels for genetic monitoring and population assignment. This work demonstrates the utility of genomic data to inform conservation in highly exploited species with shallow population structure. PMID:24665338
Endo, Megumi; Hirose, Mamiko; Honda, Masanao; Koga, Hiroyuki; Morino, Yoshiaki; Kiyomoto, Masato; Wada, Hiroshi
2018-06-15
The marine environment around Japan experienced significant changes during the Cenozoic Era. In this study, we report findings suggesting that this dynamic history left behind traces in the genome of the Japanese sand dollar species Peronella japonica and P. rubra. Although mitochondrial Cytochrome C Oxidase I sequences did not indicate fragmentation of the current local populations of P. japonica around Japan, two different types of intron sequence were found in the Alx1 locus. We inferred that past fragmentation of the populations account for the presence of two types of nuclear sequences as alleles in the Alx1 intron of P. japonica. It is likely that the split populations have intermixed in recent times; hence, we did not detect polymorphisms in the sequences reflecting the current localization of the species. In addition, we found two allelic sequences of theAlx1 intron in the sister species P. rubra. The divergence times of the two types of Alx1 intron sequences were estimated at approximately 14.9 and 4.0 million years ago for P. japonica and P. rubra, respectively. Our study indicates that information from the intron sequences of nuclear genes can enhance our understanding of past genetic events in organisms. Copyright © 2018 Elsevier B.V. All rights reserved.
Baurens, Franc-Christophe; Bocs, Stéphanie; Rouard, Mathieu; Matsumoto, Takashi; Miller, Robert N G; Rodier-Goud, Marguerite; MBéguié-A-MBéguié, Didier; Yahiaoui, Nabila
2010-07-16
Comparative sequence analysis of complex loci such as resistance gene analog clusters allows estimating the degree of sequence conservation and mechanisms of divergence at the intraspecies level. In banana (Musa sp.), two diploid wild species Musa acuminata (A genome) and Musa balbisiana (B genome) contribute to the polyploid genome of many cultivars. The M. balbisiana species is associated with vigour and tolerance to pests and disease and little is known on the genome structure and haplotype diversity within this species. Here, we compare two genomic sequences of 253 and 223 kb corresponding to two haplotypes of the RGA08 resistance gene analog locus in M. balbisiana "Pisang Klutuk Wulung" (PKW). Sequence comparison revealed two regions of contrasting features. The first is a highly colinear gene-rich region where the two haplotypes diverge only by single nucleotide polymorphisms and two repetitive element insertions. The second corresponds to a large cluster of RGA08 genes, with 13 and 18 predicted RGA genes and pseudogenes spread over 131 and 152 kb respectively on each haplotype. The RGA08 cluster is enriched in repetitive element insertions, in duplicated non-coding intergenic sequences including low complexity regions and shows structural variations between haplotypes. Although some allelic relationships are retained, a large diversity of RGA08 genes occurs in this single M. balbisiana genotype, with several RGA08 paralogs specific to each haplotype. The RGA08 gene family has evolved by mechanisms of unequal recombination, intragenic sequence exchange and diversifying selection. An unequal recombination event taking place between duplicated non-coding intergenic sequences resulted in a different RGA08 gene content between haplotypes pointing out the role of such duplicated regions in the evolution of RGA clusters. Based on the synonymous substitution rate in coding sequences, we estimated a 1 million year divergence time for these M. balbisiana haplotypes. A large RGA08 gene cluster identified in wild banana corresponds to a highly variable genomic region between haplotypes surrounded by conserved flanking regions. High level of sequence identity (70 to 99%) of the genic and intergenic regions suggests a recent and rapid evolution of this cluster in M. balbisiana.
Kistler, Amy L; Gancz, Ady; Clubb, Susan; Skewes-Cox, Peter; Fischer, Kael; Sorber, Katherine; Chiu, Charles Y; Lublin, Avishai; Mechani, Sara; Farnoushi, Yigal; Greninger, Alexander; Wen, Christopher C; Karlene, Scott B; Ganem, Don; DeRisi, Joseph L
2008-01-01
Background Proventricular dilatation disease (PDD) is a fatal disorder threatening domesticated and wild psittacine birds worldwide. It is characterized by lymphoplasmacytic infiltration of the ganglia of the central and peripheral nervous system, leading to central nervous system disorders as well as disordered enteric motility and associated wasting. For almost 40 years, a viral etiology for PDD has been suspected, but to date no candidate etiologic agent has been reproducibly linked to the disease. Results Analysis of 2 PDD case-control series collected independently on different continents using a pan-viral microarray revealed a bornavirus hybridization signature in 62.5% of the PDD cases (5/8) and none of the controls (0/8). Ultra high throughput sequencing was utilized to recover the complete viral genome sequence from one of the virus-positive PDD cases. This revealed a bornavirus-like genome organization for this agent with a high degree of sequence divergence from all prior bornavirus isolates. We propose the name avian bornavirus (ABV) for this agent. Further specific ABV PCR analysis of an additional set of independently collected PDD cases and controls yielded a significant difference in ABV detection rate among PDD cases (71%, n = 7) compared to controls (0%, n = 14) (P = 0.01; Fisher's Exact Test). Partial sequence analysis of a total of 16 ABV isolates we have now recovered from these and an additional set of cases reveals at least 5 distinct ABV genetic subgroups. Conclusion These studies clearly demonstrate the existence of an avian reservoir of remarkably diverse bornaviruses and provide a compelling candidate in the search for an etiologic agent of PDD. PMID:18671869
Singhal, Dinesh K; Singhal, Raxita; Malik, Hruda N; Kumar, Surender; Kumar, Sudarshan; Mohanty, Ashok K; Kaushik, Jai K; Malakar, Dhruba
2014-01-01
Nanog is a homeodomain containing protein which plays important roles in regulation of signaling pathways for maintenance and induction of pluripotency in stem cells. Because of its unique expression in stem cells it is also regarded as pluripotency marker. In this study goat Nanog (gNanog) gene has been amplified, cloned and characterized at sequence level with successful over-expression in CHO-K1 cell line using a lentiviral based system. gNanog ORF is 903 bp long which codes for Nanog protein of size 300 amino acids (aas). Complete nucleotide sequence shows some evolutionary mutation in goat in comparision to other species. Protein sequence of goat is highly similar to other species. Overall, gNanog nucleotide sequence and predicted protein sequence showed high similarity and minimum divergence with cattle (96 % identity/4 % divergence) and buffalo (94/5 %) while low similarity and high divergence with pig (84/15 %), human (81/23 %) and mouse (69/40 %) indicating evolutionary closeness of gNanog to cattle and buffalo. gNanog lentiviral expression construct was prepared for over-expression of Nanog gene in adult goat fibroblast cells. Lentiviral expression construct of Nanog enabled continuous protein expression for induction and maintenance of pluripotency. Western blotting revealed the expression of Nanog gene at protein level which supported that the lentiviral expression system is highly promising for Nanog protein expression in differentiated goat cell.
Huntingtin gene evolution in Chordata and its peculiar features in the ascidian Ciona genus
Gissi, Carmela; Pesole, Graziano; Cattaneo, Elena; Tartari, Marzia
2006-01-01
Background To gain insight into the evolutionary features of the huntingtin (htt) gene in Chordata, we have sequenced and characterized the full-length htt mRNA in the ascidian Ciona intestinalis, a basal chordate emerging as new invertebrate model organism. Moreover, taking advantage of the availability of genomic and EST sequences, the htt gene structure of a number of chordate species, including the cogeneric ascidian Ciona savignyi, and the vertebrates Xenopus and Gallus was reconstructed. Results The C. intestinalis htt transcript exhibits some peculiar features, such as spliced leader trans-splicing in the 98 nt-long 5' untranslated region (UTR), an alternative splicing in the coding region, eight alternative polyadenylation sites, and no similarities of both 5' and 3'UTRs compared to homologs of the cogeneric C. savignyi. The predicted protein is 2946 amino acids long, shorter than its vertebrate homologs, and lacks the polyQ and the polyP stretches found in the the N-terminal regions of mammalian homologs. The exon-intron organization of the htt gene is almost identical among vertebrates, and significantly conserved between Ciona and vertebrates, allowing us to hypothesize an ancestral chordate gene consisting of at least 40 coding exons. Conclusion During chordate diversification, events of gain/loss, sliding, phase changes, and expansion of introns occurred in both vertebrate and ascidian lineages predominantly in the 5'-half of the htt gene, where there is also evidence of lineage-specific evolutionary dynamics in vertebrates. On the contrary, the 3'-half of the gene is highly conserved in all chordates at the level of both gene structure and protein sequence. Between the two Ciona species, a fast evolutionary rate and/or an early divergence time is suggested by the absence of significant similarity between UTRs, protein divergence comparable to that observed between mammals and fishes, and different distribution of repetitive elements. PMID:17092333
Extensive Concerted Evolution of Rice Paralogs and the Road to Regaining Independence
Wang, Xiyin; Tang, Haibao; Bowers, John E.; Feltus, Frank A.; Paterson, Andrew H.
2007-01-01
Many genes duplicated by whole-genome duplications (WGDs) are more similar to one another than expected. We investigated whether concerted evolution through conversion and crossing over, well-known to affect tandem gene clusters, also affects dispersed paralogs. Genome sequences for two Oryza subspecies reveal appreciable gene conversion in the ∼0.4 MY since their divergence, with a gradual progression toward independent evolution of older paralogs. Since divergence from subspecies indica, ∼8% of japonica paralogs produced 5–7 MYA on chromosomes 11 and 12 have been affected by gene conversion and several reciprocal exchanges of chromosomal segments, while ∼70-MY-old “paleologs” resulting from a genome duplication (GD) show much less conversion. Sequence similarity analysis in proximal gene clusters also suggests more conversion between younger paralogs. About 8% of paleologs may have been converted since rice–sorghum divergence ∼41 MYA. Domain-encoding sequences are more frequently converted than nondomain sequences, suggesting a sort of circularity—that sequences conserved by selection may be further conserved by relatively frequent conversion. The higher level of concerted evolution in the 5–7 MY-old segmental duplication may reflect the behavior of many genomes within the first few million years after duplication or polyploidization. PMID:18039882
A Generalized Least-Squares Estimate for the Origin of Sporophytic Self-Incompatibility
Uyenoyama, M. K.
1995-01-01
Analysis of nucleotide sequences that regulate the expression of self-incompatibility in flowering plants affords a direct means of examining classical hypotheses for the origin and evolution of this major feature of mating systems. Departing from the classical view of monophyly of all forms of self-incompatibility, the current paradigm for the origin of self-incompatibility postulates multiple episodes of recruitment and modification of preexisting genes. In Brassica, the S locus, which regulates sporophytic self-incompatibility, shows homology to a multigene family present both in self-compatible congeners and in groups for which this form of self-incompatibility is atypical. A phylogenetic analysis of S-allele sequences together with homologous sequences that do not cosegregate with self-incompatibility permits dating the change of function that marked the origin of self-incompatibility. A generalized least-squares method is introduced that provides closed-form expressions for estimates and standard errors for function-specific divergence rates and times of divergence among sequences. This analysis suggests that the age of the sporophytic self-incompatibility system expressed in Brassica exceeds species divergence within the genus by four- to fivefold. The extraordinarily high levels of sequence diversity exhibited by S alleles appears to reflect their ancient derivation, with the alternative hypothesis of hypermutability rejected by the analysis. PMID:7713446
LeDuc, Richard G; Robertson, Kelly M; Pitman, Robert L
2008-08-23
Recently, three visually distinct forms of killer whales (Orcinus orca) were described from Antarctic waters and designated as types A, B and C. Based on consistent differences in prey selection and habitat preferences, morphological divergence and apparent lack of interbreeding among these broadly sympatric forms, it was suggested that they may represent separate species. To evaluate this hypothesis, we compared complete sequences of the mitochondrial control region from 81 Antarctic killer whale samples, including 9 type A, 18 type B, 47 type C and 7 type-undetermined individuals. We found three fixed differences that separated type A from B and C, and a single fixed difference that separated type C from A and B. These results are consistent with reproductive isolation among the different forms, although caution is needed in drawing further conclusions. Despite dramatic differences in morphology and ecology, the relatively low levels of sequence divergence in Antarctic killer whales indicate that these evolutionary changes occurred relatively rapidly and recently.
Özdemir, Ebru; Altındağ, Ahmet; Kandemir, İrfan
2017-05-01
Daphnia is a freshwater zooplankton species with controversial taxonomy due to its high morphological variation linked to environmental factors and inter-specific hybridization and polyploidy in some groups. The aim of the present study is to examine molecular diversity of some Daphnia species in Turkey and to establish DNA barcodes of Turkish Daphnia species. Sequence analysis was performed using 540 bp region of cytochrome oxidase subunit I gene of mitochondrial DNA. A total of 34 haplotypes have been identified for Turkey. Daphnia pulex complex was divided into two clades with 16.1% sequence divergence according to molecular taxonomy based on Kimura 2-parameter. The clade which was molecularly diverged from Daphnia pulex with 16.1% sequence divergence was found to show 99% similarity with Daphnia cf. pulicaria (sensu Alonso 1996) instead of Daphnia pulicaria Forbes, 1893. Furthermore, this study has contributed to Turkish zoogeography by demonstrating the distribution of Daphnia species in Turkey.
Divergence with gene flow within the recent chipmunk radiation (Tamias)
Sullivan, J; Demboski, J R; Bell, K C; Hird, S; Sarver, B; Reid, N; Good, J M
2014-01-01
Increasing data have supported the importance of divergence with gene flow (DGF) in the generation of biological diversity. In such cases, lineage divergence occurs on a shorter timescale than does the completion of reproductive isolation. Although it is critical to explore the mechanisms driving divergence and preventing homogenization by hybridization, it is equally important to document cases of DGF in nature. Here we synthesize data that have accumulated over the last dozen or so years on DGF in the chipmunk (Tamias) radiation with new data that quantify very high rates of mitochondrial DNA (mtDNA) introgression among para- and sympatric species in the T. quadrivittatus group in the central and southern Rocky Mountains. These new data (188 cytochrome b sequences) bring the total number of sequences up to 1871; roughly 16% (298) of the chipmunks we have sequenced exhibit introgressed mtDNA. This includes ongoing introgression between subspecies and between both closely related and distantly related taxa. In addition, we have identified several taxa that are apparently fixed for ancient introgressions and in which there is no evidence of ongoing introgression. A recurrent observation is that these introgressions occur between ecologically and morphologically diverged, sometimes non-sister taxa that engage in well-documented niche partitioning. Thus, the chipmunk radiation in western North America represents an excellent mammalian example of speciation in the face of recurrent gene flow among lineages and where biogeography, habitat differentiation and mating systems suggest important roles for both ecological and sexual selection. PMID:24781803
Mitochondrial divergence between slow- and fast-aging garter snakes.
Schwartz, Tonia S; Arendsee, Zebulun W; Bronikowski, Anne M
2015-11-01
Mitochondrial function has long been hypothesized to be intimately involved in aging processes--either directly through declining efficiency of mitochondrial respiration and ATP production with advancing age, or indirectly, e.g., through increased mitochondrial production of damaging free radicals with age. Yet we lack a comprehensive understanding of the evolution of mitochondrial genotypes and phenotypes across diverse animal models, particularly in species that have extremely labile physiology. Here, we measure mitochondrial genome-types and transcription in ecotypes of garter snakes (Thamnophis elegans) that are adapted to disparate habitats and have diverged in aging rates and lifespans despite residing in close proximity. Using two RNA-seq datasets, we (1) reconstruct the garter snake mitochondrial genome sequence and bioinformatically identify regulatory elements, (2) test for divergence of mitochondrial gene expression between the ecotypes and in response to heat stress, and (3) test for sequence divergence in mitochondrial protein-coding regions in these slow-aging (SA) and fast-aging (FA) naturally occurring ecotypes. At the nucleotide sequence level, we confirmed two (duplicated) mitochondrial control regions one of which contains a glucocorticoid response element (GRE). Gene expression of protein-coding genes was higher in FA snakes relative to SA snakes for most genes, but was neither affected by heat stress nor an interaction between heat stress and ecotype. SA and FA ecotypes had unique mitochondrial haplotypes with amino acid substitutions in both CYTB and ND5. The CYTB amino acid change (Isoleucine → Threonine) was highly segregated between ecotypes. This divergence of mitochondrial haplotypes between SA and FA snakes contrasts with nuclear gene-flow estimates, but correlates with previously reported divergence in mitochondrial function (mitochondrial oxygen consumption, ATP production, and reactive oxygen species consequences). Copyright © 2015 Elsevier Inc. All rights reserved.
Kim, Young Kyun; Kim, Seung Hyeon; Yi, Joo Mi; Kang, Chang-Keun; Short, Frederick; Lee, Kun-Seop
2017-01-01
Although seagrass species in the genus Halophila are generally distributed in tropical or subtropical regions, H. nipponica has been reported to occur in temperate coastal waters of the northwestern Pacific. Because H. nipponica occurs only in the warm temperate areas influenced by the Kuroshio Current and shows a tropical seasonal growth pattern, such as severely restricted growth in low water temperatures, it was hypothesized that this temperate Halophila species diverged from tropical species in the relatively recent evolutionary past. We used a phylogenetic analysis of internal transcribed spacer (ITS) regions to examine the genetic variability and evolutionary trend of H. nipponica. ITS sequences of H. nipponica from various locations in Korea and Japan were identical or showed very low sequence divergence (less than 3-base pair, bp, difference), confirming that H. nipponica from Japan and Korea are the same species. Halophila species in the section Halophila, which have simple phyllotaxy (a pair of petiolate leaves at the rhizome node), were separated into five well-supported clades by maximum parsimony analysis. H. nipponica grouped with H. okinawensis and H. gaudichaudii from the subtropical regions in the same clade, the latter two species having quite low ITS sequence divergence from H. nipponica (7-15-bp). H. nipponica in Clade I diverged 2.95 ± 1.08 million years ago from species in Clade II, which includes H. ovalis. According to geographical distribution and genetic similarity, H. nipponica appears to have diverged from a tropical species like H. ovalis and adapted to warm temperate environments. The results of divergence time estimates suggest that the temperate H. nipponica is an older species than the subtropical H. okinawensis and H. gaudichaudii and they may have different evolutionary histories.
Kim, Young Kyun; Kim, Seung Hyeon; Yi, Joo Mi; Kang, Chang-Keun; Short, Frederick; Lee, Kun-Seop
2017-01-01
Although seagrass species in the genus Halophila are generally distributed in tropical or subtropical regions, H. nipponica has been reported to occur in temperate coastal waters of the northwestern Pacific. Because H. nipponica occurs only in the warm temperate areas influenced by the Kuroshio Current and shows a tropical seasonal growth pattern, such as severely restricted growth in low water temperatures, it was hypothesized that this temperate Halophila species diverged from tropical species in the relatively recent evolutionary past. We used a phylogenetic analysis of internal transcribed spacer (ITS) regions to examine the genetic variability and evolutionary trend of H. nipponica. ITS sequences of H. nipponica from various locations in Korea and Japan were identical or showed very low sequence divergence (less than 3-base pair, bp, difference), confirming that H. nipponica from Japan and Korea are the same species. Halophila species in the section Halophila, which have simple phyllotaxy (a pair of petiolate leaves at the rhizome node), were separated into five well-supported clades by maximum parsimony analysis. H. nipponica grouped with H. okinawensis and H. gaudichaudii from the subtropical regions in the same clade, the latter two species having quite low ITS sequence divergence from H. nipponica (7–15-bp). H. nipponica in Clade I diverged 2.95 ± 1.08 million years ago from species in Clade II, which includes H. ovalis. According to geographical distribution and genetic similarity, H. nipponica appears to have diverged from a tropical species like H. ovalis and adapted to warm temperate environments. The results of divergence time estimates suggest that the temperate H. nipponica is an older species than the subtropical H. okinawensis and H. gaudichaudii and they may have different evolutionary histories. PMID:28505209
Molecular taxonomy of phytopathogenic fungi: a case study in Peronospora.
Göker, Markus; García-Blázquez, Gema; Voglmayr, Hermann; Tellería, M Teresa; Martín, María P
2009-07-29
Inappropriate taxon definitions may have severe consequences in many areas. For instance, biologically sensible species delimitation of plant pathogens is crucial for measures such as plant protection or biological control and for comparative studies involving model organisms. However, delimiting species is challenging in the case of organisms for which often only molecular data are available, such as prokaryotes, fungi, and many unicellular eukaryotes. Even in the case of organisms with well-established morphological characteristics, molecular taxonomy is often necessary to emend current taxonomic concepts and to analyze DNA sequences directly sampled from the environment. Typically, for this purpose clustering approaches to delineate molecular operational taxonomic units have been applied using arbitrary choices regarding the distance threshold values, and the clustering algorithms. Here, we report on a clustering optimization method to establish a molecular taxonomy of Peronospora based on ITS nrDNA sequences. Peronospora is the largest genus within the downy mildews, which are obligate parasites of higher plants, and includes various economically important pathogens. The method determines the distance function and clustering setting that result in an optimal agreement with selected reference data. Optimization was based on both taxonomy-based and host-based reference information, yielding the same outcome. Resampling and permutation methods indicate that the method is robust regarding taxon sampling and errors in the reference data. Tests with newly obtained ITS sequences demonstrate the use of the re-classified dataset in molecular identification of downy mildews. A corrected taxonomy is provided for all Peronospora ITS sequences contained in public databases. Clustering optimization appears to be broadly applicable in automated, sequence-based taxonomy. The method connects traditional and modern taxonomic disciplines by specifically addressing the issue of how to optimally account for both traditional species concepts and genetic divergence.
Molecular Taxonomy of Phytopathogenic Fungi: A Case Study in Peronospora
Göker, Markus; García-Blázquez, Gema; Voglmayr, Hermann; Tellería, M. Teresa; Martín, María P.
2009-01-01
Background Inappropriate taxon definitions may have severe consequences in many areas. For instance, biologically sensible species delimitation of plant pathogens is crucial for measures such as plant protection or biological control and for comparative studies involving model organisms. However, delimiting species is challenging in the case of organisms for which often only molecular data are available, such as prokaryotes, fungi, and many unicellular eukaryotes. Even in the case of organisms with well-established morphological characteristics, molecular taxonomy is often necessary to emend current taxonomic concepts and to analyze DNA sequences directly sampled from the environment. Typically, for this purpose clustering approaches to delineate molecular operational taxonomic units have been applied using arbitrary choices regarding the distance threshold values, and the clustering algorithms. Methodology Here, we report on a clustering optimization method to establish a molecular taxonomy of Peronospora based on ITS nrDNA sequences. Peronospora is the largest genus within the downy mildews, which are obligate parasites of higher plants, and includes various economically important pathogens. The method determines the distance function and clustering setting that result in an optimal agreement with selected reference data. Optimization was based on both taxonomy-based and host-based reference information, yielding the same outcome. Resampling and permutation methods indicate that the method is robust regarding taxon sampling and errors in the reference data. Tests with newly obtained ITS sequences demonstrate the use of the re-classified dataset in molecular identification of downy mildews. Conclusions A corrected taxonomy is provided for all Peronospora ITS sequences contained in public databases. Clustering optimization appears to be broadly applicable in automated, sequence-based taxonomy. The method connects traditional and modern taxonomic disciplines by specifically addressing the issue of how to optimally account for both traditional species concepts and genetic divergence. PMID:19641601
Singh, Prashant; Singh, Satya Shila; Elster, Josef; Mishra, Arun Kumar
2013-06-01
In order to assess phylogeny, population genetics, and approximation of future course of cyanobacterial evolution based on nifH gene sequences, 41 heterocystous cyanobacterial strains collected from all over India have been used in the present study. NifH gene sequence analysis data confirm that the heterocystous cyanobacteria are monophyletic while the stigonematales show polyphyletic origin with grave intermixing. Further, analysis of nifH gene sequence data using intricate mathematical extrapolations revealed that the nucleotide diversity and recombination frequency is much greater in Nostocales than the Stigonematales. Similarly, DNA divergence studies showed significant values of divergence with greater gene conversion tracts in the unbranched (Nostocales) than the branched (Stigonematales) strains. Our data strongly support the origin of true branching cyanobacterial strains from the unbranched strains.
USDA-ARS?s Scientific Manuscript database
The complete nucleotide sequence of a recently discovered Florida (FL) isolate of Hibiscus infecting Cilevirus (HiCV) was determined by Sanger sequencing. The movement- and coat- protein gene sequences of the HiCV-FL isolate are more divergent than other genes of the previously sequenced HiCV-HA (Ha...
Miniprimer PCR, a New Lens for Viewing the Microbial World▿ †
Isenbarger, Thomas A.; Finney, Michael; Ríos-Velázquez, Carlos; Handelsman, Jo; Ruvkun, Gary
2008-01-01
Molecular methods based on the 16S rRNA gene sequence are used widely in microbial ecology to reveal the diversity of microbial populations in environmental samples. Here we show that a new PCR method using an engineered polymerase and 10-nucleotide “miniprimers” expands the scope of detectable sequences beyond those detected by standard methods using longer primers and Taq polymerase. After testing the method in silico to identify divergent ribosomal genes in previously cloned environmental sequences, we applied the method to soil and microbial mat samples, which revealed novel 16S rRNA gene sequences that would not have been detected with standard primers. Deeply divergent sequences were discovered with high frequency and included representatives that define two new division-level taxa, designated CR1 and CR2, suggesting that miniprimer PCR may reveal new dimensions of microbial diversity. PMID:18083877
Gao, Dongying; Jiang, Ning; Wing, Rod A.; Jiang, Jiming; Jackson, Scott A.
2015-01-01
Centromeres are important chromosomal regions necessary for eukaryotic cell segregation and replication. Due to high amounts of tandem repeats and transposons, centromeres have been difficult to sequence in most multicellular organisms, thus their sequence structure and evolution are poorly understood. In this study, we analyzed transposons in the centromere 8 (Cen8) from the African cultivated rice (O. glaberrima) and two subspecies of the Asian cultivated rice (O. sativa), indica and japonica. We detected much higher transposon contents (>69%) in centromere regions than in the whole genomes of O. sativa ssp. japonica and O. glaberrima (~35%). We compared the three Cen8s and identified numerous recent insertions of transposons that were frequently organized into multiple-layer nested blocks, similar to nested transposons in maize. Except for the Hopi retrotransposon, all LTR retrotransposons were shared but exhibit different abundances amongst the three Cen8s. Even though a majority of the transposons were located in intergenic regions, some gene-related transposons were found and may be involved in gene diversification. Chromatin immunoprecipitated (ChIP) data analysis revealed that 165 families from both Class I and Class II transposons were found in CENH3-associated chromatin sequences. These results indicate essential roles for transposons in centromeres and that the rapid divergence of the Cen8 sequences between the two cultivated rice species was primarily caused by recent transposon insertions. PMID:25904926
Gao, Dongying; Jiang, Ning; Wing, Rod A; Jiang, Jiming; Jackson, Scott A
2015-01-01
Centromeres are important chromosomal regions necessary for eukaryotic cell segregation and replication. Due to high amounts of tandem repeats and transposons, centromeres have been difficult to sequence in most multicellular organisms, thus their sequence structure and evolution are poorly understood. In this study, we analyzed transposons in the centromere 8 (Cen8) from the African cultivated rice (O. glaberrima) and two subspecies of the Asian cultivated rice (O. sativa), indica and japonica. We detected much higher transposon contents (>69%) in centromere regions than in the whole genomes of O. sativa ssp. japonica and O. glaberrima (~35%). We compared the three Cen8s and identified numerous recent insertions of transposons that were frequently organized into multiple-layer nested blocks, similar to nested transposons in maize. Except for the Hopi retrotransposon, all LTR retrotransposons were shared but exhibit different abundances amongst the three Cen8s. Even though a majority of the transposons were located in intergenic regions, some gene-related transposons were found and may be involved in gene diversification. Chromatin immunoprecipitated (ChIP) data analysis revealed that 165 families from both Class I and Class II transposons were found in CENH3-associated chromatin sequences. These results indicate essential roles for transposons in centromeres and that the rapid divergence of the Cen8 sequences between the two cultivated rice species was primarily caused by recent transposon insertions.
Kaplan, J B; Merkel, W K; Nichols, B P
1985-06-05
The amide group of glutamine is a source of nitrogen in the biosynthesis of a variety of compounds. These reactions are catalyzed by a group of enzymes known as glutamine amidotransferases; two of these, the glutamine amidotransferase subunits of p-aminobenzoate synthase and anthranilate synthase have been studied in detail and have been shown to be structurally and functionally related. In some micro-organisms, p-aminobenzoate synthase and anthranilate synthase share a common glutamine amidotransferase subunit. We report here the primary DNA and deduced amino acid sequences of the p-aminobenzoate synthase glutamine amidotransferase subunits from Salmonella typhimurium, Klebsiella aerogenes and Serratia marcescens. A comparison of these glutamine amidotransferase sequences to the sequences of ten others, including some that function specifically in either the p-aminobenzoate synthase or anthranilate synthase complexes and some that are shared by both synthase complexes, has revealed several interesting features of the structure and organization of these genes, and has allowed us to speculate as to the evolutionary history of this family of enzymes. We propose a model for the evolution of the p-aminobenzoate synthase and anthranilate synthase glutamine amidotransferase subunits in which the duplication and subsequent divergence of the genetic information encoding a shared glutamine amidotransferase subunit led to the evolution of two new pathway-specific enzymes.
Poomtien, Jamroonsri; Jindamorakot, Sasitorn; Limtong, Savitree; Pinphanichakarn, Pairoh; Thaniyavarn, Jiraporn
2013-01-01
Three yeast strains were isolated from industrial wastes in Thailand. Based on the phylogenetic sequence analysis of the D1/D2 region of the large subunit rRNA gene, the internal transcribed spacer (ITS1-5.8S rRNA gene-ITS2; ITS1-2) region, and their physiological characteristics, the three strains were found to represent two novel species of the ascomycetous anamorphic yeast. Strain JP52(T) represent a novel species which was named Cyberlindnera samutprakarnensis sp. nov. (type strain JP52(T); = BCC 46825(T) = JCM 17816(T) = CBS 12528(T), MycoBank no. MB800879), which was differentiated from the closely related species Cyberlindnera mengyuniae CBS 10845(T) by 2.9 % sequence divergence in the D1/D2 region and 4.4 % sequence divergence in the ITS1-2. Strain JP59(T) and JP60 were identical in their D1/D2 and ITS1-2 regions, which were closely related to those of Scheffersomyces spartinae CBS 6059(T) by 0.9 and 1.0 % sequence divergence, respectively. In addition, supportive evidence of actin gene and translational elongation factor gene by sequence divergence of 6.5 % each confirmed their distinct status. Furthermore, JP59(T) and JP60 differentiated from the closely related species in some biochemical and physiological characteristics. These two strains were assigned as a single novel species which was named Candida thasaenensis sp. nov. (type JP59(T) = BCC 46828(T) = JCM 17817(T) = CBS 12529(T), MycoBank no. MB800880).
Cheng, Ji-Hong; Liu, Wen-Chun; Chang, Ting-Tsung; Hsieh, Sun-Yuan; Tseng, Vincent S
2017-10-01
Many studies have suggested that deletions of Hepatitis B Viral (HBV) are associated with the development of progressive liver diseases, even ultimately resulting in hepatocellular carcinoma (HCC). Among the methods for detecting deletions from next-generation sequencing (NGS) data, few methods considered the characteristics of virus, such as high evolution rates and high divergence among the different HBV genomes. Sequencing high divergence HBV genome sequences using the NGS technology outputs millions of reads. Thus, detecting exact breakpoints of deletions from these big and complex data incurs very high computational cost. We proposed a novel analytical method named VirDelect (Virus Deletion Detect), which uses split read alignment base to detect exact breakpoint and diversity variable to consider high divergence in single-end reads data, such that the computational cost can be reduced without losing accuracy. We use four simulated reads datasets and two real pair-end reads datasets of HBV genome sequence to verify VirDelect accuracy by score functions. The experimental results show that VirDelect outperforms the state-of-the-art method Pindel in terms of accuracy score for all simulated datasets and VirDelect had only two base errors even in real datasets. VirDelect is also shown to deliver high accuracy in analyzing the single-end read data as well as pair-end data. VirDelect can serve as an effective and efficient bioinformatics tool for physiologists with high accuracy and efficient performance and applicable to further analysis with characteristics similar to HBV on genome length and high divergence. The software program of VirDelect can be downloaded at https://sourceforge.net/projects/virdelect/. Copyright © 2017. Published by Elsevier Inc.
Guillet-Claude, Carine; Isabel, Nathalie; Pelgas, Betty; Bousquet, Jean
2004-12-01
Class I knox genes code for transcription factors that play an essential role in plant growth and development as central regulators of meristem cell identity. Based on the analysis of new cDNA sequences from various tissues and genomic DNA sequences, we identified a highly diversified group of class I knox genes in conifers. Phylogenetic analyses of complete amino acid sequences from various seed plants indicated that all conifer sequences formed a monophyletic group. Within conifers, four subgroups here named genes KN1 to KN4 were well delineated, each regrouping pine and spruce sequences. KN4 was sister group to KN3, which was sister group to KN1 and KN2. Genetic mapping on the genomes of two divergent Picea species indicated that KN1 and KN2 are located close to each other on the same linkage group, whereas KN3 and KN4 mapped on different linkage groups, correlating the more ancient divergence of these two genes. The proportion of synonymous and nonsynonymous substitutions suggested intense purifying selection for the four genes. However, rates of substitution per year indicated an evolution in two steps: faster rates were noted after gene duplications, followed subsequently by lower rates. Positive directional selection was detected for most of the internal branches harboring an accelerated rate of evolution. In addition, many sites with highly significant amino acid rate shift were identified between these branches. However, the tightly linked KN1 and KN2 did not diverge as much from each other. The implications of the correlation between phylogenetic, structural, and functional information are discussed in relation to the diversification of the knox-I gene family in conifers.
Low X/Y divergence in four pairs of papaya sex-linked genes.
Yu, Qingyi; Hou, Shaobin; Feltus, F Alex; Jones, Meghan R; Murray, Jan E; Veatch, Olivia; Lemke, Cornelia; Saw, Jimmy H; Moore, Richard C; Thimmapuram, Jyothi; Liu, Lei; Moore, Paul H; Alam, Maqsudul; Jiang, Jiming; Paterson, Andrew H; Ming, Ray
2008-01-01
Sex chromosomes in flowering plants, in contrast to those in animals, evolved relatively recently and only a few are heteromorphic. The homomorphic sex chromosomes of papaya show features of incipient sex chromosome evolution. We investigated the features of paired X- and Y-specific bacterial artificial chromosomes (BACs), and estimated the time of divergence in four pairs of sex-linked genes. We report the results of a comparative analysis of long contiguous genomic DNA sequences between the X and hermaphrodite Y (Y(h)) chromosomes. Numerous chromosomal rearrangements were detected in the male-specific region of the Y chromosome (MSY), including inversions, deletions, insertions, duplications and translocations, showing the dynamic evolutionary process on the MSY after recombination ceased. DNA sequence expansion was documented in the two regions of the MSY, demonstrating that the cytologically homomorphic sex chromosomes are heteromorphic at the molecular level. Analysis of sequence divergence between four X and Y(h) gene pairs resulted in a estimated age of divergence of between 0.5 and 2.2 million years, supporting a recent origin of the papaya sex chromosomes. Our findings indicate that sex chromosomes did not evolve at the family level in Caricaceae, and reinforce the theory that sex chromosomes evolve at the species level in some lineages.
Mohandesan, Elmira; Fitak, Robert R; Corander, Jukka; Yadamsuren, Adiya; Chuluunbat, Battsetseg; Abdelhadi, Omer; Raziq, Abdul; Nagy, Peter; Stalder, Gabrielle; Walzer, Chris; Faye, Bernard; Burger, Pamela A
2017-08-30
The genus Camelus is an interesting model to study adaptive evolution in the mitochondrial genome, as the three extant Old World camel species inhabit hot and low-altitude as well as cold and high-altitude deserts. We sequenced 24 camel mitogenomes and combined them with three previously published sequences to study the role of natural selection under different environmental pressure, and to advance our understanding of the evolutionary history of the genus Camelus. We confirmed the heterogeneity of divergence across different components of the electron transport system. Lineage-specific analysis of mitochondrial protein evolution revealed a significant effect of purifying selection in the concatenated protein-coding genes in domestic Bactrian camels. The estimated dN/dS < 1 in the concatenated protein-coding genes suggested purifying selection as driving force for shaping mitogenome diversity in camels. Additional analyses of the functional divergence in amino acid changes between species-specific lineages indicated fixed substitutions in various genes, with radical effects on the physicochemical properties of the protein products. The evolutionary time estimates revealed a divergence between domestic and wild Bactrian camels around 1.1 [0.58-1.8] million years ago (mya). This has major implications for the conservation and management of the critically endangered wild species, Camelus ferus.
Evolutionary Divergence of Aggregatibacter actinomycetemcomitans
Kittichotirat, W.; Bumgarner, R.E.; Chen, C.
2016-01-01
Gram-negative facultative Aggregatibacter actinomycetemcomitans is an oral pathogen associated with periodontitis. The genetic heterogeneity among A. actinomycetemcomitans strains has been long recognized. This study provides a comprehensive genomic analysis of A. actinomycetemcomitans and the closely related nonpathogenic Aggregatibacter aphrophilus. Whole genome sequencing by Illumina MiSeq platform was performed for 31 A. actinomycetemcomitans and 2 A. aphrophilus strains. Sequence similarity analysis shows a total of 3,220 unique genes across the 2 species, where 1,550 are core genes present in all genomes and 1,670 are variable genes (accessory genes) missing in at least 1 genome. Phylogenetic analysis based on 397 concatenated core genes distinguished A. aphrophilus and A. actinomycetemcomitans. The latter was in turn divided into 5 clades: clade b (serotype b), clade c (serotype c), clade e/f (serotypes e and f), clade a/d (serotypes a and d), and clade e′ (serotype e strains). Accessory genes accounted for 14.1% to 23.2% of the A. actinomycetemcomitans genomes, with a majority belonging to the category of poorly characterized by Cluster of Orthologous Groups classification. These accessory genes were often organized into genomic islands (n = 387) with base composition biases, suggesting their acquisitions via horizontal gene transfer. There was a greater degree of similarity in gene content and genomic islands among strains within clades than between clades. Strains of clade e′ isolated from human were found to be missing the genomic island that carries genes encoding cytolethal distending toxins. Taken together, the results suggest a pattern of sequential divergence, starting from the separation of A. aphrophilus and A. actinomycetemcomitans through gain and loss of genes and ending with the divergence of the latter species into distinct clades and serotypes. With differing constellations of genes, the A. actinomycetemcomitans clades may have evolved distinct adaptation strategies to the human oral cavity. PMID:26420795
The complete chloroplast genome sequence of the medicinal plant Salvia miltiorrhiza.
Qian, Jun; Song, Jingyuan; Gao, Huanhuan; Zhu, Yingjie; Xu, Jiang; Pang, Xiaohui; Yao, Hui; Sun, Chao; Li, Xian'en; Li, Chuyuan; Liu, Juyan; Xu, Haibin; Chen, Shilin
2013-01-01
Salvia miltiorrhiza is an important medicinal plant with great economic and medicinal value. The complete chloroplast (cp) genome sequence of Salvia miltiorrhiza, the first sequenced member of the Lamiaceae family, is reported here. The genome is 151,328 bp in length and exhibits a typical quadripartite structure of the large (LSC, 82,695 bp) and small (SSC, 17,555 bp) single-copy regions, separated by a pair of inverted repeats (IRs, 25,539 bp). It contains 114 unique genes, including 80 protein-coding genes, 30 tRNAs and four rRNAs. The genome structure, gene order, GC content and codon usage are similar to the typical angiosperm cp genomes. Four forward, three inverted and seven tandem repeats were detected in the Salvia miltiorrhiza cp genome. Simple sequence repeat (SSR) analysis among the 30 asterid cp genomes revealed that most SSRs are AT-rich, which contribute to the overall AT richness of these cp genomes. Additionally, fewer SSRs are distributed in the protein-coding sequences compared to the non-coding regions, indicating an uneven distribution of SSRs within the cp genomes. Entire cp genome comparison of Salvia miltiorrhiza and three other Lamiales cp genomes showed a high degree of sequence similarity and a relatively high divergence of intergenic spacers. Sequence divergence analysis discovered the ten most divergent and ten most conserved genes as well as their length variation, which will be helpful for phylogenetic studies in asterids. Our analysis also supports that both regional and functional constraints affect gene sequence evolution. Further, phylogenetic analysis demonstrated a sister relationship between Salvia miltiorrhiza and Sesamum indicum. The complete cp genome sequence of Salvia miltiorrhiza reported in this paper will facilitate population, phylogenetic and cp genetic engineering studies of this medicinal plant.
Miller, Hilary C.; O’Meally, Denis; Ezaz, Tariq; Amemiya, Chris; Marshall-Graves, Jennifer A.; Edwards, Scott
2015-01-01
Major histocompatibility complex (MHC) genes are a central component of the vertebrate immune system and usually exist in a single genomic region. However, considerable differences in MHC organization and size exist between different vertebrate lineages. Reptiles occupy a key evolutionary position for understanding how variation in MHC structure evolved in vertebrates, but information on the structure of the MHC region in reptiles is limited. In this study, we investigate the organization and cytogenetic location of MHC genes in the tuatara (Sphenodon punctatus), the sole extant representative of the early-diverging reptilian order Rhynchocephalia. Sequencing and mapping of 12 clones containing class I and II MHC genes from a bacterial artificial chromosome library indicated that the core MHC region is located on chromosome 13q. However, duplication and translocation of MHC genes outside of the core region was evident, because additional class I MHC genes were located on chromosome 4p. We found a total of seven class I sequences and 11 class II β sequences, with evidence for duplication and pseudogenization of genes within the tuatara lineage. The tuatara MHC is characterized by high repeat content and low gene density compared with other species and we found no antigen processing or MHC framework genes on the MHC gene-containing clones. Our findings indicate substantial differences in MHC organization in tuatara compared with mammalian and avian MHCs and highlight the dynamic nature of the MHC. Further sequencing and annotation of tuatara and other reptile MHCs will determine if the tuatara MHC is representative of nonavian reptiles in general. PMID:25953959
Rebelling for a Reason: Protein Structural “Outliers”
Arumugam, Gandhimathi; Nair, Anu G.; Hariharaputran, Sridhar; Ramanathan, Sowdhamini
2013-01-01
Analysis of structural variation in domain superfamilies can reveal constraints in protein evolution which aids protein structure prediction and classification. Structure-based sequence alignment of distantly related proteins, organized in PASS2 database, provides clues about structurally conserved regions among different functional families. Some superfamily members show large structural differences which are functionally relevant. This paper analyses the impact of structural divergence on function for multi-member superfamilies, selected from the PASS2 superfamily alignment database. Functional annotations within superfamilies, with structural outliers or ‘rebels’, are discussed in the context of structural variations. Overall, these data reinforce the idea that functional similarities cannot be extrapolated from mere structural conservation. The implication for fold-function prediction is that the functional annotations can only be inherited with very careful consideration, especially at low sequence identities. PMID:24073209
COOLAIR Antisense RNAs Form Evolutionarily Conserved Elaborate Secondary Structures
Hawkes, Emily J.; Hennelly, Scott P.; Novikova, Irina V.; ...
2016-09-20
There is considerable debate about the functionality of long non-coding RNAs (lncRNAs). Lack of sequence conservation has been used to argue against functional relevance. Here, we investigated antisense lncRNAs, called COOLAIR, at the A. thaliana FLC locus and experimentally determined their secondary structure. The major COOLAIR variants are highly structured, organized by exon. The distally polyadenylated transcript has a complex multi-domain structure, altered by a single non-coding SNP defining a functionally distinct A. thaliana FLC haplotype. The A. thaliana COOLAIR secondary structure was used to predict COOLAIR exons in evolutionarily divergent Brassicaceae species. These predictions were validated through chemical probingmore » and cloning. Despite the relatively low nucleotide sequence identity, the structures, including multi-helix junctions, show remarkable evolutionary conservation. In a number of places, the structure is conserved through covariation of a non-contiguous DNA sequence. This structural conservation supports a functional role for COOLAIR transcripts rather than, or in addition to, antisense transcription.« less
COOLAIR Antisense RNAs Form Evolutionarily Conserved Elaborate Secondary Structures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hawkes, Emily J.; Hennelly, Scott P.; Novikova, Irina V.
There is considerable debate about the functionality of long non-coding RNAs (lncRNAs). Lack of sequence conservation has been used to argue against functional relevance. Here, we investigated antisense lncRNAs, called COOLAIR, at the A. thaliana FLC locus and experimentally determined their secondary structure. The major COOLAIR variants are highly structured, organized by exon. The distally polyadenylated transcript has a complex multi-domain structure, altered by a single non-coding SNP defining a functionally distinct A. thaliana FLC haplotype. The A. thaliana COOLAIR secondary structure was used to predict COOLAIR exons in evolutionarily divergent Brassicaceae species. These predictions were validated through chemical probingmore » and cloning. Despite the relatively low nucleotide sequence identity, the structures, including multi-helix junctions, show remarkable evolutionary conservation. In a number of places, the structure is conserved through covariation of a non-contiguous DNA sequence. This structural conservation supports a functional role for COOLAIR transcripts rather than, or in addition to, antisense transcription.« less
Cianciulli, Antonia; Calvello, Rosa; Panaro, Maria A
2015-04-01
In the homologous genes studied, the exons and introns alternated in the same order in mouse and human. We studied, in both species: corresponding short segments of introns, whole corresponding introns and complete homologous genes. We considered the total number of nucleotides and the number and orientation of the SINE inserts. Comparisons of mouse and human data series showed that at the level of individual relatively short segments of intronic sequences the stochastic variability prevails in the local structuring, but at higher levels of organization a deterministic component emerges, conserved in mouse and human during the divergent evolution, despite the ample re-editing of the intronic sequences and the fact that processes such as SINE spread had taken place in an independent way in the two species. Intron conservation is negatively correlated with the SINE occupancy, suggesting that virus inserts interfere with the conservation of the sequences inherited from the common ancestor. Copyright © 2015 Elsevier Ltd. All rights reserved.
2013-01-01
Background Deep sequencing of viruses isolated from infected hosts is an efficient way to measure population-genetic variation and can reveal patterns of dispersal and natural selection. In this study, we mined existing Illumina sequence reads to investigate single-nucleotide polymorphisms (SNPs) within two RNA viruses of the Western honey bee (Apis mellifera), deformed wing virus (DWV) and Israel acute paralysis virus (IAPV). All viral RNA was extracted from North American samples of honey bees or, in one case, the ectoparasitic mite Varroa destructor. Results Coverage depth was generally lower for IAPV than DWV, and marked gaps in coverage occurred in several narrow regions (< 50 bp) of IAPV. These coverage gaps occurred across sequencing runs and were virtually unchanged when reads were re-mapped with greater permissiveness (up to 8% divergence), suggesting a recurrent sequencing artifact rather than strain divergence. Consensus sequences of DWV for each sample showed little phylogenetic divergence, low nucleotide diversity, and strongly negative values of Fu and Li’s D statistic, suggesting a recent population bottleneck and/or purifying selection. The Kakugo strain of DWV fell outside of all other DWV sequences at 100% bootstrap support. IAPV consensus sequences supported the existence of multiple clades as had been previously reported, and Fu and Li’s D was closer to neutral expectation overall, although a sliding-window analysis identified a significantly positive D within the protease region, suggesting selection maintains diversity in that region. Within-sample mean diversity was comparable between the two viruses on average, although for both viruses there was substantial variation among samples in mean diversity at third codon positions and in the number of high-diversity sites. FST values were bimodal for DWV, likely reflecting neutral divergence in two low-diversity populations, whereas IAPV had several sites that were strong outliers with very low FST. Conclusions This initial survey of genetic variation within honey bee RNA viruses suggests future directions for studies examining the underlying causes of population-genetic structure in these economically important pathogens. PMID:23497218
Cornman, Robert Scott; Boncristiani, Humberto; Dainat, Benjamin; Chen, Yanping; vanEngelsdorp, Dennis; Weaver, Daniel; Evans, Jay D
2013-03-07
Deep sequencing of viruses isolated from infected hosts is an efficient way to measure population-genetic variation and can reveal patterns of dispersal and natural selection. In this study, we mined existing Illumina sequence reads to investigate single-nucleotide polymorphisms (SNPs) within two RNA viruses of the Western honey bee (Apis mellifera), deformed wing virus (DWV) and Israel acute paralysis virus (IAPV). All viral RNA was extracted from North American samples of honey bees or, in one case, the ectoparasitic mite Varroa destructor. Coverage depth was generally lower for IAPV than DWV, and marked gaps in coverage occurred in several narrow regions (< 50 bp) of IAPV. These coverage gaps occurred across sequencing runs and were virtually unchanged when reads were re-mapped with greater permissiveness (up to 8% divergence), suggesting a recurrent sequencing artifact rather than strain divergence. Consensus sequences of DWV for each sample showed little phylogenetic divergence, low nucleotide diversity, and strongly negative values of Fu and Li's D statistic, suggesting a recent population bottleneck and/or purifying selection. The Kakugo strain of DWV fell outside of all other DWV sequences at 100% bootstrap support. IAPV consensus sequences supported the existence of multiple clades as had been previously reported, and Fu and Li's D was closer to neutral expectation overall, although a sliding-window analysis identified a significantly positive D within the protease region, suggesting selection maintains diversity in that region. Within-sample mean diversity was comparable between the two viruses on average, although for both viruses there was substantial variation among samples in mean diversity at third codon positions and in the number of high-diversity sites. FST values were bimodal for DWV, likely reflecting neutral divergence in two low-diversity populations, whereas IAPV had several sites that were strong outliers with very low FST. This initial survey of genetic variation within honey bee RNA viruses suggests future directions for studies examining the underlying causes of population-genetic structure in these economically important pathogens.
Molecular clocks and the early evolution of metazoan nervous systems.
Wray, Gregory A
2015-12-19
The timing of early animal evolution remains poorly resolved, yet remains critical for understanding nervous system evolution. Methods for estimating divergence times from sequence data have improved considerably, providing a more refined understanding of key divergences. The best molecular estimates point to the origin of metazoans and bilaterians tens to hundreds of millions of years earlier than their first appearances in the fossil record. Both the molecular and fossil records are compatible, however, with the possibility of tiny, unskeletonized, low energy budget animals during the Proterozoic that had planktonic, benthic, or meiofaunal lifestyles. Such animals would likely have had relatively simple nervous systems equipped primarily to detect food, avoid inhospitable environments and locate mates. The appearance of the first macropredators during the Cambrian would have changed the selective landscape dramatically, likely driving the evolution of complex sense organs, sophisticated sensory processing systems, and diverse effector systems involved in capturing prey and avoiding predation. © 2015 The Author(s).
Kawano, Mitsuoki
2012-12-01
Toxin-antitoxin (TA) systems are categorized into three classes based on the type of antitoxin. In type I TA systems, the antitoxin is a small antisense RNA that inhibits translation of small toxic proteins by binding to the corresponding mRNAs. Those type I TA systems were originally identified as plasmid stabilization modules rendering a post-segregational killing (PSK) effect on the host cells. The type I TA loci also exist on the Escherichia coli chromosome but their biological functions are less clear. Genetic organization and regulatory elements of hok/sok and ldr/rdl families are very similar and the toxins are predicted to contain a transmembrane domain, but otherwise share no detectable sequence similarity. This review will give an overview of the type I TA modules of E. coli K-12, especially hok/sok, ldr/rdl and SOS-inducible symE/symR systems, which are regulated by divergently overlapping cis-encoded antisense RNAs.
Rastorguev, S M; Nedoluzhko, A V; Sharko, F S; Boulygina, E S; Sokolov, A S; Gruzdeva, N M; Skryabin, K G; Prokhortchouk, E B
2016-11-01
The three-spined stickleback (Gasterosteus aculeatus L.) is an important model organism for studying the molecular mechanisms of speciation and adaptation to salinity. Despite increased interest to microRNA discovery and recent publication on microRNA prediction in the three-spined stickleback using bioinformatics approaches, there is still a lack of experimental support for these data. In this paper, high-throughput sequencing technology was applied to identify microRNA genes in gills of the three-spined stickleback. In total, 595 miRNA genes were discovered; half of them were predicted in previous computational studies and were confirmed here as microRNAs expressed in gill tissue. Moreover, 298 novel microRNA genes were identified. The presence of miRNA genes in selected 'divergence islands' was analysed and 10 miRNA genes were identified as not randomly located in 'divergence islands'. Regulatory regions of miRNA genes were found enriched with selective SNPs that may play a role in freshwater adaptation. © 2016 John Wiley & Sons Ltd.
Sequence analysis of MHC class I α2 from sockeye salmon (Oncorhynchus nerka).
McClelland, Erin K; Ming, Tobi J; Tabata, Amy; Miller, Kristina M
2011-09-01
Most studies assessing adaptive MHC diversity in salmon populations have focused on the classical class II DAB or DAA loci, as these have been most amenable to single PCR amplifications due to their relatively low level of sequence divergence. Herein, we report the characterization of the classical class I UBA α2 locus based on collections taken throughout the species range of sockeye salmon (Oncorhynchus nerka). Through use of multiple lineage-specific primer sets, denaturing gradient gel electrophoresis and sequencing, we identified thirty-four alleles from three highly divergent lineages. Sequence identity between lineages ranged from 30.0% to 56.8% but was relatively high within lineages. Allelic identity within the antigen recognition site (ARS) was greater than for the longer sequence. Global positive selection on UBA was seen at the sequence level (dN:dS = 1.012) with four codons under positive selection and 12 codons under negative selection. Crown Copyright © 2011. Published by Elsevier Ltd. All rights reserved.
Estimation of primate speciation dates using local molecular clocks.
Yoder, A D; Yang, Z
2000-07-01
Protein-coding genes of the mitochondrial genomes from 31 mammalian species were analyzed to estimate the speciation dates within primates and also between rats and mice. Three calibration points were used based on paleontological data: one at 20-25 MYA for the hominoid/cercopithecoid divergence, one at 53-57 MYA for the cetacean/artiodactyl divergence, and the third at 110-130 MYA for the metatherian/eutherian divergence. Both the nucleotide and the amino acid sequences were analyzed, producing conflicting results. The global molecular clock was clearly violated for both the nucleotide and the amino acid data. Models of local clocks were implemented using maximum likelihood, allowing different evolutionary rates for some lineages while assuming rate constancy in others. Surprisingly, the highly divergent third codon positions appeared to contain phylogenetic information and produced more sensible estimates of primate divergence dates than did the amino acid sequences. Estimated dates varied considerably depending on the data type, the calibration point, and the substitution model but differed little among the four tree topologies used. We conclude that the calibration derived from the primate fossil record is too recent to be reliable; we also point out a number of problems in date estimation when the molecular clock does not hold. Despite these obstacles, we derived estimates of primate divergence dates that were well supported by the data and were generally consistent with the paleontological record. Estimation of the mouse-rat divergence date, however, was problematic.
Pannacciulli, Federica G; Maltagliati, Ferruccio; de Guttry, Christian; Achituv, Yair
2017-01-01
The model marine broadcast-spawner barnacle Chthamalus montagui was investigated to understand its genetic structure and quantify levels of population divergence, and to make inference on historical demography in terms of time of divergence and changes in population size. We collected specimens from rocky shores of the north-east Atlantic Ocean (4 locations), Mediterranean Sea (8) and Black Sea (1). The 312 sequences 537 bp) of the mitochondrial cytochrome c oxidase I allowed to detect 130 haplotypes. High within-location genetic variability was recorded, with haplotype diversity ranging between h = 0.750 and 0.967. Parameters of genetic divergence, haplotype network and Bayesian assignment analysis were consistent in rejecting the hypothesis of panmixia. C. montagui is genetically structured in three geographically discrete populations, which corresponded to north-eastern Atlantic Ocean, western-central Mediterranean Sea, and Aegean Sea-Black Sea. These populations are separated by two main effective barriers to gene flow located at the Almeria-Oran Front and in correspondence of the Cyclades Islands. According to the 'isolation with migration' model, adjacent population pairs diverged during the early to middle Pleistocene transition, a period in which geological events provoked significant changes in the structure and composition of palaeocommunities. Mismatch distributions, neutrality tests and Bayesian skyline plots showed past population expansions, which started approximately in the Mindel-Riss interglacial, in which ecological conditions were favourable for temperate species and calcium-uptaking marine organisms.
Field, Mark C.; Adung’a, Vincent; Obado, Samson; Chait, Brian T.; Rout, Michael P.
2014-01-01
SUMMERY Trypanosomatids represent the causative agents of major diseases in humans, livestock and plants, with inevitable suffering and economic hardship as a result. They are also evolutionarily highly divergent organisms, and the many unique aspects of trypanosome biology provide opportunities in terms of identification of drug targets, the challenge of exploiting these putative targets, and at the same time significant scope for exploration of novel and divergent cell biology. We can estimate from genome sequences that the degree of divergence of trypanosomes from animals and fungi is extreme, with perhaps one third to one half of predicted trypanosome proteins having no known function based on homology or recognizable protein domains/architecture. Two highly important aspects of trypanosome biology are the flagellar pocket and the nuclear envelope, where in silico analysis clearly suggests great potential divergence in the proteome. The flagellar pocket is the sole site of endo- and exocytosis in trypanosomes and plays important roles in immune evasion via variant surface glycoprotein (VSG) trafficking and providing a location for sequestration of various invariant receptors. The trypanosome nuclear envelope has been largely unexplored, but by analogy with higher eukaryotes, roles in the regulation of chromatin and most significantly, in controlling VSG gene expression are expected. Here we discuss recent successful proteomics-based approaches towards characterization of the nuclear envelope and the endocytic apparatus, the identification of conserved and novel trypanosomatid-specific features, and the implications of these findings. PMID:22309600
Pannacciulli, Federica G.; de Guttry, Christian; Achituv, Yair
2017-01-01
The model marine broadcast-spawner barnacle Chthamalus montagui was investigated to understand its genetic structure and quantify levels of population divergence, and to make inference on historical demography in terms of time of divergence and changes in population size. We collected specimens from rocky shores of the north-east Atlantic Ocean (4 locations), Mediterranean Sea (8) and Black Sea (1). The 312 sequences 537 bp) of the mitochondrial cytochrome c oxidase I allowed to detect 130 haplotypes. High within-location genetic variability was recorded, with haplotype diversity ranging between h = 0.750 and 0.967. Parameters of genetic divergence, haplotype network and Bayesian assignment analysis were consistent in rejecting the hypothesis of panmixia. C. montagui is genetically structured in three geographically discrete populations, which corresponded to north-eastern Atlantic Ocean, western-central Mediterranean Sea, and Aegean Sea-Black Sea. These populations are separated by two main effective barriers to gene flow located at the Almeria-Oran Front and in correspondence of the Cyclades Islands. According to the ‘isolation with migration’ model, adjacent population pairs diverged during the early to middle Pleistocene transition, a period in which geological events provoked significant changes in the structure and composition of palaeocommunities. Mismatch distributions, neutrality tests and Bayesian skyline plots showed past population expansions, which started approximately in the Mindel-Riss interglacial, in which ecological conditions were favourable for temperate species and calcium-uptaking marine organisms. PMID:28594840
Divergence of Iron Metabolism in Wild Malaysian Yeast
Lee, Hana N.; Mostovoy, Yulia; Hsu, Tiffany Y.; Chang, Amanda H.; Brem, Rachel B.
2013-01-01
Comparative genomic studies have reported widespread variation in levels of gene expression within and between species. Using these data to infer organism-level trait divergence has proven to be a key challenge in the field. We have used a wild Malaysian population of S. cerevisiae as a test bed in the search to predict and validate trait differences based on observations of regulatory variation. Malaysian yeast, when cultured in standard medium, activated regulatory programs that protect cells from the toxic effects of high iron. Malaysian yeast also showed a hyperactive regulatory response during culture in the presence of excess iron and had a unique growth defect in conditions of high iron. Molecular validation experiments pinpointed the iron metabolism factors AFT1, CCC1, and YAP5 as contributors to these molecular and cellular phenotypes; in genome-scale sequence analyses, a suite of iron toxicity response genes showed evidence for rapid protein evolution in Malaysian yeast. Our findings support a model in which iron metabolism has diverged in Malaysian yeast as a consequence of a change in selective pressure, with Malaysian alleles shifting the dynamic range of iron response to low-iron concentrations and weakening resistance to extreme iron toxicity. By dissecting the iron scarcity specialist behavior of Malaysian yeast, our work highlights the power of expression divergence as a signpost for biologically and evolutionarily relevant variation at the organismal level. Interpreting the phenotypic relevance of gene expression variation is one of the primary challenges of modern genomics. PMID:24142925
Divergence of iron metabolism in wild Malaysian yeast.
Lee, Hana N; Mostovoy, Yulia; Hsu, Tiffany Y; Chang, Amanda H; Brem, Rachel B
2013-12-09
Comparative genomic studies have reported widespread variation in levels of gene expression within and between species. Using these data to infer organism-level trait divergence has proven to be a key challenge in the field. We have used a wild Malaysian population of S. cerevisiae as a test bed in the search to predict and validate trait differences based on observations of regulatory variation. Malaysian yeast, when cultured in standard medium, activated regulatory programs that protect cells from the toxic effects of high iron. Malaysian yeast also showed a hyperactive regulatory response during culture in the presence of excess iron and had a unique growth defect in conditions of high iron. Molecular validation experiments pinpointed the iron metabolism factors AFT1, CCC1, and YAP5 as contributors to these molecular and cellular phenotypes; in genome-scale sequence analyses, a suite of iron toxicity response genes showed evidence for rapid protein evolution in Malaysian yeast. Our findings support a model in which iron metabolism has diverged in Malaysian yeast as a consequence of a change in selective pressure, with Malaysian alleles shifting the dynamic range of iron response to low-iron concentrations and weakening resistance to extreme iron toxicity. By dissecting the iron scarcity specialist behavior of Malaysian yeast, our work highlights the power of expression divergence as a signpost for biologically and evolutionarily relevant variation at the organismal level. Interpreting the phenotypic relevance of gene expression variation is one of the primary challenges of modern genomics.
Aslamkhan, Amy G.; Thompson, Deborah M.; Perry, Jennifer L.; Bleasby, Kelly; Wolff, Natascha A.; Barros, Scott; Miller, David S.; Pritchard, John B.
2007-01-01
The flounder renal organic anion transporter (fOat) has substantial sequence homology to mammalian basolateral organic anion transporter orthologs (OAT1/Oat1 and OAT3/Oat3), suggesting that fOat may have functional properties of both mammalian forms. We therefore compared uptake of various substrates by rat Oat1 and Oat3 and human OAT1 and OAT3 with the fOat clone expressed in Xenopus oocytes. These data confirm that estrone sulfate is an excellent substrate for mammalian OAT3/Oat3 transporters but not for OAT1/Oat1 transporters. In contrast, 2,4-dichlorophenoxyacetic acid and adefovir are better transported by mammalian OAT1/Oat1 than by the OAT3/Oat3 clones. All three substrates were well transported by fOat-expressing Xenopus oocytes. fOat Km values were comparable to those obtained for mammalian OAT/Oat1/3 clones. We also characterized the ability of these substrates to inhibit uptake of the fluorescent substrate fluorescein in intact teleost proximal tubules isolated from the winter flounder (Pseudopleuronectes americanus) and killifish (Fundulus heteroclitus). The rank order of the IC50 values for inhibition of cellular fluorescein accumulation was similar to that for the Km values obtained in fOat-expressing oocytes, suggesting that fOat may be the primary teleost renal basolateral Oat. Assessment of the zebrafish (Danio rerio) genome indicated the presence of a single Oat (zfOat) with similarity to both mammalian OAT1/Oat1 and OAT3/Oat3. The puffer fish (Takifugu rubripes) also has an Oat (pfOat) similar to mammalian OAT1/Oat1 and OAT3/Oat3 members. Furthermore, phylogenetic analyses argue that the teleost Oat1/3-like genes diverged from a common ancestral gene in advance of the divergence of the mammalian OAT1/Oat1, OAT3/Oat3, and, possibly, Oat6 genes. PMID:16857889
Horn, T; Chang, C A; Urdea, M S
1997-12-01
The divergent synthesis of branched DNA (bDNA) comb structures is described. This new type of bDNA contains one unique oligonucleotide, the primary sequence, covalently attached through a comb-like branch network to many identical copies of a different oligonucleotide, the secondary sequence. The bDNA comb structures were assembled on a solid support and several synthesis parameters were investigated and optimized. The bDNA comb molecules were characterized by polyacrylamide gel electrophoretic methods and by controlled cleavage at periodate-cleavable moieties incorporated during synthesis. The developed chemistry allows synthesis of bDNA comb molecules containing multiple secondary sequences. In the accompanying article we describe the synthesis and characterization of large bDNA combs containing all four deoxynucleotides for use as signal amplifiers in nucleic acid quantification assays.
Horn, T; Chang, C A; Urdea, M S
1997-01-01
The divergent synthesis of branched DNA (bDNA) comb structures is described. This new type of bDNA contains one unique oligonucleotide, the primary sequence, covalently attached through a comb-like branch network to many identical copies of a different oligonucleotide, the secondary sequence. The bDNA comb structures were assembled on a solid support and several synthesis parameters were investigated and optimized. The bDNA comb molecules were characterized by polyacrylamide gel electrophoretic methods and by controlled cleavage at periodate-cleavable moieties incorporated during synthesis. The developed chemistry allows synthesis of bDNA comb molecules containing multiple secondary sequences. In the accompanying article we describe the synthesis and characterization of large bDNA combs containing all four deoxynucleotides for use as signal amplifiers in nucleic acid quantification assays. PMID:9365265
Picard, François J.; Ke, Danbing; Boudreau, Dominique K.; Boissinot, Maurice; Huletsky, Ann; Richard, Dave; Ouellette, Marc; Roy, Paul H.; Bergeron, Michel G.
2004-01-01
A 761-bp portion of the tuf gene (encoding the elongation factor Tu) from 28 clinically relevant streptococcal species was obtained by sequencing amplicons generated using broad-range PCR primers. These tuf sequences were used to select Streptococcus-specific PCR primers and to perform phylogenetic analysis. The specificity of the PCR assay was verified using 102 different bacterial species, including the 28 streptococcal species. Genomic DNA purified from all streptococcal species was efficiently detected, whereas there was no amplification with DNA from 72 of the 74 nonstreptococcal bacterial species tested. There was cross-amplification with DNAs from Enterococcus durans and Lactococcus lactis. However, the 15 to 31% nucleotide sequence divergence in the 761-bp tuf portion of these two species compared to any streptococcal tuf sequence provides ample sequence divergence to allow the development of internal probes specific to streptococci. The Streptococcus-specific assay was highly sensitive for all 28 streptococcal species tested (i.e., detection limit of 1 to 10 genome copies per PCR). The tuf sequence data was also used to perform extensive phylogenetic analysis, which was generally in agreement with phylogeny determined on the basis of 16S rRNA gene data. However, the tuf gene provided a better discrimination at the streptococcal species level that should be particularly useful for the identification of very closely related species. In conclusion, tuf appears more suitable than the 16S ribosomal RNA gene for the development of diagnostic assays for the detection and identification of streptococcal species because of its higher level of species-specific genetic divergence. PMID:15297518
Rapid and Parallel Adaptive Evolution of the Visual System of Neotropical Midas Cichlid Fishes.
Torres-Dowdall, Julián; Pierotti, Michele E R; Härer, Andreas; Karagic, Nidal; Woltering, Joost M; Henning, Frederico; Elmer, Kathryn R; Meyer, Axel
2017-10-01
Midas cichlid fish are a Central American species flock containing 13 described species that has been dated to only a few thousand years old, a historical timescale infrequently associated with speciation. Their radiation involved the colonization of several clear water crater lakes from two turbid great lakes. Therefore, Midas cichlids have been subjected to widely varying photic conditions during their radiation. Being a primary signal relay for information from the environment to the organism, the visual system is under continuing selective pressure and a prime organ system for accumulating adaptive changes during speciation, particularly in the case of dramatic shifts in photic conditions. Here, we characterize the full visual system of Midas cichlids at organismal and genetic levels, to determine what types of adaptive changes evolved within the short time span of their radiation. We show that Midas cichlids have a diverse visual system with unexpectedly high intra- and interspecific variation in color vision sensitivity and lens transmittance. Midas cichlid populations in the clear crater lakes have convergently evolved visual sensitivities shifted toward shorter wavelengths compared with the ancestral populations from the turbid great lakes. This divergence in sensitivity is driven by changes in chromophore usage, differential opsin expression, opsin coexpression, and to a lesser degree by opsin coding sequence variation. The visual system of Midas cichlids has the evolutionary capacity to rapidly integrate multiple adaptations to changing light environments. Our data may indicate that, in early stages of divergence, changes in opsin regulation could precede changes in opsin coding sequence evolution. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Maruyama, Sandra Regina; Castro-Jorge, Luiza Antunes; Ribeiro, José Marcos Chaves; Gardinassi, Luiz Gustavo; Garcia, Gustavo Rocha; Brandão, Lucinda Giampietro; Rodrigues, Aline Rezende; Okada, Marcos Ituo; Abrão, Emiliana Pereira; Ferreira, Beatriz Rossetti; da Fonseca, Benedito Antonio Lopes; de Miranda-Santos, Isabel Kinney Ferreira
2013-01-01
Transcripts similar to those that encode the nonstructural (NS) proteins NS3 and NS5 from flaviviruses were found in a salivary gland (SG) complementary DNA (cDNA) library from the cattle tick Rhipicephalus microplus. Tick extracts were cultured with cells to enable the isolation of viruses capable of replicating in cultured invertebrate and vertebrate cells. Deep sequencing of the viral RNA isolated from culture supernatants provided the complete coding sequences for the NS3 and NS5 proteins and their molecular characterisation confirmed similarity with the NS3 and NS5 sequences from other flaviviruses. Despite this similarity, phylogenetic analyses revealed that this potentially novel virus may be a highly divergent member of the genus Flavivirus. Interestingly, we detected the divergent NS3 and NS5 sequences in ticks collected from several dairy farms widely distributed throughout three regions of Brazil. This is the first report of flavivirus-like transcripts in R. microplus ticks. This novel virus is a potential arbovirus because it replicated in arthropod and mammalian cells; furthermore, it was detected in a cDNA library from tick SGs and therefore may be present in tick saliva. It is important to determine whether and by what means this potential virus is transmissible and to monitor the virus as a potential emerging tick-borne zoonotic pathogen. PMID:24626302
Candida ruelliae sp. nov., a novel yeast species isolated from flowers of Ruellia sp. (Acanthaceae).
Saluja, Puja; Prasad, Gandham S
2008-06-01
Two novel yeast strains designated as 16Q1 and 16Q3 were isolated from flowers of the Ruellia species of the Acanthaceae family. The D1/D2 domain and ITS sequences of these two strains were identical. Sequence analysis of the D1/D2 domain of large-subunit rRNA gene indicated their relationship to species of the Candida haemulonii cluster. However, they differ from C. haemulonii by 14% nucleotide sequence divergence, from Candida pseudohaemulonii by 16.1% and from C. haemulonii type II by 16.5%. These strains also differ in 18 physiological tests from the type strain of C. haemulonii, and 12 and 16 tests, respectively, from C. pseudohaemulonii and C. haemulonii type II. They also differ from C. haemulonii and other related species by more than 13% sequence divergence in the internal transcribed spacer region. In the SSU rRNA gene sequences, strain 16Q1 differs by 1.7% nucleotide divergence from C. haemulonii. Sporulation was not observed in pure or mixed cultures on several media examined. All these data support the assignment of these strains to a novel species; we have named them as Candida ruelliae sp. nov., and designate strain 16Q1(T)=MTCC 7739(T)=CBS10815(T) as type strain of the novel species.
Next generation sequencing and analysis of a conserved transcriptome of New Zealand's kiwi.
Subramanian, Sankar; Huynen, Leon; Millar, Craig D; Lambert, David M
2010-12-15
Kiwi is a highly distinctive, flightless and endangered ratite bird endemic to New Zealand. To understand the patterns of molecular evolution of the nuclear protein-coding genes in brown kiwi (Apteryx australis mantelli) and to determine the timescale of avian history we sequenced a transcriptome obtained from a kiwi embryo using next generation sequencing methods. We then assembled the conserved protein-coding regions using the chicken proteome as a scaffold. Using 1,543 conserved protein coding genes we estimated the neutral evolutionary divergence between the kiwi and chicken to be ~45%, which is approximately equal to the divergence computed for the human-mouse pair using the same set of genes. A large fraction of genes was found to be under high selective constraint, as most of the expressed genes appeared to be involved in developmental gene regulation. Our study suggests a significant relationship between gene expression levels and protein evolution. Using sequences from over 700 nuclear genes we estimated the divergence between the two basal avian groups, Palaeognathae and Neognathae to be 132 million years, which is consistent with previous studies using mitochondrial genes. The results of this investigation revealed patterns of mutation and purifying selection in conserved protein coding regions in birds. Furthermore this study suggests a relatively cost-effective way of obtaining a glimpse into the fundamental molecular evolutionary attributes of a genome, particularly when no closely related genomic sequence is available.
Govindarajulu, Rajanikanth; Hughes, Colin E; Alexander, Patrick J; Bailey, C Donovan
2011-12-01
The evolutionary history of Leucaena has been impacted by polyploidy, hybridization, and divergent allopatric species diversification, suggesting that this is an ideal group to investigate the evolutionary tempo of polyploidy and the complexities of reticulation and divergence in plant diversification. Parsimony- and ML-based phylogenetic approaches were applied to 105 accessions sequenced for six sequence characterized amplified region-based nuclear encoded loci, nrDNA ITS, and four cpDNA regions. Hypotheses for the origin of tetraploid species were inferred using results derived from a novel species tree and established gene tree methods and from data on genome sizes and geographic distributions. The combination of comprehensively sampled multilocus DNA sequence data sets and a novel methodology provide strong resolution and support for the origins of all five tetraploid species. A minimum of four allopolyploidization events are required to explain the origins of these species. The origin(s) of one tetraploid pair (L. involucrata/L. pallida) can be equally explained by two unique allopolyploidizations or a single event followed by divergent speciation. Alongside other recent findings, a comprehensive picture of the complex evolutionary dynamics of polyploidy in Leucaena is emerging that includes paleotetraploidization, diploidization of the last common ancestor to Leucaena, allopatric divergence among diploids, and recent allopolyploid origins for tetraploid species likely associated with human translocation of seed. These results provide insights into the role of divergence and reticulation in a well-characterized angiosperm lineage and into traits of diploid parents and derived tetraploids (particularly self-compatibility and year-round flowering) favoring the formation and establishment of novel tetraploids combinations.
Hornok, Sándor; Wang, Yuanzhi; Otranto, Domenico; Keskin, Adem; Lia, Riccardo Paolo; Kontschán, Jenő; Takács, Nóra; Farkas, Róbert; Sándor, Attila D
2016-12-15
Haemaphysalis erinacei is one of the few ixodid tick species for which valid names of subspecies exist. Despite their disputed taxonomic status in the literature, these subspecies have not yet been compared with molecular methods. The aim of the present study was to investigate the phylogenetic relationships of H. erinacei subspecies, in the context of the first finding of this tick species in Romania. After morphological identification, DNA was extracted from five adults of H. e. taurica (from Romania and Turkey), four adults of H. e. erinacei (from Italy) and 17 adults of H. e. turanica (from China). From these samples fragments of the cytochrome c oxidase subunit 1 (cox1) and 16S rRNA genes were amplified via PCR and sequenced. Results showed that cox1 and 16S rRNA gene sequence divergences between H. e. taurica from Romania and H. e. erinacei from Italy were below 2%. However, the sequence divergences between H. e. taurica from Romania and H. e. turanica from China were high (up to 7.3% difference for the 16S rRNA gene), exceeding the reported level of sequence divergence between closely related tick species. At the same time, two adults of H. e. taurica from Turkey had higher 16S rRNA gene similarity to H. e. turanica from China (up to 97.5%) than to H. e. taurica from Romania (96.3%), but phylogenetically clustered more closely to H. e. taurica than to H. e. turanica. This is the first finding of H. erinacei in Romania, and the first (although preliminary) phylogenetic comparison of H. erinacei subspecies. Phylogenetic analyses did not support that the three H. erinacei subspecies evaluated here are of equal taxonomic rank, because the genetic divergence between H. e. turanica from China and H. e. taurica from Romania exceeded the usual level of sequence divergence between closely related tick species, suggesting that they might represent different species. Therefore, the taxonomic status of the subspecies of H. erinacei needs to be revised based on a larger number of specimens collected throughout its geographical range.
Phylogenetic analysis of Demodex caprae based on mitochondrial 16S rDNA sequence.
Zhao, Ya-E; Hu, Li; Ma, Jun-Xian
2013-11-01
Demodex caprae infests the hair follicles and sebaceous glands of goats worldwide, which not only seriously impairs goat farming, but also causes a big economic loss. However, there are few reports on the DNA level of D. caprae. To reveal the taxonomic position of D. caprae within the genus Demodex, the present study conducted phylogenetic analysis of D. caprae based on mt16S rDNA sequence data. D. caprae adults and eggs were obtained from a skin nodule of the goat suffering demodicidosis. The mt16S rDNA sequences of individual mite were amplified using specific primers, and then cloned, sequenced, and aligned. The sequence divergence, genetic distance, and transition/transversion rate were computed, and the phylogenetic trees in Demodex were reconstructed. Results revealed the 339-bp partial sequences of six D. caprae isolates were obtained, and the sequence identity was 100% among isolates. The pairwise divergences between D. caprae and Demodex canis or Demodex folliculorum or Demodex brevis were 22.2-24.0%, 24.0-24.9%, and 22.9-23.2%, respectively. The corresponding average genetic distances were 2.840, 2.926, and 2.665, and the average transition/transversion rates were 0.70, 0.55, and 0.54, respectively. The divergences, genetic distances, and transition/transversion rates of D. caprae versus the other three species all reached interspecies level. The five phylogenetic trees all presented that D. caprae clustered with D. brevis first, and then with D. canis, D. folliculorum, and Demodex injai in sequence. In conclusion, D. caprae is an independent species, and it is closer to D. brevis than to D. canis, D. folliculorum, or D. injai.
Hybridization Reveals the Evolving Genomic Architecture of Speciation
Kronforst, Marcus R.; Hansen, Matthew E.B.; Crawford, Nicholas G.; Gallant, Jason R.; Zhang, Wei; Kulathinal, Rob J.; Kapan, Durrell D.; Mullen, Sean P.
2014-01-01
SUMMARY The rate at which genomes diverge during speciation is unknown, as are the physical dynamics of the process. Here, we compare full genome sequences of 32 butterflies, representing five species from a hybridizing Heliconius butterfly community, to examine genome-wide patterns of introgression and infer how divergence evolves during the speciation process. Our analyses reveal that initial divergence is restricted to a small fraction of the genome, largely clustered around known wing-patterning genes. Over time, divergence evolves rapidly, due primarily to the origin of new divergent regions. Furthermore, divergent genomic regions display signatures of both selection and adaptive introgression, demonstrating the link between microevolutionary processes acting within species and the origin of species across macroevolutionary timescales. Our results provide a uniquely comprehensive portrait of the evolving species boundary due to the role that hybridization plays in reducing the background accumulation of divergence at neutral sites. PMID:24183670
Dissecting the relationship between protein structure and sequence variation
NASA Astrophysics Data System (ADS)
Shahmoradi, Amir; Wilke, Claus; Wilke Lab Team
2015-03-01
Over the past decade several independent works have shown that some structural properties of proteins are capable of predicting protein evolution. The strength and significance of these structure-sequence relations, however, appear to vary widely among different proteins, with absolute correlation strengths ranging from 0 . 1 to 0 . 8 . Here we present the results from a comprehensive search for the potential biophysical and structural determinants of protein evolution by studying more than 200 structural and evolutionary properties in a dataset of 209 monomeric enzymes. We discuss the main protein characteristics responsible for the general patterns of protein evolution, and identify sequence divergence as the main determinant of the strengths of virtually all structure-evolution relationships, explaining ~ 10 - 30 % of observed variation in sequence-structure relations. In addition to sequence divergence, we identify several protein structural properties that are moderately but significantly coupled with the strength of sequence-structure relations. In particular, proteins with more homogeneous back-bone hydrogen bond energies, large fractions of helical secondary structures and low fraction of beta sheets tend to have the strongest sequence-structure relation. BEACON-NSF center for the study of evolution in action.
BLAST and FASTA similarity searching for multiple sequence alignment.
Pearson, William R
2014-01-01
BLAST, FASTA, and other similarity searching programs seek to identify homologous proteins and DNA sequences based on excess sequence similarity. If two sequences share much more similarity than expected by chance, the simplest explanation for the excess similarity is common ancestry-homology. The most effective similarity searches compare protein sequences, rather than DNA sequences, for sequences that encode proteins, and use expectation values, rather than percent identity, to infer homology. The BLAST and FASTA packages of sequence comparison programs provide programs for comparing protein and DNA sequences to protein databases (the most sensitive searches). Protein and translated-DNA comparisons to protein databases routinely allow evolutionary look back times from 1 to 2 billion years; DNA:DNA searches are 5-10-fold less sensitive. BLAST and FASTA can be run on popular web sites, but can also be downloaded and installed on local computers. With local installation, target databases can be customized for the sequence data being characterized. With today's very large protein databases, search sensitivity can also be improved by searching smaller comprehensive databases, for example, a complete protein set from an evolutionarily neighboring model organism. By default, BLAST and FASTA use scoring strategies target for distant evolutionary relationships; for comparisons involving short domains or queries, or searches that seek relatively close homologs (e.g. mouse-human), shallower scoring matrices will be more effective. Both BLAST and FASTA provide very accurate statistical estimates, which can be used to reliably identify protein sequences that diverged more than 2 billion years ago.
Asamizu, Erika; Nakamura, Yasukazu; Sato, Shusei; Tabata, Satoshi
2004-02-01
To perform a comprehensive analysis of genes expressed in a model legume, Lotus japonicus, a total of 74472 3'-end expressed sequence tags (EST) were generated from cDNA libraries produced from six different organs. Clustering of sequences was performed with an identity criterion of 95% for 50 bases, and a total of 20457 non-redundant sequences, 8503 contigs and 11954 singletons were generated. EST sequence coverage was analyzed by using the annotated L. japonicus genomic sequence and 1093 of the 1889 predicted protein-encoding genes (57.9%) were hit by the EST sequence(s). Gene content was compared to several plant species. Among the 8503 contigs, 471 were identified as sequences conserved only in leguminous species and these included several disease resistance-related genes. This suggested that in legumes, these genes may have evolved specifically to resist pathogen attack. The rate of gene sequence divergence was assessed by comparing similarity level and functional category based on the Gene Ontology (GO) annotation of Arabidopsis genes. This revealed that genes encoding ribosomal proteins, as well as those related to translation, photosynthesis, and cellular structure were more abundantly represented in the highly conserved class, and that genes encoding transcription factors and receptor protein kinases were abundantly represented in the less conserved class. To make the sequence information and the cDNA clones available to the research community, a Web database with useful services was created at http://www.kazusa.or.jp/en/plant/lotus/EST/.
Full-genome sequence and analysis of a novel human rhinovirus strain within a divergent HRV-A clade.
Rathe, Jennifer A; Liu, Xinyue; Tallon, Luke J; Gern, James E; Liggett, Stephen B
2010-01-01
Genome sequences of human rhinoviruses (HRV) have primarily been from stocks collected in the 1960s, with genomes and phylogeny of modern HRVs remaining undefined. Here, two modern isolates (hrv-A101 and hrv-A101-v1) collected approximately 8 years apart were sequenced in their entirety. Incorporation into our full-genome HRV alignment with subsequent phylogenetic network inference indicated that these represent a unique HRV-A, localized within a distinct divergent clade. They appear to have resulted from recombination of the hrv-65 and hrv-78 lineages. These results support our contention that there are unrecognized distinct HRV-A strains, and that recombination is evident in currently circulating strains.
Gonzalez, P; Barroso, G; Labarère, J
1998-10-05
The Basidiomycota Agrocybe aegerita (Aa) mitochondrial cox1 gene (6790 nucleotides), encoding a protein of 527aa (58377Da), is split by four large subgroup IB introns possessing site-specific endonucleases assumed to be involved in intron mobility. When compared to other fungal COX1 proteins, the Aa protein is closely related to the COX1 one of the Basidiomycota Schizophyllum commune (Sc). This clade reveals a relationship with the studied Ascomycota ones, with the exception of Schizosaccharomyces pombe (Sp) which ranges in an out-group position compared with both higher fungi divisions. When comparison is extended to other kingdoms, fungal COX1 sequences are found to be more related to algae and plant ones (more than 57.5% aa similarity) than to animal sequences (53.6% aa similarity), contrasting with the previously established close relationship between fungi and animals, based on comparisons of nuclear genes. The four Aa cox1 introns are homologous to Ascomycota or algae cox1 introns sharing the same location within the exonic sequences. The percentages of identity of the intronic nucleotide sequences suggest a possible acquisition by lateral transfers of ancestral copies or of their derived sequences. These identities extend over the whole intronic sequences, arguing in favor of a transfer of the complete intron rather than a transfer limited to the encoded ORF. The intron i4 shares 74% of identity, at the nucleotidic level, with the Podospora anserina (Pa) intron i14, and up to 90.5% of aa similarity between the encoded proteins, i.e. the highest values reported to date between introns of two phylogenetically distant species. This low divergence argues for a recent lateral transfer between the two species. On the contrary, the low sequence identities (below 36%) observed between Aa i1 and the homologous Sp i1 or Prototheca wickeramii (Pw) i1 suggest a long evolution time after the separation of these sequences. The introns i2 and i3 possessed intermediate percentages of identity with their homologous Ascomycota introns. This is the first report of the complete nucleotide sequence and molecular organization of a mitochondrial cox1 gene of any member of the Basidiomycota division.
2010-01-01
Background Comparative sequence analysis of complex loci such as resistance gene analog clusters allows estimating the degree of sequence conservation and mechanisms of divergence at the intraspecies level. In banana (Musa sp.), two diploid wild species Musa acuminata (A genome) and Musa balbisiana (B genome) contribute to the polyploid genome of many cultivars. The M. balbisiana species is associated with vigour and tolerance to pests and disease and little is known on the genome structure and haplotype diversity within this species. Here, we compare two genomic sequences of 253 and 223 kb corresponding to two haplotypes of the RGA08 resistance gene analog locus in M. balbisiana "Pisang Klutuk Wulung" (PKW). Results Sequence comparison revealed two regions of contrasting features. The first is a highly colinear gene-rich region where the two haplotypes diverge only by single nucleotide polymorphisms and two repetitive element insertions. The second corresponds to a large cluster of RGA08 genes, with 13 and 18 predicted RGA genes and pseudogenes spread over 131 and 152 kb respectively on each haplotype. The RGA08 cluster is enriched in repetitive element insertions, in duplicated non-coding intergenic sequences including low complexity regions and shows structural variations between haplotypes. Although some allelic relationships are retained, a large diversity of RGA08 genes occurs in this single M. balbisiana genotype, with several RGA08 paralogs specific to each haplotype. The RGA08 gene family has evolved by mechanisms of unequal recombination, intragenic sequence exchange and diversifying selection. An unequal recombination event taking place between duplicated non-coding intergenic sequences resulted in a different RGA08 gene content between haplotypes pointing out the role of such duplicated regions in the evolution of RGA clusters. Based on the synonymous substitution rate in coding sequences, we estimated a 1 million year divergence time for these M. balbisiana haplotypes. Conclusions A large RGA08 gene cluster identified in wild banana corresponds to a highly variable genomic region between haplotypes surrounded by conserved flanking regions. High level of sequence identity (70 to 99%) of the genic and intergenic regions suggests a recent and rapid evolution of this cluster in M. balbisiana. PMID:20637079
Laskar, Boni A.; Bhattacharjee, Maloyjo J.; Dhar, Bishal; Mahadani, Pradosh; Kundu, Shantanu; Ghosh, Sankar K.
2013-01-01
Background The taxonomic validity of Northeast Indian endemic Mahseer species, Tor progeneius and Neolissochilus hexastichus, has been argued repeatedly. This is mainly due to disagreements in recognizing the species based on morphological characters. Consequently, both the species have been concealed for many decades. DNA barcoding has become a promising and an independent technique for accurate species level identification. Therefore, utilization of such technique in association with the traditional morphotaxonomic description can resolve the species dilemma of this important group of sport fishes. Methodology/Principal Findings Altogether, 28 mahseer specimens including paratypes were studied from different locations in Northeast India, and 24 morphometric characters were measured invariably. The Principal Component Analysis with morphometric data revealed five distinct groups of sample that were taxonomically categorized into 4 species, viz., Tor putitora, T. progeneius, Neolissochilus hexagonolepis and N. hexastichus. Analysis with a dataset of 76 DNA barcode sequences of different mahseer species exhibited that the queries of T. putitora and N. hexagonolepis clustered cohesively with the respective conspecific database sequences maintaining 0.8% maximum K2P divergence. The closest congeneric divergence was 3 times higher than the mean conspecific divergence and was considered as barcode gap. The maximum divergence among the samples of T. progeneius and T. putitora was 0.8% that was much below the barcode gap, indicating them being synonymous. The query sequences of N. hexastichus invariably formed a discrete and a congeneric clade with the database sequences and maintained the interspecific divergence that supported its distinct species status. Notably, N. hexastichus was encountered in a single site and seemed to be under threat. Conclusion This study substantiated the identification of N. hexastichus to be a true species, and tentatively regarded T. progeneius to be a synonym of T. putitora. It would guide the conservationists to initiate priority conservation of N. hexastichus and T. putitora. PMID:23341979
Nilsson, Maria A; Härlid, Anna; Kullberg, Morgan; Janke, Axel
2010-05-01
The native rodents are the most species-rich placental mammal group on the Australian continent. Fossils of native Australian rodents belonging to the group Conilurini are known from Northern Australia at 4.5Ma. These fossil assemblages already display a rich diversity of rodents, but the exact timing of their arrival on the Australian continent is not yet established. The complete mitochondrial genomes of two native Australian rodents, Leggadina lakedownensis (Lakeland Downs mouse) and Pseudomys chapmani (Western Pebble-mound mouse) were sequenced for investigating their evolutionary history. The molecular data were used for studying the phylogenetic position and divergence times of the Australian rodents, using 12 calibration points and various methods. Phylogenetic analyses place the native Australian rodents as the sister-group to the genus Mus. The Mus-Conilurini calibration point (7.3-11.0Ma) is highly critical for estimating rodent divergence times, while the influence of the different algorithms on estimating divergence times is negligible. The influence of the data type was investigated, indicating that amino acid data are more likely to reflect the correct divergence times than nucleotide sequences. The study on the problems related to estimating divergence times in fast-evolving lineages such as rodents, emphasize the choice of data and calibration points as being critical. Furthermore, it is essential to include accurate calibration points for fast-evolving groups, because the divergence times can otherwise be estimated to be significantly older. The divergence times of the Australian rodents are highly congruent and are estimated to 6.5-7.2Ma, a date that is compatible with their fossil record.
Brettanomyces acidodurans sp. nov., a new acetic acid producing yeast species from olive oil.
Péter, Gábor; Dlauchy, Dénes; Tóbiás, Andrea; Fülöp, László; Podgoršek, Martina; Čadež, Neža
2017-05-01
Two yeast strains representing a hitherto undescribed yeast species were isolated from olive oil and spoiled olive oil originating from Spain and Israel, respectively. Both strains are strong acetic acid producers, equipped with considerable tolerance to acetic acid. The cultures are not short-lived. Cellobiose is fermented as well as several other sugars. The sequences of their large subunit (LSU) rRNA gene D1/D2 domain are very divergent from the sequences available in the GenBank. They differ from the closest hit, Brettanomyces naardenensis by about 27%, mainly substitutions. Sequence analyses of the concatenated dataset from genes of the small subunit (SSU) rRNA, LSU rRNA and translation elongation factor-1α (EF-1α) placed the two strains as an early diverging member of the Brettanomyces/Dekkera clade with high bootstrap support. Sexual reproduction was not observed. The name Brettanomyces acidodurans sp. nov. (holotype: NCAIM Y.02178 T ; isotypes: CBS 14519 T = NRRL Y-63865 T = ZIM 2626 T , MycoBank no.: MB 819608) is proposed for this highly divergent new yeast species.
Inácio, Vera; Rocheta, Margarida; Morais-Cecílio, Leonor
2014-01-01
The 35S ribosomal DNA (rDNA) units, repeated in tandem at one or more chromosomal loci, are separated by an intergenic spacer (IGS) containing functional elements involved in the regulation of transcription of downstream rRNA genes. In the present work, we have compared the IGS molecular organizations in two divergent species of Fagaceae, Fagus sylvatica and Quercus suber, aiming to comprehend the evolution of the IGS sequences within the family. Self- and cross-hybridization FISH was done on representative species of the Fagaceae. The IGS length variability and the methylation level of 18 and 25S rRNA genes were assessed in representatives of three genera of this family: Fagus, Quercus and Castanea. The intergenic spacers in Beech and Cork Oak showed similar overall organizations comprising putative functional elements needed for rRNA gene activity and containing a non-transcribed spacer (NTS), a promoter region, and a 5′-external transcribed spacer. In the NTS: the sub-repeats structure in Beech is more organized than in Cork Oak, sharing some short motifs which results in the lowest sequence similarity of the entire IGS; the AT-rich region differed in both spacers by a GC-rich block inserted in Cork Oak. The 5′-ETS is the region with the higher similarity, having nonetheless different lengths. FISH with the NTS-5′-ETS revealed fainter signals in cross-hybridization in agreement with the divergence between genera. The diversity of IGS lengths revealed variants from ∼2 kb in Fagus, and Quercus up to 5.3 kb in Castanea, and a lack of correlation between the number of variants and the number of rDNA loci in several species. Methylation of 25S Bam HI site was confirmed in all species and detected for the first time in the 18S of Q. suber and Q. faginea. These results provide important clues for the evolutionary trends of the rDNA 25S-18S IGS in the Fagaceae family. PMID:24893289
Ancient wolf lineages in India.
Sharma, Dinesh K; Maldonado, Jesus E; Jhala, Yadrendradev V; Fleischer, Robert C
2004-01-01
All previously obtained wolf (Canis lupus) and dog (Canis familiaris) mitochondrial (mt) DNA sequences fall within an intertwined and shallow clade (the 'wolf-dog' clade). We sequenced mtDNA of recent and historical samples from 45 wolves from throughout lowland peninsular India and 23 wolves from the Himalayas and Tibetan Plateau and compared these sequences with all available wolf and dog sequences. All 45 lowland Indian wolves have one of four closely related haplotypes that form a well-supported, divergent sister lineage to the wolf-dog clade. This unique lineage may have been independent for more than 400,000 years. Although seven Himalayan wolves from western and central Kashmir fall within the widespread wolf-dog clade, one from Ladakh in eastern Kashmir, nine from Himachal Pradesh, four from Nepal and two from Tibet form a very different basal clade. This lineage contains five related haplotypes that probably diverged from other canids more than 800,000 years ago, but we find no evidence of current barriers to admixture. Thus, the Indian subcontinent has three divergent, ancient and apparently parapatric mtDNA lineages within the morphologically delineated wolf. No haplotypes of either novel lineage are found within a sample of 37 Indian (or other) dogs. Thus, we find no evidence that these two taxa played a part in the domestication of canids. PMID:15101402
Amazonian phylogeography: mtDNA sequence variation in arboreal echimyid rodents (Caviomorpha).
da Silva, M N; Patton, J L
1993-09-01
Patterns of evolutionary relationships among haplotype clades of sequences of the mitochondrial cytochrome b DNA gene are examined for five genera of arboreal rodents of the Caviomorph family Echimyidae from the Amazon Basin. Data are available for 798 bp of sequence from a total of 24 separate localities in Peru, Venezuela, Bolivia, and Brazil for Mesomys, Isothrix, Makalata, Dactylomys, and Echimys. Sequence divergence, corrected for multiple hits, is extensive, ranging from less than 1% for comparisons within populations of over 20% among geographic units within genera. Both the degree of differentiation and the geographic patterning of the variation suggest that more than one species composes the Amazonian distribution of the currently recognized Mesomys hispidus, Isothrix bistriata, Makalata didelphoides, and Dactylomys dactylinus. There is general concordance in the geographic range of haplotype clades for each of these taxa, and the overall level of differentiation within them is largely equivalent. These observations suggest that a common vicariant history underlies the respective diversification of each genus. However, estimated times of divergence based on the rate of third position transversion substitutions for the major clades within each genus typically range above 1 million years. Thus, allopatric isolation precipitating divergence must have been considerably earlier than the late Pleistocene forest fragmentation events commonly invoked for Amazonian biota.
Ancient wolf lineages in India.
Sharma, Dinesh K; Maldonado, Jesus E; Jhala, Yadrendradev V; Fleischer, Robert C
2004-02-07
All previously obtained wolf (Canis lupus) and dog (Canis familiaris) mitochondrial (mt) DNA sequences fall within an intertwined and shallow clade (the 'wolf-dog' clade). We sequenced mtDNA of recent and historical samples from 45 wolves from throughout lowland peninsular India and 23 wolves from the Himalayas and Tibetan Plateau and compared these sequences with all available wolf and dog sequences. All 45 lowland Indian wolves have one of four closely related haplotypes that form a well-supported, divergent sister lineage to the wolf-dog clade. This unique lineage may have been independent for more than 400,000 years. Although seven Himalayan wolves from western and central Kashmir fall within the widespread wolf-dog clade, one from Ladakh in eastern Kashmir, nine from Himachal Pradesh, four from Nepal and two from Tibet form a very different basal clade. This lineage contains five related haplotypes that probably diverged from other canids more than 800,000 years ago, but we find no evidence of current barriers to admixture. Thus, the Indian subcontinent has three divergent, ancient and apparently parapatric mtDNA lineages within the morphologically delineated wolf. No haplotypes of either novel lineage are found within a sample of 37 Indian (or other) dogs. Thus, we find no evidence that these two taxa played a part in the domestication of canids.
Ren, Jindong; Du, Xue; Zeng, Tao; Chen, Li; Shen, Junda; Lu, Lizhi; Hu, Jianhong
2017-10-01
Long noncoding RNAs (lncRNAs) and divergently expressed genes exist widely in different tissues of mammals and birds, in which they are involved in various biological processes. However, there is limited information on their role in the regulation of normal biological processes during differentiation, development, and reproduction in birds. In this study, whole transcriptome strand-specific RNA sequencing of the ovary from young ducks (60days), first-laying ducks (160days), and old ducks, i.e., ducks that stopped laying eggs (490days) was performed. The lncRNAs and mRNAs from these ducks were systematically analyzed and identified by duck genome sequencing in the three study groups. The transcriptome from the duck ovary comprised 15,011 protein-coding genes and 2905 lncRNAs; all the lncRNAs were identified as novel long noncoding transcripts. The comparison of transcriptome data from different study groups identified 2240 divergent transcription genes and 135 divergently expressed lncRNAs, which differed among the groups; most of them were significantly downregulated with age. Among the divergent genes, 38 genes were related to the reproductive process and 6 genes were upregulated. Further prediction analysis revealed that 52 lncRNAs were closely correlated with divergent reproductive mRNAs. More importantly, 6 remarkable lncRNAs were correlated significantly with the conversion of the ovary in different phases. Our results aid in the understanding of the divergent transcriptome of duck ovary in different phases and the underlying mechanisms that drive the specificity of protein-coding genes and lncRNAs in duck ovary. Copyright © 2017. Published by Elsevier B.V.
USDA-ARS?s Scientific Manuscript database
Porcine reproductive and respiratory syndrome virus (PRRSV) is widespread with a high variation in sequence and virulence among the divergent strains and causes an economically destructive disease. A viral ovarian domain protease (vOTU) has been previously identified within the nonstructural protein...
New genes from old: asymmetric divergence of gene duplicates and the evolution of development.
Holland, Peter W H; Marlétaz, Ferdinand; Maeso, Ignacio; Dunwell, Thomas L; Paps, Jordi
2017-02-05
Gene duplications and gene losses have been frequent events in the evolution of animal genomes, with the balance between these two dynamic processes contributing to major differences in gene number between species. After gene duplication, it is common for both daughter genes to accumulate sequence change at approximately equal rates. In some cases, however, the accumulation of sequence change is highly uneven with one copy radically diverging from its paralogue. Such 'asymmetric evolution' seems commoner after tandem gene duplication than after whole-genome duplication, and can generate substantially novel genes. We describe examples of asymmetric evolution in duplicated homeobox genes of moths, molluscs and mammals, in each case generating new homeobox genes that were recruited to novel developmental roles. The prevalence of asymmetric divergence of gene duplicates has been underappreciated, in part, because the origin of highly divergent genes can be difficult to resolve using standard phylogenetic methods.This article is part of the themed issue 'Evo-devo in the genomics era, and the origins of morphological diversity'. © 2016 The Author(s).
Xu, Jianping; Yan, Zhun; Guo, Hong
2009-06-01
The inheritance of mitochondrial genes and genomes are uniparental in most sexual eukaryotes. This pattern of inheritance makes mitochondrial genomes in natural populations effectively clonal. Here, we examined the mitochondrial population genetics of the emerging human pathogenic fungus Cryptococcus gattii. The DNA sequences for five mitochondrial DNA fragments were obtained from each of 50 isolates belonging to two evolutionary divergent lineages, VGI and VGII. Our analyses revealed a greater sequence diversity within VGI than that within VGII, consistent with observations of the nuclear genes. The combined analyses of all five gene fragments indicated significant divergence between VGI and VGII. However, the five individual genealogies showed different relationships among the isolates, consistent with recent hybridization and mitochondrial gene transfer between the two lineages. Population genetic analyses of the multilocus data identified evidence for predominantly clonal mitochondrial population structures within both lineages. Interestingly, there were clear signatures of recombination among mitochondrial genes within the VGII lineage. Our analyses suggest historical mitochondrial genome divergence within C. gattii, but there is evidence for recent hybridization and recombination in the mitochondrial genome of this important human yeast pathogen.
Evaluating, Comparing, and Interpreting Protein Domain Hierarchies
2014-01-01
Abstract Arranging protein domain sequences hierarchically into evolutionarily divergent subgroups is important for investigating evolutionary history, for speeding up web-based similarity searches, for identifying sequence determinants of protein function, and for genome annotation. However, whether or not a particular hierarchy is optimal is often unclear, and independently constructed hierarchies for the same domain can often differ significantly. This article describes methods for statistically evaluating specific aspects of a hierarchy, for probing the criteria underlying its construction and for direct comparisons between hierarchies. Information theoretical notions are used to quantify the contributions of specific hierarchical features to the underlying statistical model. Such features include subhierarchies, sequence subgroups, individual sequences, and subgroup-associated signature patterns. Underlying properties are graphically displayed in plots of each specific feature's contributions, in heat maps of pattern residue conservation, in “contrast alignments,” and through cross-mapping of subgroups between hierarchies. Together, these approaches provide a deeper understanding of protein domain functional divergence, reveal uncertainties caused by inconsistent patterns of sequence conservation, and help resolve conflicts between competing hierarchies. PMID:24559108
Genome Sequences of Akhmeta Virus, an Early Divergent Old World Orthopoxvirus.
Gao, Jinxin; Gigante, Crystal; Khmaladze, Ekaterine; Liu, Pengbo; Tang, Shiyuyun; Wilkins, Kimberly; Zhao, Kun; Davidson, Whitni; Nakazawa, Yoshinori; Maghlakelidze, Giorgi; Geleishvili, Marika; Kokhreidze, Maka; Carroll, Darin S; Emerson, Ginny; Li, Yu
2018-05-12
Annotated whole genome sequences of three isolates of the Akhmeta virus (AKMV), a novel species of orthopoxvirus (OPXV), isolated from the Akhmeta and Vani regions of the country Georgia, are presented and discussed. The AKMV genome is similar in genomic content and structure to that of the cowpox virus (CPXV), but a lower sequence identity was found between AKMV and Old World OPXVs than between other known species of Old World OPXVs. Phylogenetic analysis showed that AKMV diverged prior to other Old World OPXV. AKMV isolates formed a monophyletic clade in the OPXV phylogeny, yet the sequence variability between AKMV isolates was higher than between the monkeypox virus strains in the Congo basin and West Africa. An AKMV isolate from Vani contained approximately six kb sequence in the left terminal region that shared a higher similarity with CPXV than with other AKMV isolates, whereas the rest of the genome was most similar to AKMV, suggesting recombination between AKMV and CPXV in a region containing several host range and virulence genes.
Evolution of nuclear rDNA ITS sequences in the Cladophora albida/sericea clade (Chlorophyta).
Bakker, F T; Olsen, J L; Stam, W T
1995-06-01
Ribosomal DNA ITS sequences were compared among 13 different species and biogeographic isolates from the monophyletic "albida/sericea clade" in the green algal genus Cladophora. Six distinct ITS sequence types were found, characterized by multiple insertions and deletions and high levels of nucleotide substitution. Conserved domains within the ITS regions indicate the presence of ITS secondary structure. Low transition/transversion ratios among the six types and nearly symmetrical tree-length frequency distributions indicate some saturation, and low phylogenetic signal. Although branching order among five of the six ITS sequence types could not be resolved, estimates of ITS sequence divergence as compared with 18S divergence in a subset of the taxa suggests that the origin of the different ITS types is probably in the mid-Miocene (12 Ma ago) but that biogeographic isolates within a single ITS type (including both Pacific and Atlantic representatives) have probably dispersed on a time scale of thousands rather than millions of years.
RECOVIR Software for Identifying Viruses
NASA Technical Reports Server (NTRS)
Chakravarty, Sugoto; Fox, George E.; Zhu, Dianhui
2013-01-01
Most single-stranded RNA (ssRNA) viruses mutate rapidly to generate a large number of strains with highly divergent capsid sequences. Determining the capsid residues or nucleotides that uniquely characterize these strains is critical in understanding the strain diversity of these viruses. RECOVIR (an acronym for "recognize viruses") software predicts the strains of some ssRNA viruses from their limited sequence data. Novel phylogenetic-tree-based databases of protein or nucleic acid residues that uniquely characterize these virus strains are created. Strains of input virus sequences (partial or complete) are predicted through residue-wise comparisons with the databases. RECOVIR uses unique characterizing residues to identify automatically strains of partial or complete capsid sequences of picorna and caliciviruses, two of the most highly diverse ssRNA virus families. Partition-wise comparisons of the database residues with the corresponding residues of more than 300 complete and partial sequences of these viruses resulted in correct strain identification for all of these sequences. This study shows the feasibility of creating databases of hitherto unknown residues uniquely characterizing the capsid sequences of two of the most highly divergent ssRNA virus families. These databases enable automated strain identification from partial or complete capsid sequences of these human and animal pathogens.
Gene and domain duplication in the chordate Otx gene family: insights from amphioxus Otx.
Williams, N A; Holland, P W
1998-05-01
We report the genomic organization and deduced protein sequence of a cephalochordate member of the Otx homeobox gene family (AmphiOtx) and show its probable single-copy state in the genome. We also present molecular phylogenetic analysis indicating that there was single ancestral Otx gene in the first chordates which was duplicated in the vertebrate lineage after it had split from the lineage leading to the cephalochordates. Duplication of a C-terminal protein domain has occurred specifically in the vertebrate lineage, strengthening the case for a single Otx gene in an ancestral chordate whose gene structure has been retained in an extant cephalochordate. Comparative analysis of protein sequences and published gene expression patterns suggest that the ancestral chordate Otx gene had roles in patterning the anterior mesendoderm and central nervous system. These roles were elaborated following Otx gene duplication in vertebrates, accompanied by regulatory and structural divergence, particularly of Otx1 descendant genes.
Hofman, Sebastian; Pabijan, Maciej; Osikowski, Artur; Litvinchuk, Spartak N; Szymura, Jacek M
2016-09-01
We present the full-length mitogenome sequences of four European water frog species: Pelophylax cypriensis, P. epeiroticus, P. kurtmuelleri and P. shqipericus. The mtDNA size varied from 17,363 to 17,895 bp, and its organization with the LPTF tRNA gene cluster preceding the 12 S rRNA gene displayed the typical Neobatrachian arrangement. Maximum likelihood and Bayesian inference revealed a well-resolved mtDNA phylogeny of seven European Pelophylax species. The uncorrected p-distance for among Pelophylax mitogenomes was 9.6 (range 0.01-0.13). Most divergent was the P. shqipericus mitogenome, clustering with the "P. lessonae" group, in contrast to the other three new Pelophylax mitogenomes related to the "P. bedriagae/ridibundus" lineage. The new mitogenomes resolve ambiguities of the phylogenetic placement of P. cretensis and P. epeiroticus.
Sex Chromosome Evolution in Amniotes: Applications for Bacterial Artificial Chromosome Libraries
Janes, Daniel E.; Valenzuela, Nicole; Ezaz, Tariq; Amemiya, Chris; Edwards, Scott V.
2011-01-01
Variability among sex chromosome pairs in amniotes denotes a dynamic history. Since amniotes diverged from a common ancestor, their sex chromosome pairs and, more broadly, sex-determining mechanisms have changed reversibly and frequently. These changes have been studied and characterized through the use of many tools and experimental approaches but perhaps most effectively through applications for bacterial artificial chromosome (BAC) libraries. Individual BAC clones carry 100–200 kb of sequence from one individual of a target species that can be isolated by screening, mapped onto karyotypes, and sequenced. With these techniques, researchers have identified differences and similarities in sex chromosome content and organization across amniotes and have addressed hypotheses regarding the frequency and direction of past changes. Here, we review studies of sex chromosome evolution in amniotes and the ways in which the field of research has been affected by the advent of BAC libraries. PMID:20981143
The Amphimedon queenslandica genome and the evolution of animal complexity
DOE Office of Scientific and Technical Information (OSTI.GOV)
Srivastava, Mansi; Simakov, Oleg; Chapman, Jarrod
2010-07-01
Sponges are an ancient group of animals that diverged from other metazoans over 600 million years ago. Here we present the draft genome sequence of Amphimedon queenslandica, a demosponge from the Great Barrier Reef, and show that it is remarkably similar to other animal genomes in content, structure and organization. Comparative analysis enabled by the sponge sequence reveals genomic events linked to the origin and early evolution of animals, including the appearance, expansion, and diversification of pan-metazoan transcription factor, signaling pathway, and structural genes. This diverse 'toolkit' of genes correlates with critical aspects of all metazoan body plans, and comprisesmore » cell cycle control and growth, development, somatic and germ cell specification, cell adhesion, innate immunity, and allorecognition. Notably, many of the genes associated with the emergence of animals are also implicated in cancer, which arises from defects in basic processes associated with metazoan multicellularity.« less
Saisawang, Chonticha; Ketterman, Albert J.
2014-01-01
Glutathione transferases (GST) are an ancient superfamily comprising a large number of paralogous proteins in a single organism. This multiplicity of GSTs has allowed the copies to diverge for neofunctionalization with proposed roles ranging from detoxication and oxidative stress response to involvement in signal transduction cascades. We performed a comparative genomic analysis using FlyBase annotations and Drosophila melanogaster GST sequences as templates to further annotate the GST orthologs in the 12 Drosophila sequenced genomes. We found that GST genes in the Drosophila subgenera have undergone repeated local duplications followed by transposition, inversion, and micro-rearrangements of these copies. The colinearity and orientations of the orthologous GST genes appear to be unique in many of the species which suggests that genomic rearrangement events have occurred multiple times during speciation. The high micro-plasticity of the genomes appears to have a functional contribution utilized for evolution of this gene family. PMID:25310450
Catalano, Sarah R; Whittington, Ian D; Donnellan, Stephen C; Bertozzi, Terry; Gillanders, Bronwyn M
2015-07-01
Dicyemids, poorly known parasites of benthic cephalopods, are one of the few phyla in which mitochondrial (mt) genome architecture departs from the typical ~16 kb circular metazoan genome. In addition to a putative circular genome, a series of mt minicircles that each comprises the mt encoded units (I-III) of the cytochrome c oxidase complex have been reported. Whether the structure of the mt minicircles is a consistent feature among dicyemid species is unknown. Here we analyse the complete cytochrome c oxidase subunit I (COI) minicircle molecule, containing the COI gene and an associated non-coding region (NCR), for ten dicyemid species, allowing for first time comparisons between species of minicircle architecture, NCR function and inferences of minicircle replication. Divergence in COI nucleotide sequences between dicyemid species was high (average net divergence = 31.6%) while within species diversity was lower (average net divergence = 0.2%). The NCR and putative 5' section of the COI gene were highly divergent between dicyemid species (average net nucleotide divergence of putative 5' COI section = 61.1%). No tRNA genes were found in the NCR, although palindrome sequences with the potential to form stem-loop structures were identified in some species, which may play a role in transcription or other biological processes.
CNL Disease Resistance Genes in Soybean and Their Evolutionary Divergence
Nepal, Madhav P; Benson, Benjamin V
2015-01-01
Disease resistance genes (R-genes) encode proteins involved in detecting pathogen attack and activating downstream defense molecules. Recent availability of soybean genome sequences makes it possible to examine the diversity of gene families including disease-resistant genes. The objectives of this study were to identify coiled-coil NBS-LRR (= CNL) R-genes in soybean, infer their evolutionary relationships, and assess structural as well as functional divergence of the R-genes. Profile hidden Markov models were used for sequence identification and model-based maximum likelihood was used for phylogenetic analysis, and variation in chromosomal positioning, gene clustering, and functional divergence were assessed. We identified 188 soybean CNL genes nested into four clades consistent to their orthologs in Arabidopsis. Gene clustering analysis revealed the presence of 41 gene clusters located on 13 different chromosomes. Analyses of the Ks-values and chromosomal positioning suggest duplication events occurring at varying timescales, and an extrapericentromeric positioning may have facilitated their rapid evolution. Each of the four CNL clades exhibited distinct patterns of gene expression. Phylogenetic analysis further supported the extrapericentromeric positioning effect on the divergence and retention of the CNL genes. The results are important for understanding the diversity and divergence of CNL genes in soybean, which would have implication in soybean crop improvement in future. PMID:25922568
CNL Disease Resistance Genes in Soybean and Their Evolutionary Divergence.
Nepal, Madhav P; Benson, Benjamin V
2015-01-01
Disease resistance genes (R-genes) encode proteins involved in detecting pathogen attack and activating downstream defense molecules. Recent availability of soybean genome sequences makes it possible to examine the diversity of gene families including disease-resistant genes. The objectives of this study were to identify coiled-coil NBS-LRR (= CNL) R-genes in soybean, infer their evolutionary relationships, and assess structural as well as functional divergence of the R-genes. Profile hidden Markov models were used for sequence identification and model-based maximum likelihood was used for phylogenetic analysis, and variation in chromosomal positioning, gene clustering, and functional divergence were assessed. We identified 188 soybean CNL genes nested into four clades consistent to their orthologs in Arabidopsis. Gene clustering analysis revealed the presence of 41 gene clusters located on 13 different chromosomes. Analyses of the K s-values and chromosomal positioning suggest duplication events occurring at varying timescales, and an extrapericentromeric positioning may have facilitated their rapid evolution. Each of the four CNL clades exhibited distinct patterns of gene expression. Phylogenetic analysis further supported the extrapericentromeric positioning effect on the divergence and retention of the CNL genes. The results are important for understanding the diversity and divergence of CNL genes in soybean, which would have implication in soybean crop improvement in future.
Functionally conserved enhancers with divergent sequences in distant vertebrates
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, Song; Oksenberg, Nir; Takayama, Sachiko
To examine the contributions of sequence and function conservation in the evolution of enhancers, we systematically identified enhancers whose sequences are not conserved among distant groups of vertebrate species, but have homologous function and are likely to be derived from a common ancestral sequence. In conclusion, our approach combined comparative genomics and epigenomics to identify potential enhancer sequences in the genomes of three groups of distantly related vertebrate species.
Functionally conserved enhancers with divergent sequences in distant vertebrates
Yang, Song; Oksenberg, Nir; Takayama, Sachiko; ...
2015-10-30
To examine the contributions of sequence and function conservation in the evolution of enhancers, we systematically identified enhancers whose sequences are not conserved among distant groups of vertebrate species, but have homologous function and are likely to be derived from a common ancestral sequence. In conclusion, our approach combined comparative genomics and epigenomics to identify potential enhancer sequences in the genomes of three groups of distantly related vertebrate species.
Genomic Sequence around Butterfly Wing Development Genes: Annotation and Comparative Analysis
Conceição, Inês C.; Long, Anthony D.; Gruber, Jonathan D.; Beldade, Patrícia
2011-01-01
Background Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. Methodology/Principal Findings We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations) and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes). Conclusions The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1) the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2) the high conservation of non-coding sequence around the genes wingless and Ecdysone receptor, both involved in multiple developmental processes including wing pattern formation. PMID:21909358
Horn, Susanne; Durka, Walter; Wolf, Ronny; Ermala, Aslak; Stubbe, Annegret; Stubbe, Michael; Hofreiter, Michael
2011-01-01
Background Beavers are one of the largest and ecologically most distinct rodent species. Little is known about their evolution and even their closest phylogenetic relatives have not yet been identified with certainty. Similarly, little is known about the timing of divergence events within the genus Castor. Methodology/Principal Findings We sequenced complete mitochondrial genomes from both extant beaver species and used these sequences to place beavers in the phylogenetic tree of rodents and date their divergence from other rodents as well as the divergence events within the genus Castor. Our analyses support the phylogenetic position of beavers as a sister lineage to the scaly tailed squirrel Anomalurus within the mouse related clade. Molecular dating places the divergence time of the lineages leading to beavers and Anomalurus as early as around 54 million years ago (mya). The living beaver species, Castor canadensis from North America and Castor fiber from Eurasia, although similar in appearance, appear to have diverged from a common ancestor more than seven mya. This result is consistent with the hypothesis that a migration of Castor from Eurasia to North America as early as 7.5 mya could have initiated their speciation. We date the common ancestor of the extant Eurasian beaver relict populations to around 210,000 years ago, much earlier than previously thought. Finally, the substitution rate of Castor mitochondrial DNA is considerably lower than that of other rodents. We found evidence that this is correlated with the longer life span of beavers compared to other rodents. Conclusions/Significance A phylogenetic analysis of mitochondrial genome sequences suggests a sister-group relationship between Castor and Anomalurus, and allows molecular dating of species divergence in congruence with paleontological data. The implementation of a relaxed molecular clock enabled us to estimate mitochondrial substitution rates and to evaluate the effect of life history traits on it. PMID:21307956
Miller, Hilary C; O'Meally, Denis; Ezaz, Tariq; Amemiya, Chris; Marshall-Graves, Jennifer A; Edwards, Scott
2015-05-07
Major histocompatibility complex (MHC) genes are a central component of the vertebrate immune system and usually exist in a single genomic region. However, considerable differences in MHC organization and size exist between different vertebrate lineages. Reptiles occupy a key evolutionary position for understanding how variation in MHC structure evolved in vertebrates, but information on the structure of the MHC region in reptiles is limited. In this study, we investigate the organization and cytogenetic location of MHC genes in the tuatara (Sphenodon punctatus), the sole extant representative of the early-diverging reptilian order Rhynchocephalia. Sequencing and mapping of 12 clones containing class I and II MHC genes from a bacterial artificial chromosome library indicated that the core MHC region is located on chromosome 13q. However, duplication and translocation of MHC genes outside of the core region was evident, because additional class I MHC genes were located on chromosome 4p. We found a total of seven class I sequences and 11 class II β sequences, with evidence for duplication and pseudogenization of genes within the tuatara lineage. The tuatara MHC is characterized by high repeat content and low gene density compared with other species and we found no antigen processing or MHC framework genes on the MHC gene-containing clones. Our findings indicate substantial differences in MHC organization in tuatara compared with mammalian and avian MHCs and highlight the dynamic nature of the MHC. Further sequencing and annotation of tuatara and other reptile MHCs will determine if the tuatara MHC is representative of nonavian reptiles in general. Copyright © 2015 Miller et al.
Genotype imputation in a coalescent model with infinitely-many-sites mutation
Huang, Lucy; Buzbas, Erkan O.; Rosenberg, Noah A.
2012-01-01
Empirical studies have identified population-genetic factors as important determinants of the properties of genotype-imputation accuracy in imputation-based disease association studies. Here, we develop a simple coalescent model of three sequences that we use to explore the theoretical basis for the influence of these factors on genotype-imputation accuracy, under the assumption of infinitely-many-sites mutation. Employing a demographic model in which two populations diverged at a given time in the past, we derive the approximate expectation and variance of imputation accuracy in a study sequence sampled from one of the two populations, choosing between two reference sequences, one sampled from the same population as the study sequence and the other sampled from the other population. We show that under this model, imputation accuracy—as measured by the proportion of polymorphic sites that are imputed correctly in the study sequence—increases in expectation with the mutation rate, the proportion of the markers in a chromosomal region that are genotyped, and the time to divergence between the study and reference populations. Each of these effects derives largely from an increase in information available for determining the reference sequence that is genetically most similar to the sequence targeted for imputation. We analyze as a function of divergence time the expected gain in imputation accuracy in the target using a reference sequence from the same population as the target rather than from the other population. Together with a growing body of empirical investigations of genotype imputation in diverse human populations, our modeling framework lays a foundation for extending imputation techniques to novel populations that have not yet been extensively examined. PMID:23079542
Genetic and phylogenetic divergence of feline immunodeficiency virus in the puma (Puma concolor).
Carpenter, M A; Brown, E W; Culver, M; Johnson, W E; Pecon-Slattery, J; Brousset, D; O'Brien, S J
1996-01-01
Feline immunodeficiency virus (FIV) is a lentivirus which causes an AIDS-like disease in domestic cats (Felis catus). A number of other felid species, including the puma (Puma concolor), carry a virus closely related to domestic cat FIV. Serological testing revealed the presence of antibodies to FIV in 22% of 434 samples from throughout the geographic range of the puma. FIV-Pco pol gene sequences isolated from pumas revealed extensive sequence diversity, greater than has been documented in the domestic cat. The puma sequences formed two highly divergent groups, analogous to the clades which have been defined for domestic cat and lion (Panthera leo) FIV. The puma clade A was made up of samples from Florida and California, whereas clade B consisted of samples from other parts of North America, Central America, and Brazil. The difference between these two groups was as great as that reported among three lion FIV clades. Within puma clades, sequence variation is large, comparable to between-clade differences seen for domestic cat clades, allowing recognition of 15 phylogenetic lineages (subclades) among puma FIV-Pco. Large sequence divergence among isolates, nearly complete species monophyly, and widespread geographic distribution suggest that FIV-Pco has evolved within the puma species for a long period. The sequence data provided evidence for vertical transmission of FIV-Pco from mothers to their kittens, for coinfection of individuals by two different viral strains, and for cross-species transmission of FIV from a domestic cat to a puma. These factors may all be important for understanding the epidemiology and natural history of FIV in the puma. PMID:8794304
USDA-ARS?s Scientific Manuscript database
High-throughput sequencing of reduced representation genomic libraries has ushered in an era of genotyping-by-sequencing (GBS), where genome-wide genotype data can be obtained for nearly any species. However, there remains a need for imputation-free GBS methods for genotyping large samples taken fr...
Gaby, John Christian; Buckley, Daniel H
2014-01-01
We describe a nitrogenase gene sequence database that facilitates analysis of the evolution and ecology of nitrogen-fixing organisms. The database contains 32 954 aligned nitrogenase nifH sequences linked to phylogenetic trees and associated sequence metadata. The database includes 185 linked multigene entries including full-length nifH, nifD, nifK and 16S ribosomal RNA (rRNA) gene sequences. Evolutionary analyses enabled by the multigene entries support an ancient horizontal transfer of nitrogenase genes between Archaea and Bacteria and provide evidence that nifH has a different history of horizontal gene transfer from the nifDK enzyme core. Further analyses show that lineages in nitrogenase cluster I and cluster III have different rates of substitution within nifD, suggesting that nifD is under different selection pressure in these two lineages. Finally, we find that that the genetic divergence of nifH and 16S rRNA genes does not correlate well at sequence dissimilarity values used commonly to define microbial species, as stains having <3% sequence dissimilarity in their 16S rRNA genes can have up to 23% dissimilarity in nifH. The nifH database has a number of uses including phylogenetic and evolutionary analyses, the design and assessment of primers/probes and the evaluation of nitrogenase sequence diversity. Database URL: http://www.css.cornell.edu/faculty/buckley/nifh.htm.
Liu, Mingjian; Fan, Xinpeng; Gao, Feng; Gao, Shan; Yu, Yuhe; Warren, Alan; Huang, Jie
2016-11-01
A cryptic species of the Tetrahymena pyriformis complex, Tetrahymena australis, has been known for a long time but never properly diagnosed based on taxonomic methods. The species name is thus invalid according to the International Code of Zoological Nomenclature. Recently, a population isolated from a freshwater lake in Wuhan, China was investigated using live observations, silver staining methods and gene sequence data. This organism can be separated from other described species of the T. pyriformis complex by its relatively small body size, the number of somatic kineties and differences in sequences of two genes, namely the small subunit ribosomal RNA (SSU rRNA) and the mitochondrial cytochrome c oxidase subunit I (cox1). We compared the SSU rRNA gene sequences of all available Tetrahymena species to reveal the nucleotide differences within this genus. The sequence of the Wuhan population is identical to two sequences of a previously isolated strain of T. australis (ATCC #30831). Phylogenetic analyses indicate that these three sequences (X56167, M98015, KT334373) cluster with Tetrahymena shanghaiensis (EF070256) in a polytomy. However, sequence divergence of the cox1 gene between the Wuhan population and another strain of T. australis (ATCC #30271) is 1.4%, suggesting that these may represent different subspecies. © 2016 The Author(s) Journal of Eukaryotic Microbiology © 2016 International Society of Protistologists.
Accuracy of taxonomy prediction for 16S rRNA and fungal ITS sequences
2018-01-01
Prediction of taxonomy for marker gene sequences such as 16S ribosomal RNA (rRNA) is a fundamental task in microbiology. Most experimentally observed sequences are diverged from reference sequences of authoritatively named organisms, creating a challenge for prediction methods. I assessed the accuracy of several algorithms using cross-validation by identity, a new benchmark strategy which explicitly models the variation in distances between query sequences and the closest entry in a reference database. When the accuracy of genus predictions was averaged over a representative range of identities with the reference database (100%, 99%, 97%, 95% and 90%), all tested methods had ≤50% accuracy on the currently-popular V4 region of 16S rRNA. Accuracy was found to fall rapidly with identity; for example, better methods were found to have V4 genus prediction accuracy of ∼100% at 100% identity but ∼50% at 97% identity. The relationship between identity and taxonomy was quantified as the probability that a rank is the lowest shared by a pair of sequences with a given pair-wise identity. With the V4 region, 95% identity was found to be a twilight zone where taxonomy is highly ambiguous because the probabilities that the lowest shared rank between pairs of sequences is genus, family, order or class are approximately equal. PMID:29682424
Gaby, John Christian; Buckley, Daniel H.
2014-01-01
We describe a nitrogenase gene sequence database that facilitates analysis of the evolution and ecology of nitrogen-fixing organisms. The database contains 32 954 aligned nitrogenase nifH sequences linked to phylogenetic trees and associated sequence metadata. The database includes 185 linked multigene entries including full-length nifH, nifD, nifK and 16S ribosomal RNA (rRNA) gene sequences. Evolutionary analyses enabled by the multigene entries support an ancient horizontal transfer of nitrogenase genes between Archaea and Bacteria and provide evidence that nifH has a different history of horizontal gene transfer from the nifDK enzyme core. Further analyses show that lineages in nitrogenase cluster I and cluster III have different rates of substitution within nifD, suggesting that nifD is under different selection pressure in these two lineages. Finally, we find that that the genetic divergence of nifH and 16S rRNA genes does not correlate well at sequence dissimilarity values used commonly to define microbial species, as stains having <3% sequence dissimilarity in their 16S rRNA genes can have up to 23% dissimilarity in nifH. The nifH database has a number of uses including phylogenetic and evolutionary analyses, the design and assessment of primers/probes and the evaluation of nitrogenase sequence diversity. Database URL: http://www.css.cornell.edu/faculty/buckley/nifh.htm PMID:24501396
Deli, Temim; Kalkan, Evrim; Karhan, Selahattin Ünsal; Uzunova, Sonya; Keikhosravi, Alireza; Bilgin, Raşit; Schubart, Christoph D
2018-04-11
Recently, population genetic studies of Mediterranean marine species highlighted patterns of genetic divergence and phylogeographic breaks, due to the interplay between impacts of Pleistocene climate shifts and contemporary hydrographical barriers. These factors markedly shaped the distribution of marine organisms and their genetic makeup. The present study is part of an ongoing effort to understand the phylogeography and evolutionary history of the highly dispersive Mediterranean green crab, Carcinus aestuarii (Nardo, 1847), across the Mediterranean Sea. Recently, marked divergence between two highly separated haplogroups (genetic types I and II) of C. aestuarii was discerned across the Siculo-Tunisian Strait, suggesting an Early Pleistocene vicariant event. In order to better identify phylogeographic patterns in this species, a total of 263 individuals from 22 Mediterranean locations were analysed by comparing a 587 basepair region of the mitochondrial gene Cox1 (cytochrome oxidase subunit 1). The examined dataset is composed of both newly generated sequences (76) and previously investigated ones (187). Our results unveiled the occurrence of a highly divergent haplogroup (genetic type III) in the most north-eastern part of the Mediterranean Sea. Divergence between the most distinct type III and the common ancestor of both types I and II corresponds to the Early Pleistocene and coincides with the historical episode of separation between types I and II. Our results also revealed strong genetic divergence among adjacent regions (separating the Aegean and Marmara seas from the remaining distribution zone) and confirmed a sharp phylogeographic break across the Eastern Mediterranean. The recorded parapatric genetic divergence, with the potential existence of a contact zone between both groups in the Ionian Sea and notable differences in the demographic history, suggest the likely impact of paleoclimatic events, as well as past and contemporary oceanographic processes, in shaping genetic variability of this species. Our findings not only provide further evidence for the complex evolutionary history of the green crab in the Mediterranean Sea, but also stress the importance of investigating peripheral areas in the species' distribution zone in order to fully understand the distribution of genetic diversity and unravel hidden genetic units and local patterns of endemism.
DNA barcoding for molecular identification of Demodex based on mitochondrial genes.
Hu, Li; Yang, YuanJun; Zhao, YaE; Niu, DongLing; Yang, Rui; Wang, RuiLing; Lu, Zhaohui; Li, XiaoQi
2017-12-01
There has been no widely accepted DNA barcode for species identification of Demodex. In this study, we attempted to solve this issue. First, mitochondrial cox1-5' and 12S gene fragments of Demodex folloculorum, D. brevis, D. canis, and D. caprae were amplified, cloned, and sequenced for the first time; intra/interspecific divergences were computed and phylogenetic trees were reconstructed. Then, divergence frequency distribution plots of those two gene fragments were drawn together with mtDNA cox1-middle region and 16S obtained in previous studies. Finally, their identification efficiency was evaluated by comparing barcoding gap. Results indicated that 12S had the higher identification efficiency. Specifically, for cox1-5' region of the four Demodex species, intraspecific divergences were less than 2.0%, and interspecific divergences were 21.1-31.0%; for 12S, intraspecific divergences were less than 1.4%, and interspecific divergences were 20.8-26.9%. The phylogenetic trees demonstrated that the four Demodex species clustered separately, and divergence frequency distribution plot showed that the largest intraspecific divergence of 12S (1.4%) was less than cox1-5' region (2.0%), cox1-middle region (3.1%), and 16S (2.8%). The barcoding gap of 12S was 19.4%, larger than cox1-5' region (19.1%), cox1-middle region (11.3%), and 16S (13.0%); the interspecific divergence span of 12S was 6.2%, smaller than cox1-5' region (10.0%), cox1-middle region (14.1%), and 16S (11.4%). Moreover, 12S has a moderate length (517 bp) for sequencing at once. Therefore, we proposed mtDNA 12S was more suitable than cox1 and 16S to be a DNA barcode for classification and identification of Demodex at lower category level.
Keskin, Emre; Atar, Hasan Huseyin
2012-04-01
Mitochondrial DNA sequence variation in 655 bpfragments of the cytochrome oxidase c subunit I gene, known as the DNA barcode, of European anchovy (Engraulis encrasicolus) was evaluated by analyzing 1529 individuals representing 16 populations from the Black Sea, through the Marmara Sea and the Aegean Sea to the Mediterranean Sea. A total of 19 (2.9%) variable sites were found among individuals, and these defined 10 genetically diverged populations with an overall mean distance of 1.2%. The highest nucleotide divergence was found between samples of eastern Mediterranean and northern Aegean (2.2%). Evolutionary history analysis among 16 populations clustered the Mediterranean Sea clades in one main branch and the other clades in another branch. Diverging pattern of the European anchovy populations correlated with geographic dispersion supports the genetic structuring through the Black Sea-Marmara Sea-Aegean Sea-Mediterranean Sea quad.
Seeing chordate evolution through the Ciona genome sequence
Cañestro, Cristian; Bassham, Susan; Postlethwait, John H
2003-01-01
A draft sequence of the compact genome of the sea squirt Ciona intestinalis, a non-vertebrate chordate that diverged very early from other chordates, including vertebrates, illuminates how chordates originated and how vertebrate developmental innovations evolved. PMID:12620098
Akın, Ciğdem; Bilgin, C Can; Beerli, Peter; Westaway, Rob; Ohst, Torsten; Litvinchuk, Spartak N; Uzzell, Thomas; Bilgin, Metin; Hotz, Hansjürg; Guex, Gaston-Denis; Plötner, Jörg
2010-11-01
AIM: Our aims were to assess the phylogeographic patterns of genetic diversity in eastern Mediterranean water frogs and to estimate divergence times using different geological scenarios. We related divergence times to past geological events and discuss the relevance of our data for the systematics of eastern Mediterranean water frogs. LOCATION: The eastern Mediterranean region. METHODS: Genetic diversity and divergence were calculated using sequences of two protein-coding mitochondrial (mt) genes: ND2 (1038 bp, 119 sequences) and ND3 (340 bp, 612 sequences). Divergence times were estimated in a Bayesian framework under four geological scenarios representing alternative possible geological histories for the eastern Mediterranean. We then compared the different scenarios using Bayes factors and additional geological data. RESULTS: Extensive genetic diversity in mtDNA divides eastern Mediterranean water frogs into six main haplogroups (MHG). Three MHGs were identified on the Anatolian mainland; the most widespread MHG with the highest diversity is distributed from western Anatolia to the northern shore of the Caspian Sea, including the type locality of Pelophylax ridibundus. The other two Anatolian MHGs are restricted to south-eastern Turkey, occupying localities west and east of the Amanos mountain range. One of the remaining three MHGs is restricted to Cyprus; a second to the Levant; the third was found in the distribution area of European lake frogs (P. ridibundus group), including the Balkans. MAIN CONCLUSIONS: Based on geological evidence and estimates of genetic divergence we hypothesize that the water frogs of Cyprus have been isolated from the Anatolian mainland populations since the end of the Messinian salinity crisis (MSC), i.e. since c. 5.5-5.3 Ma, while our divergence time estimates indicate that the isolation of Crete from the mainland populations (Peloponnese, Anatolia) most likely pre-dates the MSC. The observed rates of divergence imply a time window of c. 1.6-1.1 million years for diversification of the largest Anatolian MHG; divergence between the two other Anatolian MHGs may have begun about 3.0 Ma, apparently as a result of uplift of the Amanos Mountains. Our mtDNA data suggest that the Anatolian water frogs and frogs from Cyprus represent several undescribed species.
Poortvliet, Marloes; Olsen, Jeanine L; Croll, Donald A; Bernardi, Giacomo; Newton, Kelly; Kollias, Spyros; O'Sullivan, John; Fernando, Daniel; Stevens, Guy; Galván Magaña, Felipe; Seret, Bernard; Wintner, Sabine; Hoarau, Galice
2015-02-01
Manta and devil rays are an iconic group of globally distributed pelagic filter feeders, yet their evolutionary history remains enigmatic. We employed next generation sequencing of mitogenomes for nine of the 11 recognized species and two outgroups; as well as additional Sanger sequencing of two mitochondrial and two nuclear genes in an extended taxon sampling set. Analysis of the mitogenome coding regions in a Maximum Likelihood and Bayesian framework provided a well-resolved phylogeny. The deepest divergences distinguished three clades with high support, one containing Manta birostris, Manta alfredi, Mobula tarapacana, Mobula japanica and Mobula mobular; one containing Mobula kuhlii, Mobula eregoodootenkee and Mobula thurstoni; and one containing Mobula munkiana, Mobula hypostoma and Mobula rochebrunei. Mobula remains paraphyletic with the inclusion of Manta, a result that is in agreement with previous studies based on molecular and morphological data. A fossil-calibrated Bayesian random local clock analysis suggests that mobulids diverged from Rhinoptera around 30 Mya. Subsequent divergences are characterized by long internodes followed by short bursts of speciation extending from an initial episode of divergence in the Early and Middle Miocene (19-17 Mya) to a second episode during the Pliocene and Pleistocene (3.6 Mya - recent). Estimates of divergence dates overlap significantly with periods of global warming, during which upwelling intensity - and related high primary productivity in upwelling regions - decreased markedly. These periods are hypothesized to have led to fragmentation and isolation of feeding regions leading to possible regional extinctions, as well as the promotion of allopatric speciation. The closely shared evolutionary history of mobulids in combination with ongoing threats from fisheries and climate change effects on upwelling and food supply, reinforces the case for greater protection of this charismatic family of pelagic filter feeders. Copyright © 2014 Elsevier Inc. All rights reserved.
Sex Chromosome Turnover Contributes to Genomic Divergence between Incipient Stickleback Species
Yoshida, Kohta; Makino, Takashi; Yamaguchi, Katsushi; Shigenobu, Shuji; Hasebe, Mitsuyasu; Kawata, Masakado; Kume, Manabu; Mori, Seiichi; Peichel, Catherine L.; Toyoda, Atsushi; Fujiyama, Asao; Kitano, Jun
2014-01-01
Sex chromosomes turn over rapidly in some taxonomic groups, where closely related species have different sex chromosomes. Although there are many examples of sex chromosome turnover, we know little about the functional roles of sex chromosome turnover in phenotypic diversification and genomic evolution. The sympatric pair of Japanese threespine stickleback (Gasterosteus aculeatus) provides an excellent system to address these questions: the Japan Sea species has a neo-sex chromosome system resulting from a fusion between an ancestral Y chromosome and an autosome, while the sympatric Pacific Ocean species has a simple XY sex chromosome system. Furthermore, previous quantitative trait locus (QTL) mapping demonstrated that the Japan Sea neo-X chromosome contributes to phenotypic divergence and reproductive isolation between these sympatric species. To investigate the genomic basis for the accumulation of genes important for speciation on the neo-X chromosome, we conducted whole genome sequencing of males and females of both the Japan Sea and the Pacific Ocean species. No substantial degeneration has yet occurred on the neo-Y chromosome, but the nucleotide sequence of the neo-X and the neo-Y has started to diverge, particularly at regions near the fusion. The neo-sex chromosomes also harbor an excess of genes with sex-biased expression. Furthermore, genes on the neo-X chromosome showed higher non-synonymous substitution rates than autosomal genes in the Japan Sea lineage. Genomic regions of higher sequence divergence between species, genes with divergent expression between species, and QTL for inter-species phenotypic differences were found not only at the regions near the fusion site, but also at other regions along the neo-X chromosome. Neo-sex chromosomes can therefore accumulate substitutions causing species differences even in the absence of substantial neo-Y degeneration. PMID:24625862
Hellberg, M E; Moy, G W; Vacquier, V D
2000-03-01
Male-specific proteins have increasingly been reported as targets of positive selection and are of special interest because of the role they may play in the evolution of reproductive isolation. We report the rapid interspecific divergence of cDNA encoding a major acrosomal protein of unknown function (TMAP) of sperm from five species of teguline gastropods. A mitochondrial DNA clock (calibrated by congeneric species divided by the Isthmus of Panama) estimates that these five species diverged 2-10 MYA. Inferred amino acid sequences reveal a propeptide that has diverged rapidly between species. The mature protein has diverged faster still due to high nonsynonymous substitution rates (> 25 nonsynonymous substitutions per site per 10(9) years). cDNA encoding the mature protein (89-100 residues) shows evidence of positive selection (Dn/Ds > 1) for 4 of 10 pairwise species comparisons. cDNA and predicted secondary-structure comparisons suggest that TMAP is neither orthologous nor paralogous to abalone lysin, and thus marks a second, phylogenetically independent, protein subject to strong positive selection in free-spawning marine gastropods. In addition, an internal repeat in one species (Tegula aureotincta) produces a duplicated cleavage site which results in two alternatively processed mature proteins differing by nine amino acid residues. Such alternative processing may provide a mechanism for introducing novel amino acid sequence variation at the amino-termini of proteins. Highly divergent TMAP N-termini from two other tegulines (Tegula regina and Norrisia norrisii) may have originated by such a mechanism.
Troggio, Michela; Surbanovski, Nada; Bianco, Luca; Moretto, Marco; Giongo, Lara; Banchi, Elisa; Viola, Roberto; Fernández, Felicdad Fernández; Costa, Fabrizio; Velasco, Riccardo; Cestaro, Alessandro; Sargent, Daniel James
2013-01-01
High throughput arrays for the simultaneous genotyping of thousands of single-nucleotide polymorphisms (SNPs) have made the rapid genetic characterisation of plant genomes and the development of saturated linkage maps a realistic prospect for many plant species of agronomic importance. However, the correct calling of SNP genotypes in divergent polyploid genomes using array technology can be problematic due to paralogy, and to divergence in probe sequences causing changes in probe binding efficiencies. An Illumina Infinium II whole-genome genotyping array was recently developed for the cultivated apple and used to develop a molecular linkage map for an apple rootstock progeny (M432), but a large proportion of segregating SNPs were not mapped in the progeny, due to unexpected genotype clustering patterns. To investigate the causes of this unexpected clustering we performed BLAST analysis of all probe sequences against the 'Golden Delicious' genome sequence and discovered evidence for paralogous annealing sites and probe sequence divergence for a high proportion of probes contained on the array. Following visual re-evaluation of the genotyping data generated for 8,788 SNPs for the M432 progeny using the array, we manually re-scored genotypes at 818 loci and mapped a further 797 markers to the M432 linkage map. The newly mapped markers included the majority of those that could not be mapped previously, as well as loci that were previously scored as monomorphic, but which segregated due to divergence leading to heterozygosity in probe annealing sites. An evaluation of the 8,788 probes in a diverse collection of Malus germplasm showed that more than half the probes returned genotype clustering patterns that were difficult or impossible to interpret reliably, highlighting implications for the use of the array in genome-wide association studies.
rpoB-Based Identification of Nonpigmented and Late-Pigmenting Rapidly Growing Mycobacteria
Adékambi, Toïdi; Colson, Philippe; Drancourt, Michel
2003-01-01
Nonpigmented and late-pigmenting rapidly growing mycobacteria (RGM) are increasingly isolated in clinical microbiology laboratories. Their accurate identification remains problematic because classification is labor intensive work and because new taxa are not often incorporated into classification databases. Also, 16S rRNA gene sequence analysis underestimates RGM diversity and does not distinguish between all taxa. We determined the complete nucleotide sequence of the rpoB gene, which encodes the bacterial β subunit of the RNA polymerase, for 20 RGM type strains. After using in-house software which analyzes and graphically represents variability stretches of 60 bp along the nucleotide sequence, our analysis focused on a 723-bp variable region exhibiting 83.9 to 97% interspecies similarity and 0 to 1.7% intraspecific divergence. Primer pair Myco-F-Myco-R was designed as a tool for both PCR amplification and sequencing of this region for molecular identification of RGM. This tool was used for identification of 63 RGM clinical isolates previously identified at the species level on the basis of phenotypic characteristics and by 16S rRNA gene sequence analysis. Of 63 clinical isolates, 59 (94%) exhibited <2% partial rpoB gene sequence divergence from 1 of 20 species under study and were regarded as correctly identified at the species level. Mycobacterium abscessus and Mycobacterium mucogenicum isolates were clearly distinguished from Mycobacterium chelonae; Mycobacterium mageritense isolates were clearly distinguished from “Mycobacterium houstonense.” Four isolates were not identified at the species level because they exhibited >3% partial rpoB gene sequence divergence from the corresponding type strain; they belonged to three taxa related to M. mucogenicum, Mycobacterium smegmatis, and Mycobacterium porcinum. For M. abscessus and M. mucogenicum, this partial sequence yielded a high genetic heterogeneity within the clinical isolates. We conclude that molecular identification by analysis of the 723-bp rpoB sequence is a rapid and accurate tool for identification of RGM. PMID:14662964
Webb, Kristen M; Rosenthal, Benjamin M
2011-01-01
The mitochondrial genome's non-recombinant mode of inheritance and relatively rapid rate of evolution has promoted its use as a marker for studying the biogeographic history and evolutionary interrelationships among many metazoan species. A modest portion of the mitochondrial genome has been defined for 12 species and genotypes of parasites in the genus Trichinella, but its adequacy in representing the mitochondrial genome as a whole remains unclear, as the complete coding sequence has been characterized only for Trichinella spiralis. Here, we sought to comprehensively describe the extent and nature of divergence between the mitochondrial genomes of T. spiralis (which poses the most appreciable zoonotic risk owing to its capacity to establish persistent infections in domestic pigs) and Trichinella murrelli (which is the most prevalent species in North American wildlife hosts, but which poses relatively little risk to the safety of pork). Next generation sequencing methodologies and scaffold and de novo assembly strategies were employed. The entire protein-coding region was sequenced (13,917 bp), along with a portion of the highly repetitive non-coding region (1524 bp) of the mitochondrial genome of T. murrelli with a combined average read depth of 250 reads. The accuracy of base calling, estimated from coding region sequence was found to exceed 99.3%. Genome content and gene order was not found to be significantly different from that of T. spiralis. An overall inter-species sequence divergence of 9.5% was estimated. Significant variation was identified when the amount of variation between species at each gene is compared to the average amount of variation between species across the coding region. Next generation sequencing is a highly effective means to obtain previously unknown mitochondrial genome sequence. Particular to parasites, the extremely deep coverage achieved through this method allows for the detection of sequence heterogeneity between the multiple individuals that necessarily comprise such templates. Copyright © 2010 Elsevier B.V. All rights reserved.
Li, Hao-Xi; Gottilla, Thomas M; Brewer, Marin Talbot
2017-10-01
Population divergence and speciation of closely related lineages can result from reproductive differences leading to genetic isolation. An increasing number of fungal diseases of plants and animals have been determined to be caused by morphologically indistinguishable species that are genetically distinct, thereby representing cryptic species. We were interested in identifying if mating systems among three Stagonosporopsis species (S. citrulli, S. cucurbitacearum, and S. caricae) causing gummy stem blight (GSB) of cucurbits or leaf spot and dry rot of papaya differed, possibly underlying species divergence. Additionally, we were interested in identifying evolutionary pressures acting on the genes controlling mating in these fungi. The mating-type loci (MAT1) of three isolates from each of the three species were identified in draft genome sequences. For the three species, MAT1 was structurally identical and contained both mating-type genes necessary for sexual reproduction, which suggests that all three species are homothallic. However, both MAT1-1-1 and MAT1-2-1 were divergent among species showing rapid evolution with a much greater number of amino acid-changing substitutions detected for the reproductive genes compared with genes flanking MAT1. Positive selection was detected in MAT1-2-1, especially in the highly conserved high mobility group (MATA_HMG-box) domain. Thus, the mating-type genes are rapidly evolving in GSB fungi, but a difference in mating systems among the three species does not underlie their divergence. Copyright © 2017 British Mycological Society. Published by Elsevier Ltd. All rights reserved.
Extraordinary Sequence Divergence at Tsga8, an X-linked Gene Involved in Mouse Spermiogenesis
Good, Jeffrey M.; Vanderpool, Dan; Smith, Kimberly L.; Nachman, Michael W.
2011-01-01
The X chromosome plays an important role in both adaptive evolution and speciation. We used a molecular evolutionary screen of X-linked genes potentially involved in reproductive isolation in mice to identify putative targets of recurrent positive selection. We then sequenced five very rapidly evolving genes within and between several closely related species of mice in the genus Mus. All five genes were involved in male reproduction and four of the genes showed evidence of recurrent positive selection. The most remarkable evolutionary patterns were found at Testis-specific gene a8 (Tsga8), a spermatogenesis-specific gene expressed during postmeiotic chromatin condensation and nuclear transformation. Tsga8 was characterized by extremely high levels of insertion–deletion variation of an alanine-rich repetitive motif in natural populations of Mus domesticus and M. musculus, differing in length from the reference mouse genome by up to 89 amino acids (27% of the total protein length). This population-level variation was coupled with striking divergence in protein sequence and length between closely related mouse species. Although no clear orthologs had previously been described for Tsga8 in other mammalian species, we have identified a highly divergent hypothetical gene on the rat X chromosome that shares clear orthology with the 5′ and 3′ ends of Tsga8. Further inspection of this ortholog verified that it is expressed in rat testis and shares remarkable similarity with mouse Tsga8 across several general features of the protein sequence despite no conservation of nucleotide sequence across over 60% of the rat-coding domain. Overall, Tsga8 appears to be one of the most rapidly evolving genes to have been described in rodents. We discuss the potential evolutionary causes and functional implications of this extraordinary divergence and the possible contribution of Tsga8 and the other four genes we examined to reproductive isolation in mice. PMID:21186189
Three Divergent Subpopulations of the Malaria Parasite Plasmodium knowlesi
Lin, Lee C.; Rovie-Ryan, Jeffrine J.; Kadir, Khamisah A.; Anderios, Fread; Hisam, Shamilah; Sharma, Reuben S.K.; Singh, Balbir; Conway, David J.
2017-01-01
Multilocus microsatellite genotyping of Plasmodium knowlesi isolates previously indicated 2 divergent parasite subpopulations in humans on the island of Borneo, each associated with a different macaque reservoir host species. Geographic divergence was also apparent, and independent sequence data have indicated particularly deep divergence between parasites from mainland Southeast Asia and Borneo. To resolve the overall population structure, multilocus microsatellite genotyping was conducted on a new sample of 182 P. knowlesi infections (obtained from 134 humans and 48 wild macaques) from diverse areas of Malaysia, first analyzed separately and then in combination with previous data. All analyses confirmed 2 divergent clusters of human cases in Malaysian Borneo, associated with long-tailed macaques and pig-tailed macaques, and a third cluster in humans and most macaques in peninsular Malaysia. High levels of pairwise divergence between each of these sympatric and allopatric subpopulations have implications for the epidemiology and control of this zoonotic species. PMID:28322705
Vorticity and divergence in the solar photosphere
NASA Technical Reports Server (NTRS)
Wang, YI; Noyes, Robert W.; Tarbell, Theodore D.; Title, Alan M.
1995-01-01
We have studied an outstanding sequence of continuum images of the solar granulation from Pic du Midi Observatory. We have calculated the horizontal vector flow field using a correlation tracking algorithm, and from this determined three scalar field: the vertical component of the curl; the horizontal divergence; and the horizontal flow speed. The divergence field has substantially longer coherence time and more power than does the curl field. Statistically, curl is better correlated with regions of negative divergence - that is, the vertical vorticity is higher in downflow regions, suggesting excess vorticity in intergranular lanes. The average value of the divergence is largest (i.e., outflow is largest) where the horizontal speed is large; we associate these regions with exploding granules. A numerical simulation of general convection also shows similar statistical differences between curl and divergence. Some individual small bright points in the granulation pattern show large local vorticities.
USDA-ARS?s Scientific Manuscript database
The Noctuid moth, Spodoptera frugiperda (the fall armyworm), is endemic to the Western Hemisphere and appears to be undergoing sympatric speciation to produce two subpopulations that differ in their choice of host plants. The diverging “rice strain” and “corn strain” are morphologically indistinguis...
Dynamics of actin evolution in dinoflagellates.
Kim, Sunju; Bachvaroff, Tsvetan R; Handy, Sara M; Delwiche, Charles F
2011-04-01
Dinoflagellates have unique nuclei and intriguing genome characteristics with very high DNA content making complete genome sequencing difficult. In dinoflagellates, many genes are found in multicopy gene families, but the processes involved in the establishment and maintenance of these gene families are poorly understood. Understanding the dynamics of gene family evolution in dinoflagellates requires comparisons at different evolutionary scales. Studies of closely related species provide fine-scale information relative to species divergence, whereas comparisons of more distantly related species provides broad context. We selected the actin gene family as a highly expressed conserved gene previously studied in dinoflagellates. Of the 142 sequences determined in this study, 103 were from the two closely related species, Dinophysis acuminata and D. caudata, including full length and partial cDNA sequences as well as partial genomic amplicons. For these two Dinophysis species, at least three types of sequences could be identified. Most copies (79%) were relatively similar and in nucleotide trees, the sequences formed two bushy clades corresponding to the two species. In comparisons within species, only eight to ten nucleotide differences were found between these copies. The two remaining types formed clades containing sequences from both species. One type included the most similar sequences in between-species comparisons with as few as 12 nucleotide differences between species. The second type included the most divergent sequences in comparisons between and within species with up to 93 nucleotide differences between sequences. In all the sequences, most variation occurred in synonymous sites or the 5' UnTranslated Region (UTR), although there was still limited amino acid variation between most sequences. Several potential pseudogenes were found (approximately 10% of all sequences depending on species) with incomplete open reading frames due to frameshifts or early stop codons. Overall, variation in the actin gene family fits best with the "birth and death" model of evolution based on recent duplications, pseudogenes, and incomplete lineage sorting. Divergence between species was similar to variation within species, so that actin may be too conserved to be useful for phylogenetic estimation of closely related species.
Recombination, rearrangement, reshuffling, and divergence in a centromeric region of rice.
Ma, Jianxin; Bennetzen, Jeffrey L
2006-01-10
Centromeres have many unusual biological properties, including kinetochore attachment and severe repression of local meiotic recombination. These properties are partly an outcome, partly a cause, of unusual DNA structure in the centromeric region. Although several plant and animal genomes have been sequenced, most centromere sequences have not been completed or analyzed in depth. To shed light on the unique organization, variability, and evolution of centromeric DNA, detailed analysis of a 1.97-Mb sequence that includes centromere 8 (CEN8) of japonica rice was undertaken. Thirty-three long-terminal repeat (LTR)-retrotransposon families (including 11 previously unknown) were identified in the CEN8 region, totaling 245 elements and fragments that account for 67% of the region. The ratio of solo LTRs to intact elements in the CEN8 region is approximately 0.9:1, compared with approximately 2.2:1 in noncentromeric regions of rice. However, the ratio of solo LTRs to intact elements in the core of the CEN8 region ( approximately 2.5:1) is higher than in any other region investigated in rice, suggesting a hotspot for unequal recombination. Comparison of the CEN8 region of japonica and its orthologous segments from indica rice indicated that approximately 15% of the intact retrotransposons and solo LTRs were inserted into CEN8 after the divergence of japonica and indica from a common ancestor, compared with approximately 50% for previously studied euchromatic regions. Frequent DNA rearrangements were observed in the CEN8 region, including a 212-kb subregion that was found to be composed of three rearranged tandem repeats. Phylogenetic analysis also revealed recent segmental duplication and extensive rearrangement and reshuffling of the CentO satellite repeats.
Musumeci, Matias A.; Lozada, Mariana; Rial, Daniela V.; ...
2017-04-09
The goal of this work was to identify sequences encoding monooxygenase biocatalysts with novel features by in silico mining an assembled metagenomic dataset of polar and subpolar marine sediments. The targeted enzyme sequences were Baeyer-Villiger and bacterial cytochrome P450 monooxygenases (CYP153). These enzymes have wide-ranging applications, from the synthesis of steroids, antibiotics, mycotoxins and pheromones to the synthesis of monomers for polymerization and anticancer precursors, due to their extraordinary enantio-, regio-, and chemo- selectivity that are valuable features for organic synthesis. Phylogenetic analyses were used to select the most divergent sequences affiliated to these enzyme families among the 264 putativemore » monooxygenases recovered from the ~14 million protein-coding sequences in the assembled metagenome dataset. Three-dimensional structure modeling and docking analysis suggested features useful in biotechnological applications in five metagenomic sequences, such as wide substrate range, novel substrate specificity or regioselectivity. Further analysis revealed structural features associated with psychrophilic enzymes, such as broader substrate accessibility, larger catalytic pockets or low domain interactions, suggesting that they could be applied in biooxidations at room or low temperatures, saving costs inherent to energy consumption. As a result, this work allowed the identification of putative enzyme candidates with promising features from metagenomes, providing a suitable starting point for further developments.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Musumeci, Matias A.; Lozada, Mariana; Rial, Daniela V.
The goal of this work was to identify sequences encoding monooxygenase biocatalysts with novel features by in silico mining an assembled metagenomic dataset of polar and subpolar marine sediments. The targeted enzyme sequences were Baeyer-Villiger and bacterial cytochrome P450 monooxygenases (CYP153). These enzymes have wide-ranging applications, from the synthesis of steroids, antibiotics, mycotoxins and pheromones to the synthesis of monomers for polymerization and anticancer precursors, due to their extraordinary enantio-, regio-, and chemo- selectivity that are valuable features for organic synthesis. Phylogenetic analyses were used to select the most divergent sequences affiliated to these enzyme families among the 264 putativemore » monooxygenases recovered from the ~14 million protein-coding sequences in the assembled metagenome dataset. Three-dimensional structure modeling and docking analysis suggested features useful in biotechnological applications in five metagenomic sequences, such as wide substrate range, novel substrate specificity or regioselectivity. Further analysis revealed structural features associated with psychrophilic enzymes, such as broader substrate accessibility, larger catalytic pockets or low domain interactions, suggesting that they could be applied in biooxidations at room or low temperatures, saving costs inherent to energy consumption. As a result, this work allowed the identification of putative enzyme candidates with promising features from metagenomes, providing a suitable starting point for further developments.« less
Musumeci, Matías A; Lozada, Mariana; Rial, Daniela V; Mac Cormack, Walter P; Jansson, Janet K; Sjöling, Sara; Carroll, JoLynn; Dionisi, Hebe M
2017-04-09
The goal of this work was to identify sequences encoding monooxygenase biocatalysts with novel features by in silico mining an assembled metagenomic dataset of polar and subpolar marine sediments. The targeted enzyme sequences were Baeyer-Villiger and bacterial cytochrome P450 monooxygenases (CYP153). These enzymes have wide-ranging applications, from the synthesis of steroids, antibiotics, mycotoxins and pheromones to the synthesis of monomers for polymerization and anticancer precursors, due to their extraordinary enantio-, regio-, and chemo- selectivity that are valuable features for organic synthesis. Phylogenetic analyses were used to select the most divergent sequences affiliated to these enzyme families among the 264 putative monooxygenases recovered from the ~14 million protein-coding sequences in the assembled metagenome dataset. Three-dimensional structure modeling and docking analysis suggested features useful in biotechnological applications in five metagenomic sequences, such as wide substrate range, novel substrate specificity or regioselectivity. Further analysis revealed structural features associated with psychrophilic enzymes, such as broader substrate accessibility, larger catalytic pockets or low domain interactions, suggesting that they could be applied in biooxidations at room or low temperatures, saving costs inherent to energy consumption. This work allowed the identification of putative enzyme candidates with promising features from metagenomes, providing a suitable starting point for further developments.
Musumeci, Matías A.; Lozada, Mariana; Rial, Daniela V.; Mac Cormack, Walter P.; Jansson, Janet K.; Sjöling, Sara; Carroll, JoLynn; Dionisi, Hebe M.
2017-01-01
The goal of this work was to identify sequences encoding monooxygenase biocatalysts with novel features by in silico mining an assembled metagenomic dataset of polar and subpolar marine sediments. The targeted enzyme sequences were Baeyer–Villiger and bacterial cytochrome P450 monooxygenases (CYP153). These enzymes have wide-ranging applications, from the synthesis of steroids, antibiotics, mycotoxins and pheromones to the synthesis of monomers for polymerization and anticancer precursors, due to their extraordinary enantio-, regio-, and chemo- selectivity that are valuable features for organic synthesis. Phylogenetic analyses were used to select the most divergent sequences affiliated to these enzyme families among the 264 putative monooxygenases recovered from the ~14 million protein-coding sequences in the assembled metagenome dataset. Three-dimensional structure modeling and docking analysis suggested features useful in biotechnological applications in five metagenomic sequences, such as wide substrate range, novel substrate specificity or regioselectivity. Further analysis revealed structural features associated with psychrophilic enzymes, such as broader substrate accessibility, larger catalytic pockets or low domain interactions, suggesting that they could be applied in biooxidations at room or low temperatures, saving costs inherent to energy consumption. This work allowed the identification of putative enzyme candidates with promising features from metagenomes, providing a suitable starting point for further developments. PMID:28397770
Zhu, Tianqi; Dos Reis, Mario; Yang, Ziheng
2015-03-01
Genetic sequence data provide information about the distances between species or branch lengths in a phylogeny, but not about the absolute divergence times or the evolutionary rates directly. Bayesian methods for dating species divergences estimate times and rates by assigning priors on them. In particular, the prior on times (node ages on the phylogeny) incorporates information in the fossil record to calibrate the molecular tree. Because times and rates are confounded, our posterior time estimates will not approach point values even if an infinite amount of sequence data are used in the analysis. In a previous study we developed a finite-sites theory to characterize the uncertainty in Bayesian divergence time estimation in analysis of large but finite sequence data sets under a strict molecular clock. As most modern clock dating analyses use more than one locus and are conducted under relaxed clock models, here we extend the theory to the case of relaxed clock analysis of data from multiple loci (site partitions). Uncertainty in posterior time estimates is partitioned into three sources: Sampling errors in the estimates of branch lengths in the tree for each locus due to limited sequence length, variation of substitution rates among lineages and among loci, and uncertainty in fossil calibrations. Using a simple but analogous estimation problem involving the multivariate normal distribution, we predict that as the number of loci ([Formula: see text]) goes to infinity, the variance in posterior time estimates decreases and approaches the infinite-data limit at the rate of 1/[Formula: see text], and the limit is independent of the number of sites in the sequence alignment. We then confirmed the predictions by using computer simulation on phylogenies of two or three species, and by analyzing a real genomic data set for six primate species. Our results suggest that with the fossil calibrations fixed, analyzing multiple loci or site partitions is the most effective way for improving the precision of posterior time estimation. However, even if a huge amount of sequence data is analyzed, considerable uncertainty will persist in time estimates. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society of Systematic Biologists.
A Contigency Model for Predicting Institutionalization of Innovation Across Divergent Organizations.
ERIC Educational Resources Information Center
Howes, Nancy J.
This study was undertaken to compare the variables related to the successful institutionalization of changes across divergent organizations, and to design, through cross-validation, an interorganization model of change. Descriptive survey questionnaires and structured interviews were the instruments used. The respondent sample consisted of 1,500…
2011-01-01
Background Parasites are evolutionary hitchhikers whose phylogenies often track the evolutionary history of their hosts. Incongruence in the evolutionary history of closely associated lineages can be explained through a variety of possible events including host switching and host independent speciation. However, in recently diverged lineages stochastic population processes, such as retention of ancestral polymorphism or secondary contact, can also explain discordant genealogies, even in fully co-speciating taxa. The relatively simple biogeographic arrangement of the Galápagos archipelago, compared with mainland biomes, provides a framework to identify stochastic and evolutionary informative components of genealogic data in these recently diverged organisms. Results Mitochondrial DNA sequences were obtained for four species of Galápagos mockingbirds and three sympatric species of ectoparasites - two louse and one mite species. These data were complemented with nuclear EF1α sequences in selected samples of parasites and with information from microsatellite loci in the mockingbirds. Mitochondrial sequence data revealed differences in population genetic diversity between all taxa and varying degrees of topological congruence between host and parasite lineages. A very low level of genetic variability and lack of congruence was found in one of the louse parasites, which was excluded from subsequent joint analysis of mitochondrial data. The reconciled multi-species tree obtained from the analysis is congruent with both the nuclear data and the geological history of the islands. Conclusions The gene genealogies of Galápagos mockingbirds and two of their ectoparasites show strong phylogeographic correlations, with instances of incongruence mostly explained by ancestral genetic polymorphism. A third parasite genealogy shows low levels of genetic diversity and little evidence of co-phylogeny with their hosts. These differences can mostly be explained by variation in life-history characteristics, primarily host specificity and dispersal capabilities. We show that pooling genetic data from organisms living in close ecological association reveals a more accurate phylogeographic history for these taxa. Our results have implications for the conservation and taxonomy of Galápagos mockingbirds and their parasites. PMID:21966954
J.B. Whittall; J. Syring; M. Parks; J. Buenrostro; C. Dick; A. Liston; R. Cronn
2010-01-01
Critical to conservation efforts and other investigations at low taxonomic levels, DNA sequence data offer important insights into the distinctiveness, biogeographic partitioning, and evolutionary histories of species. The resolving power of DNA sequences is often limited by insufficient variability at the intraspecific level. This is particularly true of studies...
Genome Sequence of the Yeast Clavispora lusitaniae Type Strain CBS 6936.
Durrens, Pascal; Klopp, Christophe; Biteau, Nicolas; Fitton-Ouhabi, Valérie; Dementhon, Karine; Accoceberry, Isabelle; Sherman, David J; Noël, Thierry
2017-08-03
Clavispora lusitaniae , an environmental saprophytic yeast belonging to the CTG clade of Candida , can behave occasionally as an opportunistic pathogen in humans. We report here the genome sequence of the type strain CBS 6936. Comparison with sequences of strain ATCC 42720 indicates conservation of chromosomal structure but significant nucleotide divergence. Copyright © 2017 Durrens et al.
Genome Sequence of the Yeast Clavispora lusitaniae Type Strain CBS 6936
Klopp, Christophe; Biteau, Nicolas; Fitton-Ouhabi, Valérie; Dementhon, Karine; Accoceberry, Isabelle; Sherman, David J.; Noël, Thierry
2017-01-01
ABSTRACT Clavispora lusitaniae, an environmental saprophytic yeast belonging to the CTG clade of Candida, can behave occasionally as an opportunistic pathogen in humans. We report here the genome sequence of the type strain CBS 6936. Comparison with sequences of strain ATCC 42720 indicates conservation of chromosomal structure but significant nucleotide divergence. PMID:28774979
Osca, David; Templado, José; Zardoya, Rafael
2014-09-01
The complete nucleotide sequence of the mitochondrial (mt) genome of the deep-sea vent snail Ifremeria nautilei (Gastropoda: Abyssochrysoidea) was determined. The double stranded circular molecule is 15,664 pb in length and encodes for the typical 37 metazoan mitochondrial genes. The gene arrangement of the Ifremeria mt genome is most similar to genome organization of caenogastropods and differs only on the relative position of the trnW gene. The deduced amino acid sequences of the mt protein coding genes of Ifremeria mt genome were aligned with orthologous sequences from representatives of the main lineages of gastropods and phylogenetic relationships were inferred. The reconstructed phylogeny supports that Ifremeria belongs to Caenogastropoda and that it is closely related to hypsogastropod superfamilies. Results were compared with a reconstructed nuclear-based phylogeny. Moreover, a relaxed molecular-clock timetree calibrated with fossils dated the divergence of Abyssochrysoidea in the Late Jurassic-Early Cretaceous indicating a relatively modern colonization of deep-sea environments by these snails. Copyright © 2014 Elsevier B.V. All rights reserved.
Developmentally distinct MYB genes encode functionally equivalent proteins in Arabidopsis.
Lee, M M; Schiefelbein, J
2001-05-01
The duplication and divergence of developmental control genes is thought to have driven morphological diversification during the evolution of multicellular organisms. To examine the molecular basis of this process, we analyzed the functional relationship between two paralogous MYB transcription factor genes, WEREWOLF (WER) and GLABROUS1 (GL1), in Arabidopsis. The WER and GL1 genes specify distinct cell types and exhibit non-overlapping expression patterns during Arabidopsis development. Nevertheless, reciprocal complementation experiments with a series of gene fusions showed that WER and GL1 encode functionally equivalent proteins, and their unique roles in plant development are entirely due to differences in their cis-regulatory sequences. Similar experiments with a distantly related MYB gene (MYB2) showed that its product cannot functionally substitute for WER or GL1. Furthermore, an analysis of the WER and GL1 proteins shows that conserved sequences correspond to specific functional domains. These results provide new insights into the evolution of the MYB gene family in Arabidopsis, and, more generally, they demonstrate that novel developmental gene function may arise solely by the modification of cis-regulatory sequences.
Indigenous species barcode database improves the identification of zooplankton
Yang, Jianghua; Zhang, Wanwan; Sun, Jingying; Xie, Yuwei; Zhang, Yimin; Burton, G. Allen; Yu, Hongxia
2017-01-01
Incompleteness and inaccuracy of DNA barcode databases is considered an important hindrance to the use of metabarcoding in biodiversity analysis of zooplankton at the species-level. Species barcoding by Sanger sequencing is inefficient for organisms with small body sizes, such as zooplankton. Here mitochondrial cytochrome c oxidase I (COI) fragment barcodes from 910 freshwater zooplankton specimens (87 morphospecies) were recovered by a high-throughput sequencing platform, Ion Torrent PGM. Intraspecific divergence of most zooplanktons was < 5%, except Branchionus leydign (Rotifer, 14.3%), Trichocerca elongate (Rotifer, 11.5%), Lecane bulla (Rotifer, 15.9%), Synchaeta oblonga (Rotifer, 5.95%) and Schmackeria forbesi (Copepod, 6.5%). Metabarcoding data of 28 environmental samples from Lake Tai were annotated by both an indigenous database and NCBI Genbank database. The indigenous database improved the taxonomic assignment of metabarcoding of zooplankton. Most zooplankton (81%) with barcode sequences in the indigenous database were identified by metabarcoding monitoring. Furthermore, the frequency and distribution of zooplankton were also consistent between metabarcoding and morphology identification. Overall, the indigenous database improved the taxonomic assignment of zooplankton. PMID:28977035
Penny, D; Hasegawa, M; Waddell, P J; Hendy, M D
1999-03-01
We explore the tree of mammalian mtDNA sequences, using particularly the LogDet transform on amino acid sequences, the distance Hadamard transform, and the Closest Tree selection criterion. The amino acid composition of different species show significant differences, even within mammals. After compensating for these differences, nearest-neighbor bootstrap results suggest that the tree is locally stable, though a few groups show slightly greater rearrangements when a large proportion of the constant sites are removed. Many parts of the trees we obtain agree with those on published protein ML trees. Interesting results include a preference for rodent monophyly. The detection of a few alternative signals to those on the optimal tree were obtained using the distance Hadamard transform (with results expressed as a Lento plot). One rearrangement suggested was the interchange of the position of primates and rodents on the optimal tree. The basic stability of the tree, combined with two calibration points (whale/cow and horse/rhinoceros), together with a distant secondary calibration from the mammal/bird divergence, allows inferences of the times of divergence of putative clades. Allowing for sampling variances due to finite sequence length, most major divergences amongst lineages leading to modern orders, appear to occur well before the Cretaceous/Tertiary (K/T) boundary. Implications arising from these early divergences are discussed, particularly the possibility of competition between the small dinosaurs and the new mammal clades.
Liu, Xia; Li, Yuan; Yang, Hongyuan; Zhou, Boyang
2018-04-09
The complete chloroplast (cp) genome of Talinum paniculatum (Caryophyllale), a source of pharmaceutical efficacy similar to ginseng, and a widely distributed and planted edible vegetable, were sequenced and analyzed. The cp genome size of T. paniculatum is 156,929 bp, with a pair of inverted repeats (IRs) of 25,751 bp separated by a large single copy (LSC) region of 86,898 bp and a small single copy (SSC) region of 18,529 bp. The genome contains 83 protein-coding genes, 37 transfer RNA (tRNA) genes, eight ribosomal RNA (rRNA) genes and four pseudogenes. Fifty one (51) repeat units and ninety two (92) simple sequence repeats (SSRs) were found in the genome. The pseudogene rpl23 (Ribosomal protein L23) was insert AATT than other Caryophyllale species by sequence alignment, which located in IRs region. The gene of trnK-UUU (tRNA-Lys) and rpl16 (Ribosomal protein L16) have larger introns in T. paniculatum , and the existence of matK (maturase K) genes, which usually located in the introns of trnK-UUU , rich sequence divergence in Caryophyllale. Complete cp genome comparison with other eight Caryophyllales species indicated that the differences between T. paniculatum and P. oleracea were very slight, and the most highly divergent regions occurred in intergenic spacers. Comparisons of IR boundaries among nine Caryophyllales species showed that T. paniculatum have larger IRs region and the contraction is relatively slight. The phylogenetic analysis among 35 Caryophyllales species and two outgroup species revealed that T. paniculatum and P. oleracea do not belong to the same family. All these results give good opportunities for future identification, barcoding of Talinum species, understanding the evolutionary mode of Caryophyllale cp genome and molecular breeding of T. paniculatum with high pharmaceutical efficacy.
Biological function in the twilight zone of sequence conservation.
Ponting, Chris P
2017-08-16
Strong DNA conservation among divergent species is an indicator of enduring functionality. With weaker sequence conservation we enter a vast 'twilight zone' in which sequence subject to transient or lower constraint cannot be distinguished easily from neutrally evolving, non-functional sequence. Twilight zone functional sequence is illuminated instead by principles of selective constraint and positive selection using genomic data acquired from within a species' population. Application of these principles reveals that despite being biochemically active, most twilight zone sequence is not functional.
A highly divergent Puumala virus lineage in southern Poland.
Rosenfeld, Ulrike M; Drewes, Stephan; Ali, Hanan Sheikh; Sadowska, Edyta T; Mikowska, Magdalena; Heckel, Gerald; Koteja, Paweł; Ulrich, Rainer G
2017-05-01
Puumala virus (PUUV) represents one of the most important hantaviruses in Central Europe. Phylogenetic analyses of PUUV strains indicate a strong genetic structuring of this hantavirus. Recently, PUUV sequences were identified in the natural reservoir, the bank vole (Myodes glareolus), collected in the northern part of Poland. The objective of this study was to evaluate the presence of PUUV in bank voles from southern Poland. A total of 72 bank voles were trapped in 2009 at six sites in this part of Poland. RT-PCR and IgG-ELISA analyses detected three PUUV positive voles at one trapping site. The PUUV-infected animals were identified by cytochrome b gene analysis to belong to the Carpathian and Eastern evolutionary lineages of bank vole. The novel PUUV S, M and L segment nucleotide sequences showed the closest similarity to sequences of the Russian PUUV lineage from Latvia, but were highly divergent to those previously found in northern Poland, Slovakia and Austria. In conclusion, the detection of a highly divergent PUUV lineage in southern Poland indicates the necessity of further bank vole monitoring in this region allowing rational public health measures to prevent human infections.
Testing the molecular clock using mechanistic models of fossil preservation and molecular evolution
2017-01-01
Molecular sequence data provide information about relative times only, and fossil-based age constraints are the ultimate source of information about absolute times in molecular clock dating analyses. Thus, fossil calibrations are critical to molecular clock dating, but competing methods are difficult to evaluate empirically because the true evolutionary time scale is never known. Here, we combine mechanistic models of fossil preservation and sequence evolution in simulations to evaluate different approaches to constructing fossil calibrations and their impact on Bayesian molecular clock dating, and the relative impact of fossil versus molecular sampling. We show that divergence time estimation is impacted by the model of fossil preservation, sampling intensity and tree shape. The addition of sequence data may improve molecular clock estimates, but accuracy and precision is dominated by the quality of the fossil calibrations. Posterior means and medians are poor representatives of true divergence times; posterior intervals provide a much more accurate estimate of divergence times, though they may be wide and often do not have high coverage probability. Our results highlight the importance of increased fossil sampling and improved statistical approaches to generating calibrations, which should incorporate the non-uniform nature of ecological and temporal fossil species distributions. PMID:28637852
Erickson, Harold P.
2009-01-01
Summary The eukaryotic cytoskeleton appears to have evolved from ancestral precursors related to prokaryotic FtsZ and MreB. FtsZ and MreB show 40−50% sequence identity across different bacterial and archaeal species. Here I suggest that this represents the limit of divergence that is consistent with maintaining their functions for cytokinesis and cell shape. Previous analyses have noted that tubulin and actin are highly conserved across eukaryotic species, but so divergent from their prokaryotic relatives as to be hardly recognizable from sequence comparisons. One suggestion for this extreme divergence of tubulin and actin is that it occurred as they evolved very different functions from FtsZ and MreB. I will present new arguments favoring this suggestion, and speculate on pathways. Moreover, the extreme conservation of tubulin and actin across eukaryotic species is not due to an intrinsic lack of variability, but is attributed to their acquisition of elaborate mechanisms for assembly dynamics and their interactions with multiple motor and binding proteins. A new structure-based sequence alignment identifies amino acids that are conserved from FtsZ to tubulins. The highly conserved amino acids are not those forming the subunit core or protofilament interface, but those involved in binding and hydrolysis of GTP. PMID:17563102
Mitochondrial genomes reveal the extinct Hippidion as an outgroup to all living equids.
Der Sarkissian, Clio; Vilstrup, Julia T; Schubert, Mikkel; Seguin-Orlando, Andaine; Eme, David; Weinstock, Jacobo; Alberdi, Maria Teresa; Martin, Fabiana; Lopez, Patricio M; Prado, Jose L; Prieto, Alfredo; Douady, Christophe J; Stafford, Tom W; Willerslev, Eske; Orlando, Ludovic
2015-03-01
Hippidions were equids with very distinctive anatomical features. They lived in South America 2.5 million years ago (Ma) until their extinction approximately 10 000 years ago. The evolutionary origin of the three known Hippidion morphospecies is still disputed. Based on palaeontological data, Hippidion could have diverged from the lineage leading to modern equids before 10 Ma. In contrast, a much later divergence date, with Hippidion nesting within modern equids, was indicated by partial ancient mitochondrial DNA sequences. Here, we characterized eight Hippidion complete mitochondrial genomes at 3.4-386.3-fold coverage using target-enrichment capture and next-generation sequencing. Our dataset reveals that the two morphospecies sequenced (H. saldiasi and H. principale) formed a monophyletic clade, basal to extant and extinct Equus lineages. This contrasts with previous genetic analyses and supports Hippidion as a distinct genus, in agreement with palaeontological models. We date the Hippidion split from Equus at 5.6-6.5 Ma, suggesting an early divergence in North America prior to the colonization of South America, after the formation of the Panamanian Isthmus 3.5 Ma and the Great American Biotic Interchange. © 2015 The Author(s) Published by the Royal Society. All rights reserved.
Mitochondrial genomes reveal the extinct Hippidion as an outgroup to all living equids
Der Sarkissian, Clio; Vilstrup, Julia T.; Schubert, Mikkel; Seguin-Orlando, Andaine; Eme, David; Weinstock, Jacobo; Alberdi, Maria Teresa; Martin, Fabiana; Lopez, Patricio M.; Prado, Jose L.; Prieto, Alfredo; Douady, Christophe J.; Stafford, Tom W.; Willerslev, Eske; Orlando, Ludovic
2015-01-01
Hippidions were equids with very distinctive anatomical features. They lived in South America 2.5 million years ago (Ma) until their extinction approximately 10 000 years ago. The evolutionary origin of the three known Hippidion morphospecies is still disputed. Based on palaeontological data, Hippidion could have diverged from the lineage leading to modern equids before 10 Ma. In contrast, a much later divergence date, with Hippidion nesting within modern equids, was indicated by partial ancient mitochondrial DNA sequences. Here, we characterized eight Hippidion complete mitochondrial genomes at 3.4–386.3-fold coverage using target-enrichment capture and next-generation sequencing. Our dataset reveals that the two morphospecies sequenced (H. saldiasi and H. principale) formed a monophyletic clade, basal to extant and extinct Equus lineages. This contrasts with previous genetic analyses and supports Hippidion as a distinct genus, in agreement with palaeontological models. We date the Hippidion split from Equus at 5.6–6.5 Ma, suggesting an early divergence in North America prior to the colonization of South America, after the formation of the Panamanian Isthmus 3.5 Ma and the Great American Biotic Interchange. PMID:25762573
Smith, M. Alex; Fisher, Brian L; Hebert, Paul D.N
2005-01-01
The role of DNA barcoding as a tool to accelerate the inventory and analysis of diversity for hyperdiverse arthropods is tested using ants in Madagascar. We demonstrate how DNA barcoding helps address the failure of current inventory methods to rapidly respond to pressing biodiversity needs, specifically in the assessment of richness and turnover across landscapes with hyperdiverse taxa. In a comparison of inventories at four localities in northern Madagascar, patterns of richness were not significantly different when richness was determined using morphological taxonomy (morphospecies) or sequence divergence thresholds (Molecular Operational Taxonomic Unit(s); MOTU). However, sequence-based methods tended to yield greater richness and significantly lower indices of similarity than morphological taxonomy. MOTU determined using our molecular technique were a remarkably local phenomenon—indicative of highly restricted dispersal and/or long-term isolation. In cases where molecular and morphological methods differed in their assignment of individuals to categories, the morphological estimate was always more conservative than the molecular estimate. In those cases where morphospecies descriptions collapsed distinct molecular groups, sequence divergences of 16% (on average) were contained within the same morphospecies. Such high divergences highlight taxa for further detailed genetic, morphological, life history, and behavioral studies. PMID:16214741
A DNA Barcode Library for North American Pyraustinae (Lepidoptera: Pyraloidea: Crambidae).
Yang, Zhaofu; Landry, Jean-François; Hebert, Paul D N
2016-01-01
Although members of the crambid subfamily Pyraustinae are frequently important crop pests, their identification is often difficult because many species lack conspicuous diagnostic morphological characters. DNA barcoding employs sequence diversity in a short standardized gene region to facilitate specimen identifications and species discovery. This study provides a DNA barcode reference library for North American pyraustines based upon the analysis of 1589 sequences recovered from 137 nominal species, 87% of the fauna. Data from 125 species were barcode compliant (>500bp, <1% n), and 99 of these taxa formed a distinct cluster that was assigned to a single BIN. The other 26 species were assigned to 56 BINs, reflecting frequent cases of deep intraspecific sequence divergence and a few instances of barcode sharing, creating a total of 155 BINs. Two systems for OTU designation, ABGD and BIN, were examined to check the correspondence between current taxonomy and sequence clusters. The BIN system performed better than ABGD in delimiting closely related species, while OTU counts with ABGD were influenced by the value employed for relative gap width. Different species with low or no interspecific divergence may represent cases of unrecognized synonymy, whereas those with high intraspecific divergence require further taxonomic scrutiny as they may involve cryptic diversity. The barcode library developed in this study will also help to advance understanding of relationships among species of Pyraustinae.
Analysis of the Prefoldin Gene Family in 14 Plant Species
Cao, Jun
2016-01-01
Prefoldin is a hexameric molecular chaperone complex present in all eukaryotes and archaea. The evolution of this gene family in plants is unknown. Here, I identified 140 prefoldin genes in 14 plant species. These prefoldin proteins were divided into nine groups through phylogenetic analysis. Highly conserved gene organization and motif distribution exist in each prefoldin group, implying their functional conservation. I also observed the segmental duplication of maize prefoldin gene family. Moreover, a few functional divergence sites were identified within each group pairs. Functional network analyses identified 78 co-expressed genes, and most of them were involved in carrying, binding and kinase activity. Divergent expression profiles of the maize prefoldin genes were further investigated in different tissues and development periods and under auxin and some abiotic stresses. I also found a few cis-elements responding to abiotic stress and phytohormone in the upstream sequences of the maize prefoldin genes. The results provided a foundation for exploring the characterization of the prefoldin genes in plants and will offer insights for additional functional studies. PMID:27014333
Divergent and nonuniform gene expression patterns in mouse brain
Morris, John A.; Royall, Joshua J.; Bertagnolli, Darren; Boe, Andrew F.; Burnell, Josh J.; Byrnes, Emi J.; Copeland, Cathy; Desta, Tsega; Fischer, Shanna R.; Goldy, Jeff; Glattfelder, Katie J.; Kidney, Jolene M.; Lemon, Tracy; Orta, Geralyn J.; Parry, Sheana E.; Pathak, Sayan D.; Pearson, Owen C.; Reding, Melissa; Shapouri, Sheila; Smith, Kimberly A.; Soden, Chad; Solan, Beth M.; Weller, John; Takahashi, Joseph S.; Overly, Caroline C.; Lein, Ed S.; Hawrylycz, Michael J.; Hohmann, John G.; Jones, Allan R.
2010-01-01
Considerable progress has been made in understanding variations in gene sequence and expression level associated with phenotype, yet how genetic diversity translates into complex phenotypic differences remains poorly understood. Here, we examine the relationship between genetic background and spatial patterns of gene expression across seven strains of mice, providing the most extensive cellular-resolution comparative analysis of gene expression in the mammalian brain to date. Using comprehensive brainwide anatomic coverage (more than 200 brain regions), we applied in situ hybridization to analyze the spatial expression patterns of 49 genes encoding well-known pharmaceutical drug targets. Remarkably, over 50% of the genes examined showed interstrain expression variation. In addition, the variability was nonuniformly distributed across strain and neuroanatomic region, suggesting certain organizing principles. First, the degree of expression variance among strains mirrors genealogic relationships. Second, expression pattern differences were concentrated in higher-order brain regions such as the cortex and hippocampus. Divergence in gene expression patterns across the brain could contribute significantly to variations in behavior and responses to neuroactive drugs in laboratory mouse strains and may help to explain individual differences in human responsiveness to neuroactive drugs. PMID:20956311
Classification and Lineage Tracing of SH2 Domains Throughout Eukaryotes.
Liu, Bernard A
2017-01-01
Today there exists a rapidly expanding number of sequenced genomes. Cataloging protein interaction domains such as the Src Homology 2 (SH2) domain across these various genomes can be accomplished with ease due to existing algorithms and predictions models. An evolutionary analysis of SH2 domains provides a step towards understanding how SH2 proteins integrated with existing signaling networks to position phosphotyrosine signaling as a crucial driver of robust cellular communication networks in metazoans. However organizing and tracing SH2 domain across organisms and understanding their evolutionary trajectory remains a challenge. This chapter describes several methodologies towards analyzing the evolutionary trajectory of SH2 domains including a global SH2 domain classification system, which facilitates annotation of new SH2 sequences essential for tracing the lineage of SH2 domains throughout eukaryote evolution. This classification utilizes a combination of sequence homology, protein domain architecture and the boundary positions between introns and exons within the SH2 domain or genes encoding these domains. Discrete SH2 families can then be traced across various genomes to provide insight into its origins. Furthermore, additional methods for examining potential mechanisms for divergence of SH2 domains from structural changes to alterations in the protein domain content and genome duplication will be discussed. Therefore a better understanding of SH2 domain evolution may enhance our insight into the emergence of phosphotyrosine signaling and the expansion of protein interaction domains.
Huang, Zhuo; Long, Hai; Wei, Yu-Ming; Yan, Ze-Hong; Zheng, You-Liang
2016-04-01
The α-gliadins account for 15-30 % of the total storage protein in wheat endosperm and play important roles in the dough extensibility and nutritional quality. On the other side, they act as a main source of toxic peptides triggering celiac disease. In this study, 37 α-gliadins were isolated from three species of Aegilops section Sitopsis. Sequence similarity and phylogenetic analyses revealed novel allelic variation at Gli-2 loci of species of Sitopsis and regular organization of motifs in their repetitive domain. Based on the comprehensive analyses of a large number of known sequences of bread wheat and its diploid genome progenitors, the distributions of four T cell epitopes and length variations of two polyglutamine domains are analyzed. Additionally, according to the organization of repeat motifs, we classified the α-gliadins of Triticum and Aegilops into eight types. Their most recent common ancestor and putative divergence patterns were further considered. This study provides new insights into the allelic variations of α-gliadins in Aegilops section Sitopsis, as well as evolution of α-gliadin multigene family among Triticum and Aegilops species.
Gene organization and alternative splicing of human prohormone convertase PC8.
Goodge, K A; Thomas, R J; Martin, T J; Gillespie, M T
1998-01-01
The mammalian Ca2+-dependent serine protease prohormone convertase PC8 is expressed ubiquitously, being transcribed as 3.5, 4.3 and 6.0 kb mRNA isoforms in various tissues. To determine the origin of these various mRNA isoforms we report the characterization of the human PC8 gene, which has been previously localized to chromosome 11q23-24. Consisting of 16 exons, the human PC8 gene spans approx. 27 kb. A comparison of the position of intron-exon junctions of the human PC8 gene with the gene structures of previously reported prohormone convertase genes demonstrated a divergence of the human PC8 from the highly conserved nature of the gene organization of this enzyme family. The nucleotide sequence of the 5'-flanking region of the human PC8 is reported and possesses putative promoter elements characteristic of a GC-rich promoter. Further supporting the potential role of a GC-rich promoter element, multiple transcriptional initiation sites within a 200 bp region were demonstrated. We propose that the various mRNA isoforms of PC8 result from the inclusion of intronic sequences within transcripts. PMID:9820811
Dennenmoser, Stefan; Vamosi, Steven M; Nolte, Arne W; Rogers, Sean M
2017-01-01
Understanding the genomic basis of adaptive divergence in the presence of gene flow remains a major challenge in evolutionary biology. In prickly sculpin (Cottus asper), an abundant euryhaline fish in northwestern North America, high genetic connectivity among brackish-water (estuarine) and freshwater (tributary) habitats of coastal rivers does not preclude the build-up of neutral genetic differentiation and emergence of different life history strategies. Because these two habitats present different osmotic niches, we predicted high genetic differentiation at known teleost candidate genes underlying salinity tolerance and osmoregulation. We applied whole-genome sequencing of pooled DNA samples (Pool-Seq) to explore adaptive divergence between two estuarine and two tributary habitats. Paired-end sequence reads were mapped against genomic contigs of European Cottus, and the gene content of candidate regions was explored based on comparisons with the threespine stickleback genome. Genes showing signals of repeated differentiation among brackish-water and freshwater habitats included functions such as ion transport and structural permeability in freshwater gills, which suggests that local adaptation to different osmotic niches might contribute to genomic divergence among habitats. Overall, the presence of both repeated and unique signatures of differentiation across many loci scattered throughout the genome is consistent with polygenic adaptation from standing genetic variation and locally variable selection pressures in the early stages of life history divergence. © 2016 John Wiley & Sons Ltd.
Kibenge, Molly J T; Iwamoto, Tokinori; Wang, Yingwei; Morton, Alexandra; Godoy, Marcos G; Kibenge, Frederick S B
2013-07-11
Piscine reovirus (PRV) is a newly discovered fish reovirus of anadromous and marine fish ubiquitous among fish in Norwegian salmon farms, and likely the causative agent of heart and skeletal muscle inflammation (HSMI). HSMI is an increasingly economically significant disease in Atlantic salmon (Salmo salar) farms. The nucleotide sequence data available for PRV are limited, and there is no genetic information on this virus outside of Norway and none from wild fish. RT-PCR amplification and sequencing were used to obtain the complete viral genome of PRV (10 segments) from western Canada and Chile. The genetic diversity among the PRV strains and their relationship to Norwegian PRV isolates were determined by phylogenetic analyses and sequence identity comparisons. PRV is distantly related to members of the genera Orthoreovirus and Aquareovirus and an unambiguous new genus within the family Reoviridae. The Canadian and Norwegian PRV strains are most divergent in the segment S1 and S4 encoded proteins. Phylogenetic analysis of PRV S1 sequences, for which the largest number of complete sequences from different "isolates" is available, grouped Norwegian PRV strains into a single genotype, Genotype I, with sub-genotypes, Ia and Ib. The Canadian PRV strains matched sub-genotype Ia and Chilean PRV strains matched sub-genotype Ib. PRV should be considered as a member of a new genus within the family Reoviridae with two major Norwegian sub-genotypes. The Canadian PRV diverged from Norwegian sub-genotype Ia around 2007 ± 1, whereas the Chilean PRV diverged from Norwegian sub-genotype Ib around 2008 ± 1.
2013-01-01
Background Piscine reovirus (PRV) is a newly discovered fish reovirus of anadromous and marine fish ubiquitous among fish in Norwegian salmon farms, and likely the causative agent of heart and skeletal muscle inflammation (HSMI). HSMI is an increasingly economically significant disease in Atlantic salmon (Salmo salar) farms. The nucleotide sequence data available for PRV are limited, and there is no genetic information on this virus outside of Norway and none from wild fish. Methods RT-PCR amplification and sequencing were used to obtain the complete viral genome of PRV (10 segments) from western Canada and Chile. The genetic diversity among the PRV strains and their relationship to Norwegian PRV isolates were determined by phylogenetic analyses and sequence identity comparisons. Results PRV is distantly related to members of the genera Orthoreovirus and Aquareovirus and an unambiguous new genus within the family Reoviridae. The Canadian and Norwegian PRV strains are most divergent in the segment S1 and S4 encoded proteins. Phylogenetic analysis of PRV S1 sequences, for which the largest number of complete sequences from different “isolates” is available, grouped Norwegian PRV strains into a single genotype, Genotype I, with sub-genotypes, Ia and Ib. The Canadian PRV strains matched sub-genotype Ia and Chilean PRV strains matched sub-genotype Ib. Conclusions PRV should be considered as a member of a new genus within the family Reoviridae with two major Norwegian sub-genotypes. The Canadian PRV diverged from Norwegian sub-genotype Ia around 2007 ± 1, whereas the Chilean PRV diverged from Norwegian sub-genotype Ib around 2008 ± 1. PMID:23844948
Jiang, Yuan; Yang, Zhongqi; Wang, Xiaoyi; Hou, Yuxia
2015-01-01
The species belonging to Sclerodermus (Hymenoptera: Bethylidae) are currently the most important insect natural enemies of wood borer pests, mainly buprestid and cerambycid beetles, in China. However, some sibling species of this genus are very difficult to distinguish because of their similar morphological features. To address this issue, we conducted phylogenetic and genetic analyses of cytochrome oxidase subunit I (COI) and 28S RNA gene sequences from eight species of Sclerodermus reared from different wood borer pests. The eight sibling species were as follows: S. guani Xiao et Wu, S. sichuanensis Xiao, S. pupariae Yang et Yao, and Sclerodermus spp. (Nos. 1–5). A 594-bp fragment of COI and 750-bp fragment of 28S were subsequently sequenced. For COI, the G-C content was found to be low in all the species, averaging to about 30.0%. Sequence divergences (Kimura-2-parameter distances) between congeneric species averaged to 4.5%, and intraspecific divergences averaged to about 0.09%. Further, the maximum sequence divergences between congeneric species and Sclerodermus sp. (No. 5) averaged to about 16.5%. All 136 samples analyzed were included in six reciprocally monophyletic clades in the COI neighbor-joining (NJ) tree. The NJ tree inferred from the 28S rRNA sequence yielded almost identical results, but the samples from S. guani, S. sichuanensis, S. pupariae, and Sclerodermus spp. (Nos. 1–4) clustered together and only Sclerodermus sp. (No. 5) clustered separately. Our findings indicate that the standard barcode region of COI can be efficiently used to distinguish morphologically similar Sclerodermus species. Further, we speculate that Sclerodermus sp. (No. 5) might be a new species of Sclerodermus. PMID:25782000
Identification of a divergent genotype of equine arteritis virus from South American donkeys.
Rivas, J; Neira, V; Mena, J; Brito, B; Garcia, A; Gutierrez, C; Sandoval, D; Ortega, R
2017-12-01
A novel equine arteritis virus (EAV) was isolated and sequenced from feral donkeys in Chile. Phylogenetic analysis indicates that the new virus and South African asinine strains diverged at least 100 years from equine EAV strains. The results indicate that asinine strains belonged to a different EAV genotype. © 2017 Blackwell Verlag GmbH.
Evolution of the arginase fold and functional diversity
Dowling, Daniel P.; Costanzo, Luigi Di; Gennadios, Heather A.; Christianson, David W.
2009-01-01
The large number of protein structures deposited in the Protein Data Bank allows for the identification of novel structural superfamilies based on conservation of fold in addition to conservation of amino acid sequence. Since sequence diverges more rapidly than fold in protein evolution, proteins with little or no significant sequence identity are occasionally observed to adopt similar folds, thereby reflecting unanticipated evolutionary relationships. Here, we review the unique α/β fold first observed in the manganese metalloenzyme rat liver arginase, consisting of a parallel 8 stranded β-sheet surrounded by several helices, and its evolutionary relationship with the zinc-requiring and/or iron-requiring histone deacetylases and acetylpolyamine amidohydrolases. Structural comparisons reveal key features of the core α/β fold that contribute to the divergent metal ion specificity and stoichiometry required for the chemical and biological functions of these enzymes. PMID:18360740
Welker, F
2018-02-20
The study of ancient protein sequences is increasingly focused on the analysis of older samples, including those of ancient hominins. The analysis of such ancient proteomes thereby potentially suffers from "cross-species proteomic effects": the loss of peptide and protein identifications at increased evolutionary distances due to a larger number of protein sequence differences between the database sequence and the analyzed organism. Error-tolerant proteomic search algorithms should theoretically overcome this problem at both the peptide and protein level; however, this has not been demonstrated. If error-tolerant searches do not overcome the cross-species proteomic issue then there might be inherent biases in the identified proteomes. Here, a bioinformatics experiment is performed to test this using a set of modern human bone proteomes and three independent searches against sequence databases at increasing evolutionary distances: the human (0 Ma), chimpanzee (6-8 Ma) and orangutan (16-17 Ma) reference proteomes, respectively. Incorrectly suggested amino acid substitutions are absent when employing adequate filtering criteria for mutable Peptide Spectrum Matches (PSMs), but roughly half of the mutable PSMs were not recovered. As a result, peptide and protein identification rates are higher in error-tolerant mode compared to non-error-tolerant searches but did not recover protein identifications completely. Data indicates that peptide length and the number of mutations between the target and database sequences are the main factors influencing mutable PSM identification. The error-tolerant results suggest that the cross-species proteomics problem is not overcome at increasing evolutionary distances, even at the protein level. Peptide and protein loss has the potential to significantly impact divergence dating and proteome comparisons when using ancient samples as there is a bias towards the identification of conserved sequences and proteins. Effects are minimized between moderately divergent proteomes, as indicated by almost complete recovery of informative positions in the search against the chimpanzee proteome (≈90%, 6-8 Ma). This provides a bioinformatic background to future phylogenetic and proteomic analysis of ancient hominin proteomes, including the future description of novel hominin amino acid sequences, but also has negative implications for the study of fast-evolving proteins in hominins, non-hominin animals, and ancient bacterial proteins in evolutionary contexts.
Getlekha, Nuntaporn; Cioffi, Marcelo de Bello; Maneechot, Nuntiya; Bertollo, Luiz Antônio Carlos; Supiwong, Weerayuth; Tanomtong, Alongklod; Molina, Wagner Franco
2018-02-01
Pomacentrus (damselfishes) is one of the most characteristic groups of fishes in the Indo-Pacific coral reef. Its 77 described species exhibit a complex taxonomy with cryptic lineages across their extensive distribution. Periods of evolutionary divergences between them are very variable, and the cytogenetic events that followed their evolutionary diversification are largely unknown. In this respect, analyses of chromosomal divergence, within a phylogenetic perspective, are particularly informative regarding karyoevolutionary trends. As such, we conducted conventional cytogenetic and cytogenomic analyses in four Pomacentrus species (Pomacentrus similis, Pomacentrus auriventris, Pomacentrus moluccensis, and Pomacentrus cuneatus), through the mapping of repetitive DNA classes and transposable elements, including 18S rDNA, 5S rDNA, (CA) 15 , (GA) 15 , (CAA) 10 , Rex6, and U2 snDNA as markers. P. auriventris and P. similis, belonging to the Pomacentrus coelestis complex, have indistinguishable karyotypes (2n = 48; NF = 48), with a peculiar syntenic organization of ribosomal genes. On the other hand, P. moluccensis and P. cuneatus, belonging to another clade, exhibit very different karyotypes (2n = 48, NF = 86 and 92, respectively), with a large number of bi-armed chromosomes, where multiple pericentric inversions played a significant role in their karyotype organization. In this sense, different chromosomal pathways followed the phyletic diversification in the Pomacentrus genus, making possible the characterization of two well-contrasting species groups regarding their karyotype features. Despite this, pericentric inversions act as an effective postzygotic barrier in many organisms, which appear to be also the case for P. moluccensis and P. cuneatus; the extensive chromosomal similarities in the two species of P. coelestis complex suggest minor participation of chromosomal postzygotic barriers in the phyletic diversification of these species.
Medzihradszky, K F; Gibson, B W; Kaur, S; Yu, Z H; Medzihradszky, D; Burlingame, A L; Bass, N M
1992-02-01
The primary structure of a fatty-acid-binding protein (FABP) isolated from the liver of the nurse shark (Ginglymostoma cirratum) was determined by high-performance tandem mass spectrometry (employing multichannel array detection) and Edman degradation. Shark liver FABP consists of 132 amino acids with an acetylated N-terminal valine. The chemical molecular mass of the intact protein determined by electrospray ionization mass spectrometry (Mr = 15124 +/- 2.5) was in good agreement with that calculated from the amino acid sequence (Mr = 15121.3). The amino acid sequence of shark liver FABP displays significantly greater similarity to the FABP expressed in mammalian heart, peripheral nerve myelin and adipose tissue (61-53% sequence similarity) than to the FABP expressed in mammalian liver (22% similarity). Phylogenetic trees derived from the comparison of the shark liver FABP amino acid sequence with the members of the mammalian fatty-acid/retinoid-binding protein gene family indicate the initial divergence of an ancestral gene into two major subfamilies: one comprising the genes for mammalian liver FABP and gastrotropin, the other comprising the genes for mammalian cellular retinol-binding proteins I and II, cellular retinoic-acid-binding protein myelin P2 protein, adipocyte FABP, heart FABP and shark liver FABP, the latter having diverged from the ancestral gene that ultimately gave rise to the present day mammalian heart-FABP, adipocyte FABP and myelin P2 protein sequences. The sequence for intestinal FABP from the rat could be assigned to either subfamily, depending on the approach used for phylogenetic tree construction, but clearly diverged at a relatively early evolutionary time point. Indeed, sequences proximately ancestral or closely related to mammalian intestinal FABP, liver FABP, gastrotropin and the retinoid-binding group of proteins appear to have arisen prior to the divergence of shark liver FABP and should therefore also be present in elasmobranchs. The presence in shark liver of an FABP which differs substantially in primary structure from mammalian liver FABP, while being closely related to the FABP expressed in mammalian heart muscle, peripheral nerve myelin and adipocytes, opens a further dimension regarding the question of the existence of structure-dependent and tissue-specific specialization of FABP function in lipid metabolism.
Hass-Jacobus, Barbara L; Futrell-Griggs, Montona; Abernathy, Brian; Westerman, Rick; Goicoechea, Jose-Luis; Stein, Joshua; Klein, Patricia; Hurwitz, Bonnie; Zhou, Bin; Rakhshan, Fariborz; Sanyal, Abhijit; Gill, Navdeep; Lin, Jer-Young; Walling, Jason G; Luo, Mei Zhong; Ammiraju, Jetty Siva S; Kudrna, Dave; Kim, Hye Ran; Ware, Doreen; Wing, Rod A; Miguel, Phillip San; Jackson, Scott A
2006-01-01
Background With the completion of the genome sequence for rice (Oryza sativa L.), the focus of rice genomics research has shifted to the comparison of the rice genome with genomes of other species for gene cloning, breeding, and evolutionary studies. The genus Oryza includes 23 species that shared a common ancestor 8–10 million years ago making this an ideal model for investigations into the processes underlying domestication, as many of the Oryza species are still undergoing domestication. This study integrates high-throughput, hybridization-based markers with BAC end sequence and fingerprint data to construct physical maps of rice chromosome 1 orthologues in two wild Oryza species. Similar studies were undertaken in Sorghum bicolor, a species which diverged from cultivated rice 40–50 million years ago. Results Overgo markers, in conjunction with fingerprint and BAC end sequence data, were used to build sequence-ready BAC contigs for two wild Oryza species. The markers drove contig merges to construct physical maps syntenic to rice chromosome 1 in the wild species and provided evidence for at least one rearrangement on chromosome 1 of the O. sativa versus Oryza officinalis comparative map. When rice overgos were aligned to available S. bicolor sequence, 29% of the overgos aligned with three or fewer mismatches; of these, 41% gave positive hybridization signals. Overgo hybridization patterns supported colinearity of loci in regions of sorghum chromosome 3 and rice chromosome 1 and suggested that a possible genomic inversion occurred in this syntenic region in one of the two genomes after the divergence of S. bicolor and O. sativa. Conclusion The results of this study emphasize the importance of identifying conserved sequences in the reference sequence when designing overgo probes in order for those probes to hybridize successfully in distantly related species. As interspecific markers, overgos can be used successfully to construct physical maps in species which diverged less than 8 million years ago, and can be used in a more limited fashion to examine colinearity among species which diverged as much as 40 million years ago. Additionally, overgos are able to provide evidence of genomic rearrangements in comparative physical mapping studies. PMID:16895597
Genome-Wide Search Identifies 1.9 Mb from the Polar Bear Y Chromosome for Evolutionary Analyses
Bidon, Tobias; Schreck, Nancy; Hailer, Frank; Nilsson, Maria A.; Janke, Axel
2015-01-01
The male-inherited Y chromosome is the major haploid fraction of the mammalian genome, rendering Y-linked sequences an indispensable resource for evolutionary research. However, despite recent large-scale genome sequencing approaches, only a handful of Y chromosome sequences have been characterized to date, mainly in model organisms. Using polar bear (Ursus maritimus) genomes, we compare two different in silico approaches to identify Y-linked sequences: 1) Similarity to known Y-linked genes and 2) difference in the average read depth of autosomal versus sex chromosomal scaffolds. Specifically, we mapped available genomic sequencing short reads from a male and a female polar bear against the reference genome and identify 112 Y-chromosomal scaffolds with a combined length of 1.9 Mb. We verified the in silico findings for the longer polar bear scaffolds by male-specific in vitro amplification, demonstrating the reliability of the average read depth approach. The obtained Y chromosome sequences contain protein-coding sequences, single nucleotide polymorphisms, microsatellites, and transposable elements that are useful for evolutionary studies. A high-resolution phylogeny of the polar bear patriline shows two highly divergent Y chromosome lineages, obtained from analysis of the identified Y scaffolds in 12 previously published male polar bear genomes. Moreover, we find evidence of gene conversion among ZFX and ZFY sequences in the giant panda lineage and in the ancestor of ursine and tremarctine bears. Thus, the identification of Y-linked scaffold sequences from unordered genome sequences yields valuable data to infer phylogenomic and population-genomic patterns in bears. PMID:26019166
Detection of a divergent variant of grapevine virus F by next-generation sequencing.
Molenaar, Nicholas; Burger, Johan T; Maree, Hans J
2015-08-01
The complete genome sequence of a South African isolate of grapevine virus F (GVF) is presented. It was first detected by metagenomic next-generation sequencing of field samples and validated through direct Sanger sequencing. The genome sequence of GVF isolate V5 consists of 7539 nucleotides and contains a poly(A) tail. It has a typical vitivirus genome arrangement that comprises five open reading frames (ORFs), which share only 88.96 % nucleotide sequence identity with the existing complete GVF genome sequence (JX105428).
Morard, Raphaël; Escarguel, Gilles; Weiner, Agnes K M; André, Aurore; Douady, Christophe J; Wade, Christopher M; Darling, Kate F; Ujiié, Yurika; Seears, Heidi A; Quillévéré, Frédéric; de Garidel-Thoron, Thibault; de Vargas, Colomban; Kucera, Michal
2016-09-01
Investigations of biodiversity, biogeography, and ecological processes rely on the identification of "species" as biologically significant, natural units of evolution. In this context, morphotaxonomy only provides an adequate level of resolution if reproductive isolation matches morphological divergence. In many groups of organisms, morphologically defined species often disguise considerable genetic diversity, which may be indicative of the existence of cryptic species. The diversity hidden by morphological species can be disentangled through genetic surveys, which also provide access to data on the ecological distribution of genetically circumscribed units. These units can be identified by unique DNA sequence motifs and allow studies of evolutionary and ecological processes at different levels of divergence. However, the nomenclature of genetically circumscribed units within morphological species is not regulated and lacks stability. This represents a major obstacle to efforts to synthesize and communicate data on genetic diversity for multiple stakeholders. We have been confronted with such an obstacle in our work on planktonic foraminifera, where the stakeholder community is particularly diverse, involving geochemists, paleoceanographers, paleontologists, and biologists, and the lack of stable nomenclature beyond the level of formal morphospecies prevents effective transfer of knowledge. To circumvent this problem, we have designed a stable, reproducible, and flexible nomenclature system for genetically circumscribed units, analogous to the principles of a formal nomenclature system. Our system is based on the definition of unique DNA sequence motifs collocated within an individual, their typification (in analogy with holotypes), utilization of their hierarchical phylogenetic structure to define levels of divergence below that of the morphospecies, and a set of nomenclature rules assuring stability. The resulting molecular operational taxonomic units remain outside the domain of current nomenclature codes, but are linked to formal morphospecies as regulated by the codes. Subsequently, we show how this system can be applied to classify genetically defined units using the SSU rDNA marker in planktonic foraminifera and we highlight its potential use for other groups of organisms where similarly high levels of connectivity between molecular and formal taxonomies can be achieved. © The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
[A study on identification of edible bird's nests by DNA barcodes].
Chen, Yue-Juan; Liu, Wen-Jian; Chen, Dan-Na; Chieng, Sing-Hock; Jiang, Lin
2017-12-01
To provide theoretical basis for the traceability and quality evaluation of edible bird's nests (EBNs), the Cytb sequence was applied to identify the origin of EBNs. A total of 39 experiment samples were collected from Malaysia, Indonesia, Vietnam and Thailand. Genomic DNA was extracted for the PCR reaction. The amplified products were sequenced. 36 sequences were downloaded from Gen Bank including edible nest swiftlet, black nest swiftlet, mascarene swiftlet, pacific swiftlet and germain's swiftlet. MEGA 7.0 was used to analyze the distinction of sequences by the method of calculating the distances in intraspecific and interspecific divergences and constructing NJ and UPMGA phylogenetic tree based on Kimera-2-parameter model. The results showed that 39 samples were from three kinds of EBNs. Interspecific divergences were significantly greater than the intraspecific one. Samples could be successfully distinguished by NJ and UPMGA phylogenetic tree. In conclusion, Cytb sequence could be used to distinguish the origin of EBNs and it is efficient for tracing the origin species of EBNs. Copyright© by the Chinese Pharmaceutical Association.
Genomic and transcriptomic approaches to study immunology in cyprinids: What is next?
Petit, Jules; David, Lior; Dirks, Ron; Wiegertjes, Geert F
2017-10-01
Accelerated by the introduction of Next-Generation Sequencing (NGS), a number of genomes of cyprinid fish species have been drafted, leading to a highly valuable collective resource of comparative genome information on cyprinids (Cyprinidae). In addition, NGS-based transcriptome analyses of different developmental stages, organs, or cell types, increasingly contribute to the understanding of complex physiological processes, including immune responses. Cyprinids are a highly interesting family because they comprise one of the most-diversified families of teleosts and because of their variation in ploidy level, with diploid, triploid, tetraploid, hexaploid and sometimes even octoploid species. The wealth of data obtained from NGS technologies provides both challenges and opportunities for immunological research, which will be discussed here. Correct interpretation of ploidy effects on immune responses requires knowledge of the degree of functional divergence between duplicated genes, which can differ even between closely-related cyprinid fish species. We summarize NGS-based progress in analysing immune responses and discuss the importance of respecting the presence of (multiple) duplicated gene sequences when performing transcriptome analyses for detailed understanding of complex physiological processes. Progressively, advances in NGS technology are providing workable methods to further elucidate the implications of gene duplication events and functional divergence of duplicates genes and proteins involved in immune responses in cyprinids. We conclude with discussing how future applications of NGS technologies and analysis methods could enhance immunological research and understanding. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Jiang, Ke; Zhang, Peng
2011-01-01
TRPA1 is a calcium ion channel protein recently identified as the infrared receptor in pit organ-containing snakes. Therefore, understanding the molecular evolution of TRPA1 may help to illuminate the origin of “heat vision” in snakes and reveal the molecular mechanism of infrared sensitivity for TRPA1. To this end, we sequenced the infrared sensory gene TRPA1 in 24 snake species, representing nine snake families and multiple non-snake outgroups. We found that TRPA1 is under strong positive selection in the pit-bearing snakes studied, but not in other non-pit snakes and non-snake vertebrates. As a comparison, TRPV1, a gene closely related to TRPA1, was found to be under strong purifying selection in all the species studied, with no difference in the strength of selection between pit-bearing snakes and non-pit snakes. This finding demonstrates that the adaptive evolution of TRPA1 specifically occurred within the pit-bearing snakes and may be related to the functional modification for detecting infrared radiation. In addition, by comparing the TRPA1 protein sequences, we identified 11 amino acid sites that were diverged in pit-bearing snakes but conserved in non-pit snakes and other vertebrates, 21 sites that were diverged only within pit-vipers but conserved in the remaining snakes. These specific amino acid substitutions may be potentially functional important for infrared sensing. PMID:22163322
Osborne, Megan J; Turner, Thomas F
2011-06-01
The major histocompatibility complex (MHC) is a critical component of the adaptive immune response in vertebrates. Due to the role that MHC plays in immunity, absence of variation within these genes may cause species to be vulnerable to emerging diseases. The freshwater fish family Cyprinidae comprises the most diverse and species-rich group of freshwater fish in the world, but some are imperiled. Despite considerable species richness and the long evolutionary history of the family, there are very few reports of MHC sequences (apart from a few model species), and no sequences are reported from endemic North American cyprinids (subfamily Leuciscinae). Here we isolate and characterize the MH Class II beta genes from complementary DNA and genomic DNA of the non-model, endangered Rio Grande silvery minnow (Hybognathus amarus), a North American cyprinid. Phylogenetic reconstruction revealed two groups of divergent MH alleles that are paralogous to previously described loci found in deeply divergent cyprinid taxa including common carp, zebrafish, African large barb and bream. Both groups of alleles were under the influence of diversifying selection yet not all individuals had alleles belonging to both allelic groups. We concluded that the general organization and pattern of variation of MH class II genes in Rio Grande silvery minnow is similar to that identified in other cyprinid fishes studied to date, despite distant evolutionary relationships and evidence of a severe genetic bottleneck. Copyright © 2011 Elsevier Ltd. All rights reserved.
Wei, Chaoling; Yang, Hua; Wang, Songbo; Zhao, Jian; Liu, Chun; Gao, Liping; Xia, Enhua; Lu, Ying; Tai, Yuling; She, Guangbiao; Sun, Jun; Cao, Haisheng; Tong, Wei; Gao, Qiang; Li, Yeyun; Deng, Weiwei; Jiang, Xiaolan; Wang, Wenzhao; Chen, Qi; Zhang, Shihua; Li, Haijing; Wu, Junlan; Wang, Ping; Li, Penghui; Shi, Chengying; Zheng, Fengya; Jian, Jianbo; Huang, Bei; Shan, Dai; Shi, Mingming; Fang, Congbing; Yue, Yi; Li, Fangdong; Li, Daxiang; Wei, Shu; Han, Bin; Jiang, Changjun; Yin, Ye; Xia, Tao; Zhang, Zhengzhu; Bennetzen, Jeffrey L; Zhao, Shancen; Wan, Xiaochun
2018-05-01
Tea, one of the world's most important beverage crops, provides numerous secondary metabolites that account for its rich taste and health benefits. Here we present a high-quality sequence of the genome of tea, Camellia sinensis var. sinensis (CSS), using both Illumina and PacBio sequencing technologies. At least 64% of the 3.1-Gb genome assembly consists of repetitive sequences, and the rest yields 33,932 high-confidence predictions of encoded proteins. Divergence between two major lineages, CSS and Camellia sinensis var. assamica (CSA), is calculated to ∼0.38 to 1.54 million years ago (Mya). Analysis of genic collinearity reveals that the tea genome is the product of two rounds of whole-genome duplications (WGDs) that occurred ∼30 to 40 and ∼90 to 100 Mya. We provide evidence that these WGD events, and subsequent paralogous duplications, had major impacts on the copy numbers of secondary metabolite genes, particularly genes critical to producing three key quality compounds: catechins, theanine, and caffeine. Analyses of transcriptome and phytochemistry data show that amplification and transcriptional divergence of genes encoding a large acyltransferase family and leucoanthocyanidin reductases are associated with the characteristic young leaf accumulation of monomeric galloylated catechins in tea, while functional divergence of a single member of the glutamine synthetase gene family yielded theanine synthetase. This genome sequence will facilitate understanding of tea genome evolution and tea metabolite pathways, and will promote germplasm utilization for breeding improved tea varieties. Copyright © 2018 the Author(s). Published by PNAS.
Chakraborty, Ujani; George, Carolyn M.; Lyndaker, Amy M.; Alani, Eric
2016-01-01
Single-strand annealing (SSA) is an important homologous recombination mechanism that repairs DNA double strand breaks (DSBs) occurring between closely spaced repeat sequences. During SSA, the DSB is acted upon by exonucleases to reveal complementary sequences that anneal and are then repaired through tail clipping, DNA synthesis, and ligation steps. In baker’s yeast, the Msh DNA mismatch recognition complex and the Sgs1 helicase act to suppress SSA between divergent sequences by binding to mismatches present in heteroduplex DNA intermediates and triggering a DNA unwinding mechanism known as heteroduplex rejection. Using baker’s yeast as a model, we have identified new factors and regulatory steps in heteroduplex rejection during SSA. First we showed that Top3-Rmi1, a topoisomerase complex that interacts with Sgs1, is required for heteroduplex rejection. Second, we found that the replication processivity clamp proliferating cell nuclear antigen (PCNA) is dispensable for heteroduplex rejection, but is important for repairing mismatches formed during SSA. Third, we showed that modest overexpression of Msh6 results in a significant increase in heteroduplex rejection; this increase is due to a compromise in Msh2-Msh3 function required for the clipping of 3′ tails. Thus 3′ tail clipping during SSA is a critical regulatory step in the repair vs. rejection decision; rejection is favored before the 3′ tails are clipped. Unexpectedly, Msh6 overexpression, through interactions with PCNA, disrupted heteroduplex rejection between divergent sequences in another recombination substrate. These observations illustrate the delicate balance that exists between repair and replication factors to optimize genome stability. PMID:26680658
Wei, Chaoling; Yang, Hua; Wang, Songbo; Zhao, Jian; Liu, Chun; Gao, Liping; Xia, Enhua; Lu, Ying; Tai, Yuling; She, Guangbiao; Sun, Jun; Cao, Haisheng; Tong, Wei; Gao, Qiang; Li, Yeyun; Deng, Weiwei; Jiang, Xiaolan; Wang, Wenzhao; Chen, Qi; Zhang, Shihua; Li, Haijing; Wu, Junlan; Wang, Ping; Li, Penghui; Shi, Chengying; Zheng, Fengya; Jian, Jianbo; Huang, Bei; Shan, Dai; Shi, Mingming; Fang, Congbing; Yue, Yi; Li, Fangdong; Li, Daxiang; Wei, Shu; Han, Bin; Jiang, Changjun; Yin, Ye; Xia, Tao; Zhang, Zhengzhu; Bennetzen, Jeffrey L.; Zhao, Shancen; Wan, Xiaochun
2018-01-01
Tea, one of the world’s most important beverage crops, provides numerous secondary metabolites that account for its rich taste and health benefits. Here we present a high-quality sequence of the genome of tea, Camellia sinensis var. sinensis (CSS), using both Illumina and PacBio sequencing technologies. At least 64% of the 3.1-Gb genome assembly consists of repetitive sequences, and the rest yields 33,932 high-confidence predictions of encoded proteins. Divergence between two major lineages, CSS and Camellia sinensis var. assamica (CSA), is calculated to ∼0.38 to 1.54 million years ago (Mya). Analysis of genic collinearity reveals that the tea genome is the product of two rounds of whole-genome duplications (WGDs) that occurred ∼30 to 40 and ∼90 to 100 Mya. We provide evidence that these WGD events, and subsequent paralogous duplications, had major impacts on the copy numbers of secondary metabolite genes, particularly genes critical to producing three key quality compounds: catechins, theanine, and caffeine. Analyses of transcriptome and phytochemistry data show that amplification and transcriptional divergence of genes encoding a large acyltransferase family and leucoanthocyanidin reductases are associated with the characteristic young leaf accumulation of monomeric galloylated catechins in tea, while functional divergence of a single member of the glutamine synthetase gene family yielded theanine synthetase. This genome sequence will facilitate understanding of tea genome evolution and tea metabolite pathways, and will promote germplasm utilization for breeding improved tea varieties. PMID:29678829
Early animal evolution: emerging views from comparative biology and geology
NASA Technical Reports Server (NTRS)
Knoll, A. H.; Carroll, S. B.
1999-01-01
The Cambrian appearance of fossils representing diverse phyla has long inspired hypotheses about possible genetic or environmental catalysts of early animal evolution. Only recently, however, have data begun to emerge that can resolve the sequence of genetic and morphological innovations, environmental events, and ecological interactions that collectively shaped Cambrian evolution. Assembly of the modern genetic tool kit for development and the initial divergence of major animal clades occurred during the Proterozoic Eon. Crown group morphologies diversified in the Cambrian through changes in the genetic regulatory networks that organize animal ontogeny. Cambrian radiation may have been triggered by environmental perturbation near the Proterozoic-Cambrian boundary and subsequently amplified by ecological interactions within reorganized ecosystems.
DRS is far less divergent than streptococcal inhibitor of complement of group A streptococcus.
Sagar, Vivek; Kumar, Rajesh; Ganguly, Nirmal K; Menon, Thangam; Chakraborti, Anuradha
2007-04-01
When 100 group A streptococcus isolates were screened, drs, a variant of sic, was identified in emm12 and emm55 isolates. Molecular characterization showed that the drs gene sequence is highly conserved, unlike the sic gene sequence. However, the variation in gene size observed was due to the presence of extra internal repeat sequences.
DRS Is Far Less Divergent than Streptococcal Inhibitor of Complement of Group A Streptococcus▿
Sagar, Vivek; Kumar, Rajesh; Ganguly, Nirmal K.; Menon, Thangam; Chakraborti, Anuradha
2007-01-01
When 100 group A streptococcus isolates were screened, drs, a variant of sic, was identified in emm12 and emm55 isolates. Molecular characterization showed that the drs gene sequence is highly conserved, unlike the sic gene sequence. However, the variation in gene size observed was due to the presence of extra internal repeat sequences. PMID:17237170
Deep Sequencing Reveals a Divergent Ugandan cassava brown streak virus Isolate from Malawi
Winter, Stephan; Mukasa, Settumba; Tairo, Fred; Sseruwagi, Peter; Ndunguru, Joseph; Duffy, Siobain
2017-01-01
ABSTRACT Illumina sequencing of RNA from a cassava cutting from northern Malawi produced a genome of Ugandan cassava brown streak virus (UCBSV-MW-NB7_2013). Sequence comparisons revealed stronger similarity to an isolate from nearby Tanzania (93.4% pairwise nucleotide identity) than to those previously reported from Malawi (86.9 to 87.0%). PMID:28818908
Evolution of Enzyme Superfamilies: Comprehensive Exploration of Sequence-Function Relationships.
Baier, F; Copp, J N; Tokuriki, N
2016-11-22
The sequence and functional diversity of enzyme superfamilies have expanded through billions of years of evolution from a common ancestor. Understanding how protein sequence and functional "space" have expanded, at both the evolutionary and molecular level, is central to biochemistry, molecular biology, and evolutionary biology. Integrative approaches that examine protein sequence, structure, and function have begun to provide comprehensive views of the functional diversity and evolutionary relationships within enzyme superfamilies. In this review, we outline the recent advances in our understanding of enzyme evolution and superfamily functional diversity. We describe the tools that have been used to comprehensively analyze sequence relationships and to characterize sequence and function relationships. We also highlight recent large-scale experimental approaches that systematically determine the activity profiles across enzyme superfamilies. We identify several intriguing insights from this recent body of work. First, promiscuous activities are prevalent among extant enzymes. Second, many divergent proteins retain "function connectivity" via enzyme promiscuity, which can be used to probe the evolutionary potential and history of enzyme superfamilies. Finally, we discuss open questions regarding the intricacies of enzyme divergence, as well as potential research directions that will deepen our understanding of enzyme superfamily evolution.
Diversity and phylogenetic relationships among Bartonella strains from Thai bats.
McKee, Clifton D; Kosoy, Michael Y; Bai, Ying; Osikowicz, Lynn M; Franka, Richard; Gilbert, Amy T; Boonmar, Sumalee; Rupprecht, Charles E; Peruski, Leonard F
2017-01-01
Bartonellae are phylogenetically diverse, intracellular bacteria commonly found in mammals. Previous studies have demonstrated that bats have a high prevalence and diversity of Bartonella infections globally. Isolates (n = 42) were obtained from five bat species in four provinces of Thailand and analyzed using sequences of the citrate synthase gene (gltA). Sequences clustered into seven distinct genogroups; four of these genogroups displayed similarity with Bartonella spp. sequences from other bats in Southeast Asia, Africa, and Eastern Europe. Thirty of the isolates representing these seven genogroups were further characterized by sequencing four additional loci (ftsZ, nuoG, rpoB, and ITS) to clarify their evolutionary relationships with other Bartonella species and to assess patterns of diversity among strains. Among the seven genogroups, there were differences in the number of sequence variants, ranging from 1-5, and the amount of nucleotide divergence, ranging from 0.035-3.9%. Overall, these seven genogroups meet the criteria for distinction as novel Bartonella species, with sequence divergence among genogroups ranging from 6.4-15.8%. Evidence of intra- and intercontinental phylogenetic relationships and instances of homologous recombination among Bartonella genogroups in related bat species were found in Thai bats.
Janes, Holly; Frahm, Nicole; DeCamp, Allan; Rolland, Morgane; Gabriel, Erin; Wolfson, Julian; Hertz, Tomer; Kallas, Esper; Goepfert, Paul; Friedrich, David P.; Corey, Lawrence; Mullins, James I.; McElrath, M. Juliana; Gilbert, Peter
2012-01-01
Background The sieve analysis for the Step trial found evidence that breakthrough HIV-1 sequences for MRKAd5/HIV-1 Gag/Pol/Nef vaccine recipients were more divergent from the vaccine insert than placebo sequences in regions with predicted epitopes. We linked the viral sequence data with immune response and acute viral load data to explore mechanisms for and consequences of the observed sieve effect. Methods Ninety-one male participants (37 placebo and 54 vaccine recipients) were included; viral sequences were obtained at the time of HIV-1 diagnosis. T-cell responses were measured 4 weeks post-second vaccination and at the first or second week post-diagnosis. Acute viral load was obtained at RNA-positive and antibody-negative visits. Findings Vaccine recipients had a greater magnitude of post-infection CD8+ T cell response than placebo recipients (median 1.68% vs 1.18%; p = 0·04) and greater breadth of post-infection response (median 4.5 vs 2; p = 0·06). Viral sequences for vaccine recipients were marginally more divergent from the insert than placebo sequences in regions of Nef targeted by pre-infection immune responses (p = 0·04; Pol p = 0·13; Gag p = 0·89). Magnitude and breadth of pre-infection responses did not correlate with distance of the viral sequence to the insert (p>0·50). Acute log viral load trended lower in vaccine versus placebo recipients (estimated mean 4·7 vs 5·1) but the difference was not significant (p = 0·27). Neither was acute viral load associated with distance of the viral sequence to the insert (p>0·30). Interpretation Despite evidence of anamnestic responses, the sieve effect was not well explained by available measures of T-cell immunogenicity. Sequence divergence from the vaccine was not significantly associated with acute viral load. While point estimates suggested weak vaccine suppression of viral load, the result was not significant and more viral load data would be needed to detect suppression. PMID:22952672
Vences, Miguel; Rasoloariniaina, Jean R; Riemann, Jana C
2018-02-08
The genus Typhleotris contains three poorly known blind fish species, inhabiting aquifers in the limestone plateau of south-western Madagascar. Until recently these species were known from only few localities, and their pattern of genetic differentiation remains poorly studied. In this study we analyse 122 Typhleotris tissue samples collected from 12 localities, spanning the entire known range of the genus, and use DNA sequences to assign these samples to the three species known. The phylogeny based on the mitochondrial marker cox1 revealed three main clades corresponding to the three species: Typhleotris madagascariensis, T. mararybe and T. pauliani, differing by uncorrected pairwise sequence divergences of 6.3-9.8%. The distribution ranges of the three species overlapped widely: T. mararybe was collected only in a southern group of localities, T. madagascariensis was found in both the southern and the central group of localities, and T. pauliani occurred from the northernmost site to the southern group of localities; yet the three species did not share haplotypes in two nuclear genes, except for three individuals that we hypothesize are hybrids of T. pauliani with T. madagascariensis and T. mararybe. This pattern of concordant mitochondrial and nuclear divergence despite sympatry strongly supports the status of all three taxa as separate species. Phylogeographic structure was obvious in T. madagascariensis, with two separate shallow mitochondrial clades occupying (1) the central vs. (2) the southern group of populations, and in T. pauliani, with separate mitochondrial clades for (1) the northern vs. (2) the central/southern populations. The widespread occurrence of these three cave fish species suggests that the aquifers in south-western Madagascar have at least in the past allowed episodic dispersal and gene flow of subterraneous organisms, whereas the phylogeographic pattern of T. madagascariensis and T. pauliani provides evidence for isolation and loss of connectivity in the more recent past.
Hahn, Cassidy M; Iwanowicz, Luke R; Cornman, Robert S; Conway, Carla M; Winton, James R; Blazer, Vicki S
2015-12-01
The white sucker Catostomus commersonii is a freshwater teleost often utilized as a resident sentinel. Here, we sequenced the full genome of a hepatitis B-like virus that infects white suckers from the Great Lakes Region of the United States. Dideoxy sequencing confirmed that the white sucker hepatitis B virus (WSHBV) has a circular genome (3,542 bp) with the prototypical codon organization of hepadnaviruses. Electron microscopy demonstrated that complete virions of approximately 40 nm were present in the plasma of infected fish. Compared to avi- and orthohepadnaviruses, sequence conservation of the core, polymerase, and surface proteins was low and ranged from 16 to 27% at the amino acid level. An X protein homologue common to the orthohepadnaviruses was not present. The WSHBV genome included an atypical, presumptively noncoding region absent in previously described hepadnaviruses. Phylogenetic analyses confirmed WSHBV as distinct from previously documented hepadnaviruses. The level of divergence in protein sequences between WSHBV and other hepadnaviruses and the identification of an HBV-like sequence in an African cichlid provide evidence that a novel genus of the family Hepadnaviridae may need to be established that includes these hepatitis B-like viruses in fishes. Viral transcription was observed in 9.5% (16 of 169) of white suckers evaluated. The prevalence of hepatic tumors in these fish was 4.9%, and only 2.4% of fish were positive for both virus and hepatic tumors. These results are not sufficient to draw inferences regarding the association of WSHBV and carcinogenesis in white sucker. We report the first full-length genome of a hepadnavirus from fishes. Phylogenetic analysis of this genome indicates divergence from genomes of previously described hepadnaviruses from mammalian and avian hosts and supports the creation of a novel genus. The discovery of this novel virus may better our understanding of the evolutionary history of hepatitis B-like viruses of other hosts. In fishes, knowledge of this virus may provide insight regarding possible risk factors associated with hepatic neoplasia in the white sucker. This may also offer another model system for mechanistic research. Copyright © 2015 Hahn et al.
Hahn, Cassidy M.; Cornman, Robert S.; Conway, Carla M.; Winton, James R.; Blazer, Vicki S.
2015-01-01
ABSTRACT The white sucker Catostomus commersonii is a freshwater teleost often utilized as a resident sentinel. Here, we sequenced the full genome of a hepatitis B-like virus that infects white suckers from the Great Lakes Region of the United States. Dideoxy sequencing confirmed that the white sucker hepatitis B virus (WSHBV) has a circular genome (3,542 bp) with the prototypical codon organization of hepadnaviruses. Electron microscopy demonstrated that complete virions of approximately 40 nm were present in the plasma of infected fish. Compared to avi- and orthohepadnaviruses, sequence conservation of the core, polymerase, and surface proteins was low and ranged from 16 to 27% at the amino acid level. An X protein homologue common to the orthohepadnaviruses was not present. The WSHBV genome included an atypical, presumptively noncoding region absent in previously described hepadnaviruses. Phylogenetic analyses confirmed WSHBV as distinct from previously documented hepadnaviruses. The level of divergence in protein sequences between WSHBV and other hepadnaviruses and the identification of an HBV-like sequence in an African cichlid provide evidence that a novel genus of the family Hepadnaviridae may need to be established that includes these hepatitis B-like viruses in fishes. Viral transcription was observed in 9.5% (16 of 169) of white suckers evaluated. The prevalence of hepatic tumors in these fish was 4.9%, and only 2.4% of fish were positive for both virus and hepatic tumors. These results are not sufficient to draw inferences regarding the association of WSHBV and carcinogenesis in white sucker. IMPORTANCE We report the first full-length genome of a hepadnavirus from fishes. Phylogenetic analysis of this genome indicates divergence from genomes of previously described hepadnaviruses from mammalian and avian hosts and supports the creation of a novel genus. The discovery of this novel virus may better our understanding of the evolutionary history of hepatitis B-like viruses of other hosts. In fishes, knowledge of this virus may provide insight regarding possible risk factors associated with hepatic neoplasia in the white sucker. This may also offer another model system for mechanistic research. PMID:26378165
Evolutionary history of the HAP2/GCS1 gene and sexual reproduction in metazoans.
Steele, Robert E; Dana, Catherine E
2009-11-03
The HAP2/GCS1 gene first appeared in the common ancestor of plants, animals, and protists, and is required in the male gamete for fusion to the female gamete in the unicellular organisms Chlamydomonas and Plasmodium. We have identified a HAP2/GCS1 gene in the genome sequence of the sponge Amphimedon queenslandica. This finding provides a continuous evolutionary history of HAP2/GCS1 from unicellular organisms into the metazoan lineage. Divergent versions of the HAP2/GCS1 gene are also present in the genomes of some but not all arthropods. By examining the expression of the HAP2/GCS1 gene in the cnidarian Hydra, we have found the first evidence supporting the hypothesis that HAP2/GCS1 was used for male gamete fusion in the ancestor of extant metazoans and that it retains that function in modern cnidarians.
Diverse Applications of Environmental DNA Methods in Parasitology.
Bass, David; Stentiford, Grant D; Littlewood, D T J; Hartikainen, Hanna
2015-10-01
Nucleic acid extraction and sequencing of genes from organisms within environmental samples encompasses a variety of techniques collectively referred to as environmental DNA or 'eDNA'. The key advantages of eDNA analysis include the detection of cryptic or otherwise elusive organisms, large-scale sampling with fewer biases than specimen-based methods, and generation of data for molecular systematics. These are particularly relevant for parasitology because parasites can be difficult to locate and are morphologically intractable and genetically divergent. However, parasites have rarely been the focus of eDNA studies. Focusing on eukaryote parasites, we review the increasing diversity of the 'eDNA toolbox'. Combining eDNA methods with complementary tools offers much potential to understand parasite communities, disease risk, and parasite roles in broader ecosystem processes such as food web structuring and community assembly. Crown Copyright © 2015. Published by Elsevier Ltd. All rights reserved.
Hypervariable and highly divergent intron-exon organizations in the chordate Oikopleura dioica.
Edvardsen, Rolf B; Lerat, Emmanuelle; Maeland, Anne Dorthea; Flåt, Mette; Tewari, Rita; Jensen, Marit F; Lehrach, Hans; Reinhardt, Richard; Seo, Hee-Chan; Chourrout, Daniel
2004-10-01
Oikopleura dioica is a pelagic tunicate with a very small genome and a very short life cycle. In order to investigate the intron-exon organizations in Oikopleura, we have isolated and characterized ribosomal protein EF-1alpha, Hox, and alpha-tubulin genes. Their intron positions have been compared with those of the same genes from various invertebrates and vertebrates, including four species with entirely sequenced genomes. Oikopleura genes, like Caenorhabditis genes, have introns at a large number of nonconserved positions, which must originate from late insertions or intron sliding of ancient insertions. Both species exhibit hypervariable intron-exon organization within their alpha-tubulin gene family. This is due to localization of most nonconserved intron positions in single members of this gene family. The hypervariability and divergence of intron positions in Oikopleura and Caenorhabditis may be related to the predominance of short introns, the processing of which is not very dependent upon the exonic environment compared to large introns. Also, both species have an undermethylated genome, and the control of methylation-induced point mutations imposes a control on exon size, at least in vertebrate genes. That introns placed at such variable positions in Oikopleura or C. elegans may serve a specific purpose is not easy to infer from our current knowledge and hypotheses on intron functions. We propose that new introns are retained in species with very short life cycles, because illegitimate exchanges including gene conversion are repressed. We also speculate that introns placed at gene-specific positions may contribute to suppressing these exchanges and thereby favor their own persistence.
Molecular Analysis of Core Kinetochore Composition and Assembly in Drosophila melanogaster
Przewloka, Marcin R.; Archambault, Vincent; D'Avino, Pier Paolo; Lilley, Kathryn S.; Laue, Ernest D.; McAinsh, Andrew D.; Glover, David M.
2007-01-01
Background Kinetochores are large multiprotein complexes indispensable for proper chromosome segregation. Although Drosophila is a classical model organism for studies of chromosome segregation, little is known about the organization of its kinetochores. Methodology/Principal Findings We employed bioinformatics, proteomics and cell biology methods to identify and analyze the interaction network of Drosophila kinetochore proteins. We have shown that three Drosophila proteins highly diverged from human and yeast Ndc80, Nuf2 and Mis12 are indeed their orthologues. Affinity purification of these proteins from cultured Drosophila cells identified a further five interacting proteins with weak similarity to subunits of the SPC105/KNL-1, MIND/MIS12 and NDC80 kinetochore complexes together with known kinetochore associated proteins such as dynein/dynactin, spindle assembly checkpoint components and heterochromatin proteins. All eight kinetochore complex proteins were present at the kinetochore during mitosis and MIND/MIS12 complex proteins were also centromeric during interphase. Their down-regulation led to dramatic defects in chromosome congression/segregation frequently accompanied by mitotic spindle elongation. The systematic depletion of each individual protein allowed us to establish dependency relationships for their recruitment onto the kinetochore. This revealed the sequential recruitment of individual members of first, the MIND/MIS12 and then, NDC80 complex. Conclusions/Significance The Drosophila MIND/MIS12 and NDC80 complexes and the Spc105 protein, like their counterparts from other eukaryotic species, are essential for chromosome congression and segregation, but are highly diverged in sequence. Hierarchical dependence relationships of individual proteins regulate the assembly of Drosophila kinetochore complexes in a manner similar, but not identical, to other organisms. PMID:17534428
Phylogeny and divergence of the pinnipeds (Carnivora: Mammalia) assessed using a multigene dataset
Higdon, Jeff W; Bininda-Emonds, Olaf RP; Beck, Robin MD; Ferguson, Steven H
2007-01-01
Background Phylogenetic comparative methods are often improved by complete phylogenies with meaningful branch lengths (e.g., divergence dates). This study presents a dated molecular supertree for all 34 world pinniped species derived from a weighted matrix representation with parsimony (MRP) supertree analysis of 50 gene trees, each determined under a maximum likelihood (ML) framework. Divergence times were determined by mapping the same sequence data (plus two additional genes) on to the supertree topology and calibrating the ML branch lengths against a range of fossil calibrations. We assessed the sensitivity of our supertree topology in two ways: 1) a second supertree with all mtDNA genes combined into a single source tree, and 2) likelihood-based supermatrix analyses. Divergence dates were also calculated using a Bayesian relaxed molecular clock with rate autocorrelation to test the sensitivity of our supertree results further. Results The resulting phylogenies all agreed broadly with recent molecular studies, in particular supporting the monophyly of Phocidae, Otariidae, and the two phocid subfamilies, as well as an Odobenidae + Otariidae sister relationship; areas of disagreement were limited to four more poorly supported regions. Neither the supertree nor supermatrix analyses supported the monophyly of the two traditional otariid subfamilies, supporting suggestions for the need for taxonomic revision in this group. Phocid relationships were similar to other recent studies and deeper branches were generally well-resolved. Halichoerus grypus was nested within a paraphyletic Pusa, although relationships within Phocina tend to be poorly supported. Divergence date estimates for the supertree were in good agreement with other studies and the available fossil record; however, the Bayesian relaxed molecular clock divergence date estimates were significantly older. Conclusion Our results join other recent studies and highlight the need for a re-evaluation of pinniped taxonomy, especially as regards the subfamilial classification of otariids and the generic nomenclature of Phocina. Even with the recent publication of new sequence data, the available genetic sequence information for several species, particularly those in Arctocephalus, remains very limited, especially for nuclear markers. However, resolution of parts of the tree will probably remain difficult, even with additional data, due to apparent rapid radiations. Our study addresses the lack of a recent pinniped phylogeny that includes all species and robust divergence dates for all nodes, and will therefore prove indispensable to comparative and macroevolutionary studies of this group of carnivores. PMID:17996107
Waye, J S; Willard, H F
1986-09-01
The centromeric regions of all human chromosomes are characterized by distinct subsets of a diverse tandemly repeated DNA family, alpha satellite. On human chromosome 17, the predominant form of alpha satellite is a 2.7-kilobase-pair higher-order repeat unit consisting of 16 alphoid monomers. We present the complete nucleotide sequence of the 16-monomer repeat, which is present in 500 to 1,000 copies per chromosome 17, as well as that of a less abundant 15-monomer repeat, also from chromosome 17. These repeat units were approximately 98% identical in sequence, differing by the exclusion of precisely 1 monomer from the 15-monomer repeat. Homologous unequal crossing-over is suggested as a probable mechanism by which the different repeat lengths on chromosome 17 were generated, and the putative site of such a recombination event is identified. The monomer organization of the chromosome 17 higher-order repeat unit is based, in part, on tandemly repeated pentamers. A similar pentameric suborganization has been previously demonstrated for alpha satellite of the human X chromosome. Despite the organizational similarities, substantial sequence divergence distinguishes these subsets. Hybridization experiments indicate that the chromosome 17 and X subsets are more similar to each other than to the subsets found on several other human chromosomes. We suggest that the chromosome 17 and X alpha satellite subsets may be related components of a larger alphoid subfamily which have evolved from a common ancestral repeat into the contemporary chromosome-specific subsets.
Knief, Claudia
2015-01-01
Methane-oxidizing bacteria are characterized by their capability to grow on methane as sole source of carbon and energy. Cultivation-dependent and -independent methods have revealed that this functional guild of bacteria comprises a substantial diversity of organisms. In particular the use of cultivation-independent methods targeting a subunit of the particulate methane monooxygenase (pmoA) as functional marker for the detection of aerobic methanotrophs has resulted in thousands of sequences representing “unknown methanotrophic bacteria.” This limits data interpretation due to restricted information about these uncultured methanotrophs. A few groups of uncultivated methanotrophs are assumed to play important roles in methane oxidation in specific habitats, while the biology behind other sequence clusters remains still largely unknown. The discovery of evolutionary related monooxygenases in non-methanotrophic bacteria and of pmoA paralogs in methanotrophs requires that sequence clusters of uncultivated organisms have to be interpreted with care. This review article describes the present diversity of cultivated and uncultivated aerobic methanotrophic bacteria based on pmoA gene sequence diversity. It summarizes current knowledge about cultivated and major clusters of uncultivated methanotrophic bacteria and evaluates habitat specificity of these bacteria at different levels of taxonomic resolution. Habitat specificity exists for diverse lineages and at different taxonomic levels. Methanotrophic genera such as Methylocystis and Methylocaldum are identified as generalists, but they harbor habitat specific methanotrophs at species level. This finding implies that future studies should consider these diverging preferences at different taxonomic levels when analyzing methanotrophic communities. PMID:26696968
Metatranscriptomics of N2-fixing cyanobacteria in the Amazon River plume
Hilton, Jason A; Satinsky, Brandon M; Doherty, Mary; Zielinski, Brian; Zehr, Jonathan P
2015-01-01
Biological N2 fixation is an important nitrogen source for surface ocean microbial communities. However, nearly all information on the diversity and gene expression of organisms responsible for oceanic N2 fixation in the environment has come from targeted approaches that assay only a small number of genes and organisms. Using genomes of diazotrophic cyanobacteria to extract reads from extensive meta-genomic and -transcriptomic libraries, we examined diazotroph diversity and gene expression from the Amazon River plume, an area characterized by salinity and nutrient gradients. Diazotroph genome and transcript sequences were most abundant in the transitional waters compared with lower salinity or oceanic water masses. We were able to distinguish two genetically divergent phylotypes within the Hemiaulus-associated Richelia sequences, which were the most abundant diazotroph sequences in the data set. Photosystem (PS)-II transcripts in Richelia populations were much less abundant than those in Trichodesmium, and transcripts from several Richelia PS-II genes were absent, indicating a prominent role for cyclic electron transport in Richelia. In addition, there were several abundant regulatory transcripts, including one that targets a gene involved in PS-I cyclic electron transport in Richelia. High sequence coverage of the Richelia transcripts, as well as those from Trichodesmium populations, allowed us to identify expressed regions of the genomes that had been overlooked by genome annotations. High-coverage genomic and transcription analysis enabled the characterization of distinct phylotypes within diazotrophic populations, revealed a distinction in a core process between dominant populations and provided evidence for a prominent role for noncoding RNAs in microbial communities. PMID:25514535
Virus Identification in Unknown Tropical Febrile Illness Cases Using Deep Sequencing
Balmaseda, Angel; Harris, Eva; DeRisi, Joseph L.
2012-01-01
Dengue virus is an emerging infectious agent that infects an estimated 50–100 million people annually worldwide, yet current diagnostic practices cannot detect an etiologic pathogen in ∼40% of dengue-like illnesses. Metagenomic approaches to pathogen detection, such as viral microarrays and deep sequencing, are promising tools to address emerging and non-diagnosable disease challenges. In this study, we used the Virochip microarray and deep sequencing to characterize the spectrum of viruses present in human sera from 123 Nicaraguan patients presenting with dengue-like symptoms but testing negative for dengue virus. We utilized a barcoding strategy to simultaneously deep sequence multiple serum specimens, generating on average over 1 million reads per sample. We then implemented a stepwise bioinformatic filtering pipeline to remove the majority of human and low-quality sequences to improve the speed and accuracy of subsequent unbiased database searches. By deep sequencing, we were able to detect virus sequence in 37% (45/123) of previously negative cases. These included 13 cases with Human Herpesvirus 6 sequences. Other samples contained sequences with similarity to sequences from viruses in the Herpesviridae, Flaviviridae, Circoviridae, Anelloviridae, Asfarviridae, and Parvoviridae families. In some cases, the putative viral sequences were virtually identical to known viruses, and in others they diverged, suggesting that they may derive from novel viruses. These results demonstrate the utility of unbiased metagenomic approaches in the detection of known and divergent viruses in the study of tropical febrile illness. PMID:22347512
Coulthart, Michael B; Posada, David; Crandall, Keith A; Dekaban, Gregory A
2006-03-01
Recently, the putative finding of ancient human T cell leukemia virus type 1 (HTLV-1) long terminal repeat (LTR) DNA sequences in association with a 1500-year-old Chilean mummy has stirred vigorous debate. The debate is based partly on the inherent uncertainties associated with phylogenetic reconstruction when only short sequences of closely related genotypes are available. However, a full analysis of what phylogenetic information is present in the mummy data has not previously been published, leaving open the question of what precisely is the range of admissible interpretation. To fulfill this need, we re-analyzed the mummy data in a new way. We first performed phylogenetic analysis of 188 published LTR DNA sequences from extant strains belonging to the HTLV-1 Cosmopolitan clade, using the method of statistical parsimony which is designed both to optimize phylogenetic resolution among sequences with little evolutionary divergence, and to permit precise mapping of individual sequence mutations onto branches of a divergence network. We then deduced possible phylogenetic positions for the two main categories of published Chilean mummy sequences, based on their published 157-nucleotide LTR sequences. The possible phylogenetic placements for one of the mummy sequence categories are consistent with a modern origin. However, one of these placements for the other mummy sequence category falls very close to the root of the Cosmopolitan clade, consistent with an ancient origin for both this mummy sequence and the Cosmopolitan clade.
Navigating the tip of the genomic iceberg: Next-generation sequencing for plant systematics.
Straub, Shannon C K; Parks, Matthew; Weitemier, Kevin; Fishbein, Mark; Cronn, Richard C; Liston, Aaron
2012-02-01
Just as Sanger sequencing did more than 20 years ago, next-generation sequencing (NGS) is poised to revolutionize plant systematics. By combining multiplexing approaches with NGS throughput, systematists may no longer need to choose between more taxa or more characters. Here we describe a genome skimming (shallow sequencing) approach for plant systematics. Through simulations, we evaluated optimal sequencing depth and performance of single-end and paired-end short read sequences for assembly of nuclear ribosomal DNA (rDNA) and plastomes and addressed the effect of divergence on reference-guided plastome assembly. We also used simulations to identify potential phylogenetic markers from low-copy nuclear loci at different sequencing depths. We demonstrated the utility of genome skimming through phylogenetic analysis of the Sonoran Desert clade (SDC) of Asclepias (Apocynaceae). Paired-end reads performed better than single-end reads. Minimum sequencing depths for high quality rDNA and plastome assemblies were 40× and 30×, respectively. Divergence from the reference significantly affected plastome assembly, but relatively similar references are available for most seed plants. Deeper rDNA sequencing is necessary to characterize intragenomic polymorphism. The low-copy fraction of the nuclear genome was readily surveyed, even at low sequencing depths. Nearly 160000 bp of sequence from three organelles provided evidence of phylogenetic incongruence in the SDC. Adoption of NGS will facilitate progress in plant systematics, as whole plastome and rDNA cistrons, partial mitochondrial genomes, and low-copy nuclear markers can now be efficiently obtained for molecular phylogenetics studies.
The Evolution of Ribosomal DNA: Divergent Paralogues and Phylogenetic Implications
Buckler-IV, E. S.; Ippolito, A.; Holtsford, T. P.
1997-01-01
Although nuclear ribosomal DNA (rDNA) repeats evolve together through concerted evolution, some genomes contain a considerable diversity of paralogous rDNA. This diversity includes not only multiple functional loci but also putative pseudogenes and recombinants. We examined the occurrence of divergent paralogues and recombinants in Gossypium, Nicotiana, Tripsacum, Winteraceae, and Zea ribosomal internal transcribed spacer (ITS) sequences. Some of the divergent paralogues are probably rDNA pseudogenes, since they have low predicted secondary structure stability, high substitution rates, and many deamination-driven substitutions at methylation sites. Under standard PCR conditions, the low stability paralogues amplified well, while many high-stability paralogues amplified poorly. Under highly denaturing PCR conditions (i.e., with dimethylsulfoxide), both low- and high-stability paralogues amplified well. We also found recombination between divergent paralogues. For phylogenetics, divergent ribosomal paralogues can aid in reconstructing ancestral states and thus serve as good outgroups. Divergent paralogues can also provide companion rDNA phylogenies. However, phylogeneticists must discriminate among families of divergent paralogues and recombinants or suffer from muddled and inaccurate organismal phylogenies. PMID:9055091
Landry, C; Geyer, L B; Arakaki, Y; Uehara, T; Palumbi, Stephen R
2003-01-01
The rich species diversity of the marine Indo-West Pacific (IWP) has been explained largely on the basis of historical observation of large-scale diversity gradients. Careful study of divergence among closely related species can reveal important new information about the pace and mechanisms of their formation, and can illuminate the genesis of biogeographic patterns. Young species inhabiting the IWP include urchins of the genus Echinometra, which diverged over the past 1-5 Myr. Here, we report the most recent divergence of two cryptic species of Echinometra inhabiting this region. Mitochondrial cytochrome oxidase 1 (CO1) sequence data show that in Echinometra oblonga, species-level divergence in sperm morphology, gamete recognition proteins and gamete compatibility arose between central and western Pacific populations in the past 250 000 years. Divergence in sperm attachment proteins suggests rapid evolution of the fertilization system. Divergence of sperm morphology may be a common feature of free-spawning animals, and offers opportunities to simultaneously understand genetic divergence, changes in protein expression patterns and morphological evolution in traits directly related to reproductive isolation. PMID:12964987
Auguste, Albert J.; Liria, Jonathan; Forrester, Naomi L.; Giambalvo, Dileyvic; Moncada, Maria; Long, Kanya C.; Morón, Dulce; de Manzione, Nuris; Tesh, Robert B.; Halsey, Eric S.; Kochel, Tadeusz J.; Hernandez, Rosa; Navarro, Juan-Carlos
2015-01-01
In 2010, an outbreak of febrile illness with arthralgic manifestations was detected at La Estación village, Portuguesa State, Venezuela. The etiologic agent was determined to be Mayaro virus (MAYV), a reemerging South American alphavirus. A total of 77 cases was reported and 19 were confirmed as seropositive. MAYV was isolated from acute-phase serum samples from 6 symptomatic patients. We sequenced 27 complete genomes representing the full spectrum of MAYV genetic diversity, which facilitated detection of a new genotype, designated N. Phylogenetic analysis of genomic sequences indicated that etiologic strains from Venezuela belong to genotype D. Results indicate that MAYV is highly conserved genetically, showing ≈17% nucleotide divergence across all 3 genotypes and 4% among genotype D strains in the most variable genes. Coalescent analyses suggested genotypes D and L diverged ≈150 years ago and genotype diverged N ≈250 years ago. This virus commonly infects persons residing near enzootic transmission foci because of anthropogenic incursions. PMID:26401714
Auguste, Albert J; Liria, Jonathan; Forrester, Naomi L; Giambalvo, Dileyvic; Moncada, Maria; Long, Kanya C; Morón, Dulce; de Manzione, Nuris; Tesh, Robert B; Halsey, Eric S; Kochel, Tadeusz J; Hernandez, Rosa; Navarro, Juan-Carlos; Weaver, Scott C
2015-10-01
In 2010, an outbreak of febrile illness with arthralgic manifestations was detected at La Estación village, Portuguesa State, Venezuela. The etiologic agent was determined to be Mayaro virus (MAYV), a reemerging South American alphavirus. A total of 77 cases was reported and 19 were confirmed as seropositive. MAYV was isolated from acute-phase serum samples from 6 symptomatic patients. We sequenced 27 complete genomes representing the full spectrum of MAYV genetic diversity, which facilitated detection of a new genotype, designated N. Phylogenetic analysis of genomic sequences indicated that etiologic strains from Venezuela belong to genotype D. Results indicate that MAYV is highly conserved genetically, showing ≈17% nucleotide divergence across all 3 genotypes and 4% among genotype D strains in the most variable genes. Coalescent analyses suggested genotypes D and L diverged ≈150 years ago and genotype diverged N ≈250 years ago. This virus commonly infects persons residing near enzootic transmission foci because of anthropogenic incursions.
Park, D-S; Suh, S-J; Hebert, P D N; Oh, H-W; Hong, K-J
2011-08-01
Although DNA barcode coverage has grown rapidly for many insect orders, there are some groups, such as scale insects, where sequence recovery has been difficult. However, using a recently developed primer set, we recovered barcode records from 373 specimens, providing coverage for 75 species from 31 genera in two families. Overall success was >90% for mealybugs and >80% for armored scale species. The G·C content was very low in most species, averaging just 16.3%. Sequence divergences (K2P) between congeneric species averaged 10.7%, while intra-specific divergences averaged 0.97%. However, the latter value was inflated by high intra-specific divergence in nine taxa, cases that may indicate species overlooked by current taxonomic treatments. Our study establishes the feasibility of developing a comprehensive barcode library for scale insects and indicates that its construction will both create an effective system for identifying scale insects and reveal taxonomic situations worthy of deeper analysis.
Glinsky, Gennadi V.
2016-01-01
Abstract Thousands of candidate human-specific regulatory sequences (HSRS) have been identified, supporting the hypothesis that unique to human phenotypes result from human-specific alterations of genomic regulatory networks. Collectively, a compendium of multiple diverse families of HSRS that are functionally and structurally divergent from Great Apes could be defined as the backbone of human-specific genomic regulatory networks. Here, the conservation patterns analysis of 18,364 candidate HSRS was carried out requiring that 100% of bases must remap during the alignments of human, chimpanzee, and bonobo sequences. A total of 5,535 candidate HSRS were identified that are: (i) highly conserved in Great Apes; (ii) evolved by the exaptation of highly conserved ancestral DNA; (iii) defined by either the acceleration of mutation rates on the human lineage or the functional divergence from non-human primates. The exaptation of highly conserved ancestral DNA pathway seems mechanistically distinct from the evolution of regulatory DNA segments driven by the species-specific expansion of transposable elements. Genome-wide proximity placement analysis of HSRS revealed that a small fraction of topologically associating domains (TADs) contain more than half of HSRS from four distinct families. TADs that are enriched for HSRS and termed rapidly evolving in humans TADs (revTADs) comprise 0.8–10.3% of 3,127 TADs in the hESC genome. RevTADs manifest distinct correlation patterns between placements of human accelerated regions, human-specific transcription factor-binding sites, and recombination rates. There is a significant enrichment within revTAD boundaries of hESC-enhancers, primate-specific CTCF-binding sites, human-specific RNAPII-binding sites, hCONDELs, and H3K4me3 peaks with human-specific enrichment at TSS in prefrontal cortex neurons (P < 0.0001 in all instances). Present analysis supports the idea that phenotypic divergence of Homo sapiens is driven by the evolution of human-specific genomic regulatory networks via at least two mechanistically distinct pathways of creation of divergent sequences of regulatory DNA: (i) recombination-associated exaptation of the highly conserved ancestral regulatory DNA segments; (ii) human-specific insertions of transposable elements. PMID:27503290
Evolutionary distances in the twilight zone--a rational kernel approach.
Schwarz, Roland F; Fletcher, William; Förster, Frank; Merget, Benjamin; Wolf, Matthias; Schultz, Jörg; Markowetz, Florian
2010-12-31
Phylogenetic tree reconstruction is traditionally based on multiple sequence alignments (MSAs) and heavily depends on the validity of this information bottleneck. With increasing sequence divergence, the quality of MSAs decays quickly. Alignment-free methods, on the other hand, are based on abstract string comparisons and avoid potential alignment problems. However, in general they are not biologically motivated and ignore our knowledge about the evolution of sequences. Thus, it is still a major open question how to define an evolutionary distance metric between divergent sequences that makes use of indel information and known substitution models without the need for a multiple alignment. Here we propose a new evolutionary distance metric to close this gap. It uses finite-state transducers to create a biologically motivated similarity score which models substitutions and indels, and does not depend on a multiple sequence alignment. The sequence similarity score is defined in analogy to pairwise alignments and additionally has the positive semi-definite property. We describe its derivation and show in simulation studies and real-world examples that it is more accurate in reconstructing phylogenies than competing methods. The result is a new and accurate way of determining evolutionary distances in and beyond the twilight zone of sequence alignments that is suitable for large datasets.
Roessler, Christian G.; Hall, Branwen M.; Anderson, William J.; Ingram, Wendy M.; Roberts, Sue A.; Montfort, William R.; Cordes, Matthew H. J.
2008-01-01
Proteins that share common ancestry may differ in structure and function because of divergent evolution of their amino acid sequences. For a typical diverse protein superfamily, the properties of a few scattered members are known from experiment. A satisfying picture of functional and structural evolution in relation to sequence changes, however, may require characterization of a larger, well chosen subset. Here, we employ a “stepping-stone” method, based on transitive homology, to target sequences intermediate between two related proteins with known divergent properties. We apply the approach to the question of how new protein folds can evolve from preexisting folds and, in particular, to an evolutionary change in secondary structure and oligomeric state in the Cro family of bacteriophage transcription factors, initially identified by sequence-structure comparison of distant homologs from phages P22 and λ. We report crystal structures of two Cro proteins, Xfaso 1 and Pfl 6, with sequences intermediate between those of P22 and λ. The domains show 40% sequence identity but differ by switching of α-helix to β-sheet in a C-terminal region spanning ≈25 residues. Sedimentation analysis also suggests a correlation between helix-to-sheet conversion and strengthened dimerization. PMID:18227506
Exploring the temporal structure of heterochronous sequences using TempEst (formerly Path-O-Gen).
Rambaut, Andrew; Lam, Tommy T; Max Carvalho, Luiz; Pybus, Oliver G
2016-01-01
Gene sequences sampled at different points in time can be used to infer molecular phylogenies on a natural timescale of months or years, provided that the sequences in question undergo measurable amounts of evolutionary change between sampling times. Data sets with this property are termed heterochronous and have become increasingly common in several fields of biology, most notably the molecular epidemiology of rapidly evolving viruses. Here we introduce the cross-platform software tool, TempEst (formerly known as Path-O-Gen), for the visualization and analysis of temporally sampled sequence data. Given a molecular phylogeny and the dates of sampling for each sequence, TempEst uses an interactive regression approach to explore the association between genetic divergence through time and sampling dates. TempEst can be used to (1) assess whether there is sufficient temporal signal in the data to proceed with phylogenetic molecular clock analysis, and (2) identify sequences whose genetic divergence and sampling date are incongruent. Examination of the latter can help identify data quality problems, including errors in data annotation, sample contamination, sequence recombination, or alignment error. We recommend that all users of the molecular clock models implemented in BEAST first check their data using TempEst prior to analysis.
Identification of three duplicated Spin genes in medaka (Oryzias latipes).
Wang, Xiao-Lei; Mei, Jie; Sun, Min; Hong, Yun-Han; Gui, Jian-Fang
2005-05-09
Gene and genomic duplications are very important and frequent events in fish evolution, and the divergence of duplicated genes in sequences and functions is a focus of research on gene evolution. Here, we report the identification and characterization of three duplicated Spindlin (Spin) genes from medaka (Oryzias latipes): OlSpinA, OlSpinB, and OlSpinC. Molecular cloning, genomic DNA Blast analysis and phylogenetic relationship analysis demonstrated that the three duplicated OlSpin genes should belong to gene duplication. Furthermore, Western blot analysis revealed significant expression differences of the three OlSpins among different tissues and during embryogenesis in medaka, and suggested that sequence and functional divergence might have occurred in evolution among them.
Guo, Li; Breakspear, Andrew; Zhao, Guoyi; Gao, Lixin; Kistler, H Corby; Xu, Jin-Rong; Ma, Li-Jun
2016-02-01
The cyclic adenosine monophosphate-protein kinase A (cAMP-PKA) pathway is a central signalling cascade that transmits extracellular stimuli and governs cell responses through the second messenger cAMP. The importance of cAMP signalling in fungal biology has been well documented and the key conserved components, adenylate cyclase (AC) and the catalytic subunit of PKA (CPKA), have been functionally characterized. However, other genes involved in this signalling pathway and their regulation are not well understood in filamentous fungi. Here, we performed a comparative transcriptomics analysis of AC and CPKA mutants in two closely related fungi: Fusarium graminearum (Fg) and F. verticillioides (Fv). Combining available Fg transcriptomics and phenomics data, we reconstructed the Fg cAMP signalling pathway. We developed a computational program that combines sequence conservation and patterns of orthologous gene expression to facilitate global transcriptomics comparisons between different organisms. We observed highly correlated expression patterns for most orthologues (80%) between Fg and Fv. We also identified a subset of 482 (6%) diverged orthologues, whose expression under all conditions was at least 50% higher in one genome than in the other. This enabled us to dissect the conserved and unique portions of the cAMP-PKA pathway. Although the conserved portions controlled essential functions, such as metabolism, the cell cycle, chromatin remodelling and the oxidative stress response, the diverged portions had species-specific roles, such as the production and detoxification of secondary metabolites unique to each species. The evolution of the cAMP-PKA signalling pathway seems to have contributed directly to fungal divergence and niche adaptation. © 2015 The Authors. Molecular Plant Pathology published by British Society for Plant Pathology and John Wiley & Sons Ltd.
Shields, E D
1996-01-01
The quantification of total tooth structure derived from X-rays of Vietnamese, Southern Chinese, Mongolians, Western Eskimos, and Peruvian pre-Inca (Huari Empire) populations was used to examine dental divergence and the morphogenetics of change. Multivariate derived distances between the samples helped identify a quasicontinuous web of ethnic groups with two binary clusters ensconced within the web. One cluster was composed of Mongolians, Western Eskimos, and pre-Inca, and the other group consisted of the Southern Chinese and Vietnamese. Mongolians entered the quasicontinuum from a divergent angle (externally influenced) from that of the Southeast Asians. The Chinese and pre-Inca formed the polar samples of the distance superstructure. The pre-Inca sample was the most isolated, its closest neighbor being the Western Eskimos. Univariate and multivariate analyses suggested that the pre-Inca, whose ancestors arrived in America perhaps approximately 30,000 years ago, was the least derived sample. Clearly, microevolutionary change occurred among the samples, but the dental phenotype was resistant to environmental developmental perturbations. An assessment of dental divergence and developmental biology suggested that the overall dental phenotype is a complex multigenic morphological character, and that the observed variation evolved through total genomic drift. The quantified dental phenotype is greater than its highly multigenic algorithm and its development homeostasis is tightly controlled, or canalized, by the deterministic organization of a complex nonlinear epigenetic milieu. The overall dental phenotype quantified here was selectively neutral and a good character to help reconstruct the sequence of human evolution, but if the outlying homeostatic threshold was or will be exceeded in antecedents and descendants, respectively, evolutionary saltation occurs.
Finding functional features in Saccharomyces genomes by phylogenetic footprinting.
Cliften, Paul; Sudarsanam, Priya; Desikan, Ashwin; Fulton, Lucinda; Fulton, Bob; Majors, John; Waterston, Robert; Cohen, Barak A; Johnston, Mark
2003-07-04
The sifting and winnowing of DNA sequence that occur during evolution cause nonfunctional sequences to diverge, leaving phylogenetic footprints of functional sequence elements in comparisons of genome sequences. We searched for such footprints among the genome sequences of six Saccharomyces species and identified potentially functional sequences. Comparison of these sequences allowed us to revise the catalog of yeast genes and identify sequence motifs that may be targets of transcriptional regulatory proteins. Some of these conserved sequence motifs reside upstream of genes with similar functional annotations or similar expression patterns or those bound by the same transcription factor and are thus good candidates for functional regulatory sequences.
Molecular diversity of early foraminifera
NASA Astrophysics Data System (ADS)
Holzmann, Maria; Pawlowski, Jan
2017-04-01
Monothalamid foraminifera are a diverse group that is characterized by single-chambered agglutinated or organic test. They occur in all marine habitats and are also present in terrestrial and freshwater environments. Monothalamids branch at the base of foraminiferal tree, as a paraphyletic group with some clades branching at the base of Globothalamea and Tubothalamea. We have currently more than 1500 sequences of monothalamids in our database that can be divided in at least 20 clades among which certain are particularly well presented by sequence numbers and/or number of different species. These are members of clade BM that contain Bathysiphon and Micrometula, clade C that contains among others xenophyophorans, saccaminids, and a large variety of organic-walled or agglutinated genera, clade E that contains the genera Psammophaga, Vellaria and Nellya and four clades that contain freshwater foraminifera. In general, the monothalamid clades comprise both agglutinated and organic-walled genera. Some common genera, such as Crithionina, Saccammina, Hippocrepina, are polyphyletic. Our results clearly show that monothalamids are highly diverse and their molecular diversity by far surpasses their morphological variety. Based on phylogenomic studies, monothalamids evolved early in the evolution of eukaryotes, as a part of the supergroup of Rhizaria, comprising also radiolarians and other amoeboid protists. The monothalamids have diverged from ancestral radiolarians, probably about 1000 million years ago, but the exact time is difficult to infer because of the uncertainties concerning a calibration of a eukaryotic phylogenomic tree.
Tobler, Michael; Dewitt, Thomas J; Schlupp, Ingo; García de León, Francisco J; Herrmann, Roger; Feulner, Philine G D; Tiedemann, Ralph; Plath, Martin
2008-10-01
Divergent natural selection drives evolutionary diversification. It creates phenotypic diversity by favoring developmental plasticity within populations or genetic differentiation and local adaptation among populations. We investigated phenotypic and genetic divergence in the livebearing fish Poecilia mexicana along two abiotic environmental gradients. These fish typically inhabit nonsulfidic surface rivers, but also colonized sulfidic and cave habitats. We assessed phenotypic variation among a factorial combination of habitat types using geometric and traditional morphometrics, and genetic divergence using quantitative and molecular genetic analyses. Fish in caves (sulfidic or not) exhibited reduced eyes and slender bodies. Fish from sulfidic habitats (surface or cave) exhibited larger heads and longer gill filaments. Common-garden rearing suggested that these morphological differences are partly heritable. Population genetic analyses using microsatellites as well as cytochrome b gene sequences indicate high population differentiation over small spatial scale and very low rates of gene flow, especially among different habitat types. This suggests that divergent environmental conditions constitute barriers to gene flow. Strong molecular divergence over short distances as well as phenotypic and quantitative genetic divergence across habitats in directions classic to fish ecomorphology suggest that divergent selection is structuring phenotypic variation in this system.
Genomic architecture of adaptive color pattern divergence and convergence in Heliconius butterflies
Supple, Megan A.; Hines, Heather M.; Dasmahapatra, Kanchon K.; Lewis, James J.; Nielsen, Dahlia M.; Lavoie, Christine; Ray, David A.; Salazar, Camilo; McMillan, W. Owen; Counterman, Brian A.
2013-01-01
Identifying the genetic changes driving adaptive variation in natural populations is key to understanding the origins of biodiversity. The mosaic of mimetic wing patterns in Heliconius butterflies makes an excellent system for exploring adaptive variation using next-generation sequencing. In this study, we use a combination of techniques to annotate the genomic interval modulating red color pattern variation, identify a narrow region responsible for adaptive divergence and convergence in Heliconius wing color patterns, and explore the evolutionary history of these adaptive alleles. We use whole genome resequencing from four hybrid zones between divergent color pattern races of Heliconius erato and two hybrid zones of the co-mimic Heliconius melpomene to examine genetic variation across 2.2 Mb of a partial reference sequence. In the intergenic region near optix, the gene previously shown to be responsible for the complex red pattern variation in Heliconius, population genetic analyses identify a shared 65-kb region of divergence that includes several sites perfectly associated with phenotype within each species. This region likely contains multiple cis-regulatory elements that control discrete expression domains of optix. The parallel signatures of genetic differentiation in H. erato and H. melpomene support a shared genetic architecture between the two distantly related co-mimics; however, phylogenetic analysis suggests mimetic patterns in each species evolved independently. Using a combination of next-generation sequencing analyses, we have refined our understanding of the genetic architecture of wing pattern variation in Heliconius and gained important insights into the evolution of novel adaptive phenotypes in natural populations. PMID:23674305
Deciphering amphibian diversity through DNA barcoding: chances and challenges.
Vences, Miguel; Thomas, Meike; Bonett, Ronald M; Vieites, David R
2005-10-29
Amphibians globally are in decline, yet there is still a tremendous amount of unrecognized diversity, calling for an acceleration of taxonomic exploration. This process will be greatly facilitated by a DNA barcoding system; however, the mitochondrial population structure of many amphibian species presents numerous challenges to such a standardized, single locus, approach. Here we analyse intra- and interspecific patterns of mitochondrial variation in two distantly related groups of amphibians, mantellid frogs and salamanders, to determine the promise of DNA barcoding with cytochrome oxidase subunit I (cox1) sequences in this taxon. High intraspecific cox1 divergences of 7-14% were observed (18% in one case) within the whole set of amphibian sequences analysed. These high values are not caused by particularly high substitution rates of this gene but by generally deep mitochondrial divergences within and among amphibian species. Despite these high divergences, cox1 sequences were able to correctly identify species including disparate geographic variants. The main problems with cox1 barcoding of amphibians are (i) the high variability of priming sites that hinder the application of universal primers to all species and (ii) the observed distinct overlap of intraspecific and interspecific divergence values, which implies difficulties in the definition of threshold values to identify candidate species. Common discordances between geographical signatures of mitochondrial and nuclear markers in amphibians indicate that a single-locus approach can be problematic when high accuracy of DNA barcoding is required. We suggest that a number of mitochondrial and nuclear genes may be used as DNA barcoding markers to complement cox1.
Genome Evolution in the Primary Endosymbiont of Whiteflies Sheds Light on Their Divergence
Santos-Garcia, Diego; Vargas-Chavez, Carlos; Moya, Andrés; Latorre, Amparo; Silva, Francisco J.
2015-01-01
Whiteflies are important agricultural insect pests, whose evolutionary success is related to a long-term association with a bacterial endosymbiont, Candidatus Portiera aleyrodidarum. To completely characterize this endosymbiont clade, we sequenced the genomes of three new Portiera strains covering the two extant whitefly subfamilies. Using endosymbiont and mitochondrial sequences we estimated the divergence dates in the clade and used these values to understand the molecular evolution of the endosymbiont coding sequences. Portiera genomes were maintained almost completely stable in gene order and gene content during more than 125 Myr of evolution, except in the Bemisia tabaci lineage. The ancestor had already lost the genetic information transfer autonomy but was able to participate in the synthesis of all essential amino acids and carotenoids. The time of divergence of the B. tabaci complex was much more recent than previous estimations. The recent divergence of biotypes B (MEAM1 species) and Q (MED species) suggests that they still could be considered strains of the same species. We have estimated the rates of evolution of Portiera genes, synonymous and nonsynonymous, and have detected significant differences among-lineages, with most Portiera lineages evolving very slowly. Although the nonsynonymous rates were much smaller than the synonymous, the genomic dN/dS ratios were similar, discarding selection as the driver of among-lineage variation. We suggest variation in mutation rate and generation time as the responsible factors. In conclusion, the slow evolutionary rates of Portiera may have contributed to its long-term association with whiteflies, avoiding its replacement by a novel and more efficient endosymbiont. PMID:25716826
Phylogenetic position of avian nocturnal and diurnal raptors.
Mahmood, Muhammad Tariq; McLenachan, Patricia A; Gibb, Gillian C; Penny, David
2014-02-01
We report three new avian mitochondrial genomes, two from widely separated groups of owls and a falcon relative (the Secretarybird). We then report additional progress in resolving Neoavian relationships in that the two groups of owls do come together (it is not just long-branch attraction), and the Secretarybird is the deepest divergence on the Accipitridae lineage. This is now agreed between mitochondrial and nuclear sequences. There is no evidence for the monophyly of the combined three groups of raptors (owls, eagles, and falcons), and again this is agreed by nuclear and mitochondrial sequences. All three groups (owls, accipitrids [eagles], and falcons) do appear to be members of the "higher land birds," and though there may not yet be full "consilience" between mitochondrial and nuclear sequences for the precise order of divergences of the eagles, falcons, and the owls, there is good progress on their relationships.
Phylogenetic Position of Avian Nocturnal and Diurnal Raptors
Mahmood, Muhammad Tariq; McLenachan, Patricia A.; Gibb, Gillian C.; Penny, David
2014-01-01
We report three new avian mitochondrial genomes, two from widely separated groups of owls and a falcon relative (the Secretarybird). We then report additional progress in resolving Neoavian relationships in that the two groups of owls do come together (it is not just long-branch attraction), and the Secretarybird is the deepest divergence on the Accipitridae lineage. This is now agreed between mitochondrial and nuclear sequences. There is no evidence for the monophyly of the combined three groups of raptors (owls, eagles, and falcons), and again this is agreed by nuclear and mitochondrial sequences. All three groups (owls, accipitrids [eagles], and falcons) do appear to be members of the “higher land birds,” and though there may not yet be full “consilience” between mitochondrial and nuclear sequences for the precise order of divergences of the eagles, falcons, and the owls, there is good progress on their relationships. PMID:24448983
Evolution of the chalcone synthase gene family in the genus Ipomoea.
Durbin, M L; Learn, G H; Huttley, G A; Clegg, M T
1995-01-01
The evolution of the chalcone synthase [CHS; malonyl-CoA:4-coumaroyl-CoA malonyltransferase (cyclizing), EC 2.3.1.74] multigene family in the genus Ipomoea is explored. Thirteen CHS genes from seven Ipomoea species (family Convolvulaceae) were sequenced--three from genomic clones and the remainder from PCR amplification with primers designed from the 5' flanking region and the end of the 3' coding region of Ipomoea purpurea Roth. Analysis of the data indicates a duplication of CHS that predates the divergence of the Ipomoea species in this study. The Ipomoea CHS genes are among the most rapidly evolving of the CHS genes sequenced to date. The CHS genes in this study are most closely related to the Petunia CHS-B gene, which is also rapidly evolving and highly divergent from the rest of the Petunia CHS sequences. PMID:7724563
New Hepatitis B Virus of Cranes That Has an Unexpected Broad Host Range
Prassolov, Alexej; Hohenberg, Heinz; Kalinina, Tatyana; Schneider, Carola; Cova, Lucyna; Krone, Oliver; Frölich, Kai; Will, Hans; Sirma, Hüseyin
2003-01-01
All hepadnaviruses known so far have a very limited host range, restricted to their natural hosts and a few closely related species. This is thought to be due mainly to sequence divergence in the large envelope protein and species-specific differences in host components essential for virus propagation. Here we report an infection of cranes with a novel hepadnavirus, designated CHBV, that has an unexpectedly broad host range and is only distantly evolutionarily related to avihepadnaviruses of related hosts. Direct DNA sequencing of amplified CHBV DNA as well a sequencing of cloned viral genomes revealed that CHBV is most closely related to, although distinct from, Ross' goose hepatitis B virus (RGHBV) and slightly less closely related to duck hepatitis B virus (DHBV). Phylogenetically, cranes are very distant from geese and ducks and are most closely related to herons and storks. Naturally occurring hepadnaviruses in the last two species are highly divergent in sequence from RGHBV and DHBV and do not infect ducks or do so only marginally. In contrast, CHBV from crane sera and recombinant CHBV produced from LMH cells infected primary duck hepatocytes almost as efficiently as DHBV did. This is the first report of a rather broad host range of an avihepadnavirus. Our data imply either usage of similar or identical entry pathways and receptors by DHBV and CHBV, unusual host and virus adaptation mechanisms, or divergent evolution of the host genomes and cellular components required for virus propagation. PMID:12525630
New hepatitis B virus of cranes that has an unexpected broad host range.
Prassolov, Alexej; Hohenberg, Heinz; Kalinina, Tatyana; Schneider, Carola; Cova, Lucyna; Krone, Oliver; Frölich, Kai; Will, Hans; Sirma, Hüseyin
2003-02-01
All hepadnaviruses known so far have a very limited host range, restricted to their natural hosts and a few closely related species. This is thought to be due mainly to sequence divergence in the large envelope protein and species-specific differences in host components essential for virus propagation. Here we report an infection of cranes with a novel hepadnavirus, designated CHBV, that has an unexpectedly broad host range and is only distantly evolutionarily related to avihepadnaviruses of related hosts. Direct DNA sequencing of amplified CHBV DNA as well a sequencing of cloned viral genomes revealed that CHBV is most closely related to, although distinct from, Ross' goose hepatitis B virus (RGHBV) and slightly less closely related to duck hepatitis B virus (DHBV). Phylogenetically, cranes are very distant from geese and ducks and are most closely related to herons and storks. Naturally occurring hepadnaviruses in the last two species are highly divergent in sequence from RGHBV and DHBV and do not infect ducks or do so only marginally. In contrast, CHBV from crane sera and recombinant CHBV produced from LMH cells infected primary duck hepatocytes almost as efficiently as DHBV did. This is the first report of a rather broad host range of an avihepadnavirus. Our data imply either usage of similar or identical entry pathways and receptors by DHBV and CHBV, unusual host and virus adaptation mechanisms, or divergent evolution of the host genomes and cellular components required for virus propagation.
Divergence, differential methylation and interspersion of melon satellite DNA sequences.
Shmookler Reis, R; Timmis, J N; Ingle, J
1981-01-01
Melon (Cucumis melo) satellite DNA consists of two components, Q and S, each with a buoyant density in CsCl of 1.707 g/ml, but differing by 9 degrees C in "melting" temperature. These physical properties appear to be in contradiction, since both depend on G + C content. In order to resolve this anomaly, base compositions were directly determined for isolated fractions. the low-"melting" component S contains 41.8% G + C, with 6% of C present as 5-methylcytosine, whereas Q DNA contains 54% G + C, with 41% of C methylated. Analyses of restriction site loss agreed well with the direct determinations of methylation and divergence, and indicated some clustering of methylated sites in Q DNA. Analysis of restricted main-band DNA by hydridization with RNA complementary to Q satellite DNA ("Southern transfer") showed satellite Q tandem arrays interspersed in DNA of main-band density. Sequence divergence and extent of methylation did not appear to depend on whether a repeat array was present as satellite or interspersed in main-band DNA. Hydridization in situ indicated considerable heterogeneity in the genomic proportion of the Q-DNA sequences in melon fruit nuclei, implying over- and under-representation consistent with extensive unequal recombination in satellite Q tandem arrays. The cucumber, Cucumis sativus, contains less than 8% as much Q-homologous DNA per genome as the melon, suggesting rapid evolutionary gain or loss of these tandem repeat sequences. Images Fig. 2. PLATE 1 Fig. 4. Fig. 10. PMID:6172117
Testing the molecular clock using mechanistic models of fossil preservation and molecular evolution.
Warnock, Rachel C M; Yang, Ziheng; Donoghue, Philip C J
2017-06-28
Molecular sequence data provide information about relative times only, and fossil-based age constraints are the ultimate source of information about absolute times in molecular clock dating analyses. Thus, fossil calibrations are critical to molecular clock dating, but competing methods are difficult to evaluate empirically because the true evolutionary time scale is never known. Here, we combine mechanistic models of fossil preservation and sequence evolution in simulations to evaluate different approaches to constructing fossil calibrations and their impact on Bayesian molecular clock dating, and the relative impact of fossil versus molecular sampling. We show that divergence time estimation is impacted by the model of fossil preservation, sampling intensity and tree shape. The addition of sequence data may improve molecular clock estimates, but accuracy and precision is dominated by the quality of the fossil calibrations. Posterior means and medians are poor representatives of true divergence times; posterior intervals provide a much more accurate estimate of divergence times, though they may be wide and often do not have high coverage probability. Our results highlight the importance of increased fossil sampling and improved statistical approaches to generating calibrations, which should incorporate the non-uniform nature of ecological and temporal fossil species distributions. © 2017 The Authors.
Colihueque, Nelson; Gantz, Alberto; Rau, Jaime Ricardo; Parraguez, Margarita
2015-01-01
Abstract In this paper new mitochondrial COI sequences of Common Barn Owl Tyto alba (Scopoli, 1769) and Short-eared Owl Asio flammeus (Pontoppidan, 1763) from southern Chile are reported and compared with sequences from other parts of the World. The intraspecific genetic divergence (mean p-distance) was 4.6 to 5.5% for the Common Barn Owl in comparison with specimens from northern Europe and Australasia and 3.1% for the Short-eared Owl with respect to samples from north America, northern Europe and northern Asia. Phylogenetic analyses revealed three distinctive groups for the Common Barn Owl: (i) South America (Chile and Argentina) plus Central and North America, (ii) northern Europe and (iii) Australasia, and two distinctive groups for the Short-eared Owl: (i) South America (Chile and Argentina) and (ii) north America plus northern Europe and northern Asia. The level of genetic divergence observed in both species exceeds the upper limit of intraspecific comparisons reported previously for Strigiformes. Therefore, this suggests that further research is needed to assess the taxonomic status, particularly for the Chilean populations that, to date, have been identified as belonging to these species through traditional taxonomy. PMID:26668551
Wang, Qian; Abbott, Richard J; Yu, Qiu-Shi; Lin, Kao; Liu, Jian-Quan
2013-07-01
Pleistocene climate change has had an important effect in shaping intraspecific genetic variation in many species; however, its role in driving speciation is less clear. We examined the possibility of a Pleistocene origin of the only two representatives of the genus Pugionium (Brassicaceae), Pugionium cornutum and Pugionium dolabratum, which occupy different desert habitats in northwest China. We surveyed sequence variation for internal transcribed spacer (ITS), three chloroplast (cp) DNA fragments, and eight low-copy nuclear genes among individuals sampled from 11 populations of each species across their geographic ranges. One ITS mutation distinguished the two species, whereas mutations in cpDNA and the eight low-copy nuclear gene sequences were not species-specific. Although interspecific divergence varied greatly among nuclear gene sequences, in each case divergence was estimated to have occurred within the Pleistocene when deserts expanded in northwest China. Our findings point to the importance of Pleistocene climate change, in this case an increase in aridity, as a cause of speciation in Pugionium as a result of divergence in different habitats that formed in association with the expansion of deserts in China. © 2013 The Authors. New Phytologist © 2013 New Phytologist Trust.
LCC demons with divergence term for liver MRI motion correction
NASA Astrophysics Data System (ADS)
Oh, Jihun; Martin, Diego; Skrinjar, Oskar
2010-03-01
Contrast-enhanced liver MR image sequences acquired at multiple times before and after contrast administration have been shown to be critically important for the diagnosis and monitoring of liver tumors and may be used for the quantification of liver inflammation and fibrosis. However, over multiple acquisitions, the liver moves and deforms due to patient and respiratory motion. In order to analyze contrast agent uptake one first needs to correct for liver motion. In this paper we present a method for the motion correction of dynamic contrastenhanced liver MR images. For this purpose we use a modified version of the Local Correlation Coefficient (LCC) Demons non-rigid registration method. Since the liver is nearly incompressible its displacement field has small divergence. For this reason we add a divergence term to the energy that is minimized in the LCC Demons method. We applied the method to four sequences of contrast-enhanced liver MR images. Each sequence had a pre-contrast scan and seven post-contrast scans. For each post-contrast scan we corrected for the liver motion relative to the pre-contrast scan. Quantitative evaluation showed that the proposed method improved the liver alignment relative to the non-corrected and translation-corrected scans and visual inspection showed no visible misalignment of the motion corrected contrast-enhanced scans and pre-contrast scan.
Renner, S S; Grimm, Guido W; Kapli, Paschalia; Denk, Thomas
2016-07-19
The fossilized birth-death (FBD) model can make use of information contained in multiple fossils representing the same clade, and we here apply this model to infer divergence times in beeches (genus Fagus), using 53 fossils and nuclear sequences for all nine species. We also apply FBD dating to the fern clade Osmundaceae, with about 12 living species and 36 fossils. Fagus nuclear sequences cannot be aligned with those of other Fagaceae, and we therefore use Bayes factors to choose among alternative root positions. The crown group of Fagus is dated to 53 (62-43) Ma; divergence of the sole American species to 44 (51-39) Ma and divergence between Central European F. sylvatica and Eastern Mediterranean F. orientalis to 8.7 (20-1.8) Ma, unexpectedly old. The FBD model can accommodate fossils as sampled ancestors or as extinct or unobserved lineages; however, this makes its raw output, which shows all fossils on short or long branches, problematic to interpret. We use hand-drawn depictions and a bipartition network to illustrate the uncertain placements of fossils. Inferred speciation and extinction rates imply approximately 5× higher evolutionary turnover in Fagus than in Osmundaceae, fitting a hypothesized low turnover in plants adapted to low-nutrient conditions.This article is part of the themed issue 'Dating species divergences using rocks and clocks'. © 2016 The Author(s).
Kapli, Paschalia; Denk, Thomas
2016-01-01
The fossilized birth–death (FBD) model can make use of information contained in multiple fossils representing the same clade, and we here apply this model to infer divergence times in beeches (genus Fagus), using 53 fossils and nuclear sequences for all nine species. We also apply FBD dating to the fern clade Osmundaceae, with about 12 living species and 36 fossils. Fagus nuclear sequences cannot be aligned with those of other Fagaceae, and we therefore use Bayes factors to choose among alternative root positions. The crown group of Fagus is dated to 53 (62–43) Ma; divergence of the sole American species to 44 (51–39) Ma and divergence between Central European F. sylvatica and Eastern Mediterranean F. orientalis to 8.7 (20–1.8) Ma, unexpectedly old. The FBD model can accommodate fossils as sampled ancestors or as extinct or unobserved lineages; however, this makes its raw output, which shows all fossils on short or long branches, problematic to interpret. We use hand-drawn depictions and a bipartition network to illustrate the uncertain placements of fossils. Inferred speciation and extinction rates imply approximately 5× higher evolutionary turnover in Fagus than in Osmundaceae, fitting a hypothesized low turnover in plants adapted to low-nutrient conditions. This article is part of the themed issue ‘Dating species divergences using rocks and clocks’. PMID:27325832
Hyperexpansion of RNA Bacteriophage Diversity
Krishnamurthy, Siddharth R.; Janowski, Andrew B.; Zhao, Guoyan; Barouch, Dan; Wang, David
2016-01-01
Bacteriophage modulation of microbial populations impacts critical processes in ocean, soil, and animal ecosystems. However, the role of bacteriophages with RNA genomes (RNA bacteriophages) in these processes is poorly understood, in part because of the limited number of known RNA bacteriophage species. Here, we identify partial genome sequences of 122 RNA bacteriophage phylotypes that are highly divergent from each other and from previously described RNA bacteriophages. These novel RNA bacteriophage sequences were present in samples collected from a range of ecological niches worldwide, including invertebrates and extreme microbial sediment, demonstrating that they are more widely distributed than previously recognized. Genomic analyses of these novel bacteriophages yielded multiple novel genome organizations. Furthermore, one RNA bacteriophage was detected in the transcriptome of a pure culture of Streptomyces avermitilis, suggesting for the first time that the known tropism of RNA bacteriophages may include gram-positive bacteria. Finally, reverse transcription PCR (RT-PCR)-based screening for two specific RNA bacteriophages in stool samples from a longitudinal cohort of macaques suggested that they are generally acutely present rather than persistent. PMID:27010970
Park, Soo-Je; Park, Byoung-Joon; Pham, Vinh Hoa; Yoon, Dae-No; Kim, Si-Kwan; Rhee, Sung-Keun
2008-06-01
Molecular techniques, based on clone library of 18S rRNA gene, were employed to ascertain the diversity of microeukaryotic organisms in sediments from the East Sea. A total of 261 clones were recovered from surface sediments. Most of the clone sequences (90%) were affiliated with protists, dominated by Ciliates (18%) and Dinoflagellates (19%) of Alveolates, phototrophic Stramenopiles (11%), and Cercozoa (20%). Many of the clones were related to uncultivated eukaryotes clones retrieved from anoxic environments with several highly divergent 18S rRNA gene sequences. However, no clones were related to cultivated obligate anaerobic protists. Protistan communities between subsurface layers of 1 and 9 cm shared 23% of total phylotypes which comprised 64% of total clones retrieved. Analysis of diversity indices and rarefaction curve showed that the protistan community within the 1 cm layer exhibited higher diversity than the 9 cm layer. Our results imply that diverse protists remain to be uncovered within marine benthic environments.
Yubuki, Naoji; Leander, Brian S; Silberman, Jeffrey D
2010-04-01
A novel free free-living phagotrophic flagellate, Rictus lutensis gen. et sp. nov., with two heterodynamic flagella, a permanent cytostome and a cytopharynx was isolated from muddy, low oxygen coastal sediments in Cape Cod, MA, USA. We cultivated and characterized this flagellate with transmission electron microscopy, scanning electron microscopy and molecular phylogenetic analyses inferred from small subunit (SSU) rDNA sequences. These data demonstrated that this organism has the key ultrastructural characters of the Bicosoecida, including similar transitional zones and a similar overall flagellar apparatus consisting of an x fiber and an L-shape microtubular root 2 involved in food capture. Although the molecular phylogenetic analyses were concordant with the ultrastructural data in placing R. lutensis with the bicosoecid clade, the internal position of this relatively divergent sequence within the clade was not resolved. Therefore, we interpret R. lutensis gen. et sp. nov. as a novel bicosoecid incertae sedis. Copyright 2009 Elsevier GmbH. All rights reserved.
ExoLocator--an online view into genetic makeup of vertebrate proteins.
Khoo, Aik Aun; Ogrizek-Tomas, Mario; Bulovic, Ana; Korpar, Matija; Gürler, Ece; Slijepcevic, Ivan; Šikic, Mile; Mihalek, Ivana
2014-01-01
ExoLocator (http://exolocator.eopsf.org) collects in a single place information needed for comparative analysis of protein-coding exons from vertebrate species. The main source of data--the genomic sequences, and the existing exon and homology annotation--is the ENSEMBL database of completed vertebrate genomes. To these, ExoLocator adds the search for ostensibly missing exons in orthologous protein pairs across species, using an extensive computational pipeline to narrow down the search region for the candidate exons and find a suitable template in the other species, as well as state-of-the-art implementations of pairwise alignment algorithms. The resulting complements of exons are organized in a way currently unique to ExoLocator: multiple sequence alignments, both on the nucleotide and on the peptide levels, clearly indicating the exon boundaries. The alignments can be inspected in the web-embedded viewer, downloaded or used on the spot to produce an estimate of conservation within orthologous sets, or functional divergence across paralogues.
Soto, M; Requena, J M; Quijada, L; García, M; Guzman, F; Patarroyo, M E; Alonso, C
1995-12-01
Antibodies reacting against the H2A histone protein were frequently observed in the sera from dogs naturally infected with the protozoan parasite Leishmania infantum. Using synthetic peptides covering the complete sequence of the protein we have identified the amino terminal region, comprising from amino acids 1 to 20, and the carboxyl terminal region, comprising from amino acids 106 to 132, as conforming the antigenic determinants of the protein. Those regions, exposed in the nucleosome surface, are highly divergent in sequence relative to the mammalian H2A histones. The anti-H2A histone antibodies present in the sera of these dogs specifically recognize the L. infantum H2A histone and they do not react with mammalian histones. The present data indicate that, in spite of the evolutionary conservation of the H2A histone protein among eukaryotic organisms, the humoral response against this protein during natural infection is specifically triggered by the parasite protein antigenic determinants.
Structural and immunologic characterization of bovine, horse, and rabbit serum albumins
Majorek, Karolina A.; Porebski, Przemyslaw J.; Dayal, Arjun; Zimmerman, Matthew D.; Jablonska, Kamila; Stewart, Alan J.; Chruszcz, Maksymilian; Minor, Wladek
2012-01-01
Serum albumin (SA) is the most abundant plasma protein in mammals. SA is a multifunctional protein with extraordinary ligand binding capacity, making it a transporter molecule for a diverse range of metabolites, drugs, nutrients, metals and other molecules. Due to its ligand binding properties, albumins have wide clinical, pharmaceutical, and biochemical applications. Albumins are also allergenic, and exhibit a high degree of cross-reactivity due to significant sequence and structure similarity of SAs from different organisms. Here we present crystal structures of albumins from cattle (BSA), horse (ESA) and rabbit (RSA) serums. The structural data are correlated with the results of immunological studies of SAs. We also analyze the conservation or divergence of structures and sequences of SAs in the context of their potential allergenicity and cross-reactivity. In addition, we identified a previously uncharacterized ligand binding site in the structure of RSA, and calcium binding sites in the structure of BSA, which is the first serum albumin structure to contain metal ions. PMID:22677715
Mukherjee, Nabanita; Beati, Lorenza; Sellers, Michael; Burton, Laquita; Adamson, Steven; Robbins, Richard G; Moore, Frank; Karim, Shahid
2014-03-01
Birds are capable of carrying ticks and, consequently, tick-transmitted microorganisms over long distances and across geographical barriers such as oceans and deserts. Ticks are hosts for several species of spotted fever group rickettsiae (SFGR), which can be transmitted to vertebrates during blood meals. In this study, the prevalence of this group of rickettsiae was examined in ticks infesting migratory songbirds by using polymerase chain reaction (PCR). During the 2009 and 2010 spring migration season, 2064 northward-migrating passerine songbirds were examined for ticks at Johnson Bayou, Louisiana. A total of 91 ticks was removed from 35 individual songbirds for tick species identification and spotted fever group rickettsia detection. Ticks were identified as Haemaphysalis juxtakochi (n=38, 42%), Amblyomma longirostre (n=22, 24%), Amblyomma nodosum (n=17, 19%), Amblyomma calcaratum (n=11, 12%), Amblyomma maculatum (n=2, 2%), and Haemaphysalis leporispalustris (n=1, 1%) by comparing their 12S rDNA gene sequence to homologous sequences in GenBank. Most of the identified ticks were exotic species originating outside of the United States. The phylogenetic analysis of the 71 ompA gene sequences of the rickettsial strains detected in the ticks revealed the occurrence of 6 distinct rickettsial genotypes. Two genotypes (corresponding to a total of 28 samples) were included in the Candidatus Rickettsia amblyommii clade (less than 1% divergence), 2 of them (corresponding to a total of 14 samples) clustered with Rickettsia sp. "Argentina" with less than 0.2% sequence divergence, and 2 of them (corresponding to a total of 27 samples), although closely related to the R. parkeri-R. africae lineage (2.50-3.41% divergence), exhibited sufficient genetic divergence from its members to possibly constitute a new rickettsial genotype. Overall, there does not seem to be a specific relationship between exotic tick species, the rickettsiae they harbor, or the reservoir competence of the corresponding bird species. Copyright © 2013 Elsevier GmbH. All rights reserved.
Zhang, Yinan; Samee, Md. Abul Hassan; Halfon, Marc S.; Sinha, Saurabh
2014-01-01
Many genes familiar from Drosophila development, such as the so-called gap, pair-rule, and segment polarity genes, play important roles in the development of other insects and in many cases appear to be deployed in a similar fashion, despite the fact that Drosophila-like “long germband” development is highly derived and confined to a subset of insect families. Whether or not these similarities extend to the regulatory level is unknown. Identification of regulatory regions beyond the well-studied Drosophila has been challenging as even within the Diptera (flies, including mosquitoes) regulatory sequences have diverged past the point of recognition by standard alignment methods. Here, we demonstrate that methods we previously developed for computational cis-regulatory module (CRM) discovery in Drosophila can be used effectively in highly diverged (250–350 Myr) insect species including Anopheles gambiae, Tribolium castaneum, Apis mellifera, and Nasonia vitripennis. In Drosophila, we have successfully used small sets of known CRMs as “training data” to guide the search for other CRMs with related function. We show here that although species-specific CRM training data do not exist, training sets from Drosophila can facilitate CRM discovery in diverged insects. We validate in vivo over a dozen new CRMs, roughly doubling the number of known CRMs in the four non-Drosophila species. Given the growing wealth of Drosophila CRM annotation, these results suggest that extensive regulatory sequence annotation will be possible in newly sequenced insects without recourse to costly and labor-intensive genome-scale experiments. We develop a new method, Regulus, which computes a probabilistic score of similarity based on binding site composition (despite the absence of nucleotide-level sequence alignment), and demonstrate similarity between functionally related CRMs from orthologous loci. Our work represents an important step toward being able to trace the evolutionary history of gene regulatory networks and defining the mechanisms underlying insect evolution. PMID:25173756
Kazemian, Majid; Suryamohan, Kushal; Chen, Jia-Yu; Zhang, Yinan; Samee, Md Abul Hassan; Halfon, Marc S; Sinha, Saurabh
2014-09-01
Many genes familiar from Drosophila development, such as the so-called gap, pair-rule, and segment polarity genes, play important roles in the development of other insects and in many cases appear to be deployed in a similar fashion, despite the fact that Drosophila-like "long germband" development is highly derived and confined to a subset of insect families. Whether or not these similarities extend to the regulatory level is unknown. Identification of regulatory regions beyond the well-studied Drosophila has been challenging as even within the Diptera (flies, including mosquitoes) regulatory sequences have diverged past the point of recognition by standard alignment methods. Here, we demonstrate that methods we previously developed for computational cis-regulatory module (CRM) discovery in Drosophila can be used effectively in highly diverged (250-350 Myr) insect species including Anopheles gambiae, Tribolium castaneum, Apis mellifera, and Nasonia vitripennis. In Drosophila, we have successfully used small sets of known CRMs as "training data" to guide the search for other CRMs with related function. We show here that although species-specific CRM training data do not exist, training sets from Drosophila can facilitate CRM discovery in diverged insects. We validate in vivo over a dozen new CRMs, roughly doubling the number of known CRMs in the four non-Drosophila species. Given the growing wealth of Drosophila CRM annotation, these results suggest that extensive regulatory sequence annotation will be possible in newly sequenced insects without recourse to costly and labor-intensive genome-scale experiments. We develop a new method, Regulus, which computes a probabilistic score of similarity based on binding site composition (despite the absence of nucleotide-level sequence alignment), and demonstrate similarity between functionally related CRMs from orthologous loci. Our work represents an important step toward being able to trace the evolutionary history of gene regulatory networks and defining the mechanisms underlying insect evolution. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Troggio, Michela; Šurbanovski, Nada; Bianco, Luca; Moretto, Marco; Giongo, Lara; Banchi, Elisa; Viola, Roberto; Fernández, Felicdad Fernández; Costa, Fabrizio; Velasco, Riccardo; Cestaro, Alessandro; Sargent, Daniel James
2013-01-01
High throughput arrays for the simultaneous genotyping of thousands of single-nucleotide polymorphisms (SNPs) have made the rapid genetic characterisation of plant genomes and the development of saturated linkage maps a realistic prospect for many plant species of agronomic importance. However, the correct calling of SNP genotypes in divergent polyploid genomes using array technology can be problematic due to paralogy, and to divergence in probe sequences causing changes in probe binding efficiencies. An Illumina Infinium II whole-genome genotyping array was recently developed for the cultivated apple and used to develop a molecular linkage map for an apple rootstock progeny (M432), but a large proportion of segregating SNPs were not mapped in the progeny, due to unexpected genotype clustering patterns. To investigate the causes of this unexpected clustering we performed BLAST analysis of all probe sequences against the ‘Golden Delicious’ genome sequence and discovered evidence for paralogous annealing sites and probe sequence divergence for a high proportion of probes contained on the array. Following visual re-evaluation of the genotyping data generated for 8,788 SNPs for the M432 progeny using the array, we manually re-scored genotypes at 818 loci and mapped a further 797 markers to the M432 linkage map. The newly mapped markers included the majority of those that could not be mapped previously, as well as loci that were previously scored as monomorphic, but which segregated due to divergence leading to heterozygosity in probe annealing sites. An evaluation of the 8,788 probes in a diverse collection of Malus germplasm showed that more than half the probes returned genotype clustering patterns that were difficult or impossible to interpret reliably, highlighting implications for the use of the array in genome-wide association studies. PMID:23826289
Simo Tchetgna, Huguette Dorine; Nakoune, Emmanuel; Selekon, Benjamin; Gessain, Antoine; Manuguerra, Jean-Claude; Kazanji, Mirdad; Berthet, Nicolas
2017-06-01
Rhabdoviridae is one of the most diversified families of RNA viruses whose members infect a wide range of plants, animals, and arthropods. The members of this family are classified into 13 genera and >150 unassigned viruses. Here, we sequenced the complete genome of a rhabdovirus belonging to the Hart Park serogroup, the Kamese virus (KAMV), isolated in 1977 from Culex pruina in the Central African Republic. The genomic sequence showed an organization typical of rhabdoviruses with additional genes in the P-M and G-L intergenic regions, as already reported for the Hart Park serogroup. Our Kamese strain (ArB9074) had 98% and 78.8% nucleotide sequence similarity with the prototypes of the KAMV and Mossuril virus isolated in Uganda and Mozambique in two different Culex species, respectively. Moreover, the protein sequences had 98-100% amino acid similarity with the prototype of the KAMV, except for an additional gene (U3) that showed a divergence of 6%. These molecular data show that our strain of the KAMV is genetically close to the Culex annuliorus strain that was circulating in Uganda in 1967. However, this study suggests the need to improve our knowledge of the KAMV to better understand its behavior, its life cycle, and its potential reservoirs.
Hermes Transposon Distribution and Structure in Musca domestica
Subramanian, Ramanand A.; Cathcart, Laura A.; Krafsur, Elliot S.; Atkinson, Peter W.
2009-01-01
Hermes are hAT transposons from Musca domestica that are very closely related to the hobo transposons from Drosophila melanogaster and are useful as gene vectors in a wide variety of organisms including insects, planaria, and yeast. hobo elements show distinct length variations in a rapidly evolving region of the transposase-coding region as a result of expansions and contractions of a simple repeat sequence encoding 3 amino acids threonine, proline, and glutamic acid (TPE). These variations in length may influence the function of the protein and the movement of hobo transposons in natural populations. Here, we determine the distribution of Hermes in populations of M. domestica as well as whether Hermes transposase has undergone similar sequence expansions and contractions during its evolution in this species. Hermes transposons were found in all M. domestica individuals sampled from 14 populations collected from 4 continents. All individuals with Hermes transposons had evidence for the presence of intact transposase open reading frames, and little sequence variation was observed among Hermes elements. A systematic analysis of the TPE-homologous region of the Hermes transposase-coding region revealed no evidence for length variation. The simple sequence repeat found in hobo elements is a feature of this transposon that evolved since the divergence of hobo and Hermes. PMID:19366812
Kayal, Ehsan; Bentlage, Bastian; Cartwright, Paulyn; Yanagihara, Angel A; Lindsay, Dhugal J; Hopcroft, Russell R; Collins, Allen G
2015-01-01
Hydrozoans display the most morphological diversity within the phylum Cnidaria. While recent molecular studies have provided some insights into their evolutionary history, sister group relationships remain mostly unresolved, particularly at mid-taxonomic levels. Specifically, within Hydroidolina, the most speciose hydrozoan subclass, the relationships and sometimes integrity of orders are highly unsettled. Here we obtained the near complete mitochondrial sequence of twenty-six hydroidolinan hydrozoan species from a range of sources (DNA and RNA-seq data, long-range PCR). Our analyses confirm previous inference of the evolution of mtDNA in Hydrozoa while introducing a novel genome organization. Using RNA-seq data, we propose a mechanism for the expression of mitochondrial mRNA in Hydroidolina that can be extrapolated to the other medusozoan taxa. Phylogenetic analyses using the full set of mitochondrial gene sequences provide some insights into the order-level relationships within Hydroidolina, including siphonophores as the first diverging clade, a well-supported clade comprised of Leptothecata-Filifera III-IV, and a second clade comprised of Aplanulata-Capitata s.s.-Filifera I-II. Finally, we describe our relatively inexpensive and accessible multiplexing strategy to sequence long-range PCR amplicons that can be adapted to most high-throughput sequencing platforms.
Bentlage, Bastian; Cartwright, Paulyn; Yanagihara, Angel A.; Lindsay, Dhugal J.; Hopcroft, Russell R.; Collins, Allen G.
2015-01-01
Hydrozoans display the most morphological diversity within the phylum Cnidaria. While recent molecular studies have provided some insights into their evolutionary history, sister group relationships remain mostly unresolved, particularly at mid-taxonomic levels. Specifically, within Hydroidolina, the most speciose hydrozoan subclass, the relationships and sometimes integrity of orders are highly unsettled. Here we obtained the near complete mitochondrial sequence of twenty-six hydroidolinan hydrozoan species from a range of sources (DNA and RNA-seq data, long-range PCR). Our analyses confirm previous inference of the evolution of mtDNA in Hydrozoa while introducing a novel genome organization. Using RNA-seq data, we propose a mechanism for the expression of mitochondrial mRNA in Hydroidolina that can be extrapolated to the other medusozoan taxa. Phylogenetic analyses using the full set of mitochondrial gene sequences provide some insights into the order-level relationships within Hydroidolina, including siphonophores as the first diverging clade, a well-supported clade comprised of Leptothecata-Filifera III–IV, and a second clade comprised of Aplanulata-Capitata s.s.-Filifera I–II. Finally, we describe our relatively inexpensive and accessible multiplexing strategy to sequence long-range PCR amplicons that can be adapted to most high-throughput sequencing platforms. PMID:26618080
Wen, B; Rikihisa, Y; Fuerst, P A; Chaichanasiriwithaya, W
1995-04-01
Ehrlichia risticii is the causative agent of Potomac horse fever. Variations among the major antigens of different local E. risticii strains have been detected previously. To further assess genetic variability in this species or species complex, the sequences of the 16S rRNA genes of several isolates obtained from sick horses diagnosed as having Potomac horse fever were determined. The sequences of six isolates obtained from Ohio and three isolates obtained from Kentucky were amplified by PCR. Three groups of sequences were identified. The sequences of five of the Ohio isolates were identical to the sequence of the type strain of E. risticii, the Illinois strain. The sequence of one Ohio isolate, isolate 081, was unique; this sequence differed in 10 nucleotides from the sequence of the type strain (level of similarity, 99.3%). The sequences of the three Kentucky isolates were identical to each other, but differed by five bases from the sequence of the type strain (level of similarity, 99.6%). The levels of sequence similarity of isolate 081, the Kentucky isolates, and the type strain to the next most closely related Ehrlichia sp., Ehrlichia sennetsu, were 99.3, 99.2, and 99.2%, respectively. On the basis of the distinct antigenic profiles and the levels of 16S rRNA sequence divergence, isolate 081 is as divergent from the type strain of E. risticii as E. sennetsu is. Therefore, we suggest that strain 081 and the Kentucky isolates may represent two new distinct Ehrlichia species.
Gruber, Karl; Schöning, Caspar; Otte, Marianne; Kinuthia, Wanja; Hasselmann, Martin
2013-09-01
Identifying the forces shaping intraspecific phenotypic and genotypic divergence are of key importance in evolutionary biology. Phenotypic divergence may result from local adaptation or, especially in species with strong gene flow, from pronounced phenotypic plasticity. Here, we examine morphological and genetic divergence among populations of the western honey bee Apis mellifera in the topographically heterogeneous East African region. The currently accepted "mountain refugia hypothesis" states that populations living in disjunct montane forests belong to a different lineage than those in savanna habitats surrounding these forests. We obtained microsatellite data, mitochondrial sequences, and morphometric data from worker honey bees collected from feral colonies in three montane forests and corresponding neighboring savanna regions in Kenya. Honey bee colonies from montane forests showed distinct worker morphology compared with colonies in savanna areas. Mitochondrial sequence data did not support the existence of the two currently accepted subspecies. Furthermore, analyses of the microsatellite data with a Bayesian clustering method did not support the existence of two source populations as it would be expected under the mountain refugia scenario. Our findings suggest that phenotypic plasticity rather than distinct ancestry is the leading cause behind the phenotypic divergence observed between montane forest and savanna honey bees. Our study thus corroborates the idea that high gene flow may select for increased plasticity.
Beet, Clare R; Hogg, Ian D; Collins, Gemma E; Cowan, Don A; Wall, Diana H; Adams, Byron J
2016-09-01
Climate changes are likely to have major influences on the distribution and abundance of Antarctic terrestrial biota. To assess arthropod distribution and diversity within the Ross Sea region, we examined mitochondrial DNA (COI) sequences for three currently recognized species of springtail (Collembola) collected from sites in the vicinity, and to the north of, the Mackay Glacier (77°S). This area acts as a transition between two biogeographic regions (northern and southern Victoria Land). We found populations of highly divergent individuals (5%-11.3% intraspecific sequence divergence) for each of the three putative springtail species, suggesting the possibility of cryptic diversity. Based on molecular clock estimates, these divergent lineages are likely to have been isolated for 3-5 million years. It was during this time that the Western Antarctic Ice Sheet (WAIS) was likely to have completely collapsed, potentially facilitating springtail dispersal via rafting on running waters and open seaways. The reformation of the WAIS would have isolated newly established populations, with subsequent dispersal restricted by glaciers and ice-covered areas. Given the currently limited distributions for these genetically divergent populations, any future changes in species' distributions can be easily tracked through the DNA barcoding of springtails from within the Mackay Glacier ecotone.
Stone, Anne C; Battistuzzi, Fabia U; Kubatko, Laura S; Perry, George H; Trudeau, Evan; Lin, Hsiuman; Kumar, Sudhir
2010-10-27
Here, we report the sequencing and analysis of eight complete mitochondrial genomes of chimpanzees (Pan troglodytes) from each of the three established subspecies (P. t. troglodytes, P. t. schweinfurthii and P. t. verus) and the proposed fourth subspecies (P. t. ellioti). Our population genetic analyses are consistent with neutral patterns of evolution that have been shaped by demography. The high levels of mtDNA diversity in western chimpanzees are unlike those seen at nuclear loci, which may reflect a demographic history of greater female to male effective population sizes possibly owing to the characteristics of the founding population. By using relaxed-clock methods, we have inferred a timetree of chimpanzee species and subspecies. The absolute divergence times vary based on the methods and calibration used, but relative divergence times show extensive uniformity. Overall, mtDNA produces consistently older times than those known from nuclear markers, a discrepancy that is reduced significantly by explicitly accounting for chimpanzee population structures in time estimation. Assuming the human-chimpanzee split to be between 7 and 5 Ma, chimpanzee time estimates are 2.1-1.5, 1.1-0.76 and 0.25-0.18 Ma for the chimpanzee/bonobo, western/(eastern + central) and eastern/central chimpanzee divergences, respectively.
Wang, Sibao; Leclerque, Andreas; Pava-Ripoll, Monica; Fang, Weiguo; St Leger, Raymond J
2009-06-01
Many strains of Metarhizium anisopliae have broad host ranges, but others are specialists and adapted to particular hosts. Patterns of gene duplication, divergence, and deletion in three generalist and three specialist strains were investigated by heterologous hybridization of genomic DNA to genes from the generalist strain Ma2575. As expected, major life processes are highly conserved, presumably due to purifying selection. However, up to 7% of Ma2575 genes were highly divergent or absent in specialist strains. Many of these sequences are conserved in other fungal species, suggesting that there has been rapid evolution and loss in specialist Metarhizium genomes. Some poorly hybridizing genes in specialists were functionally coordinated, indicative of reductive evolution. These included several involved in toxin biosynthesis and sugar metabolism in root exudates, suggesting that specialists are losing genes required to live in alternative hosts or as saprophytes. Several components of mobile genetic elements were also highly divergent or lost in specialists. Exceptionally, the genome of the specialist cricket pathogen Ma443 contained extra insertion elements that might play a role in generating evolutionary novelty. This study throws light on the abundance of orphans in genomes, as 15% of orphan sequences were found to be rapidly evolving in the Ma2575 lineage.
Diehl, Adam G
2018-01-01
Abstract The mouse is widely used as system to study human genetic mechanisms. However, extensive rewiring of transcriptional regulatory networks often confounds translation of findings between human and mouse. Site-specific gain and loss of individual transcription factor binding sites (TFBS) has caused functional divergence of orthologous regulatory loci, and so we must look beyond this positional conservation to understand common themes of regulatory control. Fortunately, transcription factor co-binding patterns shared across species often perform conserved regulatory functions. These can be compared to ‘regulatory sentences’ that retain the same meanings regardless of sequence and species context. By analyzing TFBS co-occupancy patterns observed in four human and mouse cell types, we learned a regulatory grammar: the rules by which TFBS are combined into meaningful regulatory sentences. Different parts of this grammar associate with specific sets of functional annotations regardless of sequence conservation and predict functional signatures more accurately than positional conservation. We further show that both species-specific and conserved portions of this grammar are involved in gene expression divergence and human disease risk. These findings expand our understanding of transcriptional regulatory mechanisms, suggesting that phenotypic divergence and disease risk are driven by a complex interplay between deeply conserved and species-specific transcriptional regulatory pathways. PMID:29361190
Roy, Scott William
2015-12-01
In the deadly human malaria parasite Plasmodium falciparum, several major merozoite surface proteins (MSPs) show a striking pattern of allelic diversity called allelic dimorphism (AD). In AD, the vast majority of observed alleles fall into two highly divergent allelic classes, with recombinant alleles being rare or not observed, presumably due to repression by natural selection (recombination suppression, or RS). The three AD loci, merozoite surface proteins (MSPs) 1, 2, and 6, along with MSP3, which also exhibits RS among four allelic classes, can be collectively called AD/RS. The causes of AD/RS and the evolutionary history of allelic diversity at these loci remain mysterious. The few available sequences from a single closely related chimpanzee parasite, P. reichenowi, have suggested that for 3/4 loci, AD/RS is an ancient state that has been retained in P. falciparum since well before the P. falciparum-P. reichenowi ancestor. On the other hand, based on comparative sequence analysis, we recently suggested that (i) AD/RS P. falciparum loci have undergone interallelic recombination over longer evolutionary times (on the timescale of recent speciation events), and thus (ii) AD/RS may be a recent phenomenon. The recent publication of genomic sequencing efforts for P. gaboni, an outgroup to P. falciparum and P. reichenowi, allows for improved reconstruction of the evolutionary history of these loci. In this work, I report genic sequence for P. gaboni for all four AD/RS P. falciparum loci (MSP1, 2, 3, and 6). Comparison of these sequences with available P. falciparum and P. reichenowi data strengthens the evidence for interallelic recombination over the evolutionary history of these species and also strengthens the case that AD/RS at these loci is ancient. Combined with previous results, these data provide evidence that AD/RS at different loci has evolved at several different times in the evolutionary history of P. falciparum: (i) before the P. gaboni-P. falciparum divergence, for much of MSP1 and MSP3; (ii) between the P. gaboni-P. falciparum and P. reichenowi-P. falciparum divergences, for the 5' end of the AD region of MSP6 and block 3 of MSP1; (iii) near the P. reichenowi-P. falciparum divergence, for the 3' end of the AD region of MSP6; and (iv) after the P. reichenowi-P. falciparum divergence, for MSP2. Based on these results, I suggest a new hypothesis for long-term evolutionary maintenance of AD/RS by recombination within allelic groups. Copyright © 2015 Elsevier B.V. All rights reserved.
Pombert, Jean-François; Lemieux, Claude; Turmel, Monique
2006-01-01
Background The phylum Chlorophyta contains the majority of the green algae and is divided into four classes. The basal position of the Prasinophyceae has been well documented, but the divergence order of the Ulvophyceae, Trebouxiophyceae and Chlorophyceae is currently debated. The four complete chloroplast DNA (cpDNA) sequences presently available for representatives of these classes have revealed extensive variability in overall structure, gene content, intron composition and gene order. The chloroplast genome of Pseudendoclonium (Ulvophyceae), in particular, is characterized by an atypical quadripartite architecture that deviates from the ancestral type by a large inverted repeat (IR) featuring an inverted rRNA operon and a small single-copy (SSC) region containing 14 genes normally found in the large single-copy (LSC) region. To gain insights into the nature of the events that led to the reorganization of the chloroplast genome in the Ulvophyceae, we have determined the complete cpDNA sequence of Oltmannsiellopsis viridis, a representative of a distinct, early diverging lineage. Results The 151,933 bp IR-containing genome of Oltmannsiellopsis differs considerably from Pseudendoclonium and other chlorophyte cpDNAs in intron content and gene order, but shares close similarities with its ulvophyte homologue at the levels of quadripartite architecture, gene content and gene density. Oltmannsiellopsis cpDNA encodes 105 genes, contains five group I introns, and features many short dispersed repeats. As in Pseudendoclonium cpDNA, the rRNA genes in the IR are transcribed toward the single copy region featuring the genes typically found in the ancestral LSC region, and the opposite single copy region harbours genes characteristic of both the ancestral SSC and LSC regions. The 52 genes that were transferred from the ancestral LSC to SSC region include 12 of those observed in Pseudendoclonium cpDNA. Surprisingly, the overall gene organization of Oltmannsiellopsis cpDNA more closely resembles that of Chlorella (Trebouxiophyceae) cpDNA. Conclusion The chloroplast genome of the last common ancestor of Oltmannsiellopsis and Pseudendoclonium contained a minimum of 108 genes, carried only a few group I introns, and featured a distinctive quadripartite architecture. Numerous changes were experienced by the chloroplast genome in the lineages leading to Oltmannsiellopsis and Pseudendoclonium. Our comparative analyses of chlorophyte cpDNAs support the notion that the Ulvophyceae is sister to the Chlorophyceae. PMID:16472375
The D1-D2 region of the large subunit ribosomal DNA as barcode for ciliates.
Stoeck, T; Przybos, E; Dunthorn, M
2014-05-01
Ciliates are a major evolutionary lineage within the alveolates, which are distributed in nearly all habitats on our planet and are an essential component for ecosystem function, processes and stability. Accurate identification of these unicellular eukaryotes through, for example, microscopy or mating type reactions is reserved to few specialists. To satisfy the demand for a DNA barcode for ciliates, which meets the standard criteria for DNA barcodes defined by the Consortium for the Barcode of Life (CBOL), we here evaluated the D1-D2 region of the ribosomal DNA large subunit (LSU-rDNA). Primer universality for the phylum Ciliophora was tested in silico with available database sequences as well as in the laboratory with 73 ciliate species, which represented nine of 12 ciliate classes. Primers tested in this study were successful for all tested classes. To test the ability of the D1-D2 region to resolve conspecific and congeneric sequence divergence, 63 Paramecium strains were sampled from 24 mating species. The average conspecific D1-D2 variation was 0.18%, whereas congeneric sequence divergence averaged 4.83%. In pairwise genetic distance analyses, we identified a D1-D2 sequence divergence of <0.6% as an ideal threshold to discriminate Paramecium species. Using this definition, only 3.8% of all conspecific and 3.9% of all congeneric sequence comparisons had the potential of false assignments. Neighbour-joining analyses inferred monophyly for all taxa but for two Paramecium octaurelia strains. Here, we present a protocol for easy DNA amplification of single cells and voucher deposition. In conclusion, the presented data pinpoint the D1-D2 region as an excellent candidate for an official CBOL barcode for ciliated protists. © 2013 John Wiley & Sons Ltd.
Morrison, Cheryl L; Iwanowicz, Luke; Work, Thierry M; Fahsbender, Elizabeth; Breitbart, Mya; Adams, Cynthia; Iwanowicz, Deb; Sanders, Lakyn; Ackermann, Mathias; Cornman, Robert S
2018-01-01
Chelonid alphaherpesvirus 5 (ChHV5) is a herpesvirus associated with fibropapillomatosis (FP) in sea turtles worldwide. Single-locus typing has previously shown differentiation between Atlantic and Pacific strains of this virus, with low variation within each geographic clade. However, a lack of multi-locus genomic sequence data hinders understanding of the rate and mechanisms of ChHV5 evolutionary divergence, as well as how these genomic changes may contribute to differences in disease manifestation. To assess genomic variation in ChHV5 among five Hawaii and three Florida green sea turtles, we used high-throughput short-read sequencing of long-range PCR products amplified from tumor tissue using primers designed from the single available ChHV5 reference genome from a Hawaii green sea turtle. This strategy recovered sequence data from both geographic regions for approximately 75% of the predicted ChHV5 coding sequences. The average nucleotide divergence between geographic populations was 1.5%; most of the substitutions were fixed differences between regions. Protein divergence was generally low (average 0.08%), and ranged between 0 and 5.3%. Several atypical genes originally identified and annotated in the reference genome were confirmed in ChHV5 genomes from both geographic locations. Unambiguous recombination events between geographic regions were identified, and clustering of private alleles suggests the prevalence of recombination in the evolutionary history of ChHV5. This study significantly increased the amount of sequence data available from ChHV5 strains, enabling informed selection of loci for future population genetic and natural history studies, and suggesting the (possibly latent) co-infection of individuals by well-differentiated geographic variants.
Morrison, Cheryl L.; Iwanowicz, Luke R.; Work, Thierry M.; Fahsbender, Elizabeth; Breitbart, Mya; Adams, Cynthia; Iwanowicz, Deborah; Sanders, Lakyn; Ackermann, Mathias; Cornman, Robert S.
2018-01-01
Chelonid alphaherpesvirus 5 (ChHV5) is a herpesvirus associated with fibropapillomatosis (FP) in sea turtles worldwide. Single-locus typing has previously shown differentiation between Atlantic and Pacific strains of this virus, with low variation within each geographic clade. However, a lack of multi-locus genomic sequence data hinders understanding of the rate and mechanisms of ChHV5 evolutionary divergence, as well as how these genomic changes may contribute to differences in disease manifestation. To assess genomic variation in ChHV5 among five Hawaii and three Florida green sea turtles, we used high-throughput short-read sequencing of long-range PCR products amplified from tumor tissue using primers designed from the single available ChHV5 reference genome from a Hawaii green sea turtle. This strategy recovered sequence data from both geographic regions for approximately 75% of the predicted ChHV5 coding sequences. The average nucleotide divergence between geographic populations was 1.5%; most of the substitutions were fixed differences between regions. Protein divergence was generally low (average 0.08%), and ranged between 0 and 5.3%. Several atypical genes originally identified and annotated in the reference genome were confirmed in ChHV5 genomes from both geographic locations. Unambiguous recombination events between geographic regions were identified, and clustering of private alleles suggests the prevalence of recombination in the evolutionary history of ChHV5. This study significantly increased the amount of sequence data available from ChHV5 strains, enabling informed selection of loci for future population genetic and natural history studies, and suggesting the (possibly latent) co-infection of individuals by well-differentiated geographic variants.
Population Genomics of Paramecium Species.
Johri, Parul; Krenek, Sascha; Marinov, Georgi K; Doak, Thomas G; Berendonk, Thomas U; Lynch, Michael
2017-05-01
Population-genomic analyses are essential to understanding factors shaping genomic variation and lineage-specific sequence constraints. The dearth of such analyses for unicellular eukaryotes prompted us to assess genomic variation in Paramecium, one of the most well-studied ciliate genera. The Paramecium aurelia complex consists of ∼15 morphologically indistinguishable species that diverged subsequent to two rounds of whole-genome duplications (WGDs, as long as 320 MYA) and possess extremely streamlined genomes. We examine patterns of both nuclear and mitochondrial polymorphism, by sequencing whole genomes of 10-13 worldwide isolates of each of three species belonging to the P. aurelia complex: P. tetraurelia, P. biaurelia, P. sexaurelia, as well as two outgroup species that do not share the WGDs: P. caudatum and P. multimicronucleatum. An apparent absence of global geographic population structure suggests continuous or recent dispersal of Paramecium over long distances. Intergenic regions are highly constrained relative to coding sequences, especially in P. caudatum and P. multimicronucleatum that have shorter intergenic distances. Sequence diversity and divergence are reduced up to ∼100-150 bp both upstream and downstream of genes, suggesting strong constraints imposed by the presence of densely packed regulatory modules. In addition, comparison of sequence variation at non-synonymous and synonymous sites suggests similar recent selective pressures on paralogs within and orthologs across the deeply diverging species. This study presents the first genome-wide population-genomic analysis in ciliates and provides a valuable resource for future studies in evolutionary and functional genetics in Paramecium. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Iwanowicz, Luke; Work, Thierry M.; Fahsbender, Elizabeth; Breitbart, Mya; Adams, Cynthia; Iwanowicz, Deb; Sanders, Lakyn; Ackermann, Mathias; Cornman, Robert S.
2018-01-01
Chelonid alphaherpesvirus 5 (ChHV5) is a herpesvirus associated with fibropapillomatosis (FP) in sea turtles worldwide. Single-locus typing has previously shown differentiation between Atlantic and Pacific strains of this virus, with low variation within each geographic clade. However, a lack of multi-locus genomic sequence data hinders understanding of the rate and mechanisms of ChHV5 evolutionary divergence, as well as how these genomic changes may contribute to differences in disease manifestation. To assess genomic variation in ChHV5 among five Hawaii and three Florida green sea turtles, we used high-throughput short-read sequencing of long-range PCR products amplified from tumor tissue using primers designed from the single available ChHV5 reference genome from a Hawaii green sea turtle. This strategy recovered sequence data from both geographic regions for approximately 75% of the predicted ChHV5 coding sequences. The average nucleotide divergence between geographic populations was 1.5%; most of the substitutions were fixed differences between regions. Protein divergence was generally low (average 0.08%), and ranged between 0 and 5.3%. Several atypical genes originally identified and annotated in the reference genome were confirmed in ChHV5 genomes from both geographic locations. Unambiguous recombination events between geographic regions were identified, and clustering of private alleles suggests the prevalence of recombination in the evolutionary history of ChHV5. This study significantly increased the amount of sequence data available from ChHV5 strains, enabling informed selection of loci for future population genetic and natural history studies, and suggesting the (possibly latent) co-infection of individuals by well-differentiated geographic variants. PMID:29479497
Thompson, Owen A.; Snoek, L. Basten; Nijveen, Harm; Sterken, Mark G.; Volkers, Rita J. M.; Brenchley, Rachel; van’t Hof, Arjen; Bevers, Roel P. J.; Cossins, Andrew R.; Yanai, Itai; Hajnal, Alex; Schmid, Tobias; Perkins, Jaryn D.; Spencer, David; Kruglyak, Leonid; Andersen, Erik C.; Moerman, Donald G.; Hillier, LaDeana W.; Kammenga, Jan E.; Waterston, Robert H.
2015-01-01
The Hawaiian strain (CB4856) of Caenorhabditis elegans is one of the most divergent from the canonical laboratory strain N2 and has been widely used in developmental, population, and evolutionary studies. To enhance the utility of the strain, we have generated a draft sequence of the CB4856 genome, exploiting a variety of resources and strategies. When compared against the N2 reference, the CB4856 genome has 327,050 single nucleotide variants (SNVs) and 79,529 insertion–deletion events that result in a total of 3.3 Mb of N2 sequence missing from CB4856 and 1.4 Mb of sequence present in CB4856 but not present in N2. As previously reported, the density of SNVs varies along the chromosomes, with the arms of chromosomes showing greater average variation than the centers. In addition, we find 61 regions totaling 2.8 Mb, distributed across all six chromosomes, which have a greatly elevated SNV density, ranging from 2 to 16% SNVs. A survey of other wild isolates show that the two alternative haplotypes for each region are widely distributed, suggesting they have been maintained by balancing selection over long evolutionary times. These divergent regions contain an abundance of genes from large rapidly evolving families encoding F-box, MATH, BATH, seven-transmembrane G-coupled receptors, and nuclear hormone receptors, suggesting that they provide selective advantages in natural environments. The draft sequence makes available a comprehensive catalog of sequence differences between the CB4856 and N2 strains that will facilitate the molecular dissection of their phenotypic differences. Our work also emphasizes the importance of going beyond simple alignment of reads to a reference genome when assessing differences between genomes. PMID:25995208
López-Alvarez, Diana; López-Herranz, Maria Luisa; Betekhtin, Alexander; Catalán, Pilar
2012-01-01
Background Brachypodium distachyon s. l. has been widely investigated across the world as a model plant for temperate cereals and biofuel grasses. However, this annual plant shows three cytotypes that have been recently recognized as three independent species, the diploids B. distachyon (2n = 10) and B. stacei (2n = 20) and their derived allotetraploid B. hybridum (2n = 30). Methodology/Principal Findings We propose a DNA barcoding approach that consists of a rapid, accurate and automatable species identification method using the standard DNA sequences of complementary plastid (trnLF) and nuclear (ITS, GI) loci. The highly homogenous but largely divergent B. distachyon and B. stacei diploids could be easily distinguished (100% identification success) using direct trnLF (2.4%), ITS (5.5%) or GI (3.8%) sequence divergence. By contrast, B. hybridum could only be unambiguously identified through the use of combined trnLF+ITS sequences (90% of identification success) or by cloned GI sequences (96.7%) that showed 5.4% (ITS) and 4% (GI) rate divergence between the two parental sequences found in the allopolyploid. Conclusion/Significance Our data provide an unbiased and effective barcode to differentiate these three closely-related species from one another. This procedure overcomes the taxonomic uncertainty generated from methods based on morphology or flow cytometry identifications that have resulted in some misclassifications of the model plant and its allies. Our study also demonstrates that the allotetraploid B. hybridum has resulted from bi-directional crosses of B. distachyon and B. stacei plants acting either as maternal or paternal parents. PMID:23240000
Genome-Wide Search Identifies 1.9 Mb from the Polar Bear Y Chromosome for Evolutionary Analyses.
Bidon, Tobias; Schreck, Nancy; Hailer, Frank; Nilsson, Maria A; Janke, Axel
2015-05-27
The male-inherited Y chromosome is the major haploid fraction of the mammalian genome, rendering Y-linked sequences an indispensable resource for evolutionary research. However, despite recent large-scale genome sequencing approaches, only a handful of Y chromosome sequences have been characterized to date, mainly in model organisms. Using polar bear (Ursus maritimus) genomes, we compare two different in silico approaches to identify Y-linked sequences: 1) Similarity to known Y-linked genes and 2) difference in the average read depth of autosomal versus sex chromosomal scaffolds. Specifically, we mapped available genomic sequencing short reads from a male and a female polar bear against the reference genome and identify 112 Y-chromosomal scaffolds with a combined length of 1.9 Mb. We verified the in silico findings for the longer polar bear scaffolds by male-specific in vitro amplification, demonstrating the reliability of the average read depth approach. The obtained Y chromosome sequences contain protein-coding sequences, single nucleotide polymorphisms, microsatellites, and transposable elements that are useful for evolutionary studies. A high-resolution phylogeny of the polar bear patriline shows two highly divergent Y chromosome lineages, obtained from analysis of the identified Y scaffolds in 12 previously published male polar bear genomes. Moreover, we find evidence of gene conversion among ZFX and ZFY sequences in the giant panda lineage and in the ancestor of ursine and tremarctine bears. Thus, the identification of Y-linked scaffold sequences from unordered genome sequences yields valuable data to infer phylogenomic and population-genomic patterns in bears. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Nadachowska-Brzyska, Krystyna; Burri, Reto; Olason, Pall I.; Kawakami, Takeshi; Smeds, Linnéa; Ellegren, Hans
2013-01-01
Profound knowledge of demographic history is a prerequisite for the understanding and inference of processes involved in the evolution of population differentiation and speciation. Together with new coalescent-based methods, the recent availability of genome-wide data enables investigation of differentiation and divergence processes at unprecedented depth. We combined two powerful approaches, full Approximate Bayesian Computation analysis (ABC) and pairwise sequentially Markovian coalescent modeling (PSMC), to reconstruct the demographic history of the split between two avian speciation model species, the pied flycatcher and collared flycatcher. Using whole-genome re-sequencing data from 20 individuals, we investigated 15 demographic models including different levels and patterns of gene flow, and changes in effective population size over time. ABC provided high support for recent (mode 0.3 my, range <0.7 my) species divergence, declines in effective population size of both species since their initial divergence, and unidirectional recent gene flow from pied flycatcher into collared flycatcher. The estimated divergence time and population size changes, supported by PSMC results, suggest that the ancestral species persisted through one of the glacial periods of middle Pleistocene and then split into two large populations that first increased in size before going through severe bottlenecks and expanding into their current ranges. Secondary contact appears to have been established after the last glacial maximum. The severity of the bottlenecks at the last glacial maximum is indicated by the discrepancy between current effective population sizes (20,000–80,000) and census sizes (5–50 million birds) of the two species. The recent divergence time challenges the supposition that avian speciation is a relatively slow process with extended times for intrinsic postzygotic reproductive barriers to evolve. Our study emphasizes the importance of using genome-wide data to unravel tangled demographic histories. Moreover, it constitutes one of the first examples of the inference of divergence history from genome-wide data in non-model species. PMID:24244198
Nadachowska-Brzyska, Krystyna; Burri, Reto; Olason, Pall I; Kawakami, Takeshi; Smeds, Linnéa; Ellegren, Hans
2013-11-01
Profound knowledge of demographic history is a prerequisite for the understanding and inference of processes involved in the evolution of population differentiation and speciation. Together with new coalescent-based methods, the recent availability of genome-wide data enables investigation of differentiation and divergence processes at unprecedented depth. We combined two powerful approaches, full Approximate Bayesian Computation analysis (ABC) and pairwise sequentially Markovian coalescent modeling (PSMC), to reconstruct the demographic history of the split between two avian speciation model species, the pied flycatcher and collared flycatcher. Using whole-genome re-sequencing data from 20 individuals, we investigated 15 demographic models including different levels and patterns of gene flow, and changes in effective population size over time. ABC provided high support for recent (mode 0.3 my, range <0.7 my) species divergence, declines in effective population size of both species since their initial divergence, and unidirectional recent gene flow from pied flycatcher into collared flycatcher. The estimated divergence time and population size changes, supported by PSMC results, suggest that the ancestral species persisted through one of the glacial periods of middle Pleistocene and then split into two large populations that first increased in size before going through severe bottlenecks and expanding into their current ranges. Secondary contact appears to have been established after the last glacial maximum. The severity of the bottlenecks at the last glacial maximum is indicated by the discrepancy between current effective population sizes (20,000-80,000) and census sizes (5-50 million birds) of the two species. The recent divergence time challenges the supposition that avian speciation is a relatively slow process with extended times for intrinsic postzygotic reproductive barriers to evolve. Our study emphasizes the importance of using genome-wide data to unravel tangled demographic histories. Moreover, it constitutes one of the first examples of the inference of divergence history from genome-wide data in non-model species.
Slatyer, Rachel A; Nash, Michael A; Miller, Adam D; Endo, Yoshinori; Umbers, Kate D L; Hoffmann, Ary A
2014-10-02
Mountain landscapes are topographically complex, creating discontinuous 'islands' of alpine and sub-alpine habitat with a dynamic history. Changing climatic conditions drive their expansion and contraction, leaving signatures on the genetic structure of their flora and fauna. Australia's high country covers a small, highly fragmented area. Although the area is thought to have experienced periods of relative continuity during Pleistocene glacial periods, small-scale studies suggest deep lineage divergence across low-elevation gaps. Using both DNA sequence data and microsatellite markers, we tested the hypothesis that genetic partitioning reflects observable geographic structuring across Australia's mainland high country, in the widespread alpine grasshopper Kosciuscola tristis (Sjösted). We found broadly congruent patterns of regional structure between the DNA sequence and microsatellite datasets, corresponding to strong divergence among isolated mountain regions. Small and isolated mountains in the south of the range were particularly distinct, with well-supported divergence corresponding to climate cycles during the late Pliocene and Pleistocene. We found mixed support, however, for divergence among other mountain regions. Interestingly, within areas of largely contiguous alpine and sub-alpine habitat around Mt Kosciuszko, microsatellite data suggested significant population structure, accompanied by a strong signature of isolation-by-distance. Consistent patterns of strong lineage divergence among different molecular datasets indicate genetic breaks between populations inhabiting geographically distinct mountain regions. Three primary phylogeographic groups were evident in the highly fragmented Victorian high country, while within-region structure detected with microsatellites may reflect more recent population isolation. Despite the small area of Australia's alpine and sub-alpine habitats, their low topographic relief and lack of extensive glaciation, divergence among populations was on the same scale as that detected in much more extensive Northern hemisphere mountain systems. The processes driving divergence in the Australian mountains might therefore differ from their Northern hemisphere counterparts.
Zienius, D; Lelešius, R; Kavaliauskis, H; Stankevičius, A; Šalomskas, A
2016-01-01
The aim of the present study was to detect canine parvovirus (CPV) from faecal samples of clinically ill domestic dogs by polymerase chain reaction (PCR) followed by VP2 gene partial sequencing and molecular characterization of circulating strains in Lithuania. Eleven clinically and antigen-tested positive dog faecal samples, collected during the period of 2014-2015, were investigated by using PCR. The phylogenetic investigations indicated that the Lithuanian CPV VP2 partial sequences (3025-3706 cds) were closely related and showed 99.0-99.9% identity. All Lithuanian sequences were associated with one phylogroup, but grouped in different clusters. Ten of investigated Lithuanian CPV VP2 sequences were closely associated with CPV 2a antigenic variant (99.4% nt identity). Five CPV VP2 sequences from Lithuania were related to CPV-2a, but were rather divergent (6.8 nt differences). Only one CPV VP2 sequence from Lithuania was associated (99.3% nt identity) with CPV-2b VP2 sequences from France, Italy, USA and Korea. The four of eleven investigated Lithuanian dogs with CPV infection symptoms were vaccinated with CPV-2 vaccine, but their VP2 sequences were phylogenetically distantly associated with CPV vaccine strains VP2 sequences (11.5-15.8 nt differences). Ten Lithuanian CPV VP2 sequences had monophyletic relations among the close geographically associated samples, but five of them were rather divergent (1.0% less sequence similarity). The one Lithuanian CPV VP2 sequence was closely related with CPV-2b antigenic variant. All the Lithuanian CPV VP2 partial sequences were conservative and phylogenetically low associated with most commonly used CPV vaccine strains.
Resolution of ray-finned fish phylogeny and timing of diversification.
Near, Thomas J; Eytan, Ron I; Dornburg, Alex; Kuhn, Kristen L; Moore, Jon A; Davis, Matthew P; Wainwright, Peter C; Friedman, Matt; Smith, W Leo
2012-08-21
Ray-finned fishes make up half of all living vertebrate species. Nearly all ray-finned fishes are teleosts, which include most commercially important fish species, several model organisms for genomics and developmental biology, and the dominant component of marine and freshwater vertebrate faunas. Despite the economic and scientific importance of ray-finned fishes, the lack of a single comprehensive phylogeny with corresponding divergence-time estimates has limited our understanding of the evolution and diversification of this radiation. Our analyses, which use multiple nuclear gene sequences in conjunction with 36 fossil age constraints, result in a well-supported phylogeny of all major ray-finned fish lineages and molecular age estimates that are generally consistent with the fossil record. This phylogeny informs three long-standing problems: specifically identifying elopomorphs (eels and tarpons) as the sister lineage of all other teleosts, providing a unique hypothesis on the radiation of early euteleosts, and offering a promising strategy for resolution of the "bush at the top of the tree" that includes percomorphs and other spiny-finned teleosts. Contrasting our divergence time estimates with studies using a single nuclear gene or whole mitochondrial genomes, we find that the former underestimates ages of the oldest ray-finned fish divergences, but the latter dramatically overestimates ages for derived teleost lineages. Our time-calibrated phylogeny reveals that much of the diversification leading to extant groups of teleosts occurred between the late Mesozoic and early Cenozoic, identifying this period as the "Second Age of Fishes."
Roos, Jonas; Aggarwal, Ramesh K; Janke, Axel
2007-11-01
The mitochondrial genomes of the dwarf crocodile, Osteolaemus tetraspis, and two species of dwarf caimans, the smooth-fronted caiman, Paleosuchus trigonatus, and Cuvier's dwarf caiman, Paleosuchus palpebrosus, were sequenced and included in a mitogenomic phylogenetic study. The phylogenetic analyses, which included a total of ten crocodylian species, yielded strong support to a basal split between Crocodylidae and Alligatoridae. Osteolaemus fell within the Crocodylidae as the sister group to Crocodylus. Gavialis and Tomistoma, which joined on a common branch, constituted a sister group to Crocodylus/Osteolaemus. This suggests that extant crocodylians are organized in two families: Alligatoridae and Crocodylidae. Within the Alligatoridae there was a basal split between Alligator and a branch that contained Paleosuchus and Caiman. The analyses also provided molecular estimates of various divergences applying recently established crocodylian and outgroup fossil calibration points. Molecular estimates based on amino acid data placed the divergence between Crocodylidae and Alligatoridae at 97-103 million years ago and that between Alligator and Caiman/Paleosuchus at 65-72 million years ago. Other crocodilian divergences were placed after the Cretaceous-Tertiary boundary. Thus, according to the molecular estimates, three extant crocodylian lineages have their roots in the Cretaceous. Considering the crocodylian diversification in the Cretaceous the molecular datings suggest that the extinction of the dinosaurs was also to some extent paralleled in the crocodylian evolution. However, for whatever reason, some crocodylian lineages survived into the Tertiary.
Resolution of ray-finned fish phylogeny and timing of diversification
Near, Thomas J.; Eytan, Ron I.; Dornburg, Alex; Kuhn, Kristen L.; Moore, Jon A.; Davis, Matthew P.; Wainwright, Peter C.; Friedman, Matt; Smith, W. Leo
2012-01-01
Ray-finned fishes make up half of all living vertebrate species. Nearly all ray-finned fishes are teleosts, which include most commercially important fish species, several model organisms for genomics and developmental biology, and the dominant component of marine and freshwater vertebrate faunas. Despite the economic and scientific importance of ray-finned fishes, the lack of a single comprehensive phylogeny with corresponding divergence-time estimates has limited our understanding of the evolution and diversification of this radiation. Our analyses, which use multiple nuclear gene sequences in conjunction with 36 fossil age constraints, result in a well-supported phylogeny of all major ray-finned fish lineages and molecular age estimates that are generally consistent with the fossil record. This phylogeny informs three long-standing problems: specifically identifying elopomorphs (eels and tarpons) as the sister lineage of all other teleosts, providing a unique hypothesis on the radiation of early euteleosts, and offering a promising strategy for resolution of the “bush at the top of the tree” that includes percomorphs and other spiny-finned teleosts. Contrasting our divergence time estimates with studies using a single nuclear gene or whole mitochondrial genomes, we find that the former underestimates ages of the oldest ray-finned fish divergences, but the latter dramatically overestimates ages for derived teleost lineages. Our time-calibrated phylogeny reveals that much of the diversification leading to extant groups of teleosts occurred between the late Mesozoic and early Cenozoic, identifying this period as the “Second Age of Fishes.” PMID:22869754
Recursive sequences in first-year calculus
NASA Astrophysics Data System (ADS)
Krainer, Thomas
2016-02-01
This article provides ready-to-use supplementary material on recursive sequences for a second-semester calculus class. It equips first-year calculus students with a basic methodical procedure based on which they can conduct a rigorous convergence or divergence analysis of many simple recursive sequences on their own without the need to invoke inductive arguments as is typically required in calculus textbooks. The sequences that are accessible to this kind of analysis are predominantly (eventually) monotonic, but also certain recursive sequences that alternate around their limit point as they converge can be considered.
New CRISPR-Cas systems from uncultivated microbes
NASA Astrophysics Data System (ADS)
Burstein, David; Harrington, Lucas B.; Strutt, Steven C.; Probst, Alexander J.; Anantharaman, Karthik; Thomas, Brian C.; Doudna, Jennifer A.; Banfield, Jillian F.
2017-02-01
CRISPR-Cas systems provide microbes with adaptive immunity by employing short DNA sequences, termed spacers, that guide Cas proteins to cleave foreign DNA. Class 2 CRISPR-Cas systems are streamlined versions, in which a single RNA-bound Cas protein recognizes and cleaves target sequences. The programmable nature of these minimal systems has enabled researchers to repurpose them into a versatile technology that is broadly revolutionizing biological and clinical research. However, current CRISPR-Cas technologies are based solely on systems from isolated bacteria, leaving the vast majority of enzymes from organisms that have not been cultured untapped. Metagenomics, the sequencing of DNA extracted directly from natural microbial communities, provides access to the genetic material of a huge array of uncultivated organisms. Here, using genome-resolved metagenomics, we identify a number of CRISPR-Cas systems, including the first reported Cas9 in the archaeal domain of life, to our knowledge. This divergent Cas9 protein was found in little-studied nanoarchaea as part of an active CRISPR-Cas system. In bacteria, we discovered two previously unknown systems, CRISPR-CasX and CRISPR-CasY, which are among the most compact systems yet discovered. Notably, all required functional components were identified by metagenomics, enabling validation of robust in vivo RNA-guided DNA interference activity in Escherichia coli. Interrogation of environmental microbial communities combined with in vivo experiments allows us to access an unprecedented diversity of genomes, the content of which will expand the repertoire of microbe-based biotechnologies.
New CRISPR–Cas systems from uncultivated microbes
Burstein, David; Harrington, Lucas B.; Strutt, Steven C.; ...
2016-12-22
We present that CRISPR-Cas systems provide microbes with adaptive immunity by employing short DNA sequences, termed spacers, that guide Cas proteins to cleave foreign DNA. Class 2 CRISPR-Cas systems are streamlined versions, in which a single RNA-bound Cas protein recognizes and cleaves target sequences. The programmable nature of these minimal systems has enabled researchers to repurpose them into a versatile technology that is broadly revolutionizing biological and clinical research. However, current CRISPR-Cas technologies are based solely on systems from isolated bacteria, leaving the vast majority of enzymes from organisms that have not been cultured untapped. Metagenomics, the sequencing of DNAmore » extracted directly from natural microbial communities, provides access to the genetic material of a huge array of uncultivated organisms. Here, using genome-resolved metagenomics, we identify a number of CRISPR-Cas systems, including the first reported Cas9 in the archaeal domain of life, to our knowledge. This divergent Cas9 protein was found in little-studied nanoarchaea as part of an active CRISPR-Cas system. In bacteria, we discovered two previously unknown systems, CRISPR-CasX and CRISPR-CasY, which are among the most compact systems yet discovered. Notably, all required functional components were identified by metagenomics, enabling validation of robust in vivo RNA-guided DNA interference activity in Escherichia coli. Lastly, interrogation of environmental microbial communities combined with in vivo experiments allows us to access an unprecedented diversity of genomes, the content of which will expand the repertoire of microbe-based biotechnologies.« less
New CRISPR–Cas systems from uncultivated microbes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Burstein, David; Harrington, Lucas B.; Strutt, Steven C.
We present that CRISPR-Cas systems provide microbes with adaptive immunity by employing short DNA sequences, termed spacers, that guide Cas proteins to cleave foreign DNA. Class 2 CRISPR-Cas systems are streamlined versions, in which a single RNA-bound Cas protein recognizes and cleaves target sequences. The programmable nature of these minimal systems has enabled researchers to repurpose them into a versatile technology that is broadly revolutionizing biological and clinical research. However, current CRISPR-Cas technologies are based solely on systems from isolated bacteria, leaving the vast majority of enzymes from organisms that have not been cultured untapped. Metagenomics, the sequencing of DNAmore » extracted directly from natural microbial communities, provides access to the genetic material of a huge array of uncultivated organisms. Here, using genome-resolved metagenomics, we identify a number of CRISPR-Cas systems, including the first reported Cas9 in the archaeal domain of life, to our knowledge. This divergent Cas9 protein was found in little-studied nanoarchaea as part of an active CRISPR-Cas system. In bacteria, we discovered two previously unknown systems, CRISPR-CasX and CRISPR-CasY, which are among the most compact systems yet discovered. Notably, all required functional components were identified by metagenomics, enabling validation of robust in vivo RNA-guided DNA interference activity in Escherichia coli. Lastly, interrogation of environmental microbial communities combined with in vivo experiments allows us to access an unprecedented diversity of genomes, the content of which will expand the repertoire of microbe-based biotechnologies.« less
Martin, David H.; Zozaya, Marcela; Lillis, Rebecca A.; Myers, Leann; Nsuami, M. Jacques; Ferris, Michael J.
2013-01-01
Background. The prevalence of Trichomonas vaginalis infection is highest in women with intermediate Nugent scores. We hypothesized that the vaginal microbiota in T. vaginalis–infected women differs from that in T. vaginalis–uninfected women. Methods. Vaginal samples from 30 T. vaginalis–infected women were matched by Nugent score to those from 30 T. vaginalis–uninfected women. Equal numbers of women with Nugent scores categorized as normal, intermediate, and bacterial vaginosis were included. The vaginal microbiota was assessed using 454 pyrosequencing analysis of polymerase chain reaction–amplified 16S ribosomal RNA gene sequences. The 16S ribosomal RNA gene sequence of an unknown organism was obtained by universal bacterial polymerase chain reaction amplification, cloning, and sequencing. Results. Principal coordinates analysis of the pyrosequencing data showed divergence of the vaginal microbiota in T. vaginalis–infected and T. vaginalis–uninfected patients among women with normal and those with intermediate Nugent scores but not among women with bacterial vaginosis. Cluster analysis revealed 2 unique groups of T. vaginalis–infected women. One had high abundance of Mycoplasma hominis and other had high abundance of an unknown Mycoplasma species. Women in the former group had clinical evidence of enhanced vaginal inflammation. Conclusions. T. vaginalis may alter the vaginal microbiota in a manner that is favorable to its survival and/or transmissibility. An unknown Mycoplasma species plays a role in some of these transformations. In other cases, these changes may result in a heightened host inflammatory response. PMID:23482642
Martin, David H; Zozaya, Marcela; Lillis, Rebecca A; Myers, Leann; Nsuami, M Jacques; Ferris, Michael J
2013-06-15
The prevalence of Trichomonas vaginalis infection is highest in women with intermediate Nugent scores. We hypothesized that the vaginal microbiota in T. vaginalis-infected women differs from that in T. vaginalis-uninfected women. Vaginal samples from 30 T. vaginalis-infected women were matched by Nugent score to those from 30 T. vaginalis-uninfected women. Equal numbers of women with Nugent scores categorized as normal, intermediate, and bacterial vaginosis were included. The vaginal microbiota was assessed using 454 pyrosequencing analysis of polymerase chain reaction-amplified 16S ribosomal RNA gene sequences. The 16S ribosomal RNA gene sequence of an unknown organism was obtained by universal bacterial polymerase chain reaction amplification, cloning, and sequencing. Principal coordinates analysis of the pyrosequencing data showed divergence of the vaginal microbiota in T. vaginalis-infected and T. vaginalis-uninfected patients among women with normal and those with intermediate Nugent scores but not among women with bacterial vaginosis. Cluster analysis revealed 2 unique groups of T. vaginalis-infected women. One had high abundance of Mycoplasma hominis and other had high abundance of an unknown Mycoplasma species. Women in the former group had clinical evidence of enhanced vaginal inflammation. T. vaginalis may alter the vaginal microbiota in a manner that is favorable to its survival and/or transmissibility. An unknown Mycoplasma species plays a role in some of these transformations. In other cases, these changes may result in a heightened host inflammatory response.
Jarvi, Susan I; Bianchi, Kiara R; Farias, Margaret Em; Txakeeyang, Ann; McFarland, Thomas; Belcaid, Mahdi; Asano, Ashley
2016-07-01
Hawaiian honeycreepers (Drepanidinae) have evolved in the absence of mosquitoes for over five million years. Through human activity, mosquitoes were introduced to the Hawaiian archipelago less than 200 years ago. Mosquito-vectored diseases such as avian malaria caused by Plasmodium relictum and Avipoxviruses have greatly impacted these vulnerable species. Susceptibility to these diseases is variable among and within species. Due to their function in adaptive immunity, the role of major histocompatibility complex genes (Mhc) in disease susceptibility is under investigation. In this study, we evaluate gene organization and levels of diversity of Mhc class II β chain genes (exon 2) in a captive-reared family of Hawaii 'amakihi (Hemignathus virens). A total of 233 sequences (173 bp) were obtained by PCR+1 amplification and cloning, and 5720 sequences were generated by Roche 454 pyrosequencing. We report a total of 17 alleles originating from a minimum of 14 distinct loci. We detected three linkage groups that appear to represent three distinct haplotypes. Phylogenetic analysis revealed one variable cluster resembling classical Mhc sequences (DAB) and one highly conserved, low variability cluster resembling non-classical Mhc sequences (DBB). High net evolutionary divergence values between DAB and DBB resemble that seen between chicken BLB system and YLB system genes. High amino acid identity among non-classical alleles from 12 species of passerines (DBB) and four species of Galliformes (YLB) was found, suggesting that these non-classical passerine sequences may be related to the Galliforme YLB sequences.
Gordon, Kacy L.; Arthur, Robert K.; Ruvinsky, Ilya
2015-01-01
Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. PMID:26020930
2014-01-01
Background Horseshoe crabs are marine arthropods with a fossil record extending back approximately 450 million years. They exhibit remarkable morphological stability over their long evolutionary history, retaining a number of ancestral arthropod traits, and are often cited as examples of “living fossils.” As arthropods, they belong to the Ecdysozoa, an ancient super-phylum whose sequenced genomes (including insects and nematodes) have thus far shown more divergence from the ancestral pattern of eumetazoan genome organization than cnidarians, deuterostomes and lophotrochozoans. However, much of ecdysozoan diversity remains unrepresented in comparative genomic analyses. Results Here we apply a new strategy of combined de novo assembly and genetic mapping to examine the chromosome-scale genome organization of the Atlantic horseshoe crab, Limulus polyphemus. We constructed a genetic linkage map of this 2.7 Gbp genome by sequencing the nuclear DNA of 34 wild-collected, full-sibling embryos and their parents at a mean redundancy of 1.1x per sample. The map includes 84,307 sequence markers grouped into 1,876 distinct genetic intervals and 5,775 candidate conserved protein coding genes. Conclusions Comparison with other metazoan genomes shows that the L. polyphemus genome preserves ancestral bilaterian linkage groups, and that a common ancestor of modern horseshoe crabs underwent one or more ancient whole genome duplications 300 million years ago, followed by extensive chromosome fusion. These results provide a counter-example to the often noted correlation between whole genome duplication and evolutionary radiations. The new, low-cost genetic mapping method for obtaining a chromosome-scale view of non-model organism genomes that we demonstrate here does not require laboratory culture, and is potentially applicable to a broad range of other species. PMID:24987520
Badisco, Liesbeth; Huybrechts, Jurgen; Simonet, Gert; Verlinden, Heleen; Marchal, Elisabeth; Huybrechts, Roger; Schoofs, Liliane; De Loof, Arnold; Vanden Broeck, Jozef
2011-03-21
The desert locust (Schistocerca gregaria) displays a fascinating type of phenotypic plasticity, designated as 'phase polyphenism'. Depending on environmental conditions, one genome can be translated into two highly divergent phenotypes, termed the solitarious and gregarious (swarming) phase. Although many of the underlying molecular events remain elusive, the central nervous system (CNS) is expected to play a crucial role in the phase transition process. Locusts have also proven to be interesting model organisms in a physiological and neurobiological research context. However, molecular studies in locusts are hampered by the fact that genome/transcriptome sequence information available for this branch of insects is still limited. We have generated 34,672 raw expressed sequence tags (EST) from the CNS of desert locusts in both phases. These ESTs were assembled in 12,709 unique transcript sequences and nearly 4,000 sequences were functionally annotated. Moreover, the obtained S. gregaria EST information is highly complementary to the existing orthopteran transcriptomic data. Since many novel transcripts encode neuronal signaling and signal transduction components, this paper includes an overview of these sequences. Furthermore, several transcripts being differentially represented in solitarious and gregarious locusts were retrieved from this EST database. The findings highlight the involvement of the CNS in the phase transition process and indicate that this novel annotated database may also add to the emerging knowledge of concomitant neuronal signaling and neuroplasticity events. In summary, we met the need for novel sequence data from desert locust CNS. To our knowledge, we hereby also present the first insect EST database that is derived from the complete CNS. The obtained S. gregaria EST data constitute an important new source of information that will be instrumental in further unraveling the molecular principles of phase polyphenism, in further establishing locusts as valuable research model organisms and in molecular evolutionary and comparative entomology.
Monoparametric family of metrics derived from classical Jensen-Shannon divergence
NASA Astrophysics Data System (ADS)
Osán, Tristán M.; Bussandri, Diego G.; Lamberti, Pedro W.
2018-04-01
Jensen-Shannon divergence is a well known multi-purpose measure of dissimilarity between probability distributions. It has been proven that the square root of this quantity is a true metric in the sense that, in addition to the basic properties of a distance, it also satisfies the triangle inequality. In this work we extend this last result to prove that in fact it is possible to derive a monoparametric family of metrics from the classical Jensen-Shannon divergence. Motivated by our results, an application into the field of symbolic sequences segmentation is explored. Additionally, we analyze the possibility to extend this result into the quantum realm.
Jensen, Annette Bruun; Eilenberg, Jørgen; López Lastra, Claudia
2009-11-01
Three DNA regions (ITS 1, LSU rRNA and GPD) of isolates from the insect-pathogenic fungus genus Entomophthora originating from different fly (Diptera) and aphid (Hemiptera) host taxa were sequenced. The results documented a large genetic diversity among the fly-pathogenic Entomophthora and only minor differences among aphid-pathogenic Entomophthora. The evolutionary time of divergence of the fly and the aphid host taxa included cannot account for this difference. The host-driven divergence of Entomophthora, therefore, has been much greater in flies than in aphids. Host-range differences or a recent host shift to aphid are possible explanations.
Fatal Metacestode Infection in Bornean Orangutan Caused by Unknown Versteria Species
Gendron-Fitzpatrick, Annette; Deering, Kathleen M.; Wallace, Roberta S.; Clyde, Victoria L.; Lauck, Michael; Rosen, Gail E.; Bennett, Andrew J.; Greiner, Ellis C.; O’Connor, David H.
2014-01-01
A captive juvenile Bornean orangutan (Pongo pygmaeus) died from an unknown disseminated parasitic infection. Deep sequencing of DNA from infected tissues, followed by gene-specific PCR and sequencing, revealed a divergent species within the newly proposed genus Versteria (Cestoda: Taeniidae). Versteria may represent a previously unrecognized risk to primate health. PMID:24377497
USDA-ARS?s Scientific Manuscript database
We report on the assembly of the 14,146 base pairs (bp) near complete mitochondrial sequencing of the legume pod borer (LPB), Maruca vitrata (Lepidoptera: Crambidae), which was used to estimate divergence and relationships within the lepidopteran lineage. Arrangement and orientation of 13 protein c...
Komatsu, Ken; Yamashita, Kazuo; Sugawara, Kota; Verbeek, Martin; Fujita, Naoko; Hanada, Kaoru; Uehara-Ichiki, Tamaki; Fuji, Shin-Ichi
2017-02-01
Plantago asiatica mosaic virus (PlAMV) is a member of the genus Potexvirus and has an exceptionally wide host range. It causes severe damage to lilies. Here we report on the complete nucleotide sequences of two new Japanese PlAMV isolates, one from the eudicot weed Viola grypoceras (PlAMV-Vi), and the other from the eudicot shrub Nandina domestica Thunb. (PlAMV-NJ). Their genomes contain five open reading frames (ORFs), which is characteristic of potexviruses. Surprisingly, the isolates showed only 76.0-78.0 % sequence identity with each other and with other PlAMV isolates, including isolates from Japanese lily and American nandina. Amino acid alignments of the replicase coding region encoded by ORF1 showed that the regions between the methyltransferase and helicase domains were less conserved than other regions, with several insertions and/or deletions. Phylogenetic analyses of the full-length nucleotide sequences revealed a moderate correlation between phylogenetic clustering and the original host plants of the PlAMV isolates. This study revealed the presence of two highly divergent PlAMV isolates in Japan.
Kim, Dae Hun; Ko, Kwan Soo
2015-07-01
To investigate pmrCAB sequence divergence in 5 species of Acinetobacter baumannii complex, a total of 80 isolates from a Korean hospital were explored. We evaluated nucleotide and amino acid polymorphisms of pmrCAB operon, and phylogenetic trees were constructed for each gene of prmCAB operon. Colistin and polymyxin B susceptibility was determined for all isolates, and multilocus sequence typing was also performed for A. baumannii isolates. Our results showed that each species of A. baumannii complex has divergent pmrCAB operon sequences. We identified a distinct pmrCAB allele allied with Acinetobacter nosocomialis in gene trees. Different grouping in each gene tree suggests sporadic recombination or emergence of pmrCAB genes among Acinetobacter species. Sequence polymorphisms among Acinetobacter species might not be associated with colistin resistance. We revealed that a distinct pmrCAB allele may be widespread across the continents such as North America and Asia and that sporadic genetic recombination or emergence of pmrCAB genes might occur. Copyright © 2015 Elsevier Inc. All rights reserved.
Sarkar, Mohosin; Liu, Yun; Qi, Junpeng; Peng, Haiyong; Morimoto, Jumpei; Rader, Christoph; Chiorazzi, Nicholas; Kodadek, Thomas
2016-04-01
Chronic lymphocytic leukemia (CLL) is a disease in which a single B-cell clone proliferates relentlessly in peripheral lymphoid organs, bone marrow, and blood. DNA sequencing experiments have shown that about 30% of CLL patients have stereotyped antigen-specific B-cell receptors (BCRs) with a high level of sequence homology in the variable domains of the heavy and light chains. These include many of the most aggressive cases that haveIGHV-unmutated BCRs whose sequences have not diverged significantly from the germ line. This suggests a personalized therapy strategy in which a toxin or immune effector function is delivered selectively to the pathogenic B-cells but not to healthy B-cells. To execute this strategy, serum-stable, drug-like compounds able to target the antigen-binding sites of most or all patients in a stereotyped subset are required. We demonstrate here the feasibility of this approach with the discovery of selective, high affinity ligands for CLL BCRs of the aggressive, stereotyped subset 7P that cross-react with the BCRs of several CLL patients in subset 7p, but not with BCRs from patients outside this subset. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
Genomic organization of plant aminopropyl transferases.
Rodríguez-Kessler, Margarita; Delgado-Sánchez, Pablo; Rodríguez-Kessler, Gabriela Theresia; Moriguchi, Takaya; Jiménez-Bremont, Juan Francisco
2010-07-01
Aminopropyl transferases like spermidine synthase (SPDS; EC 2.5.1.16), spermine synthase and thermospermine synthase (SPMS, tSPMS; EC 2.5.1.22) belong to a class of widely distributed enzymes that use decarboxylated S-adenosylmethionine as an aminopropyl donor and putrescine or spermidine as an amino acceptor to form in that order spermidine, spermine or thermospermine. We describe the analysis of plant genomic sequences encoding SPDS, SPMS, tSPMS and PMT (putrescine N-methyltransferase; EC 2.1.1.53). Genome organization (including exon size, gain and loss, as well as intron number, size, loss, retention, placement and phase, and the presence of transposons) of plant aminopropyl transferase genes were compared between the genomic sequences of SPDS, SPMS and tSPMS from Zea mays, Oryza sativa, Malus x domestica, Populus trichocarpa, Arabidopsis thaliana and Physcomitrella patens. In addition, the genomic organization of plant PMT genes, proposed to be derived from SPDS during the evolution of alkaloid metabolism, is illustrated. Herein, a particular conservation and arrangement of exon and intron sequences between plant SPDS, SPMS and PMT genes that clearly differs with that of ACL5 genes, is shown. The possible acquisition of the plant SPMS exon II and, in particular exon XI in the monocot SPMS genes, is a remarkable feature that allows their differentiation from SPDS genes. In accordance with our in silico analysis, functional complementation experiments of the maize ZmSPMS1 enzyme (previously considered to be SPDS) in yeast demonstrated its spermine synthase activity. Another significant aspect is the conservation of intron sequences among SPDS and PMT paralogs. In addition the existence of microsynteny among some SPDS paralogs, especially in P. trichocarpa and A. thaliana, supports duplication events of plant SPDS genes. Based in our analysis, we hypothesize that SPMS genes appeared with the divergence of vascular plants by a processes of gene duplication and the acquisition of unique exons of as-yet unknown origin. 2010 Elsevier Masson SAS. All rights reserved.