Krishnan, Neeraja M; Seligmann, Hervé; Stewart, Caro-Beth; De Koning, A P Jason; Pollock, David D
2004-10-01
Reconstruction of ancestral DNA and amino acid sequences is an important means of inferring information about past evolutionary events. Such reconstructions suggest changes in molecular function and evolutionary processes over the course of evolution and are used to infer adaptation and convergence. Maximum likelihood (ML) is generally thought to provide relatively accurate reconstructed sequences compared to parsimony, but both methods lead to the inference of multiple directional changes in nucleotide frequencies in primate mitochondrial DNA (mtDNA). To better understand this surprising result, as well as to better understand how parsimony and ML differ, we constructed a series of computationally simple "conditional pathway" methods that differed in the number of substitutions allowed per site along each branch, and we also evaluated the entire Bayesian posterior frequency distribution of reconstructed ancestral states. We analyzed primate mitochondrial cytochrome b (Cyt-b) and cytochrome oxidase subunit I (COI) genes and found that ML reconstructs ancestral frequencies that are often more different from tip sequences than are parsimony reconstructions. In contrast, frequency reconstructions based on the posterior ensemble more closely resemble extant nucleotide frequencies. Simulations indicate that these differences in ancestral sequence inference are probably due to deterministic bias caused by high uncertainty in the optimization-based ancestral reconstruction methods (parsimony, ML, Bayesian maximum a posteriori). In contrast, ancestral nucleotide frequencies based on an average of the Bayesian set of credible ancestral sequences are much less biased. The methods involving simpler conditional pathway calculations have slightly reduced likelihood values compared to full likelihood calculations, but they can provide fairly unbiased nucleotide reconstructions and may be useful in more complex phylogenetic analyses than considered here due to their speed and flexibility. To determine whether biased reconstructions using optimization methods might affect inferences of functional properties, ancestral primate mitochondrial tRNA sequences were inferred and helix-forming propensities for conserved pairs were evaluated in silico. For ambiguously reconstructed nucleotides at sites with high base composition variability, ancestral tRNA sequences from Bayesian analyses were more compatible with canonical base pairing than were those inferred by other methods. Thus, nucleotide bias in reconstructed sequences apparently can lead to serious bias and inaccuracies in functional predictions.
A phylogenetic study of Laeliinae (Orchidaceae) based on combined nuclear and plastid DNA sequences
van den Berg, Cássio; Higgins, Wesley E.; Dressler, Robert L.; Whitten, W. Mark; Soto-Arenas, Miguel A.; Chase, Mark W.
2009-01-01
Background and Aims Laeliinae are a neotropical orchid subtribe with approx. 1500 species in 50 genera. In this study, an attempt is made to assess generic alliances based on molecular phylogenetic analysis of DNA sequence data. Methods Six DNA datasets were gathered: plastid trnL intron, trnL-F spacer, matK gene and trnK introns upstream and dowstream from matK and nuclear ITS rDNA. Data were analysed with maximum parsimony (MP) and Bayesian analysis with mixed models (BA). Key Results Although relationships between Laeliinae and outgroups are well supported, within the subtribe sequence variation is low considering the broad taxonomic range covered. Localized incongruence between the ITS and plastid trees was found. A combined tree followed the ITS trees more closely, but the levels of support obtained with MP were low. The Bayesian analysis recovered more well-supported nodes. The trees from combined MP and BA allowed eight generic alliances to be recognized within Laeliinae, all of which show trends in morphological characters but lack unambiguous synapomorphies. Conclusions By using combined plastid and nuclear DNA data in conjunction with mixed-models Bayesian inference, it is possible to delimit smaller groups within Laeliinae and discuss general patterns of pollination and hybridization compatibility. Furthermore, these small groups can now be used for further detailed studies to explain morphological evolution and diversification patterns within the subtribe. PMID:19423551
Romer, Katherine A.; Kayombya, Guy-Richard; Fraenkel, Ernest
2007-01-01
WebMOTIFS provides a web interface that facilitates the discovery and analysis of DNA-sequence motifs. Several studies have shown that the accuracy of motif discovery can be significantly improved by using multiple de novo motif discovery programs and using randomized control calculations to identify the most significant motifs or by using Bayesian approaches. WebMOTIFS makes it easy to apply these strategies. Using a single submission form, users can run several motif discovery programs and score, cluster and visualize the results. In addition, the Bayesian motif discovery program THEME can be used to determine the class of transcription factors that is most likely to regulate a set of sequences. Input can be provided as a list of gene or probe identifiers. Used with the default settings, WebMOTIFS accurately identifies biologically relevant motifs from diverse data in several species. WebMOTIFS is freely available at http://fraenkel.mit.edu/webmotifs. PMID:17584794
Posterior Predictive Bayesian Phylogenetic Model Selection
Lewis, Paul O.; Xie, Wangang; Chen, Ming-Hui; Fan, Yu; Kuo, Lynn
2014-01-01
We present two distinctly different posterior predictive approaches to Bayesian phylogenetic model selection and illustrate these methods using examples from green algal protein-coding cpDNA sequences and flowering plant rDNA sequences. The Gelfand–Ghosh (GG) approach allows dissection of an overall measure of model fit into components due to posterior predictive variance (GGp) and goodness-of-fit (GGg), which distinguishes this method from the posterior predictive P-value approach. The conditional predictive ordinate (CPO) method provides a site-specific measure of model fit useful for exploratory analyses and can be combined over sites yielding the log pseudomarginal likelihood (LPML) which is useful as an overall measure of model fit. CPO provides a useful cross-validation approach that is computationally efficient, requiring only a sample from the posterior distribution (no additional simulation is required). Both GG and CPO add new perspectives to Bayesian phylogenetic model selection based on the predictive abilities of models and complement the perspective provided by the marginal likelihood (including Bayes Factor comparisons) based solely on the fit of competing models to observed data. [Bayesian; conditional predictive ordinate; CPO; L-measure; LPML; model selection; phylogenetics; posterior predictive.] PMID:24193892
Ancient DNA sequence revealed by error-correcting codes.
Brandão, Marcelo M; Spoladore, Larissa; Faria, Luzinete C B; Rocha, Andréa S L; Silva-Filho, Marcio C; Palazzo, Reginaldo
2015-07-10
A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code.
Ancient DNA sequence revealed by error-correcting codes
Brandão, Marcelo M.; Spoladore, Larissa; Faria, Luzinete C. B.; Rocha, Andréa S. L.; Silva-Filho, Marcio C.; Palazzo, Reginaldo
2015-01-01
A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code. PMID:26159228
Nonparametric Bayesian clustering to detect bipolar methylated genomic loci.
Wu, Xiaowei; Sun, Ming-An; Zhu, Hongxiao; Xie, Hehuang
2015-01-16
With recent development in sequencing technology, a large number of genome-wide DNA methylation studies have generated massive amounts of bisulfite sequencing data. The analysis of DNA methylation patterns helps researchers understand epigenetic regulatory mechanisms. Highly variable methylation patterns reflect stochastic fluctuations in DNA methylation, whereas well-structured methylation patterns imply deterministic methylation events. Among these methylation patterns, bipolar patterns are important as they may originate from allele-specific methylation (ASM) or cell-specific methylation (CSM). Utilizing nonparametric Bayesian clustering followed by hypothesis testing, we have developed a novel statistical approach to identify bipolar methylated genomic regions in bisulfite sequencing data. Simulation studies demonstrate that the proposed method achieves good performance in terms of specificity and sensitivity. We used the method to analyze data from mouse brain and human blood methylomes. The bipolar methylated segments detected are found highly consistent with the differentially methylated regions identified by using purified cell subsets. Bipolar DNA methylation often indicates epigenetic heterogeneity caused by ASM or CSM. With allele-specific events filtered out or appropriately taken into account, our proposed approach sheds light on the identification of cell-specific genes/pathways under strong epigenetic control in a heterogeneous cell population.
Armstrong, Miles R; Husmeier, Dirk; Phillips, Mark S; Blok, Vivian C
2007-06-01
The discovery that the potato cyst nematode Globodera pallida has a multipartite mitochondrial DNA (mtDNA) composed, at least in part, of six small circular mtDNAs (scmtDNAs) raised a number of questions concerning the population-level processes that might act on such a complex genome. Here we report our observations on the distribution of some scmtDNAs among a sample of European and South American G. pallida populations. The occurrence of sequence variants of scmtDNA IV in population P4A from South America, and that particular sequence variants are common to the individuals within a single cyst, is described. Evidence for recombination of sequence variants of scmtDNA IV in P4A is also reported. The mosaic structure of P4A scmtDNA IV sequences was revealed using several detection methods and recombination breakpoints were independently detected by maximum likelihood and Bayesian MCMC methods.
A Bayesian deconvolution strategy for immunoprecipitation-based DNA methylome analysis
Down, Thomas A.; Rakyan, Vardhman K.; Turner, Daniel J.; Flicek, Paul; Li, Heng; Kulesha, Eugene; Gräf, Stefan; Johnson, Nathan; Herrero, Javier; Tomazou, Eleni M.; Thorne, Natalie P.; Bäckdahl, Liselotte; Herberth, Marlis; Howe, Kevin L.; Jackson, David K.; Miretti, Marcos M.; Marioni, John C.; Birney, Ewan; Hubbard, Tim J. P.; Durbin, Richard; Tavaré, Simon; Beck, Stephan
2009-01-01
DNA methylation is an indispensible epigenetic modification of mammalian genomes. Consequently there is great interest in strategies for genome-wide/whole-genome DNA methylation analysis, and immunoprecipitation-based methods have proven to be a powerful option. Such methods are rapidly shifting the bottleneck from data generation to data analysis, necessitating the development of better analytical tools. Until now, a major analytical difficulty associated with immunoprecipitation-based DNA methylation profiling has been the inability to estimate absolute methylation levels. Here we report the development of a novel cross-platform algorithm – Bayesian Tool for Methylation Analysis (Batman) – for analyzing Methylated DNA Immunoprecipitation (MeDIP) profiles generated using arrays (MeDIP-chip) or next-generation sequencing (MeDIP-seq). The latter is an approach we have developed to elucidate the first high-resolution whole-genome DNA methylation profile (DNA methylome) of any mammalian genome. MeDIP-seq/MeDIP-chip combined with Batman represent robust, quantitative, and cost-effective functional genomic strategies for elucidating the function of DNA methylation. PMID:18612301
New Stopping Criteria for Segmenting DNA Sequences
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Wentian
2001-06-18
We propose a solution on the stopping criterion in segmenting inhomogeneous DNA sequences with complex statistical patterns. This new stopping criterion is based on Bayesian information criterion in the model selection framework. When this criterion is applied to telomere of S.cerevisiae and the complete sequence of E.coli, borders of biologically meaningful units were identified, and a more reasonable number of domains was obtained. We also introduce a measure called segmentation strength which can be used to control the delineation of large domains. The relationship between the average domain size and the threshold of segmentation strength is determined for several genomemore » sequences.« less
Papasotiropoulos, Vasilis; Klossa-Kilia, Elena; Alahiotis, Stamatis N; Kilias, George
2007-08-01
Mitochondrial DNA sequence analysis has been used to explore genetic differentiation and phylogenetic relationships among five species of the Mugilidae family, Mugil cephalus, Chelon labrosus, Liza aurata, Liza ramada, and Liza saliens. DNA was isolated from samples originating from the Messolongi Lagoon in Greece. Three mtDNA segments (12s rRNA, 16s rRNA, and CO I) were PCR amplified and sequenced. Sequencing analysis revealed that the greatest genetic differentiation was observed between M. cephalus and all the other species studied, while C. labrosus and L. aurata were the closest taxa. Dendrograms obtained by the neighbor-joining method and Bayesian inference analysis exhibited the same topology. According to this topology, M. cephalus is the most distinct species and the remaining taxa are clustered together, with C. labrosus and L. aurata forming a single group. The latter result brings into question the monophyletic origin of the genus Liza.
Schönberg, Anna; Theunert, Christoph; Li, Mingkun; Stoneking, Mark; Nasidze, Ivan
2011-09-01
To investigate the demographic history of human populations from the Caucasus and surrounding regions, we used high-throughput sequencing to generate 147 complete mtDNA genome sequences from random samples of individuals from three groups from the Caucasus (Armenians, Azeri and Georgians), and one group each from Iran and Turkey. Overall diversity is very high, with 144 different sequences that fall into 97 different haplogroups found among the 147 individuals. Bayesian skyline plots (BSPs) of population size change through time show a population expansion around 40-50 kya, followed by a constant population size, and then another expansion around 15-18 kya for the groups from the Caucasus and Iran. The BSP for Turkey differs the most from the others, with an increase from 35 to 50 kya followed by a prolonged period of constant population size, and no indication of a second period of growth. An approximate Bayesian computation approach was used to estimate divergence times between each pair of populations; the oldest divergence times were between Turkey and the other four groups from the South Caucasus and Iran (~400-600 generations), while the divergence time of the three Caucasus groups from each other was comparable to their divergence time from Iran (average of ~360 generations). These results illustrate the value of random sampling of complete mtDNA genome sequences that can be obtained with high-throughput sequencing platforms.
MPN estimation of qPCR target sequence recoveries from whole cell calibrator samples.
Sivaganesan, Mano; Siefring, Shawn; Varma, Manju; Haugland, Richard A
2011-12-01
DNA extracts from enumerated target organism cells (calibrator samples) have been used for estimating Enterococcus cell equivalent densities in surface waters by a comparative cycle threshold (Ct) qPCR analysis method. To compare surface water Enterococcus density estimates from different studies by this approach, either a consistent source of calibrator cells must be used or the estimates must account for any differences in target sequence recoveries from different sources of calibrator cells. In this report we describe two methods for estimating target sequence recoveries from whole cell calibrator samples based on qPCR analyses of their serially diluted DNA extracts and most probable number (MPN) calculation. The first method employed a traditional MPN calculation approach. The second method employed a Bayesian hierarchical statistical modeling approach and a Monte Carlo Markov Chain (MCMC) simulation method to account for the uncertainty in these estimates associated with different individual samples of the cell preparations, different dilutions of the DNA extracts and different qPCR analytical runs. The two methods were applied to estimate mean target sequence recoveries per cell from two different lots of a commercially available source of enumerated Enterococcus cell preparations. The mean target sequence recovery estimates (and standard errors) per cell from Lot A and B cell preparations by the Bayesian method were 22.73 (3.4) and 11.76 (2.4), respectively, when the data were adjusted for potential false positive results. Means were similar for the traditional MPN approach which cannot comparably assess uncertainty in the estimates. Cell numbers and estimates of recoverable target sequences in calibrator samples prepared from the two cell sources were also used to estimate cell equivalent and target sequence quantities recovered from surface water samples in a comparative Ct method. Our results illustrate the utility of the Bayesian method in accounting for uncertainty, the high degree of precision attainable by the MPN approach and the need to account for the differences in target sequence recoveries from different calibrator sample cell sources when they are used in the comparative Ct method. Published by Elsevier B.V.
Convergence among cave catfishes: long-branch attraction and a Bayesian relative rates test.
Wilcox, T P; García de León, F J; Hendrickson, D A; Hillis, D M
2004-06-01
Convergence has long been of interest to evolutionary biologists. Cave organisms appear to be ideal candidates for studying convergence in morphological, physiological, and developmental traits. Here we report apparent convergence in two cave-catfishes that were described on morphological grounds as congeners: Prietella phreatophila and Prietella lundbergi. We collected mitochondrial DNA sequence data from 10 species of catfishes, representing five of the seven genera in Ictaluridae, as well as seven species from a broad range of siluriform outgroups. Analysis of the sequence data under parsimony supports a monophyletic Prietella. However, both maximum-likelihood and Bayesian analyses support polyphyly of the genus, with P. lundbergi sister to Ictalurus and P. phreatophila sister to Ameiurus. The topological difference between parsimony and the other methods appears to result from long-branch attraction between the Prietella species. Similarly, the sequence data do not support several other relationships within Ictaluridae supported by morphology. We develop a new Bayesian method for examining variation in molecular rates of evolution across a phylogeny.
2014-01-01
Affinity capture of DNA methylation combined with high-throughput sequencing strikes a good balance between the high cost of whole genome bisulfite sequencing and the low coverage of methylation arrays. We present BayMeth, an empirical Bayes approach that uses a fully methylated control sample to transform observed read counts into regional methylation levels. In our model, inefficient capture can readily be distinguished from low methylation levels. BayMeth improves on existing methods, allows explicit modeling of copy number variation, and offers computationally efficient analytical mean and variance estimators. BayMeth is available in the Repitools Bioconductor package. PMID:24517713
Ned B. Klopfenstein; Jane E. Stewart; Yuko Ota; John W. Hanna; Bryce A. Richardson; Amy L. Ross-Davis; Ruben D. Elias-Roman; Kari Korhonen; Nenad Keca; Eugenia Iturritxa; Dionicio Alvarado-Rosales; Halvor Solheim; Nicholas J. Brazee; Piotr Lakomy; Michelle R. Cleary; Eri Hasegawa; Taisei Kikuchi; Fortunato Garza-Ocanas; Panaghiotis Tsopelas; Daniel Rigling; Simone Prospero; Tetyana Tsykun; Jean A. Berube; Franck O. P. Stefani; Saeideh Jafarpour; Vladimir Antonin; Michal Tomsovsky; Geral I. McDonald; Stephen Woodward; Mee-Sook Kim
2017-01-01
Armillaria possesses several intriguing characteristics that have inspired wide interest in understanding phylogenetic relationships within and among species of this genus. Nuclear ribosomal DNA sequenceâbased analyses of Armillaria provide only limited information for phylogenetic studies among widely divergent taxa. More recent studies have shown that translation...
Zhi-Bin Wen; Ming-Li Zhang; Ge-Lin Zhu; Stewart C. Sanderson
2010-01-01
To reconstruct phylogeny and verify the monophyly of major subgroups, a total of 52 species representing almost all species of Salsoleae s.l. in China were sampled, with analysis based on three molecular markers (nrDNA ITS, cpDNA psbB-psbH and rbcL), using maximum parsimony, maximum likelihood, and Bayesian inference methods. Our molecular evidence provides strong...
Validation of Pooled Whole-Genome Re-Sequencing in Arabidopsis lyrata.
Fracassetti, Marco; Griffin, Philippa C; Willi, Yvonne
2015-01-01
Sequencing pooled DNA of multiple individuals from a population instead of sequencing individuals separately has become popular due to its cost-effectiveness and simple wet-lab protocol, although some criticism of this approach remains. Here we validated a protocol for pooled whole-genome re-sequencing (Pool-seq) of Arabidopsis lyrata libraries prepared with low amounts of DNA (1.6 ng per individual). The validation was based on comparing single nucleotide polymorphism (SNP) frequencies obtained by pooling with those obtained by individual-based Genotyping By Sequencing (GBS). Furthermore, we investigated the effect of sample number, sequencing depth per individual and variant caller on population SNP frequency estimates. For Pool-seq data, we compared frequency estimates from two SNP callers, VarScan and Snape; the former employs a frequentist SNP calling approach while the latter uses a Bayesian approach. Results revealed concordance correlation coefficients well above 0.8, confirming that Pool-seq is a valid method for acquiring population-level SNP frequency data. Higher accuracy was achieved by pooling more samples (25 compared to 14) and working with higher sequencing depth (4.1× per individual compared to 1.4× per individual), which increased the concordance correlation coefficient to 0.955. The Bayesian-based SNP caller produced somewhat higher concordance correlation coefficients, particularly at low sequencing depth. We recommend pooling at least 25 individuals combined with sequencing at a depth of 100× to produce satisfactory frequency estimates for common SNPs (minor allele frequency above 0.05).
Informative priors on fetal fraction increase power of the noninvasive prenatal screen.
Xu, Hanli; Wang, Shaowei; Ma, Lin-Lin; Huang, Shuai; Liang, Lin; Liu, Qian; Liu, Yang-Yang; Liu, Ke-Di; Tan, Ze-Min; Ban, Hao; Guan, Yongtao; Lu, Zuhong
2017-11-09
PurposeNoninvasive prenatal screening (NIPS) sequences a mixture of the maternal and fetal cell-free DNA. Fetal trisomy can be detected by examining chromosomal dosages estimated from sequencing reads. The traditional method uses the Z-test, which compares a subject against a set of euploid controls, where the information of fetal fraction is not fully utilized. Here we present a Bayesian method that leverages informative priors on the fetal fraction.MethodOur Bayesian method combines the Z-test likelihood and informative priors of the fetal fraction, which are learned from the sex chromosomes, to compute Bayes factors. Bayesian framework can account for nongenetic risk factors through the prior odds, and our method can report individual positive/negative predictive values.ResultsOur Bayesian method has more power than the Z-test method. We analyzed 3,405 NIPS samples and spotted at least 9 (of 51) possible Z-test false positives.ConclusionBayesian NIPS is more powerful than the Z-test method, is able to account for nongenetic risk factors through prior odds, and can report individual positive/negative predictive values.Genetics in Medicine advance online publication, 9 November 2017; doi:10.1038/gim.2017.186.
Rediscovery of Good-Turing estimators via Bayesian nonparametrics.
Favaro, Stefano; Nipoti, Bernardo; Teh, Yee Whye
2016-03-01
The problem of estimating discovery probabilities originated in the context of statistical ecology, and in recent years it has become popular due to its frequent appearance in challenging applications arising in genetics, bioinformatics, linguistics, designs of experiments, machine learning, etc. A full range of statistical approaches, parametric and nonparametric as well as frequentist and Bayesian, has been proposed for estimating discovery probabilities. In this article, we investigate the relationships between the celebrated Good-Turing approach, which is a frequentist nonparametric approach developed in the 1940s, and a Bayesian nonparametric approach recently introduced in the literature. Specifically, under the assumption of a two parameter Poisson-Dirichlet prior, we show that Bayesian nonparametric estimators of discovery probabilities are asymptotically equivalent, for a large sample size, to suitably smoothed Good-Turing estimators. As a by-product of this result, we introduce and investigate a methodology for deriving exact and asymptotic credible intervals to be associated with the Bayesian nonparametric estimators of discovery probabilities. The proposed methodology is illustrated through a comprehensive simulation study and the analysis of Expressed Sequence Tags data generated by sequencing a benchmark complementary DNA library. © 2015, The International Biometric Society.
Winterton, Shaun L; Wiegmann, Brian M; Schlinger, Evert I
2007-06-01
The first formal analysis of phylogenetic relationships among small-headed flies (Acroceridae) is presented based on DNA sequence data from two ribosomal (16S and 28S) and two protein-encoding genes: carbomoylphosphate synthase (CPS) domain of CAD (i.e., rudimentary locus) and cytochrome oxidase I (COI). DNA sequences from 40 species in 22 genera of Acroceridae (representing all three subfamilies) were compared with outgroup exemplars from Nemestrinidae, Stratiomyidae, Tabanidae, and Xylophagidae. Parsimony and Bayesian simultaneous analyses of the full data set recover a well-resolved and strongly supported hypothesis of phylogenetic relationships for major lineages within the family. Molecular evidence supports the monophyly of traditionally recognised subfamilies Philopotinae and Panopinae, but Acrocerinae are polyphyletic. Panopinae, sometimes considered "primitive" based on morphology and host-use, are always placed in a more derived position in the current study. Furthermore, these data support emerging morphological evidence that the type genus Acrocera Meigen, and its sister genus Sphaerops, are atypical acrocerids, comprising a sister lineage to all other Acroceridae. Based on the phylogeny generated in the simultaneous analysis, historical divergence times were estimated using Bayesian methodology constrained with fossil data. These estimates indicate Acroceridae likely evolved during the late Triassic but did not diversify greatly until the Cretaceous.
Liu, Guo-Hua; Li, Chun; Li, Jia-Yuan; Zhou, Dong-Hui; Xiong, Rong-Chuan; Lin, Rui-Qing; Zou, Feng-Cai; Zhu, Xing-Quan
2012-01-01
Sparganosis, caused by the plerocercoid larvae of members of the genus Spirometra, can cause significant public health problem and considerable economic losses. In the present study, the complete mitochondrial DNA (mtDNA) sequence of Spirometra erinaceieuropaei from China was determined, characterized and compared with that of S. erinaceieuropaei from Japan. The gene arrangement in the mt genome sequences of S. erinaceieuropaei from China and Japan is identical. The identity of the mt genomes was 99.1% between S. erinaceieuropaei from China and Japan, and the complete mtDNA sequence of S. erinaceieuropaei from China is slightly shorter (2 bp) than that from Japan. Phylogenetic analysis of S. erinaceieuropaei with other representative cestodes using two different computational algorithms [Bayesian inference (BI) and maximum likelihood (ML)] based on concatenated amino acid sequences of 12 protein-coding genes, revealed that S. erinaceieuropaei is closely related to Diphyllobothrium spp., supporting classification based on morphological features. The present study determined the complete mtDNA sequences of S. erinaceieuropaei from China that provides novel genetic markers for studying the population genetics and molecular epidemiology of S. erinaceieuropaei in humans and animals. PMID:22553464
A novel gammaherpesvirus in a large flying fox (Pteropus vampyrus) with blepharitis.
Paige Brock, A; Cortés-Hinojosa, Galaxia; Plummer, Caryn E; Conway, Julia A; Roff, Shannon R; Childress, April L; Wellehan, James F X
2013-05-01
A novel gammaherpesvirus was identified in a large flying fox (Pteropus vampyrus) with conjunctivitis, blepharitis, and meibomianitis by nested polymerase chain reaction and sequencing. Polymerase chain reaction amplification and sequencing of 472 base pairs of the DNA-dependent DNA polymerase gene were used to identify a novel herpesvirus. Bayesian and maximum likelihood phylogenetic analyses indicated that the virus is a member of the genus Percavirus in the subfamily Gammaherpesvirinae. Additional research is needed regarding the association of this virus with conjunctivitis and other ocular pathology. This virus may be useful as a biomarker of stress and may be a useful model of virus recrudescence in Pteropus spp.
Phylogeny of sipunculan worms: A combined analysis of four gene regions and morphology.
Schulze, Anja; Cutler, Edward B; Giribet, Gonzalo
2007-01-01
The intra-phyletic relationships of sipunculan worms were analyzed based on DNA sequence data from four gene regions and 58 morphological characters. Initially we analyzed the data under direct optimization using parsimony as optimality criterion. An implied alignment resulting from the direct optimization analysis was subsequently utilized to perform a Bayesian analysis with mixed models for the different data partitions. For this we applied a doublet model for the stem regions of the 18S rRNA. Both analyses support monophyly of Sipuncula and most of the same clades within the phylum. The analyses differ with respect to the relationships among the major groups but whereas the deep nodes in the direct optimization analysis generally show low jackknife support, they are supported by 100% posterior probability in the Bayesian analysis. Direct optimization has been useful for handling sequences of unequal length and generating conservative phylogenetic hypotheses whereas the Bayesian analysis under mixed models provided high resolution in the basal nodes of the tree.
Haplogroup relationships between domestic and wild sheep resolved using a mitogenome panel.
Meadows, J R S; Hiendleder, S; Kijas, J W
2011-04-01
Five haplogroups have been identified in domestic sheep through global surveys of mitochondrial (mt) sequence variation, however these group classifications are often based on small fragments of the complete mtDNA sequence; partial control region or the cytochrome B gene. This study presents the complete mitogenome from representatives of each haplogroup identified in domestic sheep, plus a sample of their wild relatives. Comparison of the sequence successfully resolved the relationships between each haplogroup and provided insight into the relationship with wild sheep. The five haplogroups were characterised as branching independently, a radiation that shared a common ancestor 920,000 ± 190,000 years ago based on protein coding sequence. The utility of various mtDNA components to inform the true relationship between sheep was also examined with Bayesian, maximum likelihood and partitioned Bremmer support analyses. The control region was found to be the mtDNA component, which contributed the highest amount of support to the tree generated using the complete data set. This study provides the nucleus of a mtDNA mitogenome panel, which can be used to assess additional mitogenomes and serve as a reference set to evaluate small fragments of the mtDNA.
Haplogroup relationships between domestic and wild sheep resolved using a mitogenome panel
Meadows, J R S; Hiendleder, S; Kijas, J W
2011-01-01
Five haplogroups have been identified in domestic sheep through global surveys of mitochondrial (mt) sequence variation, however these group classifications are often based on small fragments of the complete mtDNA sequence; partial control region or the cytochrome B gene. This study presents the complete mitogenome from representatives of each haplogroup identified in domestic sheep, plus a sample of their wild relatives. Comparison of the sequence successfully resolved the relationships between each haplogroup and provided insight into the relationship with wild sheep. The five haplogroups were characterised as branching independently, a radiation that shared a common ancestor 920 000±190 000 years ago based on protein coding sequence. The utility of various mtDNA components to inform the true relationship between sheep was also examined with Bayesian, maximum likelihood and partitioned Bremmer support analyses. The control region was found to be the mtDNA component, which contributed the highest amount of support to the tree generated using the complete data set. This study provides the nucleus of a mtDNA mitogenome panel, which can be used to assess additional mitogenomes and serve as a reference set to evaluate small fragments of the mtDNA. PMID:20940734
Ni, Pan; Bhuiyan, Ali Akbar; Chen, Jian-Hai; Li, Jingjin; Zhang, Cheng; Zhao, Shuhong; Du, Xiaoyong; Li, Hua; Yu, Hui; Liu, Xiangdong; Li, Kui
2018-06-01
Up to date, the scarcity of publicly available complete mitochondrial sequences for European wild pigs hampers deeper understanding about the genetic changes following domestication. Here, we have assembled 26 de novo mtDNA sequences of European wild boars from next generation sequencing (NGS) data and downloaded 174 complete mtDNA sequences to assess the genetic relationship, nucleotide diversity, and selection. The Bayesian consensus tree reveals the clear divergence between the European and Asian clade and a very small portion (10 out of 200 samples) of maternal introgression. The overall nucleotides diversities of the mtDNA sequences have been reduced following domestication. Interestingly, the selection efficiencies in both European and Asian domestic pigs are reduced, probably caused by changes in both selection constraints and maternal population size following domestication. This study suggests that de novo assembled mitogenomes can be a great boon to uncover the genetic turnover following domestication. Further investigation is warranted to include more samples from the ever-increasing amounts of NGS data to help us to better understand the process of domestication.
Stephen, Alexa A; Leone, Angelique M; Toplon, David E; Archer, Linda L; Wellehan, James F X
2016-12-01
A juvenile female bald eagle ( Haliaeetus leucocephalus ) was presented with emaciation and proliferative periocular lesions. The eagle did not respond to supportive therapy and was euthanatized. Histopathologic examination of the skin lesions revealed plaques of marked epidermal hyperplasia parakeratosis, marked acanthosis and spongiosis, and eosinophilic intracytoplasmic inclusion bodies. Novel polymerase chain reaction (PCR) assays were done to amplify and sequence DNA polymerase and rpo147 genes. The 4b gene was also analyzed by a previously developed assay. Bayesian and maximum likelihood phylogenetic analyses of the obtained sequences found it to be poxvirus of the genus Avipoxvirus and clustered with other raptor isolates. Better phylogenetic resolution was found in rpo147 rather than the commonly used DNA polymerase. The novel consensus rpo147 PCR assay will create more accurate phylogenic trees and allow better insight into poxvirus history.
Tomasello, Salvatore; Álvarez, Inés; Vargas, Pablo; Oberprieler, Christoph
2015-01-01
The present study provides results of multi-species coalescent species tree analyses of DNA sequences sampled from multiple nuclear and plastid regions to infer the phylogenetic relationships among the members of the subtribe Leucanthemopsidinae (Compositae, Anthemideae), to which besides the annual Castrilanthemum debeauxii (Degen, Hervier & É.Rev.) Vogt & Oberp., one of the rarest flowering plant species of the Iberian Peninsula, two other unispecific genera (Hymenostemma, Prolongoa), and the polyploidy complex of the genus Leucanthemopsis belong. Based on sequence information from two single- to low-copy nuclear regions (C16, D35, characterised by Chapman et al. (2007)), the multi-copy region of the nrDNA internal transcribed spacer regions ITS1 and ITS2, and two intergenic spacer regions of the cpDNA gene trees were reconstructed using Bayesian inference methods. For the reconstruction of a multi-locus species tree we applied three different methods: (a) analysis of concatenated sequences using Bayesian inference (MrBayes), (b) a tree reconciliation approach by minimizing the number of deep coalescences (PhyloNet), and (c) a coalescent-based species-tree method in a Bayesian framework ((∗)BEAST). All three species tree reconstruction methods unequivocally support the close relationship of the subtribe with the hitherto unclassified genus Phalacrocarpum, the sister-group relationship of Castrilanthemum with the three remaining genera of the subtribe, and the further sister-group relationship of the clade of Hymenostemma+Prolongoa with a monophyletic genus Leucanthemopsis. Dating of the (∗)BEAST phylogeny supports the long-lasting (Early Miocene, 15-22Ma) taxonomical independence and the switch from the plesiomorphic perennial to the apomorphic annual life-form assumed for the Castrilanthemum lineage that may have occurred not earlier than in the Pliocene (3Ma) when the establishment of a Mediterranean climate with summer droughts triggered evolution towards annuality. Copyright © 2014 Elsevier Inc. All rights reserved.
McGowen, Michael R
2011-09-01
Oceanic dolphins (Delphinidae) are the product of a rapid radiation that yielded ∼36 extant species of small to medium-sized cetaceans that first emerged in the Late Miocene. Although they are a charismatic group of organisms that have become poster children for marine conservation, many phylogenetic relationships within Delphinidae remain elusive due to the slow molecular evolution of the group and the difficulty of resolving short branches from successive cladogenic events. Here I combine existing and newly generated sequences from four mitochondrial (mt) genes and 20 nuclear (nu) genes to reconstruct a well-supported phylogenetic hypothesis for Delphinidae. This study compares maximum-likelihood and Bayesian inference methods of several data sets including mtDNA, combined nuDNA, gene trees of individual nuDNA loci, and concatenated mtDNA+nuDNA. In addition, I contrast these standard phylogenetic analyses with the species tree reconstruction method of Bayesian concordance analysis (BCA). Despite finding discordance between mtDNA and individual nuDNA loci, the concatenated matrix recovers a completely resolved and robustly supported phylogeny that is also broadly congruent with BCA trees. This study strongly supports groupings such as Delphininae, Lissodelphininae, Globicephalinae, Sotalia+Delphininae, Steno+Orcaella+Globicephalinae, and Leucopleurus acutus, Lagenorhynchus albirostris, and Orcinus orca as basal delphinid taxa. Copyright © 2011 Elsevier Inc. All rights reserved.
Callejón, Rocío; Robles, María Del Rosario; Panei, Carlos Javier; Cutillas, Cristina
2016-08-01
A molecular phylogenetic hypothesis is presented for the genus Trichuris based on sequence data from mitochondrial cytochrome c oxidase 1 (cox1) and cytochrome b (cob). The taxa consisted of nine populations of whipworm from five species of Sigmodontinae rodents from Argentina. Bayesian Inference, Maximum Parsimony, and Maximum Likelihood methods were used to infer phylogenies for each gene separately but also for the combined mitochondrial data and the combined mitochondrial and nuclear dataset. Phylogenetic results based on cox1 and cob mitochondrial DNA (mtDNA) revealed three clades strongly resolved corresponding to three different species (Trichuris navonae, Trichuris bainae, and Trichuris pardinasi) showing phylogeographic variation, but relationships among Trichuris species were poorly resolved. Phylogenetic reconstruction based on concatenated sequences had greater phylogenetic resolution for delimiting species and populations intra-specific of Trichuris than those based on partitioned genes. Thus, populations of T. bainae and T. pardinasi could be affected by geographical factors and co-divergence parasite-host.
Attwood, Stephen W.; Fatih, Farrah A.; Upatham, E. Suchart
2008-01-01
Background Schistosomiasis in humans along the lower Mekong River has proven a persistent public health problem in the region. The causative agent is the parasite Schistosoma mekongi (Trematoda: Digenea). A new transmission focus is reported, as well as the first study of genetic variation among S. mekongi populations. The aim is to confirm the identity of the species involved at each known focus of Mekong schistosomiasis transmission, to examine historical relationships among the populations and related taxa, and to provide data for use (a priori) in further studies of the origins, radiation, and future dispersal capabilities of S. mekongi. Methodology/Principal Findings DNA sequence data are presented for four populations of S. mekongi from Cambodia and southern Laos, three of which were distinguishable at the COI (cox1) and 12S (rrnS) mitochondrial loci sampled. A phylogeny was estimated for these populations and the other members of the Schistosoma sinensium group. The study provides new DNA sequence data for three new populations and one new locus/population combination. A Bayesian approach is used to estimate divergence dates for events within the S. sinensium group and among the S. mekongi populations. Conclusions/Significance The date estimates are consistent with phylogeographical hypotheses describing a Pliocene radiation of the S. sinensium group and a mid-Pleistocene invasion of Southeast Asia by S. mekongi. The date estimates also provide Bayesian priors for future work on the evolution of S. mekongi. The public health implications of S. mekongi transmission outside the lower Mekong River are also discussed. PMID:18350111
Feng, Hao; Conneely, Karen N.; Wu, Hao
2014-01-01
DNA methylation is an important epigenetic modification that has essential roles in cellular processes including gene regulation, development and disease and is widely dysregulated in most types of cancer. Recent advances in sequencing technology have enabled the measurement of DNA methylation at single nucleotide resolution through methods such as whole-genome bisulfite sequencing and reduced representation bisulfite sequencing. In DNA methylation studies, a key task is to identify differences under distinct biological contexts, for example, between tumor and normal tissue. A challenge in sequencing studies is that the number of biological replicates is often limited by the costs of sequencing. The small number of replicates leads to unstable variance estimation, which can reduce accuracy to detect differentially methylated loci (DML). Here we propose a novel statistical method to detect DML when comparing two treatment groups. The sequencing counts are described by a lognormal-beta-binomial hierarchical model, which provides a basis for information sharing across different CpG sites. A Wald test is developed for hypothesis testing at each CpG site. Simulation results show that the proposed method yields improved DML detection compared to existing methods, particularly when the number of replicates is low. The proposed method is implemented in the Bioconductor package DSS. PMID:24561809
A Bayesian mixture model for chromatin interaction data.
Niu, Liang; Lin, Shili
2015-02-01
Chromatin interactions mediated by a particular protein are of interest for studying gene regulation, especially the regulation of genes that are associated with, or known to be causative of, a disease. A recent molecular technique, Chromatin interaction analysis by paired-end tag sequencing (ChIA-PET), that uses chromatin immunoprecipitation (ChIP) and high throughput paired-end sequencing, is able to detect such chromatin interactions genomewide. However, ChIA-PET may generate noise (i.e., pairings of DNA fragments by random chance) in addition to true signal (i.e., pairings of DNA fragments by interactions). In this paper, we propose MC_DIST based on a mixture modeling framework to identify true chromatin interactions from ChIA-PET count data (counts of DNA fragment pairs). The model is cast into a Bayesian framework to take into account the dependency among the data and the available information on protein binding sites and gene promoters to reduce false positives. A simulation study showed that MC_DIST outperforms the previously proposed hypergeometric model in terms of both power and type I error rate. A real data study showed that MC_DIST may identify potential chromatin interactions between protein binding sites and gene promoters that may be missed by the hypergeometric model. An R package implementing the MC_DIST model is available at http://www.stat.osu.edu/~statgen/SOFTWARE/MDM.
Xu, Chang; Nezami Ranjbar, Mohammad R; Wu, Zhong; DiCarlo, John; Wang, Yexun
2017-01-03
Detection of DNA mutations at very low allele fractions with high accuracy will significantly improve the effectiveness of precision medicine for cancer patients. To achieve this goal through next generation sequencing, researchers need a detection method that 1) captures rare mutation-containing DNA fragments efficiently in the mix of abundant wild-type DNA; 2) sequences the DNA library extensively to deep coverage; and 3) distinguishes low level true variants from amplification and sequencing errors with high accuracy. Targeted enrichment using PCR primers provides researchers with a convenient way to achieve deep sequencing for a small, yet most relevant region using benchtop sequencers. Molecular barcoding (or indexing) provides a unique solution for reducing sequencing artifacts analytically. Although different molecular barcoding schemes have been reported in recent literature, most variant calling has been done on limited targets, using simple custom scripts. The analytical performance of barcode-aware variant calling can be significantly improved by incorporating advanced statistical models. We present here a highly efficient, simple and scalable enrichment protocol that integrates molecular barcodes in multiplex PCR amplification. In addition, we developed smCounter, an open source, generic, barcode-aware variant caller based on a Bayesian probabilistic model. smCounter was optimized and benchmarked on two independent read sets with SNVs and indels at 5 and 1% allele fractions. Variants were called with very good sensitivity and specificity within coding regions. We demonstrated that we can accurately detect somatic mutations with allele fractions as low as 1% in coding regions using our enrichment protocol and variant caller.
Sandoval-Castellanos, Edson; Palkopoulou, Eleftheria; Dalén, Love
2014-01-01
Inference of population demographic history has vastly improved in recent years due to a number of technological and theoretical advances including the use of ancient DNA. Approximate Bayesian computation (ABC) stands among the most promising methods due to its simple theoretical fundament and exceptional flexibility. However, limited availability of user-friendly programs that perform ABC analysis renders it difficult to implement, and hence programming skills are frequently required. In addition, there is limited availability of programs able to deal with heterochronous data. Here we present the software BaySICS: Bayesian Statistical Inference of Coalescent Simulations. BaySICS provides an integrated and user-friendly platform that performs ABC analyses by means of coalescent simulations from DNA sequence data. It estimates historical demographic population parameters and performs hypothesis testing by means of Bayes factors obtained from model comparisons. Although providing specific features that improve inference from datasets with heterochronous data, BaySICS also has several capabilities making it a suitable tool for analysing contemporary genetic datasets. Those capabilities include joint analysis of independent tables, a graphical interface and the implementation of Markov-chain Monte Carlo without likelihoods.
Zeng, Xu; Yuan, Zhengrong; Tong, Xin; Li, Qiushi; Gao, Weiwei; Qin, Minjian; Liu, Zhihua
2012-05-01
Oryzoideae (Poaceae) plants have economic and ecological value. However, the phylogenetic position of some plants is not clear, such as Hygroryza aristata (Retz.) Nees. and Porteresia coarctata (Roxb.) Tateoka (syn. Oryza coarctata). Comprehensive molecular phylogenetic studies have been carried out on many genera in the Poaceae. The different DNA sequences, including nuclear and chloroplast sequences, had been extensively employed to determine relationships at both higher and lower taxonomic levels in the Poaceae. Chloroplast DNA ndhF gene and atpB-rbcL spacer were used to construct phylogenetic trees and estimate the divergence time of Oryzoideae, Bambusoideae, Panicoideae, Pooideae and so on. Complete sequences of atpB-rbcL and ndhF were generated for 17 species representing six species of the Oryzoideae and related subfamilies. Nicotiana tabacum L. was the outgroup species. The two DNA datasets were analyzed, using Maximum Parsimony and Bayesian analysis methods. The molecular phylogeny revealed that H. aristata (Retz.) Nees was the sister to Chikusichloa aquatica Koidz. Moreover, P. coarctata (Roxb.) Tateoka was in the genus Oryza. Furthermore, the result of evolution analysis, which based on the ndhF marker, indicated that the time of origin of Oryzoideae might be 31 million years ago.
MATTHIAS, MICHAEL A.; DÍAZ, M. MÓNICA; CAMPOS, KALINA J.; CALDERON, MARITZA; WILLIG, MICHAEL R.; PACHECO, VICTOR; GOTUZZO, EDUARDO; GILMAN, ROBERT H.; VINETZ, JOSEPH M.
2008-01-01
The role of bats as potential sources of transmission to humans or as maintenance hosts of leptospires is poorly understood. We quantified the prevalence of leptospiral colonization in bats in the Peruvian Amazon in the vicinity of Iquitos, an area of high biologic diversity. Of 589 analyzed bats, culture (3 of 589) and molecular evidence (20 of 589) of leptospiral colonization was found in the kidneys, yielding an overall colonization rate of 3.4%. Infection rates differed with habitat and location, and among different bat species. Bayesian analysis was used to infer phylogenic relationships of leptospiral 16S ribosomal DNA sequences. Tree topologies were consistent with groupings based on DNA-DNA hybridization studies. A diverse group of leptospires was found in peri-Iquitos bat populations including Leptospira interrogans (5 clones), L. kirschneri (1), L. borgpetersenii (4), L. fainei (1), and two previously undescribed leptospiral species (8). Although L. kirschenri and L. interrogans have been previously isolated from bats, this report is the first to describe L. borgpetersenii and L. fainei infection of bats. A wild animal reservoir of L. fainei has not been previously described. The detection in bats of the L. interrogans serovar Icterohemorrhagiae, a leptospire typically maintained by peridomestic rats, suggests a rodent-bat infection cycle. Bats in Iquitos maintain a genetically diverse group of leptospires. These results provide a solid basis for pursuing molecular epidemiologic studies of bat-associated Leptospira, a potentially new epidemiologic reservoir of transmission of leptospirosis to humans. PMID:16282313
Rosvold, Jørgen; Røed, Knut H; Hufthammer, Anne Karin; Andersen, Reidar; Stenøien, Hans K
2012-09-26
Red deer (Cervus elaphus) have been an important human resource for millennia, experiencing intensive human influence through habitat alterations, hunting and translocation of animals. In this study we investigate a time series of ancient and contemporary DNA from Norwegian red deer spanning about 7,000 years. Our main aim was to investigate how increasing agricultural land use, hunting pressure and possibly human mediated translocation of animals have affected the genetic diversity on a long-term scale. We obtained mtDNA (D-loop) sequences from 73 ancient specimens. These show higher genetic diversity in ancient compared to extant samples, with the highest diversity preceding the onset of agricultural intensification in the Early Iron Age. Using standard diversity indices, Bayesian skyline plot and approximate Bayesian computation, we detected a population reduction which was more prolonged than, but not as severe as, historic documents indicate. There are signs of substantial changes in haplotype frequencies primarily due to loss of haplotypes through genetic drift. There is no indication of human mediated translocations into the Norwegian population. All the Norwegian sequences show a western European origin, from which the Norwegian lineage diverged approximately 15,000 years ago. Our results provide direct insight into the effects of increasing habitat fragmentation and human hunting pressure on genetic diversity and structure of red deer populations. They also shed light on the northward post-glacial colonisation process of red deer in Europe and suggest increased precision in inferring past demographic events when including both ancient and contemporary DNA.
Bendiksby, Mika; Næsborg, Rikke Reese; Timdal, Einar
2018-01-01
Xylopsora canopeorum Timdal, Reese Næsborg & Bendiksby is described as a new species occupying the crowns of large Sequoia sempervirens trees in California, USA. The new species is supported by morphology, anatomy, secondary chemistry and DNA sequence data. While similar in external appearance to X. friesii , it is distinguished by forming smaller, partly coralloid squamules, by the occurrence of soralia and, in some specimens, by the presence of thamnolic acid in addition to friesiic acid in the thallus. Molecular phylogenetic results are based on nuclear (ITS and LSU) as well as mitochondrial (SSU) ribosomal DNA sequence alignments. Phylogenetic hypotheses obtained using Bayesian Inference, Maximum Likelihood and Maximum Parsimony all support X. canopeorum as a distinct evolutionary lineage belonging to the X. caradocensis - X. friesii clade.
Xylopsora canopeorum (Umbilicariaceae), a new lichen species from the canopy of Sequoia sempervirens
Bendiksby, Mika; Næsborg, Rikke Reese; Timdal, Einar
2018-01-01
Abstract Xylopsora canopeorum Timdal, Reese Næsborg & Bendiksby is described as a new species occupying the crowns of large Sequoia sempervirens trees in California, USA. The new species is supported by morphology, anatomy, secondary chemistry and DNA sequence data. While similar in external appearance to X. friesii, it is distinguished by forming smaller, partly coralloid squamules, by the occurrence of soralia and, in some specimens, by the presence of thamnolic acid in addition to friesiic acid in the thallus. Molecular phylogenetic results are based on nuclear (ITS and LSU) as well as mitochondrial (SSU) ribosomal DNA sequence alignments. Phylogenetic hypotheses obtained using Bayesian Inference, Maximum Likelihood and Maximum Parsimony all support X. canopeorum as a distinct evolutionary lineage belonging to the X. caradocensis–X. friesii clade. PMID:29559828
Rylková, K; Tůmová, E; Brožová, A; Jankovská, I; Vadlejch, J; Čadková, Z; Frýdlová, J; Peřinková, P; Langrová, I; Chodová, D; Nechybová, S; Scháňková, Š
2015-11-01
Trichuris sp. individuals were collected from Myocastor coypus from fancy breeder farms in the Czech Republic. Using morphological and biometrical methods, 30 female and 30 male nematodes were identified as Trichuris myocastoris. This paper presents the first molecular description of this species. The ribosomal DNA (rDNA) region, consisting of internal transcribed spacer (ITS)-1, 5.8 gene and ITS-2, was sequenced. Based on an analysis of 651 bp, T. myocastoris was found to be different from any other Trichuris species for which published sequencing of the ITS region is available. The phylogenetic relationships were estimated using the maximum parsimony methods and Bayesian analyses. T. myocastoris was found to be significantly closely related to Trichuris of rodents than those of ruminants.
BayesPI-BAR: a new biophysical model for characterization of regulatory sequence variations
Wang, Junbai; Batmanov, Kirill
2015-01-01
Sequence variations in regulatory DNA regions are known to cause functionally important consequences for gene expression. DNA sequence variations may have an essential role in determining phenotypes and may be linked to disease; however, their identification through analysis of massive genome-wide sequencing data is a great challenge. In this work, a new computational pipeline, a Bayesian method for protein–DNA interaction with binding affinity ranking (BayesPI-BAR), is proposed for quantifying the effect of sequence variations on protein binding. BayesPI-BAR uses biophysical modeling of protein–DNA interactions to predict single nucleotide polymorphisms (SNPs) that cause significant changes in the binding affinity of a regulatory region for transcription factors (TFs). The method includes two new parameters (TF chemical potentials or protein concentrations and direct TF binding targets) that are neglected by previous methods. The new method is verified on 67 known human regulatory SNPs, of which 47 (70%) have predicted true TFs ranked in the top 10. Importantly, the performance of BayesPI-BAR, which uses principal component analysis to integrate multiple predictions from various TF chemical potentials, is found to be better than that of existing programs, such as sTRAP and is-rSNP, when evaluated on the same SNPs. BayesPI-BAR is a publicly available tool and is able to carry out parallelized computation, which helps to investigate a large number of TFs or SNPs and to detect disease-associated regulatory sequence variations in the sea of genome-wide noncoding regions. PMID:26202972
Tanaka, Keiko; Tomita, Taketeru; Suzuki, Shingo; Hosomichi, Kazuyoshi; Sano, Kazumi; Doi, Hiroyuki; Kono, Azumi; Inoko, Hidetoshi; Kulski, Jerzy K.; Tanaka, Sho
2013-01-01
Hexanchiformes is regarded as a monophyletic taxon, but the morphological and genetic relationships between the five extant species within the order are still uncertain. In this study, we determined the whole mitochondrial DNA (mtDNA) sequences of seven sharks including representatives of the five Hexanchiformes, one squaliform, and one carcharhiniform and inferred the phylogenetic relationships among those species and 12 other Chondrichthyes (cartilaginous fishes) species for which the complete mitogenome is available. The monophyly of Hexanchiformes and its close relation with all other Squaliformes sharks were strongly supported by likelihood and Bayesian phylogenetic analysis of 13,749 aligned nucleotides of 13 protein coding genes and two rRNA genes that were derived from the whole mDNA sequences of the 19 species. The phylogeny suggested that Hexanchiformes is in the superorder Squalomorphi, Chlamydoselachus anguineus (frilled shark) is the sister species to all other Hexanchiformes, and the relations within Hexanchiformes are well resolved as Chlamydoselachus, (Notorynchus, (Heptranchias, (Hexanchus griseus, H. nakamurai))). Based on our phylogeny, we discussed evolutionary scenarios of the jaw suspension mechanism and gill slit numbers that are significant features in the sharks. PMID:24089661
Ramírez, Juan C; Torres, Carolina; Curto, María de Los A; Schijman, Alejandro G
2017-12-01
Trypanosoma cruzi has been subdivided into seven Discrete Typing Units (DTUs), TcI-TcVI and Tcbat. Two major evolutionary models have been proposed to explain the origin of hybrid lineages, but while it is widely accepted that TcV and TcVI are the result of genetic exchange between TcII and TcIII strains, the origin of TcIII and TcIV is still a matter of debate. T. cruzi satellite DNA (SatDNA), comprised of 195 bp units organized in tandem repeats, from both TcV and TcVI stocks were found to have SatDNA copies type TcI and TcII; whereas contradictory results were observed for TcIII stocks and no TcIV sequence has been analyzed yet. Herein, we have gone deeper into this matter analyzing 335 distinct SatDNA sequences from 19 T. cruzi stocks representative of DTUs TcI-TcVI for phylogenetic inference. Bayesian phylogenetic tree showed that all sequences were grouped in three major clusters, which corresponded to sequences from DTUs TcI/III, TcII and TcIV; whereas TcV and TcVI stocks had two sets of sequences distributed into TcI/III and TcII clusters. As expected, the lowest genetic distances were found between TcI and TcIII, and between TcV and TcVI sequences; whereas the highest ones were observed between TcII and TcI/III, and among TcIV sequences and those from the remaining DTUs. In addition, signature patterns associated to specific T. cruzi lineages were identified and new primers that improved SatDNA-based qPCR sensitivity were designed. Our findings support the theory that TcIII is not the result of a hybridization event between TcI and TcII, and that TcIV had an independent origin from the other DTUs, contributing to clarifying the evolutionary history of T. cruzi lineages. Moreover, this work opens the possibility of typing samples from Chagas disease patients with low parasitic loads and improving molecular diagnostic methods of T. cruzi infection based on SatDNA sequence amplification.
Wang, Yan; Liu, Guo-Hua; Li, Jia-Yuan; Xu, Min-Jun; Ye, Yong-Gang; Zhou, Dong-Hui; Song, Hui-Qun; Lin, Rui-Qing; Zhu, Xing-Quan
2013-02-01
This study examined sequence variation in three mitochondrial DNA (mtDNA) regions, namely cytochrome c oxidase subunit 1 (cox1), NADH dehydrogenase subunit 5 (nad5) and cytochrome b (cytb), among Trichuris ovis isolates from different hosts in Guangdong Province, China. A portion of the cox1 (pcox1), nad5 (pnad5) and cytb (pcytb) genes was amplified separately from individual whipworms by PCR, and was subjected to sequencing from both directions. The size of the sequences of pcox1, pnad5 and pcytb was 618, 240 and 464 bp, respectively. Although the intra-specific sequence variations within T. ovis were 0-0.8% for pcox1, 0-0.8% for pnad5 and 0-1.9% for pcytb, the inter-specific sequence differences among members of the genus Trichuris were significantly higher, being 24.3-26.5% for pcox1, 33.7-56.4% for pnad5 and 24.8-26.1% for pcytb, respectively. Phylogenetic analyses using combined sequences of pcox1, pnad5 and pcytb, with three different computational algorithms (maximum likelihood, maximum parsimony and Bayesian inference), indicated that all of the T. ovis isolates grouped together with high statistical support. These findings demonstrated the existence of intra-specific variation in mtDNA sequences among T. ovis isolates from different hosts, and have implications for studying molecular epidemiology and population genetics of T. ovis.
Algama, Manjula; Tasker, Edward; Williams, Caitlin; Parslow, Adam C; Bryson-Richardson, Robert J; Keith, Jonathan M
2017-03-27
Computational identification of non-coding RNAs (ncRNAs) is a challenging problem. We describe a genome-wide analysis using Bayesian segmentation to identify intronic elements highly conserved between three evolutionarily distant vertebrate species: human, mouse and zebrafish. We investigate the extent to which these elements include ncRNAs (or conserved domains of ncRNAs) and regulatory sequences. We identified 655 deeply conserved intronic sequences in a genome-wide analysis. We also performed a pathway-focussed analysis on genes involved in muscle development, detecting 27 intronic elements, of which 22 were not detected in the genome-wide analysis. At least 87% of the genome-wide and 70% of the pathway-focussed elements have existing annotations indicative of conserved RNA secondary structure. The expression of 26 of the pathway-focused elements was examined using RT-PCR, providing confirmation that they include expressed ncRNAs. Consistent with previous studies, these elements are significantly over-represented in the introns of transcription factors. This study demonstrates a novel, highly effective, Bayesian approach to identifying conserved non-coding sequences. Our results complement previous findings that these sequences are enriched in transcription factors. However, in contrast to previous studies which suggest the majority of conserved sequences are regulatory factor binding sites, the majority of conserved sequences identified using our approach contain evidence of conserved RNA secondary structures, and our laboratory results suggest most are expressed. Functional roles at DNA and RNA levels are not mutually exclusive, and many of our elements possess evidence of both. Moreover, ncRNAs play roles in transcriptional and post-transcriptional regulation, and this may contribute to the over-representation of these elements in introns of transcription factors. We attribute the higher sensitivity of the pathway-focussed analysis compared to the genome-wide analysis to improved alignment quality, suggesting that enhanced genomic alignments may reveal many more conserved intronic sequences.
Lexer, C; Wüest, R O; Mangili, S; Heuertz, M; Stölting, K N; Pearman, P B; Forest, F; Salamin, N; Zimmermann, N E; Bossolini, E
2014-09-01
Understanding the drivers of population divergence, speciation and species persistence is of great interest to molecular ecology, especially for species-rich radiations inhabiting the world's biodiversity hotspots. The toolbox of population genomics holds great promise for addressing these key issues, especially if genomic data are analysed within a spatially and ecologically explicit context. We have studied the earliest stages of the divergence continuum in the Restionaceae, a species-rich and ecologically important plant family of the Cape Floristic Region (CFR) of South Africa, using the widespread CFR endemic Restio capensis (L.) H.P. Linder & C.R. Hardy as an example. We studied diverging populations of this morphotaxon for plastid DNA sequences and >14 400 nuclear DNA polymorphisms from Restriction site Associated DNA (RAD) sequencing and analysed the results jointly with spatial, climatic and phytogeographic data, using a Bayesian generalized linear mixed modelling (GLMM) approach. The results indicate that population divergence across the extreme environmental mosaic of the CFR is mostly driven by isolation by environment (IBE) rather than isolation by distance (IBD) for both neutral and non-neutral markers, consistent with genome hitchhiking or coupling effects during early stages of divergence. Mixed modelling of plastid DNA and single divergent outlier loci from a Bayesian genome scan confirmed the predominant role of climate and pointed to additional drivers of divergence, such as drift and ecological agents of selection captured by phytogeographic zones. Our study demonstrates the usefulness of population genomics for disentangling the effects of IBD and IBE along the divergence continuum often found in species radiations across heterogeneous ecological landscapes. © 2014 John Wiley & Sons Ltd.
Kang, Seokha; Sultana, Tahera; Eom, Keeseon S; Park, Yung Chul; Soonthornpong, Nathan; Nadler, Steven A; Park, Joong-Ki
2009-01-15
The complete mitochondrial genome sequence was determined for the human pinworm Enterobius vermicularis (Oxyurida: Nematoda) and used to infer its phylogenetic relationship to other major groups of chromadorean nematodes. The E. vermicularis genome is a 14,010-bp circular DNA molecule that encodes 36 genes (12 proteins, 22 tRNAs, and 2 rRNAs). This mtDNA genome lacks atp8, as reported for almost all other nematode species investigated. Phylogenetic analyses (maximum parsimony, maximum likelihood, neighbor joining, and Bayesian inference) of nucleotide sequences for the 12 protein-coding genes of 25 nematode species placed E. vermicularis, a representative of the order Oxyurida, as sister to the main Ascaridida+Rhabditida group. Tree topology comparisons using statistical tests rejected an alternative hypothesis favoring a closer relationship among Ascaridida, Spirurida, and Oxyurida, which has been supported from most studies based on nuclear ribosomal DNA sequences. Unlike the relatively conserved gene arrangement found for most chromadorean taxa, E. vermicularis mtDNA gene order is very unique, not sharing similarity to any other nematode species reported to date. This lack of gene order similarity may represent idiosyncratic gene rearrangements unique to this specific lineage of the oxyurids. To more fully understand the extent of gene rearrangement and its evolutionary significance within the nematode phylogenetic framework, additional mitochondrial genomes representing a greater evolutionary diversity of species must be characterized.
Defining the Estimated Core Genome of Bacterial Populations Using a Bayesian Decision Model
van Tonder, Andries J.; Mistry, Shilan; Bray, James E.; Hill, Dorothea M. C.; Cody, Alison J.; Farmer, Chris L.; Klugman, Keith P.; von Gottberg, Anne; Bentley, Stephen D.; Parkhill, Julian; Jolley, Keith A.; Maiden, Martin C. J.; Brueggemann, Angela B.
2014-01-01
The bacterial core genome is of intense interest and the volume of whole genome sequence data in the public domain available to investigate it has increased dramatically. The aim of our study was to develop a model to estimate the bacterial core genome from next-generation whole genome sequencing data and use this model to identify novel genes associated with important biological functions. Five bacterial datasets were analysed, comprising 2096 genomes in total. We developed a Bayesian decision model to estimate the number of core genes, calculated pairwise evolutionary distances (p-distances) based on nucleotide sequence diversity, and plotted the median p-distance for each core gene relative to its genome location. We designed visually-informative genome diagrams to depict areas of interest in genomes. Case studies demonstrated how the model could identify areas for further study, e.g. 25% of the core genes with higher sequence diversity in the Campylobacter jejuni and Neisseria meningitidis genomes encoded hypothetical proteins. The core gene with the highest p-distance value in C. jejuni was annotated in the reference genome as a putative hydrolase, but further work revealed that it shared sequence homology with beta-lactamase/metallo-beta-lactamases (enzymes that provide resistance to a range of broad-spectrum antibiotics) and thioredoxin reductase genes (which reduce oxidative stress and are essential for DNA replication) in other C. jejuni genomes. Our Bayesian model of estimating the core genome is principled, easy to use and can be applied to large genome datasets. This study also highlighted the lack of knowledge currently available for many core genes in bacterial genomes of significant global public health importance. PMID:25144616
Asexual-sexual morph connection in the type species of Berkleasmium.
Tanney, Joey; Miller, Andrew N
2017-06-01
Berkleasmium is a polyphyletic genus comprising 37 dematiaceous hyphomycetous species. In this study, independent collections of the type species, B. concinnum , were made from Eastern North America. Nuclear internal transcribed spacer rDNA (ITS) and partial nuc 28S large subunit rDNA (LSU) sequences obtained from collections and subsequent cultures showed that Berkleasmium concinnum is the asexual morph of Neoacanthostigma septoconstrictum ( Tubeufiaceae , Tubeufiales ). Phylogenies inferred from Bayesian inference and maximum likelihood analyses of ITS-LSU sequence data confirmed this asexual-sexual morph connection and a re-examination of fungarium reference specimens also revealed the co-occurrence of N. septoconstrictum ascomata and B. concinnum sporodochia. Neoacanthostigma septoconstrictum is therefore synonymized under B. concinnum on the basis of priority. A specimen identified as N. septoconstrictum from Thailand is described as N. thailandicum sp. nov., based on morphological and genetic distinctiveness.
Khan, Haseeb A; Arif, Ibrahim A; Bahkali, Ali H; Al Farhan, Ahmad H; Al Homaidan, Ali A
2008-10-06
This investigation was aimed to compare the inference of antelope phylogenies resulting from the 16S rRNA, cytochrome-b (cyt-b) and d-loop segments of mitochondrial DNA using three different computational models including Bayesian (BA), maximum parsimony (MP) and unweighted pair group method with arithmetic mean (UPGMA). The respective nucleotide sequences of three Oryx species (Oryx leucoryx, Oryx dammah and Oryx gazella) and an out-group (Addax nasomaculatus) were aligned and subjected to BA, MP and UPGMA models for comparing the topologies of respective phylogenetic trees. The 16S rRNA region possessed the highest frequency of conserved sequences (97.65%) followed by cyt-b (94.22%) and d-loop (87.29%). There were few transitions (2.35%) and none transversions in 16S rRNA as compared to cyt-b (5.61% transitions and 0.17% transversions) and d-loop (11.57% transitions and 1.14% transversions) while comparing the four taxa. All the three mitochondrial segments clearly differentiated the genus Addax from Oryx using the BA or UPGMA models. The topologies of all the gamma-corrected Bayesian trees were identical irrespective of the marker type. The UPGMA trees resulting from 16S rRNA and d-loop sequences were also identical (Oryx dammah grouped with Oryx leucoryx) to Bayesian trees except that the UPGMA tree based on cyt-b showed a slightly different phylogeny (Oryx dammah grouped with Oryx gazella) with a low bootstrap support. However, the MP model failed to differentiate the genus Addax from Oryx. These findings demonstrate the efficiency and robustness of BA and UPGMA methods for phylogenetic analysis of antelopes using mitochondrial markers.
Khan, Haseeb A.; Arif, Ibrahim A.; Bahkali, Ali H.; Al Farhan, Ahmad H.; Al Homaidan, Ali A.
2008-01-01
This investigation was aimed to compare the inference of antelope phylogenies resulting from the 16S rRNA, cytochrome-b (cyt-b) and d-loop segments of mitochondrial DNA using three different computational models including Bayesian (BA), maximum parsimony (MP) and unweighted pair group method with arithmetic mean (UPGMA). The respective nucleotide sequences of three Oryx species (Oryx leucoryx, Oryx dammah and Oryx gazella) and an out-group (Addax nasomaculatus) were aligned and subjected to BA, MP and UPGMA models for comparing the topologies of respective phylogenetic trees. The 16S rRNA region possessed the highest frequency of conserved sequences (97.65%) followed by cyt-b (94.22%) and d-loop (87.29%). There were few transitions (2.35%) and none transversions in 16S rRNA as compared to cyt-b (5.61% transitions and 0.17% transversions) and d-loop (11.57% transitions and 1.14% transversions) while comparing the four taxa. All the three mitochondrial segments clearly differentiated the genus Addax from Oryx using the BA or UPGMA models. The topologies of all the gamma-corrected Bayesian trees were identical irrespective of the marker type. The UPGMA trees resulting from 16S rRNA and d-loop sequences were also identical (Oryx dammah grouped with Oryx leucoryx) to Bayesian trees except that the UPGMA tree based on cyt-b showed a slightly different phylogeny (Oryx dammah grouped with Oryx gazella) with a low bootstrap support. However, the MP model failed to differentiate the genus Addax from Oryx. These findings demonstrate the efficiency and robustness of BA and UPGMA methods for phylogenetic analysis of antelopes using mitochondrial markers. PMID:19204824
Torres-Carvajal, Omar; Schulte, James A; Cadle, John E
2006-04-01
The South American iguanian lizard genus Stenocercus includes 54 species occurring mostly in the Andes and adjacent lowland areas from northern Venezuela and Colombia to central Argentina at elevations of 0-4000m. Small taxon or character sampling has characterized all phylogenetic analyses of Stenocercus, which has long been recognized as sister taxon to the Tropidurus Group. In this study, we use mtDNA sequence data to perform phylogenetic analyses that include 32 species of Stenocercus and 12 outgroup taxa. Monophyly of this genus is strongly supported by maximum parsimony and Bayesian analyses. Evolutionary relationships within Stenocercus are further analyzed with a Bayesian implementation of a general mixture model, which accommodates variability in the pattern of evolution across sites. These analyses indicate a basal split of Stenocercus into two clades, one of which receives very strong statistical support. In addition, we test previous hypotheses using non-parametric and parametric statistical methods, and provide a phylogenetic classification for Stenocercus.
Mahardika, G N K; Dibia, N; Budayanti, N S; Susilawathi, N M; Subrata, K; Darwinata, A E; Wignall, F S; Richt, J A; Valdivia-Granda, W A; Sudewi, A A R
2014-06-01
The emergence of human and animal rabies in Bali since November 2008 has attracted local, national and international interest. The potential origin and time of introduction of rabies virus to Bali is described. The nucleoprotein (N) gene of rabies virus from dog brain and human clinical specimens was sequenced using an automated DNA sequencer. Phylogenetic inference with Bayesian Markov Chain Monte Carlo (MCMC) analysis using the Bayesian Evolutionary Analysis by Sampling Trees (BEAST) v. 1.7.5 software confirmed that the outbreak of rabies in Bali was caused by an Indonesian lineage virus following a single introduction. The ancestor of Bali viruses was the descendant of a virus from Kalimantan. Contact tracing showed that the event most likely occurred in early 2008. The introduction of rabies into a large unvaccinated dog population in Bali clearly demonstrates the risk of disease transmission for government agencies and should lead to an increased preparedness and efforts for sustained risk reduction to prevent such events from occurring in future.
Turner, Barbara; Paun, Ovidiu; Munzinger, Jérôme; Chase, Mark W.; Samuel, Rosabelle
2016-01-01
Background and Aims Some plant groups, especially on islands, have been shaped by strong ancestral bottlenecks and rapid, recent radiation of phenotypic characters. Single molecular markers are often not informative enough for phylogenetic reconstruction in such plant groups. Whole plastid genomes and nuclear ribosomal DNA (nrDNA) are viewed by many researchers as sources of information for phylogenetic reconstruction of groups in which expected levels of divergence in standard markers are low. Here we evaluate the usefulness of these data types to resolve phylogenetic relationships among closely related Diospyros species. Methods Twenty-two closely related Diospyros species from New Caledonia were investigated using whole plastid genomes and nrDNA data from low-coverage next-generation sequencing (NGS). Phylogenetic trees were inferred using maximum parsimony, maximum likelihood and Bayesian inference on separate plastid and nrDNA and combined matrices. Key Results The plastid and nrDNA sequences were, singly and together, unable to provide well supported phylogenetic relationships among the closely related New Caledonian Diospyros species. In the nrDNA, a 6-fold greater percentage of parsimony-informative characters compared with plastid DNA was found, but the total number of informative sites was greater for the much larger plastid DNA genomes. Combining the plastid and nuclear data improved resolution. Plastid results showed a trend towards geographical clustering of accessions rather than following taxonomic species. Conclusions In plant groups in which multiple plastid markers are not sufficiently informative, an investigation at the level of the entire plastid genome may also not be sufficient for detailed phylogenetic reconstruction. Sequencing of complete plastid genomes and nrDNA repeats seems to clarify some relationships among the New Caledonian Diospyros species, but the higher percentage of parsimony-informative characters in nrDNA compared with plastid DNA did not help to resolve the phylogenetic tree because the total number of variable sites was much lower than in the entire plastid genome. The geographical clustering of the individuals against a background of overall low sequence divergence could indicate transfer of plastid genomes due to hybridization and introgression following secondary contact. PMID:27098088
Hailer, Frank; Kutschera, Verena E; Hallström, Björn M; Fain, Steven R; Leonard, Jennifer A; Arnason, Ulfur; Janke, Axel
2013-03-29
Nakagome et al. reanalyzed some of our data and assert that we cannot refute the mitochondrial DNA-based scenario for polar bear evolution. Their single-locus test statistic is strongly affected by introgression and incomplete lineage sorting, whereas our multilocus approaches are better suited to recover the true species relationships. Indeed, our sister-lineage model receives high support in a Bayesian model comparison.
Pan, Gaofeng; Jiang, Limin; Tang, Jijun; Guo, Fei
2018-02-08
DNA methylation is an important biochemical process, and it has a close connection with many types of cancer. Research about DNA methylation can help us to understand the regulation mechanism and epigenetic reprogramming. Therefore, it becomes very important to recognize the methylation sites in the DNA sequence. In the past several decades, many computational methods-especially machine learning methods-have been developed since the high-throughout sequencing technology became widely used in research and industry. In order to accurately identify whether or not a nucleotide residue is methylated under the specific DNA sequence context, we propose a novel method that overcomes the shortcomings of previous methods for predicting methylation sites. We use k -gram, multivariate mutual information, discrete wavelet transform, and pseudo amino acid composition to extract features, and train a sparse Bayesian learning model to do DNA methylation prediction. Five criteria-area under the receiver operating characteristic curve (AUC), Matthew's correlation coefficient (MCC), accuracy (ACC), sensitivity (SN), and specificity-are used to evaluate the prediction results of our method. On the benchmark dataset, we could reach 0.8632 on AUC, 0.8017 on ACC, 0.5558 on MCC, and 0.7268 on SN. Additionally, the best results on two scBS-seq profiled mouse embryonic stem cells datasets were 0.8896 and 0.9511 by AUC, respectively. When compared with other outstanding methods, our method surpassed them on the accuracy of prediction. The improvement of AUC by our method compared to other methods was at least 0.0399 . For the convenience of other researchers, our code has been uploaded to a file hosting service, and can be downloaded from: https://figshare.com/s/0697b692d802861282d3.
Kim, Min Jee; Choi, Sei-Woong; Kim, Iksoo
2015-04-10
Saturnia (Rinaca) jonasii Butler, 1877 is distributed in Japan, including Tsushima Island and Taiwan, whereas S. boisduvalii Eversmann, 1846 is distributed in northern areas, such as China, Russia, and South Korea. In the present study we found that the specimens from Mt. Hallasan on Jejudo, a southern remote offshore island, were S. jonasii, rather than S. boisduvalii based on morphology, DNA barcode, and nuclear elongation factor 1 alpha (EF-1α) sequences. The major morphological differences between the two species included the shape of wing pattern elements of fore- and hindwings and male and female genitalia. A DNA barcode analysis of the sequences of the Jejudo specimens and S. boisduvalii, along with those of Saturnia species obtained from a public database showed a minimum sequence divergence of 4.26% (28 bp). A phylogenetic analysis also showed clustering of the Jejudo specimens with S. jonasii, separating S. boisduvalii (Bayesian posterior probability = 0.99). The EF-1α-based sequence and phylogenetic analyses of the two species from Jejudo Island and the Korean mainland showed the uniqueness of the Jejudo specimens from S. boisduvalii collected on the Korean mainland, indicating distribution of S. jonasii on Jejudo Island in South Korea, instead of S. boisduvalii.
Hurtado, Luis A; Santamaria, Carlos A; Fitzgerald, Lee A
2014-05-06
The phylogenetic position of the critically endangered Saint Croix ground lizard Ameiva polops is presently unknown and several hypotheses have been proposed. We investigated the phylogenetic position of this species using molecular phylogenetic methods. We obtained sequences of DNA fragments of the mitochondrial ribosomal genes 12S rDNA and 16S rDNA for this species. We aligned these sequences with published sequences of other Ameiva species, which include most of the Ameiva species from the West Indies, three Ameiva species from Central America and South America, and one from the teiid lizard Tupinambis teguixin, which was used as outgroup. We conducted Maximum Likelihood and Bayesian phylogenetic analyses. The phylogenetic reconstructions among the different methods were very similar, supporting the monophyly of West Indian Ameiva and showing within this lineage, a basal polytomy of four clades that are separated geographically. Ameiva polops grouped in a cluster that included the other two Ameiva species found in the Puerto Rican Bank: A. wetmorei and A. exsul. A sister relationship between A. polops and A. wetmorei is suggested by our analyses. We compare our results with a previous study on molecular systematics of West Indian Ameiva.
Iftikhar, Romana; Ashfaq, Muhammad; Rasool, Akhtar; Hebert, Paul D N
2016-01-01
Although thrips are globally important crop pests and vectors of viral disease, species identifications are difficult because of their small size and inconspicuous morphological differences. Sequence variation in the mitochondrial COI-5' (DNA barcode) region has proven effective for the identification of species in many groups of insect pests. We analyzed barcode sequence variation among 471 thrips from various plant hosts in north-central Pakistan. The Barcode Index Number (BIN) system assigned these sequences to 55 BINs, while the Automatic Barcode Gap Discovery detected 56 partitions, a count that coincided with the number of monophyletic lineages recognized by Neighbor-Joining analysis and Bayesian inference. Congeneric species showed an average of 19% sequence divergence (range = 5.6% - 27%) at COI, while intraspecific distances averaged 0.6% (range = 0.0% - 7.6%). BIN analysis suggested that all intraspecific divergence >3.0% actually involved a species complex. In fact, sequences for three major pest species (Haplothrips reuteri, Thrips palmi, Thrips tabaci), and one predatory thrips (Aeolothrips intermedius) showed deep intraspecific divergences, providing evidence that each is a cryptic species complex. The study compiles the first barcode reference library for the thrips of Pakistan, and examines global haplotype diversity in four important pest thrips.
Haake, David A.; Suchard, Marc A.; Kelley, Melissa M.; Dundoo, Manjula; Alt, David P.; Zuerner, Richard L.
2004-01-01
Leptospires belong to a genus of parasitic bacterial spirochetes that have adapted to a broad range of mammalian hosts. Mechanisms of leptospiral molecular evolution were explored by sequence analysis of four genes shared by 38 strains belonging to the core group of pathogenic Leptospira species: L. interrogans, L. kirschneri, L. noguchii, L. borgpetersenii, L. santarosai, and L. weilii. The 16S rRNA and lipL32 genes were highly conserved, and the lipL41 and ompL1 genes were significantly more variable. Synonymous substitutions are distributed throughout the ompL1 gene, whereas nonsynonymous substitutions are clustered in four variable regions encoding surface loops. While phylogenetic trees for the 16S, lipL32, and lipL41 genes were relatively stable, 8 of 38 (20%) ompL1 sequences had mosaic compositions consistent with horizontal transfer of DNA between related bacterial species. A novel Bayesian multiple change point model was used to identify the most likely sites of recombination and to determine the phylogenetic relatedness of the segments of the mosaic ompL1 genes. Segments of the mosaic ompL1 genes encoding two of the surface-exposed loops were likely acquired by horizontal transfer from a peregrine allele of unknown ancestry. Identification of the most likely sites of recombination with the Bayesian multiple change point model, an approach which has not previously been applied to prokaryotic gene sequence analysis, serves as a model for future studies of recombination in molecular evolution of genes. PMID:15090524
Hofman, Sebastian; Pabijan, Maciej; Osikowski, Artur; Litvinchuk, Spartak N; Szymura, Jacek M
2016-09-01
We present the full-length mitogenome sequences of four European water frog species: Pelophylax cypriensis, P. epeiroticus, P. kurtmuelleri and P. shqipericus. The mtDNA size varied from 17,363 to 17,895 bp, and its organization with the LPTF tRNA gene cluster preceding the 12 S rRNA gene displayed the typical Neobatrachian arrangement. Maximum likelihood and Bayesian inference revealed a well-resolved mtDNA phylogeny of seven European Pelophylax species. The uncorrected p-distance for among Pelophylax mitogenomes was 9.6 (range 0.01-0.13). Most divergent was the P. shqipericus mitogenome, clustering with the "P. lessonae" group, in contrast to the other three new Pelophylax mitogenomes related to the "P. bedriagae/ridibundus" lineage. The new mitogenomes resolve ambiguities of the phylogenetic placement of P. cretensis and P. epeiroticus.
Mammoth and Elephant Phylogenetic Relationships: Mammut Americanum, the Missing Outgroup
Orlando, Ludovic; Hänni, Catherine; Douady, Christophe J.
2007-01-01
At the morphological level, the woolly mammoth has most often been considered as the sister-species of Asian elephants, but at the DNA level, different studies have found support for proximity with African elephants. Recent reports have increased the available sequence data and apparently solved the discrepancy, finding mammoths to be most closely related to Asian elephants. However, we demonstrate here that the three competing topologies have similar likelihood, bayesian and parsimony supports. The analysis further suggests the inadequacy of using Sirenia or Hyracoidea as outgroups. We therefore argue that orthologous sequences from the extinct American mastodon will be required to definitively solve this long-standing question. PMID:19430604
Salazar, Gerardo A.; Cabrera, Lidia I.; Madriñán, Santiago; Chase, Mark W.
2009-01-01
Background and Aims Phylogenetic relationships of subtribes Cranichidinae and Prescottiinae, two diverse groups of neotropical terrestrial orchids, are not satisfactorily understood. A previous molecular phylogenetic study supported monophyly for Cranichidinae, but Prescottiinae consisted of two clades not sister to one another. However, that analysis included only 11 species and eight genera of these subtribes. Here, plastid and nuclear DNA sequences are analysed for an enlarged sample of genera and species of Cranichidinae and Prescottiinae with the aim of clarifying their relationships, evaluating the phylogenetic position of the monospecific genera Exalaria, Ocampoa and Pseudocranichis and examining the value of various structural traits as taxonomic markers. Methods Approx. 6000 bp of nucleotide sequences from nuclear ribosomal (ITS) and plastid DNA (rbcL, matK-trnK and trnL-trnF) were analysed with cladistic parsimony and Bayesian inference for 45 species/14 genera of Cranichidinae and Prescottiinae (plus suitable outgroups). The utility of flower orientation, thickenings of velamen cell walls, hamular viscidium and pseudolabellum to mark clades recovered by the molecular analysis was assessed by tracing these characters on the molecular trees. Key Results Spiranthinae, Cranichidinae, paraphyletic Prescottia (with Pseudocranichis embedded), and a group of mainly Andean ‘prescottioid’ genera (the ‘Stenoptera clade’) were strongly supported. Relationships among these clades were unresolved by parsimony but the Bayesian tree provided moderately strong support for the resolution (Spiranthinae–(Stenoptera clade-(Prescottia/Pseudocranichis–Cranichidinae))). Three of the four structural characters mark clades on the molecular trees, but the possession of a pseudolabellum is variable in the polyphyletic Ponthieva. Conclusions No evidence was found for monophyly of Prescottiinae and the reinstatement of Cranichidinae s.l. (including the genera of ‘Prescottiinae’) is favoured. Cranichidinae s.l. are diagnosed by non-resupinate flowers. Lack of support from parsimony for relationships among the major clades of core spiranthids is suggestive of a rapid morphological radiation or a slow rate of molecular evolution. PMID:19136493
Matsumoto, Toshimi; Okumura, Naohiko; Uenishi, Hirohide; Hayashi, Takeshi; Hamasima, Noriyuki; Awata, Takashi
2012-01-01
We have collected more than 190000 porcine expressed sequence tags (ESTs) from full-length complementary DNA (cDNA) libraries and identified more than 2800 single nucleotide polymorphisms (SNPs). In this study, we tentatively chose 222 SNPs observed in assembled ESTs to study pigs of different breeds; 104 were selected by comparing the cDNA sequences of a Meishan pig and samples of three-way cross pigs (Landrace, Large White, and Duroc: LWD), and 118 were selected from LWD samples. To evaluate the genetic variation between the chosen SNPs from pig breeds, we determined the genotypes for 192 pig samples (11 pig groups) from our DNA reference panel with matrix-assisted laser desorption ionization time-of-flight mass spectrometry. Of the 222 reference SNPs, 186 were successfully genotyped. A neighbor-joining tree showed that the pig groups were classified into two large clusters, namely, Euro-American and East Asian pig populations. F-statistics and the analysis of molecular variance of Euro-American pig groups revealed that approximately 25% of the genetic variations occurred because of intergroup differences. As the F(IS) values were less than the F(ST) values(,) the clustering, based on the Bayesian inference, implied that there was strong genetic differentiation among pig groups and less divergence within the groups in our samples. © 2011 The Authors. Animal Science Journal © 2011 Japanese Society of Animal Science.
Scanning sequences after Gibbs sampling to find multiple occurrences of functional elements
Tharakaraman, Kannan; Mariño-Ramírez, Leonardo; Sheetlin, Sergey L; Landsman, David; Spouge, John L
2006-01-01
Background Many DNA regulatory elements occur as multiple instances within a target promoter. Gibbs sampling programs for finding DNA regulatory elements de novo can be prohibitively slow in locating all instances of such an element in a sequence set. Results We describe an improvement to the A-GLAM computer program, which predicts regulatory elements within DNA sequences with Gibbs sampling. The improvement adds an optional "scanning step" after Gibbs sampling. Gibbs sampling produces a position specific scoring matrix (PSSM). The new scanning step resembles an iterative PSI-BLAST search based on the PSSM. First, it assigns an "individual score" to each subsequence of appropriate length within the input sequences using the initial PSSM. Second, it computes an E-value from each individual score, to assess the agreement between the corresponding subsequence and the PSSM. Third, it permits subsequences with E-values falling below a threshold to contribute to the underlying PSSM, which is then updated using the Bayesian calculus. A-GLAM iterates its scanning step to convergence, at which point no new subsequences contribute to the PSSM. After convergence, A-GLAM reports predicted regulatory elements within each sequence in order of increasing E-values, so users have a statistical evaluation of the predicted elements in a convenient presentation. Thus, although the Gibbs sampling step in A-GLAM finds at most one regulatory element per input sequence, the scanning step can now rapidly locate further instances of the element in each sequence. Conclusion Datasets from experiments determining the binding sites of transcription factors were used to evaluate the improvement to A-GLAM. Typically, the datasets included several sequences containing multiple instances of a regulatory motif. The improvements to A-GLAM permitted it to predict the multiple instances. PMID:16961919
Evolution of helotialean fungi (Leotiomycetes, Pezizomycotina): a nuclear rDNA phylogeny.
Wang, Zheng; Binder, Manfred; Schoch, Conrad L; Johnston, Peter R; Spatafora, Joseph W; Hibbett, David S
2006-11-01
The highly divergent characters of morphology, ecology, and biology in the Helotiales make it one of the most problematic groups in traditional classification and molecular phylogeny. Sequences of three rDNA regions, SSU, LSU, and 5.8S rDNA, were generated for 50 helotialean fungi, representing 11 out of 13 families in the current classification. Data sets with different compositions were assembled, and parsimony and Bayesian analyses were performed. The phylogenetic distribution of lifestyle and ecological factors was assessed. Plant endophytism is distributed across multiple clades in the Leotiomycetes. Our results suggest that (1) the inclusion of LSU rDNA and a wider taxon sampling greatly improves resolution of the Helotiales phylogeny, however, the usefulness of rDNA in resolving the deep relationships within the Leotiomycetes is limited; (2) a new class Geoglossomycetes, including Geoglossum, Trichoglossum, and Sarcoleotia, is the basal lineage of the Leotiomyceta; (3) the Leotiomycetes, including the Helotiales, Erysiphales, Cyttariales, Rhytismatales, and Myxotrichaceae, is monophyletic; and (4) nine clades can be recognized within the Helotiales.
Bonello, Nicolas; Sampson, James; Burn, John; Wilson, Ian J; McGrown, Gail; Margison, Geoff P; Thorncroft, Mary; Crossbie, Philip; Povey, Andrew C; Santibanez-Koref, Mauro; Walters, Kevin
2013-11-07
We exploit model-based Bayesian inference methodologies to analyse lung tumour-derived methylation data from a CpG island in the O6-methylguanine-DNA methyltransferase (MGMT) promoter. Interest is in modelling the changes in methylation patterns in a CpG island in the first exon of the promoter during lung tumour development. We propose four competils of methylation state propagation based on two mechanisms. The first is the location-dependence mechanism in which the probability of a gain or loss of methylation at a CpG within the promoter depends upon its location in the CpG sequence. The second mechanism is that of neighbour-dependence in which gain or loss of methylation at a CpG depends upon the methylation status of the immediately preceding CpG. Our data comprises the methylation status at 12 CpGs near the 5' end of the CpG island in two lung tumour samples for both alleles of a nearby polymorphism. We use approximate Bayesian computation, a computationally intensive rejection-sampling algorithm to infer model parameters and compare models without the need to evaluate the likelihood function. We compare the four proposed models using two criteria: the approximate Bayes factors and the distribution of the Euclidean distance between the summary statistics of the observed and simulated datasets. Our model-based analysis demonstrates compelling evidence for both location and neighbour dependence in the process of aberrant DNA methylation of this MGMT promoter CpG island in lung tumours. We find equivocal evidence to support the hypothesis that the methylation patterns of the two alleles evolve independently. © 2013 Published by Elsevier Ltd. All rights reserved.
Pereira, Sergio L; Johnson, Kevin P; Clayton, Dale H; Baker, Allan J
2007-08-01
Phylogenetic relationships among genera of pigeons and doves (Aves, Columbiformes) have not been fully resolved because of limited sampling of taxa and characters in previous studies. We therefore sequenced multiple nuclear and mitochondrial DNA genes totaling over 9000 bp from 33 of 41 genera plus 8 outgroup taxa, and, together with sequences from 5 other pigeon genera retrieved from GenBank, recovered a strong phylogenetic hypothesis for the Columbiformes. Three major clades were recovered with the combined data set, comprising the basally branching New World pigeons and allies (clade A) that are sister to Neotropical ground doves (clade B), and the Afro-Eurasian and Australasian taxa (clade C). None of these clades supports the monophyly of current families and subfamilies. The extinct, flightless dodo and solitaires (Raphidae) were embedded within pigeons and doves (Columbidae) in clade C, and monophyly of the subfamily Columbinae was refuted because the remaining subfamilies were nested within it. Divergence times estimated using a Bayesian framework suggest that Columbiformes diverged from outgroups such as Apodiformes and Caprimulgiformes in the Cretaceous before the mass extinction that marks the end of this period. Bayesian and maximum likelihood inferences of ancestral areas, accounting for phylogenetic uncertainty and divergence times, respectively, favor an ancient origin of Columbiformes in the Neotropical portion of what was then Gondwana. The radiation of modern genera of Columbiformes started in the Early Eocene to the Middle Miocene, as previously estimated for other avian groups such as ratites, tinamous, galliform birds, penguins, shorebirds, parrots, passerine birds, and toucans. Multiple dispersals of more derived Columbiformes between Australasian and Afro-Eurasian regions are required to explain current distributions.
A Bayesian Assessment of Seismic Semi-Periodicity Forecasts
NASA Astrophysics Data System (ADS)
Nava, F.; Quinteros, C.; Glowacka, E.; Frez, J.
2016-01-01
Among the schemes for earthquake forecasting, the search for semi-periodicity during large earthquakes in a given seismogenic region plays an important role. When considering earthquake forecasts based on semi-periodic sequence identification, the Bayesian formalism is a useful tool for: (1) assessing how well a given earthquake satisfies a previously made forecast; (2) re-evaluating the semi-periodic sequence probability; and (3) testing other prior estimations of the sequence probability. A comparison of Bayesian estimates with updated estimates of semi-periodic sequences that incorporate new data not used in the original estimates shows extremely good agreement, indicating that: (1) the probability that a semi-periodic sequence is not due to chance is an appropriate estimate for the prior sequence probability estimate; and (2) the Bayesian formalism does a very good job of estimating corrected semi-periodicity probabilities, using slightly less data than that used for updated estimates. The Bayesian approach is exemplified explicitly by its application to the Parkfield semi-periodic forecast, and results are given for its application to other forecasts in Japan and Venezuela.
Bayesian clustering of DNA sequences using Markov chains and a stochastic partition model.
Jääskinen, Väinö; Parkkinen, Ville; Cheng, Lu; Corander, Jukka
2014-02-01
In many biological applications it is necessary to cluster DNA sequences into groups that represent underlying organismal units, such as named species or genera. In metagenomics this grouping needs typically to be achieved on the basis of relatively short sequences which contain different types of errors, making the use of a statistical modeling approach desirable. Here we introduce a novel method for this purpose by developing a stochastic partition model that clusters Markov chains of a given order. The model is based on a Dirichlet process prior and we use conjugate priors for the Markov chain parameters which enables an analytical expression for comparing the marginal likelihoods of any two partitions. To find a good candidate for the posterior mode in the partition space, we use a hybrid computational approach which combines the EM-algorithm with a greedy search. This is demonstrated to be faster and yield highly accurate results compared to earlier suggested clustering methods for the metagenomics application. Our model is fairly generic and could also be used for clustering of other types of sequence data for which Markov chains provide a reasonable way to compress information, as illustrated by experiments on shotgun sequence type data from an Escherichia coli strain.
Wood, Dustin A; Fisher, Robert N; Reeder, Tod W
2008-02-01
Mitochondrial DNA (mtDNA) sequence variation was examined in 131 individuals of the Rosy Boa (Lichanura trivirgata) from across the species range in southwestern North America. Bayesian inference and nested clade phylogeographic analyses (NCPA) were used to estimate relationships and infer evolutionary processes. These patterns were evaluated as they relate to previously hypothesized vicariant events and new insights are provided into the biogeographic and evolutionary processes important in Baja California and surrounding North American deserts. Three major lineages (Lineages A, B, and C) are revealed with very little overlap. Lineage A and B are predominately separated along the Colorado River and are found primarily within California and Arizona (respectively), while Lineage C consists of disjunct groups distributed along the Baja California peninsula as well as south-central Arizona, southward along the coastal regions of Sonora, Mexico. Estimated divergence time points (using a Bayesian relaxed molecular clock) and geographic congruence with postulated vicariant events suggest early extensions of the Gulf of California and subsequent development of the Colorado River during the Late Miocene-Pliocene led to the formation of these mtDNA lineages. Our results also suggest that vicariance hypotheses alone do not fully explain patterns of genetic variation. Therefore, we highlight the importance of dispersal to explain these patterns and current distribution of populations. We also compare the mtDNA lineages with those based on morphological variation and evaluate their implications for taxonomy.
Wood, D.A.; Fisher, R.N.; Reeder, T.W.
2008-01-01
Mitochondrial DNA (mtDNA) sequence variation was examined in 131 individuals of the Rosy Boa (Lichanura trivirgata) from across the species range in southwestern North America. Bayesian inference and nested clade phylogeographic analyses (NCPA) were used to estimate relationships and infer evolutionary processes. These patterns were evaluated as they relate to previously hypothesized vicariant events and new insights are provided into the biogeographic and evolutionary processes important in Baja California and surrounding North American deserts. Three major lineages (Lineages A, B, and C) are revealed with very little overlap. Lineage A and B are predominately separated along the Colorado River and are found primarily within California and Arizona (respectively), while Lineage C consists of disjunct groups distributed along the Baja California peninsula as well as south-central Arizona, southward along the coastal regions of Sonora, Mexico. Estimated divergence time points (using a Bayesian relaxed molecular clock) and geographic congruence with postulated vicariant events suggest early extensions of the Gulf of California and subsequent development of the Colorado River during the Late Miocene-Pliocene led to the formation of these mtDNA lineages. Our results also suggest that vicariance hypotheses alone do not fully explain patterns of genetic variation. Therefore, we highlight the importance of dispersal to explain these patterns and current distribution of populations. We also compare the mtDNA lineages with those based on morphological variation and evaluate their implications for taxonomy. ?? 2007 Elsevier Inc. All rights reserved.
Bohling, Justin H; Waits, Lisette P
2011-05-01
Predicting spatial patterns of hybridization is important for evolutionary and conservation biology yet are hampered by poor understanding of how hybridizing species can interact. This is especially pertinent in contact zones where hybridizing populations are sympatric. In this study, we examined the extent of red wolf (Canis rufus) colonization and introgression where the species contacts a coyote (C. latrans) population in North Carolina, USA. We surveyed 22,000km(2) in the winter of 2008 for scat and identified individual canids through genetic analysis. Of 614 collected scats, 250 were assigned to canids by mitochondrial DNA (mtDNA) sequencing. Canid samples were genotyped at 6-17 microsatellite loci (nDNA) and assigned to species using three admixture criteria implemented in two Bayesian clustering programs. We genotyped 82 individuals but none were identified as red wolves. Two individuals had red wolf mtDNA but no significant red wolf nDNA ancestry. One individual possessed significant red wolf nDNA ancestry (approximately 30%) using all criteria, although seven other individuals showed evidence of red wolf ancestry (11-21%) using the relaxed criterion. Overall, seven individuals were classified as hybrids using the conservative criteria and 37 using the relaxed criterion. We found evidence of dog (C. familiaris) and gray wolf (C. lupus) introgression into the coyote population. We compared the performance of different methods and criteria by analyzing known red wolves and hybrids. These results suggest that red wolf colonization and introgression in North Carolina is minimal and provide insights into the utility of Bayesian clustering methods to detect hybridization. © 2011 Blackwell Publishing Ltd.
TOWARD A MOLECULAR PHYLOGENY FOR PEROMYSCUS: EVIDENCE FROM MITOCHONDRIAL CYTOCHROME-b SEQUENCES
Bradley, Robert D.; Durish, Nevin D.; Rogers, Duke S.; Miller, Jacqueline R.; Engstrom, Mark D.; Kilpatrick, C. William
2009-01-01
One hundred DNA sequences from the mitochondrial cytochrome-b gene of 44 species of deer mice (Peromyscus (sensu stricto), 1 of Habromys, 1 of Isthmomys, 2 of Megadontomys, and the monotypic genera Neotomodon, Osgoodomys, and Podomys were used to develop a molecular phylogeny for Peromyscus. Phylogenetic analyses (maximum parsimony, maximum likelihood, and Bayesian inference) were conducted to evaluate alternative hypotheses concerning taxonomic arrangements (sensu stricto versus sensu lato) of the genus. In all analyses, monophyletic clades were obtained that corresponded to species groups proposed by previous authors; however, relationships among species groups generally were poorly resolved. The concept of the genus Peromyscus based on molecular data differed significantly from the most current taxonomic arrangement. Maximum-likelihood and Bayesian trees depicted strong support for a clade placing Habromys, Megadontomys, Neotomodon, Osgoodomys, and Podomys within Peromyscus. If Habromys, Megadontomys, Neotomodon, Osgoodomys, and Podomys are regarded as genera, then several species groups within Peromyscus (sensu stricto) should be elevated to generic rank. Isthmomys was associated with the genus Reithrodontomys; in turn this clade was sister to Baiomys, indicating a distant relationship of Isthmomys to Peromyscus. A formal taxonomic revision awaits synthesis of additional sequence data from nuclear markers together with inclusion of available allozymic and karyotypic data. PMID:19924266
Porter, Teresita M; Gibson, Joel F; Shokralla, Shadi; Baird, Donald J; Golding, G Brian; Hajibabaei, Mehrdad
2014-01-01
Current methods to identify unknown insect (class Insecta) cytochrome c oxidase (COI barcode) sequences often rely on thresholds of distances that can be difficult to define, sequence similarity cut-offs, or monophyly. Some of the most commonly used metagenomic classification methods do not provide a measure of confidence for the taxonomic assignments they provide. The aim of this study was to use a naïve Bayesian classifier (Wang et al. Applied and Environmental Microbiology, 2007; 73: 5261) to automate taxonomic assignments for large batches of insect COI sequences such as data obtained from high-throughput environmental sequencing. This method provides rank-flexible taxonomic assignments with an associated bootstrap support value, and it is faster than the blast-based methods commonly used in environmental sequence surveys. We have developed and rigorously tested the performance of three different training sets using leave-one-out cross-validation, two field data sets, and targeted testing of Lepidoptera, Diptera and Mantodea sequences obtained from the Barcode of Life Data system. We found that type I error rates, incorrect taxonomic assignments with a high bootstrap support, were already relatively low but could be lowered further by ensuring that all query taxa are actually present in the reference database. Choosing bootstrap support cut-offs according to query length and summarizing taxonomic assignments to more inclusive ranks can also help to reduce error while retaining the maximum number of assignments. Additionally, we highlight gaps in the taxonomic and geographic representation of insects in public sequence databases that will require further work by taxonomists to improve the quality of assignments generated using any method.
Dor, Roi; Carling, Matthew D; Lovette, Irby J; Sheldon, Frederick H; Winkler, David W
2012-10-01
The New World swallow genus Tachycineta comprises nine species that collectively have a wide geographic distribution and remarkable variation both within- and among-species in ecologically important traits. Existing phylogenetic hypotheses for Tachycineta are based on mitochondrial DNA sequences, thus they provide estimates of a single gene tree. In this study we sequenced multiple individuals from each species at 16 nuclear intron loci. We used gene concatenated approaches (Bayesian and maximum likelihood) as well as coalescent-based species tree inference to reconstruct phylogenetic relationships of the genus. We examined the concordance and conflict between the nuclear and mitochondrial trees and between concatenated and coalescent-based inferences. Our results provide an alternative phylogenetic hypothesis to the existing mitochondrial DNA estimate of phylogeny. This new hypothesis provides a more accurate framework in which to explore trait evolution and examine the evolution of the mitochondrial genome in this group. Copyright © 2012 Elsevier Inc. All rights reserved.
Hochbach, Anne; Schneider, Julia; Röser, Martin
2015-06-01
To investigate phylogenetic relationships within the grass subfamily Pooideae we studied about 50 taxa covering all recognized tribes, using one plastid DNA (cpDNA) marker (matK gene-3'trnK exon) and for the first time four nuclear single copy gene loci. DNA sequence information from two parts of the nuclear genes topoisomerase 6 (Topo6) spanning the exons 8-13 and 17-19, the exons 9-13 encoding plastid acetyl-CoA-carboxylase (Acc1) and the partial exon 1 of phytochrome B (PhyB) were generated. Individual and nuclear combined data were evaluated using maximum parsimony, maximum likelihood and Bayesian methods. All of the phylogenetic results show Brachyelytrum and the tribe Nardeae as earliest diverging lineages within the subfamily. The 'core' Pooideae (Hordeeae and the Aveneae/Poeae tribe complex) are also strongly supported, as well as the monophyly of the tribes Brachypodieae, Meliceae and Stipeae (except PhyB). The beak grass tribe Diarrheneae and the tribe Duthieeae are not monophyletic in some of the analyses. However, the combined nuclear DNA (nDNA) tree yields the highest resolution and the best delimitation of the tribes, and provides the following evolutionary hypothesis for the tribes: Brachyelytrum, Nardeae, Duthieeae, Meliceae, Stipeae, Diarrheneae, Brachypodieae and the 'core' Pooideae. Within the individual datasets, the phylogenetic trees obtained from Topo6 exon 8-13 shows the most interesting results. The divergent positions of some clone sequences of Ampelodesmos mauritanicus and Trikeraia pappiformis, for instance, may indicate a hybrid origin of these stipoid taxa. Copyright © 2015 Elsevier Inc. All rights reserved.
Near, Thomas J; Dornburg, Alex; Friedman, Matt
2014-11-01
The Gonorynchiformes are the sister lineage of the species-rich Otophysi and provide important insights into the diversification of ostariophysan fishes. Phylogenies of gonorynchiforms inferred using morphological characters and mtDNA gene sequences provide differing resolutions with regard to the sister lineage of all other gonorynchiforms (Chanos vs. Gonorynchus) and support for monophyly of the two miniaturized lineages Cromeria and Grasseichthys. In this study the phylogeny and divergence times of gonorynchiforms are investigated with DNA sequences sampled from nine nuclear genes and a published morphological character matrix. Bayesian phylogenetic analyses reveal substantial congruence among individual gene trees with inferences from eight genes placing Gonorynchus as the sister lineage to all other gonorynchiforms. Seven gene trees resolve Cromeria and Grasseichthys as a clade, supporting previous inferences using morphological characters. Phylogenies resulting from either concatenating the nuclear genes, performing a multispecies coalescent species tree analysis, or combining the morphological and nuclear gene DNA sequences resolve Gonorynchus as the living sister lineage of all other gonorynchiforms, strongly support the monophyly of Cromeria and Grasseichthys, and resolve a clade containing Parakneria, Cromeria, and Grasseichthys. The morphological dataset, which includes 13 gonorynchiform fossil taxa that range in age from Early Cretaceous to Eocene, was analyzed in combination with DNA sequences from the nine nuclear genes and a relaxed molecular clock to estimate times of evolutionary divergence. This "tip dating" strategy accommodates uncertainty in the phylogenetic resolution of fossil taxa that provide calibration information in the relaxed molecular clock analysis. The estimated age of the most recent common ancestor (MRCA) of living gonorynchiforms is slightly older than estimates from previous node dating efforts, but the molecular tip dating estimated ages of Kneriinae (Kneria, Parakneria, Cromeria, and Grasseichthys) and the two paedomorphic lineages, Cromeria and Grasseichthys, are considerably younger. Copyright © 2014 Elsevier Inc. All rights reserved.
2014-01-01
Background Nematodirus spp. are among the most common nematodes of ruminants worldwide. N. oiratianus and N. spathiger are distributed worldwide as highly prevalent gastrointestinal nematodes, which cause emerging health problems and economic losses. Accurate identification of Nematodirus species is essential to develop effective control strategies for Nematodirus infection in ruminants. Mitochondrial DNA (mtDNA) could provide powerful genetic markers for identifying these closely related species and resolving phylogenetic relationships at different taxonomic levels. Methods In the present study, the complete mitochondrial (mt) genomes of N. oiratianus and N. spathiger from small ruminants in China were obtained using Long-range PCR and sequencing. Results The complete mt genomes of N. oiratianus and N. spathiger were 13,765 bp and 13,519 bp in length, respectively. Both mt genomes were circular and consisted of 36 genes, including 12 genes encoding proteins, 2 genes encoding rRNA, and 22 genes encoding tRNA. Phylogenetic analyses based on the concatenated amino acid sequence data of all 12 protein-coding genes by Bayesian inference (BI), Maximum likelihood (ML) and Maximum parsimony (MP) showed that the two Nematodirus species (Molineidae) were closely related to Dictyocaulidae. Conclusions The availability of the complete mtDNA sequences of N. oiratianus and N. spathiger not only provides new mtDNA sources for a better understanding of nematode mt genomics and phylogeny, but also provides novel and useful genetic markers for studying diagnosis, population genetics and molecular epidemiology of Nematodirus spp. in small ruminants. PMID:25015379
Zhao, Guang-Hui; Jia, Yan-Qing; Cheng, Wen-Yu; Zhao, Wen; Bian, Qing-Qing; Liu, Guo-Hua
2014-07-11
Nematodirus spp. are among the most common nematodes of ruminants worldwide. N. oiratianus and N. spathiger are distributed worldwide as highly prevalent gastrointestinal nematodes, which cause emerging health problems and economic losses. Accurate identification of Nematodirus species is essential to develop effective control strategies for Nematodirus infection in ruminants. Mitochondrial DNA (mtDNA) could provide powerful genetic markers for identifying these closely related species and resolving phylogenetic relationships at different taxonomic levels. In the present study, the complete mitochondrial (mt) genomes of N. oiratianus and N. spathiger from small ruminants in China were obtained using Long-range PCR and sequencing. The complete mt genomes of N. oiratianus and N. spathiger were 13,765 bp and 13,519 bp in length, respectively. Both mt genomes were circular and consisted of 36 genes, including 12 genes encoding proteins, 2 genes encoding rRNA, and 22 genes encoding tRNA. Phylogenetic analyses based on the concatenated amino acid sequence data of all 12 protein-coding genes by Bayesian inference (BI), Maximum likelihood (ML) and Maximum parsimony (MP) showed that the two Nematodirus species (Molineidae) were closely related to Dictyocaulidae. The availability of the complete mtDNA sequences of N. oiratianus and N. spathiger not only provides new mtDNA sources for a better understanding of nematode mt genomics and phylogeny, but also provides novel and useful genetic markers for studying diagnosis, population genetics and molecular epidemiology of Nematodirus spp. in small ruminants.
Efficient Implementation of MrBayes on Multi-GPU
Zhou, Jianfu; Liu, Xiaoguang; Wang, Gang
2013-01-01
MrBayes, using Metropolis-coupled Markov chain Monte Carlo (MCMCMC or (MC)3), is a popular program for Bayesian inference. As a leading method of using DNA data to infer phylogeny, the (MC)3 Bayesian algorithm and its improved and parallel versions are now not fast enough for biologists to analyze massive real-world DNA data. Recently, graphics processor unit (GPU) has shown its power as a coprocessor (or rather, an accelerator) in many fields. This article describes an efficient implementation a(MC)3 (aMCMCMC) for MrBayes (MC)3 on compute unified device architecture. By dynamically adjusting the task granularity to adapt to input data size and hardware configuration, it makes full use of GPU cores with different data sets. An adaptive method is also developed to split and combine DNA sequences to make full use of a large number of GPU cards. Furthermore, a new “node-by-node” task scheduling strategy is developed to improve concurrency, and several optimizing methods are used to reduce extra overhead. Experimental results show that a(MC)3 achieves up to 63× speedup over serial MrBayes on a single machine with one GPU card, and up to 170× speedup with four GPU cards, and up to 478× speedup with a 32-node GPU cluster. a(MC)3 is dramatically faster than all the previous (MC)3 algorithms and scales well to large GPU clusters. PMID:23493260
Efficient implementation of MrBayes on multi-GPU.
Bao, Jie; Xia, Hongju; Zhou, Jianfu; Liu, Xiaoguang; Wang, Gang
2013-06-01
MrBayes, using Metropolis-coupled Markov chain Monte Carlo (MCMCMC or (MC)(3)), is a popular program for Bayesian inference. As a leading method of using DNA data to infer phylogeny, the (MC)(3) Bayesian algorithm and its improved and parallel versions are now not fast enough for biologists to analyze massive real-world DNA data. Recently, graphics processor unit (GPU) has shown its power as a coprocessor (or rather, an accelerator) in many fields. This article describes an efficient implementation a(MC)(3) (aMCMCMC) for MrBayes (MC)(3) on compute unified device architecture. By dynamically adjusting the task granularity to adapt to input data size and hardware configuration, it makes full use of GPU cores with different data sets. An adaptive method is also developed to split and combine DNA sequences to make full use of a large number of GPU cards. Furthermore, a new "node-by-node" task scheduling strategy is developed to improve concurrency, and several optimizing methods are used to reduce extra overhead. Experimental results show that a(MC)(3) achieves up to 63× speedup over serial MrBayes on a single machine with one GPU card, and up to 170× speedup with four GPU cards, and up to 478× speedup with a 32-node GPU cluster. a(MC)(3) is dramatically faster than all the previous (MC)(3) algorithms and scales well to large GPU clusters.
Porter, Teresita M.; Golding, G. Brian
2012-01-01
Nuclear large subunit ribosomal DNA is widely used in fungal phylogenetics and to an increasing extent also amplicon-based environmental sequencing. The relatively short reads produced by next-generation sequencing, however, makes primer choice and sequence error important variables for obtaining accurate taxonomic classifications. In this simulation study we tested the performance of three classification methods: 1) a similarity-based method (BLAST + Metagenomic Analyzer, MEGAN); 2) a composition-based method (Ribosomal Database Project naïve Bayesian classifier, NBC); and, 3) a phylogeny-based method (Statistical Assignment Package, SAP). We also tested the effects of sequence length, primer choice, and sequence error on classification accuracy and perceived community composition. Using a leave-one-out cross validation approach, results for classifications to the genus rank were as follows: BLAST + MEGAN had the lowest error rate and was particularly robust to sequence error; SAP accuracy was highest when long LSU query sequences were classified; and, NBC runs significantly faster than the other tested methods. All methods performed poorly with the shortest 50–100 bp sequences. Increasing simulated sequence error reduced classification accuracy. Community shifts were detected due to sequence error and primer selection even though there was no change in the underlying community composition. Short read datasets from individual primers, as well as pooled datasets, appear to only approximate the true community composition. We hope this work informs investigators of some of the factors that affect the quality and interpretation of their environmental gene surveys. PMID:22558215
Global diversity and oceanic divergence of humpback whales (Megaptera novaeangliae).
Jackson, Jennifer A; Steel, Debbie J; Beerli, P; Congdon, Bradley C; Olavarría, Carlos; Leslie, Matthew S; Pomilla, Cristina; Rosenbaum, Howard; Baker, C Scott
2014-07-07
Humpback whales (Megaptera novaeangliae) annually undertake the longest migrations between seasonal feeding and breeding grounds of any mammal. Despite this dispersal potential, discontinuous seasonal distributions and migratory patterns suggest that humpbacks form discrete regional populations within each ocean. To better understand the worldwide population history of humpbacks, and the interplay of this species with the oceanic environment through geological time, we assembled mitochondrial DNA control region sequences representing approximately 2700 individuals (465 bp, 219 haplotypes) and eight nuclear intronic sequences representing approximately 70 individuals (3700 bp, 140 alleles) from the North Pacific, North Atlantic and Southern Hemisphere. Bayesian divergence time reconstructions date the origin of humpback mtDNA lineages to the Pleistocene (880 ka, 95% posterior intervals 550-1320 ka) and estimate radiation of current Northern Hemisphere lineages between 50 and 200 ka, indicating colonization of the northern oceans prior to the Last Glacial Maximum. Coalescent analyses reveal restricted gene flow between ocean basins, with long-term migration rates (individual migrants per generation) of less than 3.3 for mtDNA and less than 2 for nuclear genomic DNA. Genetic evidence suggests that humpbacks in the North Pacific, North Atlantic and Southern Hemisphere are on independent evolutionary trajectories, supporting taxonomic revision of M. novaeangliae to three subspecies. © 2014 The Author(s) Published by the Royal Society. All rights reserved.
Global diversity and oceanic divergence of humpback whales (Megaptera novaeangliae)
Jackson, Jennifer A.; Steel, Debbie J.; Beerli, P.; Congdon, Bradley C.; Olavarría, Carlos; Leslie, Matthew S.; Pomilla, Cristina; Rosenbaum, Howard; Baker, C. Scott
2014-01-01
Humpback whales (Megaptera novaeangliae) annually undertake the longest migrations between seasonal feeding and breeding grounds of any mammal. Despite this dispersal potential, discontinuous seasonal distributions and migratory patterns suggest that humpbacks form discrete regional populations within each ocean. To better understand the worldwide population history of humpbacks, and the interplay of this species with the oceanic environment through geological time, we assembled mitochondrial DNA control region sequences representing approximately 2700 individuals (465 bp, 219 haplotypes) and eight nuclear intronic sequences representing approximately 70 individuals (3700 bp, 140 alleles) from the North Pacific, North Atlantic and Southern Hemisphere. Bayesian divergence time reconstructions date the origin of humpback mtDNA lineages to the Pleistocene (880 ka, 95% posterior intervals 550–1320 ka) and estimate radiation of current Northern Hemisphere lineages between 50 and 200 ka, indicating colonization of the northern oceans prior to the Last Glacial Maximum. Coalescent analyses reveal restricted gene flow between ocean basins, with long-term migration rates (individual migrants per generation) of less than 3.3 for mtDNA and less than 2 for nuclear genomic DNA. Genetic evidence suggests that humpbacks in the North Pacific, North Atlantic and Southern Hemisphere are on independent evolutionary trajectories, supporting taxonomic revision of M. novaeangliae to three subspecies. PMID:24850919
DNA barcoding and the identification of tree frogs (Amphibia: Anura: Rhacophoridae).
Dang, Ning-Xin; Sun, Feng-Hui; Lv, Yun-Yun; Zhao, Bo-Han; Wang, Ji-Chao; Murphy, Robert W; Wang, Wen-Zhi; Li, Jia-Tang
2016-07-01
The DNA barcoding gene COI (cytochrome c oxidase subunit I) effectively identifies many species. Herein, we barcoded 172 individuals from 37 species belonging to nine genera in Rhacophoridae to test if the gene serves equally well to identify species of tree frogs. Phenetic neighbor joining and phylogenetic Bayesian inference were used to construct phylogenetic trees, which resolved all nine genera as monophyletic taxa except for Rhacophorus, two new matrilines for Liuixalus, and Polypedates leucomystax species complex. Intraspecific genetic distances ranged from 0.000 to 0.119 and interspecific genetic distances ranged from 0.015 to 0.334. Within Rhacophorus and Kurixalus, the intra- and interspecific genetic distances did not reveal an obvious barcode gap. Notwithstanding, we found that COI sequences unambiguously identified rhacophorid species and helped to discover likely new cryptic species via the synthesis of genealogical relationships and divergence patterns. Our results supported that COI is an effective DNA barcoding marker for Rhacophoridae.
Hernández-León, Sergio; Gernandt, David S.; Pérez de la Rosa, Jorge A.; Jardón-Barbolla, Lev
2013-01-01
Recent diversification followed by secondary contact and hybridization may explain complex patterns of intra- and interspecific morphological and genetic variation in the North American hard pines (Pinus section Trifoliae), a group of approximately 49 tree species distributed in North and Central America and the Caribbean islands. We concatenated five plastid DNA markers for an average of 3.9 individuals per putative species and assessed the suitability of the five regions as DNA bar codes for species identification, species delimitation, and phylogenetic reconstruction. The ycf1 gene accounted for the greatest proportion of the alignment (46.9%), the greatest proportion of variable sites (74.9%), and the most unique sequences (75 haplotypes). Phylogenetic analysis recovered clades corresponding to subsections Australes, Contortae, and Ponderosae. Sequences for 23 of the 49 species were monophyletic and sequences for another 9 species were paraphyletic. Morphologically similar species within subsections usually grouped together, but there were exceptions consistent with incomplete lineage sorting or introgression. Bayesian relaxed molecular clock analyses indicated that all three subsections diversified relatively recently during the Miocene. The general mixed Yule-coalescent method gave a mixed model estimate of only 22 or 23 evolutionary entities for the plastid sequences, which corresponds to less than half the 49 species recognized based on morphological species assignments. Including more unique haplotypes per species may result in higher estimates, but low mutation rates, recent diversification, and large effective population sizes may limit the effectiveness of this method to detect evolutionary entities. PMID:23936218
Hernández-León, Sergio; Gernandt, David S; Pérez de la Rosa, Jorge A; Jardón-Barbolla, Lev
2013-01-01
Recent diversification followed by secondary contact and hybridization may explain complex patterns of intra- and interspecific morphological and genetic variation in the North American hard pines (Pinus section Trifoliae), a group of approximately 49 tree species distributed in North and Central America and the Caribbean islands. We concatenated five plastid DNA markers for an average of 3.9 individuals per putative species and assessed the suitability of the five regions as DNA bar codes for species identification, species delimitation, and phylogenetic reconstruction. The ycf1 gene accounted for the greatest proportion of the alignment (46.9%), the greatest proportion of variable sites (74.9%), and the most unique sequences (75 haplotypes). Phylogenetic analysis recovered clades corresponding to subsections Australes, Contortae, and Ponderosae. Sequences for 23 of the 49 species were monophyletic and sequences for another 9 species were paraphyletic. Morphologically similar species within subsections usually grouped together, but there were exceptions consistent with incomplete lineage sorting or introgression. Bayesian relaxed molecular clock analyses indicated that all three subsections diversified relatively recently during the Miocene. The general mixed Yule-coalescent method gave a mixed model estimate of only 22 or 23 evolutionary entities for the plastid sequences, which corresponds to less than half the 49 species recognized based on morphological species assignments. Including more unique haplotypes per species may result in higher estimates, but low mutation rates, recent diversification, and large effective population sizes may limit the effectiveness of this method to detect evolutionary entities.
Bayesian estimation of post-Messinian divergence times in Balearic Island lizards.
Brown, R P; Terrasa, B; Pérez-Mellado, V; Castro, J A; Hoskisson, P A; Picornell, A; Ramon, M M
2008-07-01
Phylogenetic relationships and timings of major cladogenesis events are investigated in the Balearic Island lizards Podarcislilfordi and P.pityusensis using 2675bp of mitochondrial and nuclear DNA sequences. Partitioned Bayesian and Maximum Parsimony analyses provided a well-resolved phylogeny with high node-support values. Bayesian MCMC estimation of node dates was investigated by comparing means of posterior distributions from different subsets of the sequence against the most robust analysis which used multiple partitions and allowed for rate heterogeneity among branches under a rate-drift model. Evolutionary rates were systematically underestimated and thus divergence times overestimated when sequences containing lower numbers of variable sites were used (based on ingroup node constraints). The following analyses allowed the best recovery of node times under the constant-rate (i.e., perfect clock) model: (i) all cytochrome b sequence (partitioned by codon position), (ii) cytochrome b (codon position 3 alone), (iii) NADH dehydrogenase (subunits 1 and 2; partitioned by codon position), (iv) cytochrome b and NADH dehydrogenase sequence together (six gene-codon partitions), (v) all unpartitioned sequence, (vi) a full multipartition analysis (nine partitions). Of these, only (iv) and (vi) performed well under the rate-drift model. These findings have significant implications for dating of recent divergence times in other taxa. The earliest P.lilfordi cladogenesis event (divergence of Menorcan populations), occurred before the end of the Pliocene, some 2.6Ma. Subsequent events led to a West Mallorcan lineage (2.0Ma ago), followed 1.2Ma ago by divergence of populations from the southern part of the Cabrera archipelago from a widely-distributed group from north Cabrera, northern and southern Mallorcan islets. Divergence within P.pityusensis is more recent with the main Ibiza and Formentera clades sharing a common ancestor at about 1.0Ma ago. Climatic and sea level changes are likely to have initiated cladogenesis, with lineages making secondary contact during periodic landbridge formation. This oscillating cross-archipelago pattern in which ancient divergence is followed by repeated contact resembles that seen between East-West refugia populations from mainland Europe.
Inferring genome-wide interplay landscape between DNA methylation and transcriptional regulation.
Tang, Binhua; Wang, Xin
2015-01-01
DNA methylation and transcriptional regulation play important roles in cancer cell development and differentiation processes. Based on the currently available cell line profiling information from the ENCODE Consortium, we propose a Bayesian inference model to infer and construct genome-wide interaction landscape between DNA methylation and transcriptional regulation, which sheds light on the underlying complex functional mechanisms important within the human cancer and disease context. For the first time, we select all the currently available cell lines (>=20) and transcription factors (>=80) profiling information from the ENCODE Consortium portal. Through the integration of those genome-wide profiling sources, our genome-wide analysis detects multiple functional loci of interest, and indicates that DNA methylation is cell- and region-specific, due to the interplay mechanisms with transcription regulatory activities. We validate our analysis results with the corresponding RNA-sequencing technique for those detected genomic loci. Our results provide novel and meaningful insights for the interplay mechanisms of transcriptional regulation and gene expression for the human cancer and disease studies.
Phylogeny and temporal diversification of darters (Percidae: Etheostomatinae).
Near, Thomas J; Bossu, Christen M; Bradburd, Gideon S; Carlson, Rose L; Harrington, Richard C; Hollingsworth, Phillip R; Keck, Benjamin P; Etnier, David A
2011-10-01
Discussions aimed at resolution of the Tree of Life are most often focused on the interrelationships of major organismal lineages. In this study, we focus on the resolution of some of the most apical branches in the Tree of Life through exploration of the phylogenetic relationships of darters, a species-rich clade of North American freshwater fishes. With a near-complete taxon sampling of close to 250 species, we aim to investigate strategies for efficient multilocus data sampling and the estimation of divergence times using relaxed-clock methods when a clade lacks a fossil record. Our phylogenetic data set comprises a single mitochondrial DNA (mtDNA) gene and two nuclear genes sampled from 245 of the 248 darter species. This dense sampling allows us to determine if a modest amount of nuclear DNA sequence data can resolve relationships among closely related animal species. Darters lack a fossil record to provide age calibration priors in relaxed-clock analyses. Therefore, we use a near-complete species-sampled phylogeny of the perciform clade Centrarchidae, which has a rich fossil record, to assess two distinct strategies of external calibration in relaxed-clock divergence time estimates of darters: using ages inferred from the fossil record and molecular evolutionary rate estimates. Comparison of Bayesian phylogenies inferred from mtDNA and nuclear genes reveals that heterospecific mtDNA is present in approximately 12.5% of all darter species. We identify three patterns of mtDNA introgression in darters: proximal mtDNA transfer, which involves the transfer of mtDNA among extant and sympatric darter species, indeterminate introgression, which involves the transfer of mtDNA from a lineage that cannot be confidently identified because the introgressed haplotypes are not clearly referable to mtDNA haplotypes in any recognized species, and deep introgression, which is characterized by species diversification within a recipient clade subsequent to the transfer of heterospecific mtDNA. The results of our analyses indicate that DNA sequences sampled from single-copy nuclear genes can provide appreciable phylogenetic resolution for closely related animal species. A well-resolved near-complete species-sampled phylogeny of darters was estimated with Bayesian methods using a concatenated mtDNA and nuclear gene data set with all identified heterospecific mtDNA haplotypes treated as missing data. The relaxed-clock analyses resulted in very similar posterior age estimates across the three sampled genes and methods of calibration and therefore offer a viable strategy for estimating divergence times for clades that lack a fossil record. In addition, an informative rank-free clade-based classification of darters that preserves the rich history of nomenclature in the group and provides formal taxonomic communication of darter clades was constructed using the mtDNA and nuclear gene phylogeny. On the whole, the appeal of mtDNA for phylogeny inference among closely related animal species is diminished by the observations of extensive mtDNA introgression and by finding appreciable phylogenetic signal in a modest sampling of nuclear genes in our phylogenetic analyses of darters.
Chen, Weicai; Zhang, Wei; Zhou, Shichu; Li, Ning; Huang, Yong; Mo, Yunming
2013-01-01
Lepobrachiun guangxiense Fei, Mo, Ye and Jiang, 2009 (Anura: Megophryidae), is presently thought to be endemic to Shangsi, Guangxi Province, China. A molecular phylogenetic analysis and morphological data were performed to gain insight into the phylogenetic position of this species. Maximum parsimony, maximum likelihood, and Bayesian inference methods were employed to reconstruct phylogenetic relationship, using 1914 bp of sequences from mtDNA genes of 12S rRNA, tRNAVal and 16S rRNA. Topologies revealed that L. guangxiense and Tam Dao (Vietnam) L. chapaense lineage (3A) formed a monophyletic group with well-supported values. The uncorrected p-distance of ~1.4k bp 16S rRNA data-sets between Tam Dao L. chapaense lineage (3A) and L. guangxiense is only 0.1%. Morphologically, L. guangxiense and Tam Dao L. chapaense lineage (3A) shared the same characters, and are distinguishable from "true" L. chapaense from the type locality in Sa Pa, Vietnam. Based on morphological characters and mitochondrial DNA, we suggested that the Tam Dao lineages of L. chapaense are conspecific with L. guangxiense. This represents a range extension for L. guangxiense, and a new country record for Vietnam.
The Empirical Distribution of Singletons for Geographic Samples of DNA Sequences.
Cubry, Philippe; Vigouroux, Yves; François, Olivier
2017-01-01
Rare variants are important for drawing inference about past demographic events in a species history. A singleton is a rare variant for which genetic variation is carried by a unique chromosome in a sample. How singletons are distributed across geographic space provides a local measure of genetic diversity that can be measured at the individual level. Here, we define the empirical distribution of singletons in a sample of chromosomes as the proportion of the total number of singletons that each chromosome carries, and we present a theoretical background for studying this distribution. Next, we use computer simulations to evaluate the potential for the empirical distribution of singletons to provide a description of genetic diversity across geographic space. In a Bayesian framework, we show that the empirical distribution of singletons leads to accurate estimates of the geographic origin of range expansions. We apply the Bayesian approach to estimating the origin of the cultivated plant species Pennisetum glaucum [L.] R. Br . (pearl millet) in Africa, and find support for range expansion having started from Northern Mali. Overall, we report that the empirical distribution of singletons is a useful measure to analyze results of sequencing projects based on large scale sampling of individuals across geographic space.
Ait Kaci Azzou, Sadoune; Larribe, Fabrice; Froda, Sorana
2015-01-01
The effective population size over time (demographic history) can be retraced from a sample of contemporary DNA sequences. In this paper, we propose a novel methodology based on importance sampling (IS) for exploring such demographic histories. Our starting point is the generalized skyline plot with the main difference being that our procedure, skywis plot, uses a large number of genealogies. The information provided by these genealogies is combined according to the IS weights. Thus, we compute a weighted average of the effective population sizes on specific time intervals (epochs), where the genealogies that agree more with the data are given more weight. We illustrate by a simulation study that the skywis plot correctly reconstructs the recent demographic history under the scenarios most commonly considered in the literature. In particular, our method can capture a change point in the effective population size, and its overall performance is comparable with the one of the bayesian skyline plot. We also introduce the case of serially sampled sequences and illustrate that it is possible to improve the performance of the skywis plot in the case of an exponential expansion of the effective population size. PMID:26300910
Chen, Yuan
2017-01-01
Abstract In this study, we sequenced fragments of cytochrome oxidase subunit 1 (CO1), internal transcribed spacer 1 (ITS1), and internal transcribed spacer 2 (ITS2) genes from 150 specimens belonging to 16 species of the ant genus Formica from China. Odontoponera transversa from Ponerinae and Polyergus samurai from Formicinae were added as distant relative and close relative outgroups, respectively. Neighbor-joining, maximum parsimony, and Bayesian interference methods were used to analyze their phylogenetic relationships based on CO1 gene sequence as well as combined sequence data of CO1 + ITS1, CO1 + ITS2, and CO1 + ITS1 + ITS2. The results showed that nine Formica species (i.e., Formica sinensis, Formica manchu, Formica uralensis, Formica sanguinea, Formica gagatoides, Formica candida, Formica fusca, Formica glauca, and Formica sp.) formed monophyletic clades, which in agreement with the results based on morphological taxonomy. By comparing the results of DNA barcoding and morphological taxonomy, we propose that Formica aquilonia maybe a junior synonym of F. polyctena and that cryptic species could likely existed in Formica sinae. Further studies on morphology, biology, and geography are needed to confirm this notion.
Ayyagari, Vijaya Sai; Sreerama, Krupanidhi
2017-08-01
Achatina fulica (Lissachatina fulica) is one of the most invasive species found across the globe causing a significant damage to crops, vegetables, and horticultural plants. This terrestrial snail is native to east Africa and spread to different parts of the world by introductions. India, a hot spot for biodiversity of several endemic gastropods, has witnessed an outburst of this snail population in several parts of the country posing a serious threat to crop loss and also to human health. With an objective to evaluate the genetic diversity of this snail, we have sampled this snail from different parts of India and analyzed its haplotype diversity by means of 16S rDNA sequence information. Apart from this, we have studied the phylogenetic relationships of the isolates sequenced in the present study in relation with other global populations by Bayesian and Maximum-likelihood approaches. Of the isolates sequenced, haplotype 'C' is the predominant one. A new haplotype 'S' from the state of Odisha was observed. The isolates sequenced in the present study clustered with its conspecifics from the Indian sub-continent. Haplotype network analyses were also carried out for studying the evolution of different haplotypes. It was observed that haplotype 'S' was associated with a Mauritius haplotype 'H', indicating the possibility of multiple introductions of A. fulica to India.
Nuclear genomic sequences reveal that polar bears are an old and distinct bear lineage.
Hailer, Frank; Kutschera, Verena E; Hallström, Björn M; Klassert, Denise; Fain, Steven R; Leonard, Jennifer A; Arnason, Ulfur; Janke, Axel
2012-04-20
Recent studies have shown that the polar bear matriline (mitochondrial DNA) evolved from a brown bear lineage since the late Pleistocene, potentially indicating rapid speciation and adaption to arctic conditions. Here, we present a high-resolution data set from multiple independent loci across the nuclear genomes of a broad sample of polar, brown, and black bears. Bayesian coalescent analyses place polar bears outside the brown bear clade and date the divergence much earlier, in the middle Pleistocene, about 600 (338 to 934) thousand years ago. This provides more time for polar bear evolution and confirms previous suggestions that polar bears carry introgressed brown bear mitochondrial DNA due to past hybridization. Our results highlight that multilocus genomic analyses are crucial for an accurate understanding of evolutionary history.
Pyrosequencing the Canine Faecal Microbiota: Breadth and Depth of Biodiversity
Hand, Daniel; Wallis, Corrin; Colyer, Alison; Penn, Charles W.
2013-01-01
Mammalian intestinal microbiota remain poorly understood despite decades of interest and investigation by culture-based and other long-established methodologies. Using high-throughput sequencing technology we now report a detailed analysis of canine faecal microbiota. The study group of animals comprised eleven healthy adult miniature Schnauzer dogs of mixed sex and age, some closely related and all housed in kennel and pen accommodation on the same premises with similar feeding and exercise regimes. DNA was extracted from faecal specimens and subjected to PCR amplification of 16S rDNA, followed by sequencing of the 5′ region that included variable regions V1 and V2. Barcoded amplicons were sequenced by Roche-454 FLX high-throughput pyrosequencing. Sequences were assigned to taxa using the Ribosomal Database Project Bayesian classifier and revealed dominance of Fusobacterium and Bacteroidetes phyla. Differences between animals in the proportions of different taxa, among 10,000 reads per animal, were clear and not supportive of the concept of a “core microbiota”. Despite this variability in prominent genera, littermates were shown to have a more similar faecal microbial composition than unrelated dogs. Diversity of the microbiota was also assessed by assignment of sequence reads into operational taxonomic units (OTUs) at the level of 97% sequence identity. The OTU data were then subjected to rarefaction analysis and determination of Chao1 richness estimates. The data indicated that faecal microbiota comprised possibly as many as 500 to 1500 OTUs. PMID:23382835
Klopfenstein, Ned B; Stewart, Jane E; Ota, Yuko; Hanna, John W; Richardson, Bryce A; Ross-Davis, Amy L; Elías-Román, Rubén D; Korhonen, Kari; Keča, Nenad; Iturritxa, Eugenia; Alvarado-Rosales, Dionicio; Solheim, Halvor; Brazee, Nicholas J; Łakomy, Piotr; Cleary, Michelle R; Hasegawa, Eri; Kikuchi, Taisei; Garza-Ocañas, Fortunato; Tsopelas, Panaghiotis; Rigling, Daniel; Prospero, Simone; Tsykun, Tetyana; Bérubé, Jean A; Stefani, Franck O P; Jafarpour, Saeideh; Antonín, Vladimír; Tomšovský, Michal; McDonald, Geral I; Woodward, Stephen; Kim, Mee-Sook
2017-01-01
Armillaria possesses several intriguing characteristics that have inspired wide interest in understanding phylogenetic relationships within and among species of this genus. Nuclear ribosomal DNA sequence-based analyses of Armillaria provide only limited information for phylogenetic studies among widely divergent taxa. More recent studies have shown that translation elongation factor 1-α (tef1) sequences are highly informative for phylogenetic analysis of Armillaria species within diverse global regions. This study used Neighbor-net and coalescence-based Bayesian analyses to examine phylogenetic relationships of newly determined and existing tef1 sequences derived from diverse Armillaria species from across the Northern Hemisphere, with Southern Hemisphere Armillaria species included for reference. Based on the Bayesian analysis of tef1 sequences, Armillaria species from the Northern Hemisphere are generally contained within the following four superclades, which are named according to the specific epithet of the most frequently cited species within the superclade: (i) Socialis/Tabescens (exannulate) superclade including Eurasian A. ectypa, North American A. socialis (A. tabescens), and Eurasian A. socialis (A. tabescens) clades; (ii) Mellea superclade including undescribed annulate North American Armillaria sp. (Mexico) and four separate clades of A. mellea (Europe and Iran, eastern Asia, and two groups from North America); (iii) Gallica superclade including Armillaria Nag E (Japan), multiple clades of A. gallica (Asia and Europe), A. calvescens (eastern North America), A. cepistipes (North America), A. altimontana (western USA), A. nabsnona (North America and Japan), and at least two A. gallica clades (North America); and (iv) Solidipes/Ostoyae superclade including two A. solidipes/ostoyae clades (North America), A. gemina (eastern USA), A. solidipes/ostoyae (Eurasia), A. cepistipes (Europe and Japan), A. sinapina (North America and Japan), and A. borealis (Eurasia) clade 2. Of note is that A. borealis (Eurasia) clade 1 appears basal to the Solidipes/Ostoyae and Gallica superclades. The Neighbor-net analysis showed similar phylogenetic relationships. This study further demonstrates the utility of tef1 for global phylogenetic studies of Armillaria species and provides critical insights into multiple taxonomic issues that warrant further study.
Bunawan, Hamidun; Yen, Choong Chee; Yaakop, Salmah; Noor, Normah Mohd
2017-01-26
The chloroplastic trnL intron and the nuclear internal transcribed spacer (ITS) region were sequenced for 11 Nepenthes species recorded in Peninsular Malaysia to examine their phylogenetic relationship and to evaluate the usage of trnL intron and ITS sequences for phylogenetic reconstruction of this genus. Phylogeny reconstruction was carried out using neighbor-joining, maximum parsimony and Bayesian analyses. All the trees revealed two major clusters, a lowland group consisting of N. ampullaria, N. mirabilis, N. gracilis and N. rafflesiana, and another containing both intermediately distributed species (N. albomarginata and N. benstonei) and four highland species (N. sanguinea, N. macfarlanei, N. ramispina and N. alba). The trnL intron and ITS sequences proved to provide phylogenetic informative characters for deriving a phylogeny of Nepenthes species in Peninsular Malaysia. To our knowledge, this is the first molecular phylogenetic study of Nepenthes species occurring along an altitudinal gradient in Peninsular Malaysia.
Remarkable convergent evolution in specialized parasitic Thecostraca (Crustacea)
Pérez-Losada, Marcos; Høeg, Jens T; Crandall, Keith A
2009-01-01
Background The Thecostraca are arguably the most morphologically and biologically variable group within the Crustacea, including both suspension feeders (Cirripedia: Thoracica and Acrothoracica) and parasitic forms (Cirripedia: Rhizocephala, Ascothoracida and Facetotecta). Similarities between the metamorphosis found in the Facetotecta and Rhizocephala suggests a common evolutionary origin, but until now no comprehensive study has looked at the basic evolution of these thecostracan groups. Results To this end, we collected DNA sequences from three nuclear genes [18S rRNA (2,305), 28S rRNA (2,402), Histone H3 (328)] and 41 larval characters in seven facetotectans, five ascothoracidans, three acrothoracicans, 25 rhizocephalans and 39 thoracicans (ingroup) and 12 Malacostraca and 10 Copepoda (outgroup). Maximum parsimony, maximum likelihood and Bayesian analyses showed the Facetotecta, Ascothoracida and Cirripedia each as monophyletic. The better resolved and highly supported DNA maximum likelihood and morphological-DNA Bayesian analysis trees depicted the main phylogenetic relationships within the Thecostraca as (Facetotecta, (Ascothoracida, (Acrothoracica, (Rhizocephala, Thoracica)))). Conclusion Our analyses indicate a convergent evolution of the very similar and highly reduced slug-shaped stages found during metamorphosis of both the Rhizocephala and the Facetotecta. This provides a remarkable case of convergent evolution and implies that the advanced endoparasitic mode of life known from the Rhizocephala and strongly indicated for the Facetotecta had no common origin. Future analyses are needed to determine whether the most recent common ancestor of the Thecostraca was free-living or some primitive form of ectoparasite. PMID:19374762
Cao, Ya-Nan; Wang, Ian J; Chen, Lu-Yao; Ding, Yan-Qian; Liu, Lu-Xian; Qiu, Ying-Xiong
2018-04-17
The relative roles of geography, climate and ecology in driving population divergence and (incipient) speciation has so far been largely neglected in studies addressing the evolution of East Asia's island flora. Here, we employed chloroplast and ribosomal DNA sequences and restriction site-associated DNA sequencing (RADseq) loci to investigate the phylogeography and drivers of population divergence of Neolitsea sericea. These data sets support the subdivision of N. sericea populations into the Southern and Northern lineages across the 'Tokara gap'. Two distinct sublineages were further identified for the Northern lineage of N. sericea from the RADseq data. RADseq was also used along with approximate Bayesian computation to show that the current distribution and differentiation of N. sericea populations resulted from a combination of relatively ancient migration and successive vicariant events that likely occurred during the mid to late Pleistocene. Landscape genomic analyses showed that, apart from geographic barriers, barrier, potentially local adaptation to different climatic conditions appears to be one of the major drivers for lineage diversification of N. sericea. Copyright © 2018 Elsevier Inc. All rights reserved.
Baird, Amy B; Braun, Janet K; Engstrom, Mark D; Holbert, Ashlyn C; Huerta, Maritza G; Lim, Burton K; Mares, Michael A; Patton, John C; Bickham, John W
2017-01-01
Previous studies on genetics of hoary bats produced differing conclusions on the timing of their colonization of the Hawaiian Islands and whether or not North American (Aeorestes cinereus) and Hawaiian (A. semotus) hoary bats are distinct species. One study, using mtDNA COI and nuclear Rag2 and CMA1, concluded that hoary bats colonized the Hawaiian Islands no more than 10,000 years ago based on indications of population expansion at that time using Extended Bayesian Skyline Plots. The other study, using 3 mtDNA and 1 Y-chromosome locus, concluded that the Hawaiian Islands were colonized about 1 million years ago. To address the marked inconsistencies between those studies, we examined DNA sequences from 4 mitochondrial and 2 nuclear loci in lasiurine bats to investigate the timing of colonization of the Hawaiian Islands by hoary bats, test the hypothesis that Hawaiian and North American hoary bats belong to different species, and further investigate the generic level taxonomy within the tribe. Phylogenetic analysis and dating of the nodes of mtDNA haplotypes and of nuclear CMA1 alleles show that A. semotus invaded the Hawaiian Islands approximately 1.35 Ma and that multiple arrivals of A. cinereus occurred much more recently. Extended Bayesian Skyline plots show population expansion at about 20,000 years ago in the Hawaiian Islands, which we conclude does not represent the timing of colonization of the Hawaiian Islands given the high degree of genetic differentiation among A. cinereus and A. semotus (4.2% divergence at mtDNA Cytb) and the high degree of genetic diversity within A. semotus. Rather, population expansion 20,000 years ago could have resulted from colonization of additional islands, expansion after a bottleneck, or other factors. New genetic data also support the recognition of A. semotus and A. cinereus as distinct species, a finding consistent with previous morphological and behavioral studies. The phylogenetic analysis of CMA1 alleles shows the presence of 2 clades that are primarily associated with A. semotus mtDNA haplotypes, and are unique to the Hawaiian Islands. There is evidence for low levels of hybridization between A. semotus and A. cinereus on the Hawaiian Islands, but it is not extensive (<15% of individuals are of hybrid origin), and clearly each species is able to maintain its own genetic distinctiveness. Both mtDNA and nuclear DNA sequences show deep divergence between the 3 groups (genera) of lasiurine bats that correspond to the previously recognized morphological differences between them. We show that the Tribe Lasiurini contains the genera Aeorestes (hoary bats), Lasiurus (red bats), and Dasypterus (yellow bats).
Braun, Janet K.; Engstrom, Mark D.; Holbert, Ashlyn C.; Huerta, Maritza G.; Lim, Burton K.; Mares, Michael A.; Patton, John C.
2017-01-01
Previous studies on genetics of hoary bats produced differing conclusions on the timing of their colonization of the Hawaiian Islands and whether or not North American (Aeorestes cinereus) and Hawaiian (A. semotus) hoary bats are distinct species. One study, using mtDNA COI and nuclear Rag2 and CMA1, concluded that hoary bats colonized the Hawaiian Islands no more than 10,000 years ago based on indications of population expansion at that time using Extended Bayesian Skyline Plots. The other study, using 3 mtDNA and 1 Y-chromosome locus, concluded that the Hawaiian Islands were colonized about 1 million years ago. To address the marked inconsistencies between those studies, we examined DNA sequences from 4 mitochondrial and 2 nuclear loci in lasiurine bats to investigate the timing of colonization of the Hawaiian Islands by hoary bats, test the hypothesis that Hawaiian and North American hoary bats belong to different species, and further investigate the generic level taxonomy within the tribe. Phylogenetic analysis and dating of the nodes of mtDNA haplotypes and of nuclear CMA1 alleles show that A. semotus invaded the Hawaiian Islands approximately 1.35 Ma and that multiple arrivals of A. cinereus occurred much more recently. Extended Bayesian Skyline plots show population expansion at about 20,000 years ago in the Hawaiian Islands, which we conclude does not represent the timing of colonization of the Hawaiian Islands given the high degree of genetic differentiation among A. cinereus and A. semotus (4.2% divergence at mtDNA Cytb) and the high degree of genetic diversity within A. semotus. Rather, population expansion 20,000 years ago could have resulted from colonization of additional islands, expansion after a bottleneck, or other factors. New genetic data also support the recognition of A. semotus and A. cinereus as distinct species, a finding consistent with previous morphological and behavioral studies. The phylogenetic analysis of CMA1 alleles shows the presence of 2 clades that are primarily associated with A. semotus mtDNA haplotypes, and are unique to the Hawaiian Islands. There is evidence for low levels of hybridization between A. semotus and A. cinereus on the Hawaiian Islands, but it is not extensive (<15% of individuals are of hybrid origin), and clearly each species is able to maintain its own genetic distinctiveness. Both mtDNA and nuclear DNA sequences show deep divergence between the 3 groups (genera) of lasiurine bats that correspond to the previously recognized morphological differences between them. We show that the Tribe Lasiurini contains the genera Aeorestes (hoary bats), Lasiurus (red bats), and Dasypterus (yellow bats). PMID:29020097
Cinelli, Mattia; Sun, Yuxin; Best, Katharine; Heather, James M; Reich-Zeliger, Shlomit; Shifrut, Eric; Friedman, Nir; Shawe-Taylor, John; Chain, Benny
2017-04-01
Somatic DNA recombination, the hallmark of vertebrate adaptive immunity, has the potential to generate a vast diversity of antigen receptor sequences. How this diversity captures antigen specificity remains incompletely understood. In this study we use high throughput sequencing to compare the global changes in T cell receptor β chain complementarity determining region 3 (CDR3β) sequences following immunization with ovalbumin administered with complete Freund's adjuvant (CFA) or CFA alone. The CDR3β sequences were deconstructed into short stretches of overlapping contiguous amino acids. The motifs were ranked according to a one-dimensional Bayesian classifier score comparing their frequency in the repertoires of the two immunization classes. The top ranking motifs were selected and used to create feature vectors which were used to train a support vector machine. The support vector machine achieved high classification scores in a leave-one-out validation test reaching >90% in some cases. The study describes a novel two-stage classification strategy combining a one-dimensional Bayesian classifier with a support vector machine. Using this approach we demonstrate that the frequency of a small number of linear motifs three amino acids in length can accurately identify a CD4 T cell response to ovalbumin against a background response to the complex mixture of antigens which characterize Complete Freund's Adjuvant. The sequence data is available at www.ncbi.nlm.nih.gov/sra/?term¼SRP075893 . The Decombinator package is available at github.com/innate2adaptive/Decombinator . The R package e1071 is available at the CRAN repository https://cran.r-project.org/web/packages/e1071/index.html . b.chain@ucl.ac.uk. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.
Barcoding Neotropical birds: assessing the impact of nonmonophyly in a highly diverse group.
Chaves, Bárbara R N; Chaves, Anderson V; Nascimento, Augusto C A; Chevitarese, Juliana; Vasconcelos, Marcelo F; Santos, Fabrício R
2015-07-01
In this study, we verified the power of DNA barcodes to discriminate Neotropical birds using Bayesian tree reconstructions of a total of 7404 COI sequences from 1521 species, including 55 Brazilian species with no previous barcode data. We found that 10.4% of species were nonmonophyletic, most likely due to inaccurate taxonomy, incomplete lineage sorting or hybridization. At least 0.5% of the sequences (2.5% of the sampled species) retrieved from GenBank were associated with database errors (poor-quality sequences, NuMTs, misidentification or unnoticed hybridization). Paraphyletic species (5.8% of the total) can be related to rapid speciation events leading to nonreciprocal monophyly between recently diverged sister species, or to absence of synapomorphies in the small COI region analysed. We also performed two series of genetic distance calculations under the K2P model for intraspecific and interspecific comparisons: the first included all COI sequences, and the second included only monophyletic taxa observed in the Bayesian trees. As expected, the mean and median pairwise distances were smaller for intraspecific than for interspecific comparisons. However, there was no precise 'barcode gap', which was shown to be larger in the monophyletic taxon data set than for the data from all species, as expected. Our results indicated that although database errors may explain some of the difficulties in the species discrimination of Neotropical birds, distance-based barcode assignment may also be compromised because of the high diversity of bird species and more complex speciation events in the Neotropics. © 2014 John Wiley & Sons Ltd.
Johnson, Leigh A; Chan, Lauren M; Weese, Terri L; Busby, Lisa D; McMurry, Samuel
2008-09-01
Members of the phlox family (Polemoniaceae) serve as useful models for studying various evolutionary and biological processes. Despite its biological importance, no family-wide phylogenetic estimate based on multiple DNA regions with complete generic sampling is available. Here, we analyze one nuclear and five chloroplast DNA sequence regions (nuclear ITS, chloroplast matK, trnL intron plus trnL-trnF intergeneric spacer, and the trnS-trnG, trnD-trnT, and psbM-trnD intergenic spacers) using parsimony and Bayesian methods, as well as assessments of congruence and long branch attraction, to explore phylogenetic relationships among 84 ingroup species representing all currently recognized Polemoniaceae genera. Relationships inferred from the ITS and concatenated chloroplast regions are similar overall. A combined analysis provides strong support for the monophyly of Polemoniaceae and subfamilies Acanthogilioideae, Cobaeoideae, and Polemonioideae. Relationships among subfamilies, and thus for the precise root of Polemoniaceae, remain poorly supported. Within the largest subfamily, Polemonioideae, four clades corresponding to tribes Polemonieae, Phlocideae, Gilieae, and Loeselieae receive strong support. The monogeneric Polemonieae appears sister to Phlocideae. Relationships within Polemonieae, Phlocideae, and Gilieae are mostly consistent between analyses and data permutations. Many relationships within Loeselieae remain uncertain. Overall, inferred phylogenetic relationships support a higher-level classification for Polemoniaceae proposed in 2000.
Liu, Guo-Hua; Wang, Yan; Xu, Min-Jun; Zhou, Dong-Hui; Ye, Yong-Gang; Li, Jia-Yuan; Song, Hui-Qun; Lin, Rui-Qing; Zhu, Xing-Quan
2012-12-01
For many years, whipworms (Trichuris spp.) have been described with a relatively narrow range of both morphological and biometrical features. Moreover, there has been insufficient discrimination between congeners (or closely related species). In the present study, we determined the complete mitochondrial (mt) genomes of two whipworms Trichuris ovis and Trichuris discolor, compared them and then tested the hypothesis that T. ovis and T. discolor are distinct species by phylogenetic analyses using Bayesian inference, maximum likelihood and maximum parsimony) based on the deduced amino acid sequences of the mt protein-coding genes. The complete mt genomes of T. ovis and T. discolor were 13,946 bp and 13,904 bp in size, respectively. Both mt genomes are circular, and consist of 37 genes, including 13 genes coding for proteins, 2 genes for rRNA, and 22 genes for tRNA. The gene content and arrangement are identical to that of human and pig whipworms Trichuris trichiura and Trichuris suis. Taken together, these analyses showed genetic distinctiveness and strongly supported the recent proposal that T. ovis and T. discolor are distinct species using nuclear ribosomal DNA and a portion of the mtDNA sequence dataset. The availability of the complete mtDNA sequences of T. ovis and T. discolor provides novel genetic markers for studying the population genetics, diagnostics and molecular epidemiology of T. ovis and T. discolor. Copyright © 2012 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
McLoughlin, Kevin
2016-01-11
This report describes the design and implementation of an algorithm for estimating relative microbial abundances, together with confidence limits, using data from metagenomic DNA sequencing. For the background behind this project and a detailed discussion of our modeling approach for metagenomic data, we refer the reader to our earlier technical report, dated March 4, 2014. Briefly, we described a fully Bayesian generative model for paired-end sequence read data, incorporating the effects of the relative abundances, the distribution of sequence fragment lengths, fragment position bias, sequencing errors and variations between the sampled genomes and the nearest reference genomes. A distinctive featuremore » of our modeling approach is the use of a Chinese restaurant process (CRP) to describe the selection of genomes to be sampled, and thus the relative abundances. The CRP component is desirable for fitting abundances to reads that may map ambiguously to multiple targets, because it naturally leads to sparse solutions that select the best representative from each set of nearly equivalent genomes.« less
Yu, Teng-Lang; Lin, Hung-Du; Weng, Ching-Feng
2014-01-01
Aim To comprehend the phylogeographic patterns of genetic variation in anurans at Taiwan Island, this study attempted to examine (1) the existence of various geological barriers (Central Mountain Ranges, CMRs); and (2) the genetic variation of Bufo bankorensis using mtDNA sequences among populations located in different regions of Taiwan, characterized by different climates and existing under extreme conditions when compared available sequences of related species B. gargarizans of mainland China. Methodology/Principal Findings Phylogenetic analyses of the dataset with mitochondrial DNA (mtDNA) D-loop gene (348 bp) recovered a close relationship between B. bankorensis and B. gargarizans, identified three distinct lineages. Furthermore, the network of mtDNA D-loop gene (564 bp) amplified (279 individuals, 27 localities) from Taiwan Island indicated three divergent clades within B. bankorensis (Clade W, E and S), corresponding to the geography, thereby verifying the importance of the CMRs and Kaoping River drainage as major biogeographic barriers. Mismatch distribution analysis, neutrality tests and Bayesian skyline plots revealed that a significant population expansion occurred for the total population and Clade W, with horizons dated to approximately 0.08 and 0.07 Mya, respectively. These results suggest that the population expansion of Taiwan Island species B. bankorensis might have resulted from the release of available habitat in post-glacial periods, the genetic variation on mtDNA showing habitat selection, subsequent population dispersal, and co-distribution among clades. Conclusions The multiple origins (different clades) of B. bankorensis mtDNA sequences were first evident in this study. The divergent genetic clades found within B. bankorensis could be independent colonization by previously diverged lineages; inferring B. bankorensis originated from B. gargarizans of mainland China, then dispersal followed by isolation within Taiwan Island. Highly divergent clades between W and E of B. bankorensis, implies that the CMRs serve as a genetic barrier and separated the whole island into the western and eastern phylogroups. PMID:24853679
2011-01-01
Background DNA target enrichment by micro-array capture combined with high throughput sequencing technologies provides the possibility to obtain large amounts of sequence data (e.g. whole mitochondrial DNA genomes) from multiple individuals at relatively low costs. Previously, whole mitochondrial genome data for domestic horses (Equus caballus) were limited to only a few specimens and only short parts of the mtDNA genome (especially the hypervariable region) were investigated for larger sample sets. Results In this study we investigated whole mitochondrial genomes of 59 domestic horses from 44 breeds and a single Przewalski horse (Equus przewalski) using a recently described multiplex micro-array capture approach. We found 473 variable positions within the domestic horses, 292 of which are parsimony-informative, providing a well resolved phylogenetic tree. Our divergence time estimate suggests that the mitochondrial genomes of modern horse breeds shared a common ancestor around 93,000 years ago and no later than 38,000 years ago. A Bayesian skyline plot (BSP) reveals a significant population expansion beginning 6,000-8,000 years ago with an ongoing exponential growth until the present, similar to other domestic animal species. Our data further suggest that a large sample of wild horse diversity was incorporated into the domestic population; specifically, at least 46 of the mtDNA lineages observed in domestic horses (73%) already existed before the beginning of domestication about 5,000 years ago. Conclusions Our study provides a window into the maternal origins of extant domestic horses and confirms that modern domestic breeds present a wide sample of the mtDNA diversity found in ancestral, now extinct, wild horse populations. The data obtained allow us to detect a population expansion event coinciding with the beginning of domestication and to estimate both the minimum number of female horses incorporated into the domestic gene pool and the time depth of the domestic horse mtDNA gene pool. PMID:22082251
The complete mitochondrial genomes of five Eimeria species infecting domestic rabbits.
Liu, Guo-Hua; Tian, Si-Qin; Cui, Ping; Fang, Su-Fang; Wang, Chun-Ren; Zhu, Xing-Quan
2015-12-01
Rabbit coccidiosis caused by members of the genus Eimeria can cause enormous economic impact worldwide, but the genetics, epidemiology and biology of these parasites remain poorly understood. In the present study, we sequenced and annotated the complete mitochondrial (mt) genomes of five Eimeria species that commonly infect the domestic rabbits. The complete mt genomes of Eimeria intestinalis, Eimeria flavescens, Eimeria media, Eimeria vejdovskyi and Eimeria irresidua were 6261bp, 6258bp, 6168bp, 6254bp, 6259bp in length, respectively. All of the mt genomes consist of 3 genes for proteins (cytb, cox1, and cox3), 14 gene fragments for the large subunit (LSU) rRNA and 11 gene fragments for the small subunit (SSU) rRNA, but no transfer RNA (tRNA) genes. The gene order of the mt genomes is similar to that of Plasmodium, but distinct from Haemosporida and Theileria. Phylogenetic analyses based on full nucleotide sequences using Bayesian analysis revealed that the monophyly of the Eimeria of rabbits was strongly statistically supported with a Bayesian posterior probabilities. These data provide novel mtDNA markers for studying the population genetics and molecular epidemiology of the Eimeria species, and should have implications for the molecular diagnosis, prevention and control of coccidiosis in rabbits. Copyright © 2015 Elsevier Inc. All rights reserved.
How many novel eukaryotic 'kingdoms'? Pitfalls and limitations of environmental DNA surveys
Berney, Cédric; Fahrni, José; Pawlowski, Jan
2004-01-01
Background Over the past few years, the use of molecular techniques to detect cultivation-independent, eukaryotic diversity has proven to be a powerful approach. Based on small-subunit ribosomal RNA (SSU rRNA) gene analyses, these studies have revealed the existence of an unexpected variety of new phylotypes. Some of them represent novel diversity in known eukaryotic groups, mainly stramenopiles and alveolates. Others do not seem to be related to any molecularly described lineage, and have been proposed to represent novel eukaryotic kingdoms. In order to review the evolutionary importance of this novel high-level eukaryotic diversity critically, and to test the potential technical and analytical pitfalls and limitations of eukaryotic environmental DNA surveys (EES), we analysed 484 environmental SSU rRNA gene sequences, including 81 new sequences from sediments of the small river, the Seymaz (Geneva, Switzerland). Results Based on a detailed screening of an exhaustive alignment of eukaryotic SSU rRNA gene sequences and the phylogenetic re-analysis of previously published environmental sequences using Bayesian methods, our results suggest that the number of novel higher-level taxa revealed by previously published EES was overestimated. Three main sources of errors are responsible for this situation: (1) the presence of undetected chimeric sequences; (2) the misplacement of several fast-evolving sequences; and (3) the incomplete sampling of described, but yet unsequenced eukaryotes. Additionally, EES give a biased view of the diversity present in a given biotope because of the difficult amplification of SSU rRNA genes in some taxonomic groups. Conclusions Environmental DNA surveys undoubtedly contribute to reveal many novel eukaryotic lineages, but there is no clear evidence for a spectacular increase of the diversity at the kingdom level. After re-analysis of previously published data, we found only five candidate lineages of possible novel high-level eukaryotic taxa, two of which comprise several phylotypes that were found independently in different studies. To ascertain their taxonomic status, however, the organisms themselves have now to be identified. PMID:15176975
Gruwell, Matthew E; Morse, Geoffrey E; Normark, Benjamin B
2007-07-01
Insects in the sap-sucking hemipteran suborder Sternorrhyncha typically harbor maternally transmitted bacteria housed in a specialized organ, the bacteriome. In three of the four superfamilies of Sternorrhyncha (Aphidoidea, Aleyrodoidea, Psylloidea), the bacteriome-associated (primary) bacterial lineage is from the class Gammaproteobacteria (phylum Proteobacteria). The fourth superfamily, Coccoidea (scale insects), has a diverse array of bacterial endosymbionts whose affinities are largely unexplored. We have amplified fragments of two bacterial ribosomal genes from each of 68 species of armored scale insects (Diaspididae). In spite of initially using primers designed for Gammaproteobacteria, we consistently amplified sequences from a different bacterial phylum: Bacteroidetes. We use these sequences (16S and 23S, 2105 total base pairs), along with previously published sequences from the armored scale hosts (elongation factor 1alpha and 28S rDNA) to investigate phylogenetic congruence between the two clades. The Bayesian tree for the bacteria is roughly congruent with that of the hosts, with 67% of nodes identical. Partition homogeneity tests found no significant difference between the host and bacterial data sets. Of thirteen Shimodaira-Hasegawa tests, comparing the original Bayesian bacterial tree to bacterial trees with incongruent clades forced to match the host tree, 12 found no significant difference. A significant difference in topology was found only when the entire host tree was compared with the entire bacterial tree. For the bacterial data set, the treelengths of the most parsimonious host trees are only 1.8-2.4% longer than that of the most parsimonious bacterial trees. The high level of congruence between the topologies indicates that these Bacteroidetes are the primary endosymbionts of armored scale insects. To investigate the phylogenetic affinities of these endosymbionts, we aligned some of their 16S rDNA sequences with other known Bacteroidetes endosymbionts and with other similar sequences identified by BLAST searches. Although the endosymbionts of armored scales are only distantly related to the endosymbionts of the other sternorrhynchan insects, they are closely related to bacteria associated with eriococcid and margarodid scale insects, to cockroach and auchenorrynchan endosymbionts (Blattabacterium and Sulcia), and to male-killing endosymbionts of ladybird beetles. We propose the name "Candidatus Uzinura diaspidicola" for the primary endosymbionts of armored scale insects.
Budde, K B; González-Martínez, S C; Hardy, O J; Heuertz, M
2013-07-01
Understanding the history of forests and their species' demographic responses to past disturbances is important for predicting impacts of future environmental changes. Tropical rainforests of the Guineo-Congolian region in Central Africa are believed to have survived the Pleistocene glacial periods in a few major refugia, essentially centred on mountainous regions close to the Atlantic Ocean. We tested this hypothesis by investigating the phylogeographic structure of a widespread, ancient rainforest tree species, Symphonia globulifera L. f. (Clusiaceae), using plastid DNA sequences (chloroplast DNA [cpDNA], psbA-trnH intergenic spacer) and nuclear microsatellites (simple sequence repeats, SSRs). SSRs identified four gene pools located in Benin, West Cameroon, South Cameroon and Gabon, and São Tomé. This structure was also apparent at cpDNA. Approximate Bayesian Computation detected recent bottlenecks approximately dated to the last glacial maximum in Benin, West Cameroon and São Tomé, and an older bottleneck in South Cameroon and Gabon, suggesting a genetic effect of Pleistocene cycles of forest contraction. CpDNA haplotype distribution indicated wide-ranging long-term persistence of S. globulifera both inside and outside of postulated forest refugia. Pollen flow was four times greater than that of seed in South Cameroon and Gabon, which probably enabled rapid population recovery after bottlenecks. Furthermore, our study suggested ecotypic differentiation-coastal or swamp vs terra firme-in S. globulifera. Comparison with other tree phylogeographic studies in Central Africa highlighted the relevance of species-specific responses to environmental change in forest trees.
Budde, K B; González-Martínez, S C; Hardy, O J; Heuertz, M
2013-01-01
Understanding the history of forests and their species' demographic responses to past disturbances is important for predicting impacts of future environmental changes. Tropical rainforests of the Guineo-Congolian region in Central Africa are believed to have survived the Pleistocene glacial periods in a few major refugia, essentially centred on mountainous regions close to the Atlantic Ocean. We tested this hypothesis by investigating the phylogeographic structure of a widespread, ancient rainforest tree species, Symphonia globulifera L. f. (Clusiaceae), using plastid DNA sequences (chloroplast DNA [cpDNA], psbA-trnH intergenic spacer) and nuclear microsatellites (simple sequence repeats, SSRs). SSRs identified four gene pools located in Benin, West Cameroon, South Cameroon and Gabon, and São Tomé. This structure was also apparent at cpDNA. Approximate Bayesian Computation detected recent bottlenecks approximately dated to the last glacial maximum in Benin, West Cameroon and São Tomé, and an older bottleneck in South Cameroon and Gabon, suggesting a genetic effect of Pleistocene cycles of forest contraction. CpDNA haplotype distribution indicated wide-ranging long-term persistence of S. globulifera both inside and outside of postulated forest refugia. Pollen flow was four times greater than that of seed in South Cameroon and Gabon, which probably enabled rapid population recovery after bottlenecks. Furthermore, our study suggested ecotypic differentiation—coastal or swamp vs terra firme—in S. globulifera. Comparison with other tree phylogeographic studies in Central Africa highlighted the relevance of species-specific responses to environmental change in forest trees. PMID:23572126
Lukoschek, Vimoksalehi; Scott Keogh, J; Avise, John C
2012-01-01
Evolutionary and biogeographic studies increasingly rely on calibrated molecular clocks to date key events. Although there has been significant recent progress in development of the techniques used for molecular dating, many issues remain. In particular, controversies abound over the appropriate use and placement of fossils for calibrating molecular clocks. Several methods have been proposed for evaluating candidate fossils; however, few studies have compared the results obtained by different approaches. Moreover, no previous study has incorporated the effects of nucleotide saturation from different data types in the evaluation of candidate fossils. In order to address these issues, we compared three approaches for evaluating fossil calibrations: the single-fossil cross-validation method of Near, Meylan, and Shaffer (2005. Assessing concordance of fossil calibration points in molecular clock studies: an example using turtles. Am. Nat. 165:137-146), the empirical fossil coverage method of Marshall (2008. A simple method for bracketing absolute divergence times on molecular phylogenies using multiple fossil calibration points. Am. Nat. 171:726-742), and the Bayesian multicalibration method of Sanders and Lee (2007. Evaluating molecular clock calibrations using Bayesian analyses with soft and hard bounds. Biol. Lett. 3:275-279) and explicitly incorporate the effects of data type (nuclear vs. mitochondrial DNA) for identifying the most reliable or congruent fossil calibrations. We used advanced (Caenophidian) snakes as a case study; however, our results are applicable to any taxonomic group with multiple candidate fossils, provided appropriate taxon sampling and sufficient molecular sequence data are available. We found that data type strongly influenced which fossil calibrations were identified as outliers, regardless of which method was used. Despite the use of complex partitioned models of sequence evolution and multiple calibrations throughout the tree, saturation severely compressed basal branch lengths obtained from mitochondrial DNA compared with nuclear DNA. The effects of mitochondrial saturation were not ameliorated by analyzing a combined nuclear and mitochondrial data set. Although removing the third codon positions from the mitochondrial coding regions did not ameliorate saturation effects in the single-fossil cross-validations, it did in the Bayesian multicalibration analyses. Saturation significantly influenced the fossils that were selected as most reliable for all three methods evaluated. Our findings highlight the need to critically evaluate the fossils selected by data with different rates of nucleotide substitution and how data with different evolutionary rates affect the results of each method for evaluating fossils. Our empirical evaluation demonstrates that the advantages of using multiple independent fossil calibrations significantly outweigh any disadvantages.
eDNAoccupancy: An R package for multi-scale occupancy modeling of environmental DNA data
Dorazio, Robert; Erickson, Richard A.
2017-01-01
In this article we describe eDNAoccupancy, an R package for fitting Bayesian, multi-scale occupancy models. These models are appropriate for occupancy surveys that include three, nested levels of sampling: primary sample units within a study area, secondary sample units collected from each primary unit, and replicates of each secondary sample unit. This design is commonly used in occupancy surveys of environmental DNA (eDNA). eDNAoccupancy allows users to specify and fit multi-scale occupancy models with or without covariates, to estimate posterior summaries of occurrence and detection probabilities, and to compare different models using Bayesian model-selection criteria. We illustrate these features by analyzing two published data sets: eDNA surveys of a fungal pathogen of amphibians and eDNA surveys of an endangered fish species.
Javadi, Firouzeh; Tun, Ye Tun; Kawase, Makoto; Guan, Kaiyun; Yamaguchi, Hirofumi
2011-08-01
The subgenus Ceratotropis in the genus Vigna is widely distributed from the Himalayan highlands to South, Southeast and East Asia. However, the interspecific and geographical relationships of its members are poorly understood. This study investigates the phylogeny and biogeography of the subgenus Ceratotropis using chloroplast DNA sequence data. Sequence data from four intergenic spacer regions (petA-psbJ, psbD-trnT, trnT-trnE and trnT-trnL) of chloroplast DNA, alone and in combination, were analysed using Bayesian and parsimony methods. Divergence times for major clades were estimated with penalized likelihood. Character evolution was examined by means of parsimony optimization and MacClade. Parsimony and Bayesian phylogenetic analyses on the combined data demonstrated well-resolved species relationships in which 18 Vigna species were divided into two major geographical clades: the East Asia-Southeast Asian clade and the Indian subcontinent clade. Within these two clades, three well-supported eco-geographical groups, temperate and subtropical (the East Asia-Southeast Asian clade) and tropical (the Indian subcontinent clade), are recognized. The temperate group consists of V. minima, V. nepalensis and V. angularis. The subtropical group comprises the V. nakashimae-V. riukiuensis-V. minima subgroup and the V. hirtella-V. exilis-V. umbellata subgroup. The tropical group contains two subgroups: the V. trinervia-V. reflexo-pilosa-V. trilobata subgroup and the V. mungo-V. grandiflora subgroup. An evolutionary rate analysis estimated the divergence time between the East Asia-Southeast Asia clade and the Indian subcontinent clade as 3·62 ± 0·3 million years, and that between the temperate and subtropical groups as 2·0 ± 0·2 million years. The findings provide an improved understanding of the interspecific relationships, and ecological and geographical phylogenetic structure of the subgenus Ceratotropis. The quaternary diversification of the subgenus Ceratotropis implicates its geographical dispersal in the south-eastern part of Asia involving adaptation to climatic condition after the collision of the Indian subcontinent with the Asian plate. The phylogenetic results indicate that the epigeal germination is plesiomorphic, and the germination type evolved independently multiple times in this subgenus, implying its limited taxonomic utility.
Goldsmith, Elizabeth W.; Renshaw, Benjamin; Clement, Christopher J.; Himschoot, Elizabeth A.; Hundertmark, Kris J.; Hueffer, Karsten
2015-01-01
For pathogens that infect multiple species the distinction between reservoir hosts and spillover hosts is often difficult. In Alaska, three variants of the arctic rabies virus exist with distinct spatial distributions. We test the hypothesis that rabies virus variant distribution corresponds to the population structure of the primary rabies hosts in Alaska, arctic foxes (Vulpes lagopus) and red foxes (V. vulpes) in order to possibly distinguish reservoir and spill over hosts. We used mitochondrial DNA (mtDNA) sequence and nine microsatellites to assess population structure in those two species. mtDNA structure did not correspond to rabies virus variant structure in either species. Microsatellite analyses gave varying results. Bayesian clustering found 2 groups of arctic foxes in the coastal tundra region, but for red foxes it identified tundra and boreal types. Spatial Bayesian clustering and spatial principal components analysis identified 3 and 4 groups of arctic foxes, respectively, closely matching the distribution of rabies virus variants in the state. Red foxes, conversely, showed eight clusters comprising 2 regions (boreal and tundra) with much admixture. These results run contrary to previous beliefs that arctic fox show no fine-scale spatial population structure. While we cannot rule out that the red fox is part of the maintenance host community for rabies in Alaska, the distribution of virus variants appears to be driven primarily by the artic fox Therefore we show that host population genetics can be utilized to distinguish between maintenance and spillover hosts when used in conjunction with other approaches. PMID:26661691
Goldsmith, Elizabeth W; Renshaw, Benjamin; Clement, Christopher J; Himschoot, Elizabeth A; Hundertmark, Kris J; Hueffer, Karsten
2016-02-01
For pathogens that infect multiple species, the distinction between reservoir hosts and spillover hosts is often difficult. In Alaska, three variants of the arctic rabies virus exist with distinct spatial distributions. We tested the hypothesis that rabies virus variant distribution corresponds to the population structure of the primary rabies hosts in Alaska, arctic foxes (Vulpes lagopus) and red foxes (Vulpes vulpes) to possibly distinguish reservoir and spillover hosts. We used mitochondrial DNA (mtDNA) sequence and nine microsatellites to assess population structure in those two species. mtDNA structure did not correspond to rabies virus variant structure in either species. Microsatellite analyses gave varying results. Bayesian clustering found two groups of arctic foxes in the coastal tundra region, but for red foxes it identified tundra and boreal types. Spatial Bayesian clustering and spatial principal components analysis identified 3 and 4 groups of arctic foxes, respectively, closely matching the distribution of rabies virus variants in the state. Red foxes, conversely, showed eight clusters comprising two regions (boreal and tundra) with much admixture. These results run contrary to previous beliefs that arctic fox show no fine-scale spatial population structure. While we cannot rule out that the red fox is part of the maintenance host community for rabies in Alaska, the distribution of virus variants appears to be driven primarily by the arctic fox. Therefore, we show that host population genetics can be utilized to distinguish between maintenance and spillover hosts when used in conjunction with other approaches. © 2015 John Wiley & Sons Ltd.
In real-time quantitative PCR studies using absolute plasmid DNA standards, a calibration curve is developed to estimate an unknown DNA concentration. However, potential differences in the amplification performance of plasmid DNA compared to genomic DNA standards are often ignore...
Dutra Vieira, Thainá; Pegoraro de Macedo, Marcia Raquel; Fedatto Bernardon, Fabiana; Müller, Gertrud
2017-10-01
The nematode Diplotriaena bargusinica is a bird air sac parasite, and its taxonomy is based mainly on morphological and morphometric characteristics. Increasing knowledge of genetic information variability has spurred the use of DNA markers in conjunction with morphological data for inferring phylogenetic relationships in different taxa. Considering the potential of molecular biology in taxonomy, this study presents the morphological and molecular characterization of D. bargusinica, and establishes the phylogenetic position of the nematode in Spirurina. Twenty partial sequences of the 18S region of D. bargusinica rDNA were generated. Phylogenetic trees were obtained through the Maximum Likelihood and Bayesian Inference methods where both had similar topology. The group Diplotriaenoidea is monophyletic and the topologies generated corroborate the phylogenetic studies based on traditional and previously performed molecular taxonomy. This study is the first to generate molecular data associated with the morphology of the species. Copyright © 2017 Elsevier B.V. All rights reserved.
Chaves, Guilherme M; Terçarioli, Gisela R; Padovan, Ana Carolina B; Rosas, Robert C; Ferreira, Renata C; Melo, Analy S A; Colombo, Arnaldo L
2013-04-01
Candida rugosa is a yeast species that is emerging as a causative agent of invasive infection, particularly in Latin America. Recently, C. pseudorugosa was proposed as a new species closely related to C. rugosa. We evaluated in this investigation the genetic heterogeneity within the C. rugosa species complex. All clinical isolates used in this study were identified phenotypically as C. rugosa but were genotypically different from the C. rugosa type, ATCC 10571. RAPD marker analysis revealed less than 83% similarity between our clinical isolates and the C. rugosa type strain. The D1/D2 region sequences of our clinical isolates showed 98% identity with C. rugosa but only 94-95% identity with C. pseudorugosa. The ITS rDNA sequences of the Brazilian isolates showed 91% identity with the C. rugosa ATCC 10571 ITS sequence. Network and Bayesian analyses of ITS and housekeeping gene sequences separated our clinical isolates into different branches from C. rugosa type strain. These differences are sufficient to reassign our isolates to a distinct species, named C. mesorugosa.
Stenøien, H K; Shaw, A J; Stengrundet, K; Flatberg, K I
2011-01-01
It is commonly found that individual hybrid, polyploid species originate recurrently and that many polyploid species originated relatively recently. It has been previously hypothesized that the extremely rare allopolyploid peat moss Sphagnum troendelagicum has originated multiple times, possibly after the last glacial maximum in Scandinavia. This conclusion was based on low linkage disequilibrium in anonymous genetic markers within natural populations, in which sexual reproduction has never been observed. Here we employ microsatellite markers and chloroplast DNA (cpDNA)-encoded trnG sequence data to test hypotheses concerning the origin and evolution of this species. We find that S. tenellum is the maternal progenitor and S. balticum is the paternal progenitor of S. troendelagicum. Using various Bayesian approaches, we estimate that S. troendelagicum originated before the Holocene but not before c. 80 000 years ago (median expected time since speciation 40 000 years before present). The observed lack of complete linkage disequilibrium in the genome of this species suggests cryptic sexual reproduction and recombination. Several lines of evidence suggest multiple origins for S. troendelagicum, but a single origin is supported by approximate Bayesian computation analyses. We hypothesize that S. troendelagicum originated in a peat-dominated refugium before last glacial maximum, and subsequently immigrated to central Norway by means of spore flow during the last thousands of years. PMID:20717162
Schield, Drew R; Adams, Richard H; Card, Daren C; Corbin, Andrew B; Jezkova, Tereza; Hales, Nicole R; Meik, Jesse M; Perry, Blair W; Spencer, Carol L; Smith, Lydia L; García, Gustavo Campillo; Bouzid, Nassima M; Strickland, Jason L; Parkinson, Christopher L; Borja, Miguel; Castañeda-Gaytán, Gamaliel; Bryson, Robert W; Flores-Villela, Oscar A; Mackessy, Stephen P; Castoe, Todd A
2018-06-15
The Mojave rattlesnake (Crotalus scutulatus) inhabits deserts and arid grasslands of the western United States and Mexico. Despite considerable interest in its highly toxic venom and the recognition of two subspecies, no molecular studies have characterized range-wide genetic diversity and population structure or tested species limits within C. scutulatus. We used mitochondrial DNA and thousands of nuclear loci from double-digest restriction site associated DNA sequencing to infer population genetic structure throughout the range of C. scutulatus, and to evaluate divergence times and gene flow between populations. We find strong support for several divergent mitochondrial and nuclear clades of C. scutulatus, including splits coincident with two major phylogeographic barriers: the Continental Divide and the elevational increase associated with the Central Mexican Plateau. We apply Bayesian clustering, phylogenetic inference, and coalescent-based species delimitation to our nuclear genetic data to test hypotheses of population structure. We also performed demographic analyses to test hypotheses relating to population divergence and gene flow. Collectively, our results support the existence of four distinct lineages within C. scutulatus, and genetically defined populations do not correspond with currently recognized subspecies ranges. Finally, we use approximate Bayesian computation to test hypotheses of divergence among multiple rattlesnake species groups distributed across the Continental Divide, and find evidence for co-divergence at this boundary during the mid-Pleistocene. Copyright © 2018 Elsevier Inc. All rights reserved.
Johnston, Iain G; Burgstaller, Joerg P; Havlicek, Vitezslav; Kolbe, Thomas; Rülicke, Thomas; Brem, Gottfried; Poulton, Jo; Jones, Nick S
2015-01-01
Dangerous damage to mitochondrial DNA (mtDNA) can be ameliorated during mammalian development through a highly debated mechanism called the mtDNA bottleneck. Uncertainty surrounding this process limits our ability to address inherited mtDNA diseases. We produce a new, physically motivated, generalisable theoretical model for mtDNA populations during development, allowing the first statistical comparison of proposed bottleneck mechanisms. Using approximate Bayesian computation and mouse data, we find most statistical support for a combination of binomial partitioning of mtDNAs at cell divisions and random mtDNA turnover, meaning that the debated exact magnitude of mtDNA copy number depletion is flexible. New experimental measurements from a wild-derived mtDNA pairing in mice confirm the theoretical predictions of this model. We analytically solve a mathematical description of this mechanism, computing probabilities of mtDNA disease onset, efficacy of clinical sampling strategies, and effects of potential dynamic interventions, thus developing a quantitative and experimentally-supported stochastic theory of the bottleneck. DOI: http://dx.doi.org/10.7554/eLife.07464.001 PMID:26035426
Boykin, L M; Shatters, R G; Hall, D G; Burns, R E; Franqui, R A
2006-10-01
Anastrepha suspensa (Loew) is an economically important pest, restricted to the Greater Antilles and southern Florida. It infests a wide variety of hosts and is of quarantine importance in citrus, a multi-million dollar industry in Florida. The observed recent increase in citrus infested with A. suspensa in Florida has raised questions regarding host-specificity of certain populations and genetic diversity of the pest throughout its geographical distribution. Cytochrome oxidase I (COI) DNA sequence data was used to characterize the genetic diversity of A. suspensa from Florida and Caribbean populations reared from different host plants. Maximum likelihood and Bayesian phylogenetic methods were used to analyse COI data. Sequence variation among mitochondrial COI genes from 107 A. suspensa samples collected throughout Florida and the Caribbean ranged between 0 and 10% and placed all A. suspensa as a monophyletic group that united all A. suspensa in a clade sister to a Central American group of the A. fraterculus paraphyletic species complex. The most likely tree of the COI locus indicated that COI sequence variation was too low to provide resolution at the subspecies level, therefore monophyletic groups based on host-plant use, geography (Florida, Jamaica, Cayman Islands, Puerto Rico or Dominican Republic) or population sampled are not supported. This result indicates that either no population segregation has occurred based on these biological or geographical distinctions and that this is a generalist, polyphagous invasive genotype. Alternatively, if populations are distinct, the segregation event was more recent than can be distinguished based on COI sequence variation.
Vidal-Martínez, Victor M.
2017-01-01
The phylogenetic position of three taxa from two trematode genera, belonging to the subfamily Acanthostominae (Opisthorchioidea: Cryptogonimidae), were analysed using partial 28S ribosomal DNA (Domains 1–2) and internal transcribed spacers (ITS1–5.8S–ITS2). Bayesian inference and Maximum likelihood analyses of combined 28S rDNA and ITS1 + 5.8S + ITS2 sequences indicated the monophyly of the genus Acanthostomum (A. cf. americanum and A. burminis) and paraphyly of the Acanthostominae. These phylogenetic relationships were consistent in analyses of 28S alone and concatenated 28S + ITS1 + 5.8S + ITS2 sequences analyses. Based on molecular phylogenetic analyses, the subfamily Acanthostominae is therefore a paraphyletic taxon, in contrast with previous classifications based on morphological data. Phylogenetic patterns of host specificity inferred from adult stages of other cryptogonimid taxa are also well supported. However, analyses using additional genera and species are necessary to support the phylogenetic inferences from this study. Our molecular phylogenetic reconstruction linked two larval stages of A. cf. americanum cercariae and metacercariae. Here, we present the evolutionary and ecological implications of parasitic infections in freshwater and brackish environments. PMID:29250471
Martínez-Aquino, Andrés; Vidal-Martínez, Victor M; Aguirre-Macedo, M Leopoldina
2017-01-01
The phylogenetic position of three taxa from two trematode genera, belonging to the subfamily Acanthostominae (Opisthorchioidea: Cryptogonimidae), were analysed using partial 28S ribosomal DNA (Domains 1-2) and internal transcribed spacers (ITS1-5.8S-ITS2). Bayesian inference and Maximum likelihood analyses of combined 28S rDNA and ITS1 + 5.8S + ITS2 sequences indicated the monophyly of the genus Acanthostomum ( A. cf. americanum and A. burminis ) and paraphyly of the Acanthostominae . These phylogenetic relationships were consistent in analyses of 28S alone and concatenated 28S + ITS1 + 5.8S + ITS2 sequences analyses. Based on molecular phylogenetic analyses, the subfamily Acanthostominae is therefore a paraphyletic taxon, in contrast with previous classifications based on morphological data. Phylogenetic patterns of host specificity inferred from adult stages of other cryptogonimid taxa are also well supported. However, analyses using additional genera and species are necessary to support the phylogenetic inferences from this study. Our molecular phylogenetic reconstruction linked two larval stages of A. cf. americanum cercariae and metacercariae. Here, we present the evolutionary and ecological implications of parasitic infections in freshwater and brackish environments.
Gleeson, Ricky; Adlard, Robert
2011-10-01
Three new species of Ceratomyxa Thélohan, 1892 are described from the gall-bladders of two species of carcharhinid sharks collected off Heron and Lizard Islands on the Great Barrier Reef, Australia. Ceratomyxa carcharhini n. sp. and C. melanopteri n. sp. are described from Carcharhinus melanopterus (Quoy & Gaimard), and Ceratomyxa negaprioni n. sp. is described from Negaprion acutidens (Rüppell). These species are the first ceratomyxids reported from Australian elasmobranchs, and this is the first paper to formally characterise a novel Ceratomyxa species from an elasmobranch using both morphology and small subunit ribosomal DNA sequence data. Maximum parsimony and Bayesian inference analyses of the SSU rDNA dataset revealed that ceratomyxids from elasmobranchs form a sister clade to that of species infecting marine teleosts and Palliatus indecorus Schulman, Kovaleva & Dubina, 1979. Furthermore, the only sequenced freshwater ceratomyxid, Ceratomyxa shasta Noble, 1950, fell outside the overall marine ceratomyxid clade. These data show that Ceratomyxa, as currently recognised, is polyphyletic and ignites discussion on whether Ceratomyxa should be split. However, further taxon sampling, particularly in freshwater systems, is required to establish relevant biological divisions within the genus.
Soares, André E R; Novak, Ben J; Haile, James; Heupink, Tim H; Fjeldså, Jon; Gilbert, M Thomas P; Poinar, Hendrik; Church, George M; Shapiro, Beth
2016-10-26
Pigeons and doves (Columbiformes) are one of the oldest and most diverse extant lineages of birds. However, the nature and timing of the group's evolutionary radiation remains poorly resolved, despite recent advances in DNA sequencing and assembly and the growing database of pigeon mitochondrial genomes. One challenge has been to generate comparative data from the large number of extinct pigeon lineages, some of which are morphologically unique and therefore difficult to place in a phylogenetic context. We used ancient DNA and next generation sequencing approaches to assemble complete mitochondrial genomes for eleven pigeons, including the extinct Ryukyu wood pigeon (Columba jouyi), the thick-billed ground dove (Alopecoenas salamonis), the spotted green pigeon (Caloenas maculata), the Rodrigues solitaire (Pezophaps solitaria), and the dodo (Raphus cucullatus). We used a Bayesian approach to infer the evolutionary relationships among 24 species of living and extinct pigeons and doves. Our analyses indicate that the earliest radiation of the Columbidae crown group most likely occurred during the Oligocene, with continued divergence of major clades into the Miocene, suggesting that diversification within the Columbidae occurred more recently than has been reported previously.
Andersen, Heidi L; Ekman, Stefan
2005-01-01
The phylogeny of the family Micareaceae and the genus Micarea was studied using mitochondrial small subunit ribosomal DNA sequences. Phylogenetic reconstructions were performed using Bayesian MCMC tree sampling and a maximum likelihood approach. The Micareaceae in its current sense is highly heterogeneous, and Helocarpon, Psilolechia, and Scutula, all thought to be close relatives of Micarea, are shown to be only distantly related. The genus Micarea is paraphyletic unless the entire Pilocarpaceae and Ectolechiaceae are included, as also indicated by an expected likelihood weights test. It is suggested that the Micareaceae is reduced to synonymy with the Pilocarpaceae, which also includes the Ectolechiaceae, and that Micarea may have to be divided into a series of smaller genera in the future. Micarea species with a 'non-micareoid' photobiont group with Psora and the Ramalinaceae, whereas Micarea intrusa appears to belong in Scoliciosporum. Three species fall inside the paraphyletic Micarea: Szczawinskia tsugae, Catillaria contristans, and Fellhaneropsis vezdae. Tropical foliicolous taxa are nested within groups of mainly temperate and arctic-alpine distribution. A 'micareoid' photobiont appears to be plesiomorphic in the Pilocarpaceae but has been lost a few times.
Wang, Baosheng; Khalili Mahani, Marjan; Ng, Wei Lun; Kusumi, Junko; Phi, Hai Hong; Inomata, Nobuyuki; Wang, Xiao-Ru; Szmidt, Alfred E
2014-01-01
Pinus krempfii Lecomte is a morphologically and ecologically unique pine, endemic to Vietnam. It is regarded as vulnerable species with distribution limited to just two provinces: Khanh Hoa and Lam Dong. Although a few phylogenetic studies have included this species, almost nothing is known about its genetic features. In particular, there are no studies addressing the levels and patterns of genetic variation in natural populations of P. krempfii. In this study, we sampled 57 individuals from six natural populations of P. krempfii and analyzed their sequence variation in ten nuclear gene regions (approximately 9 kb) and 14 mitochondrial (mt) DNA regions (approximately 10 kb). We also analyzed variation at seven chloroplast (cp) microsatellite (SSR) loci. We found very low haplotype and nucleotide diversity at nuclear loci compared with other pine species. Furthermore, all investigated populations were monomorphic across all mitochondrial DNA (mtDNA) regions included in our study, which are polymorphic in other pine species. Population differentiation at nuclear loci was low (5.2%) but significant. However, structure analysis of nuclear loci did not detect genetically differentiated groups of populations. Approximate Bayesian computation (ABC) using nuclear sequence data and mismatch distribution analysis for cpSSR loci suggested recent expansion of the species. The implications of these findings for the management and conservation of P. krempfii genetic resources were discussed. PMID:25360263
Atkinson, Quentin D; Gray, Russell D; Drummond, Alexei J
2008-02-01
The relative timing and size of regional human population growth following our expansion from Africa remain unknown. Human mitochondrial DNA (mtDNA) diversity carries a legacy of our population history. Given a set of sequences, we can use coalescent theory to estimate past population size through time and draw inferences about human population history. However, recent work has challenged the validity of using mtDNA diversity to infer species population sizes. Here we use Bayesian coalescent inference methods, together with a global data set of 357 human mtDNA coding-region sequences, to infer human population sizes through time across 8 major geographic regions. Our estimates of relative population sizes show remarkable concordance with the contemporary regional distribution of humans across Africa, Eurasia, and the Americas, indicating that mtDNA diversity is a good predictor of population size in humans. Plots of population size through time show slow growth in sub-Saharan Africa beginning 143-193 kya, followed by a rapid expansion into Eurasia after the emergence of the first non-African mtDNA lineages 50-70 kya. Outside Africa, the earliest and fastest growth is inferred in Southern Asia approximately 52 kya, followed by a succession of growth phases in Northern and Central Asia (approximately 49 kya), Australia (approximately 48 kya), Europe (approximately 42 kya), the Middle East and North Africa (approximately 40 kya), New Guinea (approximately 39 kya), the Americas (approximately 18 kya), and a second expansion in Europe (approximately 10-15 kya). Comparisons of relative regional population sizes through time suggest that between approximately 45 and 20 kya most of humanity lived in Southern Asia. These findings not only support the use of mtDNA data for estimating human population size but also provide a unique picture of human prehistory and demonstrate the importance of Southern Asia to our recent evolutionary past.
Makowsky, Robert; Cox, Christian L; Roelke, Corey; Chippindale, Paul T
2010-11-01
Determining the appropriate gene for phylogeny reconstruction can be a difficult process. Rapidly evolving genes tend to resolve recent relationships, but suffer from alignment issues and increased homoplasy among distantly related species. Conversely, slowly evolving genes generally perform best for deeper relationships, but lack sufficient variation to resolve recent relationships. We determine the relationship between sequence divergence and Bayesian phylogenetic reconstruction ability using both natural and simulated datasets. The natural data are based on 28 well-supported relationships within the subphylum Vertebrata. Sequences of 12 genes were acquired and Bayesian analyses were used to determine phylogenetic support for correct relationships. Simulated datasets were designed to determine whether an optimal range of sequence divergence exists across extreme phylogenetic conditions. Across all genes we found that an optimal range of divergence for resolving the correct relationships does exist, although this level of divergence expectedly depends on the distance metric. Simulated datasets show that an optimal range of sequence divergence exists across diverse topologies and models of evolution. We determine that a simple to measure property of genetic sequences (genetic distance) is related to phylogenic reconstruction ability in Bayesian analyses. This information should be useful for selecting the most informative gene to resolve any relationships, especially those that are difficult to resolve, as well as minimizing both cost and confounding information during project design. Copyright © 2010. Published by Elsevier Inc.
2012-01-01
Background The Nymphaeales (waterlilly and relatives) lineage has diverged as the second branch of basal angiosperms and comprises of two families: Cabombaceae and Nymphaceae. The classification of Nymphaeales and phylogeny within the flowering plants are quite intriguing as several systems (Thorne system, Dahlgren system, Cronquist system, Takhtajan system and APG III system (Angiosperm Phylogeny Group III system) have attempted to redefine the Nymphaeales taxonomy. There have been also fossil records consisting especially of seeds, pollen, stems, leaves and flowers as early as the lower Cretaceous. Here we present an in silico study of the order Nymphaeales taking maturaseK (matK) and internal transcribed spacer (ITS2) as biomarkers for phylogeny reconstruction (using character-based methods and Bayesian approach) and identification of motifs for DNA barcoding. Results The Maximum Likelihood (ML) and Bayesian approach yielded congruent fully resolved and well-supported trees using a concatenated (ITS2+ matK) supermatrix aligned dataset. The taxon sampling corroborates the monophyly of Cabombaceae. Nuphar emerges as a monophyletic clade in the family Nymphaeaceae while there are slight discrepancies in the monophyletic nature of the genera Nymphaea owing to Victoria-Euryale and Ondinea grouping in the same node of Nymphaeaceae. ITS2 secondary structures alignment corroborate the primary sequence analysis. Hydatellaceae emerged as a sister clade to Nymphaeaceae and had a basal lineage amongst the water lilly clades. Species from Cycas and Ginkgo were taken as outgroups and were rooted in the overall tree topology from various methods. Conclusions MatK genes are fast evolving highly variant regions of plant chloroplast DNA that can serve as potential biomarkers for DNA barcoding and also in generating primers for angiosperms with identification of unique motif regions. We have reported unique genus specific motif regions in the Order Nymphaeles from matK dataset which can be further validated for barcoding and designing of PCR primers. Our analysis using a novel approach of sequence-structure alignment and phylogenetic reconstruction using molecular morphometrics congrue with the current placement of Hydatellaceae within the early-divergent angiosperm order Nymphaeales. The results underscore the fact that more diverse genera, if not fully resolved to be monophyletic, should be represented by all major lineages. PMID:23282079
Bayesian Correlation Analysis for Sequence Count Data
Lau, Nelson; Perkins, Theodore J.
2016-01-01
Evaluating the similarity of different measured variables is a fundamental task of statistics, and a key part of many bioinformatics algorithms. Here we propose a Bayesian scheme for estimating the correlation between different entities’ measurements based on high-throughput sequencing data. These entities could be different genes or miRNAs whose expression is measured by RNA-seq, different transcription factors or histone marks whose expression is measured by ChIP-seq, or even combinations of different types of entities. Our Bayesian formulation accounts for both measured signal levels and uncertainty in those levels, due to varying sequencing depth in different experiments and to varying absolute levels of individual entities, both of which affect the precision of the measurements. In comparison with a traditional Pearson correlation analysis, we show that our Bayesian correlation analysis retains high correlations when measurement confidence is high, but suppresses correlations when measurement confidence is low—especially for entities with low signal levels. In addition, we consider the influence of priors on the Bayesian correlation estimate. Perhaps surprisingly, we show that naive, uniform priors on entities’ signal levels can lead to highly biased correlation estimates, particularly when different experiments have widely varying sequencing depths. However, we propose two alternative priors that provably mitigate this problem. We also prove that, like traditional Pearson correlation, our Bayesian correlation calculation constitutes a kernel in the machine learning sense, and thus can be used as a similarity measure in any kernel-based machine learning algorithm. We demonstrate our approach on two RNA-seq datasets and one miRNA-seq dataset. PMID:27701449
A polyphasic taxonomic approach in isolated strains of Cyanobacteria from thermal springs of Greece.
Bravakos, Panos; Kotoulas, Georgios; Skaraki, Katerina; Pantazidou, Adriani; Economou-Amilli, Athena
2016-05-01
Strains of Cyanobacteria isolated from mats of 9 thermal springs of Greece have been studied for their taxonomic evaluation. A polyphasic taxonomic approach was employed which included: morphological observations by light microscopy and scanning electron microscopy, maximum parsimony, maximum likelihood and Bayesian analysis of 16S rDNA sequences, secondary structural comparisons of 16S-23S rRNA Internal Transcribed Spacer sequences, and finally environmental data. The 17 cyanobacterial isolates formed a diverse group that contained filamentous, coccoid and heterocytous strains. These included representatives of the polyphyletic genera of Synechococcus and Phormidium, and the orders Oscillatoriales, Spirulinales, Chroococcales and Nostocales. After analysis, at least 6 new taxa at the genus level provide new evidence in the taxonomy of Cyanobacteria and highlight the abundant diversity of thermal spring environments with many potential endemic species or ecotypes. Copyright © 2016 Elsevier Inc. All rights reserved.
Conn, Jan E.; Moreno, Marta; Saavedra, Marlon; Bickersmith, Sara A.; Knoll, Elisabeth; Fernandez, Roberto; Vera, Hubert; Burrus, Roxanne G.; Lescano, Andres G.; Sanchez, Juan Francisco; Rivera, Esteban; Vinetz, Joseph M.
2013-01-01
Anopheline specimens were collected in 2011 by human landing catch, Shannon and CDC traps from the malaria endemic localities of Santa Rosa and San Pedro in Madre de Dios Department, Peru. Most specimens were either Anopheles (Nyssorhynchus) benarrochi B or An. (Nys.) rangeli, confirmed by polymerase chain reaction-restriction fragment length polymorphism-internal transcribed spacer 2 (PCR-RFLP-ITS2) and, for selected individuals, ITS2 sequences. A few specimens from Lupuna, Loreto Department, northern Amazonian Peru, were also identified as An. benarrochi B. A statistical parsimony network using ITS2 sequences confirmed that all Peruvian An. benarrochi B analyzed were identical to those in GenBank from Putumayo, southern Colombia. Sequences of the mtDNA COI BOLD region of specimens from all three Peruvian localities were connected using a statistical parsimony network, although there were multiple mutation steps between northern and southern Peruvian sequences. A Bayesian inference of concatenated Peruvian sequences of ITS2+COI detected a single clade with very high support for all An. benarrochi B except one individual from Lupuna that was excluded. No samples were positive for Plasmodium by CytB-PCR. PMID:23243107
Salvi, Daniele; Macali, Armando; Mariottini, Paolo
2014-01-01
The bivalve family Ostreidae has a worldwide distribution and includes species of high economic importance. Phylogenetics and systematic of oysters based on morphology have proved difficult because of their high phenotypic plasticity. In this study we explore the phylogenetic information of the DNA sequence and secondary structure of the nuclear, fast-evolving, ITS2 rRNA and the mitochondrial 16S rRNA genes from the Ostreidae and we implemented a multi-locus framework based on four loci for oyster phylogenetics and systematics. Sequence-structure rRNA models aid sequence alignment and improved accuracy and nodal support of phylogenetic trees. In agreement with previous molecular studies, our phylogenetic results indicate that none of the currently recognized subfamilies, Crassostreinae, Ostreinae, and Lophinae, is monophyletic. Single gene trees based on Maximum likelihood (ML) and Bayesian (BA) methods and on sequence-structure ML were congruent with multilocus trees based on a concatenated (ML and BA) and coalescent based (BA) approaches and consistently supported three main clades: (i) Crassostrea, (ii) Saccostrea, and (iii) an Ostreinae-Lophinae lineage. Therefore, the subfamily Crassotreinae (including Crassostrea), Saccostreinae subfam. nov. (including Saccostrea and tentatively Striostrea) and Ostreinae (including Ostreinae and Lophinae taxa) are recognized. Based on phylogenetic and biogeographical evidence the Asian species of Crassostrea from the Pacific Ocean are assigned to Magallana gen. nov., whereas an integrative taxonomic revision is required for the genera Ostrea and Dendostrea. This study pointed out the suitability of the ITS2 marker for DNA barcoding of oyster and the relevance of using sequence-structure rRNA models and features of the ITS2 folding in molecular phylogenetics and taxonomy. The multilocus approach allowed inferring a robust phylogeny of Ostreidae providing a broad molecular perspective on their systematics. PMID:25250663
Salvi, Daniele; Macali, Armando; Mariottini, Paolo
2014-01-01
The bivalve family Ostreidae has a worldwide distribution and includes species of high economic importance. Phylogenetics and systematic of oysters based on morphology have proved difficult because of their high phenotypic plasticity. In this study we explore the phylogenetic information of the DNA sequence and secondary structure of the nuclear, fast-evolving, ITS2 rRNA and the mitochondrial 16S rRNA genes from the Ostreidae and we implemented a multi-locus framework based on four loci for oyster phylogenetics and systematics. Sequence-structure rRNA models aid sequence alignment and improved accuracy and nodal support of phylogenetic trees. In agreement with previous molecular studies, our phylogenetic results indicate that none of the currently recognized subfamilies, Crassostreinae, Ostreinae, and Lophinae, is monophyletic. Single gene trees based on Maximum likelihood (ML) and Bayesian (BA) methods and on sequence-structure ML were congruent with multilocus trees based on a concatenated (ML and BA) and coalescent based (BA) approaches and consistently supported three main clades: (i) Crassostrea, (ii) Saccostrea, and (iii) an Ostreinae-Lophinae lineage. Therefore, the subfamily Crassostreinae (including Crassostrea), Saccostreinae subfam. nov. (including Saccostrea and tentatively Striostrea) and Ostreinae (including Ostreinae and Lophinae taxa) are recognized [corrected]. Based on phylogenetic and biogeographical evidence the Asian species of Crassostrea from the Pacific Ocean are assigned to Magallana gen. nov., whereas an integrative taxonomic revision is required for the genera Ostrea and Dendostrea. This study pointed out the suitability of the ITS2 marker for DNA barcoding of oyster and the relevance of using sequence-structure rRNA models and features of the ITS2 folding in molecular phylogenetics and taxonomy. The multilocus approach allowed inferring a robust phylogeny of Ostreidae providing a broad molecular perspective on their systematics.
Xu, Shengyong; Song, Na; Lu, Zhichuang; Wang, Jun; Cai, Shanshan; Gao, Tianxiang
2014-06-01
Scaly hair-fin anchovy (Setipinna tenuifilis) is a small, pelagic and economical species and widely distributed in Chinese coastal water. However, resources of S. tenuifilis have been reduced due to overfishing. For better fishery management, it is necessary to understand the pattern of S. tenuifilis's biogeography. Genetic analyses were taken place to detect their population genetic variation. A total of 153 individuals from 7 locations (Dongying, Yantai, Qingdao, Nantong, Wenzhou, Xiamen and Beibu Bay) were sequenced at the 5' end of mtDNA control region. A 39-bp tandem repeated sequence was found at the 5' end of the segment and a polymorphism of tandem repeated sequence was detected among 7 populations. Both mismatch distribution analysis and neutrality tests showed S. tenuifilis had experienced a recent population expansion. The topology of neighbor-joining tree and Bayesian evolutionary tree showed no significant genealogical branches or clusters of samples corresponding to sampling locality. Hierarchical analysis of molecular variance and conventional pairwise population Fst value at group hierarchical level implied that there might have genetic divergence between southern group (population WZ, XM and BB) and northern group (population DY, YT, QD and NT). We concluded that there might have three different fishery management groups of S. tenuifilis and the late Pleistocene glacial event might have a crucial effect on present-day demography of S. tenuifilis in this region.
Jonniaux, Pierre; Kumazawa, Yoshinori
2008-01-15
Mitochondrial DNA sequences of approximately 2.3 kbp including the complete NADH dehydrogenase subunit 2 gene and its flanking genes, as well as parts of 12S and 16S rRNA genes were determined from major species of the eyelid gecko family Eublepharidae sensu [Kluge, A.G. 1987. Cladistic relationships in the Gekkonoidea (Squamata, Sauria). Misc. Publ. Mus. Zool. Univ. Michigan 173, 1-54.]. In contrast to previous morphological studies, phylogenetic analyses based on these sequences supported that Eublepharidae and Gekkonidae form a sister group with Pygopodidae, raising the possibility of homoplasious character change in some key features of geckos, such as reduction of movable eyelids and innovation of climbing toe pads. The phylogenetic analyses also provided a well-resolved tree for relationships between the eublepharid species. The Bayesian estimation of divergence times without assuming the molecular clock suggested the Jurassic divergence of Eublepharidae from Gekkonidae and radiations of most eublepharid genera around the Cretaceous. These dating results appeared to be robust against some conditional changes for time estimation, such as gene regions used, taxon representation, and data partitioning. Taken together with geological evidence, these results support the vicariant divergence of Eublepharidae and Gekkonidae by the breakup of Pangea into Laurasia and Gondwanaland, and recent dispersal of two African eublepharid genera from Eurasia to Africa after these landmasses were connected in the Early Miocene.
2012-01-01
Background ChIP-seq provides new opportunities to study allele-specific protein-DNA binding (ASB). However, detecting allelic imbalance from a single ChIP-seq dataset often has low statistical power since only sequence reads mapped to heterozygote SNPs are informative for discriminating two alleles. Results We develop a new method iASeq to address this issue by jointly analyzing multiple ChIP-seq datasets. iASeq uses a Bayesian hierarchical mixture model to learn correlation patterns of allele-specificity among multiple proteins. Using the discovered correlation patterns, the model allows one to borrow information across datasets to improve detection of allelic imbalance. Application of iASeq to 77 ChIP-seq samples from 40 ENCODE datasets and 1 genomic DNA sample in GM12878 cells reveals that allele-specificity of multiple proteins are highly correlated, and demonstrates the ability of iASeq to improve allelic inference compared to analyzing each individual dataset separately. Conclusions iASeq illustrates the value of integrating multiple datasets in the allele-specificity inference and offers a new tool to better analyze ASB. PMID:23194258
Bobo-Pinilla, Javier; Barrios de León, Sara B; Seguí Colomar, Jaume; Fenu, Giuseppe; Bacchetta, Gianluigi; Peñas de Giles, Julio; Martínez-Ortega, María Montserrat
2016-01-01
Although it has been traditionally accepted that Arenaria balearica (Caryophyllaceae) could be a relict Tertiary plant species, this has never been experimentally tested. Nor have the palaeohistorical reasons underlying the highly fragmented distribution of the species in the Western Mediterranean region been investigated. We have analysed AFLP data (213) and plastid DNA sequences (226) from a total of 250 plants from 29 populations sampled throughout the entire distribution range of the species in Majorca, Corsica, Sardinia, and the Tuscan Archipelago. The AFLP data analyses indicate very low geographic structure and population differentiation. Based on plastid DNA data, six alternative phylogeographic hypotheses were tested using Approximate Bayesian Computation (ABC). These analyses revealed ancient area fragmentation as the most probable scenario, which is in accordance with the star-like topology of the parsimony network that suggests a pattern of long term survival and subsequent in situ differentiation. Overall low levels of genetic diversity and plastid DNA variation were found, reflecting evolutionary stasis of a species preserved in locally long-term stable habitats.
Rybarczyk-Mydłowska, Katarzyna; Maboreke, Hazel Ruvimbo; van Megen, Hanny; van den Elsen, Sven; Mooyman, Paul; Smant, Geert; Bakker, Jaap; Helder, Johannes
2012-11-21
Plant parasitic nematodes are unusual Metazoans as they are equipped with genes that allow for symbiont-independent degradation of plant cell walls. Among the cell wall-degrading enzymes, glycoside hydrolase family 5 (GHF5) cellulases are relatively well characterized, especially for high impact parasites such as root-knot and cyst nematodes. Interestingly, ancestors of extant nematodes most likely acquired these GHF5 cellulases from a prokaryote donor by one or multiple lateral gene transfer events. To obtain insight into the origin of GHF5 cellulases among evolutionary advanced members of the order Tylenchida, cellulase biodiversity data from less distal family members were collected and analyzed. Single nematodes were used to obtain (partial) genomic sequences of cellulases from representatives of the genera Meloidogyne, Pratylenchus, Hirschmanniella and Globodera. Combined Bayesian analysis of ≈ 100 cellulase sequences revealed three types of catalytic domains (A, B, and C). Represented by 84 sequences, type B is numerically dominant, and the overall topology of the catalytic domain type shows remarkable resemblance with trees based on neutral (= pathogenicity-unrelated) small subunit ribosomal DNA sequences. Bayesian analysis further suggested a sister relationship between the lesion nematode Pratylenchus thornei and all type B cellulases from root-knot nematodes. Yet, the relationship between the three catalytic domain types remained unclear. Superposition of intron data onto the cellulase tree suggests that types B and C are related, and together distinct from type A that is characterized by two unique introns. All Tylenchida members investigated here harbored one or multiple GHF5 cellulases. Three types of catalytic domains are distinguished, and the presence of at least two types is relatively common among plant parasitic Tylenchida. Analysis of coding sequences of cellulases suggests that root-knot and cyst nematodes did not acquire this gene directly by lateral genes transfer. More likely, these genes were passed on by ancestors of a family nowadays known as the Pratylenchidae.
Cinelli, Mattia; Sun, , Yuxin; Best, Katharine; Heather, James M.; Reich-Zeliger, Shlomit; Shifrut, Eric; Friedman, Nir; Shawe-Taylor, John; Chain, Benny
2017-01-01
Abstract Motivation: Somatic DNA recombination, the hallmark of vertebrate adaptive immunity, has the potential to generate a vast diversity of antigen receptor sequences. How this diversity captures antigen specificity remains incompletely understood. In this study we use high throughput sequencing to compare the global changes in T cell receptor β chain complementarity determining region 3 (CDR3β) sequences following immunization with ovalbumin administered with complete Freund’s adjuvant (CFA) or CFA alone. Results: The CDR3β sequences were deconstructed into short stretches of overlapping contiguous amino acids. The motifs were ranked according to a one-dimensional Bayesian classifier score comparing their frequency in the repertoires of the two immunization classes. The top ranking motifs were selected and used to create feature vectors which were used to train a support vector machine. The support vector machine achieved high classification scores in a leave-one-out validation test reaching >90% in some cases. Summary: The study describes a novel two-stage classification strategy combining a one-dimensional Bayesian classifier with a support vector machine. Using this approach we demonstrate that the frequency of a small number of linear motifs three amino acids in length can accurately identify a CD4 T cell response to ovalbumin against a background response to the complex mixture of antigens which characterize Complete Freund’s Adjuvant. Availability and implementation: The sequence data is available at www.ncbi.nlm.nih.gov/sra/?term¼SRP075893. The Decombinator package is available at github.com/innate2adaptive/Decombinator. The R package e1071 is available at the CRAN repository https://cran.r-project.org/web/packages/e1071/index.html. Contact: b.chain@ucl.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28073756
Gussarova, Galina; Allen, Geraldine A; Mikhaylova, Yulia; McCormick, Laurie J; Mirré, Virginia; Marr, Kendrick L; Hebda, Richard J; Brochmann, Christian
2015-10-01
Many arctic-alpine species have vast geographic ranges, but these may encompass substantial gaps whose origins are poorly understood. Here we address the phylogeographic history of Silene acaulis, a perennial cushion plant with a circumpolar distribution except for a large gap in Siberia. We assessed genetic variation in a range-wide sample of 103 populations using plastid DNA (pDNA) sequences and AFLPs (amplified fragment length polymorphisms). We constructed a haplotype network and performed Bayesian phylogenetic analyses based on plastid sequences. We visualized AFLP patterns using principal coordinate analysis, identified genetic groups using the program structure, and estimated genetic diversity and rarity indices by geographic region. The history of the main pDNA lineages was estimated to span several glaciations. AFLP data revealed a distinct division between Beringia/North America and Europe/East Greenland. These two regions shared only one of 17 pDNA haplotypes. Populations on opposite sides of the Siberian range gap (Ural Mountains and Chukotka) were genetically distinct and appear to have resulted from postglacial leading-edge colonizations. We inferred two refugia in North America (Beringia and the southern Rocky Mountains) and two in Europe (central-southern Europe and northern Europe/East Greenland). Patterns in the East Atlantic region suggested transoceanic long-distance dispersal events. Silene acaulis has a highly dynamic history characterized by vicariance, regional extinction, and recolonization, with persistence in at least four refugia. Long-distance dispersal explains patterns across the Atlantic Ocean, but we found no evidence of dispersal across the Siberian range gap. © 2015 Botanical Society of America.
Yu, Fang; Chen, Ming-Hui; Kuo, Lynn; Talbott, Heather; Davis, John S
2015-08-07
Recently, the Bayesian method becomes more popular for analyzing high dimensional gene expression data as it allows us to borrow information across different genes and provides powerful estimators for evaluating gene expression levels. It is crucial to develop a simple but efficient gene selection algorithm for detecting differentially expressed (DE) genes based on the Bayesian estimators. In this paper, by extending the two-criterion idea of Chen et al. (Chen M-H, Ibrahim JG, Chi Y-Y. A new class of mixture models for differential gene expression in DNA microarray data. J Stat Plan Inference. 2008;138:387-404), we propose two new gene selection algorithms for general Bayesian models and name these new methods as the confident difference criterion methods. One is based on the standardized differences between two mean expression values among genes; the other adds the differences between two variances to it. The proposed confident difference criterion methods first evaluate the posterior probability of a gene having different gene expressions between competitive samples and then declare a gene to be DE if the posterior probability is large. The theoretical connection between the proposed first method based on the means and the Bayes factor approach proposed by Yu et al. (Yu F, Chen M-H, Kuo L. Detecting differentially expressed genes using alibrated Bayes factors. Statistica Sinica. 2008;18:783-802) is established under the normal-normal-model with equal variances between two samples. The empirical performance of the proposed methods is examined and compared to those of several existing methods via several simulations. The results from these simulation studies show that the proposed confident difference criterion methods outperform the existing methods when comparing gene expressions across different conditions for both microarray studies and sequence-based high-throughput studies. A real dataset is used to further demonstrate the proposed methodology. In the real data application, the confident difference criterion methods successfully identified more clinically important DE genes than the other methods. The confident difference criterion method proposed in this paper provides a new efficient approach for both microarray studies and sequence-based high-throughput studies to identify differentially expressed genes.
Chee, S Y
2015-05-25
The mitochondrial DNA (mtDNA) cytochrome oxidase I (COI) gene has been universally and successfully utilized as a barcoding gene, mainly because it can be amplified easily, applied across a wide range of taxa, and results can be obtained cheaply and quickly. However, in rare cases, the gene can fail to distinguish between species, particularly when exposed to highly sensitive methods of data analysis, such as the Bayesian method, or when taxa have undergone introgressive hybridization, over-splitting, or incomplete lineage sorting. Such cases require the use of alternative markers, and nuclear DNA markers are commonly used. In this study, a dendrogram produced by Bayesian analysis of an mtDNA COI dataset was compared with that of a nuclear DNA ATPS-α dataset, in order to evaluate the efficiency of COI in barcoding Malaysian nerites (Neritidae). In the COI dendrogram, most of the species were in individual clusters, except for two species: Nerita chamaeleon and N. histrio. These two species were placed in the same subcluster, whereas in the ATPS-α dendrogram they were in their own subclusters. Analysis of the ATPS-α gene also placed the two genera of nerites (Nerita and Neritina) in separate clusters, whereas COI gene analysis placed both genera in the same cluster. Therefore, in the case of the Neritidae, the ATPS-α gene is a better barcoding gene than the COI gene.
Determining geographical spread pattern of MERS-CoV by distance method using Kimura model
NASA Astrophysics Data System (ADS)
Amiroch, Siti; Rohmatullah, Arif
2017-03-01
MERS-CoV or generally called as Middle East Respiratory Syndrome Coronavirus, a respiratory disease syndrome caused by a corona virus that attacks the respiratory tract ranging from mild to severe acute indication of fever, cough and shortness of breath. The cases happened relate to the countries in the Arabian Peninsula (Middle East) and there were 356 deaths have been reported due to the spread of the epidemic MERS. The data used in the case of MERS are the data DNA sequences taken from Genbank, the online database of the United States that stores the results of molecular biological experiments from all over the world (http://www.ncbi.nlm.nih.gov). In this case, bioinformatics plays an important role of reading sequences of DNA and genetic information by using the main device in the form of software that is supported by the availability of the Internet, while the analysis there in made and proven with mathematical methods. In similar research conducted by molecular biologists and physicians, the process of DNA sequencing is done with software that is already available like BLAST. In order to determine the MERS geographical distribution patterns in the Arabian Peninsula is done with program Clustal W, Bayesian, Phylip, etc. In this study, the writer use the Matlab simulation for all processes starting sequence alignment, counting the number of transitions and transversion substitutions for each sequence and its location up to the process of forming a phylogenetic tree that figures out the pattern of spread of the epidemic MERS. Mathematical analysis performed on a decline in the formula is to find Kimura evolutionary models and the process of forming a phylogenetic tree (the pattern of the epidemic MERS distribution) with neighbor joining algorithm. Finally it was obtained the pattern of geographical spread with 6 groups epidemic of MERS which ultimately turns out that all the MERS viruses that were spread in the Arabian Peninsula everything are almost the same as the virus sequence found in al-Hasa.
Scaglione, Davide; Lanteri, Sergio; Acquadro, Alberto; Lai, Zhao; Knapp, Steven J; Rieseberg, Loren; Portis, Ezio
2012-10-01
Cynara cardunculus (2n = 2× = 34) is a member of the Asteraceae family that contributes significantly to the agricultural economy of the Mediterranean basin. The species includes two cultivated varieties, globe artichoke and cardoon, which are grown mainly for food. Cynara cardunculus is an orphan crop species whose genome/transcriptome has been relatively unexplored, especially in comparison to other Asteraceae crops. Hence, there is a significant need to improve its genomic resources through the identification of novel genes and sequence-based markers, to design new breeding schemes aimed at increasing quality and crop productivity. We report the outcome of cDNA sequencing and assembly for eleven accessions of C. cardunculus. Sequencing of three mapping parental genotypes using Roche 454-Titanium technology generated 1.7 × 10⁶ reads, which were assembled into 38,726 reference transcripts covering 32 Mbp. Putative enzyme-encoding genes were annotated using the KEGG-database. Transcription factors and candidate resistance genes were surveyed as well. Paired-end sequencing was done for cDNA libraries of eight other representative C. cardunculus accessions on an Illumina Genome Analyzer IIx, generating 46 × 10⁶ reads. Alignment of the IGA and 454 reads to reference transcripts led to the identification of 195,400 SNPs with a Bayesian probability exceeding 95%; a validation rate of 90% was obtained by Sanger-sequencing of a subset of contigs. These results demonstrate that the integration of data from different NGS platforms enables large-scale transcriptome characterization, along with massive SNP discovery. This information will contribute to the dissection of key agricultural traits in C. cardunculus and facilitate the implementation of marker-assisted selection programs. © 2012 The Authors. Plant Biotechnology Journal © 2012 Society for Experimental Biology, Association of Applied Biologists and Blackwell Publishing Ltd.
Renner, Susanne S; Zhang, Li-Bing
2004-06-01
Pistia stratiotes (water lettuce) and Lemna (duckweeds) are the only free-floating aquatic Araceae. The geographic origin and phylogenetic placement of these unrelated aroids present long-standing problems because of their highly modified reproductive structures and wide geographical distributions. We sampled chloroplast (trnL-trnF and rpl20-rps12 spacers, trnL intron) and mitochondrial sequences (nad1 b/c intron) for all genera implicated as close relatives of Pistia by morphological, restriction site, and sequencing data, and present a hypothesis about its geographic origin based on the consensus of trees obtained from the combined data, using Bayesian, maximum likelihood, parsimony, and distance analyses. Of the 14 genera closest to Pistia, only Alocasia, Arisaema, and Typhonium are species-rich, and the latter two were studied previously, facilitating the choice of representatives that span the roots of these genera. Results indicate that Pistia and the Seychelles endemic Protarum sechellarum are the basalmost branches in a grade comprising the tribes Colocasieae (Ariopsis, Steudnera, Remusatia, Alocasia, Colocasia), Arisaemateae (Arisaema, Pinellia), and Areae (Arum, Biarum, Dracunculus, Eminium, Helicodiceros, Theriophonum, Typhonium). Unexpectedly, all Areae genera are embedded in Typhonium, which throws new light on the geographic history of Areae. A Bayesian analysis of divergence times that explores the effects of multiple fossil and geological calibration points indicates that the Pistia lineage is 90 to 76 million years (my) old. The oldest fossils of the Pistia clade, though not Pistia itself, are 45-my-old leaves from Germany; the closest outgroup, Peltandreae (comprising a few species in Florida, the Mediterranean, and Madagascar), is known from 60-my-old leaves from Europe, Kazakhstan, North Dakota, and Tennessee. Based on the geographic ranges of close relatives, Pistia likely originated in the Tethys region, with Protarum then surviving on the Seychelles, which became isolated from Madagascar and India in the Late Cretaceous (85 my ago). Pistia and Protarum provide striking examples of ancient lineages that appear to have survived in unique or isolated habitats.
Liu, Qing; Triplett, Jimmy K; Wen, Jun; Peterson, Paul M
2011-11-01
Eleusine (Poaceae) is a small genus of the subfamily Chloridoideae exhibiting considerable morphological and ecological diversity in East Africa and the Americas. The interspecific phylogenetic relationships of Eleusine are investigated in order to identify its allotetraploid origin, and a chronogram is estimated to infer temporal relationships between palaeoenvironment changes and divergence of Eleusine in East Africa. Two low-copy nuclear (LCN) markers, Pepc4 and EF-1α, were analysed using parsimony, likelihood and Bayesian approaches. A chronogram of Eleusine was inferred from a combined data set of six plastid DNA markers (ndhA intron, ndhF, rps16-trnK, rps16 intron, rps3, and rpl32-trnL) using the Bayesian dating method. The monophyly of Eleusine is strongly supported by sequence data from two LCN markers. In the cpDNA phylogeny, three tetraploid species (E. africana, E. coracana and E. kigeziensis) share a common ancestor with the E. indica-E. tristachya clade, which is considered a source of maternal parents for allotetraploids. Two homoeologous loci are isolated from three tetraploid species in the Pepc4 phylogeny, and the maternal parents receive further support. The A-type EF-1α sequences possess three characters, i.e. a large number of variations of intron 2; clade E-A distantly diverged from clade E-B and other diploid species; and seven deletions in intron 2, implying a possible derivation through a gene duplication event. The crown age of Eleusine and the allotetraploid lineage are 3·89 million years ago (mya) and 1·40 mya, respectively. The molecular data support independent allotetraploid origins for E. kigeziensis and the E. africana-E. coracana clade. Both events may have involved diploids E. indica and E. tristachya as the maternal parents, but the paternal parents remain unidentified. The habitat-specific hypothesis is proposed to explain the divergence of Eleusine and its allotetraploid lineage.
Liu, Qing; Triplett, Jimmy K.; Wen, Jun; Peterson, Paul M.
2011-01-01
Background and Aims Eleusine (Poaceae) is a small genus of the subfamily Chloridoideae exhibiting considerable morphological and ecological diversity in East Africa and the Americas. The interspecific phylogenetic relationships of Eleusine are investigated in order to identify its allotetraploid origin, and a chronogram is estimated to infer temporal relationships between palaeoenvironment changes and divergence of Eleusine in East Africa. Methods Two low-copy nuclear (LCN) markers, Pepc4 and EF-1α, were analysed using parsimony, likelihood and Bayesian approaches. A chronogram of Eleusine was inferred from a combined data set of six plastid DNA markers (ndhA intron, ndhF, rps16-trnK, rps16 intron, rps3, and rpl32-trnL) using the Bayesian dating method. Key Results The monophyly of Eleusine is strongly supported by sequence data from two LCN markers. In the cpDNA phylogeny, three tetraploid species (E. africana, E. coracana and E. kigeziensis) share a common ancestor with the E. indica–E. tristachya clade, which is considered a source of maternal parents for allotetraploids. Two homoeologous loci are isolated from three tetraploid species in the Pepc4 phylogeny, and the maternal parents receive further support. The A-type EF-1α sequences possess three characters, i.e. a large number of variations of intron 2; clade E-A distantly diverged from clade E-B and other diploid species; and seven deletions in intron 2, implying a possible derivation through a gene duplication event. The crown age of Eleusine and the allotetraploid lineage are 3·89 million years ago (mya) and 1·40 mya, respectively. Conclusions The molecular data support independent allotetraploid origins for E. kigeziensis and the E. africana–E. coracana clade. Both events may have involved diploids E. indica and E. tristachya as the maternal parents, but the paternal parents remain unidentified. The habitat-specific hypothesis is proposed to explain the divergence of Eleusine and its allotetraploid lineage. PMID:21880659
Phylogeny of economically important insect pests that infesting several crops species in Malaysia
NASA Astrophysics Data System (ADS)
Ghazali, Siti Zafirah; Zain, Badrul Munir Md.; Yaakop, Salmah
2014-09-01
This paper reported molecular data on insect pests of commercial crops in Peninsular Malaysia. Fifteen insect pests (Metisa plana, Calliteara horsefeldii, Cotesia vestalis, Bactrocera papayae, Bactrocera carambolae, Bactrocera latifrons, Conopomorpha cramella, Sesamia inferens, Chilo polychrysa, Rhynchophorus vulneratus, and Rhynchophorus ferrugineus) of nine crops were sampled (oil palm, coconut, paddy, cocoa, starfruit, angled loofah, guava, chili and mustard) and also four species that belong to the fern's pest (Herpetogramma platycapna) and storage and rice pests (Tribolium castaneum, Oryzaephilus surinamensis and Cadra cautella). The presented phylogeny summarized the initial phylogenetic hypothesis, which concerning by implementation of the economically important insect pests. In this paper, phylogenetic relationships among 39 individuals of 15 species that belonging to three orders under 12 genera were inferred from DNA sequences of mitochondrial marker, cytochrome oxidase subunit I (COI) and nuclear marker, ribosomal DNA 28S D2 region. The phylogenies resulted from the phylogenetic analyses of both genes are relatively similar, but differ in the sequence of evolution. Interestingly, this most recent molecular data of COI sequences data by using Bayesian Inference analysis resulted a more-resolved phylogeny that corroborated with traditional hypotheses of holometabolan relationships based on traditional hypotheses of holometabolan relationships and most of recently molecular study compared to 28S sequences. This finding provides the information on relationships of pests species, which infested several crops in Malaysia and also estimation on Holometabola's order relationships. The identification of the larval stages of insect pests could be done accurately, without waiting the emergence of adults and supported by the phylogenetic tree.
Molecular insights into the colonization and chromosomal diversification of Madeiran house mice.
Förster, D W; Gündüz, I; Nunes, A C; Gabriel, S; Ramalhinho, M G; Mathias, M L; Britton-Davidian, J; Searle, J B
2009-11-01
The colonization history of Madeiran house mice was investigated by analysing the complete mitochondrial (mt) D-loop sequences of 156 mice from the island of Madeira and mainland Portugal, extending on previous studies. The numbers of mtDNA haplotypes from Madeira and mainland Portugal were substantially increased (17 and 14 new haplotypes respectively), and phylogenetic analysis confirmed the previously reported link between the Madeiran archipelago and northern Europe. Sequence analysis revealed the presence of four mtDNA lineages in mainland Portugal, of which one was particularly common and widespread (termed the 'Portugal Main Clade'). There was no support for population bottlenecks during the formation of the six Robertsonian chromosome races on the island of Madeira, and D-loop sequence variation was not found to be structured according to karyotype. The colonization time of the Madeiran archipelago by Mus musculus domesticus was approached using two molecular dating methods (mismatch distribution and Bayesian skyline plot). Time estimates based on D-loop sequence variation at mainland sites (including previously published data from France and Turkey) were evaluated in the context of the zooarchaeological record of M. m. domesticus. A range of values for mutation rate (mu) and number of mouse generations per year was considered in these analyses because of the uncertainty surrounding these two parameters. The colonization of Portugal and Madeira by house mice is discussed in the context of the best-supported parameter values. In keeping with recent studies, our results suggest that mutation rate estimates based on interspecific divergence lead to gross overestimates concerning the timing of recent within-species events.
Dyková, Iva; Nowak, Barbara; Pecková, Hana; Fiala, Ivan; Crosbie, Philip; Dvoráková, Helena
2007-02-08
We characterised 9 strains selected from primary isolates referable to Paramoeba/Neoparamoeba spp. Based on ultrastructural study, 5 strains isolated from fish (amoebic gill disease [AGD]-affected Atlantic salmon and dead southern bluefin tuna), 1 strain from netting of a floating sea cage and 3 strains isolated from invertebrates (sea urchins and crab) were assigned to the genus Neoparamoeba Page, 1987. Phylogenetic analyses based on SSU rDNA sequences revealed affiliations of newly introduced and previously analysed Neoparamoeba strains. Three strains from the invertebrates and 2 out of 3 strains from gills of southern bluefin tunas were members of the N. branchiphila clade, while the remaining, fish-isolated strains, as well as the fish cage strain, clustered within the clade of N. pemaquidensis. These findings and previous reports point to the possibility that N. pemaquidensis and N. branchiphila can affect both fish and invertebrates. A new potential fish host, southern bluefin tuna, was included in the list of farmed fish endangered by N. branchiphila. The sequence of P. eilhardi (Culture Collection of Algae and Protozoa [CCAP] strain 1560/2) appeared in all analyses among sequences of strain representatives of Neoparamoeba species, in a position well supported by bootstrap value, Bremer index and Bayesian posterior probability. Our research shows that isolation of additional strains from invertebrates and further analyses of relations between molecular data and morphological characters of the genera Paramoeba and Neoparamoeba are required. This complexity needs to be considered when attempting to define molecular markers for identification of Paramoeba/Neoparamoeba species in tissues of fish and invertebrates.
Parkin, Derek B; Archer, Linda L; Childress, April L; Wellehan, James F X
2009-07-01
Bearded dragons (Pogona vitticeps) are popular pets in the United States. Agamid Adenovirus 1 (AgAdV1) is an important infectious agent of bearded dragons. The only AgAdV1 sequences available to date are from a highly conserved region of the DNA polymerase gene. Degenerate primers were designed to amplify a variable region of the AgAdV1 hexon gene for sequencing. Genetic differences were identified within the hexon gene of 17 bearded dragons from 4 collections. Much less diversity was present in the polymerase gene. Bayesian analysis of the hexon nucleotide alignment identified two larger groups and two isolates that did not tightly cluster with these two groups. Multiple genotypes were identified within collections, and individual genotypes were seen in different collections. Three bearded dragons appeared to be infected by multiple strains. These findings show that this hexon region is useful for AgAdV1 genotyping, which can be used epidemiologically as well as in future investigations of AgAdV1 evolution and clinical implications of strain differences.
Almathen, Faisal; Charruau, Pauline; Mohandesan, Elmira; Mwacharo, Joram M.; Orozco-terWengel, Pablo; Pitt, Daniel; Abdussamad, Abdussamad M.; Uerpmann, Margarethe; Uerpmann, Hans-Peter; De Cupere, Bea; Magee, Peter; Alnaqeeb, Majed A.; Salim, Bashir; Raziq, Abdul; Dessie, Tadelle; Abdelhadi, Omer M.; Banabazi, Mohammad H.; Al-Eknah, Marzook; Walzer, Chris; Faye, Bernard; Hofreiter, Michael; Peters, Joris; Hanotte, Olivier
2016-01-01
Dromedaries have been fundamental to the development of human societies in arid landscapes and for long-distance trade across hostile hot terrains for 3,000 y. Today they continue to be an important livestock resource in marginal agro-ecological zones. However, the history of dromedary domestication and the influence of ancient trading networks on their genetic structure have remained elusive. We combined ancient DNA sequences of wild and early-domesticated dromedary samples from arid regions with nuclear microsatellite and mitochondrial genotype information from 1,083 extant animals collected across the species’ range. We observe little phylogeographic signal in the modern population, indicative of extensive gene flow and virtually affecting all regions except East Africa, where dromedary populations have remained relatively isolated. In agreement with archaeological findings, we identify wild dromedaries from the southeast Arabian Peninsula among the founders of the domestic dromedary gene pool. Approximate Bayesian computations further support the “restocking from the wild” hypothesis, with an initial domestication followed by introgression from individuals from wild, now-extinct populations. Compared with other livestock, which show a long history of gene flow with their wild ancestors, we find a high initial diversity relative to the native distribution of the wild ancestor on the Arabian Peninsula and to the brief coexistence of early-domesticated and wild individuals. This study also demonstrates the potential to retrieve ancient DNA sequences from osseous remains excavated in hot and dry desert environments. PMID:27162355
Novel Insights into the Genetic Diversity of Balantidium and Balantidium-like Cyst-forming Ciliates
Pomajbíková, Kateřina; Oborník, Miroslav; Horák, Aleš; Petrželková, Klára J.; Grim, J. Norman; Levecke, Bruno; Todd, Angelique; Mulama, Martin; Kiyang, John; Modrý, David
2013-01-01
Balantidiasis is considered a neglected zoonotic disease with pigs serving as reservoir hosts. However, Balantidium coli has been recorded in many other mammalian species, including primates. Here, we evaluated the genetic diversity of B. coli in non-human primates using two gene markers (SSrDNA and ITS1-5.8SDNA-ITS2). We analyzed 49 isolates of ciliates from fecal samples originating from 11 species of captive and wild primates, domestic pigs and wild boar. The phylogenetic trees were computed using Bayesian inference and Maximum likelihood. Balantidium entozoon from edible frog and Buxtonella sulcata from cattle were included in the analyses as the closest relatives of B. coli, as well as reference sequences of vestibuliferids. The SSrDNA tree showed the same phylogenetic diversification of B. coli at genus level as the tree constructed based on the ITS region. Based on the polymorphism of SSrDNA sequences, the type species of the genus, namely B. entozoon, appeared to be phylogenetically distinct from B. coli. Thus, we propose a new genus Neobalantidium for the homeothermic clade. Moreover, several isolates from both captive and wild primates (excluding great apes) clustered with B. sulcata with high support, suggesting the existence of a new species within this genus. The cysts of Buxtonella and Neobalantidium are morphologically indistinguishable and the presence of Buxtonella-like ciliates in primates opens the question about possible occurrence of these pathogens in humans. PMID:23556024
Jiménez, Rosa Alicia
2016-01-01
The influence of geologic and Pleistocene glacial cycles might result in morphological and genetic complex scenarios in the biota of the Mesoamerican region. We tested whether berylline, blue-tailed and steely-blue hummingbirds, Amazilia beryllina, Amazilia cyanura and Amazilia saucerottei, show evidence of historical or current introgression as their plumage colour variation might suggest. We also analysed the role of past and present climatic events in promoting genetic introgression and species diversification. We collected mitochondrial DNA (mtDNA) sequence data and microsatellite loci scores for populations throughout the range of the three Amazilia species, as well as morphological and ecological data. Haplotype network, Bayesian phylogenetic and divergence time inference, historical demography, palaeodistribution modelling, and niche divergence tests were used to reconstruct the evolutionary history of this Amazilia species complex. An isolation-with-migration coalescent model and Bayesian assignment analysis were assessed to determine historical introgression and current genetic admixture. mtDNA haplotypes were geographically unstructured, with haplotypes from disparate areas interdispersed on a shallow tree and an unresolved haplotype network. Assignment analysis of the nuclear genome (nuDNA) supported three genetic groups with signs of genetic admixture, corresponding to: (1) A. beryllina populations located west of the Isthmus of Tehuantepec; (2) A. cyanura populations between the Isthmus of Tehuantepec and the Nicaraguan Depression (Nuclear Central America); and (3) A. saucerottei populations southeast of the Nicaraguan Depression. Gene flow and divergence time estimates, and demographic and palaeodistribution patterns suggest an evolutionary history of introgression mediated by Quaternary climatic fluctuations. High levels of gene flow were indicated by mtDNA and asymmetrical isolation-with-migration, whereas the microsatellite analyses found evidence for three genetic clusters with distributions corresponding to isolation by the Isthmus of Tehuantepec and the Nicaraguan Depression and signs of admixture. Historical levels of migration between genetically distinct groups estimated using microsatellites were higher than contemporary levels of migration. These results support the scenario of secondary contact and range contact during the glacial periods of the Pleistocene and strongly imply that the high levels of structure currently observed are a consequence of the limited dispersal of these hummingbirds across the isthmus and depression barriers. PMID:26788433
Mitogenomic analysis of the genus Panthera.
Wei, Lei; Wu, Xiaobing; Zhu, Lixin; Jiang, Zhigang
2011-10-01
The complete sequences of the mitochondrial DNA genomes of Panthera tigris, Panthera pardus, and Panthera uncia were determined using the polymerase chain reaction method. The lengths of the complete mitochondrial DNA sequences of the three species were 16990, 16964, and 16773 bp, respectively. Each of the three mitochondrial DNA genomes included 13 protein-coding genes, 22 tRNA, two rRNA, one O(L)R, and one control region. The structures of the genomes were highly similar to those of Felis catus, Acinonyx jubatus, and Neofelis nebulosa. The phylogenies of the genus Panthera were inferred from two combined mitochondrial sequence data sets and the complete mitochondrial genome sequences, by MP (maximum parsimony), ML (maximum likelihood), and Bayesian analysis. The results showed that Panthera was composed of Panthera leo, P. uncia, P. pardus, Panthera onca, P. tigris, and N. nebulosa, which was included as the most basal member. The phylogeny within Panthera genus was N. nebulosa (P. tigris (P. onca (P. pardus, (P. leo, P. uncia)))). The divergence times for Panthera genus were estimated based on the ML branch lengths and four well-established calibration points. The results showed that at about 11.3 MYA, the Panthera genus separated from other felid species and then evolved into the several species of the genus. In detail, N. nebulosa was estimated to be founded about 8.66 MYA, P. tigris about 6.55 MYA, P. uncia about 4.63 MYA, and P. pardus about 4.35 MYA. All these estimated times were older than those estimated from the fossil records. The divergence event, evolutionary process, speciation, and distribution pattern of P. uncia, a species endemic to the central Asia with core habitats on the Qinghai-Tibetan Plateau and surrounding highlands, mostly correlated with the geological tectonic events and intensive climate shifts that happened at 8, 3.6, 2.5, and 1.7 MYA on the plateau during the late Cenozoic period.
Zheng, Chenfei; Nie, Liuwang; Wang, Jue; Zhou, Huaxing; Hou, Huazhen; Wang, Hao; Liu, Juanjuan
2013-01-01
Complete mitochondrial (mt) genome sequences with duplicate control regions (CRs) have been detected in various animal species. In Testudines, duplicate mtCRs have been reported in the mtDNA of the Asian big-headed turtle, Platysternon megacephalum, which has three living subspecies. However, the evolutionary pattern of these CRs remains unclear. In this study, we report the completed sequences of duplicate CRs from 20 individuals belonging to three subspecies of this turtle and discuss the micro-evolutionary analysis of the evolution of duplicate CRs. Genetic distances calculated with MEGA 4.1 using the complete duplicate CR sequences revealed that within turtle subspecies, genetic distances between orthologous copies from different individuals were 0.63% for CR1 and 1.2% for CR2app:addword:respectively, and the average distance between paralogous copies of CR1 and CR2 was 4.8%. Phylogenetic relationships were reconstructed from the CR sequences, excluding the variable number of tandem repeats (VNTRs) at the 3' end using three methods: neighbor-joining, maximum likelihood algorithm, and Bayesian inference. These data show that any two CRs within individuals were more genetically distant from orthologous genes in different individuals within the same subspecies. This suggests independent evolution of the two mtCRs within each P. megacephalum subspecies. Reconstruction of separate phylogenetic trees using different CR components (TAS, CD, CSB, and VNTRs) suggested the role of recombination in the evolution of duplicate CRs. Consequently, recombination events were detected using RDP software with break points at ≈290 bp and ≈1,080 bp. Based on these results, we hypothesize that duplicate CRs in P. megacephalum originated from heterological ancestral recombination of mtDNA. Subsequent recombination could have resulted in homogenization during independent evolutionary events, thus maintaining the functions of duplicate CRs in the mtDNA of P. megacephalum.
Zheng, Chenfei; Nie, Liuwang; Wang, Jue; Zhou, Huaxing; Hou, Huazhen; Wang, Hao; Liu, Juanjuan
2013-01-01
Complete mitochondrial (mt) genome sequences with duplicate control regions (CRs) have been detected in various animal species. In Testudines, duplicate mtCRs have been reported in the mtDNA of the Asian big-headed turtle, Platysternon megacephalum, which has three living subspecies. However, the evolutionary pattern of these CRs remains unclear. In this study, we report the completed sequences of duplicate CRs from 20 individuals belonging to three subspecies of this turtle and discuss the micro-evolutionary analysis of the evolution of duplicate CRs. Genetic distances calculated with MEGA 4.1 using the complete duplicate CR sequences revealed that within turtle subspecies, genetic distances between orthologous copies from different individuals were 0.63% for CR1 and 1.2% for CR2app:addword:respectively, and the average distance between paralogous copies of CR1 and CR2 was 4.8%. Phylogenetic relationships were reconstructed from the CR sequences, excluding the variable number of tandem repeats (VNTRs) at the 3′ end using three methods: neighbor-joining, maximum likelihood algorithm, and Bayesian inference. These data show that any two CRs within individuals were more genetically distant from orthologous genes in different individuals within the same subspecies. This suggests independent evolution of the two mtCRs within each P. megacephalum subspecies. Reconstruction of separate phylogenetic trees using different CR components (TAS, CD, CSB, and VNTRs) suggested the role of recombination in the evolution of duplicate CRs. Consequently, recombination events were detected using RDP software with break points at ≈290 bp and ≈1,080 bp. Based on these results, we hypothesize that duplicate CRs in P. megacephalum originated from heterological ancestral recombination of mtDNA. Subsequent recombination could have resulted in homogenization during independent evolutionary events, thus maintaining the functions of duplicate CRs in the mtDNA of P. megacephalum. PMID:24367563
Genetic diversity and structure of Capparis spinosa L. in Iran as revealed by ISSR markers.
Ahmadi, Maryam; Saeidi, Hojjatollah
2018-05-01
Capparis spinosa L. (caper bush) is an economically and ecologically important perennial shrub that grows across different regions of Iran. In this study, the genetic diversity and population structure of Iranian genepool of C. spinosa is evaluated using Inter Simple Sequence Repeat (ISSR) markers. Using 10 ISSR primers, 387 DNA fragments (bands) were amplified from the genomic DNA of 92 individuals belonging to twenty-one populations of C . spinosa , of which 378 (97.7%) were polymorphic. High level of genetic diversity (percentage of polymorphic loci = 98.2%, h = 0.1382, I = 0.243), high genetic differentiation (G st = 0.5234) and low gene flow (Nm = 0.4553) among populations were observed. Caper bush populations were divided into 4 groups in the dendrogram, PCoA plot and Bayesian clustering results, mostly corresponded to their geographic regions. The results showed that there are value in sampling Iranian caper bush populations to look for valuable alleles for use in plant breeding programs.
Choudhury, Anindo; Nadler, Steven A
2016-02-01
The phylogenetic relationships of Cucullanidae were explored using near-complete sequences of the 18S rDNA (rRNA gene). Sequences (1,750-1,760 bp) were obtained from 7 species of Cucullanidae belonging to 3 genera, Cucullanus (2 spp.), Dichelyne (2 spp.), Truttaedacnitis (3 spp.), and 1 species of Quimperiidae ( Paraseuratum sp.). These sequences were aligned with those of 128 other nematode species available in GenBank, including 3 other cucullanids (Dichelyne mexicanus, Cucullanus robustus, and Cucullanus baylisi) and 2 non-cucullanid seuratoids (Paraquimperia africana, and Linstowinema sp.). Bayesian (BPP) and maximum likelihood (ML) analyses of 2 different datasets strongly supported a monophyletic Cucullanidae. Bayesian analysis placed this family as the sister group to a clade containing species of Diplogasterida, Strongylida, Rhabditida, and Tylenchida with very strong support. Neither BPP nor ML analyses recovered a close relationship of Cucullanidae to Ascaridida. None of the 3 non-cucullanid seuratoid species were sister to Cucullanidae, nor did they form a monophyletic group of their own, which questions the monophyly of Seuratoidea and the relationships among species within this superfamily. The 3 genera of cucullanids were also not monophyletic, although morphologically similar species such as the 2 species of Cucullanus from Neotropical catfishes and 2 species of Dichelyne from Nearctic ictalurid catfishes were sister taxa with strong support. The results were ambiguous with respect to the relationship of 2 Truttaedacnitis spp. in Nearctic freshwater fishes but do not support Truttaedacnitis heterodonti, a parasite of heterodontid sharks, as belonging to this genus. The study shows that all aspects of the conventional classification of Seuratoidea and its taxa should be scrutinized by even more extensive sampling across hosts and habitats.
Phylogeography above the species level for perennial species in a composite genus
Tremetsberger, Karin; Ortiz, María Ángeles; Terrab, Anass; Balao, Francisco; Casimiro-Soriguer, Ramón; Talavera, María; Talavera, Salvador
2016-01-01
In phylogeography, DNA sequence and fingerprint data at the population level are used to infer evolutionary histories of species. Phylogeography above the species level is concerned with the genealogical aspects of divergent lineages. Here, we present a phylogeographic study to examine the evolutionary history of a western Mediterranean composite, focusing on the perennial species of Helminthotheca (Asteraceae, Cichorieae). We used molecular markers (amplified fragment length polymorphism (AFLP), internal transcribed spacer and plastid DNA sequences) to infer relationships among populations throughout the distributional range of the group. Interpretation is aided by biogeographic and molecular clock analyses. Four coherent entities are revealed by Bayesian mixture clustering of AFLP data, which correspond to taxa previously recognized at the rank of subspecies. The origin of the group was in western North Africa, from where it expanded across the Strait of Gibraltar to the Iberian Peninsula and across the Strait of Sicily to Sicily. Pleistocene lineage divergence is inferred within western North Africa as well as within the western Iberian region. The existence of the four entities as discrete evolutionary lineages suggests that they should be elevated to the rank of species, yielding H. aculeata, H. comosa, H. maroccana and H. spinosa, whereby the latter two necessitate new combinations. PMID:26644340
Huang, Jie; Chen, Zigui; Song, Weibo; Berger, Helmut
2014-01-01
Classifications of the Urostyloidea were mainly based on morphology and morphogenesis. Since molecular phylogeny largely focused on limited sampling using mostly the one-gene information, the incongruence between morphological data and gene sequences have risen. In this work, the three-gene data (SSU-rDNA, ITS1-5.8S-ITS2 and LSU-rDNA) comprising 12 genera in the “core urostyloids” are sequenced, and the phylogenies based on these different markers are compared using maximum-likelihood and Bayesian algorithms and tested by unconstrained and constrained analyses. The molecular phylogeny supports the following conclusions: (1) the monophyly of the core group of Urostyloidea is well supported while the whole Urostyloidea is not monophyletic; (2) Thigmokeronopsis and Apokeronopsis are clearly separated from the pseudokeronopsids in analyses of all three gene markers, supporting their exclusion from the Pseudokeronopsidae and the inclusion in the Urostylidae; (3) Diaxonella and Apobakuella should be assigned to the Urostylidae; (4) Bergeriella, Monocoronella and Neourostylopsis flavicana share a most recent common ancestor; (5) all molecular trees support the transfer of Metaurostylopsis flavicana to the recently proposed genus Neourostylopsis; (6) all molecular phylogenies fail to separate the morphologically well-defined genera Uroleptopsis and Pseudokeronopsis; and (7) Arcuseries gen. nov. containing three distinctly deviating Anteholosticha species is established. PMID:24140978
Wilcox, Thomas P; Zwickl, Derrick J; Heath, Tracy A; Hillis, David M
2002-11-01
Four New World genera of dwarf boas (Exiliboa, Trachyboa, Tropidophis, and Ungaliophis) have been placed by many systematists in a single group (traditionally called Tropidophiidae). However, the monophyly of this group has been questioned in several studies. Moreover, the overall relationships among basal snake lineages, including the placement of the dwarf boas, are poorly understood. We obtained mtDNA sequence data for 12S, 16S, and intervening tRNA-val genes from 23 species of snakes representing most major snake lineages, including all four genera of New World dwarf boas. We then examined the phylogenetic position of these species by estimating the phylogeny of the basal snakes. Our phylogenetic analysis suggests that New World dwarf boas are not monophyletic. Instead, we find Exiliboa and Ungaliophis to be most closely related to sand boas (Erycinae), boas (Boinae), and advanced snakes (Caenophidea), whereas Tropidophis and Trachyboa form an independent clade that separated relatively early in snake radiation. Our estimate of snake phylogeny differs significantly in other ways from some previous estimates of snake phylogeny. For instance, pythons do not cluster with boas and sand boas, but instead show a strong relationship with Loxocemus and Xenopeltis. Additionally, uropeltids cluster strongly with Cylindrophis, and together are embedded in what has previously been considered the macrostomatan radiation. These relationships are supported by both bootstrapping (parametric and nonparametric approaches) and Bayesian analysis, although Bayesian support values are consistently higher than those obtained from nonparametric bootstrapping. Simulations show that Bayesian support values represent much better estimates of phylogenetic accuracy than do nonparametric bootstrap support values, at least under the conditions of our study. Copyright 2002 Elsevier Science (USA)
NASA Astrophysics Data System (ADS)
Titus, Benjamin M.; Daly, Marymegan
2017-03-01
Specialist and generalist life histories are expected to result in contrasting levels of genetic diversity at the population level, and symbioses are expected to lead to patterns that reflect a shared biogeographic history and co-diversification. We test these assumptions using mtDNA sequencing and a comparative phylogeographic approach for six co-occurring crustacean species that are symbiotic with sea anemones on western Atlantic coral reefs, yet vary in their host specificities: four are host specialists and two are host generalists. We first conducted species discovery analyses to delimit cryptic lineages, followed by classic population genetic diversity analyses for each delimited taxon, and then reconstructed the demographic history for each taxon using traditional summary statistics, Bayesian skyline plots, and approximate Bayesian computation to test for signatures of recent and concerted population expansion. The genetic diversity values recovered here contravene the expectations of the specialist-generalist variation hypothesis and classic population genetics theory; all specialist lineages had greater genetic diversity than generalists. Demography suggests recent population expansions in all taxa, although Bayesian skyline plots and approximate Bayesian computation suggest the timing and magnitude of these events were idiosyncratic. These results do not meet the a priori expectation of concordance among symbiotic taxa and suggest that intrinsic aspects of species biology may contribute more to phylogeographic history than extrinsic forces that shape whole communities. The recovery of two cryptic specialist lineages adds an additional layer of biodiversity to this symbiosis and contributes to an emerging pattern of cryptic speciation in the specialist taxa. Our results underscore the differences in the evolutionary processes acting on marine systems from the terrestrial processes that often drive theory. Finally, we continue to highlight the Florida Reef Tract as an important biodiversity hotspot.
Scliar, Marilia O; Gouveia, Mateus H; Benazzo, Andrea; Ghirotto, Silvia; Fagundes, Nelson J R; Leal, Thiago P; Magalhães, Wagner C S; Pereira, Latife; Rodrigues, Maira R; Soares-Souza, Giordano B; Cabrera, Lilia; Berg, Douglas E; Gilman, Robert H; Bertorelle, Giorgio; Tarazona-Santos, Eduardo
2014-09-30
Archaeology reports millenary cultural contacts between Peruvian Coast-Andes and the Amazon Yunga, a rainforest transitional region between Andes and Lower Amazonia. To clarify the relationships between cultural and biological evolution of these populations, in particular between Amazon Yungas and Andeans, we used DNA-sequence data, a model-based Bayesian approach and several statistical validations to infer a set of demographic parameters. We found that the genetic diversity of the Shimaa (an Amazon Yunga population) is a subset of that of Quechuas from Central-Andes. Using the Isolation-with-Migration population genetics model, we inferred that the Shimaa ancestors were a small subgroup that split less than 5300 years ago (after the development of complex societies) from an ancestral Andean population. After the split, the most plausible scenario compatible with our results is that the ancestors of Shimaas moved toward the Peruvian Amazon Yunga and incorporated the culture and language of some of their neighbors, but not a substantial amount of their genes. We validated our results using Approximate Bayesian Computations, posterior predictive tests and the analysis of pseudo-observed datasets. We presented a case study in which model-based Bayesian approaches, combined with necessary statistical validations, shed light into the prehistoric demographic relationship between Andeans and a population from the Amazon Yunga. Our results offer a testable model for the peopling of this large transitional environmental region between the Andes and the Lower Amazonia. However, studies on larger samples and involving more populations of these regions are necessary to confirm if the predominant Andean biological origin of the Shimaas is the rule, and not the exception.
Cywinska, A; Hannan, M A; Kevan, P G; Roughley, R E; Iranpour, M; Hunter, F F
2010-12-01
This paper reports the first tests of the suitability of the standardized mitochondrial cytochrome c oxidase subunit I (COI) barcoding system for the identification of Canadian deerflies and horseflies. Two additional mitochondrial molecular markers were used to determine whether unambiguous species recognition in tabanids can be achieved. Our 332 Canadian tabanid samples yielded 650 sequences from five genera and 42 species. Standard COI barcodes demonstrated a strong A + T bias (mean 68.1%), especially at third codon positions (mean 93.0%). Our preliminary test of this system showed that the standard COI barcode worked well for Canadian Tabanidae: the target DNA can be easily recovered from small amounts of insect tissue and aligned for all tabanid taxa. Each tabanid species possessed distinctive sets of COI haplotypes which discriminated well among species. Average conspecific Kimura two-parameter (K2P) divergence (0.49%) was 12 times lower than the average divergence within species. Both the neighbour-joining and the Bayesian methods produced trees with identical monophyletic species groups. Two species, Chrysops dawsoni Philip and Chrysops montanus Osten Sacken (Diptera: Tabanidae), showed relatively deep intraspecific sequence divergences (∼ 10 times the average) for all three mitochondrial gene regions analysed. We suggest provisional differentiation of Ch. montanus into two haplotypes, namely, Ch. montanus haplomorph 1 and Ch. montanus haplomorph 2, both defined by their molecular sequences and by newly discovered differences in structural features near their ocelli. © 2010 Brock University. Medical and Veterinary Entomology © 2010 The Royal Entomological Society.
Mucheka, Vimbai T; Lamb, Jennifer M; Pfukenyi, Davies M; Mukaratirwa, Samson
2015-11-30
The aim of this study was to identify and determine the genetic diversity of Fasciola species in cattle from Zimbabwe, the KwaZulu-Natal and Mpumalanga provinces of South Africa and selected wildlife hosts from Zimbabwe. This was based on analysis of DNA sequences of the nuclear ribosomal internal transcribed spacer (ITS1 and 2) and mitochondrial cytochrome oxidase 1 (CO1) regions. The sample of 120 flukes was collected from livers of 57 cattle at 4 abattoirs in Zimbabwe and 47 cattle at 6 abattoirs in South Africa; it also included three alcohol-preserved duiker, antelope and eland samples from Zimbabwe. Aligned sequences (ITS 506 base pairs and CO1 381 base pairs) were analyzed by neighbour-joining, maximum parsimony and Bayesian inference methods. Phylogenetic trees revealed the presence of Fasciola gigantica in cattle from Zimbabwe and F. gigantica and Fasciola hepatica in the samples from South Africa. F. hepatica was more prevalent (64%) in South Africa than F. gigantica. In Zimbabwe, F. gigantica was present in 99% of the samples; F. hepatica was found in only one cattle sample, an antelope (Hippotragus niger) and a duiker (Sylvicapra grimmia). This is the first molecular confirmation of the identity Fasciola species in Zimbabwe and South Africa. Knowledge on the identity and distribution of these liver flukes at molecular level will allow disease surveillance and control in the studied areas. Copyright © 2015 Elsevier B.V. All rights reserved.
González-Andrade, Fabricio; Sánchez, Dora
2005-10-01
We present individual body identification efforts, to identify skeletal remains and relatives of missing persons of an explosion took place inside one of the munitions recesses of the Armoured Brigade of the Galapagos Armoured Cavalry, in the city of Riobamba, Ecuador, on Wednesday, November 20, 2002. Nineteen samples of bone remains and two tissue samples (a blood stain on a piece of fabric) from the zero zone were analysed. DNA extraction was made by Isoamilic Phenol-Chloroform-Alcohol, and proteinase K. We increased PCR cycles to identify DNA from bones to 35 cycles in some cases. An ABI 310 sequencer was used. Determination of the fragment size and the allelic designation of the different loci was carried out by comparison with the allelic ladders of the PowerPlex 16 kit and Gene Scan Analysis Software programme. Five possible family groups were established and were compared with the profiles found. Classical Bayesian methods were used to calculate the Likelihood Ratio and it was possible to identify five different genetic profiles in our country. This paper is important because is a novel experience for our forensic services, because this was the first time DNA had been used as an identification method in disasters, and it was validated by Ecuadorian justice like a very effective method.
NASA Astrophysics Data System (ADS)
Xu, Kuipeng; Tang, Xianghai; Bi, Guiqi; Cao, Min; Wang, Lu; Mao, Yunxiang
2017-08-01
Pyropia species grow in the intertidal zone and are cold-water adapted. To date, most of the information about the whole plastid and mitochondrial genomes (ptDNA and mtDNA) of this genus is limited to Northern Hemisphere species. Here, we report the sequencing of the ptDNA and mtDNA of the Antarctic red alga Pyropia endiviifolia using the Illumina platform. The plastid genome (195 784 bp, 33.28% GC content) contains 210 protein-coding genes, 37 tRNA genes and 6 rRNA genes. The mitochondrial genome (34 603 bp, 30.5% GC content) contains 26 protein-coding genes, 25 tRNA genes and 2 rRNA genes. Our results suggest that the organellar genomes of Py. endiviifolia have a compact organization. Although the collinearity of these genomes is conserved compared with other Pyropia species, the genome sizes show significant differences, mainly because of the different copy numbers of rDNA operons in the ptDNA and group II introns in the mtDNA. The other Pyropia species have 2u20133 distinct intronic ORFs in their cox 1 genes, but Py. endiviifolia has no introns in its cox 1 gene. This has led to a smaller mtDNA than in other Pyropia species. The phylogenetic relationships within Pyropia were examined using concatenated gene sets from most of the available organellar genomes with both the maximum likelihood and Bayesian methods. The analysis revealed a sister taxa affiliation between the Antarctic species Py. endiviifolia and the North American species Py. kanakaensis.
Zhang, Yi; Lu, Yongfang; Yindee, Marnoch; Li, Kuan-Yi; Kuo, Hsiao-Yun; Ju, Yu-Ten; Ye, Shaohui; Faruque, Md Omar; Li, Qiang; Wang, Yachun; Cuong, Vu Chi; Pham, Lan Doan; Bouahom, Bounthong; Yang, Bingzhuang; Liang, Xianwei; Cai, Zhihua; Vankan, Dianne; Manatchaiworakul, Wallaya; Kowlim, Nonglid; Duangchantrasiri, Somphot; Wajjwalku, Worawidh; Colenbrander, Ben; Zhang, Yuan; Beerli, Peter; Lenstra, Johannes A; Barker, J Stuart F
2016-04-01
The swamp type of the Asian water buffalo is assumed to have been domesticated by about 4000 years BP, following the introduction of rice cultivation. Previous localizations of the domestication site were based on mitochondrial DNA (mtDNA) variation within China, accounting only for the maternal lineage. We carried out a comprehensive sampling of China, Taiwan, Vietnam, Laos, Thailand, Nepal and Bangladesh and sequenced the mtDNA Cytochrome b gene and control region and the Y-chromosomal ZFY, SRY and DBY sequences. Swamp buffalo has a higher diversity of both maternal and paternal lineages than river buffalo, with also a remarkable contrast between a weak phylogeographic structure of river buffalo and a strong geographic differentiation of swamp buffalo. The highest diversity of the swamp buffalo maternal lineages was found in south China and north Indochina on both banks of the Mekong River, while the highest diversity in paternal lineages was in the China/Indochina border region. We propose that domestication in this region was later followed by introgressive capture of wild cows west of the Mekong. Migration to the north followed the Yangtze valley as well as a more eastern route, but also involved translocations of both cows and bulls over large distances with a minor influence of river buffaloes in recent decades. Bayesian analyses of various migration models also supported domestication in the China/Indochina border region. Coalescence analysis yielded consistent estimates for the expansion of the major swamp buffalo haplogroups with a credibility interval of 900 to 3900 years BP. The spatial differentiation of mtDNA and Y-chromosomal haplotype distributions indicates a lack of gene flow between established populations that is unprecedented in livestock. © 2015 John Wiley & Sons Ltd.
Numerical study on the sequential Bayesian approach for radioactive materials detection
NASA Astrophysics Data System (ADS)
Qingpei, Xiang; Dongfeng, Tian; Jianyu, Zhu; Fanhua, Hao; Ge, Ding; Jun, Zeng
2013-01-01
A new detection method, based on the sequential Bayesian approach proposed by Candy et al., offers new horizons for the research of radioactive detection. Compared with the commonly adopted detection methods incorporated with statistical theory, the sequential Bayesian approach offers the advantages of shorter verification time during the analysis of spectra that contain low total counts, especially in complex radionuclide components. In this paper, a simulation experiment platform implanted with the methodology of sequential Bayesian approach was developed. Events sequences of γ-rays associating with the true parameters of a LaBr3(Ce) detector were obtained based on an events sequence generator using Monte Carlo sampling theory to study the performance of the sequential Bayesian approach. The numerical experimental results are in accordance with those of Candy. Moreover, the relationship between the detection model and the event generator, respectively represented by the expected detection rate (Am) and the tested detection rate (Gm) parameters, is investigated. To achieve an optimal performance for this processor, the interval of the tested detection rate as a function of the expected detection rate is also presented.
Zhao, Yu-Juan; Gong, Xun
2015-07-08
Leucomeris decora and Nouelia insignis (Asteraceae) are narrowly and allopatrically distributed species, separated by the important biogeographic boundary Tanaka Line in Southwest China. Previous morphological, cytogenetic and molecular studies suggested that L. decora is sister to N. insignis. However, it is less clear how the two species diverged, whether in full isolation or occurring gene flow across the Tanaka Line. Here, we performed a molecular study at the population level to characterize genetic differentiation and decipher phylogeographic history in two closely related species based on variation examined in plastid and nuclear DNAs using a coalescent-based approach. These morphologically distinct species share plastid DNA (cpDNA) haplotypes. In contrast, Bayesian analysis of nuclear DNA (nDNA) uncovered two distinct clusters corresponding to L. decora and N. insignis. Based on the IMa analysis, no strong indication of migration was detected based on both cpDNA and nDNA sequences. The molecular data pointed to a major west-east split in nuclear DNA between the two species corresponding with the Tanaka Line. The coalescent time estimate for all cpDNA haplotypes dated to the Mid-Late Pleistocene. The estimated demographic parameters showed that the population size of L. decora was similar to that of N. insignis and both experienced limited demographic fluctuations recently. The study revealed comprehensive species divergence and phylogeographic histories of N. insignis and L. decora divided by the Tanaka Line. The phylogeographic pattern inferred from cpDNA reflected ancestrally shared polymorphisms without post-divergence gene flow between species. The marked genealogical lineage divergence in nDNA provided some indication of Tanaka Line for its role as a barrier to plant dispersal, and lent support to its importance in promoting strong population structure and allopatric divergence.
Lukoschek, V; Waycott, M; Keogh, J S
2008-07-01
Polymorphic microsatellites are widely considered more powerful for resolving population structure than mitochondrial DNA (mtDNA) markers, particularly for recently diverged lineages or geographically proximate populations. Weaker population subdivision for biparentally inherited nuclear markers than maternally inherited mtDNA may signal male-biased dispersal but can also be attributed to marker-specific evolutionary characteristics and sampling properties. We discriminated between these competing explanations with a population genetic study on olive sea snakes, Aipysurus laevis. A previous mtDNA study revealed strong regional population structure for A. laevis around northern Australia, where Pleistocene sea-level fluctuations have influenced the genetic signatures of shallow-water marine species. Divergences among phylogroups dated to the Late Pleistocene, suggesting recent range expansions by previously isolated matrilines. Fine-scale population structure within regions was, however, poorly resolved for mtDNA. In order to improve estimates of fine-scale genetic divergence and to compare population structure between nuclear and mtDNA, 354 olive sea snakes (previously sequenced for mtDNA) were genotyped for five microsatellite loci. F statistics and Bayesian multilocus genotype clustering analyses found similar regional population structure as mtDNA and, after standardizing microsatellite F statistics for high heterozygosities, regional divergence estimates were quantitatively congruent between marker classes. Over small spatial scales, however, microsatellites recovered almost no genetic structure and standardized F statistics were orders of magnitude smaller than for mtDNA. Three tests for male-biased dispersal were not significant, suggesting that recent demographic expansions to the typically large population sizes of A. laevis have prevented microsatellites from reaching mutation-drift equilibrium and local populations may still be diverging.
Interspecific Introgression in Cetaceans: DNA Markers Reveal Post-F1 Status of a Pilot Whale
Miralles, Laura; Lens, Santiago; Rodríguez-Folgar, Antonio; Carrillo, Manuel; Martín, Vidal; Mikkelsen, Bjarni; Garcia-Vazquez, Eva
2013-01-01
Visual species identification of cetacean strandings is difficult, especially when dead specimens are degraded and/or species are morphologically similar. The two recognised pilot whale species (Globicephala melas and Globicephala macrorhynchus) are sympatric in the North Atlantic Ocean. These species are very similar in external appearance and their morphometric characteristics partially overlap; thus visual identification is not always reliable. Genetic species identification ensures correct identification of specimens. Here we have employed one mitochondrial (D-Loop region) and eight nuclear loci (microsatellites) as genetic markers to identify six stranded pilot whales found in Galicia (Northwest Spain), one of them of ambiguous phenotype. DNA analyses yielded positive amplification of all loci and enabled species identification. Nuclear microsatellite DNA genotypes revealed mixed ancestry for one individual, identified as a post-F1 interspecific hybrid employing two different Bayesian methods. From the mitochondrial sequence the maternal species was Globicephala melas. This is the first hybrid documented between Globicephala melas and G. macrorhynchus, and the first post-F1 hybrid genetically identified between cetaceans, revealing interspecific genetic introgression in marine mammals. We propose to add nuclear loci to genetic databases for cetacean species identification in order to detect hybrid individuals. PMID:23990883
Robles, María del Rosario; Cutillas, Cristina; Panei, Carlos Javier; Callejón, Rocío
2014-01-01
Populations of Trichuris spp. isolated from six species of sigmodontine rodents from Argentina were analyzed based on morphological characteristics and ITS2 (rDNA) region sequences. Molecular data provided an opportunity to discuss the phylogenetic relationships among the Trichuris spp. from Noth and South America (mainly from Argentina). Trichuris specimens were identified morphologically as Trichuris pardinasi, T. navonae, Trichuris sp. and Trichuris new species, described in this paper. Sequences analyzed by Maximum Parsimony, Maximum Likelihood and Bayesian inference methods showed four main clades corresponding with the four different species regardless of geographical origin and host species. These four species from sigmodontine rodents clustered together and separated from Trichuris species isolated from murine and arvicoline rodents (outgroup). Different genetic lineages observed among Trichuris species from sigmodontine rodents which supported the proposal of a new species. Moreover, host distribution showed correspondence with the different tribes within the subfamily Sigmodontinae. PMID:25393618
Modeling Information Content Via Dirichlet-Multinomial Regression Analysis.
Ferrari, Alberto
2017-01-01
Shannon entropy is being increasingly used in biomedical research as an index of complexity and information content in sequences of symbols, e.g. languages, amino acid sequences, DNA methylation patterns and animal vocalizations. Yet, distributional properties of information entropy as a random variable have seldom been the object of study, leading to researchers mainly using linear models or simulation-based analytical approach to assess differences in information content, when entropy is measured repeatedly in different experimental conditions. Here a method to perform inference on entropy in such conditions is proposed. Building on results coming from studies in the field of Bayesian entropy estimation, a symmetric Dirichlet-multinomial regression model, able to deal efficiently with the issue of mean entropy estimation, is formulated. Through a simulation study the model is shown to outperform linear modeling in a vast range of scenarios and to have promising statistical properties. As a practical example, the method is applied to a data set coming from a real experiment on animal communication.
Moody, Michael L; Rieseberg, Loren H
2012-07-01
The annual sunflowers (Helianthus sect. Helianthus) present a formidable challenge for phylogenetic inference because of ancient hybrid speciation, recent introgression, and suspected issues with deep coalescence. Here we analyze sequence data from 11 nuclear DNA (nDNA) genes for multiple genotypes of species within the section to (1) reconstruct the phylogeny of this group, (2) explore the utility of nDNA gene trees for detecting hybrid speciation and introgression; and (3) test an empirical method of hybrid identification based on the phylogenetic congruence of nDNA gene trees from tightly linked genes. We uncovered considerable topological heterogeneity among gene trees with or without three previously identified hybrid species included in the analyses, as well as a general lack of reciprocal monophyly of species. Nonetheless, partitioned Bayesian analyses provided strong support for the reciprocal monophyly of all species except H. annuus (0.89 PP), the most widespread and abundant annual sunflower. Previous hypotheses of relationships among taxa were generally strongly supported (1.0 PP), except among taxa typically associated with H. annuus, apparently due to the paraphyly of the latter in all gene trees. While the individual nDNA gene trees provided a useful means for detecting recent hybridization, identification of ancient hybridization was problematic for all ancient hybrid species, even when linkage was considered. We discuss biological factors that affect the efficacy of phylogenetic methods for hybrid identification.
Cyber-T web server: differential analysis of high-throughput data.
Kayala, Matthew A; Baldi, Pierre
2012-07-01
The Bayesian regularization method for high-throughput differential analysis, described in Baldi and Long (A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes. Bioinformatics 2001: 17: 509-519) and implemented in the Cyber-T web server, is one of the most widely validated. Cyber-T implements a t-test using a Bayesian framework to compute a regularized variance of the measurements associated with each probe under each condition. This regularized estimate is derived by flexibly combining the empirical measurements with a prior, or background, derived from pooling measurements associated with probes in the same neighborhood. This approach flexibly addresses problems associated with low replication levels and technology biases, not only for DNA microarrays, but also for other technologies, such as protein arrays, quantitative mass spectrometry and next-generation sequencing (RNA-seq). Here we present an update to the Cyber-T web server, incorporating several useful new additions and improvements. Several preprocessing data normalization options including logarithmic and (Variance Stabilizing Normalization) VSN transforms are included. To augment two-sample t-tests, a one-way analysis of variance is implemented. Several methods for multiple tests correction, including standard frequentist methods and a probabilistic mixture model treatment, are available. Diagnostic plots allow visual assessment of the results. The web server provides comprehensive documentation and example data sets. The Cyber-T web server, with R source code and data sets, is publicly available at http://cybert.ics.uci.edu/.
Yu, Farong; Yu, Fahong; Pang, Junfeng; Kilpatrick, C William; McGuire, Peter M; Wang, Yingxiang; Lu, Shunqing; Woods, Charles A
2006-03-01
With modified DNA extraction and purification protocols, the complete cytochrome b gene sequences (1140 bp) were determined from degraded museum specimens. Molecular analysis and morphological examination of cranial characteristics of the giant flying squirrels of Petaurista philippensis complex (P. grandis, P. hainana, and P. yunanensis) and other Petaurista species yielded new insights into long-standing controversies in the Petaurista systematics. Patterns of genetic variations and morphological differences observed in this study indicate that P. hainana, P. albiventer, and P. yunanensis can be recognized as distinct species, and P. grandis and P. petaurista are conspecific populations. Phylogenetic relationships reconstructed by using parsimony, likelihood, and Bayesian methods reveal that, with P. leucogenys as the basal branch, all Petaurista groups formed two distinct clades. Petaurista philippensis, P. hainana, P. yunanensis, and P. albiventer are clustered in the same clade, while P. grandis shows a close relationship to P. petaurista. Deduced divergence times based on Bayesian analysis and the transversional substitution at the third codon suggest that the retreating of glaciers and upheavals or movements of tectonic plates in the Pliocene-Pleistocene were the major factors responsible for the present geographical distributions of Petaurista groups.
A bayesian analysis for identifying DNA copy number variations using a compound poisson process.
Chen, Jie; Yiğiter, Ayten; Wang, Yu-Ping; Deng, Hong-Wen
2010-01-01
To study chromosomal aberrations that may lead to cancer formation or genetic diseases, the array-based Comparative Genomic Hybridization (aCGH) technique is often used for detecting DNA copy number variants (CNVs). Various methods have been developed for gaining CNVs information based on aCGH data. However, most of these methods make use of the log-intensity ratios in aCGH data without taking advantage of other information such as the DNA probe (e.g., biomarker) positions/distances contained in the data. Motivated by the specific features of aCGH data, we developed a novel method that takes into account the estimation of a change point or locus of the CNV in aCGH data with its associated biomarker position on the chromosome using a compound Poisson process. We used a Bayesian approach to derive the posterior probability for the estimation of the CNV locus. To detect loci of multiple CNVs in the data, a sliding window process combined with our derived Bayesian posterior probability was proposed. To evaluate the performance of the method in the estimation of the CNV locus, we first performed simulation studies. Finally, we applied our approach to real data from aCGH experiments, demonstrating its applicability.
The DNA database search controversy revisited: bridging the Bayesian-frequentist gap.
Storvik, Geir; Egeland, Thore
2007-09-01
Two different quantities have been suggested for quantification of evidence in cases where a suspect is found by a search through a database of DNA profiles. The likelihood ratio, typically motivated from a Bayesian setting, is preferred by most experts in the field. The so-called np rule has been suggested through frequentist arguments and has been suggested by the American National Research Council and Stockmarr (1999, Biometrics55, 671-677). The two quantities differ substantially and have given rise to the DNA database search controversy. Although several authors have criticized the different approaches, a full explanation of why these differences appear is still lacking. In this article we show that a P-value in a frequentist hypothesis setting is approximately equal to the result of the np rule. We argue, however, that a more reasonable procedure in this case is to use conditional testing, in which case a P-value directly related to posterior probabilities and the likelihood ratio is obtained. This way of viewing the problem bridges the gap between the Bayesian and frequentist approaches. At the same time it indicates that the np rule should not be used to quantify evidence.
New Rickettsia species in soft ticks Ornithodoros hasei collected from bats in French Guiana.
Tahir, Djamel; Socolovschi, Cristina; Marié, Jean-Lou; Ganay, Gautier; Berenger, Jean-Michel; Bompar, Jean-Michel; Blanchet, Denis; Cheuret, Marie; Mediannikov, Oleg; Raoult, Didier; Davoust, Bernard; Parola, Philippe
2016-10-01
In French Guiana, located on the northeastern coast of South America, bats of different species are very numerous. The infection of bats and their ticks with zoonotic bacteria, especially Rickettsia species, is so far unknown. In order to improve knowledge of these zoonotic pathogens in this French overseas department, the presence and diversity of tick-borne bacteria was investigated with molecular tools in bat ticks. In the beginning of 2013, 32 bats were caught in Saint-Jean-du-Maroni, an area close to the coast of French Guiana, and the ticks of these animals were collected. A total of 354 larvae of Argasidae soft ticks (Ornithodoros hasei) from 12 bats (Noctilio albiventris) were collected and 107 of them were analysed. DNA was extracted from the samples and quantitative real-time PCR was carried out to detect Rickettsia spp., Bartonella spp., Borrelia spp. and Coxiella burnetii. All tested samples were negative for Bartonella spp., Borrelia spp. and Coxiella burnetii. Rickettsia DNA was detected in 31 (28.9%) ticks. An almost entire (1118 base pairs long) sequence of the gltA gene was obtained after the amplification of some positive samples on conventional PCR and sequencing. A Bayesian tree was constructed using concatenated rrs, gltA, ompA, ompB, and gene D sequences. The study of characteristic sequences shows that this Rickettsia species is very close (98.3-99.8%) genetically to R. peacockii. Nevertheless, the comparative analysis of sequences obtained from gltA, ompA, ompB, rrs and gene D fragments demonstrated that this Rickettsia is different from the other members of the spotted fever group. The sequences of this new species were deposited in GenBank as Candidatus Rickettsia wissemanii. This is the first report showing the presence of nucleic acid of Rickettsia in Ornithodoros hasei ticks from South American bats. Copyright © 2016 Elsevier GmbH. All rights reserved.
Szkuta, Bianca; Ballantyne, Kaye N; Kokshoorn, Bas; van Oorschot, Roland A H
2018-03-01
Questions relating to how DNA from an individual got to where it was recovered from and the activities associated with its pickup, retention and deposition are increasingly relevant to criminal investigations and judicial considerations. To address activity level propositions, investigators are typically required to assess the likelihood that DNA was transferred indirectly and not deposited through direct contact with an item or surface. By constructing a series of Bayesian networks, we demonstrate their use in assessing activity level propositions derived from a recent legal case involving the alleged secondary transfer of DNA to a surface following a handshaking event. In the absence of data required to perform the assessment, a set of handshaking simulations were performed to obtain probabilities on the persistence of non-self DNA on the hands following a 40min, 5h or 8h delay between the handshake and contact with the final surface (an axe handle). Variables such as time elapsed, and the activities performed and objects contacted between the handshake and contact with the axe handle, were also considered when assessing the DNA results. DNA from a known contributor was transferred to the right hand of an opposing hand-shaker (as a depositor), and could be subsequently transferred to, and detected on, a surface contacted by the depositor 40min to 5h post-handshake. No non-self DNA from the known contributor was detected in deposits made 8h post-handshake. DNA from the depositor was generally detected as the major or only contributor in the profiles generated. Contributions from the known contributor were minor, decreasing in presence and in the strength of support for inclusion as the time between the handshake and transfer event increased. The construction of a series of Bayesian networks based on the case circumstances provided empirical estimations of the likelihood of direct or indirect deposition. The analyses and conclusions presented demonstrate both the complexity of activity level assessments concerning DNA evidence, and the power of Bayesian networks to visualise and explore the issues of interest for a given case. Copyright © 2017 Elsevier B.V. All rights reserved.
Vongvanrungruang, A; Mongkolsiriwatana, C; Boonkaew, T; Sawatdichaikul, O; Srikulnath, K; Peyachoknagul, S
2016-09-19
The fragrance gene, betaine aldehyde dehydrogenase 2 (Badh2), has been well studied in many plant species. The objectives of this study were to clone Badh2 and compare the sequences between aromatic and non-aromatic coconuts. The complete coding region was cloned from cDNA of both aromatic and non-aromatic coconuts. The nucleotide sequences were highly homologous to Badh2 genes of other plants. Badh2 consisted of a 1512-bp open reading frame encoding 503 amino acids. A single nucleotide difference between aromatic and non-aromatic coconuts resulted in the conversion of alanine (non-aromatic) to proline (aromatic) at position 442, which was the substrate binding site of BADH2. The ring side chain of proline could destabilize the structure leading to a non-functional enzyme. Badh2 genomic DNA was cloned from exon 1 to 4, and from exon 5 to 15 from the two coconut types, except for intron 4 that was very long. The intron sequences of the two coconut groups were highly homologous. No differences in Badh2 expression were found among the tissues of aromatic coconut or between aromatic and non-aromatic coconuts. The amino acid sequences of BADH2 from coconut and other plants were compared and the genetic relationship was analyzed using MEGA 7.0. The phylogenetic tree reconstructed by the Bayesian information criterion consisted of two distinct groups of monocots and dicots. Among the monocots, coconut (Cocos nucifera) and oil palm (Elaeis guineensis) were the most closely related species. A marker for coconut differentiation was developed from one-base substitution site and could be successfully used.
Han, Bang-Xing; Yuan, Yuan; Huang, Lu-Qi; Zhao, Qun; Tan, Ling-Ling; Song, Xiang-Wen; He, Xiao-Mei; Xu, Tao; Liu, Feng; Wang, Jian
2017-01-01
The traditional Chinese medicine (TCM) Qianhu and Zihuaqianhu are the dried roots of Peucedanum praeruptorum and Angelica decursiva , respectively. Since the plant sources of Qianhu and Zihuaqianhu are more complex, the chemical compositions of P. praeruptorum and A. decursiva are significantly different, and many adulterants exist because of the differences in traditional understanding and medication habits. Therefore, the rapid and accurate identification methods are required. The aim was to study the feasibility of using DNA barcoding to distinguish between Traditional Chinese medicine Qianhu ( Peucedanum praeruptorum ), Zihuaqianhu ( Angelica decursiva ), and common adulterants, based on internal transcribed spacer (ITS) sequences, as well as specific PCR identification between P. praeruptorum and A. decursiva . The ITS sequences of P. praeruptorum , A. decursiva , and adulterant were studied, and a phylogenetic tree was constructed. Based on the ITS barcode, the specific PCR primer pairs QH-CP19s/QH-CP19a and ZHQH-CP3s/ZHQH-CP3a were designed for P. praeruptorum and A. decursiva , respectively. The amplification conditions were optimized, and specific PCR products were obtained. The results showed that the phylogenetic trees constructed using the BI and MP methods were consistent, and P. praeruptorum and A. decursiva sequence haplotypes formed their own monophyly. The experimental results showed that in PCR products, the target bands appeared in the genuine drug and not in the adulterant, which suggests the high specificity of the two primer pairs. The ITS sequence was ideal DNA barcode to identify P. praeruptorum , A. decursiva , and adulterant. The specific PCR is a quick and effective method to distinguish between P. praeruptorum and A. decursiva . Peucedanum praeruptorum and Angelica decursiva sequence haplotypes formed their own monophyly.The ITS sequence was ideal DNA barcode to identify P. praeruptorum , A. decursiva , and adulterant.Specific PCR is a quick and effective method to distinguish between P. praeruptorum and A. decursiva . Abbreviations used: TCM: The traditional Chinese medicine, P.: Peucedanum , A.: Angelica , ITS: The internal transcribed spacer, PCR: Polymerase chain reaction, NCBI: National Center for Biotechnology Information, NI: Number of individuals, HN: Haplotype number; GAN: Gen Bank accession numbers, L.: Ligusticum , O.: Ostericum , A.: Angelica , P.: Pimpinella , BI: Bayesian inference, MP: Maximum parsimony, AIC: Akaike Information Criterion, MCMC: Markov Chains Monte Carlo, TBR: Tree bisection-reconnection, LPP: Length of PCR product, PRP: PCR reaction procedure, SNP: Single nucleotide polymorphisms, PP: Posterior probability, BS: Bootstrap.Qun Zhao.
Deep phylogeographic divergence and cytonuclear discordance in the grasshopper Oedaleus decorus.
Kindler, Eveline; Arlettaz, Raphaël; Heckel, Gerald
2012-11-01
The grasshopper Oedaleus decorus is a thermophilic insect with a large, mostly south-Palaearctic distribution range, stretching from the Mediterranean regions in Europe to Central-Asia and China. In this study, we analyzed the extent of phylogenetic divergence and the recent evolutionary history of the species based on 274 specimens from 26 localities across the distribution range in Europe. Phylogenetic relationships were determined using sequences of two mitochondrial loci (ctr, ND2) with neighbour-joining and Bayesian methods. Additionally, genetic differentiation was analyzed based on mitochondrial DNA and 11 microsatellite markers using F-statistics, model-free multivariate and model-based Bayesian clustering approaches. Phylogenetic analyses detected consistently two highly divergent, allopatrically distributed lineages within O. decorus. The divergence among these Western and Eastern lineages meeting in the region of the Alps was similar to the divergence of each lineage to the sister species O. asiaticus. Genetic differentiation for ctr was extremely high between Western and Eastern grasshopper populations (F(ct)=0.95). Microsatellite markers detected much lower but nevertheless very significant genetic structure among population samples. The nuclear data also demonstrated a case of cytonuclear discordance because the affiliation with mitochondrial lineages was incongruent in Northern Italy. Taken together these results provide evidence of an ancient separation within Oedaleus and either historical introgression of mtDNA among lineages and/or ongoing sex-specific gene flow in this grasshopper. Our study stresses the importance of multilocus approaches for unravelling the history and status of taxa of uncertain evolutionary divergence. Copyright © 2012 Elsevier Inc. All rights reserved.
Larridon, Isabel; Walter, Helmut E; Guerrero, Pablo C; Duarte, Milén; Cisternas, Mauricio A; Hernández, Carol Peña; Bauters, Kenneth; Asselman, Pieter; Goetghebeur, Paul; Samain, Marie-Stéphanie
2015-09-01
Species of the endemic Chilean cactus genus Copiapoa have cylindrical or (sub)globose stems that are solitary or form (large) clusters and typically yellow flowers. Many species are threatened with extinction. Despite being icons of the Atacama Desert and well loved by cactus enthusiasts, the evolution and diversity of Copiapoa has not yet been studied using a molecular approach. Sequence data of three plastid DNA markers (rpl32-trnL, trnH-psbA, ycf1) of 39 Copiapoa taxa were analyzed using maximum likelihood and Bayesian inference approaches. Species distributions were modeled based on geo-referenced localities and climatic data. Evolution of character states of four characters (root morphology, stem branching, stem shape, and stem diameter) as well as ancestral areas were reconstructed using a Bayesian and maximum likelihood framework, respectively. Clades of species are revealed. Though 32 morphologically defined species can be recognized, genetic diversity between some species and infraspecific taxa is too low to delimit their boundaries using plastid DNA markers. Recovered relationships are often supported by morphological and biogeographical patterns. The origin of Copiapoa likely lies between southern Peru and the extreme north of Chile. The Copiapó Valley limited colonization between two biogeographical areas. Copiapoa is here defined to include 32 species and five heterotypic subspecies. Thirty species are classified into four sections and two subsections, while two species remain unplaced. A better understanding of evolution and diversity of Copiapoa will allow allocating conservation resources to the most threatened lineages and focusing conservation action on real biodiversity. © 2015 Botanical Society of America.
Takamiya, Tomoko; Wongsawad, Pheravut; Sathapattayanon, Apirada; Tajima, Natsuko; Suzuki, Shunichiro; Kitamura, Saki; Shioda, Nao; Handa, Takashi; Kitanaka, Susumu; Iijima, Hiroshi; Yukawa, Tomohisa
2014-01-01
It is always difficult to construct coherent classification systems for plant lineages having diverse morphological characters. The genus Dendrobium, one of the largest genera in the Orchidaceae, includes ∼1100 species, and enormous morphological diversification has hindered the establishment of consistent classification systems covering all major groups of this genus. Given the particular importance of species in Dendrobium section Dendrobium and allied groups as floriculture and crude drug genetic resources, there is an urgent need to establish a stable classification system. To clarify phylogenetic relationships in Dendrobium section Dendrobium and allied groups, we analysed the macromolecular characters of the group. Phylogenetic analyses of 210 taxa of Dendrobium were conducted on DNA sequences of internal transcribed spacer (ITS) regions of 18S–26S nuclear ribosomal DNA and the maturase-coding gene (matK) located in an intron of the plastid gene trnK using maximum parsimony and Bayesian methods. The parsimony and Bayesian analyses revealed 13 distinct clades in the group comprising section Dendrobium and its allied groups. Results also showed paraphyly or polyphyly of sections Amblyanthus, Aporum, Breviflores, Calcarifera, Crumenata, Dendrobium, Densiflora, Distichophyllae, Dolichocentrum, Holochrysa, Oxyglossum and Pedilonum. On the other hand, the monophyly of section Stachyobium was well supported. It was found that many of the morphological characters that have been believed to reflect phylogenetic relationships are, in fact, the result of convergence. As such, many of the sections that have been recognized up to this point were found to not be monophyletic, so recircumscription of sections is required. PMID:25107672
Kolleck, Jakob; Yang, Mouyu; Zinner, Dietmar; Roos, Christian
2013-01-01
To evaluate the conservation status of a species or population it is necessary to gain insight into its ecological requirements, reproduction, genetic population structure, and overall genetic diversity. In our study we examined the genetic diversity of Rhinopithecus brelichi by analyzing microsatellite data and compared them with already existing data derived from mitochondrial DNA, which revealed that R. brelichi exhibits the lowest mitochondrial diversity of all so far studied Rhinopithecus species. In contrast, the genetic diversity of nuclear DNA is high and comparable to other Rhinopithecus species, i.e. the examined microsatellite loci are similarly highly polymorphic as in other species of the genus. An explanation for these differences in mitochondrial and nuclear genetic diversity could be a male biased dispersal. Females most likely stay within their natal band and males migrate between bands, thus mitochondrial DNA will not be exchanged between bands but nuclear DNA via males. A Bayesian Skyline Plot based on mitochondrial DNA sequences shows a strong decrease of the female effective population size (Nef) starting about 3,500 to 4,000 years ago, which concurs with the increasing human population in the area and respective expansion of agriculture. Given that we found no indication for a loss of nuclear DNA diversity in R. brelichi it seems that this factor does not represent the most prominent conservation threat for the long-term survival of the species. Conservation efforts should therefore focus more on immediate threats such as development of tourism and habitat destruction. PMID:24009761
A novel Bayesian change-point algorithm for genome-wide analysis of diverse ChIPseq data types.
Xing, Haipeng; Liao, Willey; Mo, Yifan; Zhang, Michael Q
2012-12-10
ChIPseq is a widely used technique for investigating protein-DNA interactions. Read density profiles are generated by using next-sequencing of protein-bound DNA and aligning the short reads to a reference genome. Enriched regions are revealed as peaks, which often differ dramatically in shape, depending on the target protein(1). For example, transcription factors often bind in a site- and sequence-specific manner and tend to produce punctate peaks, while histone modifications are more pervasive and are characterized by broad, diffuse islands of enrichment(2). Reliably identifying these regions was the focus of our work. Algorithms for analyzing ChIPseq data have employed various methodologies, from heuristics(3-5) to more rigorous statistical models, e.g. Hidden Markov Models (HMMs)(6-8). We sought a solution that minimized the necessity for difficult-to-define, ad hoc parameters that often compromise resolution and lessen the intuitive usability of the tool. With respect to HMM-based methods, we aimed to curtail parameter estimation procedures and simple, finite state classifications that are often utilized. Additionally, conventional ChIPseq data analysis involves categorization of the expected read density profiles as either punctate or diffuse followed by subsequent application of the appropriate tool. We further aimed to replace the need for these two distinct models with a single, more versatile model, which can capably address the entire spectrum of data types. To meet these objectives, we first constructed a statistical framework that naturally modeled ChIPseq data structures using a cutting edge advance in HMMs(9), which utilizes only explicit formulas-an innovation crucial to its performance advantages. More sophisticated then heuristic models, our HMM accommodates infinite hidden states through a Bayesian model. We applied it to identifying reasonable change points in read density, which further define segments of enrichment. Our analysis revealed how our Bayesian Change Point (BCP) algorithm had a reduced computational complexity-evidenced by an abridged run time and memory footprint. The BCP algorithm was successfully applied to both punctate peak and diffuse island identification with robust accuracy and limited user-defined parameters. This illustrated both its versatility and ease of use. Consequently, we believe it can be implemented readily across broad ranges of data types and end users in a manner that is easily compared and contrasted, making it a great tool for ChIPseq data analysis that can aid in collaboration and corroboration between research groups. Here, we demonstrate the application of BCP to existing transcription factor(10,11) and epigenetic data(12) to illustrate its usefulness.
Barnett, Ross; Yamaguchi, Nobuyuki; Shapiro, Beth; Ho, Simon Y W; Barnes, Ian; Sabin, Richard; Werdelin, Lars; Cuisin, Jacques; Larson, Greger
2014-04-02
Understanding the demographic history of a population is critical to conservation and to our broader understanding of evolutionary processes. For many tropical large mammals, however, this aim is confounded by the absence of fossil material and by the misleading signal obtained from genetic data of recently fragmented and isolated populations. This is particularly true for the lion which as a consequence of millennia of human persecution, has large gaps in its natural distribution and several recently extinct populations. We sequenced mitochondrial DNA from museum-preserved individuals, including the extinct Barbary lion (Panthera leo leo) and Iranian lion (P. l. persica), as well as lions from West and Central Africa. We added these to a broader sample of lion sequences, resulting in a data set spanning the historical range of lions. Our Bayesian phylogeographical analyses provide evidence for highly supported, reciprocally monophyletic lion clades. Using a molecular clock, we estimated that recent lion lineages began to diverge in the Late Pleistocene. Expanding equatorial rainforest probably separated lions in South and East Africa from other populations. West African lions then expanded into Central Africa during periods of rainforest contraction. Lastly, we found evidence of two separate incursions into Asia from North Africa, first into India and later into the Middle East. We have identified deep, well-supported splits within the mitochondrial phylogeny of African lions, arguing for recognition of some regional populations as worthy of independent conservation. More morphological and nuclear DNA data are now needed to test these subdivisions.
2014-01-01
Background Understanding the demographic history of a population is critical to conservation and to our broader understanding of evolutionary processes. For many tropical large mammals, however, this aim is confounded by the absence of fossil material and by the misleading signal obtained from genetic data of recently fragmented and isolated populations. This is particularly true for the lion which as a consequence of millennia of human persecution, has large gaps in its natural distribution and several recently extinct populations. Results We sequenced mitochondrial DNA from museum-preserved individuals, including the extinct Barbary lion (Panthera leo leo) and Iranian lion (P. l. persica), as well as lions from West and Central Africa. We added these to a broader sample of lion sequences, resulting in a data set spanning the historical range of lions. Our Bayesian phylogeographical analyses provide evidence for highly supported, reciprocally monophyletic lion clades. Using a molecular clock, we estimated that recent lion lineages began to diverge in the Late Pleistocene. Expanding equatorial rainforest probably separated lions in South and East Africa from other populations. West African lions then expanded into Central Africa during periods of rainforest contraction. Lastly, we found evidence of two separate incursions into Asia from North Africa, first into India and later into the Middle East. Conclusions We have identified deep, well-supported splits within the mitochondrial phylogeny of African lions, arguing for recognition of some regional populations as worthy of independent conservation. More morphological and nuclear DNA data are now needed to test these subdivisions. PMID:24690312
Pedraza-Lara, Carlos; Barrientos-Lozano, Ludivina; Rocha-Sánchez, Aurora Y; Zaldívar-Riverón, Alejandro
2015-03-01
The genus Sphenarium (Pyrgomorphidae) is a small group of grasshoppers endemic to México and Guatemala that are economically and culturally important both as a food source and as agricultural pests. However, its taxonomy has been largely neglected mainly due to its conserved interspecific external morphology and the considerable intraspecific variation in colour pattern of some taxa. Here we examined morphological as well as mitochondrial and nuclear DNA sequence data to assess the species boundaries and evolutionary history in Sphenarium. Our morphological identification and DNA sequence-based species delimitation, carried out with three different approaches (DNA barcoding, general mixed Yule-coalescent model, Bayesian species delimitation), all recovered a higher number of putative species of Sphenarium than previously recognised. We unambiguously delimit seven species, and between five and ten additional species depending on the data/method analysed. Phylogenetic relationships within the genus strongly support two main clades, one exclusively montane, the other coastal. Divergence time estimates suggest late Miocene to Pliocene ages for the origin and most of the early diversification events in the genus, which were probably influenced by the formation of the Trans-Mexican Volcanic Belt. A series of Pleistocene events could have led to the current species diversification in both montane and coastal regions. This study not only reveals an overlooked species richness for the most popular edible insect in Mexico, but also highlights the influence of the dynamic geological and climatic history of the region in shaping its current diversity. Copyright © 2015 Elsevier Inc. All rights reserved.
Mikaeili, F; Mirhendi, H; Mohebali, M; Hosseini, M; Sharbatkhori, M; Zarei, Z; Kia, E B
2015-07-01
The study was conducted to determine the sequence variation in two mitochondrial genes, namely cytochrome c oxidase 1 (pcox1) and NADH dehydrogenase 1 (pnad1) within and among isolates of Toxocara cati, Toxocara canis and Toxascaris leonina. Genomic DNA was extracted from 32 isolates of T. cati, 9 isolates of T. canis and 19 isolates of T. leonina collected from cats and dogs in different geographical areas of Iran. Mitochondrial genes were amplified by polymerase chain reaction (PCR) and sequenced. Sequence data were aligned using the BioEdit software and compared with published sequences in GenBank. Phylogenetic analysis was performed using Bayesian inference and maximum likelihood methods. Based on pairwise comparison, intra-species genetic diversity within Iranian isolates of T. cati, T. canis and T. leonina amounted to 0-2.3%, 0-1.3% and 0-1.0% for pcox1 and 0-2.0%, 0-1.7% and 0-2.6% for pnad1, respectively. Inter-species sequence variation among the three ascaridoid nematodes was significantly higher, being 9.5-16.6% for pcox1 and 11.9-26.7% for pnad1. Sequence and phylogenetic analysis of the pcox1 and pnad1 genes indicated that there is significant genetic diversity within and among isolates of T. cati, T. canis and T. leonina from different areas of Iran, and these genes can be used for studying genetic variation of ascaridoid nematodes.
Bettenbühl, Mario; Rusconi, Marco; Engbert, Ralf; Holschneider, Matthias
2012-01-01
Complex biological dynamics often generate sequences of discrete events which can be described as a Markov process. The order of the underlying Markovian stochastic process is fundamental for characterizing statistical dependencies within sequences. As an example for this class of biological systems, we investigate the Markov order of sequences of microsaccadic eye movements from human observers. We calculate the integrated likelihood of a given sequence for various orders of the Markov process and use this in a Bayesian framework for statistical inference on the Markov order. Our analysis shows that data from most participants are best explained by a first-order Markov process. This is compatible with recent findings of a statistical coupling of subsequent microsaccade orientations. Our method might prove to be useful for a broad class of biological systems.
Accurate, Rapid Taxonomic Classification of Fungal Large-Subunit rRNA Genes
Liu, Kuan-Liang; Porras-Alfaro, Andrea; Eichorst, Stephanie A.
2012-01-01
Taxonomic and phylogenetic fingerprinting based on sequence analysis of gene fragments from the large-subunit rRNA (LSU) gene or the internal transcribed spacer (ITS) region is becoming an integral part of fungal classification. The lack of an accurate and robust classification tool trained by a validated sequence database for taxonomic placement of fungal LSU genes is a severe limitation in taxonomic analysis of fungal isolates or large data sets obtained from environmental surveys. Using a hand-curated set of 8,506 fungal LSU gene fragments, we determined the performance characteristics of a naïve Bayesian classifier across multiple taxonomic levels and compared the classifier performance to that of a sequence similarity-based (BLASTN) approach. The naïve Bayesian classifier was computationally more rapid (>460-fold with our system) than the BLASTN approach, and it provided equal or superior classification accuracy. Classifier accuracies were compared using sequence fragments of 100 bp and 400 bp and two different PCR primer anchor points to mimic sequence read lengths commonly obtained using current high-throughput sequencing technologies. Accuracy was higher with 400-bp sequence reads than with 100-bp reads. It was also significantly affected by sequence location across the 1,400-bp test region. The highest accuracy was obtained across either the D1 or D2 variable region. The naïve Bayesian classifier provides an effective and rapid means to classify fungal LSU sequences from large environmental surveys. The training set and tool are publicly available through the Ribosomal Database Project (http://rdp.cme.msu.edu/classifier/classifier.jsp). PMID:22194300
Helping to distinguish primary from secondary transfer events for trace DNA.
Taylor, Duncan; Biedermann, Alex; Samie, Lydie; Pun, Ka-Man; Hicks, Tacha; Champod, Christophe
2017-05-01
DNA is routinely recovered in criminal investigations. The sensitivity of laboratory equipment and DNA profiling kits means that it is possible to generate DNA profiles from very small amounts of cellular material. As a consequence, it has been shown that DNA we detect may not have arisen from a direct contact with an item, but rather through one or more intermediaries. Naturally the questions arising in court, particularly when considering trace DNA, are of how DNA may have come to be on an item. While scientists cannot directly answer this question, forensic biological results can help in discriminating between alleged activities. Much experimental research has been published showing the transfer and persistence of DNA under varying conditions, but as of yet the results of these studies have not been combined to deal with broad questions about transfer mechanisms. In this work we use published data and Bayesian networks to develop a statistical logical framework by which questions of transfer mechanism can be approached probabilistically. We also identify a number of areas where further work could be carried out in order to improve our knowledge base when helping to address questions about transfer mechanisms. Finally, we apply the constructed Bayesian network to ground truth known data to determine if, with current knowledge, there is any power in DNA quantities to distinguish primary and secondary transfer events. Copyright © 2017 Elsevier B.V. All rights reserved.
Bekele, Endashaw; Tesfaye, Kassahun; Ben Slimen, Hichem; Valqui, Juan; Getahun, Abebe; Hartl, Günther B.; Suchentrunk, Franz
2017-01-01
For hares (Lepus spp., Leporidae, Lagomorpha, Mammalia) from Ethiopia no conclusive molecular phylogenetic data are available. To provide a first molecular phylogenetic model for the Abyssinian Hare (Lepus habessinicus), the Ethiopian Hare (L. fagani), and the Ethiopian Highland Hare (L. starcki) and their evolutionary relationships to hares from Africa, Eurasia, and North America, we phylogenetically analysed mitochondrial ATPase subunit 6 (ATP6; n = 153 / 416bp) and nuclear transferrin (TF; n = 155 / 434bp) sequences of phenotypically determined individuals. For the hares from Ethiopia, genotype composition at twelve microsatellite loci (n = 107) was used to explore both interspecific gene pool separation and levels of current hybridization, as has been observed in some other Lepus species. For phylogenetic analyses ATP6 and TF sequences of Lepus species from South and North Africa (L. capensis, L. saxatilis), the Anatolian peninsula and Europe (L. europaeus, L. timidus) were also produced and additional TF sequences of 18 Lepus species retrieved from GenBank were included as well. Median joining networks, neighbour joining, maximum likelihood analyses, as well as Bayesian inference resulted in similar models of evolution of the three species from Ethiopia for the ATP6 and TF sequences, respectively. The Ethiopian species are, however, not monophyletic, with signatures of contemporary uni- and bidirectional mitochondrial introgression and/ or shared ancestral polymorphism. Lepus habessinicus carries mtDNA distinct from South African L. capensis and North African L. capensis sensu lato; that finding is not in line with earlier suggestions of its conspecificity with L. capensis. Lepus starcki has mtDNA distinct from L. capensis and L. europaeus, which is not in line with earlier suggestions to include it either in L. capensis or L. europaeus. Lepus fagani shares mitochondrial haplotypes with the other two species from Ethiopia, despite its distinct phenotypic and microsatellite differences; moreover, it is not represented by a species-specific mitochondrial haplogroup, suggesting considerable mitochondrial capture by the other species from Ethiopia or species from other parts of Africa. Both mitochondrial and nuclear sequences indicate close phylogenetic relationships among all three Lepus species from Ethiopia, with L. fagani being surprisingly tightly connected to L. habessinicus. TF sequences suggest close evolutionary relationships between the three Ethiopian species and Cape hares from South and North Africa; they further suggest that hares from Ethiopia hold a position ancestral to many Eurasian and North American species. PMID:28767659
Effective Online Bayesian Phylogenetics via Sequential Monte Carlo with Guided Proposals
Fourment, Mathieu; Claywell, Brian C; Dinh, Vu; McCoy, Connor; Matsen IV, Frederick A; Darling, Aaron E
2018-01-01
Abstract Modern infectious disease outbreak surveillance produces continuous streams of sequence data which require phylogenetic analysis as data arrives. Current software packages for Bayesian phylogenetic inference are unable to quickly incorporate new sequences as they become available, making them less useful for dynamically unfolding evolutionary stories. This limitation can be addressed by applying a class of Bayesian statistical inference algorithms called sequential Monte Carlo (SMC) to conduct online inference, wherein new data can be continuously incorporated to update the estimate of the posterior probability distribution. In this article, we describe and evaluate several different online phylogenetic sequential Monte Carlo (OPSMC) algorithms. We show that proposing new phylogenies with a density similar to the Bayesian prior suffers from poor performance, and we develop “guided” proposals that better match the proposal density to the posterior. Furthermore, we show that the simplest guided proposals can exhibit pathological behavior in some situations, leading to poor results, and that the situation can be resolved by heating the proposal density. The results demonstrate that relative to the widely used MCMC-based algorithm implemented in MrBayes, the total time required to compute a series of phylogenetic posteriors as sequences arrive can be significantly reduced by the use of OPSMC, without incurring a significant loss in accuracy. PMID:29186587
Najafi, Nargess; Akmali, Vahid; Sharifi, Mozafar
2018-04-26
Molecular phylogeography and species distribution modelling (SDM) suggest that late Quaternary glacial cycles have portrayed a significant role in structuring current population genetic structure and diversity. Based on phylogenetic relationships using Bayesian inference and maximum likelihood of 535 bp mtDNA (D-loop) and 745 bp mtDNA (Cytb) in 62 individuals of the Mediterranean Horseshoe Bat, Rhinolophus euryale, from 13 different localities in Iran we identified two subspecific populations with differing population genetic structure distributed in southern Zagros Mts. and northern Elburz Mts. Analysis of molecular variance (AMOVA) obtained from D-loop sequences indicates that 21.18% of sequence variation is distributed among populations and 10.84% within them. Moreover, a degree of genetic subdivision, mainly attributable to the existence of significant variance among the two regions is shown (θCT = 0.68, p = .005). The positive and significant correlation between geographic and genetic distances (R 2 = 0.28, r = 0.529, p = .000) is obtained following controlling for environmental distance. Spatial distribution of haplotypes indicates that marginal population of the species in southern part of the species range have occupied this section as a glacial refugia. However, this genetic variation, in conjunction with results of the SDM shows a massive postglacial range expansion for R. euryale towards higher latitudes in Iran.
Ren, Guangpeng; Mateo, Rubén G; Liu, Jianquan; Suchan, Tomasz; Alvarez, Nadir; Guisan, Antoine; Conti, Elena; Salamin, Nicolas
2017-02-01
The effects of Quaternary climatic oscillations on the demography of organisms vary across regions and continents. In taxa distributed in Europe and North America, several paradigms regarding the distribution of refugia have been identified. By contrast, less is known about the processes that shaped the species' spatial genetic structure in areas such as the Himalayas, which is considered a biodiversity hotspot. Here, we investigated the phylogeographic structure and population dynamics of Primula tibetica by combining genomic phylogeography and species distribution models (SDMs). Genomic data were obtained for 293 samples of P. tibetica using restriction site-associated DNA sequencing (RADseq). Ensemble SDMs were carried out to predict potential present and past distribution ranges. Four distinct lineages were identified. Approximate Bayesian computation analyses showed that each of them have experienced both expansions and bottlenecks since their divergence, which occurred during or across the Quaternary glacial cycles. The two lineages at both edges of the distribution were found to be more vulnerable and responded in different ways to past climatic changes. These results illustrate how past climatic changes affected the demographic history of Himalayan organisms. Our findings highlight the significance of combining genomic approaches with environmental data when evaluating the effects of past climatic changes. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.
Huang, Jie; Chen, Zigui; Song, Weibo; Berger, Helmut
2014-01-01
Classifications of the Urostyloidea were mainly based on morphology and morphogenesis. Since molecular phylogeny largely focused on limited sampling using mostly the one-gene information, the incongruence between morphological data and gene sequences have risen. In this work, the three-gene data (SSU-rDNA, ITS1-5.8S-ITS2 and LSU-rDNA) comprising 12 genera in the "core urostyloids" are sequenced, and the phylogenies based on these different markers are compared using maximum-likelihood and Bayesian algorithms and tested by unconstrained and constrained analyses. The molecular phylogeny supports the following conclusions: (1) the monophyly of the core group of Urostyloidea is well supported while the whole Urostyloidea is not monophyletic; (2) Thigmokeronopsis and Apokeronopsis are clearly separated from the pseudokeronopsids in analyses of all three gene markers, supporting their exclusion from the Pseudokeronopsidae and the inclusion in the Urostylidae; (3) Diaxonella and Apobakuella should be assigned to the Urostylidae; (4) Bergeriella, Monocoronella and Neourostylopsis flavicana share a most recent common ancestor; (5) all molecular trees support the transfer of Metaurostylopsis flavicana to the recently proposed genus Neourostylopsis; (6) all molecular phylogenies fail to separate the morphologically well-defined genera Uroleptopsis and Pseudokeronopsis; and (7) Arcuseries gen. nov. containing three distinctly deviating Anteholosticha species is established. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.
Zheng, Qi; Grice, Elizabeth A
2016-10-01
Accurate mapping of next-generation sequencing (NGS) reads to reference genomes is crucial for almost all NGS applications and downstream analyses. Various repetitive elements in human and other higher eukaryotic genomes contribute in large part to ambiguously (non-uniquely) mapped reads. Most available NGS aligners attempt to address this by either removing all non-uniquely mapping reads, or reporting one random or "best" hit based on simple heuristics. Accurate estimation of the mapping quality of NGS reads is therefore critical albeit completely lacking at present. Here we developed a generalized software toolkit "AlignerBoost", which utilizes a Bayesian-based framework to accurately estimate mapping quality of ambiguously mapped NGS reads. We tested AlignerBoost with both simulated and real DNA-seq and RNA-seq datasets at various thresholds. In most cases, but especially for reads falling within repetitive regions, AlignerBoost dramatically increases the mapping precision of modern NGS aligners without significantly compromising the sensitivity even without mapping quality filters. When using higher mapping quality cutoffs, AlignerBoost achieves a much lower false mapping rate while exhibiting comparable or higher sensitivity compared to the aligner default modes, therefore significantly boosting the detection power of NGS aligners even using extreme thresholds. AlignerBoost is also SNP-aware, and higher quality alignments can be achieved if provided with known SNPs. AlignerBoost's algorithm is computationally efficient, and can process one million alignments within 30 seconds on a typical desktop computer. AlignerBoost is implemented as a uniform Java application and is freely available at https://github.com/Grice-Lab/AlignerBoost.
Palma, R. Eduardo; Boric-Bargetto, Dusan; Torres-Pérez, Fernando; Hernández, Cristián E.; Yates, Terry L.
2012-01-01
The long-tailed pygmy rice rat Oligoryzomys longicaudatus (Sigmodontinae), the major reservoir of Hantavirus in Chile and Patagonian Argentina, is widely distributed in the Mediterranean, Temperate and Patagonian Forests of Chile, as well as in adjacent areas in southern Argentina. We used molecular data to evaluate the effects of the last glacial event on the phylogeographic structure of this species. We examined if historical Pleistocene events had affected genetic variation and spatial distribution of this species along its distributional range. We sampled 223 individuals representing 47 localities along the species range, and sequenced the hypervariable domain I of the mtDNA control region. Aligned sequences were analyzed using haplotype network, Bayesian population structure and demographic analyses. Analysis of population structure and the haplotype network inferred three genetic clusters along the distribution of O. longicaudatus that mostly agreed with the three major ecogeographic regions in Chile: Mediterranean, Temperate Forests and Patagonian Forests. Bayesian Skyline Plots showed constant population sizes through time in all three clusters followed by an increase after and during the Last Glacial Maximum (LGM; between 26,000–13,000 years ago). Neutrality tests and the “g” parameter also suggest that populations of O. longicaudatus experienced demographic expansion across the species entire range. Past climate shifts have influenced population structure and lineage variation of O. longicaudatus. This species remained in refugia areas during Pleistocene times in southern Temperate Forests (and adjacent areas in Patagonia). From these refugia, O. longicaudatus experienced demographic expansions into Patagonian Forests and central Mediterranean Chile using glacial retreats. PMID:22396751
Palma, R Eduardo; Boric-Bargetto, Dusan; Torres-Pérez, Fernando; Hernández, Cristián E; Yates, Terry L
2012-01-01
The long-tailed pygmy rice rat Oligoryzomys longicaudatus (Sigmodontinae), the major reservoir of Hantavirus in Chile and Patagonian Argentina, is widely distributed in the Mediterranean, Temperate and Patagonian Forests of Chile, as well as in adjacent areas in southern Argentina. We used molecular data to evaluate the effects of the last glacial event on the phylogeographic structure of this species. We examined if historical Pleistocene events had affected genetic variation and spatial distribution of this species along its distributional range. We sampled 223 individuals representing 47 localities along the species range, and sequenced the hypervariable domain I of the mtDNA control region. Aligned sequences were analyzed using haplotype network, bayesian population structure and demographic analyses. Analysis of population structure and the haplotype network inferred three genetic clusters along the distribution of O. longicaudatus that mostly agreed with the three major ecogeographic regions in Chile: Mediterranean, Temperate Forests and Patagonian Forests. Bayesian Skyline Plots showed constant population sizes through time in all three clusters followed by an increase after and during the Last Glacial Maximum (LGM; between 26,000-13,000 years ago). Neutrality tests and the "g" parameter also suggest that populations of O. longicaudatus experienced demographic expansion across the species entire range. Past climate shifts have influenced population structure and lineage variation of O. longicaudatus. This species remained in refugia areas during Pleistocene times in southern Temperate Forests (and adjacent areas in Patagonia). From these refugia, O. longicaudatus experienced demographic expansions into Patagonian Forests and central Mediterranean Chile using glacial retreats.
Inferring genealogical processes from patterns of Bronze-Age and modern DNA variation in Sardinia.
Ghirotto, Silvia; Mona, Stefano; Benazzo, Andrea; Paparazzo, Francesco; Caramelli, David; Barbujani, Guido
2010-04-01
The ancient inhabitants of a region are often regarded as ancestral, and hence genetically related, to the modern dwellers (for instance, in studies of admixture), but so far, this assumption has not been tested empirically using ancient DNA data. We studied mitochondrial DNA (mtDNA) variation in Sardinia, across a time span of 2,500 years, comparing 23 Bronze-Age (nuragic) mtDNA sequences with those of 254 modern individuals from two regions, Ogliastra (a likely genetic isolate) and Gallura, and considering the possible impact of gene flow from mainland Italy. To understand the genealogical relationships between past and present populations, we developed seven explicit demographic models; we tested whether these models can account for the levels and patterns of genetic diversity in the data and which one does it best. Extensive simulation based on a serial coalescent algorithm allowed us to compare the posterior probability of each model and estimate the relevant evolutionary (mutation and migration rates) and demographic (effective population sizes, times since population splits) parameters, by approximate Bayesian computations. We then validated the analyses by investigating how well parameters estimated from the simulated data can reproduce the observed data set. We show that a direct genealogical continuity between Bronze-Age Sardinians and the current people of Ogliastra, but not Gallura, has a much higher probability than any alternative scenarios and that genetic diversity in Gallura evolved largely independently, owing in part to gene flow from the mainland.
Pearce, J.M.; Talbot, S.L.; Petersen, M.R.; Rearick, J.R.
2005-01-01
Due to declines in the Alaska breeding population, the Steller's eider (Polysticta stelleri) was listed as threatened in North America in 1997. Periodic non-breeding in Russia and Alaska has hampered field-based assessments of behavioral patterns critical to recovery plans, such as levels of breeding site fidelity and movements among three regional populations: Atlantic-Russia, Pacific-Russia and Alaska. Therefore, we analyzed samples from across the species range with seven nuclear microsatellite DNA loci and cytochrome b mitochondrial (mt)DNA sequence data to infer levels of interchange among sampling areas and patterns of site fidelity. Results demonstrated low levels of population differentiation within Atlantic and Pacific nesting areas, with higher levels observed between these regions, but only for mtDNA. Bayesian analysis of microsatellite data from wintering and molting birds showed no signs of sub-population structure, even though band-recovery data suggests multiple breeding areas are present. We observed higher estimates of F-statistics for female mtDNA data versus male data, suggesting female-biased natal site fidelity. Summary statistics for mtDNA were consistent with models of historic population expansion. Lack of spatial structure in Steller's eiders may result largely from insufficient time since historic population expansions for behaviors, such as natal site fidelity, to isolate breeding areas genetically. However, other behaviors such as the periodic non-breeding observed in Steller's eiders may also play a more contemporary role in genetic homogeneity, especially for microsatellite loci.
Hawlitschek, Oliver; Nagy, Zoltán T.; Berger, Johannes; Glaw, Frank
2013-01-01
In the past decade, DNA barcoding became increasingly common as a method for species identification in biodiversity inventories and related studies. However, mainly due to technical obstacles, squamate reptiles have been the target of few barcoding studies. In this article, we present the results of a DNA barcoding study of squamates of the Comoros archipelago, a poorly studied group of oceanic islands close to and mostly colonized from Madagascar. The barcoding dataset presented here includes 27 of the 29 currently recognized squamate species of the Comoros, including 17 of the 18 endemic species. Some species considered endemic to the Comoros according to current taxonomy were found to cluster with non-Comoran lineages, probably due to poorly resolved taxonomy. All other species for which more than one barcode was obtained corresponded to distinct clusters useful for species identification by barcoding. In most species, even island populations could be distinguished using barcoding. Two cryptic species were identified using the DNA barcoding approach. The obtained barcoding topology, a Bayesian tree based on COI sequences of 5 genera, was compared with available multigene topologies, and in 3 cases, major incongruences between the two topologies became evident. Three of the multigene studies were initiated after initial screening of a preliminary version of the barcoding dataset presented here. We conclude that in the case of the squamates of the Comoros Islands, DNA barcoding has proven a very useful and efficient way of detecting isolated populations and promising starting points for subsequent research. PMID:24069192
The Scirtothrips dorsalis Species Complex: Endemism and Invasion in a Global Pest
Dickey, Aaron M.; Kumar, Vivek; Hoddle, Mark S.; Funderburk, Joe E.; Morgan, J. Kent; Jara-Cavieres, Antonella; Shatters, Robert G. Jr.; Osborne, Lance S.; McKenzie, Cindy L.
2015-01-01
Invasive arthropods pose unique management challenges in various environments, the first of which is correct identification. This apparently mundane task is particularly difficult if multiple species are morphologically indistinguishable but accurate identification can be determined with DNA barcoding provided an adequate reference set is available. Scirtothrips dorsalis is a highly polyphagous plant pest with a rapidly expanding global distribution and this species, as currently recognized, may be comprised of cryptic species. Here we report the development of a comprehensive DNA barcode library for S. dorsalis and seven nuclear markers via next-generation sequencing for identification use within the complex. We also report the delimitation of nine cryptic species and two morphologically distinguishable species comprising the S. dorsalis species complex using histogram analysis of DNA barcodes, Bayesian phylogenetics, and the multi-species coalescent. One member of the complex, here designated the South Asia 1 cryptic species, is highly invasive, polyphagous, and likely the species implicated in tospovirus transmission. Two other species, South Asia 2, and East Asia 1 are also highly polyphagous and appear to be at an earlier stage of global invasion. The remaining members of the complex are regionally endemic, varying in their pest status and degree of polyphagy. In addition to patterns of invasion and endemism, our results provide a framework both for identifying members of the complex based on their DNA barcode, and for future species delimiting efforts. PMID:25893251
Object-oriented Bayesian networks for paternity cases with allelic dependencies
Hepler, Amanda B.; Weir, Bruce S.
2008-01-01
This study extends the current use of Bayesian networks by incorporating the effects of allelic dependencies in paternity calculations. The use of object-oriented networks greatly simplify the process of building and interpreting forensic identification models, allowing researchers to solve new, more complex problems. We explore two paternity examples: the most common scenario where DNA evidence is available from the alleged father, the mother and the child; a more complex casewhere DNA is not available from the alleged father, but is available from the alleged father’s brother. Object-oriented networks are built, using HUGIN, for each example which incorporate the effects of allelic dependence caused by evolutionary relatedness. PMID:19079769
Dobosz, Marina; Bocci, Chiara; Bonuglia, Margherita; Grasso, Cinzia; Merigioli, Sara; Russo, Alessandra; De Iuliis, Paolo
2010-01-01
Microsatellites have been used for parentage testing and individual identification in forensic science because they are highly polymorphic and show abundant sequences dispersed throughout most eukaryotic nuclear genomes. At present, genetic testing based on DNA technology is used for most domesticated animals, including horses, to confirm identity, to determine parentage, and to validate registration certificates. But if genetic data of one of the putative parents are missing, verifying a genealogy could be questionable. The aim of this paper is to illustrate a new approach to analyze complex cases of disputed relationship with microsatellites markers. These cases were solved by analyzing the genotypes of the offspring and other horses' genotypes in the pedigrees of the putative dam/sire with probabilistic expert systems (PESs). PES was especially efficient in supplying reliable, error-free Bayesian probabilities in complex cases with missing pedigree data. One of these systems was developed for forensic purposes (FINEX program) and is particularly valuable in human analyses. We applied this program to parentage analysis in horses, and we will illustrate how different cases have been successfully worked out.
Eid, Mohammed Mansour Abbas; Shimoda, Mayuko; Singh, Shailendra Kumar; Almofty, Sarah Ameen; Pham, Phuong; Goodman, Myron F.; Maeda, Kazuhiko; Sakaguchi, Nobuo
2017-01-01
Abstract Immunoglobulin affinity maturation depends on somatic hypermutation (SHM) in immunoglobulin variable (IgV) regions initiated by activation-induced cytidine deaminase (AID). AID induces transition mutations by C→U deamination on both strands, causing C:G→T:A. Error-prone repairs of U by base excision and mismatch repairs (MMRs) create transversion mutations at C/G and mutations at A/T sites. In Neuberger’s model, it remained to be clarified how transition/transversion repair is regulated. We investigate the role of AID-interacting GANP (germinal center-associated nuclear protein) in the IgV SHM profile. GANP enhances transition mutation of the non-transcribed strand G and reduces mutation at A, restricted to GYW of the AID hotspot motif. It reduces DNA polymerase η hotspot mutations associated with MMRs followed by uracil-DNA glycosylase. Mutation comparison between IgV complementary and framework regions (FWRs) by Bayesian statistical estimation demonstrates that GANP supports the preservation of IgV FWR genomic sequences. GANP works to maintain antibody structure by reducing drastic changes in the IgV FWR in affinity maturation. PMID:28541550
Molecular and morphologic data reveal multiple species in Peromyscus pectoralis
Bradley, Robert D.; Schmidly, David J.; Amman, Brian R.; Platt, Roy N.; Neumann, Kathy M.; Huynh, Howard M.; Muñiz-Martínez, Raúl; López-González, Celia; Ordóñez-Garza, Nicté
2015-01-01
DNA sequence and morphometric data were used to re-evaluate the taxonomy and systematics of Peromyscus pectoralis. Phylogenetic analyses (maximum likelihood and Bayesian inference) of DNA sequences from the mitochondrial cytochrome-b gene in 44 samples of P. pectoralis indicated 2 well-supported monophyletic clades. The 1st clade contained specimens from Texas historically assigned to P. p. laceianus; the 2nd was comprised of specimens previously referable to P. p. collinus, P. p. laceianus, and P. p. pectoralis obtained from northern and eastern Mexico. Levels of genetic variation (~7%) between these 2 clades indicated that the genetic divergence typically exceeded that reported for other species of Peromyscus. Samples of P. p. laceianus north and south of the Río Grande were not monophyletic. In addition, samples representing P. p. collinus and P. p. pectoralis formed 2 clades that differed genetically by 7.14%. Multivariate analyses of external and cranial measurements from 63 populations of P. pectoralis revealed 4 morpho-groups consistent with clades in the DNA sequence analysis: 1 from Texas and New Mexico assignable to P. p. laceianus; a 2nd from western and southern Mexico assignable to P. p. pectoralis; a 3rd from northern and central Mexico previously assigned to P. p. pectoralis but herein shown to represent an undescribed taxon; and a 4th from southeastern Mexico assignable to P. p. collinus. Based on the concordance of these results, populations from the United States are referred to as P. laceianus, whereas populations from Mexico are referred to as P. pectoralis (including some samples historically assigned to P. p. collinus, P. p. laceianus, and P. p. pectoralis). A new subspecies is described to represent populations south of the Río Grande in northern and central Mexico. Additional research is needed to discern if P. p. collinus warrants species recognition. PMID:26937045
2013-01-01
Background Hypodontus macropi is a common intestinal nematode of a range of kangaroos and wallabies (macropodid marsupials). Based on previous multilocus enzyme electrophoresis (MEE) and nuclear ribosomal DNA sequence data sets, H. macropi has been proposed to be complex of species. To test this proposal using independent molecular data, we sequenced the whole mitochondrial (mt) genomes of individuals of H. macropi from three different species of hosts (Macropus robustus robustus, Thylogale billardierii and Macropus [Wallabia] bicolor) as well as that of Macropicola ocydromi (a related nematode), and undertook a comparative analysis of the amino acid sequence datasets derived from these genomes. Results The mt genomes sequenced by next-generation (454) technology from H. macropi from the three host species varied from 13,634 bp to 13,699 bp in size. Pairwise comparisons of the amino acid sequences predicted from these three mt genomes revealed differences of 5.8% to 18%. Phylogenetic analysis of the amino acid sequence data sets using Bayesian Inference (BI) showed that H. macropi from the three different host species formed distinct, well-supported clades. In addition, sliding window analysis of the mt genomes defined variable regions for future population genetic studies of H. macropi in different macropodid hosts and geographical regions around Australia. Conclusions The present analyses of inferred mt protein sequence datasets clearly supported the hypothesis that H. macropi from M. robustus robustus, M. bicolor and T. billardierii represent distinct species. PMID:24261823
A supermatrix analysis of genomic, morphological, and paleontological data from crown Cetacea
2011-01-01
Background Cetacea (dolphins, porpoises, and whales) is a clade of aquatic species that includes the most massive, deepest diving, and largest brained mammals. Understanding the temporal pattern of diversification in the group as well as the evolution of cetacean anatomy and behavior requires a robust and well-resolved phylogenetic hypothesis. Although a large body of molecular data has accumulated over the past 20 years, DNA sequences of cetaceans have not been directly integrated with the rich, cetacean fossil record to reconcile discrepancies among molecular and morphological characters. Results We combined new nuclear DNA sequences, including segments of six genes (~2800 basepairs) from the functionally extinct Yangtze River dolphin, with an expanded morphological matrix and published genomic data. Diverse analyses of these data resolved the relationships of 74 taxa that represent all extant families and 11 extinct families of Cetacea. The resulting supermatrix (61,155 characters) and its sub-partitions were analyzed using parsimony methods. Bayesian and maximum likelihood (ML) searches were conducted on the molecular partition, and a molecular scaffold obtained from these searches was used to constrain a parsimony search of the morphological partition. Based on analysis of the supermatrix and model-based analyses of the molecular partition, we found overwhelming support for 15 extant clades. When extinct taxa are included, we recovered trees that are significantly correlated with the fossil record. These trees were used to reconstruct the timing of cetacean diversification and the evolution of characters shared by "river dolphins," a non-monophyletic set of species according to all of our phylogenetic analyses. Conclusions The parsimony analysis of the supermatrix and the analysis of morphology constrained to fit the ML/Bayesian molecular tree yielded broadly congruent phylogenetic hypotheses. In trees from both analyses, all Oligocene taxa included in our study fell outside crown Mysticeti and crown Odontoceti, suggesting that these two clades radiated in the late Oligocene or later, contra some recent molecular clock studies. Our trees also imply that many character states shared by river dolphins evolved in their oceanic ancestors, contradicting the hypothesis that these characters are convergent adaptations to fluvial habitats. PMID:21518443
A supermatrix analysis of genomic, morphological, and paleontological data from crown Cetacea.
Geisler, Jonathan H; McGowen, Michael R; Yang, Guang; Gatesy, John
2011-04-25
Cetacea (dolphins, porpoises, and whales) is a clade of aquatic species that includes the most massive, deepest diving, and largest brained mammals. Understanding the temporal pattern of diversification in the group as well as the evolution of cetacean anatomy and behavior requires a robust and well-resolved phylogenetic hypothesis. Although a large body of molecular data has accumulated over the past 20 years, DNA sequences of cetaceans have not been directly integrated with the rich, cetacean fossil record to reconcile discrepancies among molecular and morphological characters. We combined new nuclear DNA sequences, including segments of six genes (~2800 basepairs) from the functionally extinct Yangtze River dolphin, with an expanded morphological matrix and published genomic data. Diverse analyses of these data resolved the relationships of 74 taxa that represent all extant families and 11 extinct families of Cetacea. The resulting supermatrix (61,155 characters) and its sub-partitions were analyzed using parsimony methods. Bayesian and maximum likelihood (ML) searches were conducted on the molecular partition, and a molecular scaffold obtained from these searches was used to constrain a parsimony search of the morphological partition. Based on analysis of the supermatrix and model-based analyses of the molecular partition, we found overwhelming support for 15 extant clades. When extinct taxa are included, we recovered trees that are significantly correlated with the fossil record. These trees were used to reconstruct the timing of cetacean diversification and the evolution of characters shared by "river dolphins," a non-monophyletic set of species according to all of our phylogenetic analyses. The parsimony analysis of the supermatrix and the analysis of morphology constrained to fit the ML/Bayesian molecular tree yielded broadly congruent phylogenetic hypotheses. In trees from both analyses, all Oligocene taxa included in our study fell outside crown Mysticeti and crown Odontoceti, suggesting that these two clades radiated in the late Oligocene or later, contra some recent molecular clock studies. Our trees also imply that many character states shared by river dolphins evolved in their oceanic ancestors, contradicting the hypothesis that these characters are convergent adaptations to fluvial habitats.
Cosacov, Andrea; Ferreiro, Gabriela; Johnson, Leigh A.; Sérsic, Alicia N.
2017-01-01
Effects of Pleistocene climatic oscillations on plant phylogeographic patterns are relatively well studied in forest, savanna and grassland biomes, but such impacts remain less explored on desert regions of the world, especially in South America. Here, we performed a phylogeographical study of Monttea aphylla, an endemic species of the Monte Desert, to understand the evolutionary history of vegetation communities inhabiting the South American Arid Diagonal. We obtained sequences of three chloroplast (trnS–trnfM, trnH–psbA and trnQ–rps16) and one nuclear (ITS) intergenic spacers from 272 individuals of 34 localities throughout the range of the species. Population genetic and Bayesian coalescent analyses were performed to infer genealogical relationships among haplotypes, population genetic structure, and demographic history of the study species. Timing of demographic events was inferred using Bayesian Skyline Plot and the spatio-temporal patterns of lineage diversification was reconstructed using Bayesian relaxed diffusion models. Palaeo-distribution models (PDM) were performed through three different timescales to validate phylogeographical patterns. Twenty-five and 22 haplotypes were identified in the cpDNA and nDNA data, respectively. that clustered into two main genealogical lineages following a latitudinal pattern, the northern and the southern Monte (south of 35° S). The northern Monte showed two lineages of high genetic structure, and more relative stable demography than the southern Monte that retrieved three groups with little phylogenetic structure and a strong signal of demographic expansion that would have started during the Last Interglacial period (ca. 120 Ka). The PDM and diffusion models analyses agreed in the southeast direction of the range expansion. Differential effect of climatic oscillations across the Monte phytogeographic province was observed in Monttea aphylla lineages. In northern Monte, greater genetic structure and more relative stable demography resulted from a more stable climate than in the southern Monte. Pleistocene glaciations drastically decreased the species area in the southern Monte, which expanded in a southeastern direction to the new available areas during the interglacial periods. PMID:28582433
Crawford, Andrew J; Smith, Eric N
2005-06-01
We report the first phylogenetic analysis of DNA sequence data for the Central American component of the genus Eleutherodactylus (Anura: Leptodactylidae: Eleutherodactylinae), one of the most ubiquitous, diverse, and abundant components of the Neotropical amphibian fauna. We obtained DNA sequence data from 55 specimens representing 45 species. Sampling was focused on Central America, but also included Bolivia, Brazil, Jamaica, and the USA. We sequenced 1460 contiguous base pairs (bp) of the mitochondrial genome containing ND2 and five neighboring tRNA genes, plus 1300 bp of the c-myc nuclear gene. The resulting phylogenetic inferences were broadly concordant between data sets and among analytical methods. The subgenus Craugastor is monophyletic and its initial radiation was potentially rapid and adaptive. Within Craugastor, the earliest splits separate three northern Central American species groups, milesi, augusti, and alfredi, from a clade comprising the rest of Craugastor. Within the latter clade, the rhodopis group as formerly recognized comprises three deeply divergent clades that do not form a monophyletic group; we therefore restrict the content of the rhodopis group to one of two northern clades, and use new names for the other northern (mexicanus group) and one southern clade (bransfordii group). The new rhodopis and bransfordii groups together form the sister taxon to a clade comprising the biporcatus, fitzingeri, mexicanus, and rugulosus groups. We used a Bayesian MCMC approach together with geological and biogeographic assumptions to estimate divergence times from the combined DNA sequence data. Our results corroborated three independent dispersal events for the origins of Central American Eleutherodactylus: (1) an ancestor of Craugastor entered northern Central America from South American in the early Paleocene, (2) an ancestor of the subgenus Syrrhophus entered northern Central America from the Caribbean at the end of the Eocene, and (3) a wave of independent dispersal events from South America coincided with formation of the Isthmus of Panama during the Pliocene. We elevate the subgenus Craugastor to the genus rank.
Patterns of population structure for inshore bottlenose dolphins along the eastern United States.
Richards, Vincent P; Greig, Thomas W; Fair, Patricia A; McCulloch, Stephen D; Politz, Christine; Natoli, Ada; Driscoll, Carlos A; Hoelzel, A Rus; David, Victor; Bossart, Gregory D; Lopez, Jose V
2013-01-01
Globally distributed, the bottlenose dolphin (Tursiops truncatus) is found in a range of offshore and coastal habitats. Using 15 microsatellite loci and mtDNA control region sequences, we investigated patterns of genetic differentiation among putative populations along the eastern US shoreline (the Indian River Lagoon, Florida, and Charleston Harbor, South Carolina) (microsatellite analyses: n = 125, mtDNA analyses: n = 132). We further utilized the mtDNA to compare these populations with those from the Northwest Atlantic, Gulf of Mexico, and Caribbean. Results showed strong differentiation among inshore, alongshore, and offshore habitats (ФST = 0.744). In addition, Bayesian clustering analyses revealed the presence of 2 genetic clusters (populations) within the 250 km Indian River Lagoon. Habitat heterogeneity is likely an important force diversifying bottlenose dolphin populations through its influence on social behavior and foraging strategy. We propose that the spatial pattern of genetic variation within the lagoon reflects both its steep longitudinal transition of climate and also its historical discontinuity and recent connection as part of Intracoastal Waterway development. These findings have important management implications as they emphasize the role of habitat and the consequence of its modification in shaping bottlenose dolphin population structure and highlight the possibility of multiple management units existing in discrete inshore habitats along the entire eastern US shoreline.
Patterns of Population Structure for Inshore Bottlenose Dolphins along the Eastern United States
2013-01-01
Globally distributed, the bottlenose dolphin (Tursiops truncatus) is found in a range of offshore and coastal habitats. Using 15 microsatellite loci and mtDNA control region sequences, we investigated patterns of genetic differentiation among putative populations along the eastern US shoreline (the Indian River Lagoon, Florida, and Charleston Harbor, South Carolina) (microsatellite analyses: n = 125, mtDNA analyses: n = 132). We further utilized the mtDNA to compare these populations with those from the Northwest Atlantic, Gulf of Mexico, and Caribbean. Results showed strong differentiation among inshore, alongshore, and offshore habitats (ФST = 0.744). In addition, Bayesian clustering analyses revealed the presence of 2 genetic clusters (populations) within the 250 km Indian River Lagoon. Habitat heterogeneity is likely an important force diversifying bottlenose dolphin populations through its influence on social behavior and foraging strategy. We propose that the spatial pattern of genetic variation within the lagoon reflects both its steep longitudinal transition of climate and also its historical discontinuity and recent connection as part of Intracoastal Waterway development. These findings have important management implications as they emphasize the role of habitat and the consequence of its modification in shaping bottlenose dolphin population structure and highlight the possibility of multiple management units existing in discrete inshore habitats along the entire eastern US shoreline. PMID:24129993
A comprehensive molecular phylogeny for the hornbills (Aves: Bucerotidae).
Gonzalez, Juan-Carlos T; Sheldon, Ben C; Collar, Nigel J; Tobias, Joseph A
2013-05-01
The hornbills comprise a group of morphologically and behaviorally distinct Palaeotropical bird species that feature prominently in studies of ecology and conservation biology. Although the monophyly of hornbills is well established, previous phylogenetic hypotheses were based solely on mtDNA and limited sampling of species diversity. We used parsimony, maximum likelihood and Bayesian methods to reconstruct relationships among all 61 extant hornbill species, based on nuclear and mtDNA gene sequences extracted largely from historical samples. The resulting phylogenetic trees closely match vocal variation across the family but conflict with current taxonomic treatments. In particular, they highlight a new arrangement for the six major clades of hornbills and reveal that three groups traditionally treated as genera (Tockus, Aceros, Penelopides) are non-monophyletic. In addition, two other genera (Anthracoceros, Ocyceros) were non-monophyletic in the mtDNA gene tree. Our findings resolve some longstanding problems in hornbill systematics, including the placement of 'Penelopides exharatus' (embedded in Aceros) and 'Tockus hartlaubi' (sister to Tropicranus albocristatus). We also confirm that an Asiatic lineage (Berenicornis) is sister to a trio of Afrotropical genera (Tropicranus [including 'Tockus hartlaubi'], Ceratogymna, Bycanistes). We present a summary phylogeny as a robust basis for further studies of hornbill ecology, evolution and historical biogeography. Copyright © 2013. Published by Elsevier Inc.
Scar-less multi-part DNA assembly design automation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hillson, Nathan J.
The present invention provides a method of a method of designing an implementation of a DNA assembly. In an exemplary embodiment, the method includes (1) receiving a list of DNA sequence fragments to be assembled together and an order in which to assemble the DNA sequence fragments, (2) designing DNA oligonucleotides (oligos) for each of the DNA sequence fragments, and (3) creating a plan for adding flanking homology sequences to each of the DNA oligos. In an exemplary embodiment, the method includes (1) receiving a list of DNA sequence fragments to be assembled together and an order in which tomore » assemble the DNA sequence fragments, (2) designing DNA oligonucleotides (oligos) for each of the DNA sequence fragments, and (3) creating a plan for adding optimized overhang sequences to each of the DNA oligos.« less
Ramachandran, Parameswaran; Sánchez-Taltavull, Daniel; Perkins, Theodore J
2017-01-01
Co-expression networks have long been used as a tool for investigating the molecular circuitry governing biological systems. However, most algorithms for constructing co-expression networks were developed in the microarray era, before high-throughput sequencing-with its unique statistical properties-became the norm for expression measurement. Here we develop Bayesian Relevance Networks, an algorithm that uses Bayesian reasoning about expression levels to account for the differing levels of uncertainty in expression measurements between highly- and lowly-expressed entities, and between samples with different sequencing depths. It combines data from groups of samples (e.g., replicates) to estimate group expression levels and confidence ranges. It then computes uncertainty-moderated estimates of cross-group correlations between entities, and uses permutation testing to assess their statistical significance. Using large scale miRNA data from The Cancer Genome Atlas, we show that our Bayesian update of the classical Relevance Networks algorithm provides improved reproducibility in co-expression estimates and lower false discovery rates in the resulting co-expression networks. Software is available at www.perkinslab.ca.
An improved approximate-Bayesian model-choice method for estimating shared evolutionary history
2014-01-01
Background To understand biological diversification, it is important to account for large-scale processes that affect the evolutionary history of groups of co-distributed populations of organisms. Such events predict temporally clustered divergences times, a pattern that can be estimated using genetic data from co-distributed species. I introduce a new approximate-Bayesian method for comparative phylogeographical model-choice that estimates the temporal distribution of divergences across taxa from multi-locus DNA sequence data. The model is an extension of that implemented in msBayes. Results By reparameterizing the model, introducing more flexible priors on demographic and divergence-time parameters, and implementing a non-parametric Dirichlet-process prior over divergence models, I improved the robustness, accuracy, and power of the method for estimating shared evolutionary history across taxa. Conclusions The results demonstrate the improved performance of the new method is due to (1) more appropriate priors on divergence-time and demographic parameters that avoid prohibitively small marginal likelihoods for models with more divergence events, and (2) the Dirichlet-process providing a flexible prior on divergence histories that does not strongly disfavor models with intermediate numbers of divergence events. The new method yields more robust estimates of posterior uncertainty, and thus greatly reduces the tendency to incorrectly estimate models of shared evolutionary history with strong support. PMID:24992937
High endemism at cave entrances: a case study of spiders of the genus Uthina
Yao, Zhiyuan; Dong, Tingting; Zheng, Guo; Fu, Jinzhong; Li, Shuqiang
2016-01-01
Endemism, which is typically high on islands and in caves, has rarely been studied in the cave entrance ecotone. We investigated the endemism of the spider genus Uthina at cave entrances. Totally 212 spiders were sampled from 46 localities, from Seychelles across Southeast Asia to Fiji. They mostly occur at cave entrances but occasionally appear at various epigean environments. Phylogenetic analysis of DNA sequence data from COI and 28S genes suggested that Uthina was grouped into 13 well-supported clades. We used three methods, the Bayesian Poisson Tree Processes (bPTP) model, the Bayesian Phylogenetics and Phylogeography (BPP) method, and the general mixed Yule coalescent (GMYC) model, to investigate species boundaries. Both bPTP and BPP identified the 13 clades as 13 separate species, while GMYC identified 19 species. Furthermore, our results revealed high endemism at cave entrances. Of the 13 provisional species, twelve (one known and eleven new) are endemic to one or a cluster of caves, and all of them occurred only at cave entrances except for one population of one species. The only widely distributed species, U. luzonica, mostly occurred in epigean environments while three populations were found at cave entrances. Additionally, eleven new species of the genus are described. PMID:27775081
Posada, David; Buckley, Thomas R
2004-10-01
Model selection is a topic of special relevance in molecular phylogenetics that affects many, if not all, stages of phylogenetic inference. Here we discuss some fundamental concepts and techniques of model selection in the context of phylogenetics. We start by reviewing different aspects of the selection of substitution models in phylogenetics from a theoretical, philosophical and practical point of view, and summarize this comparison in table format. We argue that the most commonly implemented model selection approach, the hierarchical likelihood ratio test, is not the optimal strategy for model selection in phylogenetics, and that approaches like the Akaike Information Criterion (AIC) and Bayesian methods offer important advantages. In particular, the latter two methods are able to simultaneously compare multiple nested or nonnested models, assess model selection uncertainty, and allow for the estimation of phylogenies and model parameters using all available models (model-averaged inference or multimodel inference). We also describe how the relative importance of the different parameters included in substitution models can be depicted. To illustrate some of these points, we have applied AIC-based model averaging to 37 mitochondrial DNA sequences from the subgenus Ohomopterus(genus Carabus) ground beetles described by Sota and Vogler (2001).
Genetic Evidence of Geographical Groups among Neanderthals
Fabre, Virginie; Condemi, Silvana; Degioanni, Anna
2009-01-01
The Neanderthals are a well-distinguished Middle Pleistocene population which inhabited a vast geographical area extending from Europe to western Asia and the Middle East. Since the 1950s paleoanthropological studies have suggested variability in this group. Different sub-groups have been identified in western Europe, in southern Europe and in the Middle East. On the other hand, since 1997, research has been published in paleogenetics, carried out on 15 mtDNA sequences from 12 Neanderthals. In this paper we used a new methodology derived from different bioinformatic models based on data from genetics, demography and paleoanthropology. The adequacy of each model was measured by comparisons between simulated results (obtained by BayesianSSC software) and those estimated from nucleotide sequences (obtained by DNAsp4 software). The conclusions of this study are consistent with existing paleoanthropological research and show that Neanderthals can be divided into at least three groups: one in western Europe, a second in the Southern area and a third in western Asia. Moreover, it seems from our results that the size of the Neanderthal population was not constant and that some migration occurred among the demes. PMID:19367332
Strong, Ellen E; Bouchet, Philippe
2018-01-01
A new genus, Limatium gen. n. , and two new species, L. pagodula sp. n. and L. aureum sp. n. are described, found on outer slopes of barrier reefs and fringing reefs in the South Pacific. They are rare for cerithiids, which typically occur in large populations. The two new species are represented by 108 specimens sampled over a period of 30 years, only 16 of which were collected alive. Three subadults from the Philippines and Vanuatu likely represent a third species. In addition to their rarity, Limatium species are atypical for cerithiids in their smooth, polished, honey to golden brown shells with distinctive white fascioles extending suture to suture. The radula presents a unique morphology that does not readily suggest an affinity to any of the cerithiid subfamilies. Two live-collected specimens, one of each species and designated as holotypes, were preserved in 95% ethanol and sequenced. Bayesian analysis of partial COI and 16S rDNA sequences demonstrates a placement in the Bittiinae, further extending our morphological concept of the subfamily.
NASA Astrophysics Data System (ADS)
Alaniz Rodrigues, Marcos; Dumont, Luiz Felipe Cestari; dos Santos, Cléverson Rannieri Meira; D'Incao, Fernando; Weiss, Steven; Froufe, Elsa
2017-10-01
For the first time, a molecular approach was used to evaluate the phylogenetic structure of the disjunct native American distribution of the blue crab Callinectes sapidus. Population structure was investigated by sequencing 648bp of the Cytochrome oxidase subunit 1 (COI), in a total of 138 sequences stemming from individual samples from both the northern and southern hemispheres of the Western Atlantic distribution of the species. A Bayesian approach was used to construct a phylogenetic tree for all samples, and a 95% confidence parsimony network was created to depict the relationship among haplotypes. Results revealed two highly distinct lineages, one containing all samples from the United States and some from Brazil (lineage 1) and the second restricted to Brazil (lineage 2). In addition, gene flow (at least for females) was detected among estuaries at local scales and there is evidence for shared haplotypes in the south. Furthermore, the findings of this investigation support the contemporary introduction of haplotypes that have apparently spread from the south to the north Atlantic.
Skelly, Daniel A.; Johansson, Marnie; Madeoy, Jennifer; Wakefield, Jon; Akey, Joshua M.
2011-01-01
Variation in gene expression is thought to make a significant contribution to phenotypic diversity among individuals within populations. Although high-throughput cDNA sequencing offers a unique opportunity to delineate the genome-wide architecture of regulatory variation, new statistical methods need to be developed to capitalize on the wealth of information contained in RNA-seq data sets. To this end, we developed a powerful and flexible hierarchical Bayesian model that combines information across loci to allow both global and locus-specific inferences about allele-specific expression (ASE). We applied our methodology to a large RNA-seq data set obtained in a diploid hybrid of two diverse Saccharomyces cerevisiae strains, as well as to RNA-seq data from an individual human genome. Our statistical framework accurately quantifies levels of ASE with specified false-discovery rates, achieving high reproducibility between independent sequencing platforms. We pinpoint loci that show unusual and biologically interesting patterns of ASE, including allele-specific alternative splicing and transcription termination sites. Our methodology provides a rigorous, quantitative, and high-resolution tool for profiling ASE across whole genomes. PMID:21873452
Yu, Yi-Kuo; Capra, John A.; Stojmirović, Aleksandar; Landsman, David; Altschul, Stephen F.
2015-01-01
Motivation: DNA and protein patterns are usefully represented by sequence logos. However, the methods for logo generation in common use lack a proper statistical basis, and are non-optimal for recognizing functionally relevant alignment columns. Results: We redefine the information at a logo position as a per-observation multiple alignment log-odds score. Such scores are positive or negative, depending on whether a column’s observations are better explained as arising from relatedness or chance. Within this framework, we propose distinct normalized maximum likelihood and Bayesian measures of column information. We illustrate these measures on High Mobility Group B (HMGB) box proteins and a dataset of enzyme alignments. Particularly in the context of protein alignments, our measures improve the discrimination of biologically relevant positions. Availability and implementation: Our new measures are implemented in an open-source Web-based logo generation program, which is available at http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/logoddslogo/index.html. A stand-alone version of the program is also available from this site. Contact: altschul@ncbi.nlm.nih.gov Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25294922
Andersen, Jeremy C; Wu, Jin; Gruwell, Matthew E; Gwiazdowski, Rodger; Santana, Sharlene E; Feliciano, Natalie M; Morse, Geoffrey E; Normark, Benjamin B
2010-12-01
Armored scale insects (Hemiptera: Diaspididae) are among the most invasive insects in the world. They have unusual genetic systems, including diverse types of paternal genome elimination (PGE) and parthenogenesis. Intimate relationships with their host plants and bacterial endosymbionts make them potentially important subjects for the study of co-evolution. Here, we expand upon recent phylogenetic work (Morse and Normark, 2006) by analyzing armored scale and endosymbiont DNA sequences from 125 species of armored scale insect, represented by 253 samples and eight outgroup species. We used fragments of four different gene regions: the nuclear protein-coding gene Elongation Factor 1α (EF1α), the large ribosomal subunit (28S) rDNA, a mitochondrial region spanning parts of cytochrome oxidase I (COI) and cytochrome oxidase II (COII), and the small ribosomal subunit (16S) rDNA from the primary bacterial endosymbiont Uzinura diaspidicola. Maximum likelihood, and Bayesian analyses were performed producing highly congruent topological results. A comparison of two datasets, one with and one without missing data, found that missing data had little effect on topology. Our results broadly corroborate several major features of the existing classification, although we do not find any of the subfamilies, tribes or subtribes to be monophyletic as currently constituted. Using ancestral state reconstruction we estimate that the ancestral armored scale had the late PGE sex system, and it may as well have been pupillarial, though results differed between reconstruction methods. These results highlight the need for a complete revision of this family, and provide the groundwork for future taxonomic work in armored scale insects. Copyright © 2010 Elsevier Inc. All rights reserved.
Barlow, Axel; Cooper, Alan; Hou, Xin-Dong; Ji, Xue-Ping; Zhong, Bo-Jian; Liu, Hong; Flynn, Lawrence J.; Yuan, Jun-Xia; Wang, Li-Rui; Basler, Nikolas; Westbury, Michael V.; Hofreiter, Michael; Lai, Xu-Long
2018-01-01
The giant panda was widely distributed in China and south-eastern Asia during the middle to late Pleistocene, prior to its habitat becoming rapidly reduced in the Holocene. While conservation reserves have been established and population numbers of the giant panda have recently increased, the interpretation of its genetic diversity remains controversial. Previous analyses, surprisingly, have indicated relatively high levels of genetic diversity raising issues concerning the efficiency and usefulness of reintroducing individuals from captive populations. However, due to a lack of DNA data from fossil specimens, it is unknown whether genetic diversity was even higher prior to the most recent population decline. We amplified complete cytb and 12s rRNA, partial 16s rRNA and ND1, and control region sequences from the mitochondrial genomes of two Holocene panda specimens. We estimated genetic diversity and population demography by analyzing the ancient mitochondrial DNA sequences alongside those from modern giant pandas, as well as from other members of the bear family (Ursidae). Phylogenetic analyses show that one of the ancient haplotypes is sister to all sampled modern pandas and the second ancient individual is nested among the modern haplotypes, suggesting that genetic diversity may indeed have been higher earlier during the Holocene. Bayesian skyline plot analysis supports this view and indicates a slight decline in female effective population size starting around 6000 years B.P., followed by a recovery around 2000 years ago. Therefore, while the genetic diversity of the giant panda has been affected by recent habitat contraction, it still harbors substantial genetic diversity. Moreover, while its still low population numbers require continued conservation efforts, there seem to be no immediate threats from the perspective of genetic evolutionary potential. PMID:29642393
Sheng, Gui-Lian; Barlow, Axel; Cooper, Alan; Hou, Xin-Dong; Ji, Xue-Ping; Jablonski, Nina G; Zhong, Bo-Jian; Liu, Hong; Flynn, Lawrence J; Yuan, Jun-Xia; Wang, Li-Rui; Basler, Nikolas; Westbury, Michael V; Hofreiter, Michael; Lai, Xu-Long
2018-04-06
The giant panda was widely distributed in China and south-eastern Asia during the middle to late Pleistocene, prior to its habitat becoming rapidly reduced in the Holocene. While conservation reserves have been established and population numbers of the giant panda have recently increased, the interpretation of its genetic diversity remains controversial. Previous analyses, surprisingly, have indicated relatively high levels of genetic diversity raising issues concerning the efficiency and usefulness of reintroducing individuals from captive populations. However, due to a lack of DNA data from fossil specimens, it is unknown whether genetic diversity was even higher prior to the most recent population decline. We amplified complete cyt b and 12s rRNA, partial 16s rRNA and ND1 , and control region sequences from the mitochondrial genomes of two Holocene panda specimens. We estimated genetic diversity and population demography by analyzing the ancient mitochondrial DNA sequences alongside those from modern giant pandas, as well as from other members of the bear family (Ursidae). Phylogenetic analyses show that one of the ancient haplotypes is sister to all sampled modern pandas and the second ancient individual is nested among the modern haplotypes, suggesting that genetic diversity may indeed have been higher earlier during the Holocene. Bayesian skyline plot analysis supports this view and indicates a slight decline in female effective population size starting around 6000 years B.P., followed by a recovery around 2000 years ago. Therefore, while the genetic diversity of the giant panda has been affected by recent habitat contraction, it still harbors substantial genetic diversity. Moreover, while its still low population numbers require continued conservation efforts, there seem to be no immediate threats from the perspective of genetic evolutionary potential.
Daniels, Savel R
2011-11-01
The endemic, monotypic freshwater crab species Seychellum alluaudi was used as a template to examine the initial colonisation and evolutionary history among the major islands in the Seychelles Archipelago. Five of the "inner" islands in the Seychelles Archipelago including Mahé, Praslin, Silhouette, La Digue and Frégate were sampled. Two partial mtDNA fragments, 16S rRNA and cytochrome oxidase subunit I (COI) was sequenced for 83 specimens of S. alluaudi. Evolutionary relationships between populations were inferred from the combined mtDNA dataset using maximum parsimony, maximum likelihood and Bayesian inferences. Analyses of molecular variance (AMOVA) were used to examine genetic variation among and within clades. A haplotype network was constructed using TCS while BEAST was employed to date the colonisation and divergence of lineages on the islands. Phylogenetic analyses of the combined mtDNA data set of 1103 base pairs retrieved a monophyletic S. alluaudi group comprised three statistically well-supported monophyletic clades. Clade one was exclusive to Silhouette; clade two included samples from Praslin sister to La Digue, while clade three comprised samples from Mahé sister to Frégate. The haplotype network corresponded to the three clades. Within Mahé, substantial phylogeographic substructure was evident. AMOVA results revealed limited genetic variation within localities with most variation occurring among localities. Divergence time estimations predated the Holocene sea level regressions and indicated a Pliocene/Pleistocene divergence between the three clades evident within S. alluaudi. The monophyly of each clade suggests that transoceanic dispersal is rare. The absence of shared haplotypes between the three clades, coupled with marked sequence divergence values suggests the presence of three allospecies within S. alluaudi. Copyright © 2011 Elsevier Inc. All rights reserved.
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions
Gardner, Shea N; Mariella, Jr., Raymond P; Christian, Allen T; Young, Jennifer A; Clague, David S
2013-06-25
A method of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths.
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions
Gardner, Shea N [San Leandro, CA; Mariella, Jr., Raymond P.; Christian, Allen T [Tracy, CA; Young, Jennifer A [Berkeley, CA; Clague, David S [Livermore, CA
2011-01-18
A method of fabricating a DNA molecule of user-defined sequence. The method comprises the steps of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an even or odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths. In one embodiment starting sequence fragments are of different lengths, n, n+1, n+2, etc.
Chacón, Juliana; Madriñán, Santiago; Debouck, Daniel; Rodriguez, Fausto; Tohme, Joe
2008-10-01
From a phylogenetic perspective, the genus Manihot can be considered as an orphan group of plants, and the scientific knowledge acquired has been mainly related to cassava, one of the most important crops in poor tropical countries. The goal of the majority of evolutionary studies in the genus has been to decipher the domestication process and identify the closest relatives of cassava. Few investigations have focused on wild Manihot species, and the phylogeny of the genus is still unclear. In this study the DNA sequence variation from two chloroplast regions, the nuclear DNA gene G3pdh and two nuclear sequences derived from the 3'-end of two cassava ESTs, were used in order to infer the phylogenetic relationships among a subset of wild Manihot species, including two species from Cnidoscolus as out-groups. Maximum parsimony and Bayesian analyses were conducted for each data set and for a combined matrix due to the low variation of each region when analyzed independently. A penalized likelihood analysis of the chloroplast region trnL-trnF, calibrated with various age estimates for genera in the Euphorbiaceae extracted from the literature was used to determine the ages of origin and diversification of the genus. The two Mesoamerican species sampled form a well-defined clade. The South American species can be grouped into clades of varying size, but the relationships amongst them cannot be established with the data available. The age of the crown node of Manihot was estimated at 6.6 million years ago. Manihot esculenta varieties do not form a monophyletic group that is consistent with the possibility of multiple introgressions of genes from other wild species. The low levels of variation observed in the DNA regions sampled suggest a recent and explosive diversification of the genus, which is confirmed by our age estimates.
Moscoso del Prado Martín, Fermín
2013-12-01
I introduce the Bayesian assessment of scaling (BAS), a simple but powerful Bayesian hypothesis contrast methodology that can be used to test hypotheses on the scaling regime exhibited by a sequence of behavioral data. Rather than comparing parametric models, as typically done in previous approaches, the BAS offers a direct, nonparametric way to test whether a time series exhibits fractal scaling. The BAS provides a simpler and faster test than do previous methods, and the code for making the required computations is provided. The method also enables testing of finely specified hypotheses on the scaling indices, something that was not possible with the previously available methods. I then present 4 simulation studies showing that the BAS methodology outperforms the other methods used in the psychological literature. I conclude with a discussion of methodological issues on fractal analyses in experimental psychology. PsycINFO Database Record (c) 2014 APA, all rights reserved.
Zhang, Xi; Shi, Ya Li; Han, Lu Lu; Xiong, Chen; Yi, Shi Qi; Jiang, Peng; Wang, Zeng Xian; Shen, Ji Long; Cui, Jing; Wang, Zhong Quan
2018-01-01
Thelazia callipaeda is the causative agent of thelaziasis in canids, felids and humans. However, the population genetic structure regarding this parasite remains unclear. In this study, we first explored the genetic variation of 32 T. callipaeda clinical isolates using the following multi-molecular markers: cox1, cytb, 12S rDNA, ITS1 and 18S rDNA. The isolates were collected from 13 patients from 11 geographical locations in China. Next, the population structure of T. callipaeda from Europe and other Asian countries was analyzed using the cox1 sequences collected during this study and from the GenBank database. In general, the Chinese clinical isolates of T. callipaeda expressed high genetic diversity. Based on the cox1 gene, a total of 21 haplotypes were identified. One only circulated in European countries (Hap1), while the other 20 haplotypes were dispersed in Korea, Japan and China. There were five nucleotide positions in the cox1 sequences that were confirmed as invariable among individuals from Europe and Asia, but the sequences were distinct between these two regions. Population differences between Europe and Asian countries were greater than those among China, Korea and Japan. The T. callipaeda populations from Europe and Asia should be divided into two separate sub-populations. These two groups started to diverge during the middle Pleistocene. Neutrality tests, mismatch distribution and Bayesian skyline plot (BSP) analysis all rejected possible population expansion of T. callipaeda. The Asian population of T. callipaeda has a high level of genetic diversity, but further studies should be performed to explore the biology, ecology and epidemiology of T. callipaeda.
Park, Mi-Jeong; Choi, Young-Joon; Hong, Seung-Beom; Shin, Hyeon-Dong
2010-01-01
Ampelomyces quisqualis complex is well known as the most common and widespread hyperparasite of the family Erysiphaceae, the cause of powdery mildew diseases. As commercial biopesticide products it is widely used to control the disease in field and plastic houses. Although genetic diversity within Ampelomyces isolates has been previously recognized, a single name A. quisqualis is still applied to all pycnidial intracellular hyperparasites of powdery mildew fungi. In this study, the phylogenetic relationships among Ampelomyces isolates originating from various powdery mildew fungi in Korea were inferred from Bayesian and maximum parsimony analyses of the sequences of ITS rDNA region and actin gene. In the phylogenetic trees, the Ampelomyces isolates could be divided into four distinct groups with high sequence divergences in both regions. The largest group, Clade 1, mostly accommodated Ampelomyces isolates originating from the mycohost Podosphaera spp. (sect. Sphaerotheca). Clade 2 comprised isolates from several genera of powdery mildews, Golovinomyces, Erysiphe (sect. Erysiphe), Arthrocladiella, and Phyllactinia, and was further divided into two subclades. An isolate obtained from Podosphaera (sect. Sphaerotheca) pannosa was clustered into Clade 3, with those from powdery mildews infecting rosaceous hosts. The mycohosts of Ampelomyces isolates in Clade 4 mostly consisted of species of Erysiphe (sect. Erysiphe, sect. Microsphaera, and sect. Uncinula). The present phylogenetic study demonstrates that Ampelomyces hyperparasite is indeed an assemblage of several distinct lineages rather than a sole species. Although the correlation between Ampelomyces isolates and their mycohosts is not obviously clear, the isolates show not only some degree of host specialization but also adaptation to their mycohosts during the evolution of the hyperparasite. Copyright © 2010 The British Mycological Society. Published by Elsevier Ltd. All rights reserved.
Zhang, Xi; Shi, Ya Li; Han, Lu Lu; Xiong, Chen; Yi, Shi Qi; Jiang, Peng; Wang, Zeng Xian; Shen, Ji Long; Wang, Zhong Quan
2018-01-01
Background Thelazia callipaeda is the causative agent of thelaziasis in canids, felids and humans. However, the population genetic structure regarding this parasite remains unclear. Methodology/principal findings In this study, we first explored the genetic variation of 32 T. callipaeda clinical isolates using the following multi-molecular markers: cox1, cytb, 12S rDNA, ITS1 and 18S rDNA. The isolates were collected from 13 patients from 11 geographical locations in China. Next, the population structure of T. callipaeda from Europe and other Asian countries was analyzed using the cox1 sequences collected during this study and from the GenBank database. In general, the Chinese clinical isolates of T. callipaeda expressed high genetic diversity. Based on the cox1 gene, a total of 21 haplotypes were identified. One only circulated in European countries (Hap1), while the other 20 haplotypes were dispersed in Korea, Japan and China. There were five nucleotide positions in the cox1 sequences that were confirmed as invariable among individuals from Europe and Asia, but the sequences were distinct between these two regions. Population differences between Europe and Asian countries were greater than those among China, Korea and Japan. The T. callipaeda populations from Europe and Asia should be divided into two separate sub-populations. These two groups started to diverge during the middle Pleistocene. Neutrality tests, mismatch distribution and Bayesian skyline plot (BSP) analysis all rejected possible population expansion of T. callipaeda. Conclusions The Asian population of T. callipaeda has a high level of genetic diversity, but further studies should be performed to explore the biology, ecology and epidemiology of T. callipaeda. PMID:29324738
Norman, Janette A.; Blackmore, Caroline J.; Rourke, Meaghan; Christidis, Les
2014-01-01
Mitochondrial sequence data is often used to reconstruct the demographic history of Pleistocene populations in an effort to understand how species have responded to past climate change events. However, departures from neutral equilibrium conditions can confound evolutionary inference in species with structured populations or those that have experienced periods of population expansion or decline. Selection can affect patterns of mitochondrial DNA variation and variable mutation rates among mitochondrial genes can compromise inferences drawn from single markers. We investigated the contribution of these factors to patterns of mitochondrial variation and estimates of time to most recent common ancestor (TMRCA) for two clades in a co-operatively breeding avian species, the white-browed babbler Pomatostomus superciliosus. Both the protein-coding ND3 gene and hypervariable domain I control region sequences showed departures from neutral expectations within the superciliosus clade, and a two-fold difference in TMRCA estimates. Bayesian phylogenetic analysis provided evidence of departure from a strict clock model of molecular evolution in domain I, leading to an over-estimation of TMRCA for the superciliosus clade at this marker. Our results suggest mitochondrial studies that attempt to reconstruct Pleistocene demographic histories should rigorously evaluate data for departures from neutral equilibrium expectations, including variation in evolutionary rates across multiple markers. Failure to do so can lead to serious errors in the estimation of evolutionary parameters and subsequent demographic inferences concerning the role of climate as a driver of evolutionary change. These effects may be especially pronounced in species with complex social structures occupying heterogeneous environments. We propose that environmentally driven differences in social structure may explain observed differences in evolutionary rate of domain I sequences, resulting from longer than expected retention times for matriarchal lineages in the superciliosus clade. PMID:25181547
Phylogeny of the Asian spiny frog tribe Paini (Family Dicroglossidae) sensu Dubois.
Che, Jing; Hu, Jian-sheng; Zhou, Wei-wei; Murphy, Robert W; Papenfuss, Theodore J; Chen, Ming-yong; Rao, Ding-qi; Li, Pi-peng; Zhang, Ya-ping
2009-01-01
The anuran tribe Paini, family Dicroglossidae, is known in this group only from Asia. The phylogenetic relationships and often the taxonomic recognition of species are controversial. In order to stabilize the classification, we used approximately 2100 bp of nuclear (rhodopsin, tyrosinase) and mitochondrial (12S, 16S rRNA) DNA sequence data to infer the phylogenetic relationships of these frogs. Phylogenetic trees reconstructed using Bayesian inference and maximum parsimony methods supported a monophyletic tribe Paini. Two distinct groups (I,II) were recovered with the mtDNA alone and the total concatenated data (mtDNA+nuDNA). The recognition of two genera, Quasipaa and Nanorana, was supported. Group I, Quasipaa, is widespread east of the Hengduan Mountain Ranges and consists of taxa from relatively low elevations in southern China, Vietnam and Laos. Group II, Nanorana, contains a mix of species occurring from high to low elevation predominantly in the Qinghai-Tibetan Plateau and Hengduan Mountain Ranges. The occurrence of frogs at high elevations appears to be a derived ecological condition. The composition of some major species groups based on morphological characteristics strongly conflicts with the molecular analysis. Some possible cryptic species are indicated by the molecular analyses. The incorporation of genetic data from type localities helped to resolve some of the taxonomic problems, although further combined analyses of morphological data from type specimens are required. The two nuDNA gene segments proved to be very informative for resolving higher phylogenetic relationships and more nuclear data should be explored to be more confident in the relationships.
Liu, Jian; Zhang, Shouzhou; Nagalingum, Nathalie S; Chiang, Yu-Chung; Lindstrom, Anders J; Gong, Xun
2018-05-18
The gymnosperm genus Cycas is the sole member of Cycadaceae, and is the largest genus of extant cycads. There are about 115 accepted Cycas species mainly distributed in the paleotropics. Based on morphology, the genus has been divided into six sections and eight subsections, but this taxonomy has not yet been tested in a molecular phylogenetic framework. Although the monophyly of Cycas is broadly accepted, the intrageneric relationships inferred from previous molecular phylogenetic analyses are unclear due to insufficient sampling or uninformative DNA sequence data. In this study, we reconstructed a phylogeny of Cycas using four chloroplast intergenic spacers and seven low-copy nuclear genes and sampling 90% of extant Cycas species. The maximum likelihood and Bayesian inference phylogenies suggest: (1) matrices of either concatenated cpDNA markers or of concatenated nDNA lack sufficient informative sites to resolve the phylogeny alone, however, the phylogeny from the combined cpDNA-nDNA dataset suggests the genus can be roughly divided into 13 clades and six sections that are in agreement with the current classification of the genus; (2) although with partial support, a clade combining sections Panzhihuaenses + Asiorientales is resolved as the earliest diverging branch; (3) section Stangerioides is not monophyletic because the species resolve as a grade; (4) section Indosinenses is not monophyletic as it includes Cycas macrocarpa and C. pranburiensis from section Cycas; (5) section Cycas is the most derived group and its subgroups correspond with geography. Copyright © 2018 Elsevier Inc. All rights reserved.
Thomson, Vicki A.; Lebrasseur, Ophélie; Austin, Jeremy J.; Hunt, Terry L.; Burney, David A.; Denham, Tim; Rawlence, Nicolas J.; Wood, Jamie R.; Gongora, Jaime; Girdland Flink, Linus; Linderholm, Anna; Dobney, Keith; Larson, Greger; Cooper, Alan
2014-01-01
The human colonization of Remote Oceania remains one of the great feats of exploration in history, proceeding east from Asia across the vast expanse of the Pacific Ocean. Human commensal and domesticated species were widely transported as part of this diaspora, possibly as far as South America. We sequenced mitochondrial control region DNA from 122 modern and 22 ancient chicken specimens from Polynesia and Island Southeast Asia and used these together with Bayesian modeling methods to examine the human dispersal of chickens across this area. We show that specific techniques are essential to remove contaminating modern DNA from experiments, which appear to have impacted previous studies of Pacific chickens. In contrast to previous reports, we find that all ancient specimens and a high proportion of the modern chickens possess a group of unique, closely related haplotypes found only in the Pacific. This group of haplotypes appears to represent the authentic founding mitochondrial DNA chicken lineages transported across the Pacific, and allows the early dispersal of chickens across Micronesia and Polynesia to be modeled. Importantly, chickens carrying this genetic signature persist on several Pacific islands at high frequencies, suggesting that the original Polynesian chicken lineages may still survive. No early South American chicken samples have been detected with the diagnostic Polynesian mtDNA haplotypes, arguing against reports that chickens provide evidence of Polynesian contact with pre-European South America. Two modern specimens from the Philippines carry haplotypes similar to the ancient Pacific samples, providing clues about a potential homeland for the Polynesian chicken. PMID:24639505
NASA Astrophysics Data System (ADS)
Lestari, D.; Bustamam, A.; Novianti, T.; Ardaneswari, G.
2017-07-01
DNA sequence can be defined as a succession of letters, representing the order of nucleotides within DNA, using a permutation of four DNA base codes including adenine (A), guanine (G), cytosine (C), and thymine (T). The precise code of the sequences is determined using DNA sequencing methods and technologies, which have been developed since the 1970s and currently become highly developed, advanced and highly throughput sequencing technologies. So far, DNA sequencing has greatly accelerated biological and medical research and discovery. However, in some cases DNA sequencing could produce any ambiguous and not clear enough sequencing results that make them quite difficult to be determined whether these codes are A, T, G, or C. To solve these problems, in this study we can introduce other representation of DNA codes namely Quaternion Q = (PA, PT, PG, PC), where PA, PT, PG, PC are the probability of A, T, G, C bases that could appear in Q and PA + PT + PG + PC = 1. Furthermore, using Quaternion representations we are able to construct the improved scoring matrix for global sequence alignment processes, by applying a dot product method. Moreover, this scoring matrix produces better and higher quality of the match and mismatch score between two DNA base codes. In implementation, we applied the Needleman-Wunsch global sequence alignment algorithm using Octave, to analyze our target sequence which contains some ambiguous sequence data. The subject sequences are the DNA sequences of Streptococcus pneumoniae families obtained from the Genebank, meanwhile the target DNA sequence are received from our collaborator database. As the results we found the Quaternion representations improve the quality of the sequence alignment score and we can conclude that DNA sequence target has maximum similarity with Streptococcus pneumoniae.
Zheng, Qi; Grice, Elizabeth A.
2016-01-01
Accurate mapping of next-generation sequencing (NGS) reads to reference genomes is crucial for almost all NGS applications and downstream analyses. Various repetitive elements in human and other higher eukaryotic genomes contribute in large part to ambiguously (non-uniquely) mapped reads. Most available NGS aligners attempt to address this by either removing all non-uniquely mapping reads, or reporting one random or "best" hit based on simple heuristics. Accurate estimation of the mapping quality of NGS reads is therefore critical albeit completely lacking at present. Here we developed a generalized software toolkit "AlignerBoost", which utilizes a Bayesian-based framework to accurately estimate mapping quality of ambiguously mapped NGS reads. We tested AlignerBoost with both simulated and real DNA-seq and RNA-seq datasets at various thresholds. In most cases, but especially for reads falling within repetitive regions, AlignerBoost dramatically increases the mapping precision of modern NGS aligners without significantly compromising the sensitivity even without mapping quality filters. When using higher mapping quality cutoffs, AlignerBoost achieves a much lower false mapping rate while exhibiting comparable or higher sensitivity compared to the aligner default modes, therefore significantly boosting the detection power of NGS aligners even using extreme thresholds. AlignerBoost is also SNP-aware, and higher quality alignments can be achieved if provided with known SNPs. AlignerBoost’s algorithm is computationally efficient, and can process one million alignments within 30 seconds on a typical desktop computer. AlignerBoost is implemented as a uniform Java application and is freely available at https://github.com/Grice-Lab/AlignerBoost. PMID:27706155
Vuataz, Laurent; Sartori, Michel; Wagner, André; Monaghan, Michael T.
2011-01-01
Aquatic larvae of many Rhithrogena mayflies (Ephemeroptera) inhabit sensitive Alpine environments. A number of species are on the IUCN Red List and many recognized species have restricted distributions and are of conservation interest. Despite their ecological and conservation importance, ambiguous morphological differences among closely related species suggest that the current taxonomy may not accurately reflect the evolutionary diversity of the group. Here we examined the species status of nearly 50% of European Rhithrogena diversity using a widespread sampling scheme of Alpine species that included 22 type localities, general mixed Yule-coalescent (GMYC) model analysis of one standard mtDNA marker and one newly developed nDNA marker, and morphological identification where possible. Using sequences from 533 individuals from 144 sampling localities, we observed significant clustering of the mitochondrial (cox1) marker into 31 GMYC species. Twenty-one of these could be identified based on the presence of topotypes (expertly identified specimens from the species' type locality) or unambiguous morphology. These results strongly suggest the presence of both cryptic diversity and taxonomic oversplitting in Rhithrogena. Significant clustering was not detected with protein-coding nuclear PEPCK, although nine GMYC species were congruent with well supported terminal clusters of nDNA. Lack of greater congruence in the two data sets may be the result of incomplete sorting of ancestral polymorphism. Bayesian phylogenetic analyses of both gene regions recovered four of the six recognized Rhithrogena species groups in our samples as monophyletic. Future development of more nuclear markers would facilitate multi-locus analysis of unresolved, closely related species pairs. The DNA taxonomy developed here lays the groundwork for a future revision of the important but cryptic Rhithrogena genus in Europe. PMID:21611178
Large-Scale Concatenation cDNA Sequencing
Yu, Wei; Andersson, Björn; Worley, Kim C.; Muzny, Donna M.; Ding, Yan; Liu, Wen; Ricafrente, Jennifer Y.; Wentland, Meredith A.; Lennon, Greg; Gibbs, Richard A.
1997-01-01
A total of 100 kb of DNA derived from 69 individual human brain cDNA clones of 0.7–2.0 kb were sequenced by concatenated cDNA sequencing (CCS), whereby multiple individual DNA fragments are sequenced simultaneously in a single shotgun library. The method yielded accurate sequences and a similar efficiency compared with other shotgun libraries constructed from single DNA fragments (>20 kb). Computer analyses were carried out on 65 cDNA clone sequences and their corresponding end sequences to examine both nucleic acid and amino acid sequence similarities in the databases. Thirty-seven clones revealed no DNA database matches, 12 clones generated exact matches (≥98% identity), and 16 clones generated nonexact matches (57%–97% identity) to either known human or other species genes. Of those 28 matched clones, 8 had corresponding end sequences that failed to identify similarities. In a protein similarity search, 27 clone sequences displayed significant matches, whereas only 20 of the end sequences had matches to known protein sequences. Our data indicate that full-length cDNA insert sequences provide significantly more nucleic acid and protein sequence similarity matches than expressed sequence tags (ESTs) for database searching. [All 65 cDNA clone sequences described in this paper have been submitted to the GenBank data library under accession nos. U79240–U79304.] PMID:9110174
Mariella, Jr., Raymond P.
2008-11-18
A method of synthesizing a desired double-stranded DNA of a predetermined length and of a predetermined sequence. Preselected sequence segments that will complete the desired double-stranded DNA are determined. Preselected segment sequences of DNA that will be used to complete the desired double-stranded DNA are provided. The preselected segment sequences of DNA are assembled to produce the desired double-stranded DNA.
Dolz, Roser; Valle, Rosa; Perera, Carmen L.; Bertran, Kateri; Frías, Maria T.; Majó, Natàlia; Ganges, Llilianne; Pérez, Lester J.
2013-01-01
Background Infectious bursal disease is a highly contagious and acute viral disease caused by the infectious bursal disease virus (IBDV); it affects all major poultry producing areas of the world. The current study was designed to rigorously measure the global phylogeographic dynamics of IBDV strains to gain insight into viral population expansion as well as the emergence, spread and pattern of the geographical structure of very virulent IBDV (vvIBDV) strains. Methodology/Principal Findings Sequences of the hyper-variable region of the VP2 (HVR-VP2) gene from IBDV strains isolated from diverse geographic locations were obtained from the GenBank database; Cuban sequences were obtained in the current work. All sequences were analysed by Bayesian phylogeographic analysis, implemented in the Bayesian Evolutionary Analysis Sampling Trees (BEAST), Bayesian Tip-association Significance testing (BaTS) and Spatial Phylogenetic Reconstruction of Evolutionary Dynamics (SPREAD) software packages. Selection pressure on the HVR-VP2 was also assessed. The phylogeographic association-trait analysis showed that viruses sampled from individual countries tend to cluster together, suggesting a geographic pattern for IBDV strains. Spatial analysis from this study revealed that strains carrying sequences that were linked to increased virulence of IBDV appeared in Iran in 1981 and spread to Western Europe (Belgium) in 1987, Africa (Egypt) around 1990, East Asia (China and Japan) in 1993, the Caribbean Region (Cuba) by 1995 and South America (Brazil) around 2000. Selection pressure analysis showed that several codons in the HVR-VP2 region were under purifying selection. Conclusions/Significance To our knowledge, this work is the first study applying the Bayesian phylogeographic reconstruction approach to analyse the emergence and spread of vvIBDV strains worldwide. PMID:23805195
Alfonso-Morales, Abdulahi; Martínez-Pérez, Orlando; Dolz, Roser; Valle, Rosa; Perera, Carmen L; Bertran, Kateri; Frías, Maria T; Majó, Natàlia; Ganges, Llilianne; Pérez, Lester J
2013-01-01
Infectious bursal disease is a highly contagious and acute viral disease caused by the infectious bursal disease virus (IBDV); it affects all major poultry producing areas of the world. The current study was designed to rigorously measure the global phylogeographic dynamics of IBDV strains to gain insight into viral population expansion as well as the emergence, spread and pattern of the geographical structure of very virulent IBDV (vvIBDV) strains. Sequences of the hyper-variable region of the VP2 (HVR-VP2) gene from IBDV strains isolated from diverse geographic locations were obtained from the GenBank database; Cuban sequences were obtained in the current work. All sequences were analysed by Bayesian phylogeographic analysis, implemented in the Bayesian Evolutionary Analysis Sampling Trees (BEAST), Bayesian Tip-association Significance testing (BaTS) and Spatial Phylogenetic Reconstruction of Evolutionary Dynamics (SPREAD) software packages. Selection pressure on the HVR-VP2 was also assessed. The phylogeographic association-trait analysis showed that viruses sampled from individual countries tend to cluster together, suggesting a geographic pattern for IBDV strains. Spatial analysis from this study revealed that strains carrying sequences that were linked to increased virulence of IBDV appeared in Iran in 1981 and spread to Western Europe (Belgium) in 1987, Africa (Egypt) around 1990, East Asia (China and Japan) in 1993, the Caribbean Region (Cuba) by 1995 and South America (Brazil) around 2000. Selection pressure analysis showed that several codons in the HVR-VP2 region were under purifying selection. To our knowledge, this work is the first study applying the Bayesian phylogeographic reconstruction approach to analyse the emergence and spread of vvIBDV strains worldwide.
Nanopore Technology: A Simple, Inexpensive, Futuristic Technology for DNA Sequencing.
Gupta, P D
2016-10-01
In health care, importance of DNA sequencing has been fully established. Sanger's Capillary Electrophoresis DNA sequencing methodology is time consuming, cumbersome, hence become more expensive. Lately, because of its versatility DNA sequencing became house hold name, and therefore, there is an urgent need of simple, fast, inexpensive, DNA sequencing technology. In the beginning of this century efforts were made, and Nanopore DNA sequencing technology was developed; still it is infancy, nevertheless, it is the futuristic technology.
Scarpassa, Vera Margarete; Conn, Jan E.
2011-01-01
Cryptic species and lineages characterize Anopheles nuneztovari s.l. Gabaldón, an important malaria vector in South America. We investigated the phylogeographic structure across the range of this species with cytochrome oxidase subunit I (COI) mitochondrial DNA sequences to estimate the number of clades and levels of divergence. Bayesian and maximum-likelihood phylogenetic analyses detected four groups distributed in two major monophyletic clades (I and II). Samples from the Amazon Basin were clustered in clade I, as were subclades II-A and II-B, whereas those from Bolivia/Colombia/Venezuela were restricted to one basal subclade (II-C). These data, together with a statistical parsimony network, confirm results of previous studies that An. nuneztovari is a species complex consisting of at least two cryptic taxa, one occurring in Colombia and Venezuela and the another occurring in the Amazon Basin. These data also suggest that additional incipient species may exist in the Amazon Basin. Divergence time and expansion tests suggested that these groups separated and expanded in the Pleistocene Epoch. In addition, the COI sequences clearly separated An. nuneztovari s.l. from the closely related species An. dunhami Causey, and three new records are reported for An. dunhami in Amazonian Brazil. These findings are relevant for vector control programs in areas where both species occur. Our analyses support dynamic geologic and landscape changes in northern South America, and infer particularly active divergence during the Pleistocene Epoch for New World anophelines. PMID:22049039
Hill, Kristina M; Stokes, Nancy A; Webb, Stephen C; Hine, P Mike; Kroeck, Marina A; Moore, James D; Morley, Margaret S; Reece, Kimberly S; Burreson, Eugene M; Carnegie, Ryan B
2014-07-24
The genus Bonamia (Haplosporidia) includes economically significant oyster parasites. Described species were thought to have fairly circumscribed host and geographic ranges: B. ostreae infecting Ostrea edulis in Europe and North America, B. exitiosa infecting O. chilensis in New Zealand, and B. roughleyi infecting Saccostrea glomerata in Australia. The discovery of B. exitiosa-like parasites in new locations and the observation of a novel species, B. perspora, in non-commercial O. stentina altered this perception and prompted our wider evaluation of the global diversity of Bonamia parasites. Samples of 13 oyster species from 21 locations were screened for Bonamia spp. by PCR, and small subunit and internal transcribed spacer regions of Bonamia sp. ribosomal DNA were sequenced from PCR-positive individuals. Infections were confirmed histologically. Phylogenetic analyses using parsimony and Bayesian methods revealed one species, B. exitiosa, to be widely distributed, infecting 7 oyster species from Australia, New Zealand, Argentina, eastern and western USA, and Tunisia. More limited host and geographic distributions of B. ostreae and B. perspora were confirmed, but nothing genetically identifiable as B. roughleyi was found in Australia or elsewhere. Newly discovered diversity included a Bonamia sp. in Dendostrea sandvicensis from Hawaii, USA, that is basal to the other Bonamia species and a Bonamia sp. in O. edulis from Tomales Bay, California, USA, that is closely related to both B. exitiosa and the previously observed Bonamia sp. from O. chilensis in Chile.
Cavallero, Serena; De Liberato, Claudio; Friedrich, Klaus G; Di Cave, David; Masella, Valentina; D'Amelio, Stefano; Berrilli, Federica
2015-08-01
Nematodes of the genus Trichuris, known as whipworms, are recognized to infect numerous mammalian species including humans and non-human primates. Several Trichuris spp. have been described and species designation/identification is traditionally based on host-affiliation, although cross-infection and hybridization events may complicate species boundaries. The main aims of the present study were to genetically characterize adult Trichuris specimens from captive Japanese macaques (Macaca fuscata) and grivets (Chlorocebus aethiops), using the ribosomal DNA (ITS) as molecular marker and to investigate the phylogeny and the extent of genetic variation also by comparison with data on isolates from other humans, non-human primates and other hosts. The phylogenetic analysis of Trichuris sequences from M. fuscata and C. aethiops provided evidences of distinct clades and subclades thus advocating the existence of additional separated taxa. Neighbor Joining and Bayesian trees suggest that specimens from M. fuscata may be distinct from, but related to Trichuris trichiura, while a close relationship is suggested between the subclade formed by the specimens from C. aethiops and the subclade formed by T. suis. The tendency to associate Trichuris sp. to host species can lead to misleading taxonomic interpretations (i.e. whipworms found in primates are identified as T. trichiura). The results here obtained confirm previous evidences suggesting the existence of Trichuris spp. other than T. trichiura infecting non-human living primates. Copyright © 2015 Elsevier B.V. All rights reserved.
Rosas-Valdez, Rogelio; Morrone, Juan J; García-Varela, Martín
2012-08-01
Species of Floridosentis (Acanthocephala) are common parasites of mullets (Mugil spp., Mugilidae) found in tropical marine and brackish water in the Americas. Floridosentis includes 2 species distributed in Mexico, i.e., Floridosentis pacifica, restricted to the Pacific Ocean near Salina Cruz, Oaxaca, and Floridosentis mugilis, distributed along the coast of the Pacific Ocean and the Gulf of Mexico. We sampled 18 populations of F. mugilis and F. pacifica (12 from the Pacific and 6 from the Gulf of Mexico) and sequenced a fragment of the rDNA large subunit to evaluate phylogenetic relationships of populations of Floridosentis spp. from Mexico. Species identification of museum specimens of F. mugilis from the Pacific Ocean was confirmed by examination of morphology traits. Phylogenetic trees inferred with maximum parsimony, maximum likelihood, and Bayesian inference indicate that Floridosentis is monophyletic comprising of 2 major well-supported clades, the first clade corresponding to F. mugilis from the Gulf of Mexico, and the second to F. pacifica from the Pacific Ocean. Genetic divergence between species ranged from 7.68 to 8.60%. Intraspecific divergence ranged from 0.14 to 0.86% for F. mugilis and from 1.72 to 4.49% for F. pacifica. Data obtained from diagnostic characters indicate that specimens from the Pacific Ocean in Mexico have differences in some traits among locations. These results are consistent with the phylogenetic hypothesis, indicating that F. pacifica is distributed in the Pacific Ocean in Mexico with 3 major lineages.
Guidugli, Lucia; Shimelis, Hermela; Masica, David L; Pankratz, Vernon S; Lipton, Gary B; Singh, Namit; Hu, Chunling; Monteiro, Alvaro N A; Lindor, Noralane M; Goldgar, David E; Karchin, Rachel; Iversen, Edwin S; Couch, Fergus J
2018-01-17
Many variants of uncertain significance (VUS) have been identified in BRCA2 through clinical genetic testing. VUS pose a significant clinical challenge because the contribution of these variants to cancer risk has not been determined. We conducted a comprehensive assessment of VUS in the BRCA2 C-terminal DNA binding domain (DBD) by using a validated functional assay of BRCA2 homologous recombination (HR) DNA-repair activity and defined a classifier of variant pathogenicity. Among 139 variants evaluated, 54 had ≥99% probability of pathogenicity, and 73 had ≥95% probability of neutrality. Functional assay results were compared with predictions of variant pathogenicity from the Align-GVGD protein-sequence-based prediction algorithm, which has been used for variant classification. Relative to the HR assay, Align-GVGD significantly (p < 0.05) over-predicted pathogenic variants. We subsequently combined functional and Align-GVGD prediction results in a Bayesian hierarchical model (VarCall) to estimate the overall probability of pathogenicity for each VUS. In addition, to predict the effects of all other BRCA2 DBD variants and to prioritize variants for functional studies, we used the endoPhenotype-Optimized Sequence Ensemble (ePOSE) algorithm to train classifiers for BRCA2 variants by using data from the HR functional assay. Together, the results show that systematic functional assays in combination with in silico predictors of pathogenicity provide robust tools for clinical annotation of BRCA2 VUS. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Xu, Zhe; Zhang, Ming-Li
2015-01-01
Climatic fluctuations during the Pleistocene are usually considered as a significant factor in shaping intraspecific genetic variation and influencing demographic histories. To well-understand these processes in desert northwest China, we selected arid adapted Atraphaxis frutescens as the study species. Two cpDNA regions (psbK-psbI, psbB-psbH) were sequenced in 272 individuals from 33 natural populations across the range of this shrub, and 10 haplotypes were identified. It was found to contain high levels of total gene diversity (H T = 0.858), and low levels of within-population diversity (H S = 0.092). Analysis of molecular variance (AMOVA) indicates that genetic differentiation primarily occurs among groups of populations. Based on BEAST (Bayesian Evolutionary Analysis Sampling Trees) analysis, we suggest that intraspecific differentiation of the species, resulting from isolated populations, accompanied enhanced desertification during the middle and late Pleistocene. The expansion of the Gurbantunggut and Kumtag deserts in this area appears to have triggered divergence among populations of the western, central, and eastern portions of the region and shaped genetic differentiation among them. Two possible independent glacial refugia were predicted, the Ili Valley and the northern Junggar Basin. Extensive development of arid habitats (desert margin and arid piedmont grassland) coupled with a more equable climate because the early Holocene are factors likely to have generated recent expansion of A. frutescens. © The American Genetic Association 2014. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Massardo, Darli; Fornel, Rodrigo; Kronforst, Marcus; Gonçalves, Gislene Lopes; Moreira, Gilson Rudinei Pires
2015-01-01
The tribe Heliconiini (Lepidoptera: Nymphalidae) is a diverse group of butterflies distributed throughout the Neotropics, which has been studied extensively, in particular the genus Heliconius. However, most of the other lineages, such as Dione, which are less diverse and considered basal within the group, have received little attention. Basic information, such as species limits and geographical distributions remain uncertain for this genus. Here we used multilocus DNA sequence data and the geographical distribution analysis across the entire range of Dione in the Neotropical region in order to make inferences on the evolutionary history of this poorly explored lineage. Bayesian time-tree reconstruction allows inferring two major diversification events in this tribe around 25mya. Lineages thought to be ancient, such as Dione and Agraulis, are as recent as Heliconius. Dione formed a monophyletic clade, sister to the genus Agraulis. Dione juno, D. glycera and D. moneta were reciprocally monophyletic and formed genetic clusters, with the first two more close related than each other in relation to the third. Divergence time estimates support the hypothesis that speciation in Dione coincided with both the rise of Passifloraceae (the host plants) and the uplift of the Andes. Since the sister species D. glycera and D. moneta are specialized feeders on passion-vine lineages that are endemic to areas located either within or adjacent to the Andes, we inferred that they co-speciated with their host plants during this vicariant event. Copyright © 2014 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Zhang, Haibin; Johnson, Shannon B.; Flores, Vanessa R.; Vrijenhoek, Robert C.
2015-11-01
We describe a broad zone of intergradation between genetically differentiated, northern and southern lineages of the hydrothermal vent tubeworm, Tevnia jerichonana. DNA sequences from four genes, nuclear HSP and ATPsα and mitochondrial COI and Cytb were examined in samples from eastern Pacific vent localities between 13°N and 38°S latitude. Allelic frequencies at these loci exhibited concordant latitudinal clines, and genetic differentiation (pairwise ΦST's) increased with geographical distances between sample localities. Though this pattern of differentiation suggested isolation-by-distance (IBD), it appeared to result from hierarchical population structure. Genotypic assignment tests identified two population clusters comprised of samples from the northern East Pacific Rise (NEPR: 9-13°N) and an extension of the Pacific-Antarctic Ridge (PAR: 31-32°S) with a zone of intergradation along the southern East Pacific Rise (SEPR: 7-17°S). The overall degrees of DNA sequence divergence between the NEPR and PAR populations were slight and not indicative of lengthy isolation. Bayesian assignment methods suggested that the SEPR populations constitute intergrades that connect the NEPR and PAR populations. Though it typically is difficult to distinguish between primary and secondary intergradation, our results were consistent with parallel studies of vent-restricted species that suggest a high degree of demographic instability along the superfast-spreading SEPR axis. Frequent local extinctions and immigration from NEPR and PAR refugia probably shaped the observed pattern of intergradation.
García-Vásquez, Adriana; Pinacho-Pinacho, Carlos Daniel; Martínez-Ramírez, Emilio; Rubio-Godoy, Miguel
2018-08-01
In the present study, two new species of Gyrodactylus are described from Profundulus oaxacae, a fish endemic to the Pacific slope of Oaxaca State, Mexico. Fishes were collected within their distribution range in 5 localities in the Atoyac-Verde River. Gyrodactylus montealbani n. sp. and G. zapoteco n. sp. were erected and characterized morphologically (sclerites of the attachment apparatus and the male copulatory organ) and molecularly (sequences of the Internal Transcribed Spacer region of rDNA). The haptoral sclerites of the new species are similar to those of Gyrodactylus iunuri and Gyrodactylus tepari, both recently described from the goodeid fish Goodea atripinnis, from the Mexican States of Jalisco and Querétaro, respectively; and to Gyrodactylus xtachuna described from the poeciliid Poeciliopsis gracilis in Veracruz State, Mexico - nonetheless, these species can all be discriminated based on their marginal hook morphology. Specimens of G. montealbani n. sp. and G. zapoteco n. sp. were sequenced, and were aligned with sequences of 25 other Gyrodactylus spp. Both Maximum likelihood and Bayesian inference analyses indicated that the two new species are members of independent, well-supported lineages - these are the first Gyrodactylus species described from Profundulus oaxacae. Copyright © 2018 Elsevier B.V. All rights reserved.
The genome-wide DNA sequence specificity of the anti-tumour drug bleomycin in human cells.
Murray, Vincent; Chen, Jon K; Tanaka, Mark M
2016-07-01
The cancer chemotherapeutic agent, bleomycin, cleaves DNA at specific sites. For the first time, the genome-wide DNA sequence specificity of bleomycin breakage was determined in human cells. Utilising Illumina next-generation DNA sequencing techniques, over 200 million bleomycin cleavage sites were examined to elucidate the bleomycin genome-wide DNA selectivity. The genome-wide bleomycin cleavage data were analysed by four different methods to determine the cellular DNA sequence specificity of bleomycin strand breakage. For the most highly cleaved DNA sequences, the preferred site of bleomycin breakage was at 5'-GT* dinucleotide sequences (where the asterisk indicates the bleomycin cleavage site), with lesser cleavage at 5'-GC* dinucleotides. This investigation also determined longer bleomycin cleavage sequences, with preferred cleavage at 5'-GT*A and 5'- TGT* trinucleotide sequences, and 5'-TGT*A tetranucleotides. For cellular DNA, the hexanucleotide DNA sequence 5'-RTGT*AY (where R is a purine and Y is a pyrimidine) was the most highly cleaved DNA sequence. It was striking that alternating purine-pyrimidine sequences were highly cleaved by bleomycin. The highest intensity cleavage sites in cellular and purified DNA were very similar although there were some minor differences. Statistical nucleotide frequency analysis indicated a G nucleotide was present at the -3 position (relative to the cleavage site) in cellular DNA but was absent in purified DNA.
Distribution of hepatitis B virus subgenotype F2a in São Paulo, Brazil.
Alvarado-Mora, Mónica V; Botelho-Lima, Livia S; Santana, Rubia A; Sitnik, Roberta; Ferreira, Paulo Abrão; do Amaral Mello, Francisco; Mangueira, Cristovão P; Carrilho, Flair J; Rebello Pinho, João R
2013-10-21
HBV genotype F is primarily found in indigenous populations from South America and is classified in four subgenotypes (F1 to F4). Subgenotype F2a is the most common in Brazil among genotype F cases. The aim of this study was to characterize HBV genotype F2a circulating in 16 patients from São Paulo, Brazil. Samples were collected between 2006 and 2012 and sent to Hospital Israelita Albert Einstein. A fragment of 1306 bp partially comprising HBsAg and DNA polymerase coding regions was amplified and sequenced. Viral sequences were genotyped by phylogenetic analysis using reference sequences from GenBank (n=198), including 80 classified as subgenotype F2a. Bayesian Markov chain Monte Carlo simulation implemented in BEAST v.1.5.4 was applied to obtain the best possible estimates using the model of nucleotide substitutions GTR+G+I. It were identified three groups of sequences of subgenotype F2a: 1) 10 sequences from São Paulo state; 2) 3 sequences from Rio de Janeiro and one from São Paulo states; 3) 8 sequences from the West Amazon Basin. These results showing for the first time the distribution of F2a subgenotype in Brazil. The spreading and the dynamic of subgenotype F2a in Brazil requires the study of a higher number of samples from different regions as it is unfold in almost all Brazilian populations studied so far. We cannot infer with certainty the origin of these different groups due to the lack of available sequences. Nevertheless, our data suggest that the common origin of these groups probably occurred a long time ago.
Population Genetic Structure of the Tropical Two-Wing Flyingfish (Exocoetus volitans)
Lewallen, Eric A.; Bohonak, Andrew J.; Bonin, Carolina A.; van Wijnen, Andre J.; Pitman, Robert L.; Lovejoy, Nathan R.
2016-01-01
Delineating populations of pantropical marine fish is a difficult process, due to widespread geographic ranges and complex life history traits in most species. Exocoetus volitans, a species of two-winged flyingfish, is a good model for understanding large-scale patterns of epipelagic fish population structure because it has a circumtropical geographic range and completes its entire life cycle in the epipelagic zone. Buoyant pelagic eggs should dictate high local dispersal capacity in this species, although a brief larval phase, small body size, and short lifespan may limit the dispersal of individuals over large spatial scales. Based on these biological features, we hypothesized that E. volitans would exhibit statistically and biologically significant population structure defined by recognized oceanographic barriers. We tested this hypothesis by analyzing cytochrome b mtDNA sequence data (1106 bps) from specimens collected in the Pacific, Atlantic and Indian oceans (n = 266). AMOVA, Bayesian, and coalescent analytical approaches were used to assess and interpret population-level genetic variability. A parsimony-based haplotype network did not reveal population subdivision among ocean basins, but AMOVA revealed limited, statistically significant population structure between the Pacific and Atlantic Oceans (ΦST = 0.035, p<0.001). A spatially-unbiased Bayesian approach identified two circumtropical population clusters north and south of the Equator (ΦST = 0.026, p<0.001), a previously unknown dispersal barrier for an epipelagic fish. Bayesian demographic modeling suggested the effective population size of this species increased by at least an order of magnitude ~150,000 years ago, to more than 1 billion individuals currently. Thus, high levels of genetic similarity observed in E. volitans can be explained by high rates of gene flow, a dramatic and recent population expansion, as well as extensive and consistent dispersal throughout the geographic range of the species. PMID:27736863
The influence of ignoring secondary structure on divergence time estimates from ribosomal RNA genes.
Dohrmann, Martin
2014-02-01
Genes coding for ribosomal RNA molecules (rDNA) are among the most popular markers in molecular phylogenetics and evolution. However, coevolution of sites that code for pairing regions (stems) in the RNA secondary structure can make it challenging to obtain accurate results from such loci. While the influence of ignoring secondary structure on multiple sequence alignment and tree topology has been investigated in numerous studies, its effect on molecular divergence time estimates is still poorly known. Here, I investigate this issue in Bayesian Markov Chain Monte Carlo (BMCMC) and penalized likelihood (PL) frameworks, using empirical datasets from dragonflies (Odonata: Anisoptera) and glass sponges (Porifera: Hexactinellida). My results indicate that highly biased inferences under substitution models that ignore secondary structure only occur if maximum-likelihood estimates of branch lengths are used as input to PL dating, whereas in a BMCMC framework and in PL dating based on Bayesian consensus branch lengths, the effect is far less severe. I conclude that accounting for coevolution of paired sites in molecular dating studies is not as important as previously suggested, as long as the estimates are based on Bayesian consensus branch lengths instead of ML point estimates. This finding is especially relevant for studies where computational limitations do not allow the use of secondary-structure specific substitution models, or where accurate consensus structures cannot be predicted. I also found that the magnitude and direction (over- vs. underestimating node ages) of bias in age estimates when secondary structure is ignored was not distributed randomly across the nodes of the phylogenies, a phenomenon that requires further investigation. Copyright © 2013 Elsevier Inc. All rights reserved.
Population Genetic Structure of the Tropical Two-Wing Flyingfish (Exocoetus volitans).
Lewallen, Eric A; Bohonak, Andrew J; Bonin, Carolina A; van Wijnen, Andre J; Pitman, Robert L; Lovejoy, Nathan R
2016-01-01
Delineating populations of pantropical marine fish is a difficult process, due to widespread geographic ranges and complex life history traits in most species. Exocoetus volitans, a species of two-winged flyingfish, is a good model for understanding large-scale patterns of epipelagic fish population structure because it has a circumtropical geographic range and completes its entire life cycle in the epipelagic zone. Buoyant pelagic eggs should dictate high local dispersal capacity in this species, although a brief larval phase, small body size, and short lifespan may limit the dispersal of individuals over large spatial scales. Based on these biological features, we hypothesized that E. volitans would exhibit statistically and biologically significant population structure defined by recognized oceanographic barriers. We tested this hypothesis by analyzing cytochrome b mtDNA sequence data (1106 bps) from specimens collected in the Pacific, Atlantic and Indian oceans (n = 266). AMOVA, Bayesian, and coalescent analytical approaches were used to assess and interpret population-level genetic variability. A parsimony-based haplotype network did not reveal population subdivision among ocean basins, but AMOVA revealed limited, statistically significant population structure between the Pacific and Atlantic Oceans (ΦST = 0.035, p<0.001). A spatially-unbiased Bayesian approach identified two circumtropical population clusters north and south of the Equator (ΦST = 0.026, p<0.001), a previously unknown dispersal barrier for an epipelagic fish. Bayesian demographic modeling suggested the effective population size of this species increased by at least an order of magnitude ~150,000 years ago, to more than 1 billion individuals currently. Thus, high levels of genetic similarity observed in E. volitans can be explained by high rates of gene flow, a dramatic and recent population expansion, as well as extensive and consistent dispersal throughout the geographic range of the species.
Sequence and Structure Dependent DNA-DNA Interactions
NASA Astrophysics Data System (ADS)
Kopchick, Benjamin; Qiu, Xiangyun
Molecular forces between dsDNA strands are largely dominated by electrostatics and have been extensively studied. Quantitative knowledge has been accumulated on how DNA-DNA interactions are modulated by varied biological constituents such as ions, cationic ligands, and proteins. Despite its central role in biology, the sequence of DNA has not received substantial attention and ``random'' DNA sequences are typically used in biophysical studies. However, ~50% of human genome is composed of non-random-sequence DNAs, particularly repetitive sequences. Furthermore, covalent modifications of DNA such as methylation play key roles in gene functions. Such DNAs with specific sequences or modifications often take on structures other than the canonical B-form. Here we present series of quantitative measurements of the DNA-DNA forces with the osmotic stress method on different DNA sequences, from short repeats to the most frequent sequences in genome, and to modifications such as bromination and methylation. We observe peculiar behaviors that appear to be strongly correlated with the incurred structural changes. We speculate the causalities in terms of the differences in hydration shell and DNA surface structures.
Vieira, Leila do Nascimento; Dos Anjos, Karina Goulart; Faoro, Helisson; Fraga, Hugo Pacheco de Freitas; Greco, Thiago Machado; Pedrosa, Fábio de Oliveira; de Souza, Emanuel Maltempi; Rogalski, Marcelo; de Souza, Robson Francisco; Guerra, Miguel Pedro
2016-05-01
The complete plastome sequencing is an efficient option for increasing phylogenetic resolution and evolutionary studies, as well as may greatly facilitate the use of plastid DNA markers in plant population genetic studies. Merostachys and Guadua stand out as the most common and the highest potential utilization bamboos indigenous of Brazil. Here, we sequenced the complete plastome sequences of the Brazilian Guadua chacoensis and Merostachys sp. to perform full plastome phylogeny and characterize the occurrence, type, and distribution of SRRs using 20 Bambuseae species. The determined plastome sequence of Merostachys sp. and G. chacoensis is 136,334 and 135,403 bp in size, respectively, with an identical gene content and typical quadripartite structure consisting of a pair of IRs separated by the LSC and SSC regions. The Maximum Likelihood and Bayesian Inference analyses produced phylogenomic trees identical in topology. These trees supported monophyly of Paleotropical and Neotropical Bamboos clades. The Neotropical bamboos segregated into three well-supported lineages, Chusqueinae, Guaduinae, and Arthrostylidiinae, with the last two forming a well-supported sister relationship. Paleotropical bamboos segregated into two well-supported lineages, Hickeliinae and Bambusinae + Melocanninae. We identified 141.8 cpSSR in Bambuseae plastomes and an inferior value (38.15) for plastome coding sequences. Among them, we identified 16 polymorphic SSR loci, with number of alleles varying from 3 to 10. These 16 polymorphic cpSSR loci in Bambuseae plastome can be assessed for the intraspecific level of polymorphism, leading to innovative highly sensitive phylogeographic and population genetics studies for this tribe.
Bloor, P; Kemp, S J; Brown, R P
2008-02-01
The phylogeography of the lacertid lizard Gallotia atlantica from the small volcanic island of Lanzarote (Canary Islands) was analysed based on 1075 bp of mitochondrial DNA (mtDNA) sequence (partial cytochrome b and ND2) for 157 individuals from 27 sites (including three sites from neighbouring islets). Levels of sequence divergence were generally low, with the most distant haplotypes separated by only 14 mutational steps. MtDNA divergence appears to coincide with formation of the middle Pleistocene lowland that united formerly separate ancient islands to form the current island of Lanzarote, allowing rejection of a two-island model of phylogeographical structure. There was evidence of large-scale population expansion after island unification, consistent with the colonization of new areas. A nested clade phylogeographical analysis (NCPA) revealed significant phylogeographical structuring. Two-step and higher-level clades each had disjunct distributions, being found to the east and west of a common area with a north-south orientation that extends between coasts in the centre-east of the island (El Jable). Other clades were almost entirely restricted to the El Jable region alone. Bayesian Markov chain Monte Carlo analyses were used to separate ongoing gene flow from historical associations. These supported the NCPA by indicating recent (75,000-150,000 years ago) east-west vicariance across the El Jable region. Lava flows covered El Jable and other parts of the central lowland at this time and likely led to population extinctions and temporary dispersal barriers, although present-day evidence suggests some populations would have survived in small refugia. Expansion of the latter appears to explain the presence of a clade located between the eastern and western components of the disjunct clades. Direct relationships between mtDNA lineages and morphology were not found, although one of two morphological forms on the island has a disjunct distribution that is broadly concordant with east-west components of the phylogeographical pattern. This work demonstrates how recent volcanic activity can cause population fragmentation and thus shape genetic diversity on microgeographical scales.
BM-Map: Bayesian Mapping of Multireads for Next-Generation Sequencing Data
Ji, Yuan; Xu, Yanxun; Zhang, Qiong; Tsui, Kam-Wah; Yuan, Yuan; Norris, Clift; Liang, Shoudan; Liang, Han
2011-01-01
Summary Next-generation sequencing (NGS) technology generates millions of short reads, which provide valuable information for various aspects of cellular activities and biological functions. A key step in NGS applications (e.g., RNA-Seq) is to map short reads to correct genomic locations within the source genome. While most reads are mapped to a unique location, a significant proportion of reads align to multiple genomic locations with equal or similar numbers of mismatches; these are called multireads. The ambiguity in mapping the multireads may lead to bias in downstream analyses. Currently, most practitioners discard the multireads in their analysis, resulting in a loss of valuable information, especially for the genes with similar sequences. To refine the read mapping, we develop a Bayesian model that computes the posterior probability of mapping a multiread to each competing location. The probabilities are used for downstream analyses, such as the quantification of gene expression. We show through simulation studies and RNA-Seq analysis of real life data that the Bayesian method yields better mapping than the current leading methods. We provide a C++ program for downloading that is being packaged into a user-friendly software. PMID:21517792
BASiCS: Bayesian Analysis of Single-Cell Sequencing Data
Vallejos, Catalina A.; Marioni, John C.; Richardson, Sylvia
2015-01-01
Single-cell mRNA sequencing can uncover novel cell-to-cell heterogeneity in gene expression levels in seemingly homogeneous populations of cells. However, these experiments are prone to high levels of unexplained technical noise, creating new challenges for identifying genes that show genuine heterogeneous expression within the population of cells under study. BASiCS (Bayesian Analysis of Single-Cell Sequencing data) is an integrated Bayesian hierarchical model where: (i) cell-specific normalisation constants are estimated as part of the model parameters, (ii) technical variability is quantified based on spike-in genes that are artificially introduced to each analysed cell’s lysate and (iii) the total variability of the expression counts is decomposed into technical and biological components. BASiCS also provides an intuitive detection criterion for highly (or lowly) variable genes within the population of cells under study. This is formalised by means of tail posterior probabilities associated to high (or low) biological cell-to-cell variance contributions, quantities that can be easily interpreted by users. We demonstrate our method using gene expression measurements from mouse Embryonic Stem Cells. Cross-validation and meaningful enrichment of gene ontology categories within genes classified as highly (or lowly) variable supports the efficacy of our approach. PMID:26107944
BASiCS: Bayesian Analysis of Single-Cell Sequencing Data.
Vallejos, Catalina A; Marioni, John C; Richardson, Sylvia
2015-06-01
Single-cell mRNA sequencing can uncover novel cell-to-cell heterogeneity in gene expression levels in seemingly homogeneous populations of cells. However, these experiments are prone to high levels of unexplained technical noise, creating new challenges for identifying genes that show genuine heterogeneous expression within the population of cells under study. BASiCS (Bayesian Analysis of Single-Cell Sequencing data) is an integrated Bayesian hierarchical model where: (i) cell-specific normalisation constants are estimated as part of the model parameters, (ii) technical variability is quantified based on spike-in genes that are artificially introduced to each analysed cell's lysate and (iii) the total variability of the expression counts is decomposed into technical and biological components. BASiCS also provides an intuitive detection criterion for highly (or lowly) variable genes within the population of cells under study. This is formalised by means of tail posterior probabilities associated to high (or low) biological cell-to-cell variance contributions, quantities that can be easily interpreted by users. We demonstrate our method using gene expression measurements from mouse Embryonic Stem Cells. Cross-validation and meaningful enrichment of gene ontology categories within genes classified as highly (or lowly) variable supports the efficacy of our approach.
A High-Throughput Process for the Solid-Phase Purification of Synthetic DNA Sequences
Grajkowski, Andrzej; Cieślak, Jacek; Beaucage, Serge L.
2017-01-01
An efficient process for the purification of synthetic phosphorothioate and native DNA sequences is presented. The process is based on the use of an aminopropylated silica gel support functionalized with aminooxyalkyl functions to enable capture of DNA sequences through an oximation reaction with the keto function of a linker conjugated to the 5′-terminus of DNA sequences. Deoxyribonucleoside phosphoramidites carrying this linker, as a 5′-hydroxyl protecting group, have been synthesized for incorporation into DNA sequences during the last coupling step of a standard solid-phase synthesis protocol executed on a controlled pore glass (CPG) support. Solid-phase capture of the nucleobase- and phosphate-deprotected DNA sequences released from the CPG support is demonstrated to proceed near quantitatively. Shorter than full-length DNA sequences are first washed away from the capture support; the solid-phase purified DNA sequences are then released from this support upon reaction with tetra-n-butylammonium fluoride in dry dimethylsulfoxide (DMSO) and precipitated in tetrahydrofuran (THF). The purity of solid-phase-purified DNA sequences exceeds 98%. The simulated high-throughput and scalability features of the solid-phase purification process are demonstrated without sacrificing purity of the DNA sequences. PMID:28628204
An improved model for whole genome phylogenetic analysis by Fourier transform.
Yin, Changchuan; Yau, Stephen S-T
2015-10-07
DNA sequence similarity comparison is one of the major steps in computational phylogenetic studies. The sequence comparison of closely related DNA sequences and genomes is usually performed by multiple sequence alignments (MSA). While the MSA method is accurate for some types of sequences, it may produce incorrect results when DNA sequences undergone rearrangements as in many bacterial and viral genomes. It is also limited by its computational complexity for comparing large volumes of data. Previously, we proposed an alignment-free method that exploits the full information contents of DNA sequences by Discrete Fourier Transform (DFT), but still with some limitations. Here, we present a significantly improved method for the similarity comparison of DNA sequences by DFT. In this method, we map DNA sequences into 2-dimensional (2D) numerical sequences and then apply DFT to transform the 2D numerical sequences into frequency domain. In the 2D mapping, the nucleotide composition of a DNA sequence is a determinant factor and the 2D mapping reduces the nucleotide composition bias in distance measure, and thus improving the similarity measure of DNA sequences. To compare the DFT power spectra of DNA sequences with different lengths, we propose an improved even scaling algorithm to extend shorter DFT power spectra to the longest length of the underlying sequences. After the DFT power spectra are evenly scaled, the spectra are in the same dimensionality of the Fourier frequency space, then the Euclidean distances of full Fourier power spectra of the DNA sequences are used as the dissimilarity metrics. The improved DFT method, with increased computational performance by 2D numerical representation, can be applicable to any DNA sequences of different length ranges. We assess the accuracy of the improved DFT similarity measure in hierarchical clustering of different DNA sequences including simulated and real datasets. The method yields accurate and reliable phylogenetic trees and demonstrates that the improved DFT dissimilarity measure is an efficient and effective similarity measure of DNA sequences. Due to its high efficiency and accuracy, the proposed DFT similarity measure is successfully applied on phylogenetic analysis for individual genes and large whole bacterial genomes. Copyright © 2015 Elsevier Ltd. All rights reserved.
Ribosomal RNA Genes Contribute to the Formation of Pseudogenes and Junk DNA in the Human Genome.
Robicheau, Brent M; Susko, Edward; Harrigan, Amye M; Snyder, Marlene
2017-02-01
Approximately 35% of the human genome can be identified as sequence devoid of a selected-effect function, and not derived from transposable elements or repeated sequences. We provide evidence supporting a known origin for a fraction of this sequence. We show that: 1) highly degraded, but near full length, ribosomal DNA (rDNA) units, including both 45S and Intergenic Spacer (IGS), can be found at multiple sites in the human genome on chromosomes without rDNA arrays, 2) that these rDNA sequences have a propensity for being centromere proximal, and 3) that sequence at all human functional rDNA array ends is divergent from canonical rDNA to the point that it is pseudogenic. We also show that small sequence strings of rDNA (from 45S + IGS) can be found distributed throughout the genome and are identifiable as an "rDNA-like signal", representing 0.26% of the q-arm of HSA21 and ∼2% of the total sequence of other regions tested. The size of sequence strings found in the rDNA-like signal intergrade into the size of sequence strings that make up the full-length degrading rDNA units found scattered throughout the genome. We conclude that the displaced and degrading rDNA sequences are likely of a similar origin but represent different stages in their evolution towards random sequence. Collectively, our data suggests that over vast evolutionary time, rDNA arrays contribute to the production of junk DNA. The concept that the production of rDNA pseudogenes is a by-product of concerted evolution represents a previously under-appreciated process; we demonstrate here its importance. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Yin, Changchuan
2015-04-01
To apply digital signal processing (DSP) methods to analyze DNA sequences, the sequences first must be specially mapped into numerical sequences. Thus, effective numerical mappings of DNA sequences play key roles in the effectiveness of DSP-based methods such as exon prediction. Despite numerous mappings of symbolic DNA sequences to numerical series, the existing mapping methods do not include the genetic coding features of DNA sequences. We present a novel numerical representation of DNA sequences using genetic codon context (GCC) in which the numerical values are optimized by simulation annealing to maximize the 3-periodicity signal to noise ratio (SNR). The optimized GCC representation is then applied in exon and intron prediction by Short-Time Fourier Transform (STFT) approach. The results show the GCC method enhances the SNR values of exon sequences and thus increases the accuracy of predicting protein coding regions in genomes compared with the commonly used 4D binary representation. In addition, this study offers a novel way to reveal specific features of DNA sequences by optimizing numerical mappings of symbolic DNA sequences.
"New turns from old STaRs": enhancing the capabilities of forensic short tandem repeat analysis.
Phillips, Christopher; Gelabert-Besada, Miguel; Fernandez-Formoso, Luis; García-Magariños, Manuel; Santos, Carla; Fondevila, Manuel; Ballard, David; Syndercombe Court, Denise; Carracedo, Angel; Lareu, Maria Victoria
2014-11-01
The field of research and development of forensic STR genotyping remains active, innovative, and focused on continuous improvements. A series of recent developments including the introduction of a sixth dye have brought expanded STR multiplex sizes while maintaining sensitivity to typical forensic DNA. New supplementary kits complimenting the core STRs have also helped improve analysis of challenging identification cases such as distant pairwise relationships in deficient pedigrees. This article gives an overview of several recent key developments in forensic STR analysis: availability of expanded core STR kits and supplementary STRs, short-amplicon mini-STRs offering practical options for highly degraded DNA, Y-STR enhancements made from the identification of rapidly mutating loci, and enhanced analysis of genetic ancestry by analyzing 32-STR profiles with a Bayesian forensic classifier originally developed for SNP population data. As well as providing scope for genotyping larger numbers of STRs optimized for forensic applications, the launch of compact next-generation sequencing systems provides considerable potential for genotyping the sizeable proportion of nucleotide variation existing in forensic STRs, which currently escapes detection with CE. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Laakso, Into; Stenroos, Soili
2017-01-01
Heterocephalacria bachmannii is a lichenicolous fungus that takes as hosts numerous lichen species of the genus Cladonia. In the present study we analyze whether the geographical distance, the host species or the host secondary metabolites determine the genetic structure of this parasite. To address the question, populations mainly from the Southern Europe, Southern Finland and the Azores were sampled. The specimens were collected from 20 different host species representing ten chemotypes. Three loci, ITS rDNA, LSU rDNA and mtSSU, were sequenced. The genetic structure was assessed by AMOVA, redundance analyses and Bayesian clustering methods. The results indicated that the host species and the host secondary metabolites are the most influential factors over the genetic structure of this lichenicolous fungus. In addition, the genetic structure of H. bachmannii was compared with that of one of its hosts, Cladonia rangiformis. The population structure of parasite and host were discordant. The contents in phenolic compounds and fatty acids of C. rangiformis were quantified in order to test whether it had some influence on the genetic structure of the species. But no correlation was found with the genetic clusters of H. bachmannii. PMID:29253026
NASA Astrophysics Data System (ADS)
Edwards, Ceiridwen J.; Soulsbury, Carl D.; Statham, Mark J.; Ho, Simon Y. W.; Wall, Dave; Dolf, Gaudenz; Iossa, Graziella; Baker, Phillip J.; Harris, Stephen; Sacks, Benjamin N.; Bradley, Daniel G.
2012-12-01
Quaternary climatic fluctuations have had profound effects on the phylogeographic structure of many species. Classically, species were thought to have become isolated in peninsular refugia, but there is limited evidence that large, non-polar species survived outside traditional refugial areas. We examined the phylogeographic structure of the red fox (Vulpes vulpes), a species that shows high ecological adaptability in the western Palaearctic region. We compared mitochondrial DNA sequences (cytochrome b and control region) from 399 modern and 31 ancient individuals from across Europe. Our objective was to test whether red foxes colonised the British Isles from mainland Europe in the late Pleistocene, or whether there is evidence that they persisted in the region through the Last Glacial Maximum. We found red foxes to show a high degree of phylogeographic structuring across Europe and, consistent with palaeontological and ancient DNA evidence, confirmed via phylogenetic indicators that red foxes were persistent in areas outside peninsular refugia during the last ice age. Bayesian analyses and tests of neutrality indicated population expansion. We conclude that there is evidence that red foxes from the British Isles derived from central European populations that became isolated after the closure of the landbridge with Europe.
Edwards, Ceiridwen J; Soulsbury, Carl D; Statham, Mark J; Ho, Simon Y W; Wall, Dave; Dolf, Gaudenz; Iossa, Graziella; Baker, Phillip J; Harris, Stephen; Sacks, Benjamin N; Bradley, Daniel G
2012-12-04
Quaternary climatic fluctuations have had profound effects on the phylogeographic structure of many species. Classically, species were thought to have become isolated in peninsular refugia, but there is limited evidence that large, non-polar species survived outside traditional refugial areas. We examined the phylogeographic structure of the red fox ( Vulpes vulpes ), a species that shows high ecological adaptability in the western Palaearctic region. We compared mitochondrial DNA sequences (cytochrome b and control region) from 399 modern and 31 ancient individuals from across Europe. Our objective was to test whether red foxes colonised the British Isles from mainland Europe in the late Pleistocene, or whether there is evidence that they persisted in the region through the Last Glacial Maximum. We found red foxes to show a high degree of phylogeographic structuring across Europe and, consistent with palaeontological and ancient DNA evidence, confirmed via phylogenetic indicators that red foxes were persistent in areas outside peninsular refugia during the last ice age. Bayesian analyses and tests of neutrality indicated population expansion. We conclude that there is evidence that red foxes from the British Isles derived from central European populations that became isolated after the closure of the landbridge with Europe.
Eid, Mohammed Mansour Abbas; Shimoda, Mayuko; Singh, Shailendra Kumar; Almofty, Sarah Ameen; Pham, Phuong; Goodman, Myron F; Maeda, Kazuhiko; Sakaguchi, Nobuo
2017-05-01
Immunoglobulin affinity maturation depends on somatic hypermutation (SHM) in immunoglobulin variable (IgV) regions initiated by activation-induced cytidine deaminase (AID). AID induces transition mutations by C→U deamination on both strands, causing C:G→T:A. Error-prone repairs of U by base excision and mismatch repairs (MMRs) create transversion mutations at C/G and mutations at A/T sites. In Neuberger's model, it remained to be clarified how transition/transversion repair is regulated. We investigate the role of AID-interacting GANP (germinal center-associated nuclear protein) in the IgV SHM profile. GANP enhances transition mutation of the non-transcribed strand G and reduces mutation at A, restricted to GYW of the AID hotspot motif. It reduces DNA polymerase η hotspot mutations associated with MMRs followed by uracil-DNA glycosylase. Mutation comparison between IgV complementary and framework regions (FWRs) by Bayesian statistical estimation demonstrates that GANP supports the preservation of IgV FWR genomic sequences. GANP works to maintain antibody structure by reducing drastic changes in the IgV FWR in affinity maturation. © The Author 2017. Published by Oxford University Press on behalf of The Japanese Society for Immunology.
Barik, Tapan K; Swain, Surya N; Sahu, Bijayalaxmi; Tripathy, Bibarani; Acharya, Usha R
2018-05-01
Identification of fish species have so far been carried out mostly by classical morpho-taxonomy. In the present study, however, an attempt has been taken to identify two species of fishes Ulua mentalis and Pinjalo pinjalo of order Perciformes which happens to be the first record in Odisha coast Bay of Bengal, India during the year 2015, using DNA barcoding technique for reconfirmation over conventional morpho-taxonomy. During recent past, study of molecular-taxonomical profile of mitochondrial DNA in general and Cytochrome Oxidase subunit I (COI) gene in particular has gained enormous importance for accurate identification of species. In the present study, the partial COI sequence of Ulua mentalis and Pinjalo pinjalo were generated. Analysis using the COI gene produced phylogenetic trees in concurrence with other multi gene studies and we came across the identical phylogenetic relationship considering Neighbor-Joining and Maximum Likelihood tree. Moreover, these molecular data set further testified in Bayesian framework to reevaluate the exact taxonomic groupings within the family. Surprisingly, Ulua mentalis and Pinjalo pinjalo seems to be closely related to their sister taxa.
Mitochondrial DNA phylogeny of camel spiders (Arachnida: Solifugae) from Iran.
Maddahi, Hassan; Khazanehdari, Mahsa; Aliabadian, Mansour; Kami, Haji Gholi; Mirshamsi, Amin; Mirshamsi, Omid
2017-11-01
In the present study, the mitochondrial DNA phylogeny of five solifuge families of Iran is presented using phylogenetic analysis of mitochondrial cytochrome c oxidase, subunit 1 (COI) sequence data. Moreover, we included available representatives from seven families from GenBank to examine the genetic distance between Old and New World taxa and test the phylogenetic relationships among more solifuge families. Phylogenetic relationships were reconstructed based on the two most probabilistic methods, Maximum Likelihood (ML) and Bayesian inference (BI) approaches. Resulting topologies demonstrated the monophyly of the families Daesiidae, Eremobatidae, Galeodidae, Karschiidae and Rhagodidae, whereas the monophyly of the families Ammotrechidae and Gylippidae was not supported. Also, within the family Eremobatidae, the subfamilies Eremobatinae and Therobatinae and the genus Hemerotrecha were paraphyletic or polyphyletic. According to the resulted topologies, the taxonomic placements of Trichotoma michaelseni (Gylippidae) and Nothopuga sp. 1 (Ammotrechidae) are still remain under question and their revision might be appropriate. According to the results of this study, within the family Galeodidae, the validity of the genus Galeodopsis is supported, while the validity of the genus Paragaleodes still remains uncertain. Moreover, our results revealed that the species Galeodes bacillatus, and Rhagodes melanochaetus are junior synonyms of G. caspius, and R. eylandti, respectively.
Phylogenetic study of Class Armophorea (Alveolata, Ciliophora) based on 18S-rDNA data.
da Silva Paiva, Thiago; do Nascimento Borges, Bárbara; da Silva-Neto, Inácio Domingos
2013-12-01
The 18S rDNA phylogeny of Class Armophorea, a group of anaerobic ciliates, is proposed based on an analysis of 44 sequences (out of 195) retrieved from the NCBI/GenBank database. Emphasis was placed on the use of two nucleotide alignment criteria that involved variation in the gap-opening and gap-extension parameters and the use of rRNA secondary structure to orientate multiple-alignment. A sensitivity analysis of 76 data sets was run to assess the effect of variations in indel parameters on tree topologies. Bayesian inference, maximum likelihood and maximum parsimony phylogenetic analyses were used to explore how different analytic frameworks influenced the resulting hypotheses. A sensitivity analysis revealed that the relationships among higher taxa of the Intramacronucleata were dependent upon how indels were determined during multiple-alignment of nucleotides. The phylogenetic analyses rejected the monophyly of the Armophorea most of the time and consistently indicated that the Metopidae and Nyctotheridae were related to the Litostomatea. There was no consensus on the placement of the Caenomorphidae, which could be a sister group of the Metopidae + Nyctorheridae, or could have diverged at the base of the Spirotrichea branch or the Intramacronucleata tree.
Phylogenetic study of Class Armophorea (Alveolata, Ciliophora) based on 18S-rDNA data
da Silva Paiva, Thiago; do Nascimento Borges, Bárbara; da Silva-Neto, Inácio Domingos
2013-01-01
The 18S rDNA phylogeny of Class Armophorea, a group of anaerobic ciliates, is proposed based on an analysis of 44 sequences (out of 195) retrieved from the NCBI/GenBank database. Emphasis was placed on the use of two nucleotide alignment criteria that involved variation in the gap-opening and gap-extension parameters and the use of rRNA secondary structure to orientate multiple-alignment. A sensitivity analysis of 76 data sets was run to assess the effect of variations in indel parameters on tree topologies. Bayesian inference, maximum likelihood and maximum parsimony phylogenetic analyses were used to explore how different analytic frameworks influenced the resulting hypotheses. A sensitivity analysis revealed that the relationships among higher taxa of the Intramacronucleata were dependent upon how indels were determined during multiple-alignment of nucleotides. The phylogenetic analyses rejected the monophyly of the Armophorea most of the time and consistently indicated that the Metopidae and Nyctotheridae were related to the Litostomatea. There was no consensus on the placement of the Caenomorphidae, which could be a sister group of the Metopidae + Nyctorheridae, or could have diverged at the base of the Spirotrichea branch or the Intramacronucleata tree. PMID:24385862
Single-cell genomic sequencing using Multiple Displacement Amplification.
Lasken, Roger S
2007-10-01
Single microbial cells can now be sequenced using DNA amplified by the Multiple Displacement Amplification (MDA) reaction. The few femtograms of DNA in a bacterium are amplified into micrograms of high molecular weight DNA suitable for DNA library construction and Sanger sequencing. The MDA-generated DNA also performs well when used directly as template for pyrosequencing by the 454 Life Sciences method. While MDA from single cells loses some of the genomic sequence, this approach will greatly accelerate the pace of sequencing from uncultured microbes. The genetically linked sequences from single cells are also a powerful tool to be used in guiding genomic assembly of shotgun sequences of multiple organisms from environmental DNA extracts (metagenomic sequences).
Arora, Natasha; Nater, Alexander; van Schaik, Carel P.; Willems, Erik P.; van Noordwijk, Maria A.; Goossens, Benoit; Morf, Nadja; Bastian, Meredith; Knott, Cheryl; Morrogh-Bernard, Helen; Kuze, Noko; Kanamori, Tomoko; Pamungkas, Joko; Perwitasari-Farajallah, Dyah; Verschoor, Ernst; Warren, Kristin; Krützen, Michael
2010-01-01
Sundaland, a tropical hotspot of biodiversity comprising Borneo and Sumatra among other islands, the Malay Peninsula, and a shallow sea, has been subject to dramatic environmental processes. Thus, it presents an ideal opportunity to investigate the role of environmental mechanisms in shaping species distribution and diversity. We investigated the population structure and underlying mechanisms of an insular endemic, the Bornean orangutan (Pongo pygmaeus). Phylogenetic reconstructions based on mtDNA sequences from 211 wild orangutans covering the entire range of the species indicate an unexpectedly recent common ancestor of Bornean orangutans 176 ka (95% highest posterior density, 72–322 ka), pointing to a Pleistocene refugium. High mtDNA differentiation among populations and rare haplotype sharing is consistent with a pattern of strong female philopatry. This is corroborated by isolation by distance tests, which show a significant correlation between mtDNA divergence and distance and a strong effect of rivers as barriers for female movement. Both frequency-based and Bayesian clustering analyses using as many as 25 nuclear microsatellite loci revealed a significant separation among all populations, as well as a small degree of male-mediated gene flow. This study highlights the unique effects of environmental and biological features on the evolutionary history of Bornean orangutans, a highly endangered species particularly vulnerable to future climate and anthropogenic change as an insular endemic. PMID:21098261
Arora, Natasha; Nater, Alexander; van Schaik, Carel P; Willems, Erik P; van Noordwijk, Maria A; Goossens, Benoit; Morf, Nadja; Bastian, Meredith; Knott, Cheryl; Morrogh-Bernard, Helen; Kuze, Noko; Kanamori, Tomoko; Pamungkas, Joko; Perwitasari-Farajallah, Dyah; Verschoor, Ernst; Warren, Kristin; Krützen, Michael
2010-12-14
Sundaland, a tropical hotspot of biodiversity comprising Borneo and Sumatra among other islands, the Malay Peninsula, and a shallow sea, has been subject to dramatic environmental processes. Thus, it presents an ideal opportunity to investigate the role of environmental mechanisms in shaping species distribution and diversity. We investigated the population structure and underlying mechanisms of an insular endemic, the Bornean orangutan (Pongo pygmaeus). Phylogenetic reconstructions based on mtDNA sequences from 211 wild orangutans covering the entire range of the species indicate an unexpectedly recent common ancestor of Bornean orangutans 176 ka (95% highest posterior density, 72-322 ka), pointing to a Pleistocene refugium. High mtDNA differentiation among populations and rare haplotype sharing is consistent with a pattern of strong female philopatry. This is corroborated by isolation by distance tests, which show a significant correlation between mtDNA divergence and distance and a strong effect of rivers as barriers for female movement. Both frequency-based and Bayesian clustering analyses using as many as 25 nuclear microsatellite loci revealed a significant separation among all populations, as well as a small degree of male-mediated gene flow. This study highlights the unique effects of environmental and biological features on the evolutionary history of Bornean orangutans, a highly endangered species particularly vulnerable to future climate and anthropogenic change as an insular endemic.
Combining MLC and SVM Classifiers for Learning Based Decision Making: Analysis and Evaluations
Zhang, Yi; Ren, Jinchang; Jiang, Jianmin
2015-01-01
Maximum likelihood classifier (MLC) and support vector machines (SVM) are two commonly used approaches in machine learning. MLC is based on Bayesian theory in estimating parameters of a probabilistic model, whilst SVM is an optimization based nonparametric method in this context. Recently, it is found that SVM in some cases is equivalent to MLC in probabilistically modeling the learning process. In this paper, MLC and SVM are combined in learning and classification, which helps to yield probabilistic output for SVM and facilitate soft decision making. In total four groups of data are used for evaluations, covering sonar, vehicle, breast cancer, and DNA sequences. The data samples are characterized in terms of Gaussian/non-Gaussian distributed and balanced/unbalanced samples which are then further used for performance assessment in comparing the SVM and the combined SVM-MLC classifier. Interesting results are reported to indicate how the combined classifier may work under various conditions. PMID:26089862
Combining MLC and SVM Classifiers for Learning Based Decision Making: Analysis and Evaluations.
Zhang, Yi; Ren, Jinchang; Jiang, Jianmin
2015-01-01
Maximum likelihood classifier (MLC) and support vector machines (SVM) are two commonly used approaches in machine learning. MLC is based on Bayesian theory in estimating parameters of a probabilistic model, whilst SVM is an optimization based nonparametric method in this context. Recently, it is found that SVM in some cases is equivalent to MLC in probabilistically modeling the learning process. In this paper, MLC and SVM are combined in learning and classification, which helps to yield probabilistic output for SVM and facilitate soft decision making. In total four groups of data are used for evaluations, covering sonar, vehicle, breast cancer, and DNA sequences. The data samples are characterized in terms of Gaussian/non-Gaussian distributed and balanced/unbalanced samples which are then further used for performance assessment in comparing the SVM and the combined SVM-MLC classifier. Interesting results are reported to indicate how the combined classifier may work under various conditions.
Zanatta, David T; Murphy, Robert W
2006-10-01
Most freshwater mussels (Bivalvia: Unionoida) require a host, usually a fish, to complete their life cycle. Most species of mussels show adaptations that increase the chances of glochidia larvae contacting a host. We investigated the evolutionary relationships of the freshwater mussel tribe Lampsilini including 49 of the approximately 100 extant species including 21 of the 24 recognized genera. Mitochondrial DNA sequence data (COI, 16S, and ND1) were used to create a molecular phylogeny for these species. Parsimony and Bayesian likelihood topologies revealed that the use of an active lure arose early in the evolution of the Lampsiline mussels. The mantle flap lure appears to have been the first to evolve with other lure types being derived from this condition. Apparently, lures were lost independently in several clades. Hypotheses are discussed as to how some of these lure strategies may have evolved in response to host fish prey preferences.
Hu, Chao; Tian, Huaizhen; Li, Hongqing; Hu, Aiqun; Xing, Fuwu; Bhattacharjee, Avishek; Hsu, Tianchuan; Kumar, Pankaj; Chung, Shihwen
2016-01-01
A molecular phylogeny of Asiatic species of Goodyera (Orchidaceae, Cranichideae, Goodyerinae) based on the nuclear ribosomal internal transcribed spacer (ITS) region and two chloroplast loci (matK and trnL-F) was presented. Thirty-five species represented by 132 samples of Goodyera were analyzed, along with other 27 genera/48 species, using Pterostylis longifolia and Chloraea gaudichaudii as outgroups. Bayesian inference, maximum parsimony and maximum likelihood methods were used to reveal the intrageneric relationships of Goodyera and its intergeneric relationships to related genera. The results indicate that: 1) Goodyera is not monophyletic; 2) Goodyera could be divided into four sections, viz., Goodyera, Otosepalum, Reticulum and a new section; 3) sect. Reticulum can be further divided into two subsections, viz., Reticulum and Foliosum, whereas sect. Goodyera can in turn be divided into subsections Goodyera and a new subsection. PMID:26927946
Hu, Chao; Tian, Huaizhen; Li, Hongqing; Hu, Aiqun; Xing, Fuwu; Bhattacharjee, Avishek; Hsu, Tianchuan; Kumar, Pankaj; Chung, Shihwen
2016-01-01
A molecular phylogeny of Asiatic species of Goodyera (Orchidaceae, Cranichideae, Goodyerinae) based on the nuclear ribosomal internal transcribed spacer (ITS) region and two chloroplast loci (matK and trnL-F) was presented. Thirty-five species represented by 132 samples of Goodyera were analyzed, along with other 27 genera/48 species, using Pterostylis longifolia and Chloraea gaudichaudii as outgroups. Bayesian inference, maximum parsimony and maximum likelihood methods were used to reveal the intrageneric relationships of Goodyera and its intergeneric relationships to related genera. The results indicate that: 1) Goodyera is not monophyletic; 2) Goodyera could be divided into four sections, viz., Goodyera, Otosepalum, Reticulum and a new section; 3) sect. Reticulum can be further divided into two subsections, viz., Reticulum and Foliosum, whereas sect. Goodyera can in turn be divided into subsections Goodyera and a new subsection.
Mitochondrial diversity and phylogeography of Acrossocheilus paradoxus (Teleostei: Cyprinidae).
Ju, Yu-Min; Hsu, Kui-Ching; Yang, Jin-Quan; Wu, Jui-Hsien; Li, Shan; Wang, Wei-Kuang; Ding, Fang; Li, Jun; Lin, Hung-Du
2018-01-31
Mitochondrial DNA cytochrome b sequences (1141 bp) in 229 specimens of Acrossocheilus paradoxus from 26 populations were identified as four lineages. The pairwise genetic distances among these four lineages ranged from 1.57 to 2.37% (mean= 2.00%). Statistical dispersal-vicariance analysis suggests that the ancestral populations were distributed over mainland China and Northern and Western Taiwan. Approximate Bayesian computation approaches show that the three lineages in Taiwan originated from the lineage in mainland China through three colonization routes during two glaciations. The results indicated that during the glaciation and inter-glacial periods, the Taiwan Strait was exposed and sank, which contributed to the dispersion and differentiation of populations. Furthermore, the populations of A. paradoxus colonized Taiwan through a land bridge to the north of the Formosa Bank, and the Miaoli Plateau in Taiwan was an important barrier that limited gene exchange between populations on both the sides.
Li, Min; Tian, Ying; Zhao, Ying; Bu, Wenjun
2012-01-01
Heteroptera, or true bugs, are the largest, morphologically diverse and economically important group of insects with incomplete metamorphosis. However, the phylogenetic relationships within Heteroptera are still in dispute and most of the previous studies were based on morphological characters or with single gene (partial or whole 18S rDNA). Besides, so far, divergence time estimates for Heteroptera totally rely on the fossil record, while no studies have been performed on molecular divergence rates. Here, for the first time, we used maximum parsimony (MP), maximum likelihood (ML) and Bayesian inference (BI) with multiple genes (18S rDNA, 28S rDNA, 16S rDNA and COI) to estimate phylogenetic relationships among the infraorders, and meanwhile, the Penalized Likelihood (r8s) and Bayesian (BEAST) molecular dating methods were employed to estimate divergence time of higher taxa of this suborder. Major results of the present study included: Nepomorpha was placed as the most basal clade in all six trees (MP trees, ML trees and Bayesian trees of nuclear gene data and four-gene combined data, respectively) with full support values. The sister-group relationship of Cimicomorpha and Pentatomomorpha was also strongly supported. Nepomorpha originated in early Triassic and the other six infraorders originated in a very short period of time in middle Triassic. Cimicomorpha and Pentatomomorpha underwent a radiation at family level in Cretaceous, paralleling the proliferation of the flowering plants. Our results indicated that the higher-group radiations within hemimetabolous Heteroptera were simultaneously with those of holometabolous Coleoptera and Diptera which took place in the Triassic. While the aquatic habitat was colonized by Nepomorpha already in the Triassic, the Gerromorpha independently adapted to the semi-aquatic habitat in the Early Jurassic.
Zhao, Ying; Bu, Wenjun
2012-01-01
Heteroptera, or true bugs, are the largest, morphologically diverse and economically important group of insects with incomplete metamorphosis. However, the phylogenetic relationships within Heteroptera are still in dispute and most of the previous studies were based on morphological characters or with single gene (partial or whole 18S rDNA). Besides, so far, divergence time estimates for Heteroptera totally rely on the fossil record, while no studies have been performed on molecular divergence rates. Here, for the first time, we used maximum parsimony (MP), maximum likelihood (ML) and Bayesian inference (BI) with multiple genes (18S rDNA, 28S rDNA, 16S rDNA and COI) to estimate phylogenetic relationships among the infraorders, and meanwhile, the Penalized Likelihood (r8s) and Bayesian (BEAST) molecular dating methods were employed to estimate divergence time of higher taxa of this suborder. Major results of the present study included: Nepomorpha was placed as the most basal clade in all six trees (MP trees, ML trees and Bayesian trees of nuclear gene data and four-gene combined data, respectively) with full support values. The sister-group relationship of Cimicomorpha and Pentatomomorpha was also strongly supported. Nepomorpha originated in early Triassic and the other six infraorders originated in a very short period of time in middle Triassic. Cimicomorpha and Pentatomomorpha underwent a radiation at family level in Cretaceous, paralleling the proliferation of the flowering plants. Our results indicated that the higher-group radiations within hemimetabolous Heteroptera were simultaneously with those of holometabolous Coleoptera and Diptera which took place in the Triassic. While the aquatic habitat was colonized by Nepomorpha already in the Triassic, the Gerromorpha independently adapted to the semi-aquatic habitat in the Early Jurassic. PMID:22384163
Duminil, Jerome; Brown, Richard P; Ewédjè, Eben-Ezer B K; Mardulyn, Patrick; Doucet, Jean-Louis; Hardy, Olivier J
2013-09-12
The evolutionary events that have shaped biodiversity patterns in the African rainforests are still poorly documented. Past forest fragmentation and ecological gradients have been advocated as important drivers of genetic differentiation but their respective roles remain unclear. Using nuclear microsatellites (nSSRs) and chloroplast non-coding sequences (pDNA), we characterised the spatial genetic structure of Erythrophleum (Fabaceae) forest trees in West and Central Africa (Guinea Region, GR). This widespread genus displays a wide ecological amplitude and taxonomists recognize two forest tree species, E. ivorense and E. suaveolens, which are difficult to distinguish in the field and often confused. Bayesian-clustering applied on nSSRs of a blind sample of 648 specimens identified three major gene pools showing no or very limited introgression. They present parapatric distributions correlated to rainfall gradients and forest types. One gene pool is restricted to coastal evergreen forests and corresponds to E. ivorense; a second one is found in gallery forests from the dry forest zone of West Africa and North-West Cameroon and corresponds to West-African E. suaveolens; the third gene pool occurs in semi-evergreen forests and corresponds to Central African E. suaveolens. These gene pools have mostly unique pDNA haplotypes but they do not form reciprocally monophyletic clades. Nevertheless, pDNA molecular dating indicates that the divergence between E. ivorense and Central African E. suaveolens predates the Pleistocene. Further Bayesian-clustering applied within each major gene pool identified diffuse genetic discontinuities (minor gene pools displaying substantial introgression) at a latitude between 0 and 2°N in Central Africa for both species, and at a longitude between 5° and 8°E for E. ivorense. Moreover, we detected evidence of past population declines which are consistent with historical habitat fragmentation induced by Pleistocene climate changes. Overall, deep genetic differentiation (major gene pools) follows ecological gradients that may be at the origin of speciation, while diffuse differentiation (minor gene pools) are tentatively interpreted as the signature of past forest fragmentation induced by past climate changes.
2013-01-01
Background The evolutionary events that have shaped biodiversity patterns in the African rainforests are still poorly documented. Past forest fragmentation and ecological gradients have been advocated as important drivers of genetic differentiation but their respective roles remain unclear. Using nuclear microsatellites (nSSRs) and chloroplast non-coding sequences (pDNA), we characterised the spatial genetic structure of Erythrophleum (Fabaceae) forest trees in West and Central Africa (Guinea Region, GR). This widespread genus displays a wide ecological amplitude and taxonomists recognize two forest tree species, E. ivorense and E. suaveolens, which are difficult to distinguish in the field and often confused. Results Bayesian-clustering applied on nSSRs of a blind sample of 648 specimens identified three major gene pools showing no or very limited introgression. They present parapatric distributions correlated to rainfall gradients and forest types. One gene pool is restricted to coastal evergreen forests and corresponds to E. ivorense; a second one is found in gallery forests from the dry forest zone of West Africa and North-West Cameroon and corresponds to West-African E. suaveolens; the third gene pool occurs in semi-evergreen forests and corresponds to Central African E. suaveolens. These gene pools have mostly unique pDNA haplotypes but they do not form reciprocally monophyletic clades. Nevertheless, pDNA molecular dating indicates that the divergence between E. ivorense and Central African E. suaveolens predates the Pleistocene. Further Bayesian-clustering applied within each major gene pool identified diffuse genetic discontinuities (minor gene pools displaying substantial introgression) at a latitude between 0 and 2°N in Central Africa for both species, and at a longitude between 5° and 8°E for E. ivorense. Moreover, we detected evidence of past population declines which are consistent with historical habitat fragmentation induced by Pleistocene climate changes. Conclusions Overall, deep genetic differentiation (major gene pools) follows ecological gradients that may be at the origin of speciation, while diffuse differentiation (minor gene pools) are tentatively interpreted as the signature of past forest fragmentation induced by past climate changes. PMID:24028582
Acquisition of New DNA Sequences After Infection of Chicken Cells with Avian Myeloblastosis Virus
Shoyab, M.; Baluda, M. A.; Evans, R.
1974-01-01
DNA-RNA hybridization studies between 70S RNA from avian myeloblastosis virus (AMV) and an excess of DNA from (i) AMV-induced leukemic chicken myeloblasts or (ii) a mixture of normal and of congenitally infected K-137 chicken embryos producing avian leukosis viruses revealed the presence of fast- and slow-hybridizing virus-specific DNA sequences. However, the leukemic cells contained twice the level of AMV-specific DNA sequences observed in normal chicken embryonic cells. The fast-reacting sequences were two to three times more numerous in leukemic DNA than in DNA from the mixed embryos. The slow-reacting sequences had a reiteration frequency of approximately 9 and 6, in the two respective systems. Both the fast- and the slow-reacting DNA sequences in leukemic cells exhibited a higher Tm (2 C) than the respective DNA sequences in normal cells. In normal and leukemic cells the slow hybrid sequences appeared to have a Tm which was 2 C higher than that of the fast hybrid sequences. Individual non-virus-producing chicken embryos, either group-specific antigen positive or negative, contained 40 to 100 copies of the fast sequences and 2 to 6 copies of the slowly hybridizing sequences per cell genome. Normal rat cells did not contain DNA that hybridized with AMV RNA, whereas non-virus-producing rat cells transformed by B-77 avian sarcoma virus contained only the slowly reacting sequences. The results demonstrate that leukemic cells transformed by AMV contain new AMV-specific DNA sequences which were not present before infection. PMID:16789139
Screening for SNPs with Allele-Specific Methylation based on Next-Generation Sequencing Data.
Hu, Bo; Ji, Yuan; Xu, Yaomin; Ting, Angela H
2013-05-01
Allele-specific methylation (ASM) has long been studied but mainly documented in the context of genomic imprinting and X chromosome inactivation. Taking advantage of the next-generation sequencing technology, we conduct a high-throughput sequencing experiment with four prostate cell lines to survey the whole genome and identify single nucleotide polymorphisms (SNPs) with ASM. A Bayesian approach is proposed to model the counts of short reads for each SNP conditional on its genotypes of multiple subjects, leading to a posterior probability of ASM. We flag SNPs with high posterior probabilities of ASM by accounting for multiple comparisons based on posterior false discovery rates. Applying the Bayesian approach to the in-house prostate cell line data, we identify 269 SNPs as candidates of ASM. A simulation study is carried out to demonstrate the quantitative performance of the proposed approach.
Matsuda, M; Tazumi, A; Kagawa, S; Sekizuka, T; Murayama, O; Moore, JE; Millar, BC
2006-01-01
Background At present, six accessible sequences of 16S rDNA from Taylorella equigenitalis (T. equigenitalis) are available, whose sequence differences occur at a few nucleotide positions. Thus it is important to determine these sequences from additional strains in other countries, if possible, in order to clarify any anomalies regarding 16S rDNA sequence heterogeneity. Here, we clone and sequence the approximate full-length 16S rDNA from additional strains of T. equigenitalis isolated in Japan, Australia and France and compare these sequences to the existing published sequences. Results Clarification of any anomalies regarding 16S rDNA sequence heterogeneity of T. equigenitalis was carried out. When cloning, sequencing and comparison of the approximate full-length 16S rDNA from 17 strains of T. equigenitalis isolated in Japan, Australia and France, nucleotide sequence differences were demonstrated at the six loci in the 1,469 nucleotide sequence. Moreover, 12 polymorphic sites occurred among 23 sequences of the 16S rDNA, including the six reference sequences. Conclusion High sequence similarity (99.5% or more) was observed throughout, except from nucleotide positions 138 to 501 where substitutions and deletions were noted. PMID:16398935
McCutchen-Maloney, Sandra L.
2002-01-01
DNA mutation binding proteins alone and as chimeric proteins with nucleases are used with solid supports to detect DNA sequence variations, DNA mutations and single nucleotide polymorphisms. The solid supports may be flow cytometry beads, DNA chips, glass slides or DNA dips sticks. DNA molecules are coupled to solid supports to form DNA-support complexes. Labeled DNA is used with unlabeled DNA mutation binding proteins such at TthMutS to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by binding which gives an increase in signal. Unlabeled DNA is utilized with labeled chimeras to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by nuclease activity of the chimera which gives a decrease in signal.
The genetic diversity of hepatitis A genotype I in Bulgaria
Cella, Eleonora; Golkocheva-Markova, Elitsa N.; Trandeva-Bankova, Diljana; Gregori, Giulia; Bruni, Roberto; Taffon, Stefania; Equestre, Michele; Costantino, Angela; Spoto, Silvia; Curtis, Melissa; Ciccaglione, Anna Rita; Ciccozzi, Massimo; Angeletti, Silvia
2018-01-01
Abstract The purpose of this study was to analyze sequences of hepatitis A virus (HAV) Ia and Ib genotypes from Bulgarian patients to investigate the molecular epidemiology of HAV genotype I during the years 2012 to 2014. Around 105 serum samples were collected by the Department of Virology of the National Center of Infectious and Parasitic Diseases in Bulgaria. The sequenced region encompassed the VP1/2A region of HAV genome. The sequences obtained from the samples were 103. For the phylogenetic analyses, 5 datasets were built to investigate the viral gene in/out flow among distinct HAV subpopulations in different geographic areas and to build a Bayesian dated tree, Bayesian phylogenetic and migration pattern analyses were performed. HAV Ib Bulgarian sequences mostly grouped into a single clade. This indicates that the Bulgarian epidemic is partially compartmentalized. It originated from a limited number of viruses and then spread through fecal-oral local transmission. HAV Ia Bulgarian sequences were intermixed with European sequences, suggesting that an Ia epidemic is not restricted to Bulgaria but can affect other European countries. The time-scaled phylogeny reconstruction showed the root of the tree dating in 2008 for genotype Ib and in 1999 for genotype Ia with a second epidemic entrance in 2003. The Bayesian skyline plot for genotype Ib showed a slow but continuous growth, sustained by fecal-oral route transmission. For genotype Ia, there was an exponential growth followed by a plateau, which suggests better infection control. Bidirectional viral flow for Ib genotype, involving different Bulgarian areas, was observed, whereas a unidirectional flow from Sofia to Ihtiman for genotype Ia was highlighted, suggesting the fecal-oral transmission route for Ia. PMID:29504993
The genetic diversity of hepatitis A genotype I in Bulgaria.
Cella, Eleonora; Golkocheva-Markova, Elitsa N; Trandeva-Bankova, Diljana; Gregori, Giulia; Bruni, Roberto; Taffon, Stefania; Equestre, Michele; Costantino, Angela; Spoto, Silvia; Curtis, Melissa; Ciccaglione, Anna Rita; Ciccozzi, Massimo; Angeletti, Silvia
2018-01-01
The purpose of this study was to analyze sequences of hepatitis A virus (HAV) Ia and Ib genotypes from Bulgarian patients to investigate the molecular epidemiology of HAV genotype I during the years 2012 to 2014. Around 105 serum samples were collected by the Department of Virology of the National Center of Infectious and Parasitic Diseases in Bulgaria. The sequenced region encompassed the VP1/2A region of HAV genome. The sequences obtained from the samples were 103. For the phylogenetic analyses, 5 datasets were built to investigate the viral gene in/out flow among distinct HAV subpopulations in different geographic areas and to build a Bayesian dated tree, Bayesian phylogenetic and migration pattern analyses were performed. HAV Ib Bulgarian sequences mostly grouped into a single clade. This indicates that the Bulgarian epidemic is partially compartmentalized. It originated from a limited number of viruses and then spread through fecal-oral local transmission. HAV Ia Bulgarian sequences were intermixed with European sequences, suggesting that an Ia epidemic is not restricted to Bulgaria but can affect other European countries. The time-scaled phylogeny reconstruction showed the root of the tree dating in 2008 for genotype Ib and in 1999 for genotype Ia with a second epidemic entrance in 2003. The Bayesian skyline plot for genotype Ib showed a slow but continuous growth, sustained by fecal-oral route transmission. For genotype Ia, there was an exponential growth followed by a plateau, which suggests better infection control. Bidirectional viral flow for Ib genotype, involving different Bulgarian areas, was observed, whereas a unidirectional flow from Sofia to Ihtiman for genotype Ia was highlighted, suggesting the fecal-oral transmission route for Ia. Copyright © 2017 The Authors. Published by Wolters Kluwer Health, Inc. All rights reserved.
Methylation patterns of repetitive DNA sequences in germ cells of Mus musculus.
Sanford, J; Forrester, L; Chapman, V; Chandley, A; Hastie, N
1984-03-26
The major and the minor satellite sequences of Mus musculus were undermethylated in both sperm and oocyte DNAs relative to the amount of undermethylation observed in adult somatic tissue DNA. This hypomethylation was specific for satellite sequences in sperm DNA. Dispersed repetitive and low copy sequences show a high degree of methylation in sperm DNA; however, a dispersed repetitive sequence was undermethylated in oocyte DNA. This finding suggests a difference in the amount of total genomic DNA methylation between sperm and oocyte DNA. The methylation levels of the minor satellite sequences did not change during spermiogenesis, and were not associated with the onset of meiosis or a specific stage in sperm development.
Predicting ICU mortality: a comparison of stationary and nonstationary temporal models.
Kayaalp, M.; Cooper, G. F.; Clermont, G.
2000-01-01
OBJECTIVE: This study evaluates the effectiveness of the stationarity assumption in predicting the mortality of intensive care unit (ICU) patients at the ICU discharge. DESIGN: This is a comparative study. A stationary temporal Bayesian network learned from data was compared to a set of (33) nonstationary temporal Bayesian networks learned from data. A process observed as a sequence of events is stationary if its stochastic properties stay the same when the sequence is shifted in a positive or negative direction by a constant time parameter. The temporal Bayesian networks forecast mortalities of patients, where each patient has one record per day. The predictive performance of the stationary model is compared with nonstationary models using the area under the receiver operating characteristics (ROC) curves. RESULTS: The stationary model usually performed best. However, one nonstationary model using large data sets performed significantly better than the stationary model. CONCLUSION: Results suggest that using a combination of stationary and nonstationary models may predict better than using either alone. PMID:11079917
Process of labeling specific chromosomes using recombinant repetitive DNA
Moyzis, R.K.; Meyne, J.
1988-02-12
Chromosome preferential nucleotide sequences are first determined from a library of recombinant DNA clones having families of repetitive sequences. Library clones are identified with a low homology with a sequence of repetitive DNA families to which the first clones respectively belong and variant sequences are then identified by selecting clones having a pattern of hybridization with genomic DNA dissimilar to the hybridization pattern shown by the respective families. In another embodiment, variant sequences are selected from a sequence of a known repetitive DNA family. The selected variant sequence is classified as chromosome specific, chromosome preferential, or chromosome nonspecific. Sequences which are classified as chromosome preferential are further sequenced and regions are identified having a low homology with other regions of the chromosome preferential sequence or with known sequences of other family members and consensus sequences of the repetitive DNA families for the chromosome preferential sequences. The selected low homology regions are then hybridized with chromosomes to determine those low homology regions hybridized with a specific chromosome under normal stringency conditions.
Ancient mitochondrial DNA provides high-resolution time scale of the peopling of the Americas.
Llamas, Bastien; Fehren-Schmitz, Lars; Valverde, Guido; Soubrier, Julien; Mallick, Swapan; Rohland, Nadin; Nordenfelt, Susanne; Valdiosera, Cristina; Richards, Stephen M; Rohrlach, Adam; Romero, Maria Inés Barreto; Espinoza, Isabel Flores; Cagigao, Elsa Tomasto; Jiménez, Lucía Watson; Makowski, Krzysztof; Reyna, Ilán Santiago Leboreiro; Lory, Josefina Mansilla; Torrez, Julio Alejandro Ballivián; Rivera, Mario A; Burger, Richard L; Ceruti, Maria Constanza; Reinhard, Johan; Wells, R Spencer; Politis, Gustavo; Santoro, Calogero M; Standen, Vivien G; Smith, Colin; Reich, David; Ho, Simon Y W; Cooper, Alan; Haak, Wolfgang
2016-04-01
The exact timing, route, and process of the initial peopling of the Americas remains uncertain despite much research. Archaeological evidence indicates the presence of humans as far as southern Chile by 14.6 thousand years ago (ka), shortly after the Pleistocene ice sheets blocking access from eastern Beringia began to retreat. Genetic estimates of the timing and route of entry have been constrained by the lack of suitable calibration points and low genetic diversity of Native Americans. We sequenced 92 whole mitochondrial genomes from pre-Columbian South American skeletons dating from 8.6 to 0.5 ka, allowing a detailed, temporally calibrated reconstruction of the peopling of the Americas in a Bayesian coalescent analysis. The data suggest that a small population entered the Americas via a coastal route around 16.0 ka, following previous isolation in eastern Beringia for ~2.4 to 9 thousand years after separation from eastern Siberian populations. Following a rapid movement throughout the Americas, limited gene flow in South America resulted in a marked phylogeographic structure of populations, which persisted through time. All of the ancient mitochondrial lineages detected in this study were absent from modern data sets, suggesting a high extinction rate. To investigate this further, we applied a novel principal components multiple logistic regression test to Bayesian serial coalescent simulations. The analysis supported a scenario in which European colonization caused a substantial loss of pre-Columbian lineages.
Wiens, John J; Kuczynski, Caitlin A; Townsend, Ted; Reeder, Tod W; Mulcahy, Daniel G; Sites, Jack W
2010-12-01
Molecular data offer great potential to resolve the phylogeny of living taxa but can molecular data improve our understanding of relationships of fossil taxa? Simulations suggest that this is possible, but few empirical examples have demonstrated the ability of molecular data to change the placement of fossil taxa. We offer such an example here. We analyze the placement of snakes among squamate reptiles, combining published morphological data (363 characters) and new DNA sequence data (15,794 characters, 22 nuclear loci) for 45 living and 19 fossil taxa. We find several intriguing results. First, some fossil taxa undergo major changes in their phylogenetic position when molecular data are added. Second, most fossil taxa are placed with strong support in the expected clades by the combined data Bayesian analyses, despite each having >98% missing cells and despite recent suggestions that extensive missing data are problematic for Bayesian phylogenetics. Third, morphological data can change the placement of living taxa in combined analyses, even when there is an overwhelming majority of molecular characters. Finally, we find strong but apparently misleading signal in the morphological data, seemingly associated with a burrowing lifestyle in snakes, amphisbaenians, and dibamids. Overall, our results suggest promise for an integrated and comprehensive Tree of Life by combining molecular and morphological data for living and fossil taxa.
Phylogenetic analysis of honey bee behavioral evolution.
Raffiudin, Rika; Crozier, Ross H
2007-05-01
DNA sequences from three mitochondrial (rrnL, cox2, nad2) and one nuclear gene (itpr) from all 9 known honey bee species (Apis), a 10th possible species, Apis dorsata binghami, and three outgroup species (Bombus terrestris, Melipona bicolor and Trigona fimbriata) were used to infer Apis phylogenetic relationships using Bayesian analysis. The dwarf honey bees were confirmed as basal, and the giant and cavity-nesting species to be monophyletic. All nodes were strongly supported except that grouping Apis cerana with A. nigrocincta. Two thousand post-burnin trees from the phylogenetic analysis were used in a Bayesian comparative analysis to explore the evolution of dance type, nest structure, comb structure and dance sound within Apis. The ancestral honey bee species was inferred with high support to have nested in the open, and to have more likely than not had a silent vertical waggle dance and a single comb. The common ancestor of the giant and cavity-dwelling bees is strongly inferred to have had a buzzing vertical directional dance. All pairwise combinations of characters showed strong association, but the multiple comparisons problem reduces the ability to infer associations between states between characters. Nevertheless, a buzzing dance is significantly associated with cavity-nesting, several vertical combs, and dancing vertically, a horizontal dance is significantly associated with a nest with a single comb wrapped around the support, and open nesting with a single pendant comb and a silent waggle dance.
Fehren-Schmitz, Lars; Haak, Wolfgang; Mächtle, Bertil; Masch, Florian; Llamas, Bastien; Cagigao, Elsa Tomasto; Sossna, Volker; Schittek, Karsten; Isla Cuadrado, Johny; Eitel, Bernhard; Reindel, Markus
2014-07-01
Several archaeological studies in the Central Andes have pointed at the temporal coincidence of climatic fluctuations (both long- and short-term) and episodes of cultural transition and changes of socioeconomic structures throughout the pre-Columbian period. Although most scholars explain the connection between environmental and cultural changes by the impact of climatic alterations on the capacities of the ecosystems inhabited by pre-Columbian cultures, direct evidence for assumed demographic consequences is missing so far. In this study, we address directly the impact of climatic changes on the spatial population dynamics of the Central Andes. We use a large dataset of pre-Columbian mitochondrial DNA sequences from the northern Rio Grande de Nasca drainage (RGND) in southern Peru, dating from ∼840 BC to 1450 AD. Alternative demographic scenarios are tested using Bayesian serial coalescent simulations in an approximate Bayesian computational framework. Our results indicate migrations from the lower coastal valleys of southern Peru into the Andean highlands coincident with increasing climate variability at the end of the Nasca culture at ∼640 AD. We also find support for a back-migration from the highlands to the coast coincident with droughts in the southeastern Andean highlands and improvement of climatic conditions on the coast after the decline of the Wari and Tiwanaku empires (∼1200 AD), leading to a genetic homogenization in the RGND and probably southern Peru as a whole.
Fehren-Schmitz, Lars; Haak, Wolfgang; Mächtle, Bertil; Masch, Florian; Llamas, Bastien; Tomasto Cagigao, Elsa; Sossna, Volker; Schittek, Karsten; Isla Cuadrado, Johny; Eitel, Bernhard; Reindel, Markus
2014-01-01
Several archaeological studies in the Central Andes have pointed at the temporal coincidence of climatic fluctuations (both long- and short-term) and episodes of cultural transition and changes of socioeconomic structures throughout the pre-Columbian period. Although most scholars explain the connection between environmental and cultural changes by the impact of climatic alterations on the capacities of the ecosystems inhabited by pre-Columbian cultures, direct evidence for assumed demographic consequences is missing so far. In this study, we address directly the impact of climatic changes on the spatial population dynamics of the Central Andes. We use a large dataset of pre-Columbian mitochondrial DNA sequences from the northern Rio Grande de Nasca drainage (RGND) in southern Peru, dating from ∼840 BC to 1450 AD. Alternative demographic scenarios are tested using Bayesian serial coalescent simulations in an approximate Bayesian computational framework. Our results indicate migrations from the lower coastal valleys of southern Peru into the Andean highlands coincident with increasing climate variability at the end of the Nasca culture at ∼640 AD. We also find support for a back-migration from the highlands to the coast coincident with droughts in the southeastern Andean highlands and improvement of climatic conditions on the coast after the decline of the Wari and Tiwanaku empires (∼1200 AD), leading to a genetic homogenization in the RGND and probably southern Peru as a whole. PMID:24979787
Spatio-Temporal History of HIV-1 CRF35_AD in Afghanistan and Iran.
Eybpoosh, Sana; Bahrampour, Abbas; Karamouzian, Mohammad; Azadmanesh, Kayhan; Jahanbakhsh, Fatemeh; Mostafavi, Ehsan; Zolala, Farzaneh; Haghdoost, Ali Akbar
2016-01-01
HIV-1 Circulating Recombinant Form 35_AD (CRF35_AD) has an important position in the epidemiological profile of Afghanistan and Iran. Despite the presence of this clade in Afghanistan and Iran for over a decade, our understanding of its origin and dissemination patterns is limited. In this study, we performed a Bayesian phylogeographic analysis to reconstruct the spatio-temporal dispersion pattern of this clade using eligible CRF35_AD gag and pol sequences available in the Los Alamos HIV database (432 sequences available from Iran, 16 sequences available from Afghanistan, and a single CRF35_AD-like pol sequence available from USA). Bayesian Markov Chain Monte Carlo algorithm was implemented in BEAST v1.8.1. Between-country dispersion rates were tested with Bayesian stochastic search variable selection method and were considered significant where Bayes factor values were greater than three. The findings suggested that CRF35_AD sequences were genetically similar to parental sequences from Kenya and Uganda, and to a set of subtype A1 sequences available from Afghan refugees living in Pakistan. Our results also showed that across all phylogenies, Afghan and Iranian CRF35_AD sequences formed a monophyletic cluster (posterior clade credibility> 0.7). The divergence date of this cluster was estimated to be between 1990 and 1992. Within this cluster, a bidirectional dispersion of the virus was observed across Afghanistan and Iran. We could not clearly identify if Afghanistan or Iran first established or received this epidemic, as the root location of this cluster could not be robustly estimated. Three CRF35_AD sequences from Afghan refugees living in Pakistan nested among Afghan and Iranian CRF35_AD branches. However, the CRF35_AD-like sequence available from USA diverged independently from Kenyan subtype A1 sequences, suggesting it not to be a true CRF35_AD lineage. Potential factors contributing to viral exchange between Afghanistan and Iran could be injection drug networks and mass migration of Afghan refugees and labours to Iran, which calls for extensive preventive efforts.
Spatio-Temporal History of HIV-1 CRF35_AD in Afghanistan and Iran
Eybpoosh, Sana; Bahrampour, Abbas; Karamouzian, Mohammad; Azadmanesh, Kayhan; Jahanbakhsh, Fatemeh; Mostafavi, Ehsan; Zolala, Farzaneh; Haghdoost, Ali Akbar
2016-01-01
HIV-1 Circulating Recombinant Form 35_AD (CRF35_AD) has an important position in the epidemiological profile of Afghanistan and Iran. Despite the presence of this clade in Afghanistan and Iran for over a decade, our understanding of its origin and dissemination patterns is limited. In this study, we performed a Bayesian phylogeographic analysis to reconstruct the spatio-temporal dispersion pattern of this clade using eligible CRF35_AD gag and pol sequences available in the Los Alamos HIV database (432 sequences available from Iran, 16 sequences available from Afghanistan, and a single CRF35_AD-like pol sequence available from USA). Bayesian Markov Chain Monte Carlo algorithm was implemented in BEAST v1.8.1. Between-country dispersion rates were tested with Bayesian stochastic search variable selection method and were considered significant where Bayes factor values were greater than three. The findings suggested that CRF35_AD sequences were genetically similar to parental sequences from Kenya and Uganda, and to a set of subtype A1 sequences available from Afghan refugees living in Pakistan. Our results also showed that across all phylogenies, Afghan and Iranian CRF35_AD sequences formed a monophyletic cluster (posterior clade credibility> 0.7). The divergence date of this cluster was estimated to be between 1990 and 1992. Within this cluster, a bidirectional dispersion of the virus was observed across Afghanistan and Iran. We could not clearly identify if Afghanistan or Iran first established or received this epidemic, as the root location of this cluster could not be robustly estimated. Three CRF35_AD sequences from Afghan refugees living in Pakistan nested among Afghan and Iranian CRF35_AD branches. However, the CRF35_AD-like sequence available from USA diverged independently from Kenyan subtype A1 sequences, suggesting it not to be a true CRF35_AD lineage. Potential factors contributing to viral exchange between Afghanistan and Iran could be injection drug networks and mass migration of Afghan refugees and labours to Iran, which calls for extensive preventive efforts. PMID:27280293
Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models.
Park, Byungkyu; Im, Jinyong; Tuvshinjargal, Narankhuu; Lee, Wook; Han, Kyungsook
2014-11-01
As many structures of protein-DNA complexes have been known in the past years, several computational methods have been developed to predict DNA-binding sites in proteins. However, its inverse problem (i.e., predicting protein-binding sites in DNA) has received much less attention. One of the reasons is that the differences between the interaction propensities of nucleotides are much smaller than those between amino acids. Another reason is that DNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding DNA nucleotides is much harder than predicting DNA-binding amino acids. We computed the interaction propensity (IP) of nucleotide triplets with amino acids using an extensive dataset of protein-DNA complexes, and developed two support vector machine (SVM) models that predict protein-binding nucleotides from sequence data alone. One SVM model predicts protein-binding nucleotides using DNA sequence data alone, and the other SVM model predicts protein-binding nucleotides using both DNA and protein sequences. In a 10-fold cross-validation with 1519 DNA sequences, the SVM model that uses DNA sequence data only predicted protein-binding nucleotides with an accuracy of 67.0%, an F-measure of 67.1%, and a Matthews correlation coefficient (MCC) of 0.340. With an independent dataset of 181 DNAs that were not used in training, it achieved an accuracy of 66.2%, an F-measure 66.3% and a MCC of 0.324. Another SVM model that uses both DNA and protein sequences achieved an accuracy of 69.6%, an F-measure of 69.6%, and a MCC of 0.383 in a 10-fold cross-validation with 1519 DNA sequences and 859 protein sequences. With an independent dataset of 181 DNAs and 143 proteins, it showed an accuracy of 67.3%, an F-measure of 66.5% and a MCC of 0.329. Both in cross-validation and independent testing, the second SVM model that used both DNA and protein sequence data showed better performance than the first model that used DNA sequence data. To the best of our knowledge, this is the first attempt to predict protein-binding nucleotides in a given DNA sequence from the sequence data alone. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Fernández, Martina; Ezcurra, Cecilia; Calviño, Carolina I
2017-03-01
Azorella, Laretia and Mulinum are taxonomically complex, and good candidates to study evolutionary radiations in the Andes and the importance of hybridizations. Previous phylogenetic studies of subfamily Azorelloideae agree that Azorella and Mulinum as currently conceived are not monophyletic, and hence a revision of their circumscription is necessary. However, these phylogenies were based only on chloroplast DNA sequence data. Here, phylogenetic relationships within Azorelloideae were inferred using sequence data from five chloroplast DNA (rps16 intron, trnQ-rps16, rps16-trnK UUU 5' -exon, trnG GCC -trnS GCU and rpL32-trnL UAG ), and from nuclear rDNA ITS regions to assess the monophyly of Azorella and Mulinum and discuss generic re-circumscriptions, determine hybridization and radiation events, identify and characterize important lineages, and propose hypotheses on evolution of key morphological characters. In total, 121 accessions of Azorelloideae were analyzed. Phylogenetic analyses of the different genomes were conducted separately and combined, with and without indels, using maximum parsimony, maximum likelihood, and Bayesian methods. To analyze the incongruence between plastid and nuclear-derived trees a consensus network from strongly supported nodes from cpDNA and ITS trees was constructed. Internode certainty values were calculated to evaluate the reliability of the relationships estimated from the individual cpDNA and ITS data sets and to examine the degree of conflict within the total evidence data set. Azorella and Mulinum were confirmed as not monophyletic. Except three Azorella species, the remaining azorellas, all species of Mulinum, and Laretia form a monophyletic group, designated here as Andean-Patagonian. The three species of Azorella that are not part of the Andean-Patagonian lineage are grouped together with Huanaca and Schizeilema in another lineage, designated here as Austral. Within the Andean-Patagonian clade, three major lineages can be recognized: Diversifolia, Trifurcata, and Spinosum. Each of these lineages have different leaf morpho-anatomies, Diversifolia species being more mesomorphic compared to species of Trifurcata, and species of Spinosum being the most xeromorphic. Hybridizations have been important in the evolution of the group, especially within Diversifolia, with at least six reticulation events resulting in putative homoploid and allopolyploid hybrid species. Evidence from branch lengths and low sequence divergences suggest a rapid radiation in the Spinosum group, probably associated with the acquisition of wings in the fruits. Copyright © 2017 Elsevier Inc. All rights reserved.
Unexpected diversity and new species in the sponge-Parazoanthidae association in southern Japan.
Montenegro, Javier; Sinniger, Frederic; Reimer, James Davis
2015-08-01
Currently the genera Parazoanthus (family Parazoanthidae) and Epizoanthus (family Epizoanthidae) are the only sponge-associated zoantharians (Cnidaria, Anthozoa). The Parazoanthidae-sponge associations are widely distributed in tropical and subtropical waters from the intertidal to the deep sea in the Atlantic and Indo-Pacific Oceans. However, the taxonomic identification of both parties is often confused due to variable morphology and wide ecological ranges. In particular, Parazoanthidae species diversity remains poorly understood in the Indo-Pacific. In the present study, the diversity of the sponge-zoanthid association in the Indo-Pacific was investigated with 71 Parazoanthidae specimens collected from 29 different locations in Japan (n=22), Australia (n=6) and Florida, USA (n=1). For all specimens morphological analyses were performed and total DNA was extracted and amplified for four DNA markers (COI-mtDNA, mt 16S-rDNA, ITS-rDNA and ALG11-nuDNA). The combined data demonstrate that the specimens of this study are clearly different from those of all described Parazoanthus species, and lead us to erect Umimayanthus gen. n., within family Parazoanthidae, containing the three newly described species U. chanpuru sp. n., U. miyabi sp. n., U. nakama sp. n. The new genus also includes the previously described species U. parasiticus (Duchassaing and Michelotti, 1860; comb. nov.), previously belonging to the genus Parazoanthus. Neighbor joining, maximum likelihood and Bayesian posterior probability phylogenetic trees clearly demonstrate the monophyly of Umimayanthus gen. n. to the exclusion of all outgroup sequences. The phylogenetic results were also compared to morphological features, and polyp sizes, amount of sand content in tissues, types of connections between polyps, and cnidae data, in particular holotrichs-1, were useful in distinguishing the different species within this new genus. This new genus can be distinguished from all other Zoantharia by a unique and conserved 9 bp insertion and a 14 bp deletion in the mt 16S-rDNA region. Additionally, compared to Parazoanthus sensu stricto (i.e. P. axinellae [Schmidt, 1862]), Umimayanthus spp. are only found associated to sponges, and have a coenenchyme much less developed than Parazoanthus sensu stricto. Each new species can be distinguished from other congeners by a unique DNA sequence, numbers of tentacle, maximum sizes of holotrichs, associated sponge morphology, and colony morphology. The identification of the host sponge species is the next logical step in this research as this may also aid in the distinction of Umimayanthus species. Copyright © 2015 Elsevier Inc. All rights reserved.
Peddayelachagiri, Bhavani V.; Paul, Soumya; Nagaraj, Sowmya; Gogoi, Madhurjya; Sripathy, Murali H.; Batra, Harsh V.
2016-01-01
Accurate identification of pathogens with biowarfare importance requires detection tools that specifically differentiate them from near-neighbor species. Burkholderia pseudomallei, the causative agent of a fatal disease melioidosis, is one such biothreat agent whose differentiation from its near-neighbor species is always a challenge. This is because of its phenotypic similarity with other Burkholderia species which have a wide spread geographical distribution with shared environmental niches. Melioidosis is a major public health concern in endemic regions including Southeast Asia and northern Australia. In India, the disease is still considered to be emerging. Prevalence surveys of this saprophytic bacterium in environment are under-reported in the country. A major challenge in this case is the specific identification and differentiation of B. pseudomallei from the growing list of species of Burkholderia genus. The objectives of this study included examining the prevalence of B. pseudomallei and near-neighbor species in coastal region of South India and development of a novel detection tool for specific identification and differentiation of Burkholderia species. Briefly, we analyzed soil and water samples collected from Malabar coastal region of Kerala, South India for prevalence of B. pseudomallei. The presumptive Burkholderia isolates were identified using recA PCR assay. The recA PCR assay identified 22 of the total 40 presumptive isolates as Burkholderia strains (22.72% and 77.27% B. pseudomallei and non-pseudomallei Burkholderia respectively). In order to identify each isolate screened, we performed recA and 16S rDNA sequencing. This two genes sequencing revealed that the presumptive isolates included B. pseudomallei, non-pseudomallei Burkholderia as well as non-Burkholderia strains. Furthermore, a gene termed D-beta hydroxybutyrate dehydrogenase (bdha) was studied both in silico and in vitro for accurate detection of Burkholderia genus. The optimized bdha based PCR assay when evaluated on the Burkholderia isolates of this study, it was found to be highly specific (100%) in its detection feature and a clear detection sensitivity of 10 pg/μl of purified gDNA was recorded. Nucleotide sequence variations of bdha among interspecies, as per in silico analysis, ranged from 8 to 29% within the target stretch of 730 bp highlighting the potential utility of bdha sequencing method in specific detection of Burkholderia species. Further, sequencing of the 730 bp bdha PCR amplicon of each Burkholderia strain isolated could differentiate the species and the data was comparable with recA sequence data of the strains. All sequencing results obtained were submitted to NCBI database. Bayesian phylogenetic analysis of bdha in comparison with recA and 16S rDNA showed that the bdha gene provided comparable identification of Burkholderia species. PMID:27632353
Enlightenment of Yeast Mitochondrial Homoplasmy: Diversified Roles of Gene Conversion
Ling, Feng; Mikawa, Tsutomu; Shibata, Takehiko
2011-01-01
Mitochondria have their own genomic DNA. Unlike the nuclear genome, each cell contains hundreds to thousands of copies of mitochondrial DNA (mtDNA). The copies of mtDNA tend to have heterogeneous sequences, due to the high frequency of mutagenesis, but are quickly homogenized within a cell (“homoplasmy”) during vegetative cell growth or through a few sexual generations. Heteroplasmy is strongly associated with mitochondrial diseases, diabetes and aging. Recent studies revealed that the yeast cell has the machinery to homogenize mtDNA, using a common DNA processing pathway with gene conversion; i.e., both genetic events are initiated by a double-stranded break, which is processed into 3′ single-stranded tails. One of the tails is base-paired with the complementary sequence of the recipient double-stranded DNA to form a D-loop (homologous pairing), in which repair DNA synthesis is initiated to restore the sequence lost by the breakage. Gene conversion generates sequence diversity, depending on the divergence between the donor and recipient sequences, especially when it occurs among a number of copies of a DNA sequence family with some sequence variations, such as in immunoglobulin diversification in chicken. MtDNA can be regarded as a sequence family, in which the members tend to be diversified by a high frequency of spontaneous mutagenesis. Thus, it would be interesting to determine why and how double-stranded breakage and D-loop formation induce sequence homogenization in mitochondria and sequence diversification in nuclear DNA. We will review the mechanisms and roles of mtDNA homoplasmy, in contrast to nuclear gene conversion, which diversifies gene and genome sequences, to provide clues toward understanding how the common DNA processing pathway results in such divergent outcomes. PMID:24710143
"First generation" automated DNA sequencing technology.
Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M
2011-10-01
Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines. © 2011 by John Wiley & Sons, Inc.
Influence of DNA sequence on the structure of minicircles under torsional stress
Wang, Qian; Irobalieva, Rossitza N.; Chiu, Wah; Schmid, Michael F.; Fogg, Jonathan M.; Zechiedrich, Lynn
2017-01-01
Abstract The sequence dependence of the conformational distribution of DNA under various levels of torsional stress is an important unsolved problem. Combining theory and coarse-grained simulations shows that the DNA sequence and a structural correlation due to topology constraints of a circle are the main factors that dictate the 3D structure of a 336 bp DNA minicircle under torsional stress. We found that DNA minicircle topoisomers can have multiple bend locations under high torsional stress and that the positions of these sharp bends are determined by the sequence, and by a positive mechanical correlation along the sequence. We showed that simulations and theory are able to provide sequence-specific information about individual DNA minicircles observed by cryo-electron tomography (cryo-ET). We provided a sequence-specific cryo-ET tomogram fitting of DNA minicircles, registering the sequence within the geometric features. Our results indicate that the conformational distribution of minicircles under torsional stress can be designed, which has important implications for using minicircle DNA for gene therapy. PMID:28609782
Analysis of DNA Sequences by an Optical Time-Integrating Correlator: Proof-of-Concept Experiments.
1992-05-01
DNA ANALYSIS STRATEGY 4 2.1 Representation of DNA Bases 4 2.2 DNA Analysis Strategy 6 3.0 CUSTOM GENERATORS FOR DNA SEQUENCES 10 3.1 Hardware Design 10...of the DNA bases where each base is represented by a 7-bits long pseudorandom sequence. 5 Figure 4: Coarse analysis of a DNA sequence. 7 Figure 5: Fine...a 20-bases long database. 32 xiii LIST OF TABLES PAGE Table 1: Short representations of the DNA bases where each base is represented by 7-bits long
Campagna, Leonardo; Van Coeverden de Groot, Peter J; Saunders, Brenda L; Atkinson, Stephen N; Weber, Diana S; Dyck, Markus G; Boag, Peter T; Lougheed, Stephen C
2013-09-01
As global warming accelerates the melting of Arctic sea ice, polar bears (Ursus maritimus) must adapt to a rapidly changing landscape. This process will necessarily alter the species distribution together with population dynamics and structure. Detailed knowledge of these changes is crucial to delineating conservation priorities. Here, we sampled 361 polar bears from across the center of the Canadian Arctic Archipelago spanning the Gulf of Boothia (GB) and M'Clintock Channel (MC). We use DNA microsatellites and mitochondrial control region sequences to quantify genetic differentiation, estimate gene flow, and infer population history. Two populations, roughly coincident with GB and MC, are significantly differentiated at both nuclear (F ST = 0.01) and mitochondrial (ΦST = 0.47; F ST = 0.29) loci, allowing Bayesian clustering analyses to assign individuals to either group. Our data imply that the causes of the mitochondrial and nuclear genetic patterns differ. Analysis of mtDNA reveals the matrilineal structure dates at least to the Holocene, and is common to individuals throughout the species' range. These mtDNA differences probably reflect both genetic drift and historical colonization dynamics. In contrast, the differentiation inferred from microsatellites is only on the scale of hundreds of years, possibly reflecting contemporary impediments to gene flow. Taken together, our data suggest that gene flow is insufficient to homogenize the GB and MC populations and support the designation of GB and MC as separate polar bear conservation units. Our study also provide a striking example of how nuclear DNA and mtDNA capture different aspects of a species demographic history.
Zeng, Yan-Fei; Zhang, Jian-Guo; Abuduhamiti, Bawerjan; Wang, Wen-Ting; Jia, Zhi-Qing
2018-05-25
The effects of historical geology and climatic events on the evolution of plants around the Qinghai-Tibetan Plateau region have been at the center of debate for years. To identify the influence of the uplift of the Tianshan Mountains and/or climatic oscillations on the evolution of plants in arid northwest China, we investigated the phylogeography of the Euphrates poplar (Populus euphratica) using chloroplast DNA (cpDNA) sequences and nuclear microsatellites, and estimated its historical distribution using Ecological Niche Modeling (ENM). We found that the Euphrates poplar differed from another desert poplar, P. pruinosa, in both nuclear and chloroplast DNA. The low clonal diversity in both populations reflected the low regeneration rate by seed/seedlings in many locations. Both cpDNA and nuclear markers demonstrated a clear divergence between the Euphrates poplar populations from northern and southern Xinjiang regions. The divergence time was estimated to be early Pleistocene based on cpDNA, and late Pleistocene using an Approximate Bayesian Computation analysis based on microsatellites. Estimated gene flow was low between these two regions, and the limited gene flow occurred mainly via dispersal from eastern regions. ENM analysis supported a wider distribution of the Euphrates poplar at 3 Ma, but a more constricted distribution during both the glacial period and the interglacial period. These results indicate that the deformation of the Tianshan Mountains has impeded gene flow of the Euphrates poplar populations from northern and southern Xinjiang, and the distribution constriction due to climatic oscillations further accelerated the divergence of populations from these regions. To protect the desert poplars, more effort is needed to encourage seed germination and seedling establishment, and to conserve endemic gene resources in the northern Xinjiang region.
Zha, Hong-Guang; Milne, Richard I.; Sun, Hang
2010-01-01
Background and Aims Rhododendron (Ericaceae) is a large woody genus in which hybridization is thought to play an important role in evolution and speciation, particularly in the Sino-Himalaya region where many interfertile species often occur sympatrically. Rhododendron agastum, a putative hybrid species, occurs in China, western Yunnan Province, in mixed populations with R. irroratum and R. delavayi. Methods Material of these taxa from two sites 400 km apart (ZhuJianYuan, ZJY and HuaDianBa, HDB) was examined using cpDNA and internal transcribed spacer (ITS) sequences, and amplified fragment length polymorphism (AFLP) loci, to test the possibility that R. agastum was in fact a hybrid between two of the other species. Chloroplast trnL-F and trnS-trnG sequences together distinguished R. irroratum, R. delavayi and some material of R. decorum, which is also considered a putative parent of R. agastum. Key Results All 14 R. agastum plants from the HDB site had the delavayi cpDNA haplotype, whereas at the ZJY site 17 R. agastum plants had this haplotype and four had the R. irroratum haplotype. R. irroratum and R. delavayi are distinguished by five unequivocal point mutations in their ITS sequences; every R. agastum accession had an additive pattern (double peaks) at each of these sites. Data from AFLP loci were acquired for between ten and 21 plants of each taxon from each site, and were analysed using a Bayesian approach implemented by the program NewHybrids. The program confirmed the identity of all accessions of R. delavayi, and all R. irroratum except one, which was probably a backcross. All R. agastum from HDB and 19 of 21 from ZJY were classified as F1 hybrids; the other two could not be assigned a class. Conclusions Rhododendron agastum represents populations of hybrids between R. irroratum and R. delavayi, which comprise mostly or only F1s, at the two sites examined. The sites differ in that at HDB there was no detected variation in cpDNA type or hybrid class, whereas at ZJY there was variation in both. PMID:19887474
Laser mass spectrometry for DNA sequencing, disease diagnosis, and fingerprinting
NASA Astrophysics Data System (ADS)
Chen, C. H. Winston; Taranenko, N. I.; Zhu, Y. F.; Chung, C. N.; Allman, S. L.
1997-05-01
Since laser mass spectrometry has the potential for achieving very fast DNA analysis, we recently applied it to DNA sequencing, DNA typing for fingerprinting, and DNA screening for disease diagnosis. Two different approaches for sequencing DNA have been successfully demonstrated. One is to sequence DNA with DNA ladders produced from Sanger's enzymatic method. The other is to do direct sequencing without DNA ladders. The need for quick DNA typing for identification purposes is critical for forensic application. Our preliminary results indicate laser mass spectrometry can possible be used for rapid DNA fingerprinting applications at a much lower cost than gel electrophoresis. Population screening for certain genetic disease can be a very efficient step to reducing medical costs through prevention. Since laser mass spectrometry can provide very fast DNA analysis, we applied laser mass spectrometry to disease diagnosis. Clinical samples with both base deletion and point mutation have been tested with complete success.
Colombo, M M; Swanton, M T; Donini, P; Prescott, D M
1984-01-01
Oxytricha nova is a hypotrichous ciliate with micronuclei and macronuclei. Micronuclei, which contain large, chromosomal-sized DNA, are genetically inert but undergo meiosis and exchange during cell mating. Macronuclei, which contain only small, gene-sized DNA molecules, provide all of the nuclear RNA needed to run the cell. After cell mating the macronucleus is derived from a micronucleus, a derivation that includes excision of the genes from chromosomes and elimination of the remaining DNA. The eliminated DNA includes all of the repetitious sequences and approximately 95% of the unique sequences. We cloned large restriction fragments from the micronucleus that confer replication ability on a replication-deficient plasmid in Saccharomyces cerevisiae. Sequences that confer replication ability are called autonomously replicating sequences. The frequency and effectiveness of autonomously replicating sequences in micronuclear DNA are similar to those reported for DNAs of other organisms introduced into yeast cells. Of the 12 micronuclear fragments with autonomously replicating sequence activity, 9 also showed homology to macronuclear DNA, indicating that they contain a macronuclear gene sequence. We conclude from this that autonomously replicating sequence activity is nonrandomly distributed throughout micronuclear DNA and is preferentially associated with those regions of micronuclear DNA that contain genes. Images PMID:6092934
DNA sequence-dependent mechanics and protein-assisted bending in repressor-mediated loop formation
Boedicker, James Q.; Garcia, Hernan G.; Johnson, Stephanie; Phillips, Rob
2014-01-01
As the chief informational molecule of life, DNA is subject to extensive physical manipulations. The energy required to deform double-helical DNA depends on sequence, and this mechanical code of DNA influences gene regulation, such as through nucleosome positioning. Here we examine the sequence-dependent flexibility of DNA in bacterial transcription factor-mediated looping, a context for which the role of sequence remains poorly understood. Using a suite of synthetic constructs repressed by the Lac repressor and two well-known sequences that show large flexibility differences in vitro, we make precise statistical mechanical predictions as to how DNA sequence influences loop formation and test these predictions using in vivo transcription and in vitro single-molecule assays. Surprisingly, sequence-dependent flexibility does not affect in vivo gene regulation. By theoretically and experimentally quantifying the relative contributions of sequence and the DNA-bending protein HU to DNA mechanical properties, we reveal that bending by HU dominates DNA mechanics and masks intrinsic sequence-dependent flexibility. Such a quantitative understanding of how mechanical regulatory information is encoded in the genome will be a key step towards a predictive understanding of gene regulation at single-base pair resolution. PMID:24231252
Lee, Lobin A; Arvai, Kevin J; Jones, Dan
2015-07-01
As DNA sequencing of multigene panels becomes routine for cancer samples in the clinical laboratory, an efficient process for classifying variants has become more critical. Determining which germline variants are significant for cancer disposition and which somatic mutations are integral to cancer development or therapy response remains difficult, even for well-studied genes such as BRCA1 and TP53. We compare and contrast the general principles and lines of evidence commonly used to distinguish the significance of cancer-associated germline and somatic genetic variants. The factors important in each step of the analysis pipeline are reviewed, as are some of the publicly available annotation tools. Given the range of indications and uses of cancer sequencing assays, including diagnosis, staging, prognostication, theranostics, and residual disease detection, the need for flexible methods for scoring of variants is discussed. The usefulness of protein prediction tools and multimodal risk-based or Bayesian approaches are highlighted. Using TET2 variants encountered in hematologic neoplasms, several examples of this multifactorial approach to classifying sequence variants of unknown significance are presented. Although there are still significant gaps in the publicly available data for many cancer genes that limit the broad application of explicit algorithms for variant scoring, the elements of a more rigorous model are outlined. Copyright © 2015 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
El-Sherry, Shiem; Ogedengbe, Mosun E; Hafeez, Mian A; Barta, John R
2013-07-01
Multiple 18S rDNA sequences were obtained from two single-oocyst-derived lines of each of Eimeria meleagrimitis and Eimeria adenoeides. After analysing the 15 new 18S rDNA sequences from two lines of E. meleagrimitis and 17 new sequences from two lines of E. adenoeides, there were clear indications that divergent, paralogous 18S rDNA copies existed within the nuclear genome of E. meleagrimitis. In contrast, mitochondrial cytochrome c oxidase subunit I (COI) partial sequences from all lines of a particular Eimeria sp. were identical and, in phylogenetic analyses, COI sequences clustered unambiguously in monophyletic and highly-supported clades specific to individual Eimeria sp. Phylogenetic analysis of the new 18S rDNA sequences from E. meleagrimitis showed that they formed two distinct clades: Type A with four new sequences; and Type B with nine new sequences; both Types A and B sequences were obtained from each of the single-oocyst-derived lines of E. meleagrimitis. Together these rDNA types formed a well-supported E. meleagrimitis clade. Types A and B 18S rDNA sequences from E. meleagrimitis had a mean sequence identity of only 97.4% whereas mean sequence identity within types was 99.1-99.3%. The observed intraspecific sequence divergence among E. meleagrimitis 18S rDNA sequence types was even higher (approximately 2.6%) than the interspecific sequence divergence present between some well-recognized species such as Eimeria tenella and Eimeria necatrix (1.1%). Our observations suggest that, unlike COI sequences, 18S rDNA sequences are not reliable molecular markers to be used alone for species identification with coccidia, although 18S rDNA sequences have clear utility for phylogenetic reconstruction of apicomplexan parasites at the genus and higher taxonomic ranks. Copyright © 2013. Published by Elsevier Ltd.
Bayesian models: A statistical primer for ecologists
Hobbs, N. Thompson; Hooten, Mevin B.
2015-01-01
Bayesian modeling has become an indispensable tool for ecological research because it is uniquely suited to deal with complexity in a statistically coherent way. This textbook provides a comprehensive and accessible introduction to the latest Bayesian methods—in language ecologists can understand. Unlike other books on the subject, this one emphasizes the principles behind the computations, giving ecologists a big-picture understanding of how to implement this powerful statistical approach.Bayesian Models is an essential primer for non-statisticians. It begins with a definition of probability and develops a step-by-step sequence of connected ideas, including basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and inference from single and multiple models. This unique book places less emphasis on computer coding, favoring instead a concise presentation of the mathematical statistics needed to understand how and why Bayesian analysis works. It also explains how to write out properly formulated hierarchical Bayesian models and use them in computing, research papers, and proposals.This primer enables ecologists to understand the statistical principles behind Bayesian modeling and apply them to research, teaching, policy, and management.Presents the mathematical and statistical foundations of Bayesian modeling in language accessible to non-statisticiansCovers basic distribution theory, network diagrams, hierarchical models, Markov chain Monte Carlo, and moreDeemphasizes computer coding in favor of basic principlesExplains how to write out properly factored statistical expressions representing Bayesian models
Abdul-Latiff, Muhammad Abu Bakar; Ruslin, Farhani; Fui, Vun Vui; Abu, Mohd-Hashim; Rovie-Ryan, Jeffrine Japning; Abdul-Patah, Pazil; Lakim, Maklarin; Roos, Christian; Yaakop, Salmah; Md-Zain, Badrul Munir
2014-01-01
Abstract Phylogenetic relationships among Malaysia’s long-tailed macaques have yet to be established, despite abundant genetic studies of the species worldwide. The aims of this study are to examine the phylogenetic relationships of Macaca fascicularis in Malaysia and to test its classification as a morphological subspecies. A total of 25 genetic samples of M. fascicularis yielding 383 bp of Cytochrome b (Cyt b) sequences were used in phylogenetic analysis along with one sample each of M. nemestrina and M. arctoides used as outgroups. Sequence character analysis reveals that Cyt b locus is a highly conserved region with only 23% parsimony informative character detected among ingroups. Further analysis indicates a clear separation between populations originating from different regions; the Malay Peninsula versus Borneo Insular, the East Coast versus West Coast of the Malay Peninsula, and the island versus mainland Malay Peninsula populations. Phylogenetic trees (NJ, MP and Bayesian) portray a consistent clustering paradigm as Borneo’s population was distinguished from Peninsula’s population (99% and 100% bootstrap value in NJ and MP respectively and 1.00 posterior probability in Bayesian trees). The East coast population was separated from other Peninsula populations (64% in NJ, 66% in MP and 0.53 posterior probability in Bayesian). West coast populations were divided into 2 clades: the North-South (47%/54% in NJ, 26/26% in MP and 1.00/0.80 posterior probability in Bayesian) and Island-Mainland (93% in NJ, 90% in MP and 1.00 posterior probability in Bayesian). The results confirm the previous morphological assignment of 2 subspecies, M. f. fascicularis and M. f. argentimembris, in the Malay Peninsula. These populations should be treated as separate genetic entities in order to conserve the genetic diversity of Malaysia’s M. fascicularis. These findings are crucial in aiding the conservation management and translocation process of M. fascicularis populations in Malaysia. PMID:24899832
Abdul-Latiff, Muhammad Abu Bakar; Ruslin, Farhani; Fui, Vun Vui; Abu, Mohd-Hashim; Rovie-Ryan, Jeffrine Japning; Abdul-Patah, Pazil; Lakim, Maklarin; Roos, Christian; Yaakop, Salmah; Md-Zain, Badrul Munir
2014-01-01
Phylogenetic relationships among Malaysia's long-tailed macaques have yet to be established, despite abundant genetic studies of the species worldwide. The aims of this study are to examine the phylogenetic relationships of Macaca fascicularis in Malaysia and to test its classification as a morphological subspecies. A total of 25 genetic samples of M. fascicularis yielding 383 bp of Cytochrome b (Cyt b) sequences were used in phylogenetic analysis along with one sample each of M. nemestrina and M. arctoides used as outgroups. Sequence character analysis reveals that Cyt b locus is a highly conserved region with only 23% parsimony informative character detected among ingroups. Further analysis indicates a clear separation between populations originating from different regions; the Malay Peninsula versus Borneo Insular, the East Coast versus West Coast of the Malay Peninsula, and the island versus mainland Malay Peninsula populations. Phylogenetic trees (NJ, MP and Bayesian) portray a consistent clustering paradigm as Borneo's population was distinguished from Peninsula's population (99% and 100% bootstrap value in NJ and MP respectively and 1.00 posterior probability in Bayesian trees). The East coast population was separated from other Peninsula populations (64% in NJ, 66% in MP and 0.53 posterior probability in Bayesian). West coast populations were divided into 2 clades: the North-South (47%/54% in NJ, 26/26% in MP and 1.00/0.80 posterior probability in Bayesian) and Island-Mainland (93% in NJ, 90% in MP and 1.00 posterior probability in Bayesian). The results confirm the previous morphological assignment of 2 subspecies, M. f. fascicularis and M. f. argentimembris, in the Malay Peninsula. These populations should be treated as separate genetic entities in order to conserve the genetic diversity of Malaysia's M. fascicularis. These findings are crucial in aiding the conservation management and translocation process of M. fascicularis populations in Malaysia.
Song, Ha Yeun; Mabuchi, Kohji; Satoh, Takashi P; Moore, Jon A; Yamanoue, Yusuke; Miya, Masaki; Nishida, Mutsumi
2014-06-01
Percomorpha, comprising about 60% of modern teleost fishes, has been described as the "(unresolved) bush at the top" of the tree, with its intrarelationships still being ambiguous owing to huge diversity (>15,000 species). Recent molecular phylogenetic studies based on extensive taxon and character sampling, however, have revealed a number of unexpected clades of Percomorpha, and one of which is composed of Syngnathoidei (seahorses, pipefishes, and their relatives) plus several groups distributed across three different orders. To circumscribe the clade more definitely, we sampled several candidate taxa with reference to the previous studies and newly determined whole mitochondrial genome (mitogenome) sequences for 16 percomorph species across syngnathoids, dactylopterids, and their putatively closely-related fishes (Mullidae, Callionymoidei, Malacanthidae). Unambiguously aligned sequences (13,872 bp) from those 16 species plus 78 percomorphs and two outgroups (total 96 species) were subjected to partitioned Bayesian and maximum likelihood analyses. The resulting trees revealed a highly supported clade comprising seven families in Syngnathoidei (Gasterosteiformes), Dactylopteridae (Scorpaeniformes), Mullidae in Percoidei and two families in Callionymoidei (Perciformes). We herein proposed to call this clade "Syngnathiformes" following the latest nuclear DNA studies with some revisions on the included families. Copyright © 2014 Elsevier B.V. All rights reserved.
Phylogeographic History and Gene Flow Among Giant Galápagos Tortoises on Southern Isabela Island
Ciofi, Claudio; Wilson, Gregory A.; Beheregaray, Luciano B.; Marquez, Cruz; Gibbs, James P.; Tapia, Washington; Snell, Howard L.; Caccone, Adalgisa; Powell, Jeffrey R.
2006-01-01
Volcanic islands represent excellent models with which to study the effect of vicariance on colonization and dispersal, particularly when the evolution of genetic diversity mirrors the sequence of geological events that led to island formation. Phylogeographic inference, however, can be particularly challenging for recent dispersal events within islands, where the antagonistic effects of land bridge formation and vicariance can affect movements of organisms with limited dispersal ability. We investigated levels of genetic divergence and recovered signatures of dispersal events for 631 Galápagos giant tortoises across the volcanoes of Sierra Negra and Cerro Azul on the island of Isabela. These volcanoes are among the most recent formations in the Galápagos (<0.7 million years), and previous studies based on genetic and morphological data could not recover a consistent pattern of lineage sorting. We integrated nested clade analysis of mitochondrial DNA control region sequences, to infer historical patterns of colonization, and a novel Bayesian multilocus genotyping method for recovering evidence of recent migration across volcanoes using eleven microsatellite loci. These genetic studies illuminate taxonomic distinctions as well as provide guidance to possible repatriation programs aimed at countering the rapid population declines of these spectacular animals. PMID:16387883
Detecting Recombination Hotspots from Patterns of Linkage Disequilibrium.
Wall, Jeffrey D; Stevison, Laurie S
2016-08-09
With recent advances in DNA sequencing technologies, it has become increasingly easy to use whole-genome sequencing of unrelated individuals to assay patterns of linkage disequilibrium (LD) across the genome. One type of analysis that is commonly performed is to estimate local recombination rates and identify recombination hotspots from patterns of LD. One method for detecting recombination hotspots, LDhot, has been used in a handful of species to further our understanding of the basic biology of recombination. For the most part, the effectiveness of this method (e.g., power and false positive rate) is unknown. In this study, we run extensive simulations to compare the effectiveness of three different implementations of LDhot. We find large differences in the power and false positive rates of these different approaches, as well as a strong sensitivity to the window size used (with smaller window sizes leading to more accurate estimation of hotspot locations). We also compared our LDhot simulation results with comparable simulation results obtained from a Bayesian maximum-likelihood approach for identifying hotspots. Surprisingly, we found that the latter computationally intensive approach had substantially lower power over the parameter values considered in our simulations. Copyright © 2016 Wall and Stevison.
Phylogeographic history and gene flow among giant Galápagos tortoises on southern Isabela Island.
Ciofi, Claudio; Wilson, Gregory A; Beheregaray, Luciano B; Marquez, Cruz; Gibbs, James P; Tapia, Washington; Snell, Howard L; Caccone, Adalgisa; Powell, Jeffrey R
2006-03-01
Volcanic islands represent excellent models with which to study the effect of vicariance on colonization and dispersal, particularly when the evolution of genetic diversity mirrors the sequence of geological events that led to island formation. Phylogeographic inference, however, can be particularly challenging for recent dispersal events within islands, where the antagonistic effects of land bridge formation and vicariance can affect movements of organisms with limited dispersal ability. We investigated levels of genetic divergence and recovered signatures of dispersal events for 631 Galápagos giant tortoises across the volcanoes of Sierra Negra and Cerro Azul on the island of Isabela. These volcanoes are among the most recent formations in the Galápagos (<0.7 million years), and previous studies based on genetic and morphological data could not recover a consistent pattern of lineage sorting. We integrated nested clade analysis of mitochondrial DNA control region sequences, to infer historical patterns of colonization, and a novel Bayesian multilocus genotyping method for recovering evidence of recent migration across volcanoes using eleven microsatellite loci. These genetic studies illuminate taxonomic distinctions as well as provide guidance to possible repatriation programs aimed at countering the rapid population declines of these spectacular animals.
Park, Eunji; Hwang, Dae-Sik; Lee, Jae-Seong; Song, Jun-Im; Seo, Tae-Kun; Won, Yong-Jin
2012-01-01
The phylum Cnidaria is comprised of remarkably diverse and ecologically significant taxa, such as the reef-forming corals, and occupies a basal position in metazoan evolution. The origin of this phylum and the most recent common ancestors (MRCAs) of its modern classes remain mostly unknown, although scattered fossil evidence provides some insights on this topic. Here, we investigate the molecular divergence times of the major taxonomic groups of Cnidaria (27 Hexacorallia, 16 Octocorallia, and 5 Medusozoa) on the basis of mitochondrial DNA sequences of 13 protein-coding genes. For this analysis, the complete mitochondrial genomes of seven octocoral and two scyphozoan species were newly sequenced and combined with all available mitogenomic data from GenBank. Five reliable fossil dates were used to calibrate the Bayesian estimates of divergence times. The molecular evidence suggests that cnidarians originated 741 million years ago (Ma) (95% credible region of 686-819), and the major taxa diversified prior to the Cambrian (543 Ma). The Octocorallia and Scleractinia may have originated from radiations of survivors of the Permian-Triassic mass extinction, which matches their fossil record well. Copyright © 2011 Elsevier Inc. All rights reserved.
Shah, Kushani; Thomas, Shelby; Stein, Arnold
2013-01-01
In this report, we describe a 5-week laboratory exercise for undergraduate biology and biochemistry students in which students learn to sequence DNA and to genotype their DNA for selected single nucleotide polymorphisms (SNPs). Students use miniaturized DNA sequencing gels that require approximately 8 min to run. The students perform G, A, T, C Sanger sequencing reactions. They prepare and run the gels, perform Southern blots (which require only 10 min), and detect sequencing ladders using a colorimetric detection system. Students enlarge their sequencing ladders from digital images of their small nylon membranes, and read the sequence manually. They compare their reads with the actual DNA sequence using BLAST2. After mastering the DNA sequencing system, students prepare their own DNA from a cheek swab, polymerase chain reaction-amplify a region of their DNA that encompasses a SNP of interest, and perform sequencing to determine their genotype at the SNP position. A family pedigree can also be constructed. The SNP chosen by the instructor was rs17822931, which is in the ABCC11 gene and is the determinant of human earwax type. Genotypes at the rs178229931 site vary in different ethnic populations. © 2013 by The International Union of Biochemistry and Molecular Biology.
Kröber, Magdalena; Bekel, Thomas; Diaz, Naryttza N; Goesmann, Alexander; Jaenicke, Sebastian; Krause, Lutz; Miller, Dimitri; Runte, Kai J; Viehöver, Prisca; Pühler, Alfred; Schlüter, Andreas
2009-06-01
The phylogenetic structure of the microbial community residing in a fermentation sample from a production-scale biogas plant fed with maize silage, green rye and liquid manure was analysed by an integrated approach using clone library sequences and metagenome sequence data obtained by 454-pyrosequencing. Sequencing of 109 clones from a bacterial and an archaeal 16S-rDNA amplicon library revealed that the obtained nucleotide sequences are similar but not identical to 16S-rDNA database sequences derived from different anaerobic environments including digestors and bioreactors. Most of the bacterial 16S-rDNA sequences could be assigned to the phylum Firmicutes with the most abundant class Clostridia and to the class Bacteroidetes, whereas most archaeal 16S-rDNA sequences cluster close to the methanogen Methanoculleus bourgensis. Further sequences of the archaeal library most probably represent so far non-characterised species within the genus Methanoculleus. A similar result derived from phylogenetic analysis of mcrA clone sequences. The mcrA gene product encodes the alpha-subunit of methyl-coenzyme-M reductase involved in the final step of methanogenesis. BLASTn analysis applying stringent settings resulted in assignment of 16S-rDNA metagenome sequence reads to 62 16S-rDNA amplicon sequences thus enabling frequency of abundance estimations for 16S-rDNA clone library sequences. Ribosomal Database Project (RDP) Classifier processing of metagenome 16S-rDNA reads revealed abundance of the phyla Firmicutes, Bacteroidetes and Euryarchaeota and the orders Clostridiales, Bacteroidales and Methanomicrobiales. Moreover, a large fraction of 16S-rDNA metagenome reads could not be assigned to lower taxonomic ranks, demonstrating that numerous microorganisms in the analysed fermentation sample of the biogas plant are still unclassified or unknown.
Bayesian Recurrent Neural Network for Language Modeling.
Chien, Jen-Tzung; Ku, Yuan-Chu
2016-02-01
A language model (LM) is calculated as the probability of a word sequence that provides the solution to word prediction for a variety of information systems. A recurrent neural network (RNN) is powerful to learn the large-span dynamics of a word sequence in the continuous space. However, the training of the RNN-LM is an ill-posed problem because of too many parameters from a large dictionary size and a high-dimensional hidden layer. This paper presents a Bayesian approach to regularize the RNN-LM and apply it for continuous speech recognition. We aim to penalize the too complicated RNN-LM by compensating for the uncertainty of the estimated model parameters, which is represented by a Gaussian prior. The objective function in a Bayesian classification network is formed as the regularized cross-entropy error function. The regularized model is constructed not only by calculating the regularized parameters according to the maximum a posteriori criterion but also by estimating the Gaussian hyperparameter by maximizing the marginal likelihood. A rapid approximation to a Hessian matrix is developed to implement the Bayesian RNN-LM (BRNN-LM) by selecting a small set of salient outer-products. The proposed BRNN-LM achieves a sparser model than the RNN-LM. Experiments on different corpora show the robustness of system performance by applying the rapid BRNN-LM under different conditions.
2013-01-01
Background Mitochondrial DNA (mtDNA) typing can be a useful aid for identifying people from compromised samples when nuclear DNA is too damaged, degraded or below detection thresholds for routine short tandem repeat (STR)-based analysis. Standard mtDNA typing, focused on PCR amplicon sequencing of the control region (HVS I and HVS II), is limited by the resolving power of this short sequence, which misses up to 70% of the variation present in the mtDNA genome. Methods We used in-solution hybridisation-based DNA capture (using DNA capture probes prepared from modern human mtDNA) to recover mtDNA from post-mortem human remains in which the majority of DNA is both highly fragmented (<100 base pairs in length) and chemically damaged. The method ‘immortalises’ the finite quantities of DNA in valuable extracts as DNA libraries, which is followed by the targeted enrichment of endogenous mtDNA sequences and characterisation by next-generation sequencing (NGS). Results We sequenced whole mitochondrial genomes for human identification from samples where standard nuclear STR typing produced only partial profiles or demonstrably failed and/or where standard mtDNA hypervariable region sequences lacked resolving power. Multiple rounds of enrichment can substantially improve coverage and sequencing depth of mtDNA genomes from highly degraded samples. The application of this method has led to the reliable mitochondrial sequencing of human skeletal remains from unidentified World War Two (WWII) casualties approximately 70 years old and from archaeological remains (up to 2,500 years old). Conclusions This approach has potential applications in forensic science, historical human identification cases, archived medical samples, kinship analysis and population studies. In particular the methodology can be applied to any case, involving human or non-human species, where whole mitochondrial genome sequences are required to provide the highest level of maternal lineage discrimination. Multiple rounds of in-solution hybridisation-based DNA capture can retrieve whole mitochondrial genome sequences from even the most challenging samples. PMID:24289217
Craig, Erica H.; Adams, Jennifer R.; Waits, Lisette P.; Fuller, Mark R.; Whittington, Diana M.
2016-01-01
Understanding the genetics of a population is a critical component of developing conservation strategies. We used archived tissue samples from golden eagles (Aquila chrysaetos canadensis) in three geographic regions of western North America to conduct a preliminary study of the genetics of the North American subspecies, and to provide data for United States Fish and Wildlife Service (USFWS) decision-making for golden eagle management. We used a combination of mitochondrial DNA (mtDNA) D-loop sequences and 16 nuclear DNA (nDNA) microsatellite loci to investigate the extent of gene flow among our sampling areas in Idaho, California and Alaska and to determine if we could distinguish birds from the different geographic regions based on their genetic profiles. Our results indicate high genetic diversity, low genetic structure and high connectivity. Nuclear DNA Fst values between Idaho and California were low but significantly different from zero (0.026). Bayesian clustering methods indicated a single population, and we were unable to distinguish summer breeding residents from different regions. Results of the mtDNA AMOVA showed that most of the haplotype variation (97%) was within the geographic populations while 3% variation was partitioned among them. One haplotype was common to all three areas. One region-specific haplotype was detected in California and one in Idaho, but additional sampling is required to determine if these haplotypes are unique to those geographic areas or a sampling artifact. We discuss potential sources of the high gene flow for this species including natal and breeding dispersal, floaters, and changes in migratory behavior as a result of environmental factors such as climate change and habitat alteration. Our preliminary findings can help inform the USFWS in development of golden eagle management strategies and provide a basis for additional research into the complex dynamics of the North American subspecies. PMID:27783687
Craig, Erica H; Adams, Jennifer R; Waits, Lisette P; Fuller, Mark R; Whittington, Diana M
2016-01-01
Understanding the genetics of a population is a critical component of developing conservation strategies. We used archived tissue samples from golden eagles (Aquila chrysaetos canadensis) in three geographic regions of western North America to conduct a preliminary study of the genetics of the North American subspecies, and to provide data for United States Fish and Wildlife Service (USFWS) decision-making for golden eagle management. We used a combination of mitochondrial DNA (mtDNA) D-loop sequences and 16 nuclear DNA (nDNA) microsatellite loci to investigate the extent of gene flow among our sampling areas in Idaho, California and Alaska and to determine if we could distinguish birds from the different geographic regions based on their genetic profiles. Our results indicate high genetic diversity, low genetic structure and high connectivity. Nuclear DNA Fst values between Idaho and California were low but significantly different from zero (0.026). Bayesian clustering methods indicated a single population, and we were unable to distinguish summer breeding residents from different regions. Results of the mtDNA AMOVA showed that most of the haplotype variation (97%) was within the geographic populations while 3% variation was partitioned among them. One haplotype was common to all three areas. One region-specific haplotype was detected in California and one in Idaho, but additional sampling is required to determine if these haplotypes are unique to those geographic areas or a sampling artifact. We discuss potential sources of the high gene flow for this species including natal and breeding dispersal, floaters, and changes in migratory behavior as a result of environmental factors such as climate change and habitat alteration. Our preliminary findings can help inform the USFWS in development of golden eagle management strategies and provide a basis for additional research into the complex dynamics of the North American subspecies.
Craig, Erica H; Adams, Jennifer R.; Waits, Lisette P.; Fuller, Mark R.; Whittington, Diana M.
2016-01-01
Understanding the genetics of a population is a critical component of developing conservation strategies. We used archived tissue samples from golden eagles (Aquila chrysaetos canadensis) in three geographic regions of western North America to conduct a preliminary study of the genetics of the North American subspecies, and to provide data for United States Fish and Wildlife Service (USFWS) decision-making for golden eagle management. We used a combination of mitochondrial DNA (mtDNA) D-loop sequences and 16 nuclear DNA (nDNA) microsatellite loci to investigate the extent of gene flow among our sampling areas in Idaho, California and Alaska and to determine if we could distinguish birds from the different geographic regions based on their genetic profiles. Our results indicate high genetic diversity, low genetic structure and high connectivity. Nuclear DNA Fst values between Idaho and California were low but significantly different from zero (0.026). Bayesian clustering methods indicated a single population, and we were unable to distinguish summer breeding residents from different regions. Results of the mtDNA AMOVA showed that most of the haplotype variation (97%) was within the geographic populations while 3% variation was partitioned among them. One haplotype was common to all three areas. One region-specific haplotype was detected in California and one in Idaho, but additional sampling is required to determine if these haplotypes are unique to those geographic areas or a sampling artifact. We discuss potential sources of the high gene flow for this species including natal and breeding dispersal, floaters, and changes in migratory behavior as a result of environmental factors such as climate change and habitat alteration. Our preliminary findings can help inform the USFWS in development of golden eagle management strategies and provide a basis for additional research into the complex dynamics of the North American subspecies.
RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis.
Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab
2012-01-01
RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. http://www.cemb.edu.pk/sw.html RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language.
Direct Detection and Sequencing of Damaged DNA Bases
2011-01-01
Products of various forms of DNA damage have been implicated in a variety of important biological processes, such as aging, neurodegenerative diseases, and cancer. Therefore, there exists great interest to develop methods for interrogating damaged DNA in the context of sequencing. Here, we demonstrate that single-molecule, real-time (SMRT®) DNA sequencing can directly detect damaged DNA bases in the DNA template - as a by-product of the sequencing method - through an analysis of the DNA polymerase kinetics that are altered by the presence of a modified base. We demonstrate the sequencing of several DNA templates containing products of DNA damage, including 8-oxoguanine, 8-oxoadenine, O6-methylguanine, 1-methyladenine, O4-methylthymine, 5-hydroxycytosine, 5-hydroxyuracil, 5-hydroxymethyluracil, or thymine dimers, and show that these base modifications can be readily detected with single-modification resolution and DNA strand specificity. We characterize the distinct kinetic signatures generated by these DNA base modifications. PMID:22185597
Direct detection and sequencing of damaged DNA bases.
Clark, Tyson A; Spittle, Kristi E; Turner, Stephen W; Korlach, Jonas
2011-12-20
Products of various forms of DNA damage have been implicated in a variety of important biological processes, such as aging, neurodegenerative diseases, and cancer. Therefore, there exists great interest to develop methods for interrogating damaged DNA in the context of sequencing. Here, we demonstrate that single-molecule, real-time (SMRT®) DNA sequencing can directly detect damaged DNA bases in the DNA template - as a by-product of the sequencing method - through an analysis of the DNA polymerase kinetics that are altered by the presence of a modified base. We demonstrate the sequencing of several DNA templates containing products of DNA damage, including 8-oxoguanine, 8-oxoadenine, O6-methylguanine, 1-methyladenine, O4-methylthymine, 5-hydroxycytosine, 5-hydroxyuracil, 5-hydroxymethyluracil, or thymine dimers, and show that these base modifications can be readily detected with single-modification resolution and DNA strand specificity. We characterize the distinct kinetic signatures generated by these DNA base modifications.
A comprehensive list of cloned human DNA sequences
Schmidtke, Jörg; Cooper, David N.
1987-01-01
A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3575113
A comprehensive list of cloned human DNA sequences
Schmidtke, Jörg; Cooper, David N.
1990-01-01
A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2333227
A comprehensive list of cloned human DNA sequences
Schmidtke, Jörg; Cooper, David N.
1988-01-01
A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3368330
A comprehensive list of cloned human DNA sequences
Schmidtke, Jörg; Cooper, David N.
1989-01-01
A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2654889
Kilo-sequencing: an ordered strategy for rapid DNA sequence data acquisition.
Barnes, W M; Bevan, M
1983-01-01
A strategy for rapid DNA sequence acquisition in an ordered, nonrandom manner, while retaining all of the conveniences of the dideoxy method with M13 transducing phage DNA template, is described. Target DNA 3 to 14 kb in size can be stably carried by our M13 vectors. Suitable targets are stretches of DNA which lack an enzyme recognition site which is unique on our cloning vectors and adjacent to the sequencing primer; current sites that are so useful when lacking are Pst, Xba, HindIII, BglII, EcoRI. By an in vitro procedure, we cut RF DNA once randomly and once specifically, to create thousands of deletions which start at the unique restriction site adjacent to the dideoxy sequencing primer and extend various distances across the target DNA. Phage carrying a desired size of deletions, whose DNA as template will give rise to DNA sequence data in a desired location along the target DNA, may be purified by electrophoresis alive on agarose gels. Phage running in the same location on the agarose gel thus conveniently give rise to nucleotide sequence data from the same kilobase of target DNA. Images PMID:6298723
Silicene nanoribbon as a new DNA sequencing device
NASA Astrophysics Data System (ADS)
Alesheikh, Sara; Shahtahmassebi, Nasser; Roknabadi, Mahmood Rezaee; Pilevar Shahri, Raheleh
2018-02-01
The importance of applying DNA sequencing in different fields, results in looking for fast and cheap methods. Nanotechnology helps this development by introducing nanostructures used for DNA sequencing. In this work we study the interaction between zigzag silicene nanoribbon and DNA nucleobases using DFT and non equilibrium Green's function approach, to investigate the possibility of using zigzag silicene nanoribbons as a biosensor for DNA sequencing.
Isolation and characterization of target sequences of the chicken CdxA homeobox gene.
Margalit, Y; Yarus, S; Shapira, E; Gruenbaum, Y; Fainsod, A
1993-01-01
The DNA binding specificity of the chicken homeodomain protein CDXA was studied. Using a CDXA-glutathione-S-transferase fusion protein, DNA fragments containing the binding site for this protein were isolated. The sources of DNA were oligonucleotides with random sequence and chicken genomic DNA. The DNA fragments isolated were sequenced and tested in DNA binding assays. Sequencing revealed that most DNA fragments are AT rich which is a common feature of homeodomain binding sites. By electrophoretic mobility shift assays it was shown that the different target sequences isolated bind to the CDXA protein with different affinities. The specific sequences bound by the CDXA protein in the genomic fragments isolated, were determined by DNase I footprinting. From the footprinted sequences, the CDXA consensus binding site was determined. The CDXA protein binds the consensus sequence A, A/T, T, A/T, A, T, A/G. The CAUDAL binding site in the ftz promoter is also included in this consensus sequence. When tested, some of the genomic target sequences were capable of enhancing the transcriptional activity of reporter plasmids when introduced into CDXA expressing cells. This study determined the DNA sequence specificity of the CDXA protein and it also shows that this protein can further activate transcription in cells in culture. Images PMID:7909943
Sequence periodicity in nucleosomal DNA and intrinsic curvature.
Nair, T Murlidharan
2010-05-17
Most eukaryotic DNA contained in the nucleus is packaged by wrapping DNA around histone octamers. Histones are ubiquitous and bind most regions of chromosomal DNA. In order to achieve smooth wrapping of the DNA around the histone octamer, the DNA duplex should be able to deform and should possess intrinsic curvature. The deformability of DNA is a result of the non-parallelness of base pair stacks. The stacking interaction between base pairs is sequence dependent. The higher the stacking energy the more rigid the DNA helix, thus it is natural to expect that sequences that are involved in wrapping around the histone octamer should be unstacked and possess intrinsic curvature. Intrinsic curvature has been shown to be dictated by the periodic recurrence of certain dinucleotides. Several genome-wide studies directed towards mapping of nucleosome positions have revealed periodicity associated with certain stretches of sequences. In the current study, these sequences have been analyzed with a view to understand their sequence-dependent structures. Higher order DNA structures and the distribution of molecular bend loci associated with 146 base nucleosome core DNA sequence from C. elegans and chicken have been analyzed using the theoretical model for DNA curvature. The curvature dispersion calculated by cyclically permuting the sequences revealed that the molecular bend loci were delocalized throughout the nucleosome core region and had varying degrees of intrinsic curvature. The higher order structures associated with nucleosomes of C.elegans and chicken calculated from the sequences revealed heterogeneity with respect to the deviation of the DNA axis. The results points to the possibility of context dependent curvature of varying degrees to be associated with nucleosomal DNA.
Le Bras, Ronan J; Kuzma, Heidi; Sucic, Victor; Bokelmann, Götz
2016-05-01
A notable sequence of calls was encountered, spanning several days in January 2003, in the central part of the Indian Ocean on a hydrophone triplet recording acoustic data at a 250 Hz sampling rate. This paper presents signal processing methods applied to the waveform data to detect, group, extract amplitude and bearing estimates for the recorded signals. An approximate location for the source of the sequence of calls is inferred from extracting the features from the waveform. As the source approaches the hydrophone triplet, the source level (SL) of the calls is estimated at 187 ± 6 dB re: 1 μPa-1 m in the 15-60 Hz frequency range. The calls are attributed to a subgroup of blue whales, Balaenoptera musculus, with a characteristic acoustic signature. A Bayesian location method using probabilistic models for bearing and amplitude is demonstrated on the calls sequence. The method is applied to the case of detection at a single triad of hydrophones and results in a probability distribution map for the origin of the calls. It can be extended to detections at multiple triads and because of the Bayesian formulation, additional modeling complexity can be built-in as needed.
Assessing the Fidelity of Ancient DNA Sequences Amplified From Nuclear Genes
Binladen, Jonas; Wiuf, Carsten; Gilbert, M. Thomas P.; Bunce, Michael; Barnett, Ross; Larson, Greger; Greenwood, Alex D.; Haile, James; Ho, Simon Y. W.; Hansen, Anders J.; Willerslev, Eske
2006-01-01
To date, the field of ancient DNA has relied almost exclusively on mitochondrial DNA (mtDNA) sequences. However, a number of recent studies have reported the successful recovery of ancient nuclear DNA (nuDNA) sequences, thereby allowing the characterization of genetic loci directly involved in phenotypic traits of extinct taxa. It is well documented that postmortem damage in ancient mtDNA can lead to the generation of artifactual sequences. However, as yet no one has thoroughly investigated the damage spectrum in ancient nuDNA. By comparing clone sequences from 23 fossil specimens, recovered from environments ranging from permafrost to desert, we demonstrate the presence of miscoding lesion damage in both the mtDNA and nuDNA, resulting in insertion of erroneous bases during amplification. Interestingly, no significant differences in the frequency of miscoding lesion damage are recorded between mtDNA and nuDNA despite great differences in cellular copy numbers. For both mtDNA and nuDNA, we find significant positive correlations between total sequence heterogeneity and the rates of type 1 transitions (adenine → guanine and thymine → cytosine) and type 2 transitions (cytosine → thymine and guanine → adenine), respectively. Type 2 transitions are by far the most dominant and increase relative to those of type 1 with damage load. The results suggest that the deamination of cytosine (and 5-methyl cytosine) to uracil (and thymine) is the main cause of miscoding lesions in both ancient mtDNA and nuDNA sequences. We argue that the problems presented by postmortem damage, as well as problems with contamination from exogenous sources of conserved nuclear genes, allelic variation, and the reliance on single nucleotide polymorphisms, call for great caution in studies relying on ancient nuDNA sequences. PMID:16299392
[Current applications of high-throughput DNA sequencing technology in antibody drug research].
Yu, Xin; Liu, Qi-Gang; Wang, Ming-Rong
2012-03-01
Since the publication of a high-throughput DNA sequencing technology based on PCR reaction was carried out in oil emulsions in 2005, high-throughput DNA sequencing platforms have been evolved to a robust technology in sequencing genomes and diverse DNA libraries. Antibody libraries with vast numbers of members currently serve as a foundation of discovering novel antibody drugs, and high-throughput DNA sequencing technology makes it possible to rapidly identify functional antibody variants with desired properties. Herein we present a review of current applications of high-throughput DNA sequencing technology in the analysis of antibody library diversity, sequencing of CDR3 regions, identification of potent antibodies based on sequence frequency, discovery of functional genes, and combination with various display technologies, so as to provide an alternative approach of discovery and development of antibody drugs.
Screening for SNPs with Allele-Specific Methylation based on Next-Generation Sequencing Data
Hu, Bo; Xu, Yaomin
2013-01-01
Allele-specific methylation (ASM) has long been studied but mainly documented in the context of genomic imprinting and X chromosome inactivation. Taking advantage of the next-generation sequencing technology, we conduct a high-throughput sequencing experiment with four prostate cell lines to survey the whole genome and identify single nucleotide polymorphisms (SNPs) with ASM. A Bayesian approach is proposed to model the counts of short reads for each SNP conditional on its genotypes of multiple subjects, leading to a posterior probability of ASM. We flag SNPs with high posterior probabilities of ASM by accounting for multiple comparisons based on posterior false discovery rates. Applying the Bayesian approach to the in-house prostate cell line data, we identify 269 SNPs as candidates of ASM. A simulation study is carried out to demonstrate the quantitative performance of the proposed approach. PMID:23710259
Mona, Stefano; Catalano, Giulio; Lari, Martina; Larson, Greger; Boscato, Paolo; Casoli, Antonella; Sineo, Luca; Di Patti, Carolina; Pecchioli, Elena; Caramelli, David; Bertorelle, Giorgio
2010-03-26
The aurochs (Bos primigenius) was a large bovine that ranged over almost the entirety of the Eurasian continent and North Africa. It is the wild ancestor of the modern cattle (Bos taurus), and went extinct in 1627 probably as a consequence of human hunting and the progressive reduction of its habitat. To investigate in detail the genetic history of this species and to compare the population dynamics in different European areas, we analysed Bos primigenius remains from various sites across Italy. Fourteen samples provided ancient DNA fragments from the mitochondrial hypervariable region. Our data, jointly analysed with previously published sequences, support the view that Italian aurochsen were genetically similar to modern bovine breeds, but very different from northern/central European aurochsen. Bayesian analyses and coalescent simulations indicate that the genetic variation pattern in both Italian and northern/central European aurochsen is compatible with demographic stability after the last glaciation. We provide evidence that signatures of population expansion can erroneously arise in stable aurochsen populations when the different ages of the samples are not taken into account. Distinct groups of aurochsen probably inhabited Italy and northern/central Europe after the last glaciation, respectively. On the contrary, Italian and Fertile Crescent aurochsen likely shared several mtDNA sequences, now common in modern breeds. We argue that a certain level of genetic homogeneity characterized aurochs populations in Southern Europe and the Middle East, and also that post-glacial recolonization of northern and central Europe advanced, without major demographic expansions, from eastern, and not southern, refugia.
2010-01-01
Background The aurochs (Bos primigenius) was a large bovine that ranged over almost the entirety of the Eurasian continent and North Africa. It is the wild ancestor of the modern cattle (Bos taurus), and went extinct in 1627 probably as a consequence of human hunting and the progressive reduction of its habitat. To investigate in detail the genetic history of this species and to compare the population dynamics in different European areas, we analysed Bos primigenius remains from various sites across Italy. Results Fourteen samples provided ancient DNA fragments from the mitochondrial hypervariable region. Our data, jointly analysed with previously published sequences, support the view that Italian aurochsen were genetically similar to modern bovine breeds, but very different from northern/central European aurochsen. Bayesian analyses and coalescent simulations indicate that the genetic variation pattern in both Italian and northern/central European aurochsen is compatible with demographic stability after the last glaciation. We provide evidence that signatures of population expansion can erroneously arise in stable aurochsen populations when the different ages of the samples are not taken into account. Conclusions Distinct groups of aurochsen probably inhabited Italy and northern/central Europe after the last glaciation, respectively. On the contrary, Italian and Fertile Crescent aurochsen likely shared several mtDNA sequences, now common in modern breeds. We argue that a certain level of genetic homogeneity characterized aurochs populations in Southern Europe and the Middle East, and also that post-glacial recolonization of northern and central Europe advanced, without major demographic expansions, from eastern, and not southern, refugia. PMID:20346116
Wilson, Wade D; Turner, Thomas F
2009-08-01
The genus Oncorhynchus includes Pacific salmon and trout (anadromous and land-locked) species of the western United States and Mexico. All species and subspecies in this group are threatened, endangered, sensitive, or species of conservation concern in portions of their native ranges. To examine the relationships of the species within Oncorhynchus we sequenced a 768 bp fragment of the protein-encoding ND4 mtDNA region. We included all six recognized subspecies of O. clarki (cutthroat trout), O. gilaegilae (Gila trout) and O. g. apache (Apache trout). Gene trees from likelihood and Bayesian phylogenetic analyses revealed that Salvelinus was the sister group to Oncorhynchus, and as expected based on previous studies, O. clarki was sister to a clade that consisted of O. mykiss plus O. g. gilae and O. g. apache. Within the cutthroat clade (O. clarki), the coastal form O. c. clarki was basal with the Rio Grande cutthroat (O. c. virginalis) most derived. Divergence dating based on a fossil calibration molecular clock showed the oldest clade (mean node age) was O. masou ssp., which diverged roughly 7.6 MYA. Highest probability density intervals for divergence of O. masou overlapped with divergence (6.3 MYA) of Pacific salmon clades ((O. gorbuscha + O. nerka) and (O. tshawytscha + O. kisutch)). The Pacific trout clade ((O. mykiss + O. gilae ssp.) + (O. clarki ssp.)) diverged from the Pacific salmon around 6.3 MYA, with most of the diversification within the O. clarki clade occurring in the last 1 MY.
A tree of life based on ninety-eight expressed genes conserved across diverse eukaryotic species
Jayaswal, Pawan Kumar; Dogra, Vivek; Shanker, Asheesh; Sharma, Tilak Raj
2017-01-01
Rapid advances in DNA sequencing technologies have resulted in the accumulation of large data sets in the public domain, facilitating comparative studies to provide novel insights into the evolution of life. Phylogenetic studies across the eukaryotic taxa have been reported but on the basis of a limited number of genes. Here we present a genome-wide analysis across different plant, fungal, protist, and animal species, with reference to the 36,002 expressed genes of the rice genome. Our analysis revealed 9831 genes unique to rice and 98 genes conserved across all 49 eukaryotic species analysed. The 98 genes conserved across diverse eukaryotes mostly exhibited binding and catalytic activities and shared common sequence motifs; and hence appeared to have a common origin. The 98 conserved genes belonged to 22 functional gene families including 26S protease, actin, ADP–ribosylation factor, ATP synthase, casein kinase, DEAD-box protein, DnaK, elongation factor 2, glyceraldehyde 3-phosphate, phosphatase 2A, ras-related protein, Ser/Thr protein phosphatase family protein, tubulin, ubiquitin and others. The consensus Bayesian eukaryotic tree of life developed in this study demonstrated widely separated clades of plants, fungi, and animals. Musa acuminata provided an evolutionary link between monocotyledons and dicotyledons, and Salpingoeca rosetta provided an evolutionary link between fungi and animals, which indicating that protozoan species are close relatives of fungi and animals. The divergence times for 1176 species pairs were estimated accurately by integrating fossil information with synonymous substitution rates in the comprehensive set of 98 genes. The present study provides valuable insight into the evolution of eukaryotes. PMID:28922368
Phylogenetic estimates of diversification rate are affected by molecular rate variation.
Duchêne, D A; Hua, X; Bromham, L
2017-10-01
Molecular phylogenies are increasingly being used to investigate the patterns and mechanisms of macroevolution. In particular, node heights in a phylogeny can be used to detect changes in rates of diversification over time. Such analyses rest on the assumption that node heights in a phylogeny represent the timing of diversification events, which in turn rests on the assumption that evolutionary time can be accurately predicted from DNA sequence divergence. But there are many influences on the rate of molecular evolution, which might also influence node heights in molecular phylogenies, and thus affect estimates of diversification rate. In particular, a growing number of studies have revealed an association between the net diversification rate estimated from phylogenies and the rate of molecular evolution. Such an association might, by influencing the relative position of node heights, systematically bias estimates of diversification time. We simulated the evolution of DNA sequences under several scenarios where rates of diversification and molecular evolution vary through time, including models where diversification and molecular evolutionary rates are linked. We show that commonly used methods, including metric-based, likelihood and Bayesian approaches, can have a low power to identify changes in diversification rate when molecular substitution rates vary. Furthermore, the association between the rates of speciation and molecular evolution rate can cause the signature of a slowdown or speedup in speciation rates to be lost or misidentified. These results suggest that the multiple sources of variation in molecular evolutionary rates need to be considered when inferring macroevolutionary processes from phylogenies. © 2017 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2017 European Society For Evolutionary Biology.
DNA fingerprinting, DNA barcoding, and next generation sequencing technology in plants.
Sucher, Nikolaus J; Hennell, James R; Carles, Maria C
2012-01-01
DNA fingerprinting of plants has become an invaluable tool in forensic, scientific, and industrial laboratories all over the world. PCR has become part of virtually every variation of the plethora of approaches used for DNA fingerprinting today. DNA sequencing is increasingly used either in combination with or as a replacement for traditional DNA fingerprinting techniques. A prime example is the use of short, standardized regions of the genome as taxon barcodes for biological identification of plants. Rapid advances in "next generation sequencing" (NGS) technology are driving down the cost of sequencing and bringing large-scale sequencing projects into the reach of individual investigators. We present an overview of recent publications that demonstrate the use of "NGS" technology for DNA fingerprinting and DNA barcoding applications.
Mammalian DNA enriched for replication origins is enriched for snap-back sequences.
Zannis-Hadjopoulos, M; Kaufmann, G; Martin, R G
1984-11-15
Using the instability of replication loops as a method for the isolation of double-stranded nascent DNA, extruded DNA enriched for replication origins was obtained and denatured. Snap-back DNA, single-stranded DNA with inverted repeats (palindromic sequences), reassociates rapidly into stem-loop structures with zero-order kinetics when conditions are changed from denaturing to renaturing, and can be assayed by chromatography on hydroxyapatite. Origin-enriched nascent DNA strands from mouse, rat and monkey cells growing either synchronously or asynchronously were purified and assayed for the presence of snap-back sequences. The results show that origin-enriched DNA is also enriched for snap-back sequences, implying that some origins for mammalian DNA replication contain or lie near palindromic sequences.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hancock, Stephen P.; Stella, Stefano; Cascio, Duilio
The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequencesmore » in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. Lastly, the affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.« less
Hancock, Stephen P.; Stella, Stefano; Cascio, Duilio; ...
2016-03-09
The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequencesmore » in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. Lastly, the affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.« less
Specific minor groove solvation is a crucial determinant of DNA binding site recognition
Harris, Lydia-Ann; Williams, Loren Dean; Koudelka, Gerald B.
2014-01-01
The DNA sequence preferences of nearly all sequence specific DNA binding proteins are influenced by the identities of bases that are not directly contacted by protein. Discrimination between non-contacted base sequences is commonly based on the differential abilities of DNA sequences to allow narrowing of the DNA minor groove. However, the factors that govern the propensity of minor groove narrowing are not completely understood. Here we show that the differential abilities of various DNA sequences to support formation of a highly ordered and stable minor groove solvation network are a key determinant of non-contacted base recognition by a sequence-specific binding protein. In addition, disrupting the solvent network in the non-contacted region of the binding site alters the protein's ability to recognize contacted base sequences at positions 5–6 bases away. This observation suggests that DNA solvent interactions link contacted and non-contacted base recognition by the protein. PMID:25429976
A Method for Preparing DNA Sequencing Templates Using a DNA-Binding Microplate
Yang, Yu; Hebron, Haroun R.; Hang, Jun
2009-01-01
A DNA-binding matrix was immobilized on the surface of a 96-well microplate and used for plasmid DNA preparation for DNA sequencing. The same DNA-binding plate was used for bacterial growth, cell lysis, DNA purification, and storage. In a single step using one buffer, bacterial cells were lysed by enzymes, and released DNA was captured on the plate simultaneously. After two wash steps, DNA was eluted and stored in the same plate. Inclusion of phosphates in the culture medium was found to enhance the yield of plasmid significantly. Purified DNA samples were used successfully in DNA sequencing with high consistency and reproducibility. Eleven vectors and nine libraries were tested using this method. In 10 μl sequencing reactions using 3 μl sample and 0.25 μl BigDye Terminator v3.1, the results from a 3730xl sequencer gave a success rate of 90–95% and read-lengths of 700 bases or more. The method is fully automatable and convenient for manual operation as well. It enables reproducible, high-throughput, rapid production of DNA with purity and yields sufficient for high-quality DNA sequencing at a substantially reduced cost. PMID:19568455
Dendritic Cell-Based Immunotherapy of Breast Cancer: Modulation by CpG DNA
2005-09-01
tumor-associated antigens and bacterial DNA oligodeoxynucleotides containing unmethylated CpG sequences (CpG DNA) further augment the immune priming...associated antigens by cytotoxic T lymphocytes, and bacterial DNA oligodeoxy- nucleotides containing unmethylated CpG sequences (CpG DNA) can further...further amplify their immunostimulatory capacity and bacterial DNA oligodeoxynucleotides (ODN) containing unmethylated CpG sequences (CpG DNA) provide such
Montoya-Ruiz, Carolina; Cajimat, Maria N B; Milazzo, Mary Louise; Diaz, Francisco J; Rodas, Juan David; Valbuena, Gustavo; Fulhorst, Charles F
2015-07-01
The results of a previous study suggested that Cherrie's cane rat (Zygodontomys cherriei) is the principal host of Necoclí virus (family Bunyaviridae, genus Hantavirus) in Colombia. Bayesian analyses of complete nucleocapsid protein gene sequences and complete glycoprotein precursor gene sequences in this study confirmed that Necoclí virus is phylogenetically closely related to Maporal virus, which is principally associated with the delicate pygmy rice rat (Oligoryzomys delicatus) in western Venezuela. In pairwise comparisons, nonidentities between the complete amino acid sequence of the nucleocapsid protein of Necoclí virus and the complete amino acid sequences of the nucleocapsid proteins of other hantaviruses were ≥8.7%. Likewise, nonidentities between the complete amino acid sequence of the glycoprotein precursor of Necoclí virus and the complete amino acid sequences of the glycoprotein precursors of other hantaviruses were ≥11.7%. Collectively, the unique association of Necoclí virus with Z. cherriei in Colombia, results of the Bayesian analyses of complete nucleocapsid protein gene sequences and complete glycoprotein precursor gene sequences, and results of the pairwise comparisons of amino acid sequences strongly support the notion that Necoclí virus represents a novel species in the genus Hantavirus. Further work is needed to determine whether Calabazo virus (a hantavirus associated with Z. brevicauda cherriei in Panama) and Necoclí virus are conspecific.
Morozumi, Takeya; Toki, Daisuke; Eguchi-Ogawa, Tomoko; Uenishi, Hirohide
2011-09-01
Large-scale cDNA-sequencing projects require an efficient strategy for mass sequencing. Here we describe a method for sequencing pooled cDNA clones using a combination of transposon insertion and Gateway technology. Our method reduces the number of shotgun clones that are unsuitable for reconstruction of cDNA sequences, and has the advantage of reducing the total costs of the sequencing project.
Biological sequence compression algorithms.
Matsumoto, T; Sadakane, K; Imai, H
2000-01-01
Today, more and more DNA sequences are becoming available. The information about DNA sequences are stored in molecular biology databases. The size and importance of these databases will be bigger and bigger in the future, therefore this information must be stored or communicated efficiently. Furthermore, sequence compression can be used to define similarities between biological sequences. The standard compression algorithms such as gzip or compress cannot compress DNA sequences, but only expand them in size. On the other hand, CTW (Context Tree Weighting Method) can compress DNA sequences less than two bits per symbol. These algorithms do not use special structures of biological sequences. Two characteristic structures of DNA sequences are known. One is called palindromes or reverse complements and the other structure is approximate repeats. Several specific algorithms for DNA sequences that use these structures can compress them less than two bits per symbol. In this paper, we improve the CTW so that characteristic structures of DNA sequences are available. Before encoding the next symbol, the algorithm searches an approximate repeat and palindrome using hash and dynamic programming. If there is a palindrome or an approximate repeat with enough length then our algorithm represents it with length and distance. By using this preprocessing, a new program achieves a little higher compression ratio than that of existing DNA-oriented compression algorithms. We also describe new compression algorithm for protein sequences.
Kim, Jiyeon; Kern, Elizabeth; Kim, Taeho; Sim, Mikang; Kim, Jaebum; Kim, Yuseob; Park, Chungoo; Nadler, Steven A; Park, Joong-Ki
2017-02-01
Plectida is an important nematode order with species that occupy many different biological niches. The order includes free-living aquatic and soil-dwelling species, but its phylogenetic position has remained uncertain. We sequenced the complete mitochondrial genomes of two members of this order, Plectus acuminatus and Plectus aquatilis and compared them with those of other major nematode clades. The genome size and base composition of these species are similar to other nematodes; 14,831 and 14,372bp, respectively, with AT contents of 71.0% and 70.1%. Gene content was also similar to other nematodes, but gene order and coding direction of Plectus mtDNAs were dissimilar from other chromadorean species. P. acuminatus and P. aquatilis are the first chromadorean species found to contain a gene inversion. We reconstructed mitochondrial genome phylogenetic trees using nucleotide and amino acid datasets from 87 nematodes that represent major nematode clades, including the Plectus sequences. Trees from phylogenetic analyses using maximum likelihood and Bayesian methods depicted Plectida as the sister group to other sequenced chromadorean nematodes. This finding is consistent with several phylogenetic results based on SSU rDNA, but disagrees with a classification based on morphology. Mitogenomes representing other basal chromadorean groups (Araeolaimida, Monhysterida, Desmodorida, Chromadorida) are needed to confirm their phylogenetic relationships. Copyright © 2016 Elsevier Inc. All rights reserved.
Detection of DNA Methylation by Whole-Genome Bisulfite Sequencing.
Li, Qing; Hermanson, Peter J; Springer, Nathan M
2018-01-01
DNA methylation plays an important role in the regulation of the expression of transposons and genes. Various methods have been developed to assay DNA methylation levels. Bisulfite sequencing is considered to be the "gold standard" for single-base resolution measurement of DNA methylation levels. Coupled with next-generation sequencing, whole-genome bisulfite sequencing (WGBS) allows DNA methylation to be evaluated at a genome-wide scale. Here, we described a protocol for WGBS in plant species with large genomes. This protocol has been successfully applied to assay genome-wide DNA methylation levels in maize and barley. This protocol has also been successfully coupled with sequence capture technology to assay DNA methylation levels in a targeted set of genomic regions.
Single-Molecule Electrical Random Resequencing of DNA and RNA
NASA Astrophysics Data System (ADS)
Ohshiro, Takahito; Matsubara, Kazuki; Tsutsui, Makusu; Furuhashi, Masayuki; Taniguchi, Masateru; Kawai, Tomoji
2012-07-01
Two paradigm shifts in DNA sequencing technologies--from bulk to single molecules and from optical to electrical detection--are expected to realize label-free, low-cost DNA sequencing that does not require PCR amplification. It will lead to development of high-throughput third-generation sequencing technologies for personalized medicine. Although nanopore devices have been proposed as third-generation DNA-sequencing devices, a significant milestone in these technologies has been attained by demonstrating a novel technique for resequencing DNA using electrical signals. Here we report single-molecule electrical resequencing of DNA and RNA using a hybrid method of identifying single-base molecules via tunneling currents and random sequencing. Our method reads sequences of nine types of DNA oligomers. The complete sequence of 5'-UGAGGUA-3' from the let-7 microRNA family was also identified by creating a composite of overlapping fragment sequences, which was randomly determined using tunneling current conducted by single-base molecules as they passed between a pair of nanoelectrodes.
DNA/RNA hybrid substrates modulate the catalytic activity of purified AID.
Abdouni, Hala S; King, Justin J; Ghorbani, Atefeh; Fifield, Heather; Berghuis, Lesley; Larijani, Mani
2018-01-01
Activation-induced cytidine deaminase (AID) converts cytidine to uridine at Immunoglobulin (Ig) loci, initiating somatic hypermutation and class switching of antibodies. In vitro, AID acts on single stranded DNA (ssDNA), but neither double-stranded DNA (dsDNA) oligonucleotides nor RNA, and it is believed that transcription is the in vivo generator of ssDNA targeted by AID. It is also known that the Ig loci, particularly the switch (S) regions targeted by AID are rich in transcription-generated DNA/RNA hybrids. Here, we examined the binding and catalytic behavior of purified AID on DNA/RNA hybrid substrates bearing either random sequences or GC-rich sequences simulating Ig S regions. If substrates were made up of a random sequence, AID preferred substrates composed entirely of DNA over DNA/RNA hybrids. In contrast, if substrates were composed of S region sequences, AID preferred to mutate DNA/RNA hybrids over substrates composed entirely of DNA. Accordingly, AID exhibited a significantly higher affinity for binding DNA/RNA hybrid substrates composed specifically of S region sequences, than any other substrates composed of DNA. Thus, in the absence of any other cellular processes or factors, AID itself favors binding and mutating DNA/RNA hybrids composed of S region sequences. AID:DNA/RNA complex formation and supporting mutational analyses suggest that recognition of DNA/RNA hybrids is an inherent structural property of AID. Copyright © 2017 Elsevier Ltd. All rights reserved.
Characterization of the repetitive DNA elements in the genome of fish lymphocystis disease viruses.
Schnitzler, P; Darai, G
1989-09-01
The complete DNA nucleotide sequence of the repetitive DNA elements in the genome of fish lymphocystis disease virus (FLDV) isolated from two different species (flounder and dab) was determined. The size of these repetitive DNA elements was found to be 1413 bp which corresponds to the DNA sequences of the 5' terminus of the EcoRI DNA fragment B (0.034 to 0.052 m.u.) and to the EcoRI DNA fragment M (0.718 to 0.736 m.u.) of the FLDV genome causing lymphocystis disease in flounder and plaice. The degree of DNA nucleotide homology between both regions was found to be 99%. The repetitive DNA element in the genome of FLDV isolated from other fish species (dab) was identified and is located within the EcoRI DNA fragment B and J of the viral genome. The DNA nucleotide sequence of one duplicate of this repetition (EcoRI DNA fragment J) was determined (1410 bp) and compared to the DNA nucleotide sequences of the repetitive DNA elements of the genome of FLDV isolated from flounder. It was found that the repetitive DNA elements of the genome of FLDV derived from two different fish species are highly conserved and possess a degree of DNA sequence homology of 94%. The DNA sequences of each strand of the individual repetitive element possess one open reading frame.
Long-range correlations and charge transport properties of DNA sequences
NASA Astrophysics Data System (ADS)
Liu, Xiao-liang; Ren, Yi; Xie, Qiong-tao; Deng, Chao-sheng; Xu, Hui
2010-04-01
By using Hurst's analysis and transfer approach, the rescaled range functions and Hurst exponents of human chromosome 22 and enterobacteria phage lambda DNA sequences are investigated and the transmission coefficients, Landauer resistances and Lyapunov coefficients of finite segments based on above genomic DNA sequences are calculated. In a comparison with quasiperiodic and random artificial DNA sequences, we find that λ-DNA exhibits anticorrelation behavior characterized by a Hurst exponent 0.5
[Whole Genome Sequencing of Human mtDNA Based on Ion Torrent PGM™ Platform].
Cao, Y; Zou, K N; Huang, J P; Ma, K; Ping, Y
2017-08-01
To analyze and detect the whole genome sequence of human mitochondrial DNA (mtDNA) by Ion Torrent PGM™ platform and to study the differences of mtDNA sequence in different tissues. Samples were collected from 6 unrelated individuals by forensic postmortem examination, including chest blood, hair, costicartilage, nail, skeletal muscle and oral epithelium. Amplification of whole genome sequence of mtDNA was performed by 4 pairs of primer. Libraries were constructed with Ion Shear™ Plus Reagents kit and Ion Plus Fragment Library kit. Whole genome sequencing of mtDNA was performed using Ion Torrent PGM™ platform. Sanger sequencing was used to determine the heteroplasmy positions and the mutation positions on HVⅠ region. The whole genome sequence of mtDNA from all samples were amplified successfully. Six unrelated individuals belonged to 6 different haplotypes. Different tissues in one individual had heteroplasmy difference. The heteroplasmy positions and the mutation positions on HVⅠ region were verified by Sanger sequencing. After a consistency check by the Kappa method, it was found that the results of mtDNA sequence had a high consistency in different tissues. The testing method used in present study for sequencing the whole genome sequence of human mtDNA can detect the heteroplasmy difference in different tissues, which have good consistency. The results provide guidance for the further applications of mtDNA in forensic science. Copyright© by the Editorial Department of Journal of Forensic Medicine
Sequence periodicity in nucleosomal DNA and intrinsic curvature
2010-01-01
Background Most eukaryotic DNA contained in the nucleus is packaged by wrapping DNA around histone octamers. Histones are ubiquitous and bind most regions of chromosomal DNA. In order to achieve smooth wrapping of the DNA around the histone octamer, the DNA duplex should be able to deform and should possess intrinsic curvature. The deformability of DNA is a result of the non-parallelness of base pair stacks. The stacking interaction between base pairs is sequence dependent. The higher the stacking energy the more rigid the DNA helix, thus it is natural to expect that sequences that are involved in wrapping around the histone octamer should be unstacked and possess intrinsic curvature. Intrinsic curvature has been shown to be dictated by the periodic recurrence of certain dinucleotides. Several genome-wide studies directed towards mapping of nucleosome positions have revealed periodicity associated with certain stretches of sequences. In the current study, these sequences have been analyzed with a view to understand their sequence-dependent structures. Results Higher order DNA structures and the distribution of molecular bend loci associated with 146 base nucleosome core DNA sequence from C. elegans and chicken have been analyzed using the theoretical model for DNA curvature. The curvature dispersion calculated by cyclically permuting the sequences revealed that the molecular bend loci were delocalized throughout the nucleosome core region and had varying degrees of intrinsic curvature. Conclusions The higher order structures associated with nucleosomes of C.elegans and chicken calculated from the sequences revealed heterogeneity with respect to the deviation of the DNA axis. The results points to the possibility of context dependent curvature of varying degrees to be associated with nucleosomal DNA. PMID:20487515
Murray, V
1999-01-01
This article reviews the literature concerning the sequence specificity of DNA-damaging agents. DNA-damaging agents are widely used in cancer chemotherapy. It is important to understand fully the determinants of DNA sequence specificity so that more effective DNA-damaging agents can be developed as antitumor drugs. There are five main methods of DNA sequence specificity analysis: cleavage of end-labeled fragments, linear amplification with Taq DNA polymerase, ligation-mediated polymerase chain reaction (PCR), single-strand ligation PCR, and footprinting. The DNA sequence specificity in purified DNA and in intact mammalian cells is reviewed for several classes of DNA-damaging agent. These include agents that form covalent adducts with DNA, free radical generators, topoisomerase inhibitors, intercalators and minor groove binders, enzymes, and electromagnetic radiation. The main sites of adduct formation are at the N-7 of guanine in the major groove of DNA and the N-3 of adenine in the minor groove, whereas free radical generators abstract hydrogen from the deoxyribose sugar and topoisomerase inhibitors cause enzyme-DNA cross-links to form. Several issues involved in the determination of the DNA sequence specificity are discussed. The future directions of the field, with respect to cancer chemotherapy, are also examined.
Deciphering the genomic targets of alkylating polyamide conjugates using high-throughput sequencing
Chandran, Anandhakumar; Syed, Junetha; Taylor, Rhys D.; Kashiwazaki, Gengo; Sato, Shinsuke; Hashiya, Kaori; Bando, Toshikazu; Sugiyama, Hiroshi
2016-01-01
Chemically engineered small molecules targeting specific genomic sequences play an important role in drug development research. Pyrrole-imidazole polyamides (PIPs) are a group of molecules that can bind to the DNA minor-groove and can be engineered to target specific sequences. Their biological effects rely primarily on their selective DNA binding. However, the binding mechanism of PIPs at the chromatinized genome level is poorly understood. Herein, we report a method using high-throughput sequencing to identify the DNA-alkylating sites of PIP-indole-seco-CBI conjugates. High-throughput sequencing analysis of conjugate 2 showed highly similar DNA-alkylating sites on synthetic oligos (histone-free DNA) and on human genomes (chromatinized DNA context). To our knowledge, this is the first report identifying alkylation sites across genomic DNA by alkylating PIP conjugates using high-throughput sequencing. PMID:27098039
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies.
Utturkar, Sagar M; Klingeman, Dawn M; Hurt, Richard A; Brown, Steven D
2017-01-01
This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted. PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.
Scalable whole-exome sequencing of cell-free DNA reveals high concordance with metastatic tumors.
Adalsteinsson, Viktor A; Ha, Gavin; Freeman, Samuel S; Choudhury, Atish D; Stover, Daniel G; Parsons, Heather A; Gydush, Gregory; Reed, Sarah C; Rotem, Denisse; Rhoades, Justin; Loginov, Denis; Livitz, Dimitri; Rosebrock, Daniel; Leshchiner, Ignaty; Kim, Jaegil; Stewart, Chip; Rosenberg, Mara; Francis, Joshua M; Zhang, Cheng-Zhong; Cohen, Ofir; Oh, Coyin; Ding, Huiming; Polak, Paz; Lloyd, Max; Mahmud, Sairah; Helvie, Karla; Merrill, Margaret S; Santiago, Rebecca A; O'Connor, Edward P; Jeong, Seong H; Leeson, Rachel; Barry, Rachel M; Kramkowski, Joseph F; Zhang, Zhenwei; Polacek, Laura; Lohr, Jens G; Schleicher, Molly; Lipscomb, Emily; Saltzman, Andrea; Oliver, Nelly M; Marini, Lori; Waks, Adrienne G; Harshman, Lauren C; Tolaney, Sara M; Van Allen, Eliezer M; Winer, Eric P; Lin, Nancy U; Nakabayashi, Mari; Taplin, Mary-Ellen; Johannessen, Cory M; Garraway, Levi A; Golub, Todd R; Boehm, Jesse S; Wagle, Nikhil; Getz, Gad; Love, J Christopher; Meyerson, Matthew
2017-11-06
Whole-exome sequencing of cell-free DNA (cfDNA) could enable comprehensive profiling of tumors from blood but the genome-wide concordance between cfDNA and tumor biopsies is uncertain. Here we report ichorCNA, software that quantifies tumor content in cfDNA from 0.1× coverage whole-genome sequencing data without prior knowledge of tumor mutations. We apply ichorCNA to 1439 blood samples from 520 patients with metastatic prostate or breast cancers. In the earliest tested sample for each patient, 34% of patients have ≥10% tumor-derived cfDNA, sufficient for standard coverage whole-exome sequencing. Using whole-exome sequencing, we validate the concordance of clonal somatic mutations (88%), copy number alterations (80%), mutational signatures, and neoantigens between cfDNA and matched tumor biopsies from 41 patients with ≥10% cfDNA tumor content. In summary, we provide methods to identify patients eligible for comprehensive cfDNA profiling, revealing its applicability to many patients, and demonstrate high concordance of cfDNA and metastatic tumor whole-exome sequencing.
An evolution based biosensor receptor DNA sequence generation algorithm.
Kim, Eungyeong; Lee, Malrey; Gatton, Thomas M; Lee, Jaewan; Zang, Yupeng
2010-01-01
A biosensor is composed of a bioreceptor, an associated recognition molecule, and a signal transducer that can selectively detect target substances for analysis. DNA based biosensors utilize receptor molecules that allow hybridization with the target analyte. However, most DNA biosensor research uses oligonucleotides as the target analytes and does not address the potential problems of real samples. The identification of recognition molecules suitable for real target analyte samples is an important step towards further development of DNA biosensors. This study examines the characteristics of DNA used as bioreceptors and proposes a hybrid evolution-based DNA sequence generating algorithm, based on DNA computing, to identify suitable DNA bioreceptor recognition molecules for stable hybridization with real target substances. The Traveling Salesman Problem (TSP) approach is applied in the proposed algorithm to evaluate the safety and fitness of the generated DNA sequences. This approach improves efficiency and stability for enhanced and variable-length DNA sequence generation and allows extension to generation of variable-length DNA sequences with diverse receptor recognition requirements.
RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis
Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab
2012-01-01
RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. Availability http://www.cemb.edu.pk/sw.html Abbreviations RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language. PMID:23055611
Structural and Thermodynamic Signatures of DNA Recognition by Mycobacterium tuberculosis DnaA
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tsodikov, Oleg V.; Biswas, Tapan
An essential protein, DnaA, binds to 9-bp DNA sites within the origin of replication oriC. These binding events are prerequisite to forming an enigmatic nucleoprotein scaffold that initiates replication. The number, sequences, positions, and orientations of these short DNA sites, or DnaA boxes, within the oriCs of different bacteria vary considerably. To investigate features of DnaA boxes that are important for binding Mycobacterium tuberculosis DnaA (MtDnaA), we have determined the crystal structures of the DNA binding domain (DBD) of MtDnaA bound to a cognate MtDnaA-box (at 2.0 {angstrom} resolution) and to a consensus Escherichia coli DnaA-box (at 2.3 {angstrom}). Thesemore » structures, complemented by calorimetric equilibrium binding studies of MtDnaA DBD in a series of DnaA-box variants, reveal the main determinants of DNA recognition and establish the [T/C][T/A][G/A]TCCACA sequence as a high-affinity MtDnaA-box. Bioinformatic and calorimetric analyses indicate that DnaA-box sequences in mycobacterial oriCs generally differ from the optimal binding sequence. This sequence variation occurs commonly at the first 2 bp, making an in vivo mycobacterial DnaA-box effectively a 7-mer and not a 9-mer. We demonstrate that the decrease in the affinity of these MtDnaA-box variants for MtDnaA DBD relative to that of the highest-affinity box TTGTCCACA is less than 10-fold. The understanding of DnaA-box recognition by MtDnaA and E. coli DnaA enables one to map DnaA-box sequences in the genomes of M. tuberculosis and other eubacteria.« less
Reiss, Rebecca A; Guerra, Peter; Makhnin, Oleg
2016-01-01
Chlorinated solvent contamination of potable water supplies is a serious problem worldwide. Biostimulation protocols can successfully remediate chlorinated solvent contamination through enhanced reductive dechlorination pathways, however the process is poorly understood and sometimes stalls creating a more serious problem. Whole metagenome techniques have the potential to reveal details of microbial community changes induced by biostimulation. Here we compare the metagenome of a tetrachloroethene contaminated Environmental Protection Agency Superfund Site before and after the application of biostimulation protocols. Environmental DNA was extracted from uncultured microbes that were harvested by on-site filtration of groundwater one month prior to and five months after the injection of emulsified vegetable oil, nutrients, and hydrogen gas bioamendments. Pair-end libraries were prepared for high-throughput DNA sequencing and 90 basepairs from both ends of randomly fragmented 400 basepair DNA fragments were sequenced. Over 31 millions reads were annotated with Metagenome Rapid Annotation using Subsystem Technology representing 32 prokaryotic phyla, 869 genera, and 3,181 species. A 3.6 log 2 fold increase in biomass as measured by DNA yield per mL water was measured, but there was a 9% decrease in the number of genera detected post-remediation. We apply Bayesian statistical methods to assign false discovery rates to fold-change abundance data and use Zipf's power law to filter genera with low read counts. Plotting the log-rank against the log-fold-change facilitates the visualization of the changes in the community in response to the enhanced reductive dechlorination protocol. Members of the Archaea domain increased 4.7 log 2 fold, dominated by methanogens. Prior to remediation, classes Alphaproteobacteria and Betaproteobacteria dominated the community but exhibit significant decreases five months after biostimulation. Geobacter and Sulfurospirillum replace " Sideroxydans " and Burkholderia as the most abundant genera. As a result of biostimulation, Deltaproteobacteria and Epsilonproteobacteria capable of dehalogenation, iron and sulfate reduction, and sulfur oxidation increase. Matches to thermophilic, haloalkane respiring archaea is evidence for additional species involved in biodegradation of chlorinated solvents. Additionally, potentially pathogenic bacteria increase, indicating that there may be unintended consequences of bioremediation.
deWaard, Jeremy R; Mitchell, Andrew; Keena, Melody A; Gopurenko, David; Boykin, Laura M; Armstrong, Karen F; Pogue, Michael G; Lima, Joao; Floyd, Robin; Hanner, Robert H; Humble, Leland M
2010-12-09
Detecting and controlling the movements of invasive species, such as insect pests, relies upon rapid and accurate species identification in order to initiate containment procedures by the appropriate authorities. Many species in the tussock moth genus Lymantria are significant forestry pests, including the gypsy moth Lymantria dispar L., and consequently have been a focus for the development of molecular diagnostic tools to assist in identifying species and source populations. In this study we expand the taxonomic and geographic coverage of the DNA barcode reference library, and further test the utility of this diagnostic method, both for species/subspecies assignment and for determination of geographic provenance of populations. Cytochrome oxidase I (COI) barcodes were obtained from 518 individuals and 36 species of Lymantria, including sequences assembled and generated from previous studies, vouchered material in public collections, and intercepted specimens obtained from surveillance programs in Canada. A maximum likelihood tree was constructed, revealing high bootstrap support for 90% of species clusters. Bayesian species assignment was also tested, and resulted in correct assignment to species and subspecies in all instances. The performance of barcoding was also compared against the commonly employed NB restriction digest system (also based on COI); while the latter is informative for discriminating gypsy moth subspecies, COI barcode sequences provide greater resolution and generality by encompassing a greater number of haplotypes across all Lymantria species, none shared between species. This study demonstrates the efficacy of DNA barcodes for diagnosing species of Lymantria and reinforces the view that the approach is an under-utilized resource with substantial potential for biosecurity and surveillance. Biomonitoring agencies currently employing the NB restriction digest system would gather more information by transitioning to the use of DNA barcoding, a change which could be made relatively seamlessly as the same gene region underlies both protocols.
Hubka, Vit; Thureborn, Olle; Lundberg, Johannes; Sallstedt, Therese; Wedin, Mats; Ivarsson, Magnus
2016-01-01
Rock-inhabiting fungi harbour species-rich, poorly differentiated, extremophilic taxa of polyphyletic origin. Their closest relatives are often well-known species from various biotopes with significant pathogenic potential. Speleothems represent a unique rock-dwelling habitat, whose mycobiota are largely unexplored. Isolation of fungi from speleothem biofilm covering bare granite walls in the Kungsträdgården metro station in Stockholm yielded axenic cultures of two distinct black yeast morphotypes. Phylogenetic analyses of DNA sequences from six nuclear loci, ITS, nuc18S and nuc28S rDNA, rpb1, rpb2 and β-tubulin, support their placement in the Chaetothyriales (Ascomycota). They are described as a new genus Bacillicladium with the type species B. lobatum, and a new species Bradymyces graniticola. Bacillicladium is distantly related to the known five chaetothyrialean families and is unique in the Chaetothyriales by variable morphology showing hyphal, meristematic and yeast-like growth in vitro. The nearest relatives of Bacillicladium are recruited among fungi isolated from cardboard-like construction material produced by arboricolous non-attine ants. Their sister relationship is weakly supported by the Maximum likelihood analysis, but strongly supported by Bayesian inference. The genus Bradymyces is placed amidst members of the Trichomeriaceae and is ecologically undefined; it includes an opportunistic animal pathogen while two other species inhabit rock surfaces. ITS rDNA sequences of three species accepted in Bradymyces and other undescribed species and environmental samples were subjected to phylogenetic analysis and in-depth comparative analysis of ITS1 and ITS2 secondary structures in order to study their intraspecific variability. Compensatory base change criterion in the ITS2 secondary structure supported delimitation of species in Bradymyces, which manifest a limited number of phenotypic features useful for species recognition. The role of fungi in the speleothem biofilm and relationships of Bacillicladium and Bradymyces with other members of the Chaetothyriales are discussed. PMID:27732675
Réblová, Martina; Hubka, Vit; Thureborn, Olle; Lundberg, Johannes; Sallstedt, Therese; Wedin, Mats; Ivarsson, Magnus
2016-01-01
Rock-inhabiting fungi harbour species-rich, poorly differentiated, extremophilic taxa of polyphyletic origin. Their closest relatives are often well-known species from various biotopes with significant pathogenic potential. Speleothems represent a unique rock-dwelling habitat, whose mycobiota are largely unexplored. Isolation of fungi from speleothem biofilm covering bare granite walls in the Kungsträdgården metro station in Stockholm yielded axenic cultures of two distinct black yeast morphotypes. Phylogenetic analyses of DNA sequences from six nuclear loci, ITS, nuc18S and nuc28S rDNA, rpb1, rpb2 and β-tubulin, support their placement in the Chaetothyriales (Ascomycota). They are described as a new genus Bacillicladium with the type species B. lobatum, and a new species Bradymyces graniticola. Bacillicladium is distantly related to the known five chaetothyrialean families and is unique in the Chaetothyriales by variable morphology showing hyphal, meristematic and yeast-like growth in vitro. The nearest relatives of Bacillicladium are recruited among fungi isolated from cardboard-like construction material produced by arboricolous non-attine ants. Their sister relationship is weakly supported by the Maximum likelihood analysis, but strongly supported by Bayesian inference. The genus Bradymyces is placed amidst members of the Trichomeriaceae and is ecologically undefined; it includes an opportunistic animal pathogen while two other species inhabit rock surfaces. ITS rDNA sequences of three species accepted in Bradymyces and other undescribed species and environmental samples were subjected to phylogenetic analysis and in-depth comparative analysis of ITS1 and ITS2 secondary structures in order to study their intraspecific variability. Compensatory base change criterion in the ITS2 secondary structure supported delimitation of species in Bradymyces, which manifest a limited number of phenotypic features useful for species recognition. The role of fungi in the speleothem biofilm and relationships of Bacillicladium and Bradymyces with other members of the Chaetothyriales are discussed.
A Bayesian nonparametric method for prediction in EST analysis
Lijoi, Antonio; Mena, Ramsés H; Prünster, Igor
2007-01-01
Background Expressed sequence tags (ESTs) analyses are a fundamental tool for gene identification in organisms. Given a preliminary EST sample from a certain library, several statistical prediction problems arise. In particular, it is of interest to estimate how many new genes can be detected in a future EST sample of given size and also to determine the gene discovery rate: these estimates represent the basis for deciding whether to proceed sequencing the library and, in case of a positive decision, a guideline for selecting the size of the new sample. Such information is also useful for establishing sequencing efficiency in experimental design and for measuring the degree of redundancy of an EST library. Results In this work we propose a Bayesian nonparametric approach for tackling statistical problems related to EST surveys. In particular, we provide estimates for: a) the coverage, defined as the proportion of unique genes in the library represented in the given sample of reads; b) the number of new unique genes to be observed in a future sample; c) the discovery rate of new genes as a function of the future sample size. The Bayesian nonparametric model we adopt conveys, in a statistically rigorous way, the available information into prediction. Our proposal has appealing properties over frequentist nonparametric methods, which become unstable when prediction is required for large future samples. EST libraries, previously studied with frequentist methods, are analyzed in detail. Conclusion The Bayesian nonparametric approach we undertake yields valuable tools for gene capture and prediction in EST libraries. The estimators we obtain do not feature the kind of drawbacks associated with frequentist estimators and are reliable for any size of the additional sample. PMID:17868445
DNA barcode goes two-dimensions: DNA QR code web server.
Liu, Chang; Shi, Linchun; Xu, Xiaolan; Li, Huan; Xing, Hang; Liang, Dong; Jiang, Kun; Pang, Xiaohui; Song, Jingyuan; Chen, Shilin
2012-01-01
The DNA barcoding technology uses a standard region of DNA sequence for species identification and discovery. At present, "DNA barcode" actually refers to DNA sequences, which are not amenable to information storage, recognition, and retrieval. Our aim is to identify the best symbology that can represent DNA barcode sequences in practical applications. A comprehensive set of sequences for five DNA barcode markers ITS2, rbcL, matK, psbA-trnH, and CO1 was used as the test data. Fifty-three different types of one-dimensional and ten two-dimensional barcode symbologies were compared based on different criteria, such as coding capacity, compression efficiency, and error detection ability. The quick response (QR) code was found to have the largest coding capacity and relatively high compression ratio. To facilitate the further usage of QR code-based DNA barcodes, a web server was developed and is accessible at http://qrfordna.dnsalias.org. The web server allows users to retrieve the QR code for a species of interests, convert a DNA sequence to and from a QR code, and perform species identification based on local and global sequence similarities. In summary, the first comprehensive evaluation of various barcode symbologies has been carried out. The QR code has been found to be the most appropriate symbology for DNA barcode sequences. A web server has also been constructed to allow biologists to utilize QR codes in practical DNA barcoding applications.
TaxI: a software tool for DNA barcoding using distance methods
Steinke, Dirk; Vences, Miguel; Salzburger, Walter; Meyer, Axel
2005-01-01
DNA barcoding is a promising approach to the diagnosis of biological diversity in which DNA sequences serve as the primary key for information retrieval. Most existing software for evolutionary analysis of DNA sequences was designed for phylogenetic analyses and, hence, those algorithms do not offer appropriate solutions for the rapid, but precise analyses needed for DNA barcoding, and are also unable to process the often large comparative datasets. We developed a flexible software tool for DNA taxonomy, named TaxI. This program calculates sequence divergences between a query sequence (taxon to be barcoded) and each sequence of a dataset of reference sequences defined by the user. Because the analysis is based on separate pairwise alignments this software is also able to work with sequences characterized by multiple insertions and deletions that are difficult to align in large sequence sets (i.e. thousands of sequences) by multiple alignment algorithms because of computational restrictions. Here, we demonstrate the utility of this approach with two datasets of fish larvae and juveniles from Lake Constance and juvenile land snails under different models of sequence evolution. Sets of ribosomal 16S rRNA sequences, characterized by multiple indels, performed as good as or better than cox1 sequence sets in assigning sequences to species, demonstrating the suitability of rRNA genes for DNA barcoding. PMID:16214755
Tabor, Stanley; Richardson, Charles C.
1995-04-25
A method for sequencing a strand of DNA, including the steps off: providing the strand of DNA; annealing the strand with a primer able to hybridize to the strand to give an annealed mixture; incubating the mixture with four deoxyribonucleoside triphosphates, a DNA polymerase, and at least three deoxyribonucleoside triphosphates in different amounts, under conditions in favoring primer extension to form nucleic acid fragments complementory to the DNA to be sequenced; labelling the nucleic and fragments; separating them and determining the position of the deoxyribonucleoside triphosphates by differences in the intensity of the labels, thereby to determine the DNA sequence.
Kim, Joo-Hwan; Kim, Dong-Kap; Forest, Felix; Fay, Michael F.; Chase, Mark W.
2010-01-01
Background Previous phylogenetics studies of Asparagales, although extensive and generally well supported, have left several sets of taxa unclearly placed and have not addressed all relationships within certain clades thoroughly (some clades were relatively sparsely sampled). One of the most important of these is sampling within and placement of Nolinoideae (Ruscaceae s.l.) of Asparagaceae sensu Angiosperm Phylogeny Group (APG) III, which subfamily includes taxa previously referred to Convallariaceae, Dracaenaaceae, Eriospermaceae, Nolinaceae and Ruscaceae. Methods A phylogenetic analysis of a combined data set for 126 taxa of Ruscaceae s.l. and related groups in Asparagales based on three nuclear and plastid DNA coding genes, 18S rDNA (1796 bp), rbcL (1338 bp) and matK (1668 bp), representing a total of approx. 4·8 kb is presented. Parsimony and Bayesian inference analyses were conducted to elucidate relationships of Ruscaceae s.l. and related groups, and parsimony bootstrap analysis was performed to assess support of clades. Key Results The combination of the three genes results in the most highly resolved and strongly supported topology yet obtained for Asparagales including Ruscaceae s.l. Asparagales relationships are nearly congruent with previous combined gene analyses, which were reflected in the APG III classification. Parsimony and Bayesian analyses yield identical relationships except for some slight variation among the core asparagoid families, which nevertheless form a strongly supported group in both types of analyses. In core asparagoids, five major clades are identified: (1) Alliaceae s.l. (sensu APG III, Amarylidaceae–Agapanthaceae–Alliaceae); (2) Asparagaceae–Laxmanniaceae–Ruscaceae s.l.; (3) Themidaceae; (4) Hyacinthaceae; (5) Anemarrhenaceae–Behniaceae–Herreriaceae–Agavaceae (clades 2–5 collectively Asparagaceae s.l. sensu APG III). The position of Aphyllanthes is labile, but it is sister to Themidaceae in the combined maximum-parsimony tree and sister to Anemarrhenaceae in the Bayesian analysis. The highly supported clade of Xanthorrhoeaceae s.l. (sensu APG III, including Asphodelaceae and Hemerocallidaceae) is sister to the core asparagoids. Ruscaceae s.l. are a well-supported group. Asparagaceae s.s. are sister to Ruscaceae s.l., even though the clade of the two families is weakly supported; Laxmanniaceae are strongly supported as sister to Ruscaceae s.l. and Asparagaceae. Ruscaceae s.l. include six principal clades that often reflect previously named groups: (1) tribe Polygonateae (excluding Disporopsis); (2) tribe Ophiopogoneae; (3) tribe Convallarieae (excluding Theropogon); (4) Ruscaceae s.s. + Dracaenaceae + Theropogon + Disporopsis + Comospermum; (5) Nolinaceae, (6) Eriospermum. Conclusions The analyses here were largely conducted with new data collected for the same loci as in previous studies, but in this case from different species/DNA accessions and greater sampling in many cases than in previously published analyses; nonetheless, the results largely mirror those of previously conducted studies. This demonstrates the robustness of these results and answers questions often raised about reproducibility of DNA results, given the often sparse sampling of taxa in some studies, particularly the earliest ones. The results also provide a clear set of patterns on which to base a new classification of the subfamilies of Asparagaceae s.l., particularly Ruscaceae s.l. (= Nolinoideae of Asparagaceae s.l.), and examine other putatively important characters of Asparagales. PMID:20929900
Kukita, Yoji; Matoba, Ryo; Uchida, Junji; Hamakawa, Takuya; Doki, Yuichiro; Imamura, Fumio; Kato, Kikuya
2015-08-01
Circulating tumour DNA (ctDNA) is an emerging field of cancer research. However, current ctDNA analysis is usually restricted to one or a few mutation sites due to technical limitations. In the case of massively parallel DNA sequencers, the number of false positives caused by a high read error rate is a major problem. In addition, the final sequence reads do not represent the original DNA population due to the global amplification step during the template preparation. We established a high-fidelity target sequencing system of individual molecules identified in plasma cell-free DNA using barcode sequences; this system consists of the following two steps. (i) A novel target sequencing method that adds barcode sequences by adaptor ligation. This method uses linear amplification to eliminate the errors introduced during the early cycles of polymerase chain reaction. (ii) The monitoring and removal of erroneous barcode tags. This process involves the identification of individual molecules that have been sequenced and for which the number of mutations have been absolute quantitated. Using plasma cell-free DNA from patients with gastric or lung cancer, we demonstrated that the system achieved near complete elimination of false positives and enabled de novo detection and absolute quantitation of mutations in plasma cell-free DNA. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Aguilar, William; Paz, Manuel M; Vargas, Anayatzinc; Clement, Cristina C; Cheng, Shu-Yuan; Champeil, Elise
2018-04-20
Mitomycin C (MC), a potent antitumor drug, and decarbamoylmitomycin C (DMC), a derivative lacking the carbamoyl group, form highly cytotoxic DNA interstrand crosslinks. The major interstrand crosslink formed by DMC is the C1'' epimer of the major crosslink formed by MC. The molecular basis for the stereochemical configuration exhibited by DMC was investigated using biomimetic synthesis. The formation of DNA-DNA crosslinks by DMC is diastereospecific and diastereodivergent: Only the 1''S-diastereomer of the initially formed monoadduct can form crosslinks at GpC sequences, and only the 1''R-diastereomer of the monoadduct can form crosslinks at CpG sequences. We also show that CpG and GpC sequences react with divergent diastereoselectivity in the first alkylation step: 1"S stereochemistry is favored at GpC sequences and 1''R stereochemistry is favored at CpG sequences. Therefore, the first alkylation step results, at each sequence, in the selective formation of the diastereomer able to generate an interstrand DNA-DNA crosslink after the "second arm" alkylation. Examination of the known DNA adduct pattern obtained after treatment of cancer cell cultures with DMC indicates that the GpC sequence is the major target for the formation of DNA-DNA crosslinks in vivo by this drug. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Sproul, John S; Maddison, David R
2017-11-01
Despite advances that allow DNA sequencing of old museum specimens, sequencing small-bodied, historical specimens can be challenging and unreliable as many contain only small amounts of fragmented DNA. Dependable methods to sequence such specimens are especially critical if the specimens are unique. We attempt to sequence small-bodied (3-6 mm) historical specimens (including nomenclatural types) of beetles that have been housed, dried, in museums for 58-159 years, and for which few or no suitable replacement specimens exist. To better understand ideal approaches of sample preparation and produce preparation guidelines, we compared different library preparation protocols using low amounts of input DNA (1-10 ng). We also explored low-cost optimizations designed to improve library preparation efficiency and sequencing success of historical specimens with minimal DNA, such as enzymatic repair of DNA. We report successful sample preparation and sequencing for all historical specimens despite our low-input DNA approach. We provide a list of guidelines related to DNA repair, bead handling, reducing adapter dimers and library amplification. We present these guidelines to facilitate more economical use of valuable DNA and enable more consistent results in projects that aim to sequence challenging, irreplaceable historical specimens. © 2017 John Wiley & Sons Ltd.
Mohammed, Monzoorul Haque; Ghosh, Tarini Shankar; Chadaram, Sudha; Mande, Sharmila S
2011-11-30
Obtaining accurate estimates of microbial diversity using rDNA profiling is the first step in most metagenomics projects. Consequently, most metagenomic projects spend considerable amounts of time, money and manpower for experimentally cloning, amplifying and sequencing the rDNA content in a metagenomic sample. In the second step, the entire genomic content of the metagenome is extracted, sequenced and analyzed. Since DNA sequences obtained in this second step also contain rDNA fragments, rapid in silico identification of these rDNA fragments would drastically reduce the cost, time and effort of current metagenomic projects by entirely bypassing the experimental steps of primer based rDNA amplification, cloning and sequencing. In this study, we present an algorithm called i-rDNA that can facilitate the rapid detection of 16S rDNA fragments from amongst millions of sequences in metagenomic data sets with high detection sensitivity. Performance evaluation with data sets/database variants simulating typical metagenomic scenarios indicates the significantly high detection sensitivity of i-rDNA. Moreover, i-rDNA can process a million sequences in less than an hour on a simple desktop with modest hardware specifications. In addition to the speed of execution, high sensitivity and low false positive rate, the utility of the algorithmic approach discussed in this paper is immense given that it would help in bypassing the entire experimental step of primer-based rDNA amplification, cloning and sequencing. Application of this algorithmic approach would thus drastically reduce the cost, time and human efforts invested in all metagenomic projects. A web-server for the i-rDNA algorithm is available at http://metagenomics.atc.tcs.com/i-rDNA/
Biosensors for DNA sequence detection
NASA Technical Reports Server (NTRS)
Vercoutere, Wenonah; Akeson, Mark
2002-01-01
DNA biosensors are being developed as alternatives to conventional DNA microarrays. These devices couple signal transduction directly to sequence recognition. Some of the most sensitive and functional technologies use fibre optics or electrochemical sensors in combination with DNA hybridization. In a shift from sequence recognition by hybridization, two emerging single-molecule techniques read sequence composition using zero-mode waveguides or electrical impedance in nanoscale pores.
Thomas, W. Kelley; Vida, J. T.; Frisse, Linda M.; Mundo, Manuel; Baldwin, James G.
1997-01-01
To effectively integrate DNA sequence analysis and classical nematode taxonomy, we must be able to obtain DNA sequences from formalin-fixed specimens. Microdissected sections of nematodes were removed from specimens fixed in formalin, using standard protocols and without destroying morphological features. The fixed sections provided sufficient template for multiple polymerase chain reaction-based DNA sequence analyses. PMID:19274156
Star, Bastiaan; Nederbragt, Alexander J.; Hansen, Marianne H. S.; Skage, Morten; Gilfillan, Gregor D.; Bradbury, Ian R.; Pampoulie, Christophe; Stenseth, Nils Chr; Jakobsen, Kjetill S.; Jentoft, Sissel
2014-01-01
Degradation-specific processes and variation in laboratory protocols can bias the DNA sequence composition from samples of ancient or historic origin. Here, we identify a novel artifact in sequences from historic samples of Atlantic cod (Gadus morhua), which forms interrupted palindromes consisting of reverse complementary sequence at the 5′ and 3′-ends of sequencing reads. The palindromic sequences themselves have specific properties – the bases at the 5′-end align well to the reference genome, whereas extensive misalignments exists among the bases at the terminal 3′-end. The terminal 3′ bases are artificial extensions likely caused by the occurrence of hairpin loops in single stranded DNA (ssDNA), which can be ligated and amplified in particular library creation protocols. We propose that such hairpin loops allow the inclusion of erroneous nucleotides, specifically at the 3′-end of DNA strands, with the 5′-end of the same strand providing the template. We also find these palindromes in previously published ancient DNA (aDNA) datasets, albeit at varying and substantially lower frequencies. This artifact can negatively affect the yield of endogenous DNA in these types of samples and introduces sequence bias. PMID:24608104
Yamada, Kazuhiko; Nishida-Umehara, Chizuko; Matsuda, Yoichi
2004-03-01
We isolated a new family of satellite DNA sequences from HaeIII- and EcoRI-digested genomic DNA of the Blakiston's fish owl ( Ketupa blakistoni). The repetitive sequences were organized in tandem arrays of the 174 bp element, and localized to the centromeric regions of all macrochromosomes, including the Z and W chromosomes, and microchromosomes. This hybridization pattern was consistent with the distribution of C-band-positive centromeric heterochromatin, and the satellite DNA sequences occupied 10% of the total genome as a major component of centromeric heterochromatin. The sequences were homogenized between macro- and microchromosomes in this species, and therefore intraspecific divergence of the nucleotide sequences was low. The 174 bp element cross-hybridized to the genomic DNA of six other Strigidae species, but not to that of the Tytonidae, suggesting that the satellite DNA sequences are conserved in the same family but fairly divergent between the different families in the Strigiformes. Secondly, the centromeric satellite DNAs were cloned from eight Strigidae species, and the nucleotide sequences of 41 monomer fragments were compared within and between species. Molecular phylogenetic relationships of the nucleotide sequences were highly correlated with both the taxonomy based on morphological traits and the phylogenetic tree constructed by DNA-DNA hybridization. These results suggest that the satellite DNA sequence has evolved by concerted evolution in the Strigidae and that it is a good taxonomic and phylogenetic marker to examine genetic diversity between Strigiformes species.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sobottka, Marcelo, E-mail: sobottka@mtm.ufsc.br; Hart, Andrew G., E-mail: ahart@dim.uchile.cl
Highlights: {yields} We propose a simple stochastic model to construct primitive DNA sequences. {yields} The model provide an explanation for Chargaff's second parity rule in primitive DNA sequences. {yields} The model is also used to predict a novel type of strand symmetry in primitive DNA sequences. {yields} We extend the results for bacterial DNA sequences and compare distributional properties intrinsic to the model to statistical estimates from 1049 bacterial genomes. {yields} We find out statistical evidences that the novel type of strand symmetry holds for bacterial DNA sequences. -- Abstract: Chargaff's second parity rule for short oligonucleotides states that themore » frequency of any short nucleotide sequence on a strand is approximately equal to the frequency of its reverse complement on the same strand. Recent studies have shown that, with the exception of organellar DNA, this parity rule generally holds for double-stranded DNA genomes and fails to hold for single-stranded genomes. While Chargaff's first parity rule is fully explained by the Watson-Crick pairing in the DNA double helix, a definitive explanation for the second parity rule has not yet been determined. In this work, we propose a model based on a hidden Markov process for approximating the distributional structure of primitive DNA sequences. Then, we use the model to provide another possible theoretical explanation for Chargaff's second parity rule, and to predict novel distributional aspects of bacterial DNA sequences.« less
Xu, Ning; Kwon, Soonil; Abbott, David H; Geller, David H; Dumesic, Daniel A; Azziz, Ricardo; Guo, Xiuqing; Goodarzi, Mark O
2011-01-01
The pathogenesis of polycystic ovary syndrome (PCOS) is poorly understood. PCOS-like phenotypes are produced by prenatal androgenization (PA) of female rhesus monkeys. We hypothesize that perturbation of the epigenome, through altered DNA methylation, is one of the mechanisms whereby PA reprograms monkeys to develop PCOS. Infant and adult visceral adipose tissues (VAT) harvested from 15 PA and 10 control monkeys were studied. Bisulfite treated samples were subjected to genome-wide CpG methylation analysis, designed to simultaneously measure methylation levels at 27,578 CpG sites. Analysis was carried out using Bayesian Classification with Singular Value Decomposition (BCSVD), testing all probes simultaneously in a single test. Stringent criteria were then applied to filter out invalid probes due to sequence dissimilarities between human probes and monkey DNA, and then mapped to the rhesus genome. This yielded differentially methylated loci between PA and control monkeys, 163 in infant VAT, and 325 in adult VAT (BCSVD P<0.05). Among these two sets of genes, we identified several significant pathways, including the antiproliferative role of TOB in T cell signaling and transforming growth factor-β (TGF-β) signaling. Our results suggest PA may modify DNA methylation patterns in both infant and adult VAT. This pilot study suggests that excess fetal androgen exposure in female nonhuman primates may predispose to PCOS via alteration of the epigenome, providing a novel avenue to understand PCOS in humans.
Guo, Guo-Ye; Chen, Fang; Shi, Xiao-Dong; Tian, Yin-Shuai; Yu, Mao-Qun; Han, Xue-Qin; Yuan, Li-Chun; Zhang, Ying
2016-01-01
Genetic variation and phylogenetic relationships among 102 Jatropha curcas accessions from Asia, Africa, and the Americas were assessed using the internal transcribed spacer region of nuclear ribosomal DNA (nrDNA ITS). The average G+C content (65.04%) was considerably higher than the A+T (34.96%) content. The estimated genetic diversity revealed moderate genetic variation. The pairwise genetic divergences (GD) between haplotypes were evaluated and ranged from 0.000 to 0.017, suggesting a higher level of genetic differentiation in Mexican accessions than those of other regions. Phylogenetic relationships and intraspecific divergence were inferred by Bayesian inference (BI), maximum parsimony (MP), and median joining (MJ) network analysis and were generally resolved. The J. curcas accessions were consistently divided into three lineages, groups A, B, and C, which demonstrated distant geographical isolation and genetic divergence between American accessions and those from other regions. The MJ network analysis confirmed that Central America was the possible center of origin. The putative migration route suggested that J. curcas was distributed from Mexico or Brazil, via Cape Verde and then split into two routes. One route was dispersed to Spain, then migrated to China, eventually spreading to southeastern Asia, while the other route was dispersed to Africa, via Madagascar and migrated to China, later spreading to southeastern Asia. Copyright © 2016 Académie des sciences. Published by Elsevier SAS. All rights reserved.
Rose, Emily; Masonjones, Heather D; Jones, Adam G
2016-11-01
Isolated populations provide special opportunities to study local adaptation and incipient speciation. In some cases, however, morphological evolution can obscure the taxonomic status of recently founded populations. Here, we use molecular markers to show that an anchialine-lake-restricted population of seahorses, originally identified as Hippocampus reidi, appears on the basis of DNA data to be Hippocampus erectus We collected seahorses from Sweetings Pond, on Eleuthera Island, Bahamas, during the summer of 2014. We measured morphological traits and sequenced 2 genes, cytochrome b and ribosomal protein S7, from 19 seahorses in our sample. On the basis of morphology, Sweetings Pond seahorses could not be assigned definitively to either of the 2 species of seahorse, H. reidi and H. erectus, that occur in marine waters surrounding the Bahamas. However, our DNA-based phylogenetic analysis showed that the Sweetings Pond fish were firmly nested within the H. erectus clade with a Bayesian posterior probability greater than 0.99. Thus, Sweetings Pond seahorses most recently shared a common ancestor with H. erectus populations from the Western Atlantic. Interestingly, the seahorses from Sweetings Pond differ morphologically from other marine populations of H. erectus in having a more even torso to tail length ratio. The substantial habitat differences between Sweetings Pond and the surrounding coastal habitat make Sweetings Pond seahorses particularly interesting from the perspectives of conservation, local adaptation, and incipient speciation. © The American Genetic Association 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
A Simulation of DNA Sequencing Utilizing 3M Post-It[R] Notes
ERIC Educational Resources Information Center
Christensen, Doug
2009-01-01
An inexpensive and equipment free approach to teaching the technical aspects of DNA sequencing. The activity described requires an instructor with a familiarity of DNA sequencing technology but provides a straight forward method of teaching the technical aspects of sequencing in the absence of expensive sequencing equipment. The final sequence…
Lee, James W.; Thundat, Thomas G.
2005-06-14
An apparatus and method for performing nucleic acid (DNA and/or RNA) sequencing on a single molecule. The genetic sequence information is obtained by probing through a DNA or RNA molecule base by base at nanometer scale as though looking through a strip of movie film. This DNA sequencing nanotechnology has the theoretical capability of performing DNA sequencing at a maximal rate of about 1,000,000 bases per second. This enhanced performance is made possible by a series of innovations including: novel applications of a fine-tuned nanometer gap for passage of a single DNA or RNA molecule; thin layer microfluidics for sample loading and delivery; and programmable electric fields for precise control of DNA or RNA movement. Detection methods include nanoelectrode-gated tunneling current measurements, dielectric molecular characterization, and atomic force microscopy/electrostatic force microscopy (AFM/EFM) probing for nanoscale reading of the nucleic acid sequences.
The sequence specificity of UV-induced DNA damage in a systematically altered DNA sequence.
Khoe, Clairine V; Chung, Long H; Murray, Vincent
2018-06-01
The sequence specificity of UV-induced DNA damage was investigated in a specifically designed DNA plasmid using two procedures: end-labelling and linear amplification. Absorption of UV photons by DNA leads to dimerisation of pyrimidine bases and produces two major photoproducts, cyclobutane pyrimidine dimers (CPDs) and pyrimidine(6-4)pyrimidone photoproducts (6-4PPs). A previous study had determined that two hexanucleotide sequences, 5'-GCTC*AC and 5'-TATT*AA, were high intensity UV-induced DNA damage sites. The UV clone plasmid was constructed by systematically altering each nucleotide of these two hexanucleotide sequences. One of the main goals of this study was to determine the influence of single nucleotide alterations on the intensity of UV-induced DNA damage. The sequence 5'-GCTC*AC was designed to examine the sequence specificity of 6-4PPs and the highest intensity 6-4PP damage sites were found at 5'-GTTC*CC nucleotides. The sequence 5'-TATT*AA was devised to investigate the sequence specificity of CPDs and the highest intensity CPD damage sites were found at 5'-TTTT*CG nucleotides. It was proposed that the tetranucleotide DNA sequence, 5'-YTC*Y (where Y is T or C), was the consensus sequence for the highest intensity UV-induced 6-4PP adduct sites; while it was 5'-YTT*C for the highest intensity UV-induced CPD damage sites. These consensus tetranucleotides are composed entirely of consecutive pyrimidines and must have a DNA conformation that is highly productive for the absorption of UV photons. Crown Copyright © 2018. Published by Elsevier B.V. All rights reserved.
Torque measurements reveal sequence-specific cooperative transitions in supercoiled DNA
Oberstrass, Florian C.; Fernandes, Louis E.; Bryant, Zev
2012-01-01
B-DNA becomes unstable under superhelical stress and is able to adopt a wide range of alternative conformations including strand-separated DNA and Z-DNA. Localized sequence-dependent structural transitions are important for the regulation of biological processes such as DNA replication and transcription. To directly probe the effect of sequence on structural transitions driven by torque, we have measured the torsional response of a panel of DNA sequences using single molecule assays that employ nanosphere rotational probes to achieve high torque resolution. The responses of Z-forming d(pGpC)n sequences match our predictions based on a theoretical treatment of cooperative transitions in helical polymers. “Bubble” templates containing 50–100 bp mismatch regions show cooperative structural transitions similar to B-DNA, although less torque is required to disrupt strand–strand interactions. Our mechanical measurements, including direct characterization of the torsional rigidity of strand-separated DNA, establish a framework for quantitative predictions of the complex torsional response of arbitrary sequences in their biological context. PMID:22474350
NASA Astrophysics Data System (ADS)
Yang, Hong
Until recently, recovery and analysis of genetic information encoded in ancient DNA sequences from Pleistocene fossils were impossible. Recent advances in molecular biology offered technical tools to obtain ancient DNA sequences from well-preserved Quaternary fossils and opened the possibilities to directly study genetic changes in fossil species to address various biological and paleontological questions. Ancient DNA studies involving Pleistocene fossil material and ancient DNA degradation and preservation in Quaternary deposits are reviewed. The molecular technology applied to isolate, amplify, and sequence ancient DNA is also presented. Authentication of ancient DNA sequences and technical problems associated with modern and ancient DNA contamination are discussed. As illustrated in recent studies on ancient DNA from proboscideans, it is apparent that fossil DNA sequence data can shed light on many aspects of Quaternary research such as systematics and phylogeny. conservation biology, evolutionary theory, molecular taphonomy, and forensic sciences. Improvement of molecular techniques and a better understanding of DNA degradation during fossilization are likely to build on current strengths and to overcome existing problems, making fossil DNA data a unique source of information for Quaternary scientists.
Enantiospecific recognition of DNA sequences by a proflavine Tröger base.
Bailly, C; Laine, W; Demeunynck, M; Lhomme, J
2000-07-05
The DNA interaction of a chiral Tröger base derived from proflavine was investigated by DNA melting temperature measurements and complementary biochemical assays. DNase I footprinting experiments demonstrate that the binding of the proflavine-based Tröger base is both enantio- and sequence-specific. The (+)-isomer poorly interacts with DNA in a non-sequence-selective fashion. In sharp contrast, the corresponding (-)-isomer recognizes preferentially certain DNA sequences containing both A. T and G. C base pairs, such as the motifs 5'-GTT. AAC and 5'-ATGA. TCAT. This is the first experimental demonstration that acridine-type Tröger bases can be used for enantiospecific recognition of DNA sequences. Copyright 2000 Academic Press.
NASA Astrophysics Data System (ADS)
Peng, Jun; Ling, Jian; Zhang, Xiu-Qing; Bai, Hui-Ping; Zheng, Liyan; Cao, Qiu-E.; Ding, Zhong-Tao
2015-02-01
In this work, we designed a new fluorescent oligonucleotides-stabilized silver nanoclusters (DNA/AgNCs) probe for sensitive detection of mercury and copper ions. This probe contains two tailored DNA sequence. One is a signal probe contains a cytosine-rich sequence template for AgNCs synthesis and link sequence at both ends. The other is a guanine-rich sequence for signal enhancement and link sequence complementary to the link sequence of the signal probe. After hybridization, the fluorescence of hybridized double-strand DNA/AgNCs is 200-fold enhanced based on the fluorescence enhancement effect of DNA/AgNCs in proximity of guanine-rich DNA sequence. The double-strand DNA/AgNCs probe is brighter and stable than that of single-strand DNA/AgNCs, and more importantly, can be used as novel fluorescent probes for detecting mercury and copper ions. Mercury and copper ions in the range of 6.0-160.0 and 6-240 nM, can be linearly detected with the detection limits of 2.1 and 3.4 nM, respectively. Our results indicated that the analytical parameters of the method for mercury and copper ions detection are much better than which using a single-strand DNA/AgNCs.
Antipova, Valeriya N; Zheleznaya, Lyudmila A; Zyrina, Nadezhda V
2014-08-01
In the absence of added DNA, thermophilic DNA polymerases synthesize double-stranded DNA from free dNTPs, which consist of numerous repetitive units (ab initio DNA synthesis). The addition of thermophilic restriction endonuclease (REase), or nicking endonuclease (NEase), effectively stimulates ab initio DNA synthesis and determines the nucleotide sequence of reaction products. We have found that NEases Nt.AlwI, Nb.BbvCI, and Nb.BsmI with non-palindromic recognition sites stimulate the synthesis of sequences organized mainly as palindromes. Moreover, the nucleotide sequence of the palindromes appeared to be dependent on NEase recognition/cleavage modes. Thus, the heterodimeric Nb.BbvCI stimulated the synthesis of palindromes composed of two recognition sites of this NEase, which were separated by AT-reach sequences or (A)n (T)m spacers. Palindromic DNA sequences obtained in the ab initio DNA synthesis with the monomeric NEases Nb.BsmI and Nt.AlwI contained, along with the sites of these NEases, randomly synthesized sequences consisted of blocks of short repeats. These findings could help investigation of the potential abilities of highly productive ab initio DNA synthesis for the creation of DNA molecules with desirable sequence. © 2014 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved.
Shao, Zhiyong; Graf, Shannon; Chaga, Oleg Y; Lavrov, Dennis V
2006-10-15
The 16,937-nuceotide sequence of the linear mitochondrial DNA (mt-DNA) molecule of the moon jelly Aurelia aurita (Cnidaria, Scyphozoa) - the first mtDNA sequence from the class Scypozoa and the first sequence of a linear mtDNA from Metazoa - has been determined. This sequence contains genes for 13 energy pathway proteins, small and large subunit rRNAs, and methionine and tryptophan tRNAs. In addition, two open reading frames of 324 and 969 base pairs in length have been found. The deduced amino-acid sequence of one of them, ORF969, displays extensive sequence similarity with the polymerase [but not the exonuclease] domain of family B DNA polymerases, and this ORF has been tentatively identified as dnab. This is the first report of dnab in animal mtDNA. The genes in A. aurita mtDNA are arranged in two clusters with opposite transcriptional polarities; transcription proceeding toward the ends of the molecule. The determined sequences at the ends of the molecule are nearly identical but inverted and lack any obvious potential secondary structures or telomere-like repeat elements. The acquisition of mitochondrial genomic data for the second class of Cnidaria allows us to reconstruct characteristic features of mitochondrial evolution in this animal phylum.
Recent patents of nanopore DNA sequencing technology: progress and challenges.
Zhou, Jianfeng; Xu, Bingqian
2010-11-01
DNA sequencing techniques witnessed fast development in the last decades, primarily driven by the Human Genome Project. Among the proposed new techniques, Nanopore was considered as a suitable candidate for the single DNA sequencing with ultrahigh speed and very low cost. Several fabrication and modification techniques have been developed to produce robust and well-defined nanopore devices. Many efforts have also been done to apply nanopore to analyze the properties of DNA molecules. By comparing with traditional sequencing techniques, nanopore has demonstrated its distinctive superiorities in main practical issues, such as sample preparation, sequencing speed, cost-effective and read-length. Although challenges still remain, recent researches in improving the capabilities of nanopore have shed a light to achieve its ultimate goal: Sequence individual DNA strand at single nucleotide level. This patent review briefly highlights recent developments and technological achievements for DNA analysis and sequencing at single molecule level, focusing on nanopore based methods.
Small tandemly repeated DNA sequences of higher plants likely originate from a tRNA gene ancestor.
Benslimane, A A; Dron, M; Hartmann, C; Rode, A
1986-01-01
Several monomers (177 bp) of a tandemly arranged repetitive nuclear DNA sequence of Brassica oleracea have been cloned and sequenced. They share up to 95% homology between one another and up to 80% with other satellite DNA sequences of Cruciferae, suggesting a common ancestor. Both strands of these monomers show more than 50% homology with many tRNA genes; the best homologies have been obtained with Lys and His yeast mitochondrial tRNA genes (respectively 64% and 60%). These results suggest that small tandemly repeated DNA sequences of plants may have evolved from a tRNA gene ancestor. These tandem repeats have probably arisen via a process involving reverse transcription of polymerase III RNA intermediates, as is the case for interspersed DNA sequences of mammalians. A model is proposed to explain the formation of such small tandemly repeated DNA sequences. Images PMID:3774553
Next-Generation Sequencing Platforms
NASA Astrophysics Data System (ADS)
Mardis, Elaine R.
2013-06-01
Automated DNA sequencing instruments embody an elegant interplay among chemistry, engineering, software, and molecular biology and have built upon Sanger's founding discovery of dideoxynucleotide sequencing to perform once-unfathomable tasks. Combined with innovative physical mapping approaches that helped to establish long-range relationships between cloned stretches of genomic DNA, fluorescent DNA sequencers produced reference genome sequences for model organisms and for the reference human genome. New types of sequencing instruments that permit amazing acceleration of data-collection rates for DNA sequencing have been developed. The ability to generate genome-scale data sets is now transforming the nature of biological inquiry. Here, I provide an historical perspective of the field, focusing on the fundamental developments that predated the advent of next-generation sequencing instruments and providing information about how these instruments work, their application to biological research, and the newest types of sequencers that can extract data from single DNA molecules.
Regulatory link between DNA methylation and active demethylation in Arabidopsis
Lei, Mingguang; Zhang, Huiming; Julian, Russell; Tang, Kai; Xie, Shaojun; Zhu, Jian-Kang
2015-01-01
De novo DNA methylation through the RNA-directed DNA methylation (RdDM) pathway and active DNA demethylation play important roles in controlling genome-wide DNA methylation patterns in plants. Little is known about how cells manage the balance between DNA methylation and active demethylation activities. Here, we report the identification of a unique RdDM target sequence, where DNA methylation is required for maintaining proper active DNA demethylation of the Arabidopsis genome. In a genetic screen for cellular antisilencing factors, we isolated several REPRESSOR OF SILENCING 1 (ros1) mutant alleles, as well as many RdDM mutants, which showed drastically reduced ROS1 gene expression and, consequently, transcriptional silencing of two reporter genes. A helitron transposon element (TE) in the ROS1 gene promoter negatively controls ROS1 expression, whereas DNA methylation of an RdDM target sequence between ROS1 5′ UTR and the promoter TE region antagonizes this helitron TE in regulating ROS1 expression. This RdDM target sequence is also targeted by ROS1, and defective DNA demethylation in loss-of-function ros1 mutant alleles causes DNA hypermethylation of this sequence and concomitantly causes increased ROS1 expression. Our results suggest that this sequence in the ROS1 promoter region serves as a DNA methylation monitoring sequence (MEMS) that senses DNA methylation and active DNA demethylation activities. Therefore, the ROS1 promoter functions like a thermostat (i.e., methylstat) to sense DNA methylation levels and regulates DNA methylation by controlling ROS1 expression. PMID:25733903
Exact calculation of distributions on integers, with application to sequence alignment.
Newberg, Lee A; Lawrence, Charles E
2009-01-01
Computational biology is replete with high-dimensional discrete prediction and inference problems. Dynamic programming recursions can be applied to several of the most important of these, including sequence alignment, RNA secondary-structure prediction, phylogenetic inference, and motif finding. In these problems, attention is frequently focused on some scalar quantity of interest, a score, such as an alignment score or the free energy of an RNA secondary structure. In many cases, score is naturally defined on integers, such as a count of the number of pairing differences between two sequence alignments, or else an integer score has been adopted for computational reasons, such as in the test of significance of motif scores. The probability distribution of the score under an appropriate probabilistic model is of interest, such as in tests of significance of motif scores, or in calculation of Bayesian confidence limits around an alignment. Here we present three algorithms for calculating the exact distribution of a score of this type; then, in the context of pairwise local sequence alignments, we apply the approach so as to find the alignment score distribution and Bayesian confidence limits.
van Riemsdijk, Isolde; Arntzen, Jan W; Bogaerts, Sergé; Franzen, Michael; Litvinchuk, Spartak N; Olgun, Kurtuluş; Wielstra, Ben
2017-09-01
The banded newt (genus Ommatotriton) is widely distributed in the Near East (Anatolia, Caucasus and the Levant) - an understudied region from the perspective of phylogeography. The genus is polytypic, but the number of species included and the phylogenetic relationships between them are not settled. We sequenced two mitochondrial and two nuclear DNA markers throughout the range of Ommatotriton. For mtDNA we constructed phylogenetic trees, estimated divergence times using fossil calibration, and investigated changes in effective population size with Bayesian skyline plots and mismatch analyses. For nuDNA we constructed phylogenetic trees and haplotype networks. Species trees were constructed for all markers and nuDNA only. Species distribution models were projected on current and Last Glacial Maximum climate layers. We confirm the presence of three Ommatotriton species: O. nesterovi, O. ophryticus and O. vittatus. These species are genetically distinct and their most recent common ancestor was dated at ∼25Ma (Oligocene). No evidence of recent gene flow between species was found. The species show deep intraspecific genetic divergence, represented by geographically structured clades, with crown nodes of species dated ∼8-13Ma (Miocene to Early Quaternary); evidence of long-term in situ evolution and survival in multiple glacial refugia. While a species tree based on nuDNA suggested a sister species relationship between O. vittatus and O. ophryticus, when mtDNA was included, phylogenetic relationships were unresolved, and we refrain from accepting a particular phylogenetic hypothesis at this stage. While species distribution models suggest reduced and fragmented ranges during the Last Glacial Maximum, we found no evidence for strong population bottlenecks. We discuss our results in the light of other phylogeographic studies from the Near East. Our study underlines the important role of the Near East in generating and sustaining biodiversity. Copyright © 2017 Elsevier Inc. All rights reserved.
Attomole-level Genomics with Single-molecule Direct DNA, cDNA and RNA Sequencing Technologies.
Ozsolak, Fatih
2016-01-01
With the introduction of next-generation sequencing (NGS) technologies in 2005, the domination of microarrays in genomics quickly came to an end due to NGS's superior technical performance and cost advantages. By enabling genetic analysis capabilities that were not possible previously, NGS technologies have started to play an integral role in all areas of biomedical research. This chapter outlines the low-quantity DNA and cDNA sequencing capabilities and applications developed with the Helicos single molecule DNA sequencing technology.
Walker, M D; Park, C W; Rosen, A; Aronheim, A
1990-01-01
Cell specific expression of the insulin gene is achieved through transcriptional mechanisms operating on multiple DNA sequence elements located in the 5' flanking region of the gene. Of particular importance in the rat insulin I gene are two closely similar 9 bp sequences (IEB1 and IEB2): mutation of either of these leads to 5-10 fold reduction in transcriptional activity. We have screened an expression cDNA library derived from mouse pancreatic endocrine beta cells with a radioactive DNA probe containing multiple copies of the IEB1 sequence. A cDNA clone (A1) isolated by this procedure encodes a protein which shows efficient binding to the IEB1 probe, but much weaker binding to either an unrelated DNA probe or to a probe bearing a single base pair insertion within the recognition sequence. DNA sequence analysis indicates a protein belonging to the helix-loop-helix family of DNA-binding proteins. The ability of the protein encoded by clone A1 to recognize a number of wild type and mutant DNA sequences correlates closely with the ability of each sequence element to support transcription in vivo in the context of the insulin 5' flanking DNA. We conclude that the isolated cDNA may encode a transcription factor that participates in control of insulin gene expression. Images PMID:2181401
Highly multiplexed targeted DNA sequencing from single nuclei.
Leung, Marco L; Wang, Yong; Kim, Charissa; Gao, Ruli; Jiang, Jerry; Sei, Emi; Navin, Nicholas E
2016-02-01
Single-cell DNA sequencing methods are challenged by poor physical coverage, high technical error rates and low throughput. To address these issues, we developed a single-cell DNA sequencing protocol that combines flow-sorting of single nuclei, time-limited multiple-displacement amplification (MDA), low-input library preparation, DNA barcoding, targeted capture and next-generation sequencing (NGS). This approach represents a major improvement over our previous single nucleus sequencing (SNS) Nature Protocols paper in terms of generating higher-coverage data (>90%), thereby enabling the detection of genome-wide variants in single mammalian cells at base-pair resolution. Furthermore, by pooling 48-96 single-cell libraries together for targeted capture, this approach can be used to sequence many single-cell libraries in parallel in a single reaction. This protocol greatly reduces the cost of single-cell DNA sequencing, and it can be completed in 5-6 d by advanced users. This single-cell DNA sequencing protocol has broad applications for studying rare cells and complex populations in diverse fields of biological research and medicine.
Phylogenetic Relationships of American Willows (Salix L., Salicaceae)
Lauron-Moreau, Aurélien; Pitre, Frédéric E.; Argus, George W.; Labrecque, Michel; Brouillet, Luc
2015-01-01
Salix L. is the largest genus in the family Salicaceae (450 species). Several classifications have been published, but taxonomic subdivision has been under continuous revision. Our goal is to establish the phylogenetic structure of the genus using molecular data on all American willows, using three DNA markers. This complete phylogeny of American willows allows us to propose a biogeographic framework for the evolution of the genus. Material was obtained for the 122 native and introduced willow species of America. Sequences were obtained from the ITS (ribosomal nuclear DNA) and two plastid regions, matK and rbcL. Phylogenetic analyses (parsimony, maximum likelihood, Bayesian inference) were performed on the data. Geographic distribution was mapped onto the tree. The species tree provides strong support for a division of the genus into two subgenera, Salix and Vetrix. Subgenus Salix comprises temperate species from the Americas and Asia, and their disjunction may result from Tertiary events. Subgenus Vetrix is composed of boreo-arctic species of the Northern Hemisphere and their radiation may coincide with the Quaternary glaciations. Sixteen species have ambiguous positions; genetic diversity is lower in subg. Vetrix. A molecular phylogeny of all species of American willows has been inferred. It needs to be tested and further resolved using other molecular data. Nonetheless, the genus clearly has two clades that have distinct biogeographic patterns. PMID:25880993
Álvarez-Castañeda, Sergio Ticul; Murphy, Robert W.
2014-01-01
The Baja California peninsula is the second longest, most geographically isolated peninsula on Earth. Its physiography and the presence of many surrounding islands has facilitated studies of the underlying patterns and drivers of genetic structuring for a wide spectrum of organisms. Chaetodipus spinatus is endemic to the region and occurs on 12 associated islands, including 10 in the Gulf of California and two in the Pacific Ocean. This distribution makes it a model species for evaluating natural historical barriers. We test hypotheses associated with the relationship between the range of the species, patterns in other species, and its relationship to Pleistocene-Holocene climatic changes. We analyzed sequence data from mtDNA genes encoding cytochrome b (Cytb) and cytochrome c oxidase subunits I (COI) and III (COIII) in 26 populations including all 12 islands. The matrilineal genealogy, statistical parsimony network and Bayesian skyline plot indicated an origin of C. spinatus in the southern part of the peninsula. Our analyses detected several differences from the common pattern of peninsular animals: no mid-peninsula break exists, Isla Carmen hosts the most divergent population, the population on an ancient southern Midriff island does not differ from peninsular populations, and a mtDNA peninsular discordance occurs near Loreto. PMID:25542029
Mitochondrial Echoes of First Settlement and Genetic Continuity in El Salvador
Salas, Antonio; Lovo-Gómez, José; Álvarez-Iglesias, Vanesa; Cerezo, María; Lareu, María Victoria; Macaulay, Vincent; Richards, Martin B.; Carracedo, Ángel
2009-01-01
Background From Paleo-Indian times to recent historical episodes, the Mesoamerican isthmus played an important role in the distribution and patterns of variability all around the double American continent. However, the amount of genetic information currently available on Central American continental populations is very scarce. In order to shed light on the role of Mesoamerica in the peopling of the New World, the present study focuses on the analysis of the mtDNA variation in a population sample from El Salvador. Methodology/Principal Findings We have carried out DNA sequencing of the entire control region of the mitochondrial DNA (mtDNA) genome in 90 individuals from El Salvador. We have also compiled more than 3,985 control region profiles from the public domain and the literature in order to carry out inter-population comparisons. The results reveal a predominant Native American component in this region: by far, the most prevalent mtDNA haplogroup in this country (at ∼90%) is A2, in contrast with other North, Meso- and South American populations. Haplogroup A2 shows a star-like phylogeny and is very diverse with a substantial proportion of mtDNAs (45%; sequence range 16090–16365) still unobserved in other American populations. Two different Bayesian approaches used to estimate admixture proportions in El Salvador shows that the majority of the mtDNAs observed come from North America. A preliminary founder analysis indicates that the settlement of El Salvador occurred about 13,400±5,200 Y.B.P.. The founder age of A2 in El Salvador is close to the overall age of A2 in America, which suggests that the colonization of this region occurred within a few thousand years of the initial expansion into the Americas. Conclusions/Significance As a whole, the results are compatible with the hypothesis that today's A2 variability in El Salvador represents to a large extent the indigenous component of the region. Concordant with this hypothesis is also the observation of a very limited contribution from European and African women (∼5%). This implies that the Atlantic slave trade had a very small demographic impact in El Salvador in contrast to its transformation of the gene pool in neighbouring populations from the Caribbean facade. PMID:19724647
Mayer, Werner; Pavlicev, Mihaela
2007-09-01
The family Lacertidae encompasses more than 250 species distributed in the Palearctis, Ethiopis and Orientalis. Lacertids have been subjected in the past to several morphological and molecular studies to establish their phylogeny. However, the problems of convergent adaptation in morphology and of excessively variable molecular markers have hampered the establishment of well supported deeper phylogenetic relationships. Particularly the adaptations to xeric environments have often been used to establish a scenario for the origin and radiation of major lineages within lacertids. Here we present a molecular phylogenetic study based on two nuclear marker genes and representatives of 37 lacertid genera and distinct species groups (as in the case of the collective genus Lacerta). Roughly 1600 bp of the nuclear rag1 and c-mos genes were sequenced and analyzed. While the results provide good support to the hitherto suggested main subfamilies of Gallotiinae (Gallotia and Psammodromus), Eremiainae and Lacertinae [Harris, D.J., Arnold, E.N., Thomas, R.H., 1998. Relationships of lacertid lizards (Reptilia: Lacertidae) estimated from mitochondrial DNA sequences and morphology. Proc. R. Soc. Lond. B 265, 1939-1948], they also suggest unexpected relationships. In particular, the oriental genus Takydromus, previously considered the sister-group to the three subfamilies, is nested within Lacertinae. Moreover, the genera within the Eremiainae are further divided into two groups, roughly corresponding to their respective geographical distributions in the Ethiopian and the Saharo-Eurasian ranges. The results support an independent origin of adaptations to xeric conditions in different subfamilies. The relationships within the subfamily Lacertinae could not be resolved with the markers used. The species groups of the collective genus Lacerta show a bush-like topology in the inferred Bayesian tree, suggesting rapid radiation. The composition of the subfamilies Eremiainae and Lacertinae as well as their phylogeography are discussed.
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies
DOE Office of Scientific and Technical Information (OSTI.GOV)
Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Jr., Richard A.
This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted.more » PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Furthermore, our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.« less
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies
Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Jr., Richard A.; ...
2017-07-18
This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted.more » PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Furthermore, our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.« less
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies
Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Richard A.; Brown, Steven D.
2017-01-01
This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted. PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences. PMID:28769883
DOE Office of Scientific and Technical Information (OSTI.GOV)
Benasutti, M.; Ejadi, S.; Whitlow, M.D.
The mutagenic and carcinogenic chemical aflatoxin B/sub 1/ (AFB/sub 1/) reacts almost exclusively at the N(7)-position of guanine following activation to its reactive form, the 8,9-epoxide (AFB/sub 1/ oxide). In general N(7)-guanine adducts yield DNA strand breaks when heated in base, a property that serves as the basis for the Maxam-Gilbert DNA sequencing reaction specific for guanine. Using DNA sequencing methods, other workers have shown that AFB/sub 1/ oxide gives strand breaks at positions of guanines; however, the guanine bands varied in intensity. This phenomenon has been used to infer that AFB/sub 1/ oxide prefers to react with guanines inmore » some sequence contexts more than in others and has been referred to as sequence specificity of binding. Herein, data on the reaction of AFB/sub 1/ oxide with several synthetic DNA polymers with different sequences are presented, and (following hydrolysis) adduct levels are determine by high-pressure liquid chromatography. These results reveal that for AFB/sub 1/ oxide (1) the N(7)-guanine adduct is the major adduct found in all of the DNA polymers, (2) adduct levels vary in different sequences, and, thus, sequence specificity is also observed by this more direct method, and (3) the intensity of bands in DNA sequencing gels is likely to reflect adduct levels formed at the N(7)-position of guanine. Knowing this, a reinvestigation of the reactivity of guanines in different DNA sequences using DNA sequencing methods was undertaken. Methods are developed to determine the X (5'-side) base and the Y (3'-side) base are most influential in determining guanine reactivity. These rules in conjunction with molecular modeling studies were used to assess the binding sites that might be utilized by AFB/sub 1/ oxide in its reaction with DNA.« less
Chromosome specific repetitive DNA sequences
Moyzis, Robert K.; Meyne, Julianne
1991-01-01
A method is provided for determining specific nucleotide sequences useful in forming a probe which can identify specific chromosomes, preferably through in situ hybridization within the cell itself. In one embodiment, chromosome preferential nucleotide sequences are first determined from a library of recombinant DNA clones having families of repetitive sequences. Library clones are identified with a low homology with a sequence of repetitive DNA families to which the first clones respectively belong and variant sequences are then identified by selecting clones having a pattern of hybridization with genomic DNA dissimilar to the hybridization pattern shown by the respective families. In another embodiment, variant sequences are selected from a sequence of a known repetitive DNA family. The selected variant sequence is classified as chromosome specific, chromosome preferential, or chromosome nonspecific. Sequences which are classified as chromosome preferential are further sequenced and regions are identified having a low homology with other regions of the chromosome preferential sequence or with known sequences of other family me This invention is the result of a contract with the Department of Energy (Contract No. W-7405-ENG-36).
ERIC Educational Resources Information Center
Shah, Kushani; Thomas, Shelby; Stein, Arnold
2013-01-01
In this report, we describe a 5-week laboratory exercise for undergraduate biology and biochemistry students in which students learn to sequence DNA and to genotype their DNA for selected single nucleotide polymorphisms (SNPs). Students use miniaturized DNA sequencing gels that require approximately 8 min to run. The students perform G, A, T, C…
DNA Barcode Goes Two-Dimensions: DNA QR Code Web Server
Li, Huan; Xing, Hang; Liang, Dong; Jiang, Kun; Pang, Xiaohui; Song, Jingyuan; Chen, Shilin
2012-01-01
The DNA barcoding technology uses a standard region of DNA sequence for species identification and discovery. At present, “DNA barcode” actually refers to DNA sequences, which are not amenable to information storage, recognition, and retrieval. Our aim is to identify the best symbology that can represent DNA barcode sequences in practical applications. A comprehensive set of sequences for five DNA barcode markers ITS2, rbcL, matK, psbA-trnH, and CO1 was used as the test data. Fifty-three different types of one-dimensional and ten two-dimensional barcode symbologies were compared based on different criteria, such as coding capacity, compression efficiency, and error detection ability. The quick response (QR) code was found to have the largest coding capacity and relatively high compression ratio. To facilitate the further usage of QR code-based DNA barcodes, a web server was developed and is accessible at http://qrfordna.dnsalias.org. The web server allows users to retrieve the QR code for a species of interests, convert a DNA sequence to and from a QR code, and perform species identification based on local and global sequence similarities. In summary, the first comprehensive evaluation of various barcode symbologies has been carried out. The QR code has been found to be the most appropriate symbology for DNA barcode sequences. A web server has also been constructed to allow biologists to utilize QR codes in practical DNA barcoding applications. PMID:22574113
High-resolution characterization of sequence signatures due to non-random cleavage of cell-free DNA.
Chandrananda, Dineika; Thorne, Natalie P; Bahlo, Melanie
2015-06-17
High-throughput sequencing of cell-free DNA fragments found in human plasma has been used to non-invasively detect fetal aneuploidy, monitor organ transplants and investigate tumor DNA. However, many biological properties of this extracellular genetic material remain unknown. Research that further characterizes circulating DNA could substantially increase its diagnostic value by allowing the application of more sophisticated bioinformatics tools that lead to an improved signal to noise ratio in the sequencing data. In this study, we investigate various features of cell-free DNA in plasma using deep-sequencing data from two pregnant women (>70X, >50X) and compare them with matched cellular DNA. We utilize a descriptive approach to examine how the biological cleavage of cell-free DNA affects different sequence signatures such as fragment lengths, sequence motifs at fragment ends and the distribution of cleavage sites along the genome. We show that the size distributions of these cell-free DNA molecules are dependent on their autosomal and mitochondrial origin as well as the genomic location within chromosomes. DNA mapping to particular microsatellites and alpha repeat elements display unique size signatures. We show how cell-free fragments occur in clusters along the genome, localizing to nucleosomal arrays and are preferentially cleaved at linker regions by correlating the mapping locations of these fragments with ENCODE annotation of chromatin organization. Our work further demonstrates that cell-free autosomal DNA cleavage is sequence dependent. The region spanning up to 10 positions on either side of the DNA cleavage site show a consistent pattern of preference for specific nucleotides. This sequence motif is present in cleavage sites localized to nucleosomal cores and linker regions but is absent in nucleosome-free mitochondrial DNA. These background signals in cell-free DNA sequencing data stem from the non-random biological cleavage of these fragments. This sequence structure can be harnessed to improve bioinformatics algorithms, in particular for CNV and structural variant detection. Descriptive measures for cell-free DNA features developed here could also be used in biomarker analysis to monitor the changes that occur during different pathological conditions.
Assessment of phylogenetic sensitivity for reconstructing HIV-1 epidemiological relationships.
Beloukas, Apostolos; Magiorkinis, Emmanouil; Magiorkinis, Gkikas; Zavitsanou, Asimina; Karamitros, Timokratis; Hatzakis, Angelos; Paraskevis, Dimitrios
2012-06-01
Phylogenetic analysis has been extensively used as a tool for the reconstruction of epidemiological relations for research or for forensic purposes. It was our objective to assess the sensitivity of different phylogenetic methods and various phylogenetic programs to reconstruct epidemiological links among HIV-1 infected patients that is the probability to reveal a true transmission relationship. Multiple datasets (90) were prepared consisting of HIV-1 sequences in protease (PR) and partial reverse transcriptase (RT) sampled from patients with documented epidemiological relationship (target population), and from unrelated individuals (control population) belonging to the same HIV-1 subtype as the target population. Each dataset varied regarding the number, the geographic origin and the transmission risk groups of the sequences among the control population. Phylogenetic trees were inferred by neighbor-joining (NJ), maximum likelihood heuristics (hML) and Bayesian methods. All clusters of sequences belonging to the target population were correctly reconstructed by NJ and Bayesian methods receiving high bootstrap and posterior probability (PP) support, respectively. On the other hand, TreePuzzle failed to reconstruct or provide significant support for several clusters; high puzzling step support was associated with the inclusion of control sequences from the same geographic area as the target population. In contrary, all clusters were correctly reconstructed by hML as implemented in PhyML 3.0 receiving high bootstrap support. We report that under the conditions of our study, hML using PhyML, NJ and Bayesian methods were the most sensitive for the reconstruction of epidemiological links mostly from sexually infected individuals. Copyright © 2012 Elsevier B.V. All rights reserved.
Bayesian reconstruction of transmission within outbreaks using genomic variants.
De Maio, Nicola; Worby, Colin J; Wilson, Daniel J; Stoesser, Nicole
2018-04-01
Pathogen genome sequencing can reveal details of transmission histories and is a powerful tool in the fight against infectious disease. In particular, within-host pathogen genomic variants identified through heterozygous nucleotide base calls are a potential source of information to identify linked cases and infer direction and time of transmission. However, using such data effectively to model disease transmission presents a number of challenges, including differentiating genuine variants from those observed due to sequencing error, as well as the specification of a realistic model for within-host pathogen population dynamics. Here we propose a new Bayesian approach to transmission inference, BadTrIP (BAyesian epiDemiological TRansmission Inference from Polymorphisms), that explicitly models evolution of pathogen populations in an outbreak, transmission (including transmission bottlenecks), and sequencing error. BadTrIP enables the inference of host-to-host transmission from pathogen sequencing data and epidemiological data. By assuming that genomic variants are unlinked, our method does not require the computationally intensive and unreliable reconstruction of individual haplotypes. Using simulations we show that BadTrIP is robust in most scenarios and can accurately infer transmission events by efficiently combining information from genetic and epidemiological sources; thanks to its realistic model of pathogen evolution and the inclusion of epidemiological data, BadTrIP is also more accurate than existing approaches. BadTrIP is distributed as an open source package (https://bitbucket.org/nicofmay/badtrip) for the phylogenetic software BEAST2. We apply our method to reconstruct transmission history at the early stages of the 2014 Ebola outbreak, showcasing the power of within-host genomic variants to reconstruct transmission events.
Analysis of DNA Sequences by An Optical Time-Integrating Correlator: Proof-Of-Concept Experiments.
1992-05-01
TABLES xv LIST OF ABBREVIATIONS xvii 1.0 INTRODUCTION 1 2.0 DNA ANALYSIS STRATEGY 4 2.1 Representation of DNA Bases 4 2.2 DNA Analysis Strategy 6 3.0...Zehnder architecture. 3 Figure 3: Short representations of the DNA bases where each base is represented by a 7-bits long pseudorandom sequence. 5... DNA bases where each base is represented by 7-bits long pseudorandom sequences. 4 Table 2: Long representations of the DNA bases with 255-bits maximum
SNP discovery through de novo deep sequencing using the next generation of DNA sequencers
USDA-ARS?s Scientific Manuscript database
The production of high volumes of DNA sequence data using new technologies has permitted more efficient identification of single nucleotide polymorphisms in vertebrate genomes. This chapter presented practical methodology for production and analysis of DNA sequence data for SNP discovery....
A simple procedure for parallel sequence analysis of both strands of 5'-labeled DNA.
Razvi, F; Gargiulo, G; Worcel, A
1983-08-01
Ligation of a 5'-labeled DNA restriction fragment results in a circular DNA molecule carrying the two 32Ps at the reformed restriction site. Double digestions of the circular DNA with the original enzyme and a second restriction enzyme cleavage near the labeled site allows direct chemical sequencing of one 5'-labeled DNA strand. Similar double digestions, using an isoschizomer that cleaves differently at the 32P-labeled site, allows direct sequencing of the now 3'-labeled complementary DNA strand. It is possible to directly sequence both strands of cloned DNA inserts by using the above protocol and a multiple cloning site vector that provides the necessary restriction sites. The simultaneous and parallel visualization of both DNA strands eliminates sequence ambiguities. In addition, the labeled circular molecules are particularly useful for single-hit DNA cleavage studies and DNA footprint analysis. As an example, we show here an analysis of the micrococcal nuclease-induced breaks on the two strands of the somatic 5S RNA gene of Xenopus borealis, which suggests that the enzyme may recognize and cleave small AT-containing palindromes along the DNA helix.
A Glimpse into the Satellite DNA Library in Characidae Fish (Teleostei, Characiformes)
Utsunomia, Ricardo; Ruiz-Ruano, Francisco J.; Silva, Duílio M. Z. A.; Serrano, Érica A.; Rosa, Ivana F.; Scudeler, Patrícia E. S.; Hashimoto, Diogo T.; Oliveira, Claudio; Camacho, Juan Pedro M.; Foresti, Fausto
2017-01-01
Satellite DNA (satDNA) is an abundant fraction of repetitive DNA in eukaryotic genomes and plays an important role in genome organization and evolution. In general, satDNA sequences follow a concerted evolutionary pattern through the intragenomic homogenization of different repeat units. In addition, the satDNA library hypothesis predicts that related species share a series of satDNA variants descended from a common ancestor species, with differential amplification of different satDNA variants. The finding of a same satDNA family in species belonging to different genera within Characidae fish provided the opportunity to test both concerted evolution and library hypotheses. For this purpose, we analyzed here sequence variation and abundance of this satDNA family in ten species, by a combination of next generation sequencing (NGS), PCR and Sanger sequencing, and fluorescence in situ hybridization (FISH). We found extensive between-species variation for the number and size of pericentromeric FISH signals. At genomic level, the analysis of 1000s of DNA sequences obtained by Illumina sequencing and PCR amplification allowed defining 150 haplotypes which were linked in a common minimum spanning tree, where different patterns of concerted evolution were apparent. This also provided a glimpse into the satDNA library of this group of species. In consistency with the library hypothesis, different variants for this satDNA showed high differences in abundance between species, from highly abundant to simply relictual variants. PMID:28855916
Short, interspersed, and repetitive DNA sequences in Spiroplasma species.
Nur, I; LeBlanc, D J; Tully, J G
1987-03-01
Small fragments of DNA from an 8-kbp plasmid, pRA1, from a plant pathogenic strain of Spiroplasma citri were shown previously to be present in the chromosomal DNA of at least two species of Spiroplasma. We describe here the shot-gun cloning of chromosomal DNA from S. citri Maroc and the identification of two distinct sequences exhibiting homology to pRA1. Further subcloning experiments provided specific molecular probes for the identification of these two sequences in chromosomal DNA from three distinct plant pathogenic species of Spiroplasma. The results of Southern blot hybridization indicated that each of the pRA1-associated sequences is present as multiple copies in short, dispersed, and repetitive sequences in the chromosomes of these three strains. None of the sequences was detectable in chromosomal DNA from an additional nine Spiroplasma strains examined.
Laser Desorption Mass Spectrometry for DNA Sequencing and Analysis
NASA Astrophysics Data System (ADS)
Chen, C. H. Winston; Taranenko, N. I.; Golovlev, V. V.; Isola, N. R.; Allman, S. L.
1998-03-01
Rapid DNA sequencing and/or analysis is critically important for biomedical research. In the past, gel electrophoresis has been the primary tool to achieve DNA analysis and sequencing. However, gel electrophoresis is a time-consuming and labor-extensive process. Recently, we have developed and used laser desorption mass spectrometry (LDMS) to achieve sequencing of ss-DNA longer than 100 nucleotides. With LDMS, we succeeded in sequencing DNA in seconds instead of hours or days required by gel electrophoresis. In addition to sequencing, we also applied LDMS for the detection of DNA probes for hybridization LDMS was also used to detect short tandem repeats for forensic applications. Clinical applications for disease diagnosis such as cystic fibrosis caused by base deletion and point mutation have also been demonstrated. Experimental details will be presented in the meeting. abstract.
Using simple artificial intelligence methods for predicting amyloidogenesis in antibodies
2010-01-01
Background All polypeptide backbones have the potential to form amyloid fibrils, which are associated with a number of degenerative disorders. However, the likelihood that amyloidosis would actually occur under physiological conditions depends largely on the amino acid composition of a protein. We explore using a naive Bayesian classifier and a weighted decision tree for predicting the amyloidogenicity of immunoglobulin sequences. Results The average accuracy based on leave-one-out (LOO) cross validation of a Bayesian classifier generated from 143 amyloidogenic sequences is 60.84%. This is consistent with the average accuracy of 61.15% for a holdout test set comprised of 103 AM and 28 non-amyloidogenic sequences. The LOO cross validation accuracy increases to 81.08% when the training set is augmented by the holdout test set. In comparison, the average classification accuracy for the holdout test set obtained using a decision tree is 78.64%. Non-amyloidogenic sequences are predicted with average LOO cross validation accuracies between 74.05% and 77.24% using the Bayesian classifier, depending on the training set size. The accuracy for the holdout test set was 89%. For the decision tree, the non-amyloidogenic prediction accuracy is 75.00%. Conclusions This exploratory study indicates that both classification methods may be promising in providing straightforward predictions on the amyloidogenicity of a sequence. Nevertheless, the number of available sequences that satisfy the premises of this study are limited, and are consequently smaller than the ideal training set size. Increasing the size of the training set clearly increases the accuracy, and the expansion of the training set to include not only more derivatives, but more alignments, would make the method more sound. The accuracy of the classifiers may also be improved when additional factors, such as structural and physico-chemical data, are considered. The development of this type of classifier has significant applications in evaluating engineered antibodies, and may be adapted for evaluating engineered proteins in general. PMID:20144194
Using simple artificial intelligence methods for predicting amyloidogenesis in antibodies.
David, Maria Pamela C; Concepcion, Gisela P; Padlan, Eduardo A
2010-02-08
All polypeptide backbones have the potential to form amyloid fibrils, which are associated with a number of degenerative disorders. However, the likelihood that amyloidosis would actually occur under physiological conditions depends largely on the amino acid composition of a protein. We explore using a naive Bayesian classifier and a weighted decision tree for predicting the amyloidogenicity of immunoglobulin sequences. The average accuracy based on leave-one-out (LOO) cross validation of a Bayesian classifier generated from 143 amyloidogenic sequences is 60.84%. This is consistent with the average accuracy of 61.15% for a holdout test set comprised of 103 AM and 28 non-amyloidogenic sequences. The LOO cross validation accuracy increases to 81.08% when the training set is augmented by the holdout test set. In comparison, the average classification accuracy for the holdout test set obtained using a decision tree is 78.64%. Non-amyloidogenic sequences are predicted with average LOO cross validation accuracies between 74.05% and 77.24% using the Bayesian classifier, depending on the training set size. The accuracy for the holdout test set was 89%. For the decision tree, the non-amyloidogenic prediction accuracy is 75.00%. This exploratory study indicates that both classification methods may be promising in providing straightforward predictions on the amyloidogenicity of a sequence. Nevertheless, the number of available sequences that satisfy the premises of this study are limited, and are consequently smaller than the ideal training set size. Increasing the size of the training set clearly increases the accuracy, and the expansion of the training set to include not only more derivatives, but more alignments, would make the method more sound. The accuracy of the classifiers may also be improved when additional factors, such as structural and physico-chemical data, are considered. The development of this type of classifier has significant applications in evaluating engineered antibodies, and may be adapted for evaluating engineered proteins in general.
Constructing DNA Barcode Sets Based on Particle Swarm Optimization.
Wang, Bin; Zheng, Xuedong; Zhou, Shihua; Zhou, Changjun; Wei, Xiaopeng; Zhang, Qiang; Wei, Ziqi
2018-01-01
Following the completion of the human genome project, a large amount of high-throughput bio-data was generated. To analyze these data, massively parallel sequencing, namely next-generation sequencing, was rapidly developed. DNA barcodes are used to identify the ownership between sequences and samples when they are attached at the beginning or end of sequencing reads. Constructing DNA barcode sets provides the candidate DNA barcodes for this application. To increase the accuracy of DNA barcode sets, a particle swarm optimization (PSO) algorithm has been modified and used to construct the DNA barcode sets in this paper. Compared with the extant results, some lower bounds of DNA barcode sets are improved. The results show that the proposed algorithm is effective in constructing DNA barcode sets.
From sea to land and beyond – New insights into the evolution of euthyneuran Gastropoda (Mollusca)
2008-01-01
Background The Euthyneura are considered to be the most successful and diverse group of Gastropoda. Phylogenetically, they are riven with controversy. Previous morphology-based phylogenetic studies have been greatly hampered by rampant parallelism in morphological characters or by incomplete taxon sampling. Based on sequences of nuclear 18S rRNA and 28S rRNA as well as mitochondrial 16S rRNA and COI DNA from 56 taxa, we reconstructed the phylogeny of Euthyneura utilising Maximum Likelihood and Bayesian inference methods. The evolution of colonization of freshwater and terrestrial habitats by pulmonate Euthyneura, considered crucial in the evolution of this group of Gastropoda, is reconstructed with Bayesian approaches. Results We found several well supported clades within Euthyneura, however, we could not confirm the traditional classification, since Pulmonata are paraphyletic and Opistobranchia are either polyphyletic or paraphyletic with several clades clearly distinguishable. Sacoglossa appear separately from the rest of the Opisthobranchia as sister taxon to basal Pulmonata. Within Pulmonata, Basommatophora are paraphyletic and Hygrophila and Eupulmonata form monophyletic clades. Pyramidelloidea are placed within Euthyneura rendering the Euthyneura paraphyletic. Conclusion Based on the current phylogeny, it can be proposed for the first time that invasion of freshwater by Pulmonata is a unique evolutionary event and has taken place directly from the marine environment via an aquatic pathway. The origin of colonisation of terrestrial habitats is seeded in marginal zones and has probably occurred via estuaries or semi-terrestrial habitats such as mangroves. PMID:18294406
Demaio, Pablo H; Barfuss, Michael H J; Kiesling, Roberto; Till, Walter; Chiapella, Jorge O
2011-11-01
The South American genus Gymnocalycium (Cactoideae-Trichocereae) demonstrates how the sole use of morphological data in Cactaceae results in conflicts in assessing phylogeny, constructing a taxonomic system, and analyzing trends in the evolution of the genus. Molecular phylogenetic analysis was performed using parsimony and Bayesian methods on a 6195-bp data matrix of plastid DNA sequences (atpI-atpH, petL-psbE, trnK-matK, trnT-trnL-trnF) of 78 samples, including 52 species and infraspecific taxa representing all the subgenera of Gymnocalycium. We assessed morphological character evolution using likelihood methods to optimize characters on a Bayesian tree and to reconstruct possible ancestral states. The results of the phylogenetic study confirm the monophyly of the genus, while supporting overall the available infrageneric classification based on seed morphology. Analysis showed the subgenera Microsemineum and Macrosemineum to be polyphyletic and paraphyletic. Analysis of morphological characters showed a tendency toward reduction of stem size, reduction in quantity and hardiness of spines, increment of seed size, development of napiform roots, and change from juicy and colorful fruits to dry and green fruits. Gymnocalycium saglionis is the only species of Microsemineum and a new name is required to identify the clade including the remaining species of Microsemineum; we propose the name Scabrosemineum in agreement with seed morphology. Identifying morphological trends and environmental features allows for a better understanding of the events that might have influenced the diversification of the genus.
Bayesian Analysis of Evolutionary Divergence with Genomic Data under Diverse Demographic Models.
Chung, Yujin; Hey, Jody
2017-06-01
We present a new Bayesian method for estimating demographic and phylogenetic history using population genomic data. Several key innovations are introduced that allow the study of diverse models within an Isolation-with-Migration framework. The new method implements a 2-step analysis, with an initial Markov chain Monte Carlo (MCMC) phase that samples simple coalescent trees, followed by the calculation of the joint posterior density for the parameters of a demographic model. In step 1, the MCMC sampling phase, the method uses a reduced state space, consisting of coalescent trees without migration paths, and a simple importance sampling distribution without the demography of interest. Once obtained, a single sample of trees can be used in step 2 to calculate the joint posterior density for model parameters under multiple diverse demographic models, without having to repeat MCMC runs. Because migration paths are not included in the state space of the MCMC phase, but rather are handled by analytic integration in step 2 of the analysis, the method is scalable to a large number of loci with excellent MCMC mixing properties. With an implementation of the new method in the computer program MIST, we demonstrate the method's accuracy, scalability, and other advantages using simulated data and DNA sequences of two common chimpanzee subspecies: Pan troglodytes (P. t.) troglodytes and P. t. verus. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Evidence of transoceanic dispersion of the genus Vanilla based on plastid DNA phylogenetic analysis.
Bouetard, Anthony; Lefeuvre, Pierre; Gigant, Rodolphe; Bory, Séverine; Pignal, Marc; Besse, Pascale; Grisoni, Michel
2010-05-01
The phylogeny and the biogeographical history of the genus Vanilla was investigated using four chloroplastic genes (psbB, psbC; psaB and rbcL), on 47 accessions of Vanilla chosen from the ex situ CIRAD collection maintained in Reunion Island and additional sequences from GenBank. Bayesian methods provided a fairly well supported reconstruction of the phylogeny of the Vanilloideae sub-family and more particularly of the genus Vanilla. Three major phylogenetic groups in the genus Vanilla were differentiated, which is in disagreement with the actual classification in two sections (Foliosae and Aphyllae) based on morphological traits. Recent Bayesian relaxed molecular clock methods allowed to test the two main hypotheses of the phylogeography of the genus Vanilla. Early radiation of the Vanilla genus and diversification by vicariance consecutive to the break-up of Gondwana, 95 million years ago (Mya), was incompatible with the admitted age of origin of Angiosperm. Based on the Vanilloideae age recently estimated to 71 million years ago (Mya), we conclude that the genus Vanilla would have appeared approximately 34 Mya in South America, when continents were already separated. Nevertheless, whatever the two extreme scenarios tested, at least three long distance migration events are needed to explain the present distribution of Vanilla species in tropical areas. These transoceanic dispersions could have occurred via transoceanic passageway such as the Rio Grande Ridge and the involvement of floating vegetation mats and migratory birds. Copyright 2010 Elsevier Inc. All rights reserved.
Ancient mitochondrial DNA provides high-resolution time scale of the peopling of the Americas
Llamas, Bastien; Fehren-Schmitz, Lars; Valverde, Guido; Soubrier, Julien; Mallick, Swapan; Rohland, Nadin; Nordenfelt, Susanne; Valdiosera, Cristina; Richards, Stephen M.; Rohrlach, Adam; Romero, Maria Inés Barreto; Espinoza, Isabel Flores; Cagigao, Elsa Tomasto; Jiménez, Lucía Watson; Makowski, Krzysztof; Reyna, Ilán Santiago Leboreiro; Lory, Josefina Mansilla; Torrez, Julio Alejandro Ballivián; Rivera, Mario A.; Burger, Richard L.; Ceruti, Maria Constanza; Reinhard, Johan; Wells, R. Spencer; Politis, Gustavo; Santoro, Calogero M.; Standen, Vivien G.; Smith, Colin; Reich, David; Ho, Simon Y. W.; Cooper, Alan; Haak, Wolfgang
2016-01-01
The exact timing, route, and process of the initial peopling of the Americas remains uncertain despite much research. Archaeological evidence indicates the presence of humans as far as southern Chile by 14.6 thousand years ago (ka), shortly after the Pleistocene ice sheets blocking access from eastern Beringia began to retreat. Genetic estimates of the timing and route of entry have been constrained by the lack of suitable calibration points and low genetic diversity of Native Americans. We sequenced 92 whole mitochondrial genomes from pre-Columbian South American skeletons dating from 8.6 to 0.5 ka, allowing a detailed, temporally calibrated reconstruction of the peopling of the Americas in a Bayesian coalescent analysis. The data suggest that a small population entered the Americas via a coastal route around 16.0 ka, following previous isolation in eastern Beringia for ~2.4 to 9 thousand years after separation from eastern Siberian populations. Following a rapid movement throughout the Americas, limited gene flow in South America resulted in a marked phylogeographic structure of populations, which persisted through time. All of the ancient mitochondrial lineages detected in this study were absent from modern data sets, suggesting a high extinction rate. To investigate this further, we applied a novel principal components multiple logistic regression test to Bayesian serial coalescent simulations. The analysis supported a scenario in which European colonization caused a substantial loss of pre-Columbian lineages. PMID:27051878
Wielstra, Ben; Arntzen, Jan W
2011-06-14
The rapid radiation of crested newts (Triturus cristatus superspecies) comprises four morphotypes: 1) the T. karelinii group, 2) T. carnifex - T. macedonicus, 3) T. cristatus and 4) T. dobrogicus. These vary in body build and the number of rib-bearing pre-sacral vertebrae (NRBV). The phylogenetic relationships of the morphotypes have not yet been settled, despite several previous attempts, employing a variety of molecular markers. We here resolve the crested newt phylogeny by using complete mitochondrial genome sequences. Bayesian inference based on the mitogenomic data yields a fully bifurcating, significantly supported tree, though Maximum Likelihood inference yields low support values. The internal branches connecting the morphotypes are short relative to the terminal branches. Seen from the root of Triturus (NRBV = 13), a basal dichotomy separates the T. karelinii group (NRBV = 13) from the remaining crested newts. The next split divides the latter assortment into T. carnifex - T. macedonicus (NRBV = 14) versus T. cristatus (NRBV = 15) and T. dobrogicus (NRBV = 16 or 17). We argue that the Bayesian full mitochondrial DNA phylogeny is superior to previous attempts aiming to recover the crested newt species tree. Furthermore, our new phylogeny involves a maximally parsimonious interpretation of NRBV evolution. Calibrating the phylogeny allows us to evaluate potential drivers for crested newt cladogenesis. The split between the T. karelinii group and the three other morphotypes, at ca. 10.4 Ma, is associated with the separation of the Balkan and Anatolian landmasses (12-9 Ma). No currently known vicariant events can be ascribed to the other two splits, first at ca. 9.3 Ma, separating T. carnifex - T. macedonicus, and second at ca. 8.8 Ma, splitting T. cristatus and T. dobrogicus. The crested newt morphotypes differ in the duration of their annual aquatic period. We speculate on the role that this ecological differentiation could have played during speciation.
Davis, Brian W; Li, Gang; Murphy, William J
2010-07-01
The pantherine lineage of cats diverged from the remainder of modern Felidae less than 11 million years ago and consists of the five big cats of the genus Panthera, the lion, tiger, jaguar, leopard, and snow leopard, as well as the closely related clouded leopard. A significant problem exists with respect to the precise phylogeny of these highly threatened great cats. Despite multiple publications on the subject, no two molecular studies have reconstructed Panthera with the same topology. These evolutionary relationships remain unresolved partially due to the recent and rapid radiation of pantherines in the Pliocene, individual speciation events occurring within less than 1 million years, and probable introgression between lineages following their divergence. We provide an alternative, highly supported interpretation of the evolutionary history of the pantherine lineage using novel and published DNA sequence data from the autosomes, both sex chromosomes and the mitochondrial genome. New sequences were generated for 39 single-copy regions of the felid Y chromosome, as well as four mitochondrial and four autosomal gene segments, totaling 28.7 kb. Phylogenetic analysis of these new data, combined with all published data in GenBank, highlighted the prevalence of phylogenetic disparities stemming either from the amplification of a mitochondrial to nuclear translocation event (numt), or errors in species identification. Our 47.6 kb combined dataset was analyzed as a supermatrix and with respect to individual partitions using maximum likelihood and Bayesian phylogenetic inference, in conjunction with Bayesian Estimation of Species Trees (BEST) which accounts for heterogeneous gene histories. Our results yield a robust consensus topology supporting the monophyly of lion and leopard, with jaguar sister to these species, as well as a sister species relationship of tiger and snow leopard. These results highlight new avenues for the study of speciation genomics and understanding the historical events surrounding the origin of the members of this lineage. Copyright 2010 Elsevier Inc. All rights reserved.
Winnowing DNA for rare sequences: highly specific sequence and methylation based enrichment.
Thompson, Jason D; Shibahara, Gosuke; Rajan, Sweta; Pel, Joel; Marziali, Andre
2012-01-01
Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue.
Carpenter, Meredith L.; Buenrostro, Jason D.; Valdiosera, Cristina; Schroeder, Hannes; Allentoft, Morten E.; Sikora, Martin; Rasmussen, Morten; Gravel, Simon; Guillén, Sonia; Nekhrizov, Georgi; Leshtakov, Krasimir; Dimitrova, Diana; Theodossiev, Nikola; Pettener, Davide; Luiselli, Donata; Sandoval, Karla; Moreno-Estrada, Andrés; Li, Yingrui; Wang, Jun; Gilbert, M. Thomas P.; Willerslev, Eske; Greenleaf, William J.; Bustamante, Carlos D.
2013-01-01
Most ancient specimens contain very low levels of endogenous DNA, precluding the shotgun sequencing of many interesting samples because of cost. Ancient DNA (aDNA) libraries often contain <1% endogenous DNA, with the majority of sequencing capacity taken up by environmental DNA. Here we present a capture-based method for enriching the endogenous component of aDNA sequencing libraries. By using biotinylated RNA baits transcribed from genomic DNA libraries, we are able to capture DNA fragments from across the human genome. We demonstrate this method on libraries created from four Iron Age and Bronze Age human teeth from Bulgaria, as well as bone samples from seven Peruvian mummies and a Bronze Age hair sample from Denmark. Prior to capture, shotgun sequencing of these libraries yielded an average of 1.2% of reads mapping to the human genome (including duplicates). After capture, this fraction increased substantially, with up to 59% of reads mapped to human and enrichment ranging from 6- to 159-fold. Furthermore, we maintained coverage of the majority of regions sequenced in the precapture library. Intersection with the 1000 Genomes Project reference panel yielded an average of 50,723 SNPs (range 3,062–147,243) for the postcapture libraries sequenced with 1 million reads, compared with 13,280 SNPs (range 217–73,266) for the precapture libraries, increasing resolution in population genetic analyses. Our whole-genome capture approach makes it less costly to sequence aDNA from specimens containing very low levels of endogenous DNA, enabling the analysis of larger numbers of samples. PMID:24568772
Biological nanopore MspA for DNA sequencing
NASA Astrophysics Data System (ADS)
Manrao, Elizabeth A.
Unlocking the information hidden in the human genome provides insight into the inner workings of complex biological systems and can be used to greatly improve health-care. In order to allow for widespread sequencing, new technologies are required that provide fast and inexpensive readings of DNA. Nanopore sequencing is a third generation DNA sequencing technology that is currently being developed to fulfill this need. In nanopore sequencing, a voltage is applied across a small pore in an electrolyte solution and the resulting ionic current is recorded. When DNA passes through the channel, the ionic current is partially blocked. If the DNA bases uniquely modulate the ionic current flowing through the channel, the time trace of the current can be related to the sequence of DNA passing through the pore. There are two main challenges to realizing nanopore sequencing: identifying a pore with sensitivity to single nucleotides and controlling the translocation of DNA through the pore so that the small single nucleotide current signatures are distinguishable from background noise. In this dissertation, I explore the use of Mycobacterium smegmatis porin A (MspA) for nanopore sequencing. In order to determine MspA's sensitivity to single nucleotides, DNA strands of various compositions are held in the pore as the resulting ionic current is measured. DNA is immobilized in MspA by attaching it to a large molecule which acts as an anchor. This technique confirms the single nucleotide resolution of the pore and additionally shows that MspA is sensitive to epigenetic modifications and single nucleotide polymorphisms. The forces from the electric field within MspA, the effective charge of nucleotides, and elasticity of DNA are estimated using a Freely Jointed Chain model of single stranded DNA. These results offer insight into the interactions of DNA within the pore. With the nucleotide sensitivity of MspA confirmed, a method is introduced to controllably pass DNA through the pore. Using a DNA polymerase, DNA strands are stepped through MspA one nucleotide at a time. The steps are observable as distinct levels on the ionic-current time-trace and are related to the DNA sequence. These experiments overcome the two fundamental challenges to realizing MspA nanopore sequencing and pave the way to the development of a commercial technology.
Effects of sequence on DNA wrapping around histones
NASA Astrophysics Data System (ADS)
Ortiz, Vanessa
2011-03-01
A central question in biophysics is whether the sequence of a DNA strand affects its mechanical properties. In epigenetics, these are thought to influence nucleosome positioning and gene expression. Theoretical and experimental attempts to answer this question have been hindered by an inability to directly resolve DNA structure and dynamics at the base-pair level. In our previous studies we used a detailed model of DNA to measure the effects of sequence on the stability of naked DNA under bending. Sequence was shown to influence DNA's ability to form kinks, which arise when certain motifs slide past others to form non-native contacts. Here, we have now included histone-DNA interactions to see if the results obtained for naked DNA are transferable to the problem of nucleosome positioning. Different DNA sequences interacting with the histone protein complex are studied, and their equilibrium and mechanical properties are compared among themselves and with the naked case. NLM training grant to the Computation and Informatics in Biology and Medicine Training Program (NLM T15LM007359).
Taggart, David J.; Camerlengo, Terry L.; Harrison, Jason K.; Sherrer, Shanen M.; Kshetry, Ajay K.; Taylor, John-Stephen; Huang, Kun; Suo, Zucai
2013-01-01
Cellular genomes are constantly damaged by endogenous and exogenous agents that covalently and structurally modify DNA to produce DNA lesions. Although most lesions are mended by various DNA repair pathways in vivo, a significant number of damage sites persist during genomic replication. Our understanding of the mutagenic outcomes derived from these unrepaired DNA lesions has been hindered by the low throughput of existing sequencing methods. Therefore, we have developed a cost-effective high-throughput short oligonucleotide sequencing assay that uses next-generation DNA sequencing technology for the assessment of the mutagenic profiles of translesion DNA synthesis catalyzed by any error-prone DNA polymerase. The vast amount of sequencing data produced were aligned and quantified by using our novel software. As an example, the high-throughput short oligonucleotide sequencing assay was used to analyze the types and frequencies of mutations upstream, downstream and at a site-specifically placed cis–syn thymidine–thymidine dimer generated individually by three lesion-bypass human Y-family DNA polymerases. PMID:23470999
Mamos, Tomasz; Bącela-Spychalska, Karolina; Rewicz, Tomasz; Wattier, Remi A.
2017-01-01
Background The Balkans are a major worldwide biodiversity and endemism hotspot. Among the freshwater biota, amphipods are known for their high cryptic diversity. However, little is known about the temporal and paleogeographic aspects of their evolutionary history. We used paleogeography as a framework for understanding the onset of diversification in Gammarus roeselii: (1) we hypothesised that, given the high number of isolated waterbodies in the Balkans, the species is characterised by high level of cryptic diversity, even on a local scale; (2) the long geological history of the region might promote pre-Pleistocene divergence between lineages; (3) given that G. roeselii thrives both in lakes and rivers, its evolutionary history could be linked to the Balkan Neogene paleolake system; (4) we inspected whether the Pleistocene decline of hydrological networks could have any impact on the diversification of G. roeselii. Material and Methods DNA was extracted from 177 individuals collected from 26 sites all over Balkans. All individuals were amplified for ca. 650 bp long fragment of the mtDNA cytochrome oxidase subunit I (COI). After defining molecular operational taxonomic units (MOTU) based on COI, 50 individuals were amplified for ca. 900 bp long fragment of the nuclear 28S rDNA. Molecular diversity, divergence, differentiation and historical demography based on COI sequences were estimated for each MOTU. The relative frequency, geographic distribution and molecular divergence between COI haplotypes were presented as a median-joining network. COI was used also to reconstruct time-calibrated phylogeny with Bayesian inference. Probabilities of ancestors’ occurrence in riverine or lacustrine habitats, as well their possible geographic locations, were estimated with the Bayesian method. A Neighbour Joining tree was constructed to illustrate the phylogenetic relationships between 28S rDNA haplotypes. Results We revealed that G. roeselii includes at least 13 cryptic species or molecular operational taxonomic units (MOTUs), mostly of Miocene origin. A substantial Pleistocene diversification within-MOTUs was observed in several cases. We evidenced secondary contacts between very divergent MOTUs and introgression of nDNA. The Miocene ancestors could live in either lacustrine or riverine habitats yet their presumed geographic localisations overlapped with those of the Neogene lakes. Several extant riverine populations had Pleistocene lacustrine ancestors. Discussion Neogene divergence of lineages resulting in substantial cryptic diversity may be a common phenomenon in extant freshwater benthic crustaceans occupying areas that were not glaciated during the Pleistocene. Evolution of G. roeselii could be associated with gradual deterioration of the paleolakes. The within-MOTU diversification might be driven by fragmentation of river systems during the Pleistocene. Extant ancient lakes could serve as local microrefugia during that time. PMID:28265503
Singh, Pushpendra; Benjak, Andrej; Schuenemann, Verena J.; Herbig, Alexander; Avanzi, Charlotte; Busso, Philippe; Nieselt, Kay; Krause, Johannes; Vera-Cabrera, Lucio; Cole, Stewart T.
2015-01-01
Mycobacterium lepromatosis is an uncultured human pathogen associated with diffuse lepromatous leprosy and a reactional state known as Lucio's phenomenon. By using deep sequencing with and without DNA enrichment, we obtained the near-complete genome sequence of M. lepromatosis present in a skin biopsy from a Mexican patient, and compared it with that of Mycobacterium leprae, which has undergone extensive reductive evolution. The genomes display extensive synteny and are similar in size (∼3.27 Mb). Protein-coding genes share 93% nucleotide sequence identity, whereas pseudogenes are only 82% identical. The events that led to pseudogenization of 50% of the genome likely occurred before divergence from their most recent common ancestor (MRCA), and both M. lepromatosis and M. leprae have since accumulated new pseudogenes or acquired specific deletions. Functional comparisons suggest that M. lepromatosis has lost several enzymes required for amino acid synthesis whereas M. leprae has a defective heme pathway. M. lepromatosis has retained all functions required to infect the Schwann cells of the peripheral nervous system and therefore may also be neuropathogenic. A phylogeographic survey of 227 leprosy biopsies by differential PCR revealed that 221 contained M. leprae whereas only six, all from Mexico, harbored M. lepromatosis. Phylogenetic comparisons indicate that M. lepromatosis is closer than M. leprae to the MRCA, and a Bayesian dating analysis suggests that they diverged from their MRCA approximately 13.9 Mya. Thus, despite their ancient separation, the two leprosy bacilli are remarkably conserved and still cause similar pathologic conditions. PMID:25831531
Renz, Adina J.; Meyer, Axel; Kuraku, Shigehiro
2013-01-01
Cartilaginous fishes, divided into Holocephali (chimaeras) and Elasmoblanchii (sharks, rays and skates), occupy a key phylogenetic position among extant vertebrates in reconstructing their evolutionary processes. Their accurate evolutionary time scale is indispensable for better understanding of the relationship between phenotypic and molecular evolution of cartilaginous fishes. However, our current knowledge on the time scale of cartilaginous fish evolution largely relies on estimates using mitochondrial DNA sequences. In this study, making the best use of the still partial, but large-scale sequencing data of cartilaginous fish species, we estimate the divergence times between the major cartilaginous fish lineages employing nuclear genes. By rigorous orthology assessment based on available genomic and transcriptomic sequence resources for cartilaginous fishes, we selected 20 protein-coding genes in the nuclear genome, spanning 2973 amino acid residues. Our analysis based on the Bayesian inference resulted in the mean divergence time of 421 Ma, the late Silurian, for the Holocephali-Elasmobranchii split, and 306 Ma, the late Carboniferous, for the split between sharks and rays/skates. By applying these results and other documented divergence times, we measured the relative evolutionary rate of the Hox A cluster sequences in the cartilaginous fish lineages, which resulted in a lower substitution rate with a factor of at least 2.4 in comparison to tetrapod lineages. The obtained time scale enables mapping phenotypic and molecular changes in a quantitative framework. It is of great interest to corroborate the less derived nature of cartilaginous fish at the molecular level as a genome-wide phenomenon. PMID:23825540
Renz, Adina J; Meyer, Axel; Kuraku, Shigehiro
2013-01-01
Cartilaginous fishes, divided into Holocephali (chimaeras) and Elasmoblanchii (sharks, rays and skates), occupy a key phylogenetic position among extant vertebrates in reconstructing their evolutionary processes. Their accurate evolutionary time scale is indispensable for better understanding of the relationship between phenotypic and molecular evolution of cartilaginous fishes. However, our current knowledge on the time scale of cartilaginous fish evolution largely relies on estimates using mitochondrial DNA sequences. In this study, making the best use of the still partial, but large-scale sequencing data of cartilaginous fish species, we estimate the divergence times between the major cartilaginous fish lineages employing nuclear genes. By rigorous orthology assessment based on available genomic and transcriptomic sequence resources for cartilaginous fishes, we selected 20 protein-coding genes in the nuclear genome, spanning 2973 amino acid residues. Our analysis based on the Bayesian inference resulted in the mean divergence time of 421 Ma, the late Silurian, for the Holocephali-Elasmobranchii split, and 306 Ma, the late Carboniferous, for the split between sharks and rays/skates. By applying these results and other documented divergence times, we measured the relative evolutionary rate of the Hox A cluster sequences in the cartilaginous fish lineages, which resulted in a lower substitution rate with a factor of at least 2.4 in comparison to tetrapod lineages. The obtained time scale enables mapping phenotypic and molecular changes in a quantitative framework. It is of great interest to corroborate the less derived nature of cartilaginous fish at the molecular level as a genome-wide phenomenon.
Briosio-Aguilar, R; Pinto, H A; Rodríguez-Santiago, M A; López-García, K; García-Varela, M; de León, G Pérez-Ponce
2018-06-01
The phylogenetic position of Clinostomum heluans Braun, 1899 within the genus Clinostomum Leidy, 1856 is reported in this study based on sequences of the barcoding region of the mitochondrial cytochrome c oxidase subunit 1 gene ( COX1). Additionally, molecular data are used to link the adult and the metacercariae of the species. The metacercariae of C. heluans were found encysted infecting the cichlid fish Australoheros sp. in Minas Gerais, Brazil, whereas the adults were obtained from the mouth cavity of the Great White Egret, Ardea alba, in Campeche, Mexico. The COX1 sequences obtained for the Mexican clinostomes and the Brazilian metacercaria were almost identical (0.2% molecular divergence), indicating conspecificity. Similar molecular divergence (0.2-0.4%) was found between sequences of C. heluans reported here and Clinostomum sp. 6 previously obtained from a metacercaria recovered from the cichlid Cichlasoma boliviense in Santa Cruz, Bolivia. Both maximum likelihood and Bayesian inference analyses unequivocally showed the conspecificity between C. heluans and Clinostomum sp. 6, which form a monophyletic clade with high nodal support and very low genetic divergence. Moreover, tree topology reveals that C. heluans occupies a basal position with respect to New World species of Clinostomum, although a denser taxon sampling of species within the genus is further required. The metacercaria of C. heluans seems to be specific to cichlid fish because both samples from South America were recovered from species of this fish family, although not closely related.
An extended sequence specificity for UV-induced DNA damage.
Chung, Long H; Murray, Vincent
2018-01-01
The sequence specificity of UV-induced DNA damage was determined with a higher precision and accuracy than previously reported. UV light induces two major damage adducts: cyclobutane pyrimidine dimers (CPDs) and pyrimidine(6-4)pyrimidone photoproducts (6-4PPs). Employing capillary electrophoresis with laser-induced fluorescence and taking advantages of the distinct properties of the CPDs and 6-4PPs, we studied the sequence specificity of UV-induced DNA damage in a purified DNA sequence using two approaches: end-labelling and a polymerase stop/linear amplification assay. A mitochondrial DNA sequence that contained a random nucleotide composition was employed as the target DNA sequence. With previous methodology, the UV sequence specificity was determined at a dinucleotide or trinucleotide level; however, in this paper, we have extended the UV sequence specificity to a hexanucleotide level. With the end-labelling technique (for 6-4PPs), the consensus sequence was found to be 5'-GCTC*AC (where C* is the breakage site); while with the linear amplification procedure, it was 5'-TCTT*AC. With end-labelling, the dinucleotide frequency of occurrence was highest for 5'-TC*, 5'-TT* and 5'-CC*; whereas it was 5'-TT* for linear amplification. The influence of neighbouring nucleotides on the degree of UV-induced DNA damage was also examined. The core sequences consisted of pyrimidine nucleotides 5'-CTC* and 5'-CTT* while an A at position "1" and C at position "2" enhanced UV-induced DNA damage. Crown Copyright © 2017. Published by Elsevier B.V. All rights reserved.
Church, George M.; Kieffer-Higgins, Stephen
1992-01-01
This invention features vectors and a method for sequencing DNA. The method includes the steps of: a) ligating the DNA into a vector comprising a tag sequence, the tag sequence includes at least 15 bases, wherein the tag sequence will not hybridize to the DNA under stringent hybridization conditions and is unique in the vector, to form a hybrid vector, b) treating the hybrid vector in a plurality of vessels to produce fragments comprising the tag sequence, wherein the fragments differ in length and terminate at a fixed known base or bases, wherein the fixed known base or bases differs in each vessel, c) separating the fragments from each vessel according to their size, d) hybridizing the fragments with an oligonucleotide able to hybridize specifically with the tag sequence, and e) detecting the pattern of hybridization of the tag sequence, wherein the pattern reflects the nucleotide sequence of the DNA.
BiQ Analyzer HT: locus-specific analysis of DNA methylation by high-throughput bisulfite sequencing
Lutsik, Pavlo; Feuerbach, Lars; Arand, Julia; Lengauer, Thomas; Walter, Jörn; Bock, Christoph
2011-01-01
Bisulfite sequencing is a widely used method for measuring DNA methylation in eukaryotic genomes. The assay provides single-base pair resolution and, given sufficient sequencing depth, its quantitative accuracy is excellent. High-throughput sequencing of bisulfite-converted DNA can be applied either genome wide or targeted to a defined set of genomic loci (e.g. using locus-specific PCR primers or DNA capture probes). Here, we describe BiQ Analyzer HT (http://biq-analyzer-ht.bioinf.mpi-inf.mpg.de/), a user-friendly software tool that supports locus-specific analysis and visualization of high-throughput bisulfite sequencing data. The software facilitates the shift from time-consuming clonal bisulfite sequencing to the more quantitative and cost-efficient use of high-throughput sequencing for studying locus-specific DNA methylation patterns. In addition, it is useful for locus-specific visualization of genome-wide bisulfite sequencing data. PMID:21565797
A DNA sequence analysis package for the IBM personal computer.
Lagrimini, L M; Brentano, S T; Donelson, J E
1984-01-01
We present here a collection of DNA sequence analysis programs, called "PC Sequence" (PCS), which are designed to run on the IBM Personal Computer (PC). These programs are written in IBM PC compiled BASIC and take full advantage of the IBM PC's speed, error handling, and graphics capabilities. For a modest initial expense in hardware any laboratory can use these programs to quickly perform computer analysis on DNA sequences. They are written with the novice user in mind and require very little training or previous experience with computers. Also provided are a text editing program for creating and modifying DNA sequence files and a communications program which enables the PC to communicate with and collect information from mainframe computers and DNA sequence databases. PMID:6546433
Genomic sequencing of Pleistocene cave bears
DOE Office of Scientific and Technical Information (OSTI.GOV)
Noonan, James P.; Hofreiter, Michael; Smith, Doug
2005-04-01
Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome,more » the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.« less
M. -S. Kim; N. B. Klopfenstein; J. W. Hanna; G. I. McDonald
2006-01-01
Phylogenetic and genetic relationships among 10 North American Armillaria species were analysed using sequence data from ribosomal DNA (rDNA), including intergenic spacer (IGS-1), internal transcribed spacers with associated 5.8S (ITS + 5.8S), and nuclear large subunit rDNA (nLSU), and amplified fragment length polymorphism (AFLP) markers. Based on rDNA sequence data,...
Fractal landscape analysis of DNA walks
NASA Technical Reports Server (NTRS)
Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Sciortino, F.; Simons, M.; Stanley, H. E.
1992-01-01
By mapping nucleotide sequences onto a "DNA walk", we uncovered remarkably long-range power law correlations [Nature 356 (1992) 168] that imply a new scale invariant property of DNA. We found such long-range correlations in intron-containing genes and in non-transcribed regulatory DNA sequences, but not in cDNA sequences or intron-less genes. In this paper, we present more explicit evidences to support our findings.
[Genome-scale sequence data processing and epigenetic analysis of DNA methylation].
Wang, Ting-Zhang; Shan, Gao; Xu, Jian-Hong; Xue, Qing-Zhong
2013-06-01
A new approach recently developed for detecting cytosine DNA methylation (mC) and analyzing the genome-scale DNA methylation profiling, is called BS-Seq which is based on bisulfite conversion of genomic DNA combined with next-generation sequencing. The method can not only provide an insight into the difference of genome-scale DNA methylation among different organisms, but also reveal the conservation of DNA methylation in all contexts and nucleotide preference for different genomic regions, including genes, exons, and repetitive DNA sequences. It will be helpful to under-stand the epigenetic impacts of cytosine DNA methylation on the regulation of gene expression and maintaining silence of repetitive sequences, such as transposable elements. In this paper, we introduce the preprocessing steps of DNA methylation data, by which cytosine (C) and guanine (G) in the reference sequence are transferred to thymine (T) and adenine (A), and cytosine in reads is transferred to thymine, respectively. We also comprehensively review the main content of the DNA methylation analysis on the genomic scale: (1) the cytosine methylation under the context of different sequences; (2) the distribution of genomic methylcytosine; (3) DNA methylation context and the preference for the nucleotides; (4) DNA- protein interaction sites of DNA methylation; (5) degree of methylation of cytosine in the different structural elements of genes. DNA methylation analysis technique provides a powerful tool for the epigenome study in human and other species, and genes and environment interaction, and founds the theoretical basis for further development of disease diagnostics and therapeutics in human.
Extracting DNA words based on the sequence features: non-uniform distribution and integrity.
Li, Zhi; Cao, Hongyan; Cui, Yuehua; Zhang, Yanbo
2016-01-25
DNA sequence can be viewed as an unknown language with words as its functional units. Given that most sequence alignment algorithms such as the motif discovery algorithms depend on the quality of background information about sequences, it is necessary to develop an ab initio algorithm for extracting the "words" based only on the DNA sequences. We considered that non-uniform distribution and integrity were two important features of a word, based on which we developed an ab initio algorithm to extract "DNA words" that have potential functional meaning. A Kolmogorov-Smirnov test was used for consistency test of uniform distribution of DNA sequences, and the integrity was judged by the sequence and position alignment. Two random base sequences were adopted as negative control, and an English book was used as positive control to verify our algorithm. We applied our algorithm to the genomes of Saccharomyces cerevisiae and 10 strains of Escherichia coli to show the utility of the methods. The results provide strong evidences that the algorithm is a promising tool for ab initio building a DNA dictionary. Our method provides a fast way for large scale screening of important DNA elements and offers potential insights into the understanding of a genome.
Oliveira, Larissa Rosa de; Gehara, Marcelo C M; Fraga, Lúcia D; Lopes, Fernando; Túnez, Juan Ignacio; Cassini, Marcelo H; Majluf, Patricia; Cárdenas-Alayza, Susana; Pavés, Héctor J; Crespo, Enrique Alberto; García, Nestor; Loizaga de Castro, Rocío; Hoelzel, A Rus; Sepúlveda, Maritza; Olavarría, Carlos; Valiati, Victor Hugo; Quiñones, Renato; Pérez-Alvarez, Maria Jose; Ott, Paulo Henrique; Bonatto, Sandro L
2017-01-01
The South American sea lion (Otaria flavescens) is widely distributed along the southern Atlantic and Pacific coasts of South America with a history of significant commercial exploitation. We aimed to evaluate the population genetic structure and the evolutionary history of South American sea lion along its distribution by analyses of mitochondrial DNA (mtDNA) and 10 nuclear microsatellites loci. We analyzed 147 sequences of mtDNA control region and genotyped 111 individuals of South American sea lion for 10 microsatellite loci, representing six populations (Peru, Northern Chile, Southern Chile, Uruguay (Brazil), Argentina and Falkland (Malvinas) Islands) and covering the entire distribution of the species. The mtDNA phylogeny shows that haplotypes from the two oceans comprise two very divergent clades as observed in previous studies, suggesting a long period (>1 million years) of low inter-oceanic female gene flow. Bayesian analysis of bi-parental genetic diversity supports significant (but less pronounced than mitochondrial) genetic structure between Pacific and Atlantic populations, although also suggested some inter-oceanic gene flow mediated by males. Higher male migration rates were found in the intra-oceanic population comparisons, supporting very high female philopatry in the species. Demographic analyses showed that populations from both oceans went through a large population expansion ~10,000 years ago, suggesting a very similar influence of historical environmental factors, such as the last glacial cycle, on both regions. Our results support the proposition that the Pacific and Atlantic populations of the South American sea lion should be considered distinct evolutionarily significant units, with at least two managements units in each ocean.
Davis, Mark A.; Douglas, Marlis R.; Collyer, Michael L.; Douglas, Michael E.
2016-01-01
Morphological data are a conduit for the recognition and description of species, and their acquisition has recently been broadened by geometric morphometric (GM) approaches that co-join the collection of digital data with exploratory ‘big data’ analytics. We employed this approach to dissect the Western Rattlesnake (Crotalus viridis) species-complex in North America, currently partitioned by mitochondrial (mt)DNA analyses into eastern and western lineages (two and seven subspecies, respectively). The GM data (i.e., 33 dorsal and 50 lateral head landmarks) were gleaned from 2,824 individuals located in 10 museum collections. We also downloaded and concatenated sequences for six mtDNA genes from the NCBI GenBank database. GM analyses revealed significant head shape differences attributable to size and subspecies-designation (but not their interactions). Pairwise shape distances among subspecies were significantly greater than those derived from ancestral character states via squared-change parsimony, with the greatest differences separating those most closely related. This, in turn, suggests the potential for historic character displacement as a diversifying force in the complex. All subspecies, save one, were significantly differentiated in a Bayesian discriminant function analysis (DFA), regardless of whether our priors were uniform or informative (i.e., mtDNA data). Finally, shape differences among sister-clades were significantly greater than expected by chance alone under a Brownian model of evolution, promoting the hypothesis that selection rather than drift was the driving force in the evolution of the complex. Lastly, we combine head shape and mtDNA data so as to derived an integrative taxonomy that produced robust boundaries for six OTUs (operational taxonomic units) of the C. viridis complex. We suggest these boundaries are concomitant with species-status and subsequently provide a relevant nomenclature for its recognition and representation. PMID:26816132
Hurtado, Luis A; Mateos, Mariana; Wang, Chang; Santamaria, Carlos A; Jung, Jongwoo; Khalaji-Pirbalouty, Valiallah; Kim, Won
2018-01-01
The native ranges and invasion histories of many marine species remain elusive due to a dynamic dispersal process via marine vessels. Molecular markers can aid in identification of native ranges and elucidation of the introduction and establishment process. The supralittoral isopod Ligia exotica has a wide tropical and subtropical distribution, frequently found in harbors and ports around the globe. This isopod is hypothesized to have an Old World origin, from where it was unintentionally introduced to other regions via wooden ships and solid ballast. Its native range, however, remains uncertain. Recent molecular studies uncovered the presence of two highly divergent lineages of L. exotica in East Asia, and suggest this region is a source of nonindigenous populations. In this study, we conducted phylogenetic analyses (Maximum Likelihood and Bayesian) of a fragment of the mitochondrial 16S ribosomal (r)DNA gene using a dataset of this isopod that greatly expanded previous representation from Asia and putative nonindigenous populations around the world. For a subset of samples, sequences of 12S rDNA and NaK were also obtained and analyzed together with 16S rDNA. Our results show that L. exotica is comprised of several highly divergent genetic lineages, which probably represent different species. Most of the 16S rDNA genetic diversity (48 haplotypes) was detected in East and Southeast Asia. Only seven haplotypes were observed outside this region (in the Americas, Hawai'i, Africa and India), which were identical or closely related to haplotypes found in East and Southeast Asia. Phylogenetic patterns indicate the L. exotica clade originated and diversified in East and Southeast Asia, and only members of one of the divergent lineages have spread out of this region, recently, suggesting the potential to become invasive is phylogenetically constrained.
Distinct patterns of mitochondrial genome diversity in bonobos (Pan paniscus) and humans.
Zsurka, Gábor; Kudina, Tatiana; Peeva, Viktoriya; Hallmann, Kerstin; Elger, Christian E; Khrapko, Konstantin; Kunz, Wolfram S
2010-09-02
We have analyzed the complete mitochondrial genomes of 22 Pan paniscus (bonobo, pygmy chimpanzee) individuals to assess the detailed mitochondrial DNA (mtDNA) phylogeny of this close relative of Homo sapiens. We identified three major clades among bonobos that separated approximately 540,000 years ago, as suggested by Bayesian analysis. Incidentally, we discovered that the current reference sequence for bonobo likely is a hybrid of the mitochondrial genomes of two distant individuals. When comparing spectra of polymorphic mtDNA sites in bonobos and humans, we observed two major differences: (i) Of all 31 bonobo mtDNA homoplasies, i.e. nucleotide changes that occurred independently on separate branches of the phylogenetic tree, 13 were not homoplasic in humans. This indicates that at least a part of the unstable sites of the mitochondrial genome is species-specific and difficult to be explained on the basis of a mutational hotspot concept. (ii) A comparison of the ratios of non-synonymous to synonymous changes (dN/dS) among polymorphic positions in bonobos and in 4902 Homo sapiens mitochondrial genomes revealed a remarkable difference in the strength of purifying selection in the mitochondrial genes of the F0F1-ATPase complex. While in bonobos this complex showed a similar low value as complexes I and IV, human haplogroups displayed 2.2 to 7.6 times increased dN/dS ratios when compared to bonobos. Some variants of mitochondrially encoded subunits of the ATPase complex in humans very likely decrease the efficiency of energy conversion leading to production of extra heat. Thus, we hypothesize that the species-specific release of evolutionary constraints for the mitochondrial genes of the proton-translocating ATPase is a consequence of altered heat homeostasis in modern humans.
Distinct patterns of mitochondrial genome diversity in bonobos (Pan paniscus) and humans
2010-01-01
Background We have analyzed the complete mitochondrial genomes of 22 Pan paniscus (bonobo, pygmy chimpanzee) individuals to assess the detailed mitochondrial DNA (mtDNA) phylogeny of this close relative of Homo sapiens. Results We identified three major clades among bonobos that separated approximately 540,000 years ago, as suggested by Bayesian analysis. Incidentally, we discovered that the current reference sequence for bonobo likely is a hybrid of the mitochondrial genomes of two distant individuals. When comparing spectra of polymorphic mtDNA sites in bonobos and humans, we observed two major differences: (i) Of all 31 bonobo mtDNA homoplasies, i.e. nucleotide changes that occurred independently on separate branches of the phylogenetic tree, 13 were not homoplasic in humans. This indicates that at least a part of the unstable sites of the mitochondrial genome is species-specific and difficult to be explained on the basis of a mutational hotspot concept. (ii) A comparison of the ratios of non-synonymous to synonymous changes (dN/dS) among polymorphic positions in bonobos and in 4902 Homo sapiens mitochondrial genomes revealed a remarkable difference in the strength of purifying selection in the mitochondrial genes of the F0F1-ATPase complex. While in bonobos this complex showed a similar low value as complexes I and IV, human haplogroups displayed 2.2 to 7.6 times increased dN/dS ratios when compared to bonobos. Conclusions Some variants of mitochondrially encoded subunits of the ATPase complex in humans very likely decrease the efficiency of energy conversion leading to production of extra heat. Thus, we hypothesize that the species-specific release of evolutionary constraints for the mitochondrial genes of the proton-translocating ATPase is a consequence of altered heat homeostasis in modern humans. PMID:20813043
A melting pot of multicontinental mtDNA lineages in admixed Venezuelans.
Gómez-Carballa, Alberto; Ignacio-Veiga, Ana; Alvarez-Iglesias, Vanesa; Pastoriza-Mourelle, Ana; Ruíz, Yarimar; Pineda, Lennie; Carracedo, Angel; Salas, Antonio
2012-01-01
The arrival of Europeans in Colonial and post-Colonial times coupled with the forced introduction of sub-Saharan Africans have dramatically changed the genetic background of Venezuela. The main aim of the present study was to evaluate, through the study of mitochondrial DNA (mtDNA) variation, the extent of admixture and the characterization of the most likely continental ancestral sources of present-day urban Venezuelans. We analyzed two admixed populations that have experienced different demographic histories, namely, Caracas (n = 131) and Pueblo Llano (n = 219). The native American component of admixed Venezuelans accounted for 80% (46% haplogroup [hg] A2, 7% hg B2, 21% hg C1, and 6% hg D1) of all mtDNAs; while the sub-Saharan and European contributions made up ∼10% each, indicating that Trans-Atlantic immigrants have only partially erased the native American nature of Venezuelans. A Bayesian-based model allowed the different contributions of European countries to admixed Venezuelans to be disentangled (Spain: ∼38.4%, Portugal: ∼35.5%, Italy: ∼27.0%), in good agreement with the documented history. Seventeen entire mtDNA genomes were sequenced, which allowed five new native American branches to be discovered. B2j and B2k, are supported by two different haplotypes and control region data, and their coalescence ages are 3.9 k.y. (95% C.I. 0-7.8) and 2.6 k.y. (95% C.I. 0.1-5.2), respectively. The other clades were exclusively observed in Pueblo Llano and they show the fingerprint of strong recent genetic drift coupled with severe historical consanguinity episodes that might explain the high prevalence of certain Mendelian and complex multi-factorial diseases in this region. Copyright © 2011 Wiley Periodicals, Inc.
Waqairatu, Salote S; Dierens, Leanne; Cowley, Jeff A; Dixon, Tom J; Johnson, Karyn N; Barnes, Andrew C; Li, Yutao
2012-08-01
The Black Tiger shrimp (Penaeus monodon) has a natural distribution range from East Africa to the South Pacific Islands. Although previous studies of Indo-Pacific P. monodon have found populations from the Indian Ocean and Australasia to differ genetically, their relatedness to South Pacific shrimp remains unknown. To address this, polymorphisms at eight shared microsatellite loci and haplotypes in a 418-bp mtDNA-CR (control region) sequence were examined across 682 P. monodon from locations spread widely across its natural range, including the South Pacific islands of Fiji, Palau, and Papua New Guinea (PNG). Observed microsatellite heterozygosities of 0.82-0.91, allele richness of 6.85-9.69, and significant mtDNA-CR haplotype variation indicated high levels of genetic diversity among the South Pacific shrimp. Analysis of microsatellite genotypes using a Bayesian STRUCTURE method segregated Indo-Pacific P. monodon into eight distinct clades, with Palau and PNG shrimp clustering among others from Southeast Asia and eastern Australia, respectively, and Fiji shrimp clustering as a distinct group. Phylogenetic analyses of mtDNA-CR haplotypes delineated shrimp into three groupings, with shrimp from Fiji again being distinct by sharing no haplotypes with other populations. Depending on regional location, the genetic structures and substructures identified from the genotyping and mtDNA-CR haplotype phylogeny could be explained by Metapopulation and/or Member-Vagrant type evolutionary processes. Neutrality tests of mutation-drift equilibrium and estimation of the time since population expansion supported a hypothesis that South Pacific P. monodon were colonized from Southeast Asia and eastern Australia during the Pleistocene period over 60,000 years ago when land bridges were more expansive and linked these regions more closely.
Gehara, Marcelo C. M.; Fraga, Lúcia D.; Lopes, Fernando; Túnez, Juan Ignacio; Cassini, Marcelo H.; Majluf, Patricia; Cárdenas-Alayza, Susana; Pavés, Héctor J.; Crespo, Enrique Alberto; García, Nestor; Loizaga de Castro, Rocío; Hoelzel, A. Rus; Sepúlveda, Maritza; Olavarría, Carlos; Valiati, Victor Hugo; Quiñones, Renato; Pérez-Alvarez, Maria Jose; Ott, Paulo Henrique
2017-01-01
The South American sea lion (Otaria flavescens) is widely distributed along the southern Atlantic and Pacific coasts of South America with a history of significant commercial exploitation. We aimed to evaluate the population genetic structure and the evolutionary history of South American sea lion along its distribution by analyses of mitochondrial DNA (mtDNA) and 10 nuclear microsatellites loci. We analyzed 147 sequences of mtDNA control region and genotyped 111 individuals of South American sea lion for 10 microsatellite loci, representing six populations (Peru, Northern Chile, Southern Chile, Uruguay (Brazil), Argentina and Falkland (Malvinas) Islands) and covering the entire distribution of the species. The mtDNA phylogeny shows that haplotypes from the two oceans comprise two very divergent clades as observed in previous studies, suggesting a long period (>1 million years) of low inter-oceanic female gene flow. Bayesian analysis of bi-parental genetic diversity supports significant (but less pronounced than mitochondrial) genetic structure between Pacific and Atlantic populations, although also suggested some inter-oceanic gene flow mediated by males. Higher male migration rates were found in the intra-oceanic population comparisons, supporting very high female philopatry in the species. Demographic analyses showed that populations from both oceans went through a large population expansion ~10,000 years ago, suggesting a very similar influence of historical environmental factors, such as the last glacial cycle, on both regions. Our results support the proposition that the Pacific and Atlantic populations of the South American sea lion should be considered distinct evolutionarily significant units, with at least two managements units in each ocean. PMID:28654647
Waqairatu, Salote S; Dierens, Leanne; Cowley, Jeff A; Dixon, Tom J; Johnson, Karyn N; Barnes, Andrew C; Li, Yutao
2012-01-01
The Black Tiger shrimp (Penaeus monodon) has a natural distribution range from East Africa to the South Pacific Islands. Although previous studies of Indo-Pacific P. monodon have found populations from the Indian Ocean and Australasia to differ genetically, their relatedness to South Pacific shrimp remains unknown. To address this, polymorphisms at eight shared microsatellite loci and haplotypes in a 418-bp mtDNA-CR (control region) sequence were examined across 682 P. monodon from locations spread widely across its natural range, including the South Pacific islands of Fiji, Palau, and Papua New Guinea (PNG). Observed microsatellite heterozygosities of 0.82–0.91, allele richness of 6.85–9.69, and significant mtDNA-CR haplotype variation indicated high levels of genetic diversity among the South Pacific shrimp. Analysis of microsatellite genotypes using a Bayesian STRUCTURE method segregated Indo-Pacific P. monodon into eight distinct clades, with Palau and PNG shrimp clustering among others from Southeast Asia and eastern Australia, respectively, and Fiji shrimp clustering as a distinct group. Phylogenetic analyses of mtDNA-CR haplotypes delineated shrimp into three groupings, with shrimp from Fiji again being distinct by sharing no haplotypes with other populations. Depending on regional location, the genetic structures and substructures identified from the genotyping and mtDNA-CR haplotype phylogeny could be explained by Metapopulation and/or Member–Vagrant type evolutionary processes. Neutrality tests of mutation-drift equilibrium and estimation of the time since population expansion supported a hypothesis that South Pacific P. monodon were colonized from Southeast Asia and eastern Australia during the Pleistocene period over 60,000 years ago when land bridges were more expansive and linked these regions more closely. PMID:22957205
Xu, Yi-Hua; Manoharan, Herbert T; Pitot, Henry C
2007-09-01
The bisulfite genomic sequencing technique is one of the most widely used techniques to study sequence-specific DNA methylation because of its unambiguous ability to reveal DNA methylation status to the order of a single nucleotide. One characteristic feature of the bisulfite genomic sequencing technique is that a number of sample sequence files will be produced from a single DNA sample. The PCR products of bisulfite-treated DNA samples cannot be sequenced directly because they are heterogeneous in nature; therefore they should be cloned into suitable plasmids and then sequenced. This procedure generates an enormous number of sample DNA sequence files as well as adding extra bases belonging to the plasmids to the sequence, which will cause problems in the final sequence comparison. Finding the methylation status for each CpG in each sample sequence is not an easy job. As a result CpG PatternFinder was developed for this purpose. The main functions of the CpG PatternFinder are: (i) to analyze the reference sequence to obtain CpG and non-CpG-C residue position information. (ii) To tailor sample sequence files (delete insertions and mark deletions from the sample sequence files) based on a configuration of ClustalW multiple alignment. (iii) To align sample sequence files with a reference file to obtain bisulfite conversion efficiency and CpG methylation status. And, (iv) to produce graphics, highlighted aligned sequence text and a summary report which can be easily exported to Microsoft Office suite. CpG PatternFinder is designed to operate cooperatively with BioEdit, a freeware on the internet. It can handle up to 100 files of sample DNA sequences simultaneously, and the total CpG pattern analysis process can be finished in minutes. CpG PatternFinder is an ideal software tool for DNA methylation studies to determine the differential methylation pattern in a large number of individuals in a population. Previously we developed the CpG Analyzer program; CpG PatternFinder is our further effort to create software tools for DNA methylation studies.