dna sequence comparison: Topics by Science.gov

Sample records for dna sequence comparison

An improved model for whole genome phylogenetic analysis by Fourier transform.

PubMed

Yin, Changchuan; Yau, Stephen S-T

2015-10-07

DNA sequence similarity comparison is one of the major steps in computational phylogenetic studies. The sequence comparison of closely related DNA sequences and genomes is usually performed by multiple sequence alignments (MSA). While the MSA method is accurate for some types of sequences, it may produce incorrect results when DNA sequences undergone rearrangements as in many bacterial and viral genomes. It is also limited by its computational complexity for comparing large volumes of data. Previously, we proposed an alignment-free method that exploits the full information contents of DNA sequences by Discrete Fourier Transform (DFT), but still with some limitations. Here, we present a significantly improved method for the similarity comparison of DNA sequences by DFT. In this method, we map DNA sequences into 2-dimensional (2D) numerical sequences and then apply DFT to transform the 2D numerical sequences into frequency domain. In the 2D mapping, the nucleotide composition of a DNA sequence is a determinant factor and the 2D mapping reduces the nucleotide composition bias in distance measure, and thus improving the similarity measure of DNA sequences. To compare the DFT power spectra of DNA sequences with different lengths, we propose an improved even scaling algorithm to extend shorter DFT power spectra to the longest length of the underlying sequences. After the DFT power spectra are evenly scaled, the spectra are in the same dimensionality of the Fourier frequency space, then the Euclidean distances of full Fourier power spectra of the DNA sequences are used as the dissimilarity metrics. The improved DFT method, with increased computational performance by 2D numerical representation, can be applicable to any DNA sequences of different length ranges. We assess the accuracy of the improved DFT similarity measure in hierarchical clustering of different DNA sequences including simulated and real datasets. The method yields accurate and reliable phylogenetic trees and demonstrates that the improved DFT dissimilarity measure is an efficient and effective similarity measure of DNA sequences. Due to its high efficiency and accuracy, the proposed DFT similarity measure is successfully applied on phylogenetic analysis for individual genes and large whole bacterial genomes. Copyright © 2015 Elsevier Ltd. All rights reserved.
BLAST and FASTA similarity searching for multiple sequence alignment.

PubMed

Pearson, William R

2014-01-01

BLAST, FASTA, and other similarity searching programs seek to identify homologous proteins and DNA sequences based on excess sequence similarity. If two sequences share much more similarity than expected by chance, the simplest explanation for the excess similarity is common ancestry-homology. The most effective similarity searches compare protein sequences, rather than DNA sequences, for sequences that encode proteins, and use expectation values, rather than percent identity, to infer homology. The BLAST and FASTA packages of sequence comparison programs provide programs for comparing protein and DNA sequences to protein databases (the most sensitive searches). Protein and translated-DNA comparisons to protein databases routinely allow evolutionary look back times from 1 to 2 billion years; DNA:DNA searches are 5-10-fold less sensitive. BLAST and FASTA can be run on popular web sites, but can also be downloaded and installed on local computers. With local installation, target databases can be customized for the sequence data being characterized. With today's very large protein databases, search sensitivity can also be improved by searching smaller comprehensive databases, for example, a complete protein set from an evolutionarily neighboring model organism. By default, BLAST and FASTA use scoring strategies target for distant evolutionary relationships; for comparisons involving short domains or queries, or searches that seek relatively close homologs (e.g. mouse-human), shallower scoring matrices will be more effective. Both BLAST and FASTA provide very accurate statistical estimates, which can be used to reliably identify protein sequences that diverged more than 2 billion years ago.
First Molecular Identification and Phylogeny of Moroccan Anopheles sergentii (Diptera: Culicidae) Based on Second Internal Transcribed Spencer (ITS2) and Cytochrome c Oxidase I (COI) Sequences.

PubMed

Benabdelkrim Filali, Oumama; Kabine, Mostafa; El Hamouchi, Adil; Lemrani, Meryem; Debboun, Mustapha; Sarih, M'hammed

2018-06-05

Anopheles sergentii known as the "oasis vector" or the "desert malaria vector" is considered the main vector of malaria in the southern parts of Morocco. Its presence in Morocco is confirmed for the first time through sequencing of mitochondrial DNA (mDNA) cytochrome c oxidase subunit I (COI) barcodes and nuclear ribosomal DNA (rDNA) second internal transcribed spacer (ITS2) sequences and direct comparison with specimens of A. sergentii of other countries. The DNA barcodes (n = 39) obtained from A. sergentii collected in 2015 and 2016 showed more diversity with 10 haplotypes, compared with 3 haplotypes obtained from ITS2 sequences (n = 59). Moreover, the comparison using the ITS2 sequences showed closer evolutionary relationship between the Moroccan and Egyptian strains than the Iranian strain. Nevertheless, genetic differences due to geographical segregation were also observed. This study provides the first report on the sequence of rDNA-ITS2 and mtDNA COI, which could be used to better understand the biodiversity of A. sergentii.
Nucleotide sequence analysis establishes the role of endogenous murine leukemia virus DNA segments in formation of recombinant mink cell focus-forming murine leukemia viruses.

PubMed Central

Khan, A S

1984-01-01

The sequence of 363 nucleotides near the 3' end of the pol gene and 564 nucleotides from the 5' terminus of the env gene in an endogenous murine leukemia viral (MuLV) DNA segment, cloned from AKR/J mouse DNA and designated as A-12, was obtained. For comparison, the nucleotide sequence in an analogous portion of AKR mink cell focus-forming (MCF) 247 MuLV provirus was also determined. Sequence features unique to MCF247 MuLV DNA in the 3' pol and 5' env regions were identified by comparison with nucleotide sequences in analogous regions of NFS -Th-1 xenotropic and AKR ecotropic MuLV proviruses. These included (i) an insertion of 12 base pairs encoding four amino acids located 60 base pairs from the 3' terminus of the pol gene and immediately preceding the env gene, (ii) the deletion of 12 base pairs (encoding four amino acids) and the insertion of 3 base pairs (encoding one amino acid) in the 5' portion of the env gene, and (iii) single base substitutions resulting in 2 MCF247 -specific amino acids in the 3' pol and 23 in the 5' env regions. Nucleotide sequence comparison involving the 3' pol and 5' env regions of AKR MCF247 , NFS xenotropic, and AKR ecotropic MuLV proviruses with the cloned endogenous MuLV DNA indicated that MCF247 proviral DNA sequences were conserved in the cloned endogenous MuLV proviral segment. In fact, total nucleotide sequence identity existed between the endogenous MuLV DNA and the MCF247 MuLV provirus in the 3' portion of the pol gene. In the 5' env region, only 4 of 564 nucleotides were different, resulting in three amino acid changes between AKR MCF247 MuLV DNA and the endogenous MuLV DNA present in clone A-12. In addition, nucleotide sequence comparison indicated that Moloney-and Friend-MCF MuLVs were also highly related in the 3' pol and 5' env regions to the cloned endogenous MuLV DNA. These results establish the role of endogenous MuLV DNA segments in generation of recombinant MCF viruses. PMID:6328017
Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications

USDA-ARS?s Scientific Manuscript database

Analysis of DNA methylation patterns relies increasingly on sequencing-based profiling methods. The four most frequently used sequencing-based technologies are the bisulfite-based methods MethylC-seq and reduced representation bisulfite sequencing (RRBS), and the enrichment-based techniques methylat...
Gene sequence analyses and other DNA-based methods for yeast species recognition

USDA-ARS?s Scientific Manuscript database

DNA sequence analyses, as well as other DNA-based methodologies, have transformed the way in which yeasts are identified. The focus of this chapter will be on the resolution of species using various types of DNA comparisons. In other chapters in this book, Rozpedowska, Piškur and Wolfe discuss mul...
Intervening sequences in a plant gene-comparison of the partial sequence of cDNA and genomic DNA of French bean phaseolin

NASA Astrophysics Data System (ADS)

Sun, S. M.; Slightom, J. L.; Hall, T. C.

1981-01-01

A plant gene coding for the major storage protein (phaseolin, G1-globulin) of the French bean was isolated from a genomic library constructed in the phage vector Charon 24A. Comparison of the nucleotide sequence of part of the gene with that of the cloned messenger RNA (cDNA) revealed the presence of three intervening sequences, all beginning with GTand ending with AG. The 5' and 3' boundaries of intervening sequences TVS-A (88 base pairs) and IVS-B (124 base pairs) are similar to those described for animal and viral genes, but the 3' boundary of IVS-C (129 base pairs) shows some differences. A sequence of 185 amino acids deduced from the cloned DMAs represents about 40% of a phaseolin polypeptide.
Nucleotide sequence of the gene encoding the nitrogenase iron protein of Thiobacillus ferrooxidans

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pretorius, I.M.; Rawlings, D.E.; O'Neill, E.G.

1987-01-01

The DNA sequence was determined for the cloned Thiobacillus ferrooxidans nifH and part of the nifD genes. The DNA chains were radiolabeled with (..cap alpha..-/sup 32/P)dCTP (3000 Ci/mmol) or (..cap alpha..-/sup 35/S)dCTP (400 Ci/mmol). A putative T. ferrooxidans nifH promoter was identified whose sequences showed perfect consensus with those of the Klebsiella pneumoniae nif promoter. Two putative consensus upstream activator sequences were also identified. The amino acid sequence was deduced from the DNA sequence. In a comparison of nifH DNA sequences from T. ferrooxidans and eight other nitrogen-fixing microbes, a Rhizobium sp. isolated from Parasponia andersonii showed the greatest homologymore » (74%) and Clostridium pasteurianum (nifH1) showed the least homology (54%). In the comparison of the amino acid sequences of the Fe proteins, the Rhizobium sp. and Rhizobium japonicum showed the greatest homology (both 86%) and C. pasteurianum (nifH1 gene product) demonstrated the least homology (56%) to the T. ferrooxidans Fe protein.« less
FastID: Extremely Fast Forensic DNA Comparisons

DTIC Science & Technology

2017-05-19

FastID: Extremely Fast Forensic DNA Comparisons Darrell O. Ricke, PhD Bioengineering Systems & Technologies Massachusetts Institute of...Technology Lincoln Laboratory Lexington, MA USA Darrell.Ricke@ll.mit.edu Abstract—Rapid analysis of DNA forensic samples can have a critical impact on...time sensitive investigations. Analysis of forensic DNA samples by massively parallel sequencing is creating the next gold standard for DNA
Long-range correlations and charge transport properties of DNA sequences

NASA Astrophysics Data System (ADS)

Liu, Xiao-liang; Ren, Yi; Xie, Qiong-tao; Deng, Chao-sheng; Xu, Hui

2010-04-01

By using Hurst's analysis and transfer approach, the rescaled range functions and Hurst exponents of human chromosome 22 and enterobacteria phage lambda DNA sequences are investigated and the transmission coefficients, Landauer resistances and Lyapunov coefficients of finite segments based on above genomic DNA sequences are calculated. In a comparison with quasiperiodic and random artificial DNA sequences, we find that λ-DNA exhibits anticorrelation behavior characterized by a Hurst exponent 0.5
Identification of food and beverage spoilage yeasts from DNA sequence analyses

USDA-ARS?s Scientific Manuscript database

Detection, identification, and classification of yeasts has undergone a major transformation in the last decade and a half following application of gene sequence analyses and genome comparisons. Development of a database (barcode) of easily determined DNA sequences from domains 1 and 2 (D1/D2) of th...
Homogeneity of the 16S rDNA sequence among geographically disparate isolates of Taylorella equigenitalis

PubMed Central

Matsuda, M; Tazumi, A; Kagawa, S; Sekizuka, T; Murayama, O; Moore, JE; Millar, BC

2006-01-01

Background At present, six accessible sequences of 16S rDNA from Taylorella equigenitalis (T. equigenitalis) are available, whose sequence differences occur at a few nucleotide positions. Thus it is important to determine these sequences from additional strains in other countries, if possible, in order to clarify any anomalies regarding 16S rDNA sequence heterogeneity. Here, we clone and sequence the approximate full-length 16S rDNA from additional strains of T. equigenitalis isolated in Japan, Australia and France and compare these sequences to the existing published sequences. Results Clarification of any anomalies regarding 16S rDNA sequence heterogeneity of T. equigenitalis was carried out. When cloning, sequencing and comparison of the approximate full-length 16S rDNA from 17 strains of T. equigenitalis isolated in Japan, Australia and France, nucleotide sequence differences were demonstrated at the six loci in the 1,469 nucleotide sequence. Moreover, 12 polymorphic sites occurred among 23 sequences of the 16S rDNA, including the six reference sequences. Conclusion High sequence similarity (99.5% or more) was observed throughout, except from nucleotide positions 138 to 501 where substitutions and deletions were noted. PMID:16398935
Genomics dataset of unidentified disclosed isolates.

PubMed

Rekadwad, Bhagwan N

2016-09-01

Analysis of DNA sequences is necessary for higher hierarchical classification of the organisms. It gives clues about the characteristics of organisms and their taxonomic position. This dataset is chosen to find complexities in the unidentified DNA in the disclosed patents. A total of 17 unidentified DNA sequences were thoroughly analyzed. The quick response codes were generated. AT/GC content of the DNA sequences analysis was carried out. The QR is helpful for quick identification of isolates. AT/GC content is helpful for studying their stability at different temperatures. Additionally, a dataset on cleavage code and enzyme code studied under the restriction digestion study, which helpful for performing studies using short DNA sequences was reported. The dataset disclosed here is the new revelatory data for exploration of unique DNA sequences for evaluation, identification, comparison and analysis.
The number of reduced alignments between two DNA sequences

PubMed Central

2014-01-01

Background In this study we consider DNA sequences as mathematical strings. Total and reduced alignments between two DNA sequences have been considered in the literature to measure their similarity. Results for explicit representations of some alignments have been already obtained. Results We present exact, explicit and computable formulas for the number of different possible alignments between two DNA sequences and a new formula for a class of reduced alignments. Conclusions A unified approach for a wide class of alignments between two DNA sequences has been provided. The formula is computable and, if complemented by software development, will provide a deeper insight into the theory of sequence alignment and give rise to new comparison methods. AMS Subject Classification Primary 92B05, 33C20, secondary 39A14, 65Q30 PMID:24684679
Comparison of base composition analysis and Sanger sequencing of mitochondrial DNA for four U.S. population groups.

PubMed

Kiesler, Kevin M; Coble, Michael D; Hall, Thomas A; Vallone, Peter M

2014-01-01

A set of 711 samples from four U.S. population groups was analyzed using a novel mass spectrometry based method for mitochondrial DNA (mtDNA) base composition profiling. Comparison of the mass spectrometry results with Sanger sequencing derived data yielded a concordance rate of 99.97%. Length heteroplasmy was identified in 46% of samples and point heteroplasmy was observed in 6.6% of samples in the combined mass spectral and Sanger data set. Using discrimination capacity as a metric, Sanger sequencing of the full control region had the highest discriminatory power, followed by the mass spectrometry base composition method, which was more discriminating than Sanger sequencing of just the hypervariable regions. This trend is in agreement with the number of nucleotides covered by each of the three assays. Published by Elsevier Ireland Ltd.
A comparison of RNA with DNA in template-directed synthesis

NASA Technical Reports Server (NTRS)

Zielinski, M.; Kozlov, I. A.; Orgel, L. E.; Bada, J. L. (Principal Investigator)

2000-01-01

Nonenzymatic template-directed copying of RNA sequences rich in cytidylic acid using nucleoside 5'-(2-methylimidazol-1-yl phosphates) as substrates is substantially more efficient than the copying of corresponding DNA sequences. However, many sequences cannot be copied, and the prospect of replication in this system is remote, even for RNA. Surprisingly, wobble-pairing leads to much more efficient incorporation of G opposite U on RNA templates than of G opposite T on DNA templates.
Comparison of variable region 3 sequences of human immunodeficiency virus type 1 from infected children with the RNA and DNA sequences of the virus populations of their mothers.

PubMed Central

Scarlatti, G; Leitner, T; Halapi, E; Wahlberg, J; Marchisio, P; Clerici-Schoeller, M A; Wigzell, H; Fenyö, E M; Albert, J; Uhlén, M

1993-01-01

We have compared the variable region 3 sequences from 10 human immunodeficiency virus type 1 (HIV-1)-infected infants to virus sequences from the corresponding mothers. The sequences were derived from DNA of uncultured peripheral blood mononuclear cells (PBMC), DNA of cultured PBMC, and RNA from serum collected at or shortly after delivery. The infected infants, in contrast to the mothers, harbored homogeneous virus populations. Comparison of sequences from the children and clones derived from DNA of the corresponding mothers showed that the transmitted virus represented either a minor or a major virus population of the mother. In contrast to an earlier study, we found no evidence of selection of minor virus variants during transmission. Furthermore, the transmitted virus variant did not show any characteristic molecular features. In some cases the transmitted virus was more related to the virus RNA population of the mother and in other cases it was more related to the virus DNA population. This suggests that either cell-free or cell-associated virus may be transmitted. These data will help AIDS researchers to understand the mechanism of transmission and to plan strategies for prevention of transmission. PMID:8446584
USE OF COMPETITIVE DNA HYBRIDIZATION TO IDENTIFY DIFFERENCES IN THE GENOMES OF TWO CLOSELY RELATED FECAL INDICATOR BACTERIA

EPA Science Inventory

Although recent technological advances in DNA sequencing and computational biology now allow scientists to compare entire microbial genomes, comparisons of closely related bacterial species and individual isolates by whole-genome sequencing approaches remains prohibitively expens...
Complete Genome Sequence of ER2796, a DNA Methyltransferase-Deficient Strain of Escherichia coli K-12.

PubMed

Anton, Brian P; Mongodin, Emmanuel F; Agrawal, Sonia; Fomenkov, Alexey; Byrd, Devon R; Roberts, Richard J; Raleigh, Elisabeth A

2015-01-01

We report the complete sequence of ER2796, a laboratory strain of Escherichia coli K-12 that is completely defective in DNA methylation. Because of its lack of any native methylation, it is extremely useful as a host into which heterologous DNA methyltransferase genes can be cloned and the recognition sequences of their products deduced by Pacific Biosciences Single-Molecule Real Time (SMRT) sequencing. The genome was itself sequenced from a long-insert library using the SMRT platform, resulting in a single closed contig devoid of methylated bases. Comparison with K-12 MG1655, the first E. coli K-12 strain to be sequenced, shows an essentially co-linear relationship with no major rearrangements despite many generations of laboratory manipulation. The comparison revealed a total of 41 insertions and deletions, and 228 single base pair substitutions. In addition, the long-read approach facilitated the surprising discovery of four gene conversion events, three involving rRNA operons and one between two cryptic prophages. Such events thus contribute both to genomic homogenization and to bacteriophage diversification. As one of relatively few laboratory strains of E. coli to be sequenced, the genome also reveals the sequence changes underlying a number of classical mutant alleles including those affecting the various native DNA methylation systems.
Complete Genome Sequence of ER2796, a DNA Methyltransferase-Deficient Strain of Escherichia coli K-12

PubMed Central

Anton, Brian P.; Mongodin, Emmanuel F.; Agrawal, Sonia; Fomenkov, Alexey; Byrd, Devon R.; Roberts, Richard J.; Raleigh, Elisabeth A.

2015-01-01

We report the complete sequence of ER2796, a laboratory strain of Escherichia coli K-12 that is completely defective in DNA methylation. Because of its lack of any native methylation, it is extremely useful as a host into which heterologous DNA methyltransferase genes can be cloned and the recognition sequences of their products deduced by Pacific Biosciences Single-Molecule Real Time (SMRT) sequencing. The genome was itself sequenced from a long-insert library using the SMRT platform, resulting in a single closed contig devoid of methylated bases. Comparison with K-12 MG1655, the first E. coli K-12 strain to be sequenced, shows an essentially co-linear relationship with no major rearrangements despite many generations of laboratory manipulation. The comparison revealed a total of 41 insertions and deletions, and 228 single base pair substitutions. In addition, the long-read approach facilitated the surprising discovery of four gene conversion events, three involving rRNA operons and one between two cryptic prophages. Such events thus contribute both to genomic homogenization and to bacteriophage diversification. As one of relatively few laboratory strains of E. coli to be sequenced, the genome also reveals the sequence changes underlying a number of classical mutant alleles including those affecting the various native DNA methylation systems. PMID:26010885

Comparison of Methods of Detection of Exceptional Sequences in Prokaryotic Genomes.

PubMed

Rusinov, I S; Ershova, A S; Karyagina, A S; Spirin, S A; Alexeevski, A V

2018-02-01

Many proteins need recognition of specific DNA sequences for functioning. The number of recognition sites and their distribution along the DNA might be of biological importance. For example, the number of restriction sites is often reduced in prokaryotic and phage genomes to decrease the probability of DNA cleavage by restriction endonucleases. We call a sequence an exceptional one if its frequency in a genome significantly differs from one predicted by some mathematical model. An exceptional sequence could be either under- or over-represented, depending on its frequency in comparison with the predicted one. Exceptional sequences could be considered biologically meaningful, for example, as targets of DNA-binding proteins or as parts of abundant repetitive elements. Several methods to predict frequency of a short sequence in a genome, based on actual frequencies of certain its subsequences, are used. The most popular are methods based on Markov chain models. But any rigorous comparison of the methods has not previously been performed. We compared three methods for the prediction of short sequence frequencies: the maximum-order Markov chain model-based method, the method that uses geometric mean of extended Markovian estimates, and the method that utilizes frequencies of all subsequences including discontiguous ones. We applied them to restriction sites in complete genomes of 2500 prokaryotic species and demonstrated that the results depend greatly on the method used: lists of 5% of the most under-represented sites differed by up to 50%. The method designed by Burge and coauthors in 1992, which utilizes all subsequences of the sequence, showed a higher precision than the other two methods both on prokaryotic genomes and randomly generated sequences after computational imitation of selective pressure. We propose this method as the first choice for detection of exceptional sequences in prokaryotic genomes.
Sequencing of cDNA Clones from the Genetic Map of Tomato (Lycopersicon esculentum)

PubMed Central

Ganal, Martin W.; Czihal, Rosemarie; Hannappel, Ulrich; Kloos, Dorothee-U.; Polley, Andreas; Ling, Hong-Qing

1998-01-01

The dense RFLP linkage map of tomato (Lycopersicon esculentum) contains >300 anonymous cDNA clones. Of those clones, 272 were partially or completely sequenced. The sequences were compared at the DNA and protein level to known genes in databases. For 57% of the clones, a significant match to previously described genes was found. The information will permit the conversion of those markers to STS markers and allow their use in PCR-based mapping experiments. Furthermore, it will facilitate the comparative mapping of genes across distantly related plant species by direct comparison of DNA sequences and map positions. [cDNA sequence data reported in this paper have been submitted to the EMBL database under accession nos. AA824695–AA825005 and the dbEST_Id database under accession nos. 1546519–1546862.] PMID:9724330
cgDNAweb: a web interface to the cgDNA sequence-dependent coarse-grain model of double-stranded DNA.

PubMed

De Bruin, Lennart; Maddocks, John H

2018-06-14

The sequence-dependent statistical mechanical properties of fragments of double-stranded DNA is believed to be pertinent to its biological function at length scales from a few base pairs (or bp) to a few hundreds of bp, e.g. indirect read-out protein binding sites, nucleosome positioning sequences, phased A-tracts, etc. In turn, the equilibrium statistical mechanics behaviour of DNA depends upon its ground state configuration, or minimum free energy shape, as well as on its fluctuations as governed by its stiffness (in an appropriate sense). We here present cgDNAweb, which provides browser-based interactive visualization of the sequence-dependent ground states of double-stranded DNA molecules, as predicted by the underlying cgDNA coarse-grain rigid-base model of fragments with arbitrary sequence. The cgDNAweb interface is specifically designed to facilitate comparison between ground state shapes of different sequences. The server is freely available at cgDNAweb.epfl.ch with no login requirement.
cDNA cloning of the human peroxisomal enoyl-CoA hydratase: 3-Hydroxyacyl-CoA dehydrogenase bifunctional enzyme and localization to chromosome 3q26. 3-3q28: A free left Alu arm is inserted in the 3[prime] noncoding region

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hoefler, G.; Forstner, M.; Hulla, W.

1994-01-01

Enoyl-CoA hydratase:3-hydroxyacyl-CoA dehydrogenase bifunctional enzyme is one of the four enzymes of the peroxisomal, [beta]-oxidation pathway. Here, the authors report the full-length human cDNA sequence and the localization of the corresponding gene on chromosome 3q26.3-3q28. The cDNA sequence spans 3779 nucleotides with an open reading frame of 2169 nucleotides. The tripeptide SKL at the carboxy terminus, known to serve as a peroxisomal targeting signal, is present. DNA sequence comparison of the coding region showed an 80% homology between human and rat bifunctional enzyme cDNA. The 3[prime] noncoding sequence contains 117 nucleotides homologous to an Alu repeat. Based on sequence comparison,more » they propose that these nucleotides are a free left Alu arm with 86% homology to the Alu-J family. RNA analysis shows one band with highest intensity in liver and kidney. This cDNA will allow in-depth studies of molecular defects in patients with defective peroxisomal bifunctional enzyme. Moreover, it will also provide a means for studying the regulation of peroxisomal [beta]-oxidation in humans. 33 refs., 5 figs.« less
Genomic sequencing of Pleistocene cave bears

DOE Office of Scientific and Technical Information (OSTI.GOV)

Noonan, James P.; Hofreiter, Michael; Smith, Doug

2005-04-01

Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome,more » the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.« less
Blastocystis phylogeny among various isolates from humans to insects.

PubMed

Yoshikawa, Hisao; Koyama, Yukiko; Tsuchiya, Erika; Takami, Kazutoshi

2016-12-01

Blastocystis is a common unicellular eukaryotic parasite found not only in humans, but also in various kinds of animal species worldwide. Since Blastocystis isolates are morphologically indistinguishable, many molecular biological approaches have been applied to classify these isolates. The complete or partial sequences of the small subunit rRNA gene (SSU rDNA) are mainly used for comparisons and phylogenetic analyses among Blastocystis isolates. However, various lengths of the partial SSU rDNA sequence have been used for phylogenetic inference among genetically different isolates. Based on the complete SSU rDNA sequences, consensus terminology of nine subtypes (STs) of Blastocystis sp. that were supported by phylogenetically monophyletic nine clades was proposed in 2007. Thereafter, eight additional kinds of STs comprising non-human mammalian Blastocystis isolates have been reported based on the phylogeny of SSU rDNA sequences, while STs 11 and 12 were only proposed on the base of partial sequences. Although many sequence data from mammalian and avian Blastocystis are registered in GenBank, only limited data on SSU rDNA are available for poikilotherm-derived Blastocystis isolates. Therefore, the phylogenetic positions of the reptilian/amphibian Blastocystis clades are unstable. The phylogenetic inference of various STs comprising mammalian and/or avian Blastocystis isolates was verified herein based on comparisons between partial and complete SSU rDNA sequences, and the phylogenetic positions of reptilian and amphibian Blastocystis isolates were also investigated using 14 new Blastocystis isolates from reptiles with all known isolates from other reptilians, amphibians, and insects registered in GenBank. Copyright © 2016. Published by Elsevier Ireland Ltd.
Comparison and quantitative verification of mapping algorithms for whole genome bisulfite sequencing

USDA-ARS?s Scientific Manuscript database

Coupling bisulfite conversion with next-generation sequencing (Bisulfite-seq) enables genome-wide measurement of DNA methylation, but poses unique challenges for mapping. However, despite a proliferation of Bisulfite-seq mapping tools, no systematic comparison of their genomic coverage and quantitat...
Application of discrete Fourier inter-coefficient difference for assessing genetic sequence similarity.

PubMed

King, Brian R; Aburdene, Maurice; Thompson, Alex; Warres, Zach

2014-01-01

Digital signal processing (DSP) techniques for biological sequence analysis continue to grow in popularity due to the inherent digital nature of these sequences. DSP methods have demonstrated early success for detection of coding regions in a gene. Recently, these methods are being used to establish DNA gene similarity. We present the inter-coefficient difference (ICD) transformation, a novel extension of the discrete Fourier transformation, which can be applied to any DNA sequence. The ICD method is a mathematical, alignment-free DNA comparison method that generates a genetic signature for any DNA sequence that is used to generate relative measures of similarity among DNA sequences. We demonstrate our method on a set of insulin genes obtained from an evolutionarily wide range of species, and on a set of avian influenza viral sequences, which represents a set of highly similar sequences. We compare phylogenetic trees generated using our technique against trees generated using traditional alignment techniques for similarity and demonstrate that the ICD method produces a highly accurate tree without requiring an alignment prior to establishing sequence similarity.
Comparison of Flow Injection MS, NMR, and DNA Sequencing: Methods for Identification and Authentication of Black Cohosh (Actaea racemosa)

USDA-ARS?s Scientific Manuscript database

Flow injection mass spectrometry (FIMS) and proton nuclear magnetic resonance spectrometry (1H-NMR), two metabolic fingerprinting methods, and DNA sequencing were used to identify and authenticate Actaea species. Initially, samples of Actaea racemosa L. from a single source were distinguished from ...
Comparison of internal transcribed spacers and intergenic spacer regions of five common Iranian sheep bursate nematodes.

PubMed

Nabavi, Reza; Conneely, Brendan; McCarthy, Elaine; Good, Barbara; Shayan, Parviz; DE Waal, Theo

2014-09-01

Accurate identification of sheep nematodes is a critical point in epidemiological studies and monitoring of drug resistance in flocks. However, due to a close morphological similarity between the eggs and larval stages of many of these nematodes, such identification is not a trivial task. There are a number of studies showing that molecular targets in ribosomal DNA (Internal transcribed spacer 1, 2 and Intergenic spacer) are suitable for accurate identification of sheep bursate nematodes. The objective of present study was to compare the ITS1, ITS2 and IGS regions of Iranian common bursate nematodes in order to choose best target for specific identification methods. The first and second internal transcribed spacers (ITS1and ITS2) and intergenic spacer (IGS) of the ribosomal DNA (rDNA) of 5 common Iranian bursate nematodes of sheep were sequenced. The sequences of some non-Iranian isolates were used for comparison in order to evaluate the variation in sequence homology between geographically different nematode populations. Comparison of the ITS1 and ITS2 sequences of Iranian nematodes showed greatest similarity among Teladorsagia circumcincta and Marshallagia marshalli of 94% and 88%, respectively. While Trichostrongylus colubriformis and M. marshalli showed the highest homology (99%) in the IGS sequences. Comparison of the spacer sequences of Iranian with non-Iranian isolates showed significantly higher variation in Haemonchus contortus compared to the other species. Both the ITS1 and ITS2 sequences are convenient targets to have species-specific identification of Iranian bursate nematodes. On the other hand the IGS region may be a less suitable molecular target.
Genome-wide comparison of medieval and modern Mycobacterium leprae.

PubMed

Schuenemann, Verena J; Singh, Pushpendra; Mendum, Thomas A; Krause-Kyora, Ben; Jäger, Günter; Bos, Kirsten I; Herbig, Alexander; Economou, Christos; Benjak, Andrej; Busso, Philippe; Nebel, Almut; Boldsen, Jesper L; Kjellström, Anna; Wu, Huihai; Stewart, Graham R; Taylor, G Michael; Bauer, Peter; Lee, Oona Y-C; Wu, Houdini H T; Minnikin, David E; Besra, Gurdyal S; Tucker, Katie; Roffey, Simon; Sow, Samba O; Cole, Stewart T; Nieselt, Kay; Krause, Johannes

2013-07-12

Leprosy was endemic in Europe until the Middle Ages. Using DNA array capture, we have obtained genome sequences of Mycobacterium leprae from skeletons of five medieval leprosy cases from the United Kingdom, Sweden, and Denmark. In one case, the DNA was so well preserved that full de novo assembly of the ancient bacterial genome could be achieved through shotgun sequencing alone. The ancient M. leprae sequences were compared with those of 11 modern strains, representing diverse genotypes and geographic origins. The comparisons revealed remarkable genomic conservation during the past 1000 years, a European origin for leprosy in the Americas, and the presence of an M. leprae genotype in medieval Europe now commonly associated with the Middle East. The exceptional preservation of M. leprae biomarkers, both DNA and mycolic acids, in ancient skeletons has major implications for palaeomicrobiology and human pathogen evolution.
Multi-modulus algorithm based on global artificial fish swarm intelligent optimization of DNA encoding sequences.

PubMed

Guo, Y C; Wang, H; Wu, H P; Zhang, M Q

2015-12-21

Aimed to address the defects of the large mean square error (MSE), and the slow convergence speed in equalizing the multi-modulus signals of the constant modulus algorithm (CMA), a multi-modulus algorithm (MMA) based on global artificial fish swarm (GAFS) intelligent optimization of DNA encoding sequences (GAFS-DNA-MMA) was proposed. To improve the convergence rate and reduce the MSE, this proposed algorithm adopted an encoding method based on DNA nucleotide chains to provide a possible solution to the problem. Furthermore, the GAFS algorithm, with its fast convergence and global search ability, was used to find the best sequence. The real and imaginary parts of the initial optimal weight vector of MMA were obtained through DNA coding of the best sequence. The simulation results show that the proposed algorithm has a faster convergence speed and smaller MSE in comparison with the CMA, the MMA, and the AFS-DNA-MMA.
Application and comparison of large-scale solution-based DNA capture-enrichment methods on ancient DNA

PubMed Central

Ávila-Arcos, María C.; Cappellini, Enrico; Romero-Navarro, J. Alberto; Wales, Nathan; Moreno-Mayar, J. Víctor; Rasmussen, Morten; Fordyce, Sarah L.; Montiel, Rafael; Vielle-Calzada, Jean-Philippe; Willerslev, Eske; Gilbert, M. Thomas P.

2011-01-01

The development of second-generation sequencing technologies has greatly benefitted the field of ancient DNA (aDNA). Its application can be further exploited by the use of targeted capture-enrichment methods to overcome restrictions posed by low endogenous and contaminating DNA in ancient samples. We tested the performance of Agilent's SureSelect and Mycroarray's MySelect in-solution capture systems on Illumina sequencing libraries built from ancient maize to identify key factors influencing aDNA capture experiments. High levels of clonality as well as the presence of multiple-copy sequences in the capture targets led to biases in the data regardless of the capture method. Neither method consistently outperformed the other in terms of average target enrichment, and no obvious difference was observed either when two tiling designs were compared. In addition to demonstrating the plausibility of capturing aDNA from ancient plant material, our results also enable us to provide useful recommendations for those planning targeted-sequencing on aDNA. PMID:22355593
Cloning and sequence analysis of Hemonchus contortus HC58cDNA.

PubMed

Muleke, Charles I; Ruofeng, Yan; Lixin, Xu; Xinwen, Bo; Xiangrui, Li

2007-06-01

The complete coding sequence of Hemonchus contortus HC58cDNA was generated by rapid amplification of cDNA ends and polymerase chain reaction using primers based on the 5' and 3' ends of the parasite mRNA, accession no. AF305964. The HC58cDNA gene was 851 bp long, with open reading frame of 717 bp, precursors to 239 amino acids coding for approximately 27 kDa protein. Analysis of amino acid sequence revealed conserved residues of cysteine, histidine, asparagine, occluding loop pattern, hemoglobinase motif and glutamine of the oxyanion hole characteristic of cathepsin B like proteases (CBL). Comparison of the predicted amino acid sequences showed the protein shared 33.5-58.7% identity to cathepsin B homologues in the papain clan CA family (family C1). Phylogenetic analysis revealed close evolutionary proximity of the protein sequence to counterpart sequences in the CBL, suggesting that HC58cDNA was a member of the papain family.
A Linked Series of Laboratory Exercises in Molecular Biology Utilizing Bioinformatics and GFP

ERIC Educational Resources Information Center

Medin, Carey L.; Nolin, Katie L.

2011-01-01

Molecular biologists commonly use bioinformatics to map and analyze DNA and protein sequences and to align different DNA and protein sequences for comparison. Additionally, biologists can create and view 3D models of protein structures to further understand intramolecular interactions. The primary goal of this 10-week laboratory was to introduce…
Comparison of complete mitochondrial DNA sequences between old and new world strains of the cowpea aphid, Aphis craccivora (Hemiptera: Aphididae)

USDA-ARS?s Scientific Manuscript database

Mitochondrial DNA provides useful tools for inferring population genetic structure within a species and phylogenetic relationships between species. The complete mitogenome sequences were assembled from strains of the cowpea aphids, Aphis craccivora, from the old (15,308 bp) and new world (15,305 bp...
A 28,000 Years Old Cro-Magnon mtDNA Sequence Differs from All Potentially Contaminating Modern Sequences

PubMed Central

Caramelli, David; Milani, Lucio; Vai, Stefania; Modi, Alessandra; Pecchioli, Elena; Girardi, Matteo; Pilli, Elena; Lari, Martina; Lippi, Barbara; Ronchitelli, Annamaria; Mallegni, Francesco; Casoli, Antonella; Bertorelle, Giorgio; Barbujani, Guido

2008-01-01

Background DNA sequences from ancient speciments may in fact result from undetected contamination of the ancient specimens by modern DNA, and the problem is particularly challenging in studies of human fossils. Doubts on the authenticity of the available sequences have so far hampered genetic comparisons between anatomically archaic (Neandertal) and early modern (Cro-Magnoid) Europeans. Methodology/Principal Findings We typed the mitochondrial DNA (mtDNA) hypervariable region I in a 28,000 years old Cro-Magnoid individual from the Paglicci cave, in Italy (Paglicci 23) and in all the people who had contact with the sample since its discovery in 2003. The Paglicci 23 sequence, determined through the analysis of 152 clones, is the Cambridge reference sequence, and cannot possibly reflect contamination because it differs from all potentially contaminating modern sequences. Conclusions/Significance: The Paglicci 23 individual carried a mtDNA sequence that is still common in Europe, and which radically differs from those of the almost contemporary Neandertals, demonstrating a genealogical continuity across 28,000 years, from Cro-Magnoid to modern Europeans. Because all potential sources of modern DNA contamination are known, the Paglicci 23 sample will offer a unique opportunity to get insight for the first time into the nuclear genes of early modern Europeans. PMID:18628960
Comparison of dkgB-linked intergenic sequence ribotyping to DNA microarray hybridization for assigning serotype to Salmonella enterica

PubMed Central

Guard, Jean; Sanchez-Ingunza, Roxana; Morales, Cesar; Stewart, Tod; Liljebjelke, Karen; Kessel, JoAnn; Ingram, Kim; Jones, Deana; Jackson, Charlene; Fedorka-Cray, Paula; Frye, Jonathan; Gast, Richard; Hinton, Arthur

2012-01-01

Two DNA-based methods were compared for the ability to assign serotype to 139 isolates of Salmonella enterica ssp. I. Intergenic sequence ribotyping (ISR) evaluated single nucleotide polymorphisms occurring in a 5S ribosomal gene region and flanking sequences bordering the gene dkgB. A DNA microarray hybridization method that assessed the presence and the absence of sets of genes was the second method. Serotype was assigned for 128 (92.1%) of submissions by the two DNA methods. ISR detected mixtures of serotypes within single colonies and it cost substantially less than Kauffmann–White serotyping and DNA microarray hybridization. Decreasing the cost of serotyping S. enterica while maintaining reliability may encourage routine testing and research. PMID:22998607
Complementary DNA sequences encoding the multimammate rat MHC class II DQ alpha and beta chains and cross-species sequence comparison in rodents.

PubMed

de Bellocq, J Goüy; Leirs, H

2009-09-01

Sequences of the complete open reading frame (ORF) for rodents major histocompatibility complex (MHC) class II genes are rare. Multimammate rat (Mastomys natalensis) complementary DNA (cDNA) encoding the alpha and beta chains of MHC class II DQ gene was cloned from a rapid amplifications of cDNA Emds (RACE) cDNA library. The ORFs consist of 801 and 771 bp encoding 266 and 256 amino acid residues for DQB and DQA, respectively. The genomic structure of Mana-DQ genes is globally analogous to that described for other rodents except for the insertion of a serine residue in the signal peptide of Mana-DQB, which is unique among known rodents.
The LINE-1 DNA sequences in four mammalian orders predict proteins that conserve homologies to retrovirus proteins.

PubMed Central

Fanning, T; Singer, M

1987-01-01

Recent work suggests that one or more members of the highly repeated LINE-1 (L1) DNA family found in all mammals may encode one or more proteins. Here we report the sequence of a portion of an L1 cloned from the domestic cat (Felis catus). These data permit comparison of the L1 sequences in four mammalian orders (Carnivore, Lagomorph, Rodent and Primate) and the comparison supports the suggested coding potential. In two separate, noncontiguous regions in the carboxy terminal half of the proteins predicted from the DNA sequences, there are several strongly conserved segments. In one region, these share homology with known or suspected reverse transcriptases, as described by others in rodents and primates. In the second region, closer to the carboxy terminus, the strongly conserved segments are over 90% homologous among the four orders. One of the latter segments is cysteine rich and resembles the putative metal binding domains of nucleic acid binding proteins, including those of TFIIIA and retroviruses. PMID:3562227

Illuminating choices for library prep: a comparison of library preparation methods for whole genome sequencing of Cryptococcus neoformans using Illumina HiSeq.

PubMed

Rhodes, Johanna; Beale, Mathew A; Fisher, Matthew C

2014-01-01

The industry of next-generation sequencing is constantly evolving, with novel library preparation methods and new sequencing machines being released by the major sequencing technology companies annually. The Illumina TruSeq v2 library preparation method was the most widely used kit and the market leader; however, it has now been discontinued, and in 2013 was replaced by the TruSeq Nano and TruSeq PCR-free methods, leaving a gap in knowledge regarding which is the most appropriate library preparation method to use. Here, we used isolates from the pathogenic fungi Cryptococcus neoformans var. grubii and sequenced them using the existing TruSeq DNA v2 kit (Illumina), along with two new kits: the TruSeq Nano DNA kit (Illumina) and the NEBNext Ultra DNA kit (New England Biolabs) to provide a comparison. Compared to the original TruSeq DNA v2 kit, both newer kits gave equivalent or better sequencing data, with increased coverage. When comparing the two newer kits, we found little difference in cost and workflow, with the NEBNext Ultra both slightly cheaper and faster than the TruSeq Nano. However, the quality of data generated using the TruSeq Nano DNA kit was superior due to higher coverage at regions of low GC content, and more SNPs identified. Researchers should therefore evaluate their resources and the type of application (and hence data quality) being considered when ultimately deciding on which library prep method to use.
Genomics dataset on unclassified published organism (patent US 7547531).

PubMed

Khan Shawan, Mohammad Mahfuz Ali; Hasan, Md Ashraful; Hossain, Md Mozammel; Hasan, Md Mahmudul; Parvin, Afroza; Akter, Salina; Uddin, Kazi Rasel; Banik, Subrata; Morshed, Mahbubul; Rahman, Md Nazibur; Rahman, S M Badier

2016-12-01

Nucleotide (DNA) sequence analysis provides important clues regarding the characteristics and taxonomic position of an organism. With the intention that, DNA sequence analysis is very crucial to learn about hierarchical classification of that particular organism. This dataset (patent US 7547531) is chosen to simplify all the complex raw data buried in undisclosed DNA sequences which help to open doors for new collaborations. In this data, a total of 48 unidentified DNA sequences from patent US 7547531 were selected and their complete sequences were retrieved from NCBI BioSample database. Quick response (QR) code of those DNA sequences was constructed by DNA BarID tool. QR code is useful for the identification and comparison of isolates with other organisms. AT/GC content of the DNA sequences was determined using ENDMEMO GC Content Calculator, which indicates their stability at different temperature. The highest GC content was observed in GP445188 (62.5%) which was followed by GP445198 (61.8%) and GP445189 (59.44%), while lowest was in GP445178 (24.39%). In addition, New England BioLabs (NEB) database was used to identify cleavage code indicating the 5, 3 and blunt end and enzyme code indicating the methylation site of the DNA sequences was also shown. These data will be helpful for the construction of the organisms' hierarchical classification, determination of their phylogenetic and taxonomic position and revelation of their molecular characteristics.
The cDNA sequence of a neutral horseradish peroxidase.

PubMed

Bartonek-Roxå, E; Eriksson, H; Mattiasson, B

1991-02-16

A cDNA clone encoding a horseradish (Armoracia rusticana) peroxidase has been isolated and characterized. The cDNA contains 1378 nucleotides excluding the poly(A) tail and the deduced protein contains 327 amino acids which includes a 28 amino acid leader sequence. The predicted amino acid sequence is nine amino acids shorter than the major isoenzyme belonging to the horseradish peroxidase C group (HRP-C) and the sequence shows 53.7% identity with this isoenzyme. The described clone encodes nine cysteines of which eight correspond well with the cysteines found in HRP-C. Five potential N-glycosylation sites with the general sequence Asn-X-Thr/Ser are present in the deduced sequence. Compared to the earlier described HRP-C this is three glycosylation sites less. The shorter sequence and fewer N-glycosylation sites give the native isoenzyme a molecular weight of several thousands less than the horseradish peroxidase C isoenzymes. Comparison with the net charge value of HRP-C indicates that the described cDNA clone encodes a peroxidase which has either the same or a slightly less basic pI value, depending on whether the encoded protein is N-terminally blocked or not. This excludes the possibility that HRP-n could belong to either the HRP-A, -D or -E groups. The low sequence identity (53.7%) with HRP-C indicates that the described clone does not belong to the HRP-C isoenzyme group and comparison of the total amino acid composition with the HRP-B group does not place the described clone within this isoenzyme group. Our conclusion is that the described cDNA clone encodes a neutral horseradish peroxidase which belongs to a new, not earlier described, horseradish peroxidase group.
Mitochondrial DNA typing from human axillary, pubic and head hair shafts - success rates and sequence comparisons.

PubMed

Pfeiffer, H; Hühne, J; Ortmann, C; Waterkamp, K; Brinkmann, B

1999-01-01

The analysis of mitochondrial DNA (mtDNA) from shed hairs has gained high importance in forensic casework since telogen hairs are one of the most common types of evidence left at the crime scene. In this systematic study of hair shafts from 20 individuals, the correlation of mtDNA recovery with hair morphology (length, diameter, volume, colour), with sex, and with body localisation (head, armpit, pubis) was investigated. The highest average success rate of hypervariable region 1 (HV 1) sequencing was found in head hair shafts (75%) followed by pubic (66%) and axillary hair shafts (52%). No statistically significant correlation between morphological parameters or sex and the success rate of sequencing was found. MtDNA sequences of buccal cells, head, pubic and axillary hair shafts did not show intraindividual differences. Heteroplasmic base positions were observed neither in the hair shafts nor in control samples of buccal cells.
Molecular characterization of banana bunchy top virus isolate from Sri Lanka and its genetic relationship with other isolates.

PubMed

Wickramaarachchi, W A R T; Shankarappa, K S; Rangaswamy, K T; Maruthi, M N; Rajapakse, R G A S; Ghosh, Saptarshi

2016-06-01

Bunchy top disease of banana caused by Banana bunchy top virus (BBTV, genus Babuvirus family Nanoviridae) is one of the most important constraints in production of banana in the different parts of the world. Six genomic DNA components of BBTV isolate from Kandy, Sri Lanka (BBTV-K) were amplified by polymerase chain reaction (PCR) with specific primers using total DNA extracted from banana tissues showing typical symptoms of bunchy top disease. The amplicons were of expected size of 1.0-1.1 kb, which were cloned and sequenced. Analysis of sequence data revealed the presence of six DNA components; DNA-R, DNA-U3, DNA-S, DNA-N, DNA-M and DNA-C for Sri Lanka isolate. Comparisons of sequence data of DNA components followed by the phylogenetic analysis, grouped Sri Lanka-(Kandy) isolate in the Pacific Indian Oceans (PIO) group. Sri Lanka-(Kandy) isolate of BBTV is classified a new member of PIO group based on analysis of six components of the virus.
Finding functional features in Saccharomyces genomes by phylogenetic footprinting.

PubMed

Cliften, Paul; Sudarsanam, Priya; Desikan, Ashwin; Fulton, Lucinda; Fulton, Bob; Majors, John; Waterston, Robert; Cohen, Barak A; Johnston, Mark

2003-07-04

The sifting and winnowing of DNA sequence that occur during evolution cause nonfunctional sequences to diverge, leaving phylogenetic footprints of functional sequence elements in comparisons of genome sequences. We searched for such footprints among the genome sequences of six Saccharomyces species and identified potentially functional sequences. Comparison of these sequences allowed us to revise the catalog of yeast genes and identify sequence motifs that may be targets of transcriptional regulatory proteins. Some of these conserved sequence motifs reside upstream of genes with similar functional annotations or similar expression patterns or those bound by the same transcription factor and are thus good candidates for functional regulatory sequences.
2000 Year-old ancient equids: an ancient-DNA lesson from pompeii remains.

PubMed

Di Bernardo, Giovanni; Del Gaudio, Stefania; Galderisi, Umberto; Cipollaro, Marilena

2004-11-15

Ancient DNA extracted from 2000 year-old equine bones was examined in order to amplify mitochondrial and nuclear DNA fragments. A specific equine satellite-type sequence representing 3.7%-11% of the entire equine genome, proved to be a suitable target to address the question of the presence of aDNA in ancient bones. The PCR strategy designed to investigate this specific target also allowed us to calculate the molecular weight of amplifiable DNA fragments. Sequencing of a 370 bp DNA fragment of mitochondrial control region allowed the comparison of ancient DNA sequences with those of modern horses to assess their genetic relationship. The 16S rRNA mitochondrial gene was also examined to unravel the post-mortem base modification feature and to test the status of Pompeian equids taxon on the basis of a Mae III restriction site polymorphism. Copyright 2004 Wiley-Liss, Inc.
Phylogenetic tree of 16s rRNA sequences from sulfate-reducing bacteria in a sandy marine sediment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Devereux, R.; Mundfrom, G.W.

1994-01-01

Phylogenetic divergence among sulfate-reducing bateria in an estuarine sediment sample was investigated by PCR amplification and comparison of partial 16S rDNA sequences. Twenty unique 16S rDNA sequences were found, 12 from delta subclass bacteria based on overall sequence similarity (82-91%). Two successive PCR amplifications were used to obtain and clone the 16S rDNA. The first reaction used templates derived from phosphate-buffered saline washed sediment with primers designed to amplify nearly full-length bacterial domain 16S rDNA. A produce from a first reaction was used as template in a second reaction with primers designed to selectivity amplify a region of 16S rDNAmore » genes of sulfate-reducing bacteria. A phylogenetic tree incorporating the cloned sequences suggests the presence of yet to be cultivated lines of sulfate-reducing bacteria within the sediment sample.« less
Phylo-VISTA: Interactive visualization of multiple DNA sequence alignments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shah, Nameeta; Couronne, Olivier; Pennacchio, Len A.

The power of multi-sequence comparison for biological discovery is well established. The need for new capabilities to visualize and compare cross-species alignment data is intensified by the growing number of genomic sequence datasets being generated for an ever-increasing number of organisms. To be efficient these visualization algorithms must support the ability to accommodate consistently a wide range of evolutionary distances in a comparison framework based upon phylogenetic relationships. Results: We have developed Phylo-VISTA, an interactive tool for analyzing multiple alignments by visualizing a similarity measure for multiple DNA sequences. The complexity of visual presentation is effectively organized using a frameworkmore » based upon interspecies phylogenetic relationships. The phylogenetic organization supports rapid, user-guided interspecies comparison. To aid in navigation through large sequence datasets, Phylo-VISTA leverages concepts from VISTA that provide a user with the ability to select and view data at varying resolutions. The combination of multiresolution data visualization and analysis, combined with the phylogenetic framework for interspecies comparison, produces a highly flexible and powerful tool for visual data analysis of multiple sequence alignments. Availability: Phylo-VISTA is available at http://www-gsd.lbl. gov/phylovista. It requires an Internet browser with Java Plugin 1.4.2 and it is integrated into the global alignment program LAGAN at http://lagan.stanford.edu« less
Differences in expression of retinal pigment epithelium mRNA between normal canines

PubMed Central

2004-01-01

Abstract A reference database of differences in mRNA expression in normal healthy canine retinal pigment epithelium (RPE) has been established. This database identifies non-informative differences in mRNA expression that can be used in screening canine RPE for mutations associated with clinical effects on vision. Complementary DNA (cDNA) pools were prepared from mRNA harvested from RPE, amplified by PCR, and used in a subtractive hybridization protocol (representational differential analysis) to identify differences in RPE mRNA expression between canines. The effect of relatedness of the test canines on the frequency of occurrence of differences was evaluated by using 2 unrelated canines for comparison with 2 female sibling canines of blue heeler/bull terrier lineage. Differentially expressed cDNA species were cloned, sequenced, and identified by comparison to public database entries. The most frequently observed differentially expressed sequence from the unrelated canine comparison was cDNA with 21 base pairs (bp) identical to the human epithelial membrane protein 1 gene (present in 8 of 20 clones). Different clones from the same-sex sibling RPE contained repetitions of several short sequence motifs including the human epithelial membrane protein 1 (4 of 25 clones). Other prevalent differences between sibling RPE included sequences similar to a chicken genetic marker sequence motif (5 of 25), and 6 clones with homology to porcine major histocompatibility loci. In addition to identifying several repetitively occurring, noninformative, differentially expressed RPE mRNA species, the findings confirm that fewer differences occurred between siblings, highlighting the importance of using closely related subjects in representational difference analysis studies. PMID:15352545
Amplification of the major satellite DNA family (FA-SAT) in a cat fibrosarcoma might be related to chromosomal instability.

PubMed

Santos, Sara; Chaves, Raquel; Adega, Filomena; Bastos, Estela; Guedes-Pinto, Henrique

2006-01-01

Most mammalian chromosomes have satellite DNA sequences located at or near the centromeres, organized in arrays of variable size and higher order structure. The implications of these specific repetitive DNA sequences and their organization for centromere function are still quite cloudy. In contrast to most mammalian species, the domestic cat seems to have the major satellite DNA family (FA-SAT) localized primarily at the telomeres and secondarily at the centromeres of the chromosomes. In the present work, we analyzed chromosome preparations from a fibrosarcoma, in comparison with nontumor cells (epithelial tissue) from the same individual, by in situ hybridization of the FA-SAT cat satellite DNA family. This repetitive sequence was found to be amplified in the cat tumor chromosomes analyzed. The amplification of these satellite DNA sequences in the cat chromosomes with variable number and appearance (marker chromosomes) is discussed and might be related to mitotic instability, which could explain the exhibition of complex patterns of chromosome aberrations detected in the fibrosarcoma analyzed.
Local alignment of two-base encoded DNA sequence

PubMed Central

Homer, Nils; Merriman, Barry; Nelson, Stanley F

2009-01-01

Background DNA sequence comparison is based on optimal local alignment of two sequences using a similarity score. However, some new DNA sequencing technologies do not directly measure the base sequence, but rather an encoded form, such as the two-base encoding considered here. In order to compare such data to a reference sequence, the data must be decoded into sequence. The decoding is deterministic, but the possibility of measurement errors requires searching among all possible error modes and resulting alignments to achieve an optimal balance of fewer errors versus greater sequence similarity. Results We present an extension of the standard dynamic programming method for local alignment, which simultaneously decodes the data and performs the alignment, maximizing a similarity score based on a weighted combination of errors and edits, and allowing an affine gap penalty. We also present simulations that demonstrate the performance characteristics of our two base encoded alignment method and contrast those with standard DNA sequence alignment under the same conditions. Conclusion The new local alignment algorithm for two-base encoded data has substantial power to properly detect and correct measurement errors while identifying underlying sequence variants, and facilitating genome re-sequencing efforts based on this form of sequence data. PMID:19508732
Cloning and sequence analysis of a cDNA encoding the alpha-subunit of mouse beta-N-acetylhexosaminidase and comparison with the human enzyme.

PubMed Central

Beccari, T; Hoade, J; Orlacchio, A; Stirling, J L

1992-01-01

cDNAs encoding the mouse beta-N-acetylhexosaminidase alpha-subunit were isolated from a mouse testis library. The longest of these (1.7 kb) was sequenced and showed 83% similarity with the human alpha-subunit cDNA sequence. The 5' end of the coding sequence was obtained from a genomic DNA clone. Alignment of the human and mouse sequences showed that all three putative N-glycosylation sites are conserved, but that the mouse alpha-subunit has an additional site towards the C-terminus. All eight cysteines in the human sequence are conserved in the mouse. There are an additional two cysteines in the mouse alpha-subunit signal peptide. All amino acids affected in Tay-Sachs-disease mutations are conserved in the mouse. Images Fig. 1. PMID:1379046
Positive Streptobacillus moniliformis PCR in guinea pigs likely due to Leptotrichia spp.

PubMed

Boot, Ron; Van de Berg, Lia; Reubsaet, Frans A G; Vlemminx, Maurice J

2008-04-30

Streptobacillus moniliformis is a zoonotic bacterium. We obtained positive S. moniliformis PCR results in oral swab samples from guinea pigs from an experimental colony and the breeding colony of origin. Comparison of the DNA sequence of an amplicon with deposited 16S rDNA sequences revealed that Leptotrichia sp. can be the source of a false positive S. moniliformis PCR outcome.
Characterization of a cDNA encoding a protein involved in formation of the skeleton during development of the sea urchin Lytechinus pictus.

PubMed

Livingston, B T; Shaw, R; Bailey, A; Wilt, F

1991-12-01

In order to investigate the role of proteins in the formation of mineralized tissues during development, we have isolated a cDNA that encodes a protein that is a component of the organic matrix of the skeletal spicule of the sea urchin, Lytechinus pictus. The expression of the RNA encoding this protein is regulated over development and is localized to the descendents of the micromere lineage. Comparison of the sequence of this cDNA to homologous cDNAs from other species of urchin reveal that the protein is basic and contains three conserved structural motifs: a signal peptide, a proline-rich region, and an unusual region composed of a series of direct repeats. Studies on the protein encoded by this cDNA confirm the predicted reading frame deduced from the nucleotide sequence and show that the protein is secreted and not glycosylated. Comparison of the amino acid sequence to databases reveal that the repeat domain is similar to proteins that form a unique beta-spiral supersecondary structure.
Benchmarking of the Oxford Nanopore MinION sequencing for quantitative and qualitative assessment of cDNA populations.

PubMed

Oikonomopoulos, Spyros; Wang, Yu Chang; Djambazian, Haig; Badescu, Dunarel; Ragoussis, Jiannis

2016-08-24

To assess the performance of the Oxford Nanopore Technologies MinION sequencing platform, cDNAs from the External RNA Controls Consortium (ERCC) RNA Spike-In mix were sequenced. This mix mimics mammalian mRNA species and consists of 92 polyadenylated transcripts with known concentration. cDNA libraries were generated using a template switching protocol to facilitate the direct comparison between different sequencing platforms. The MinION performance was assessed for its ability to sequence the cDNAs directly with good accuracy in terms of abundance and full length. The abundance of the ERCC cDNA molecules sequenced by MinION agreed with their expected concentration. No length or GC content bias was observed. The majority of cDNAs were sequenced as full length. Additionally, a complex cDNA population derived from a human HEK-293 cell line was sequenced on an Illumina HiSeq 2500, PacBio RS II and ONT MinION platforms. We observed that there was a good agreement in the measured cDNA abundance between PacBio RS II and ONT MinION (rpearson = 0.82, isoforms with length more than 700bp) and between Illumina HiSeq 2500 and ONT MinION (rpearson = 0.75). This indicates that the ONT MinION can sequence quantitatively both long and short full length cDNA molecules.
Application of a time-dependent coalescence process for inferring the history of population size changes from DNA sequence data.

PubMed

Polanski, A; Kimmel, M; Chakraborty, R

1998-05-12

Distribution of pairwise differences of nucleotides from data on a sample of DNA sequences from a given segment of the genome has been used in the past to draw inferences about the past history of population size changes. However, all earlier methods assume a given model of population size changes (such as sudden expansion), parameters of which (e.g., time and amplitude of expansion) are fitted to the observed distributions of nucleotide differences among pairwise comparisons of all DNA sequences in the sample. Our theory indicates that for any time-dependent population size, N(tau) (in which time tau is counted backward from present), a time-dependent coalescence process yields the distribution, p(tau), of the time of coalescence between two DNA sequences randomly drawn from the population. Prediction of p(tau) and N(tau) requires the use of a reverse Laplace transform known to be unstable. Nevertheless, simulated data obtained from three models of monotone population change (stepwise, exponential, and logistic) indicate that the pattern of a past population size change leaves its signature on the pattern of DNA polymorphism. Application of the theory to the published mtDNA sequences indicates that the current mtDNA sequence variation is not inconsistent with a logistic growth of the human population.
Computational and experimental analysis of DNA shuffling

PubMed Central

Maheshri, Narendra; Schaffer, David V.

2003-01-01

We describe a computational model of DNA shuffling based on the thermodynamics and kinetics of this process. The model independently tracks a representative ensemble of DNA molecules and records their states at every stage of a shuffling reaction. These data can subsequently be analyzed to yield information on any relevant metric, including reassembly efficiency, crossover number, type and distribution, and DNA sequence length distributions. The predictive ability of the model was validated by comparison to three independent sets of experimental data, and analysis of the simulation results led to several unique insights into the DNA shuffling process. We examine a tradeoff between crossover frequency and reassembly efficiency and illustrate the effects of experimental parameters on this relationship. Furthermore, we discuss conditions that promote the formation of useless “junk” DNA sequences or multimeric sequences containing multiple copies of the reassembled product. This model will therefore aid in the design of optimal shuffling reaction conditions. PMID:12626764
Comparison of DNA Microarray, Loop-Mediated Isothermal Amplification (LAMP) and Real-Time PCR with DNA Sequencing for Identification of Fusarium spp. Obtained from Patients with Hematologic Malignancies.

PubMed

de Souza, Marcela; Matsuzawa, Tetsuhiro; Sakai, Kanae; Muraosa, Yasunori; Lyra, Luzia; Busso-Lopes, Ariane Fidelis; Levin, Anna Sara Shafferman; Schreiber, Angélica Zaninelli; Mikami, Yuzuru; Gonoi, Tohoru; Kamei, Katsuhiko; Moretti, Maria Luiza; Trabasso, Plínio

2017-08-01

The performance of three molecular biology techniques, i.e., DNA microarray, loop-mediated isothermal amplification (LAMP), and real-time PCR were compared with DNA sequencing for properly identification of 20 isolates of Fusarium spp. obtained from blood stream as etiologic agent of invasive infections in patients with hematologic malignancies. DNA microarray, LAMP and real-time PCR identified 16 (80%) out of 20 samples as Fusarium solani species complex (FSSC) and four (20%) as Fusarium spp. The agreement among the techniques was 100%. LAMP exhibited 100% specificity, while DNA microarray, LAMP and real-time PCR showed 100% sensitivity. The three techniques had 100% agreement with DNA sequencing. Sixteen isolates were identified as FSSC by sequencing, being five Fusarium keratoplasticum, nine Fusarium petroliphilum and two Fusarium solani. On the other hand, sequencing identified four isolates as Fusarium non-solani species complex (FNSSC), being three isolates as Fusarium napiforme and one isolate as Fusarium oxysporum. Finally, LAMP proved to be faster and more accessible than DNA microarray and real-time PCR, since it does not require a thermocycler. Therefore, LAMP signalizes as emerging and promising methodology to be used in routine identification of Fusarium spp. among cases of invasive fungal infections.
Applications of alignment-free methods in epigenomics.

PubMed

Pinello, Luca; Lo Bosco, Giosuè; Yuan, Guo-Cheng

2014-05-01

Epigenetic mechanisms play an important role in the regulation of cell type-specific gene activities, yet how epigenetic patterns are established and maintained remains poorly understood. Recent studies have supported a role of DNA sequences in recruitment of epigenetic regulators. Alignment-free methods have been applied to identify distinct sequence features that are associated with epigenetic patterns and to predict epigenomic profiles. Here, we review recent advances in such applications, including the methods to map DNA sequence to feature space, sequence comparison and prediction models. Computational studies using these methods have provided important insights into the epigenetic regulatory mechanisms.

DNABP: Identification of DNA-Binding Proteins Based on Feature Selection Using a Random Forest and Predicting Binding Residues.

PubMed

Ma, Xin; Guo, Jing; Sun, Xiao

2016-01-01

DNA-binding proteins are fundamentally important in cellular processes. Several computational-based methods have been developed to improve the prediction of DNA-binding proteins in previous years. However, insufficient work has been done on the prediction of DNA-binding proteins from protein sequence information. In this paper, a novel predictor, DNABP (DNA-binding proteins), was designed to predict DNA-binding proteins using the random forest (RF) classifier with a hybrid feature. The hybrid feature contains two types of novel sequence features, which reflect information about the conservation of physicochemical properties of the amino acids, and the binding propensity of DNA-binding residues and non-binding propensities of non-binding residues. The comparisons with each feature demonstrated that these two novel features contributed most to the improvement in predictive ability. Furthermore, to improve the prediction performance of the DNABP model, feature selection using the minimum redundancy maximum relevance (mRMR) method combined with incremental feature selection (IFS) was carried out during the model construction. The results showed that the DNABP model could achieve 86.90% accuracy, 83.76% sensitivity, 90.03% specificity and a Matthews correlation coefficient of 0.727. High prediction accuracy and performance comparisons with previous research suggested that DNABP could be a useful approach to identify DNA-binding proteins from sequence information. The DNABP web server system is freely available at http://www.cbi.seu.edu.cn/DNABP/.
Phylogenetic Analysis of Shewanella Strains by DNA Relatedness Derived from Whole Genome Microarray DNA-DNA Hybridization and Comparison with Other Methods

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wu, Liyou; Yi, T. Y.; Van Nostrand, Joy

Phylogenetic analyses were done for the Shewanella strains isolated from Baltic Sea (38 strains), US DOE Hanford Uranium bioremediation site [Hanford Reach of the Columbia River (HRCR), 11 strains], Pacific Ocean and Hawaiian sediments (8 strains), and strains from other resources (16 strains) with three out group strains, Rhodopseudomonas palustris, Clostridium cellulolyticum, and Thermoanaerobacter ethanolicus X514, using DNA relatedness derived from WCGA-based DNA-DNA hybridizations, sequence similarities of 16S rRNA gene and gyrB gene, and sequence similarities of 6 loci of Shewanella genome selected from a shared gene list of the Shewanella strains with whole genome sequenced based on the averagemore » nucleotide identity of them (ANI). The phylogenetic trees based on 16S rRNA and gyrB gene sequences, and DNA relatedness derived from WCGA hybridizations of the tested Shewanella strains share exactly the same sub-clusters with very few exceptions, in which the strains were basically grouped by species. However, the phylogenetic analysis based on DNA relatedness derived from WCGA hybridizations dramatically increased the differentiation resolution at species and strains level within Shewanella genus. When the tree based on DNA relatedness derived from WCGA hybridizations was compared to the tree based on the combined sequences of the selected functional genes (6 loci), we found that the resolutions of both methods are similar, but the clustering of the tree based on DNA relatedness derived from WMGA hybridizations was clearer. These results indicate that WCGA-based DNA-DNA hybridization is an idea alternative of conventional DNA-DNA hybridization methods and it is superior to the phylogenetics methods based on sequence similarities of single genes. Detailed analysis is being performed for the re-classification of the strains examined.« less
Phylogenetic study on Shiraia bambusicola by rDNA sequence analyses.

PubMed

Cheng, Tian-Fan; Jia, Xiao-Ming; Ma, Xiao-Hang; Lin, Hai-Ping; Zhao, Yu-Hua

2004-01-01

In this study, 18S rDNA and ITS-5.8S rDNA regions of four Shiraia bambusicola isolates collected from different species of bamboos were amplified by PCR with universal primer pairs NS1/NS8 and ITS5/ITS4, respectively, and sequenced. Phylogenetic analyses were conducted on three selected datasets of rDNA sequences. Maximum parsimony, distance and maximum likelihood criteria were used to infer trees. Morphological characteristics were also observed. The positioning of Shiraia in the order Pleosporales was well supported by bootstrap, which agreed with the placement by Amano (1980) according to their morphology. We did not find significant inter-hostal differences among these four isolates from different species of bamboos. From the results of analyses and comparison of their rDNA sequences, we conclude that Shiraia should be classified into Pleosporales as Amano (1980) proposed and suggest that it might be positioned in the family Phaeosphaeriaceae. Copyright 2004 WILEY-VCH Verlag GmbH & Co.
Mitochondrial DNA Evidence Supports the Hypothesis that Triodontophorus Species Belong to Cyathostominae

PubMed Central

Gao, Yuan; Zhang, Yan; Yang, Xin; Qiu, Jian-Hua; Duan, Hong; Xu, Wen-Wen; Chang, Qiao-Cheng; Wang, Chun-Ren

2017-01-01

Equine strongyles, the significant nematode pathogens of horses, are characterized by high quantities and species abundance, but classification of this group of parasitic nematodes is debated. Mitochondrial (mt) genome DNA data are often used to address classification controversies. Thus, the objectives of this study were to determine the complete mt genomes of three Cyathostominae nematode species (Cyathostomum catinatum, Cylicostephanus minutus, and Poteriostomum imparidentatum) of horses and reconstruct the phylogenetic relationship of Strongylidae with other nematodes in Strongyloidea to test the hypothesis that Triodontophorus spp. belong to Cyathostominae using the mt genomes. The mt genomes of Cy. catinatum, Cs. minutus, and P. imparidentatum were 13,838, 13,826, and 13,817 bp in length, respectively. Complete mt nucleotide sequence comparison of all Strongylidae nematodes revealed that sequence identity ranged from 77.8 to 91.6%. The mt genome sequences of Triodontophorus species had relatively high identity with Cyathostominae nematodes, rather than Strongylus species of the same subfamily (Strongylinae). Comparative analyses of mt genome organization for Strongyloidea nematodes sequenced to date revealed that members of this superfamily possess identical gene arrangements. Phylogenetic analyses using mtDNA data indicated that the Triodontophorus species clustered with Cyathostominae species instead of Strongylus species. The present study first determined the complete mt genome sequences of Cy. catinatum, Cs. minutus, and P. imparidentatum, which will provide novel genetic markers for further studies of Strongylidae taxonomy, population genetics, and systematics. Importantly, sequence comparison and phylogenetic analyses based on mtDNA sequences supported the hypothesis that Triodontophorus belongs to Cyathostominae. PMID:28824575
Methylation pattern of fish lymphocystis disease virus DNA.

PubMed

Wagner, H; Simon, D; Werner, E; Gelderblom, H; Darai, C; Flügel, R M

1985-03-01

The content and distribution of 5-methylcytosine in DNA from fish lymphocystis disease virus was analyzed by high-pressure liquid chromatography, nearest-neighbor analysis, and with restriction endonucleases. We found that 22% of all C residues were methylated, including methylation of the following dinucleotide sequences: CpG to 75%, CpC to ca. 1%, and CpA to 2 to 5%. Comparison of relative digestion of viral DNA with MspI and HpaII indicated that CCGG sequences were almost completely methylated at the inner C. The degree of methylation of GCGC was much lower. The methylation pattern of fish lymphocystis disease virus DNA differed from that of the host cell DNA.
Methylation pattern of fish lymphocystis disease virus DNA.

PubMed Central

Wagner, H; Simon, D; Werner, E; Gelderblom, H; Darai, C; Flügel, R M

1985-01-01

The content and distribution of 5-methylcytosine in DNA from fish lymphocystis disease virus was analyzed by high-pressure liquid chromatography, nearest-neighbor analysis, and with restriction endonucleases. We found that 22% of all C residues were methylated, including methylation of the following dinucleotide sequences: CpG to 75%, CpC to ca. 1%, and CpA to 2 to 5%. Comparison of relative digestion of viral DNA with MspI and HpaII indicated that CCGG sequences were almost completely methylated at the inner C. The degree of methylation of GCGC was much lower. The methylation pattern of fish lymphocystis disease virus DNA differed from that of the host cell DNA. Images PMID:3973962
MethVisual - visualization and exploratory statistical analysis of DNA methylation profiles from bisulfite sequencing.

PubMed

Zackay, Arie; Steinhoff, Christine

2010-12-15

Exploration of DNA methylation and its impact on various regulatory mechanisms has become a very active field of research. Simultaneously there is an arising need for tools to process and analyse the data together with statistical investigation and visualisation. MethVisual is a new application that enables exploratory analysis and intuitive visualization of DNA methylation data as is typically generated by bisulfite sequencing. The package allows the import of DNA methylation sequences, aligns them and performs quality control comparison. It comprises basic analysis steps as lollipop visualization, co-occurrence display of methylation of neighbouring and distant CpG sites, summary statistics on methylation status, clustering and correspondence analysis. The package has been developed for methylation data but can be also used for other data types for which binary coding can be inferred. The application of the package, as well as a comparison to existing DNA methylation analysis tools and its workflow based on two datasets is presented in this paper. The R package MethVisual offers various analysis procedures for data that can be binarized, in particular for bisulfite sequenced methylation data. R/Bioconductor has become one of the most important environments for statistical analysis of various types of biological and medical data. Therefore, any data analysis within R that allows the integration of various data types as provided from different technological platforms is convenient. It is the first and so far the only specific package for DNA methylation analysis, in particular for bisulfite sequenced data available in R/Bioconductor enviroment. The package is available for free at http://methvisual.molgen.mpg.de/ and from the Bioconductor Consortium http://www.bioconductor.org.
MethVisual - visualization and exploratory statistical analysis of DNA methylation profiles from bisulfite sequencing

PubMed Central

2010-01-01

Background Exploration of DNA methylation and its impact on various regulatory mechanisms has become a very active field of research. Simultaneously there is an arising need for tools to process and analyse the data together with statistical investigation and visualisation. Findings MethVisual is a new application that enables exploratory analysis and intuitive visualization of DNA methylation data as is typically generated by bisulfite sequencing. The package allows the import of DNA methylation sequences, aligns them and performs quality control comparison. It comprises basic analysis steps as lollipop visualization, co-occurrence display of methylation of neighbouring and distant CpG sites, summary statistics on methylation status, clustering and correspondence analysis. The package has been developed for methylation data but can be also used for other data types for which binary coding can be inferred. The application of the package, as well as a comparison to existing DNA methylation analysis tools and its workflow based on two datasets is presented in this paper. Conclusions The R package MethVisual offers various analysis procedures for data that can be binarized, in particular for bisulfite sequenced methylation data. R/Bioconductor has become one of the most important environments for statistical analysis of various types of biological and medical data. Therefore, any data analysis within R that allows the integration of various data types as provided from different technological platforms is convenient. It is the first and so far the only specific package for DNA methylation analysis, in particular for bisulfite sequenced data available in R/Bioconductor enviroment. The package is available for free at http://methvisual.molgen.mpg.de/ and from the Bioconductor Consortium http://www.bioconductor.org. PMID:21159174
Analysis on the DNA Fingerprinting of Aspergillus Oryzae Mutant Induced by High Hydrostatic Pressure

NASA Astrophysics Data System (ADS)

Wang, Hua; Zhang, Jian; Yang, Fan; Wang, Kai; Shen, Si-Le; Liu, Bing-Bing; Zou, Bo; Zou, Guang-Tian

2011-01-01

The mutant strains of aspergillus oryzae (HP300a) are screened under 300 MPa for 20 min. Compared with the control strains, the screened mutant strains have unique properties such as genetic stability, rapid growth, lots of spores, and high protease activity. Random amplified polymorphic DNA (RAPD) and inter simple sequence repeats (ISSR) are used to analyze the DNA fingerprinting of HP300a and the control strains. There are 67.9% and 51.3% polymorphic bands obtained by these two markers, respectively, indicating significant genetic variations between HP300a and the control strains. In addition, comparison of HP300a and the control strains, the genetic distances of random sequence and simple sequence repeat of DNA are 0.51 and 0.34, respectively.
Biosynthesis of Lipoic Acid in Arabidopsis: Cloning and Characterization of the cDNA for Lipoic Acid Synthase1

PubMed Central

Yasuno, Rie; Wada, Hajime

1998-01-01

Lipoic acid is a coenzyme that is essential for the activity of enzyme complexes such as those of pyruvate dehydrogenase and glycine decarboxylase. We report here the isolation and characterization of LIP1 cDNA for lipoic acid synthase of Arabidopsis. The Arabidopsis LIP1 cDNA was isolated using an expressed sequence tag homologous to the lipoic acid synthase of Escherichia coli. This cDNA was shown to code for Arabidopsis lipoic acid synthase by its ability to complement a lipA mutant of E. coli defective in lipoic acid synthase. DNA-sequence analysis of the LIP1 cDNA revealed an open reading frame predicting a protein of 374 amino acids. Comparisons of the deduced amino acid sequence with those of E. coli and yeast lipoic acid synthase homologs showed a high degree of sequence similarity and the presence of a leader sequence presumably required for import into the mitochondria. Southern-hybridization analysis suggested that LIP1 is a single-copy gene in Arabidopsis. Western analysis with an antibody against lipoic acid synthase demonstrated that this enzyme is located in the mitochondrial compartment in Arabidopsis cells as a 43-kD polypeptide. PMID:9808738
Complete genome sequence of a new begomovirus associated with yellow mosaic disease of Hemidesmus indicus in India.

PubMed

Reddy, M Sreekanth; Kanakala, S; Srinivas, K P; Hema, M; Malathi, V G; Sreenivasulu, P

2014-05-01

The complete DNA A genome of a virus isolate associated with yellow mosaic disease of a medicinal plant, Hemidesmus indicus, from India was cloned and sequenced. The length of DNA A was 2825 nucleotides, 35 nucleotides longer than the unit genome of monopartite begomoviruses. Comparison of the nucleotide sequence of DNA A of the virus isolate with those of other begomoviruses showed maximum sequence identity of 69 % to DNA A of ageratum yellow vein China virus (AYVCNV; AJ558120) and 68 % with tomato yellow leaf curl virus- LBa4 (TYLCV; EF185318), and it formed a distinct clade in phylogenetic analysis. The genome organization of the present virus isolate was found to be similar to that of Old World monopartite begomoviruses. The genome was considered to be monopartite, because association of DNA B and β satellite DNA components was not detected. Based on its sequence identity (<70 %) to all other begomoviruses known to date and ICTV (International Committee on Taxonomy of Viruses) species demarcating criteria (<89 % identity), it is considered a member of a novel begomovirus species, and the tentative name "Hemidesmus yellow mosaic virus" (HeYMV) is proposed.
The Neandertal genome and ancient DNA authenticity

PubMed Central

Green, Richard E; Briggs, Adrian W; Krause, Johannes; Prüfer, Kay; Burbano, Hernán A; Siebauer, Michael; Lachmann, Michael; Pääbo, Svante

2009-01-01

Recent advances in high-thoughput DNA sequencing have made genome-scale analyses of genomes of extinct organisms possible. With these new opportunities come new difficulties in assessing the authenticity of the DNA sequences retrieved. We discuss how these difficulties can be addressed, particularly with regard to analyses of the Neandertal genome. We argue that only direct assays of DNA sequence positions in which Neandertals differ from all contemporary humans can serve as a reliable means to estimate human contamination. Indirect measures, such as the extent of DNA fragmentation, nucleotide misincorporations, or comparison of derived allele frequencies in different fragment size classes, are unreliable. Fortunately, interim approaches based on mtDNA differences between Neandertals and current humans, detection of male contamination through Y chromosomal sequences, and repeated sequencing from the same fossil to detect autosomal contamination allow initial large-scale sequencing of Neandertal genomes. This will result in the discovery of fixed differences in the nuclear genome between Neandertals and current humans that can serve as future direct assays for contamination. For analyses of other fossil hominins, which may become possible in the future, we suggest a similar ‘boot-strap' approach in which interim approaches are applied until sufficient data for more definitive direct assays are acquired. PMID:19661919
CpG PatternFinder: a Windows-based utility program for easy and rapid identification of the CpG methylation status of DNA.

PubMed

Xu, Yi-Hua; Manoharan, Herbert T; Pitot, Henry C

2007-09-01

The bisulfite genomic sequencing technique is one of the most widely used techniques to study sequence-specific DNA methylation because of its unambiguous ability to reveal DNA methylation status to the order of a single nucleotide. One characteristic feature of the bisulfite genomic sequencing technique is that a number of sample sequence files will be produced from a single DNA sample. The PCR products of bisulfite-treated DNA samples cannot be sequenced directly because they are heterogeneous in nature; therefore they should be cloned into suitable plasmids and then sequenced. This procedure generates an enormous number of sample DNA sequence files as well as adding extra bases belonging to the plasmids to the sequence, which will cause problems in the final sequence comparison. Finding the methylation status for each CpG in each sample sequence is not an easy job. As a result CpG PatternFinder was developed for this purpose. The main functions of the CpG PatternFinder are: (i) to analyze the reference sequence to obtain CpG and non-CpG-C residue position information. (ii) To tailor sample sequence files (delete insertions and mark deletions from the sample sequence files) based on a configuration of ClustalW multiple alignment. (iii) To align sample sequence files with a reference file to obtain bisulfite conversion efficiency and CpG methylation status. And, (iv) to produce graphics, highlighted aligned sequence text and a summary report which can be easily exported to Microsoft Office suite. CpG PatternFinder is designed to operate cooperatively with BioEdit, a freeware on the internet. It can handle up to 100 files of sample DNA sequences simultaneously, and the total CpG pattern analysis process can be finished in minutes. CpG PatternFinder is an ideal software tool for DNA methylation studies to determine the differential methylation pattern in a large number of individuals in a population. Previously we developed the CpG Analyzer program; CpG PatternFinder is our further effort to create software tools for DNA methylation studies.
Annotation, submission and screening of repetitive elements in Repbase: RepbaseSubmitter and Censor.

PubMed

Kohany, Oleksiy; Gentles, Andrew J; Hankus, Lukasz; Jurka, Jerzy

2006-10-25

Repbase is a reference database of eukaryotic repetitive DNA, which includes prototypic sequences of repeats and basic information described in annotations. Updating and maintenance of the database requires specialized tools, which we have created and made available for use with Repbase, and which may be useful as a template for other curated databases. We describe the software tools RepbaseSubmitter and Censor, which are designed to facilitate updating and screening the content of Repbase. RepbaseSubmitter is a java-based interface for formatting and annotating Repbase entries. It eliminates many common formatting errors, and automates actions such as calculation of sequence lengths and composition, thus facilitating curation of Repbase sequences. In addition, it has several features for predicting protein coding regions in sequences; searching and including Pubmed references in Repbase entries; and searching the NCBI taxonomy database for correct inclusion of species information and taxonomic position. Censor is a tool to rapidly identify repetitive elements by comparison to known repeats. It uses WU-BLAST for speed and sensitivity, and can conduct DNA-DNA, DNA-protein, or translated DNA-translated DNA searches of genomic sequence. Defragmented output includes a map of repeats present in the query sequence, with the options to report masked query sequence(s), repeat sequences found in the query, and alignments. Censor and RepbaseSubmitter are available as both web-based services and downloadable versions. They can be found at http://www.girinst.org/repbase/submission.html (RepbaseSubmitter) and http://www.girinst.org/censor/index.php (Censor).
Comparison of microbial DNA enrichment tools for metagenomic whole genome sequencing.

PubMed

Thoendel, Matthew; Jeraldo, Patricio R; Greenwood-Quaintance, Kerryl E; Yao, Janet Z; Chia, Nicholas; Hanssen, Arlen D; Abdel, Matthew P; Patel, Robin

2016-08-01

Metagenomic whole genome sequencing for detection of pathogens in clinical samples is an exciting new area for discovery and clinical testing. A major barrier to this approach is the overwhelming ratio of human to pathogen DNA in samples with low pathogen abundance, which is typical of most clinical specimens. Microbial DNA enrichment methods offer the potential to relieve this limitation by improving this ratio. Two commercially available enrichment kits, the NEBNext Microbiome DNA Enrichment Kit and the Molzym MolYsis Basic kit, were tested for their ability to enrich for microbial DNA from resected arthroplasty component sonicate fluids from prosthetic joint infections or uninfected sonicate fluids spiked with Staphylococcus aureus. Using spiked uninfected sonicate fluid there was a 6-fold enrichment of bacterial DNA with the NEBNext kit and 76-fold enrichment with the MolYsis kit. Metagenomic whole genome sequencing of sonicate fluid revealed 13- to 85-fold enrichment of bacterial DNA using the NEBNext enrichment kit. The MolYsis approach achieved 481- to 9580-fold enrichment, resulting in 7 to 59% of sequencing reads being from the pathogens known to be present in the samples. These results demonstrate the usefulness of these tools when testing clinical samples with low microbial burden using next generation sequencing. Copyright © 2016 Elsevier B.V. All rights reserved.
Digital signal processing methods for biosequence comparison.

PubMed Central

Benson, D C

1990-01-01

A method is discussed for DNA or protein sequence comparison using a finite field fast Fourier transform, a digital signal processing technique; and statistical methods are discussed for analyzing the output of this algorithm. This method compares two sequences of length N in computing time proportional to N log N compared to N2 for methods currently used. This method makes it feasible to compare very long sequences. An example is given to show that the method correctly identifies sites of known homology. PMID:2349096
ABI Base Recall: Automatic Correction and Ends Trimming of DNA Sequences.

PubMed

Elyazghi, Zakaria; Yazouli, Loubna El; Sadki, Khalid; Radouani, Fouzia

2017-12-01

Automated DNA sequencers produce chromatogram files in ABI format. When viewing chromatograms, some ambiguities are shown at various sites along the DNA sequences, because the program implemented in the sequencing machine and used to call bases cannot always precisely determine the right nucleotide, especially when it is represented by either a broad peak or a set of overlaying peaks. In such cases, a letter other than A, C, G, or T is recorded, most commonly N. Thus, DNA sequencing chromatograms need manual examination: checking for mis-calls and truncating the sequence when errors become too frequent. The purpose of this paper is to develop a program allowing the automatic correction of these ambiguities. This application is a Web-based program powered by Shiny and runs under R platform for an easy exploitation. As a part of the interface, we added the automatic ends clipping option, alignment against reference sequences, and BLAST. To develop and test our tool, we collected several bacterial DNA sequences from different laboratories within Institut Pasteur du Maroc and performed both manual and automatic correction. The comparison between the two methods was carried out. As a result, we note that our program, ABI base recall, accomplishes good correction with a high accuracy. Indeed, it increases the rate of identity and coverage and minimizes the number of mismatches and gaps, hence it provides solution to sequencing ambiguities and saves biologists' time and labor.
Purification of High Molecular Weight Genomic DNA from Powdery Mildew for Long-Read Sequencing.

PubMed

Feehan, Joanna M; Scheibel, Katherine E; Bourras, Salim; Underwood, William; Keller, Beat; Somerville, Shauna C

2017-03-31

The powdery mildew fungi are a group of economically important fungal plant pathogens. Relatively little is known about the molecular biology and genetics of these pathogens, in part due to a lack of well-developed genetic and genomic resources. These organisms have large, repetitive genomes, which have made genome sequencing and assembly prohibitively difficult. Here, we describe methods for the collection, extraction, purification and quality control assessment of high molecular weight genomic DNA from one powdery mildew species, Golovinomyces cichoracearum. The protocol described includes mechanical disruption of spores followed by an optimized phenol/chloroform genomic DNA extraction. A typical yield was 7 µg DNA per 150 mg conidia. The genomic DNA that is isolated using this procedure is suitable for long-read sequencing (i.e., > 48.5 kbp). Quality control measures to ensure the size, yield, and purity of the genomic DNA are also described in this method. Sequencing of the genomic DNA of the quality described here will allow for the assembly and comparison of multiple powdery mildew genomes, which in turn will lead to a better understanding and improved control of this agricultural pathogen.
Effects of nucleoside analog incorporation on DNA binding to the DNA binding domain of the GATA-1 erythroid transcription factor.

PubMed

Foti, M; Omichinski, J G; Stahl, S; Maloney, D; West, J; Schweitzer, B I

1999-02-05

We investigate here the effects of the incorporation of the nucleoside analogs araC (1-beta-D-arabinofuranosylcytosine) and ganciclovir (9-[(1,3-dihydroxy-2-propoxy)methyl] guanine) into the DNA binding recognition sequence for the GATA-1 erythroid transcription factor. A 10-fold decrease in binding affinity was observed for the ganciclovir-substituted DNA complex in comparison to an unmodified DNA of the same sequence composition. AraC substitution did not result in any changes in binding affinity. 1H-15N HSQC and NOESY NMR experiments revealed a number of chemical shift changes in both DNA and protein in the ganciclovir-modified DNA-protein complex when compared to the unmodified DNA-protein complex. These changes in chemical shift and binding affinity suggest a change in the binding mode of the complex when ganciclovir is incorporated into the GATA DNA binding site.
Biomolecule Sequencer: Next-Generation DNA Sequencing Technology for In-Flight Environmental Monitoring, Research, and Beyond

NASA Technical Reports Server (NTRS)

Smith, David J.; Burton, Aaron; Castro-Wallace, Sarah; John, Kristen; Stahl, Sarah E.; Dworkin, Jason Peter; Lupisella, Mark L.

2016-01-01

On the International Space Station (ISS), technologies capable of rapid microbial identification and disease diagnostics are not currently available. NASA still relies upon sample return for comprehensive, molecular-based sample characterization. Next-generation DNA sequencing is a powerful approach for identifying microorganisms in air, water, and surfaces onboard spacecraft. The Biomolecule Sequencer payload, manifested to SpaceX-9 and scheduled on the Increment 4748 research plan (June 2016), will assess the functionality of a commercially-available next-generation DNA sequencer in the microgravity environment of ISS. The MinION device from Oxford Nanopore Technologies (Oxford, UK) measures picoamp changes in electrical current dependent on nucleotide sequences of the DNA strand migrating through nanopores in the system. The hardware is exceptionally small (9.5 x 3.2 x 1.6 cm), lightweight (120 grams), and powered only by a USB connection. For the ISS technology demonstration, the Biomolecule Sequencer will be powered by a Microsoft Surface Pro3. Ground-prepared samples containing lambda bacteriophage, Escherichia coli, and mouse genomic DNA, will be launched and stored frozen on the ISS until experiment initiation. Immediately prior to sequencing, a crew member will collect and thaw frozen DNA samples, connect the sequencer to the Surface Pro3, inject thawed samples into a MinION flow cell, and initiate sequencing. At the completion of the sequencing run, data will be downlinked for ground analysis. Identical, synchronous ground controls will be used for data comparisons to determine sequencer functionality, run-time sequence, current dynamics, and overall accuracy. We will present our latest results from the ISS flight experiment the first time DNA has ever been sequenced in space and discuss the many potential applications of the Biomolecule Sequencer for environmental monitoring, medical diagnostics, higher fidelity and more adaptable Space Biology Human Research Program investigations, and even life detection experiments for astrobiology missions.

Promoter Sequences Prediction Using Relational Association Rule Mining

PubMed Central

Czibula, Gabriela; Bocicor, Maria-Iuliana; Czibula, Istvan Gergely

2012-01-01

In this paper we are approaching, from a computational perspective, the problem of promoter sequences prediction, an important problem within the field of bioinformatics. As the conditions for a DNA sequence to function as a promoter are not known, machine learning based classification models are still developed to approach the problem of promoter identification in the DNA. We are proposing a classification model based on relational association rules mining. Relational association rules are a particular type of association rules and describe numerical orderings between attributes that commonly occur over a data set. Our classifier is based on the discovery of relational association rules for predicting if a DNA sequence contains or not a promoter region. An experimental evaluation of the proposed model and comparison with similar existing approaches is provided. The obtained results show that our classifier overperforms the existing techniques for identifying promoter sequences, confirming the potential of our proposal. PMID:22563233
Re-sequencing transgenic plants revealed rearrangements at T-DNA inserts, and integration of a short T-DNA fragment, but no increase of small mutations elsewhere.

PubMed

Schouten, Henk J; Vande Geest, Henri; Papadimitriou, Sofia; Bemer, Marian; Schaart, Jan G; Smulders, Marinus J M; Perez, Gabino Sanchez; Schijlen, Elio

2017-03-01

Transformation resulted in deletions and translocations at T-DNA inserts, but not in genome-wide small mutations. A tiny T-DNA splinter was detected that probably would remain undetected by conventional techniques. We investigated to which extent Agrobacterium tumefaciens-mediated transformation is mutagenic, on top of inserting T-DNA. To prevent mutations due to in vitro propagation, we applied floral dip transformation of Arabidopsis thaliana. We re-sequenced the genomes of five primary transformants, and compared these to genomic sequences derived from a pool of four wild-type plants. By genome-wide comparisons, we identified ten small mutations in the genomes of the five transgenic plants, not correlated to the positions or number of T-DNA inserts. This mutation frequency is within the range of spontaneous mutations occurring during seed propagation in A. thaliana, as determined earlier. In addition, we detected small as well as large deletions specifically at the T-DNA insert sites. Furthermore, we detected partial T-DNA inserts, one of these a tiny 50-bp fragment originating from a central part of the T-DNA construct used, inserted into the plant genome without flanking other T-DNA. Because of its small size, we named this fragment a T-DNA splinter. As far as we know this is the first report of such a small T-DNA fragment insert in absence of any T-DNA border sequence. Finally, we found evidence for translocations from other chromosomes, flanking T-DNA inserts. In this study, we showed that next-generation sequencing (NGS) is a highly sensitive approach to detect T-DNA inserts in transgenic plants.
Comparison of the ITS1 and ITS2 rDNA in Emeria callospermophili (Apicomplexa: Eimeriidae) from Sciurid Rodents

PubMed Central

Motriuk-Smith, Dagmara; Seville, R Scott; Quealy, Leah; Oliver, Clinton E.

2011-01-01

The taxonomy of the coccidia has historically been morphologically based. The purpose of this study was to establish if conspecificity of isolates of Eimeria callospermophili from 4 ground-dwelling squirrel hosts (Rodentia: Sciuridae) is supported by comparison of rDNA sequence data and to examine how this species relates to eimerian species from other sciurid hosts. Eimeria callospermophili was isolated from 4 wild caught hosts, i.e., Urocitellus elegans, Cynomys leucurus, Marmota flaviventris, and Cynomys ludovicianus. The ITS1 and ITS2 genomic rDNA sequences were PCR generated, sequenced, and analyzed. The highest intraspecific pairwise distance values of 6.0% in ITS1 and 7.1% in ITS2 were observed in C. leucurus. Interspecific pairwise distance values greater than 5% do not support E. callospermophili conspecificity. Generated E. callospermophili sequences were compared to Eimeria lancasterensis from Sciuris niger and Sciurus niger cinereus, and Eimeria ontarioensis from S. niger. A single well-supported clade was formed by E. callospermophili amplicons in Neighbor Joining and Maximum Parsimony analyses. However, within the clade there was little evidence of host or geographic structuring of the species. PMID:21506777
Phylogenetic position of the pentastomida and [pan]crustacean relationships

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lavrov, Dennis V.; Brown, Wesley M.; Boore, Jeffrey L.

2004-01-31

Pentastomids are a small group of vermiform animals with unique morphology and parasitic lifestyle. They are generally recognized as being related to the Arthropoda, however the nature of this relationship is controversial. We have determined the complete sequence of the mitochondrial DNA (mtDNA) of the pentastomid Armillifer armillatus and complete, or nearly complete, mtDNA sequences from representatives of four previously unsampled groups of Crustacea: Remipedia (Speleonectes tulumensis), Cephalocarida (Hutchinsoniella macracantha), Cirripedia (Pollicipes polymerus), and Branchiura (Argulus americanus). Analyses of the mtDNA gene arrangements and sequences determined in this study indicate unambiguously that pentastomids are a group of modified crustaceans likelymore » related to branchiurans. In addition, gene arrangement comparisons strongly support an unforeseen assemblage of pentastomids with maxillopod and cephalocarid crustaceans, to the exclusion of remipedes, branchiopods, malacos tracans and insects.« less
Construction and EST sequencing of full-length, drought stress cDNA libraries for common beans (Phaseolus vulgaris L.)

PubMed Central

2011-01-01

Background Common bean is an important legume crop with only a moderate number of short expressed sequence tags (ESTs) made with traditional methods. The goal of this research was to use full-length cDNA technology to develop ESTs that would overlap with the beginning of open reading frames and therefore be useful for gene annotation of genomic sequences. The library was also constructed to represent genes expressed under drought, low soil phosphorus and high soil aluminum toxicity. We also undertook comparisons of the full-length cDNA library to two previous non-full clone EST sets for common bean. Results Two full-length cDNA libraries were constructed: one for the drought tolerant Mesoamerican genotype BAT477 and the other one for the acid-soil tolerant Andean genotype G19833 which has been selected for genome sequencing. Plants were grown in three soil types using deep rooting cylinders subjected to drought and non-drought stress and tissues were collected from both roots and above ground parts. A total of 20,000 clones were selected robotically, half from each library. Then, nearly 10,000 clones from the G19833 library were sequenced with an average read length of 850 nucleotides. A total of 4,219 unigenes were identified consisting of 2,981 contigs and 1,238 singletons. These were functionally annotated with gene ontology terms and placed into KEGG pathways. Compared to other EST sequencing efforts in common bean, about half of the sequences were novel or represented the 5' ends of known genes. Conclusions The present full-length cDNA libraries add to the technological toolbox available for common bean and our sequencing of these clones substantially increases the number of unique EST sequences available for the common bean genome. All of this should be useful for both functional gene annotation, analysis of splice site variants and intron/exon boundary determination by comparison to soybean genes or with common bean whole-genome sequences. In addition the library has a large number of transcription factors and will be interesting for discovery and validation of drought or abiotic stress related genes in common bean. PMID:22118559
Phylogeography, intraspecific structure and sex-biased dispersal of Dall's porpoise, Phocoenoides dalli, revealed by mitochondrial and microsatellite DNA analyses.

PubMed

Escorza-Treviño, S; Dizon, A E

2000-08-01

Mitochondrial DNA (mtDNA) control-region sequences and microsatellite loci length polymorphisms were used to estimate phylogeographical patterns (historical patterns underlying contemporary distribution), intraspecific population structure and gender-biased dispersal of Phocoenoides dalli dalli across its entire range. One-hundred and thirteen animals from several geographical strata were sequenced over 379 bp of mtDNA, resulting in 58 mtDNA haplotypes. Analysis using F(ST) values (based on haplotype frequencies) and phi(ST) values (based on frequencies and genetic distances between haplotypes) yielded statistically significant separation (bootstrap values P < 0.05) among most of the stocks currently used for management purposes. A minimum spanning network of haplotypes showed two very distinctive clusters, differentially occupied by western and eastern populations, with some common widespread haplotypes. This suggests some degree of phyletic radiation from west to east, superimposed on gene flow. Highly male-biased migration was detected for several population comparisons. Nuclear microsatellite DNA markers (119 individuals and six loci) provided additional support for population subdivision and gender-biased dispersal detected in the mtDNA sequences. Analysis using F(ST) values (based on allelic frequencies) yielded statistically significant separation between some, but not all, populations distinguished by mtDNA analysis. R(ST) values (based on frequencies of and genetic distance between alleles) showed no statistically significant subdivision. Again, highly male-biased dispersal was detected for all population comparisons, suggesting, together with morphological and reproductive data, the existence of sexual selection. Our molecular results argue for nine distinct dalli-type populations that should be treated as separate units for management purposes.
Partial characterization of normal and Haemophilus influenzae-infected mucosal complementary DNA libraries in chinchilla middle ear mucosa.

PubMed

Kerschner, Joseph E; Erdos, Geza; Hu, Fen Ze; Burrows, Amy; Cioffi, Joseph; Khampang, Pawjai; Dahlgren, Margaret; Hayes, Jay; Keefe, Randy; Janto, Benjamin; Post, J Christopher; Ehrlich, Garth D

2010-04-01

We sought to construct and partially characterize complementary DNA (cDNA) libraries prepared from the middle ear mucosa (MEM) of chinchillas to better understand pathogenic aspects of infection and inflammation, particularly with respect to leukotriene biogenesis and response. Chinchilla MEM was harvested from controls and after middle ear inoculation with nontypeable Haemophilus influenzae. RNA was extracted to generate cDNA libraries. Randomly selected clones were subjected to sequence analysis to characterize the libraries and to provide DNA sequence for phylogenetic analyses. Reverse transcription-polymerase chain reaction of the RNA pools was used to generate cDNA sequences corresponding to genes associated with leukotriene biosynthesis and metabolism. Sequence analysis of 921 randomly selected clones from the uninfected MEM cDNA library produced approximately 250,000 nucleotides of almost entirely novel sequence data. Searches of the GenBank database with the Basic Local Alignment Search Tool provided for identification of 515 unique genes expressed in the MEM and not previously described in chinchillas. In almost all cases, the chinchilla cDNA sequences displayed much greater homology to human or other primate genes than with rodent species. Genes associated with leukotriene metabolism were present in both normal and infected MEM. Based on both phylogenetic comparisons and gene expression similarities with humans, chinchilla MEM appears to be an excellent model for the study of middle ear inflammation and infection. The higher degree of sequence similarity between chinchillas and humans compared to chinchillas and rodents was unexpected. The cDNA libraries from normal and infected chinchilla MEM will serve as useful molecular tools in the study of otitis media and should yield important information with respect to middle ear pathogenesis.
Partial Characterization of Normal and Haemophilus influenzae–Infected Mucosal Complementary DNA Libraries in Chinchilla Middle Ear Mucosa

PubMed Central

Kerschner, Joseph E.; Erdos, Geza; Hu, Fen Ze; Burrows, Amy; Cioffi, Joseph; Khampang, Pawjai; Dahlgren, Margaret; Hayes, Jay; Keefe, Randy; Janto, Benjamin; Post, J. Christopher; Ehrlich, Garth D.

2010-01-01

Objectives We sought to construct and partially characterize complementary DNA (cDNA) libraries prepared from the middle ear mucosa (MEM) of chinchillas to better understand pathogenic aspects of infection and inflammation, particularly with respect to leukotriene biogenesis and response. Methods Chinchilla MEM was harvested from controls and after middle ear inoculation with nontypeable Haemophilus influenzae. RNA was extracted to generate cDNA libraries. Randomly selected clones were subjected to sequence analysis to characterize the libraries and to provide DNA sequence for phylogenetic analyses. Reverse transcription–polymerase chain reaction of the RNA pools was used to generate cDNA sequences corresponding to genes associated with leukotriene biosynthesis and metabolism. Results Sequence analysis of 921 randomly selected clones from the uninfected MEM cDNA library produced approximately 250,000 nucleotides of almost entirely novel sequence data. Searches of the GenBank database with the Basic Local Alignment Search Tool provided for identification of 515 unique genes expressed in the MEM and not previously described in chinchillas. In almost all cases, the chinchilla cDNA sequences displayed much greater homology to human or other primate genes than with rodent species. Genes associated with leukotriene metabolism were present in both normal and infected MEM. Conclusions Based on both phylogenetic comparisons and gene expression similarities with humans, chinchilla MEM appears to be an excellent model for the study of middle ear inflammation and infection. The higher degree of sequence similarity between chinchillas and humans compared to chinchillas and rodents was unexpected. The cDNA libraries from normal and infected chinchilla MEM will serve as useful molecular tools in the study of otitis media and should yield important information with respect to middle ear pathogenesis. PMID:20433028
The twilight zone of cis element alignments.

PubMed

Sebastian, Alvaro; Contreras-Moreira, Bruno

2013-02-01

Sequence alignment of proteins and nucleic acids is a routine task in bioinformatics. Although the comparison of complete peptides, genes or genomes can be undertaken with a great variety of tools, the alignment of short DNA sequences and motifs entails pitfalls that have not been fully addressed yet. Here we confront the structural superposition of transcription factors with the sequence alignment of their recognized cis elements. Our goals are (i) to test TFcompare (http://floresta.eead.csic.es/tfcompare), a structural alignment method for protein-DNA complexes; (ii) to benchmark the pairwise alignment of regulatory elements; (iii) to define the confidence limits and the twilight zone of such alignments and (iv) to evaluate the relevance of these thresholds with elements obtained experimentally. We find that the structure of cis elements and protein-DNA interfaces is significantly more conserved than their sequence and measures how this correlates with alignment errors when only sequence information is considered. Our results confirm that DNA motifs in the form of matrices produce better alignments than individual sequences. Finally, we report that empirical and theoretically derived twilight thresholds are useful for estimating the natural plasticity of regulatory sequences, and hence for filtering out unreliable alignments.
The twilight zone of cis element alignments

PubMed Central

Sebastian, Alvaro; Contreras-Moreira, Bruno

2013-01-01

Sequence alignment of proteins and nucleic acids is a routine task in bioinformatics. Although the comparison of complete peptides, genes or genomes can be undertaken with a great variety of tools, the alignment of short DNA sequences and motifs entails pitfalls that have not been fully addressed yet. Here we confront the structural superposition of transcription factors with the sequence alignment of their recognized cis elements. Our goals are (i) to test TFcompare (http://floresta.eead.csic.es/tfcompare), a structural alignment method for protein–DNA complexes; (ii) to benchmark the pairwise alignment of regulatory elements; (iii) to define the confidence limits and the twilight zone of such alignments and (iv) to evaluate the relevance of these thresholds with elements obtained experimentally. We find that the structure of cis elements and protein–DNA interfaces is significantly more conserved than their sequence and measures how this correlates with alignment errors when only sequence information is considered. Our results confirm that DNA motifs in the form of matrices produce better alignments than individual sequences. Finally, we report that empirical and theoretically derived twilight thresholds are useful for estimating the natural plasticity of regulatory sequences, and hence for filtering out unreliable alignments. PMID:23268451
Comparative analysis of the complete sequence of the plastid genome of Parthenium argentatum and identification of DNA barcodes to differentiate Parthenium species and lines

PubMed Central

2009-01-01

Background Parthenium argentatum (guayule) is an industrial crop that produces latex, which was recently commercialized as a source of latex rubber safe for people with Type I latex allergy. The complete plastid genome of P. argentatum was sequenced. The sequence provides important information useful for genetic engineering strategies. Comparison to the sequences of plastid genomes from three other members of the Asteraceae, Lactuca sativa, Guitozia abyssinica and Helianthus annuus revealed details of the evolution of the four genomes. Chloroplast-specific DNA barcodes were developed for identification of Parthenium species and lines. Results The complete plastid genome of P. argentatum is 152,803 bp. Based on the overall comparison of individual protein coding genes with those in L. sativa, G. abyssinica and H. annuus, we demonstrate that the P. argentatum chloroplast genome sequence is most closely related to that of H. annuus. Similar to chloroplast genomes in G. abyssinica, L. sativa and H. annuus, the plastid genome of P. argentatum has a large 23 kb inversion with a smaller 3.4 kb inversion, within the large inversion. Using the matK and psbA-trnH spacer chloroplast DNA barcodes, three of the four Parthenium species tested, P. tomentosum, P. hysterophorus and P. schottii, can be differentiated from P. argentatum. In addition, we identified lines within P. argentatum. Conclusion The genome sequence of the P. argentatum chloroplast will enrich the sequence resources of plastid genomes in commercial crops. The availability of the complete plastid genome sequence may facilitate transformation efficiency by using the precise sequence of endogenous flanking sequences and regulatory elements in chloroplast transformation vectors. The DNA barcoding study forms the foundation for genetic identification of commercially significant lines of P. argentatum that are important for producing latex. PMID:19917140
Identification of gyrB and rpoB gene mutations and differentially expressed proteins between a novobiocin-resistant Aeromonas hydrophila catfish vaccine strain and its virulent parent strain

USDA-ARS?s Scientific Manuscript database

Sequence comparison between the full-length 2412 bp DNA gyrase subunit B (gyrB) gene of a novobiocin resistant Aeromonas hydrophila AH11NOVO vaccine strain and that of its virulent parent strain AH11P revealed 10 missense mutations. Similarly, sequence comparison between the full-length 4092 bp RNA ...
Morphological description and DNA barcoding of Hydrobaenus majus sp. nov. (Diptera: Chironomidae: Orthocladiinae) from the Russian Far East.

PubMed

Makarchenko, Eugenyi A; Makarchenko, Marina A; Semenchenko, Alexander A

2015-08-14

Illustrated descriptions of adult male, pupa and fourth instar larva, as well as DNA barcoding, of Hydrobaenus majus sp. nov. in comparison with the close related species H. sikhotealinensis Makarchenko et Makarchenko from the Russian Far East are provided. The species-specificity of H. majus sp. nov. COI sequences is analyzed and the sequences are presented as diagnostic characters--molecular markers of H. majus and H. sikhotealinensis.
Cloning and restriction enzyme mapping of ribosomal DNA of Giardia duodenalis, Giardia ardeae and Giardia muris.

PubMed

van Keulen, H; Campbell, S R; Erlandsen, S L; Jarroll, E L

1991-06-01

In an attempt to study Giardia at the DNA sequence level, the rRNA genes of three species, Giardia duodenalis, Giardia ardeae and Giardia muris were cloned and restriction enzyme maps were constructed. The rDNA repeats of these Giardia show completely different restriction enzyme recognition patterns. The size of the rDNA repeat ranges from approximately 5.6 kb in G. duodenalis to 7.6 kb in both G. muris and G. ardeae. These size differences are mainly attributable to the variation in length of the spacer. Minor differences exist among these Giardia in the sizes of their small subunit rRNA and the internal transcribed spacer between small and large subunit rRNA. The genetic maps were constructed by sequence analysis of the DNA around the 5' and 3' ends of the mature rRNA genes and between the rRNA covering the 5.8S rRNA gene and internal transcribed spacer. Comparison of the 5.8S rDNA and 3' end of large subunit rDNA from these three Giardia species showed considerable sequence variation, but the rDNA sequences of G. duodenalis and G. ardeae appear more closely related to each other than to G. muris.
Development and validation of a D-loop mtDNA SNP assay for the screening of specimens in forensic casework.

PubMed

Chemale, Gustavo; Paneto, Greiciane Gaburro; Menezes, Meiga Aurea Mendes; de Freitas, Jorge Marcelo; Jacques, Guilherme Silveira; Cicarelli, Regina Maria Barretto; Fagundes, Paulo Roberto

2013-05-01

Mitochondrial DNA (mtDNA) analysis is usually a last resort in routine forensic DNA casework. However, it has become a powerful tool for the analysis of highly degraded samples or samples containing too little or no nuclear DNA, such as old bones and hair shafts. The gold standard methodology still constitutes the direct sequencing of polymerase chain reaction (PCR) products or cloned amplicons from the HVS-1 and HVS-2 (hypervariable segment) control region segments. Identifications using mtDNA are time consuming, expensive and can be very complex, depending on the amount and nature of the material being tested. The main goal of this work is to develop a less labour-intensive and less expensive screening method for mtDNA analysis, in order to aid in the exclusion of non-matching samples and as a presumptive test prior to final confirmatory DNA sequencing. We have selected 14 highly discriminatory single nucleotide polymorphisms (SNPs) based on simulations performed by Salas and Amigo (2010) to be typed using SNaPShot(TM) (Applied Biosystems, Foster City, CA, USA). The assay was validated by typing more than 100 HVS-1/HVS-2 sequenced samples. No differences were observed between the SNP typing and DNA sequencing when results were compared, with the exception of allelic dropouts observed in a few haplotypes. Haplotype diversity simulations were performed using 172 mtDNA sequences representative of the Brazilian population and a score of 0.9794 was obtained when the 14 SNPs were used, showing that the theoretical prediction approach for the selection of highly discriminatory SNPs suggested by Salas and Amigo (2010) was confirmed in the population studied. As the main goal of the work is to develop a screening assay to skip the sequencing of all samples in a particular case, a pair-wise comparison of the sequences was done using the selected SNPs. When both HVS-1/HVS-2 SNPs were used for simulations, at least two differences were observed in 93.2% of the comparisons performed. The assay was validated with casework samples. Results show that the method is straightforward and can be used for exclusionary purposes, saving time and laboratory resources. The assay confirms the theoretic prediction suggested by Salas and Amigo (2010). All forensic advantages, such as high sensitivity and power of discrimination, as also the disadvantages, such as the occurrence of allele dropouts, are discussed throughout the article. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Inferring coarse-grain histone-DNA interaction potentials from high-resolution structures of the nucleosome

NASA Astrophysics Data System (ADS)

Meyer, Sam; Everaers, Ralf

2015-02-01

The histone-DNA interaction in the nucleosome is a fundamental mechanism of genomic compaction and regulation, which remains largely unknown despite increasing structural knowledge of the complex. In this paper, we propose a framework for the extraction of a nanoscale histone-DNA force-field from a collection of high-resolution structures, which may be adapted to a larger class of protein-DNA complexes. We applied the procedure to a large crystallographic database extended by snapshots from molecular dynamics simulations. The comparison of the structural models first shows that, at histone-DNA contact sites, the DNA base-pairs are shifted outwards locally, consistent with locally repulsive forces exerted by the histones. The second step shows that the various force profiles of the structures under analysis derive locally from a unique, sequence-independent, quadratic repulsive force-field, while the sequence preferences are entirely due to internal DNA mechanics. We have thus obtained the first knowledge-derived nanoscale interaction potential for histone-DNA in the nucleosome. The conformations obtained by relaxation of nucleosomal DNA with high-affinity sequences in this potential accurately reproduce the experimental values of binding preferences. Finally we address the more generic binding mechanisms relevant to the 80% genomic sequences incorporated in nucleosomes, by computing the conformation of nucleosomal DNA with sequence-averaged properties. This conformation differs from those found in crystals, and the analysis suggests that repulsive histone forces are related to local stretch tension in nucleosomal DNA, mostly between adjacent contact points. This tension could play a role in the stability of the complex.
Triazole-linked DNA as a primer surrogate in the synthesis of first-strand cDNA.

PubMed

Fujino, Tomoko; Yasumoto, Ken-ichi; Yamazaki, Naomi; Hasome, Ai; Sogawa, Kazuhiro; Isobe, Hiroyuki

2011-11-04

A phosphate-eliminated nonnatural oligonucleotide serves as a primer surrogate in reverse transcription reaction of mRNA. Despite of the nonnatural triazole linkages in the surrogate, the reverse transcriptase effectively elongated cDNA sequences on the 3'-downstream of the primer by transcription of the complementary sequence of mRNA. A structure-activity comparison with the reference natural oligonucleotides shows the superior priming activity of the surrogate containing triazole-linkages. The nonnatural linkages also protect the transcribed cDNA from digestion reactions with 5'-exonuclease and enable us to remove noise transcripts of unknown origins. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Mesoscopic modeling of DNA denaturation rates: Sequence dependence and experimental comparison

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dahlen, Oda, E-mail: oda.dahlen@ntnu.no; Erp, Titus S. van, E-mail: titus.van.erp@ntnu.no

Using rare event simulation techniques, we calculated DNA denaturation rate constants for a range of sequences and temperatures for the Peyrard-Bishop-Dauxois (PBD) model with two different parameter sets. We studied a larger variety of sequences compared to previous studies that only consider DNA homopolymers and DNA sequences containing an equal amount of weak AT- and strong GC-base pairs. Our results show that, contrary to previous findings, an even distribution of the strong GC-base pairs does not always result in the fastest possible denaturation. In addition, we applied an adaptation of the PBD model to study hairpin denaturation for which experimentalmore » data are available. This is the first quantitative study in which dynamical results from the mesoscopic PBD model have been compared with experiments. Our results show that present parameterized models, although giving good results regarding thermodynamic properties, overestimate denaturation rates by orders of magnitude. We believe that our dynamical approach is, therefore, an important tool for verifying DNA models and for developing next generation models that have higher predictive power than present ones.« less
Nonneutral mitochondrial DNA variation in humans and chimpanzees

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nachman, M.W.; Aquadro, C.F.; Brown, W.M.

1996-03-01

We sequenced the NADH dehydrogenase subunit 3 (ND3) gene from a sample of 61 humans, five common chimpanzees, and one gorilla to test whether patterns of mitochondrial DNA (mtDNA) variation are consistent with a neutral model of molecular evolution. Within humans and within chimpanzees, the ratio of replacement to silent nucleotide substitutions was higher than observed in comparisons between species, contrary to neutral expectations. To test the generality of this result, we reanalyzed published human RFLP data from the entire mitochondrial genome. Gains of restriction sites relative to a known human mtDNA sequence were used to infer unambiguous nucleotide substitutions.more » We also compared the complete mtDNA sequences of three humans. Both the RFLP data and the sequence data reveal a higher ratio of replacement to silent nucleotide substitutions within humans than is seen between species. This pattern is observed at most or all human mitochondrial genes and is inconsistent with a strictly neutral model. These data suggest that many mitochondrial protein polymorphisms are slightly deleterious, consistent with studies of human mitochondrial diseases. 59 refs., 2 figs., 8 tabs.« less
Primary structure of prostaglandin G/H synthase from sheep vesicular gland determined from the complementary DNA sequence.

PubMed Central

DeWitt, D L; Smith, W L

1988-01-01

Prostaglandin G/H synthase (8,11,14-icosatrienoate, hydrogen-donor:oxygen oxidoreductase, EC 1.14.99.1) catalyzes the first step in the formation of prostaglandins and thromboxanes, the conversion of arachidonic acid to prostaglandin endoperoxides G and H. This enzyme is the site of action of nonsteroidal anti-inflammatory drugs. We have isolated a 2.7-kilobase complementary DNA (cDNA) encompassing the entire coding region of prostaglandin G/H synthase from sheep vesicular glands. This cDNA, cloned from a lambda gt 10 library prepared from poly(A)+ RNA of vesicular glands, hybridizes with a single 2.75-kilobase mRNA species. The cDNA clone was selected using oligonucleotide probes modeled from amino acid sequences of tryptic peptides prepared from the purified enzyme. The full-length cDNA encodes a protein of 600 amino acids, including a signal sequence of 24 amino acids. Identification of the cDNA as coding for prostaglandin G/H synthase is based on comparison of amino acid sequences of seven peptides comprising 103 amino acids with the amino acid sequence deduced from the nucleotide sequence of the cDNA. The molecular weight of the unglycosylated enzyme lacking the signal peptide is 65,621. The synthase is a glycoprotein, and there are three potential sites for N-glycosylation, two of them in the amino-terminal half of the molecule. The serine reported to be acetylated by aspirin is at position 530, near the carboxyl terminus. There is no significant similarity between the sequence of the synthase and that of any other protein in amino acid or nucleotide sequence libraries, and a heme binding site(s) is not apparent from the amino acid sequence. The availability of a full-length cDNA clone coding for prostaglandin G/H synthase should facilitate studies of the regulation of expression of this enzyme and the structural features important for catalysis and for interaction with anti-inflammatory drugs. Images PMID:3125548

Intestinal flora of FAP patients containing APC-like sequences.

PubMed

Hainova, K; Adamcikova, Z; Ciernikova, S; Stevurkova, V; Tyciakova, S; Zajac, V

2014-01-01

Colorectal cancer mortality is one of the most common cause of cancer-related mortality. A multiple risk factors are associated with colorectal cancer, including hereditary, enviromental and inflammatory syndromes affecting the gastrointestinal tract. Familial adenomatous polyposis (FAP) is characterized by the emergence of hundreds to thousands of colorectal adenomatous polyps and FAP syndrome is caused by mutations within the adenomatous polyposis coli (APC) tumor suppressor gene. We analyzed 21 rectal bacterial subclones isolated from FAP patient 41-1 with confirmed 5bp ACAAA deletion within codons 1060-1063 for the presence of APC-like sequences in longest exon 15. The studied section was defined by primers 15Efor-15Erev, what correlates with mutation cluster region (MCR) in which the 75% of all APC germline mutations were detected. More than 90% homology was showed by sequencing and subsequent software comparison. The expression of APC-like sequences was demostrated by Western blot analysis using monoclonal and polyclonal antibodies against APC protein. To study missing link between the DNA analysis (PCR, DNA sequencing) and protein expresion experiments (Western blotting) we analyzed bacterial transcripts containing the 15Efor-15Erev sequence of APC gene by reverse transcription-PCR, what indicated that an APC gene derived fragment may be produced. We observed 97-100 % homology after computer comparison of cDNA PCR products. Our results suggest that presence of APC-like sequences in intestinal/rectal bacteria is enrichment of bacterial genetic information in which horizontal gene transfer between humans and microflora play an important role.
The Status, Quality, and Expansion of the NIH Full-Length cDNA Project: The Mammalian Gene Collection (MGC)

PubMed Central

2004-01-01

The National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5′-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project. Currently, more than 11,000 human and 10,000 mouse genes are represented in MGC by at least one clone with a full ORF. The random selection approach is now reaching a saturation point, and a transition to protocols targeted at the missing transcripts is now required to complete the mouse and human collections. Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors. Recently, a rat cDNA component was added to the project, and ongoing frog (Xenopus) and zebrafish (Danio) cDNA projects were expanded to take advantage of the high-throughput MGC pipeline. PMID:15489334
GATA: A graphic alignment tool for comparative sequenceanalysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nix, David A.; Eisen, Michael B.

2005-01-01

Several problems exist with current methods used to align DNA sequences for comparative sequence analysis. Most dynamic programming algorithms assume that conserved sequence elements are collinear. This assumption appears valid when comparing orthologous protein coding sequences. Functional constraints on proteins provide strong selective pressure against sequence inversions, and minimize sequence duplications and feature shuffling. For non-coding sequences this collinearity assumption is often invalid. For example, enhancers contain clusters of transcription factor binding sites that change in number, orientation, and spacing during evolution yet the enhancer retains its activity. Dotplot analysis is often used to estimate non-coding sequence relatedness. Yet dotmore » plots do not actually align sequences and thus cannot account well for base insertions or deletions. Moreover, they lack an adequate statistical framework for comparing sequence relatedness and are limited to pairwise comparisons. Lastly, dot plots and dynamic programming text outputs fail to provide an intuitive means for visualizing DNA alignments.« less
Length Variation, Heteroplasmy and Sequence Divergence in the Mitochondrial DNA of Four Species of Sturgeon (Acipenser)

PubMed Central

Brown, J. R.; Beckenbach, K.; Beckenbach, A. T.; Smith, M. J.

1996-01-01

The extent of mtDNA length variation and heteroplasmy as well as DNA sequences of the control region and two tRNA genes were determined for four North American sturgeon species: Acipenser transmontanus, A. medirostris, A. fulvescens and A. oxyrhnychus. Across the Continental Divide, a division in the occurrence of length variation and heteroplasmy was observed that was concordant with species biogeography as well as with phylogenies inferred from restriction fragment length polymorphisms (RFLP) of whole mtDNA and pairwise comparisons of unique sequences of the control region. In all species, mtDNA length variation was due to repeated arrays of 78-82-bp sequences each containing a D-loop strand synthesis termination associated sequence (TAS). Individual repeats showed greater sequence conservation within individuals and species rather than between species, which is suggestive of concerted evolution. Differences in the frequencies of multiple copy genomes and heteroplasmy among the four species may be ascribed to differences in the rates of recurrent mutation. A mechanism that may offset the high rate of mutation for increased copy number is suggested on the basis that an increase in the number of functional TAS motifs might reduce the frequency of successfully initiated H-strand replications. PMID:8852850
Phylogeny and taxonomy of the genus Gliocephalotrichum.

PubMed

Lombard, L; Serrato-Diaz, L M; Cheewangkoon, R; French-Monar, R D; Decock, C; Crous, P W

2014-06-01

Species in the genus Gliocephalotrichum (= Leuconectria) (Hypocreales, Nectriaceae) are soilborne fungi, associated with post-harvest fruit spoilage of several important tropical fruit crops. Contemporary taxonomic studies of these fungi have relied on morphology and DNA sequence comparisons of the internal transcribed spacer region of the nuclear rDNA (ITS) and the β-tubulin gene regions. Employing DNA sequence data from four loci (β-tubulin, histone H3, ITS, and translation elongation factor 1-alpha) and morphological comparisons, the taxonomic status of the genus Gliocephalotrichum was re-evaluated. As a result five species are newly described, namely G. humicola (Taiwan, soil), G. mexicanum (rambutan fruit from Mexico), G. nephelii (rambutan fruit from Guatemala), G. queenslandicum (Australia, endophytic isolations) and G. simmonsii (rambutan fruit from Guatemala). Although species of Gliocephalotrichum are generally not regarded as important plant pathogens, their ability to cause post-harvest fruit rot could have an impact on fruit export and storage.
Vander Lugt correlation of DNA sequence data

NASA Astrophysics Data System (ADS)

Christens-Barry, William A.; Hawk, James F.; Martin, James C.

1990-12-01

DNA, the molecule containing the genetic code of an organism, is a linear chain of subunits. It is the sequence of subunits, of which there are four kinds, that constitutes the unique blueprint of an individual. This sequence is the focus of a large number of analyses performed by an army of geneticists, biologists, and computer scientists. Most of these analyses entail searches for specific subsequences within the larger set of sequence data. Thus, most analyses are essentially pattern recognition or correlation tasks. Yet, there are special features to such analysis that influence the strategy and methods of an optical pattern recognition approach. While the serial processing employed in digital electronic computers remains the main engine of sequence analyses, there is no fundamental reason that more efficient parallel methods cannot be used. We describe an approach using optical pattern recognition (OPR) techniques based on matched spatial filtering. This allows parallel comparison of large blocks of sequence data. In this study we have simulated a Vander Lugt1 architecture implementing our approach. Searches for specific target sequence strings within a block of DNA sequence from the Co/El plasmid2 are performed.
DNA sequence similarity recognition by hybridization to short oligomers

DOEpatents

Milosavljevic, Aleksandar

1999-01-01

Methods are disclosed for the comparison of nucleic acid sequences. Data is generated by hybridizing sets of oligomers with target nucleic acids. The data thus generated is manipulated simultaneously with respect to both (i) matching between oligomers and (ii) matching between oligomers and putative reference sequences available in databases. Using data compression methods to manipulate this mutual information, sequences for the target can be constructed.
Improved efficiency in amplification of Escherichia coli o-antigen gene clusters using genome-wide sequence comparison

USDA-ARS?s Scientific Manuscript database

Background: In many bacteria including E. coli, genes encoding O-antigens are clustered in the chromosome, with a 39-bp JUMPstart sequence and gnd gene located upstream and downstream of the cluster, respectively. For determining the DNA sequence of the E. coli O-antigen gene cluster, one set of P...
Clustered regularly interspaced short palindromic repeats (CRISPRs) for the genotyping of bacterial pathogens.

PubMed

Grissa, Ibtissem; Vergnaud, Gilles; Pourcel, Christine

2009-01-01

Clustered regularly interspaced short palindromic repeats (CRISPRs) are DNA sequences composed of a succession of repeats (23- to 47-bp long) separated by unique sequences called spacers. Polymorphism can be observed in different strains of a species and may be used for genotyping. We describe protocols and bioinformatics tools that allow the identification of CRISPRs from sequenced genomes, their comparison, and their component determination (the direct repeats and the spacers). A schematic representation of the spacer organization can be produced, allowing an easy comparison between strains.
SGP-1: Prediction and Validation of Homologous Genes Based on Sequence Alignments

PubMed Central

Wiehe, Thomas; Gebauer-Jung, Steffi; Mitchell-Olds, Thomas; Guigó, Roderic

2001-01-01

Conventional methods of gene prediction rely on the recognition of DNA-sequence signals, the coding potential or the comparison of a genomic sequence with a cDNA, EST, or protein database. Reasons for limited accuracy in many circumstances are species-specific training and the incompleteness of reference databases. Lately, comparative genome analysis has attracted increasing attention. Several analysis tools that are based on human/mouse comparisons are already available. Here, we present a program for the prediction of protein-coding genes, termed SGP-1 (Syntenic Gene Prediction), which is based on the similarity of homologous genomic sequences. In contrast to most existing tools, the accuracy of SGP-1 depends little on species-specific properties such as codon usage or the nucleotide distribution. SGP-1 may therefore be applied to nonstandard model organisms in vertebrates as well as in plants, without the need for extensive parameter training. In addition to predicting genes in large-scale genomic sequences, the program may be useful to validate gene structure annotations from databases. To this end, SGP-1 output also contains comparisons between predicted and annotated gene structures in HTML format. The program can be accessed via a Web server at http://soft.ice.mpg.de/sgp-1. The source code, written in ANSI C, is available on request from the authors. PMID:11544202
Flow cytometry sorting of nuclei enables the first global characterization of Paramecium germline DNA and transposable elements.

PubMed

Guérin, Frédéric; Arnaiz, Olivier; Boggetto, Nicole; Denby Wilkes, Cyril; Meyer, Eric; Sperling, Linda; Duharcourt, Sandra

2017-04-26

DNA elimination is developmentally programmed in a wide variety of eukaryotes, including unicellular ciliates, and leads to the generation of distinct germline and somatic genomes. The ciliate Paramecium tetraurelia harbors two types of nuclei with different functions and genome structures. The transcriptionally inactive micronucleus contains the complete germline genome, while the somatic macronucleus contains a reduced genome streamlined for gene expression. During development of the somatic macronucleus, the germline genome undergoes massive and reproducible DNA elimination events. Availability of both the somatic and germline genomes is essential to examine the genome changes that occur during programmed DNA elimination and ultimately decipher the mechanisms underlying the specific removal of germline-limited sequences. We developed a novel experimental approach that uses flow cell imaging and flow cytometry to sort subpopulations of nuclei to high purity. We sorted vegetative micronuclei and macronuclei during development of P. tetraurelia. We validated the method by flow cell imaging and by high throughput DNA sequencing. Our work establishes the proof of principle that developing somatic macronuclei can be sorted from a complex biological sample to high purity based on their size, shape and DNA content. This method enabled us to sequence, for the first time, the germline DNA from pure micronuclei and to identify novel transposable elements. Sequencing the germline DNA confirms that the Pgm domesticated transposase is required for the excision of all ~45,000 Internal Eliminated Sequences. Comparison of the germline DNA and unrearranged DNA obtained from PGM-silenced cells reveals that the latter does not provide a faithful representation of the germline genome. We developed a flow cytometry-based method to purify P. tetraurelia nuclei to high purity and provided quality control with flow cell imaging and high throughput DNA sequencing. We identified 61 germline transposable elements including the first Paramecium retrotransposons. This approach paves the way to sequence the germline genomes of P. aurelia sibling species for future comparative genomic studies.
Single-strand conformation polymorphism (SSCP)-based mutation scanning approaches to fingerprint sequence variation in ribosomal DNA of ascaridoid nematodes.

PubMed

Zhu, X Q; Gasser, R B

1998-06-01

In this study, we assessed single-strand conformation polymorphism (SSCP)-based approaches for their capacity to fingerprint sequence variation in ribosomal DNA (rDNA) of ascaridoid nematodes of veterinary and/or human health significance. The second internal transcribed spacer region (ITS-2) of rDNA was utilised as the target region because it is known to provide species-specific markers for this group of parasites. ITS-2 was amplified by PCR from genomic DNA derived from individual parasites and subjected to analysis. Direct SSCP analysis of amplicons from seven taxa (Toxocara vitulorum, Toxocara cati, Toxocara canis, Toxascaris leonina, Baylisascaris procyonis, Ascaris suum and Parascaris equorum) showed that the single-strand (ss) ITS-2 patterns produced allowed their unequivocal identification to species. While no variation in SSCP patterns was detected in the ITS-2 within four species for which multiple samples were available, the method allowed the direct display of four distinct sequence types of ITS-2 among individual worms of T. cati. Comparison of SSCP/sequencing with the methods of dideoxy fingerprinting (ddF) and restriction endonuclease fingerprinting (REF) revealed that also ddF allowed the definition of the four sequence types, whereas REF displayed three of four. The findings indicate the usefulness of the SSCP-based approaches for the identification of ascaridoid nematodes to species, the direct display of sequence variation in rDNA and the detection of population variation. The ability to fingerprint microheterogeneity in ITS-2 rDNA using such approaches also has implications for studying fundamental aspects relating to mutational change in rDNA.
Nucleotide Sequence Database Comparison for Routine Dermatophyte Identification by Internal Transcribed Spacer 2 Genetic Region DNA Barcoding.

PubMed

Normand, A C; Packeu, A; Cassagne, C; Hendrickx, M; Ranque, S; Piarroux, R

2018-05-01

Conventional dermatophyte identification is based on morphological features. However, recent studies have proposed to use the nucleotide sequences of the rRNA internal transcribed spacer (ITS) region as an identification barcode of all fungi, including dermatophytes. Several nucleotide databases are available to compare sequences and thus identify isolates; however, these databases often contain mislabeled sequences that impair sequence-based identification. We evaluated five of these databases on a clinical isolate panel. We selected 292 clinical dermatophyte strains that were prospectively subjected to an ITS2 nucleotide sequence analysis. Sequences were analyzed against the databases, and the results were compared to clusters obtained via DNA alignment of sequence segments. The DNA tree served as the identification standard throughout the study. According to the ITS2 sequence identification, the majority of strains (255/292) belonged to the genus Trichophyton , mainly T. rubrum complex ( n = 184), T. interdigitale ( n = 40), T. tonsurans ( n = 26), and T. benhamiae ( n = 5). Other genera included Microsporum (e.g., M. canis [ n = 21], M. audouinii [ n = 10], Nannizzia gypsea [ n = 3], and Epidermophyton [ n = 3]). Species-level identification of T. rubrum complex isolates was an issue. Overall, ITS DNA sequencing is a reliable tool to identify dermatophyte species given that a comprehensive and correctly labeled database is consulted. Since many inaccurate identification results exist in the DNA databases used for this study, reference databases must be verified frequently and amended in line with the current revisions of fungal taxonomy. Before describing a new species or adding a new DNA reference to the available databases, its position in the phylogenetic tree must be verified. Copyright © 2018 American Society for Microbiology.
Crystal structure of the Msx-1 homeodomain/DNA complex.

PubMed

Hovde, S; Abate-Shen, C; Geiger, J H

2001-10-09

The Msx-1 homeodomain protein plays a crucial role in craniofacial, limb, and nervous system development. Homeodomain DNA-binding domains are comprised of 60 amino acids that show a high degree of evolutionary conservation. We have determined the structure of the Msx-1 homeodomain complexed to DNA at 2.2 A resolution. The structure has an unusually well-ordered N-terminal arm with a unique trajectory across the minor groove of the DNA. DNA specificity conferred by bases flanking the core TAAT sequence is explained by well ordered water-mediated interactions at Q50. Most interactions seen at the TAAT sequence are typical of the interactions seen in other homeodomain structures. Comparison of the Msx-1-HD structure to all other high resolution HD-DNA complex structures indicate a remarkably well-conserved sphere of hydration between the DNA and protein in these complexes.
Direct Comparison of Amino Acid and Salt Interactions with Double-Stranded and Single-Stranded DNA from Explicit-Solvent Molecular Dynamics Simulations.

PubMed

Andrews, Casey T; Campbell, Brady A; Elcock, Adrian H

2017-04-11

Given the ubiquitous nature of protein-DNA interactions, it is important to understand the interaction thermodynamics of individual amino acid side chains for DNA. One way to assess these preferences is to perform molecular dynamics (MD) simulations. Here we report MD simulations of 20 amino acid side chain analogs interacting simultaneously with both a 70-base-pair double-stranded DNA and with a 70-nucleotide single-stranded DNA. The relative preferences of the amino acid side chains for dsDNA and ssDNA match well with values deduced from crystallographic analyses of protein-DNA complexes. The estimated apparent free energies of interaction for ssDNA, on the other hand, correlate well with previous simulation values reported for interactions with isolated nucleobases, and with experimental values reported for interactions with guanosine. Comparisons of the interactions with dsDNA and ssDNA indicate that, with the exception of the positively charged side chains, all types of amino acid side chain interact more favorably with ssDNA, with intercalation of aromatic and aliphatic side chains being especially notable. Analysis of the data on a base-by-base basis indicates that positively charged side chains, as well as sodium ions, preferentially bind to cytosine in ssDNA, and that negatively charged side chains, and chloride ions, preferentially bind to guanine in ssDNA. These latter observations provide a novel explanation for the lower salt dependence of DNA duplex stability in GC-rich sequences relative to AT-rich sequences.
Human ribosomal RNA gene: nucleotide sequence of the transcription initiation region and comparison of three mammalian genes.

PubMed Central

Financsek, I; Mizumoto, K; Mishima, Y; Muramatsu, M

1982-01-01

The transcription initiation site of the human ribosomal RNA gene (rDNA) was located by using the single-strand specific nuclease protection method and by determining the first nucleotide of the in vitro capped 45S preribosomal RNA. The sequence of 1,211 nucleotides surrounding the initiation site was determined. The sequenced region was found to consist of 75% G and C and to contain a number of short direct and inverted repeats and palindromes. By comparison of the corresponding initiation regions of three mammalian species, several conserved sequences were found upstream and downstream from the transcription starting point. Two short A + T-rich sequences are present on human, mouse, and rat ribosomal RNA genes between the initiation site and 40 nucleotides upstream, and a C + T cluster is located at a position around -60. At and downstream from the initiation site, a common sequence, T-AG-C-T-G-A-C-A-C-G-C-T-G-T-C-C-T-CT-T, was found in the three genes from position -1 through +18. The strong conservation of these sequences suggests their functional significance in rDNA. The S1 nuclease protection experiments with cloned rDNA fragments indicated the presence in human 45S RNA of molecules several hundred nucleotides shorter than the supposed primary transcript. The first 19 nucleotides of these molecules appear identical--except for one mismatch--to the nucleotide sequence of the 5' end of a supposed early processing product of the mouse 45S RNA. Images PMID:6954460
Dynamics of actin evolution in dinoflagellates.

PubMed

Kim, Sunju; Bachvaroff, Tsvetan R; Handy, Sara M; Delwiche, Charles F

2011-04-01

Dinoflagellates have unique nuclei and intriguing genome characteristics with very high DNA content making complete genome sequencing difficult. In dinoflagellates, many genes are found in multicopy gene families, but the processes involved in the establishment and maintenance of these gene families are poorly understood. Understanding the dynamics of gene family evolution in dinoflagellates requires comparisons at different evolutionary scales. Studies of closely related species provide fine-scale information relative to species divergence, whereas comparisons of more distantly related species provides broad context. We selected the actin gene family as a highly expressed conserved gene previously studied in dinoflagellates. Of the 142 sequences determined in this study, 103 were from the two closely related species, Dinophysis acuminata and D. caudata, including full length and partial cDNA sequences as well as partial genomic amplicons. For these two Dinophysis species, at least three types of sequences could be identified. Most copies (79%) were relatively similar and in nucleotide trees, the sequences formed two bushy clades corresponding to the two species. In comparisons within species, only eight to ten nucleotide differences were found between these copies. The two remaining types formed clades containing sequences from both species. One type included the most similar sequences in between-species comparisons with as few as 12 nucleotide differences between species. The second type included the most divergent sequences in comparisons between and within species with up to 93 nucleotide differences between sequences. In all the sequences, most variation occurred in synonymous sites or the 5' UnTranslated Region (UTR), although there was still limited amino acid variation between most sequences. Several potential pseudogenes were found (approximately 10% of all sequences depending on species) with incomplete open reading frames due to frameshifts or early stop codons. Overall, variation in the actin gene family fits best with the "birth and death" model of evolution based on recent duplications, pseudogenes, and incomplete lineage sorting. Divergence between species was similar to variation within species, so that actin may be too conserved to be useful for phylogenetic estimation of closely related species.
Evolutionary genomics of miniature inverted-repeat transposable elements (MITEs) in Brassica.

PubMed

Nouroz, Faisal; Noreen, Shumaila; Heslop-Harrison, J S

2015-12-01

Miniature inverted-repeat transposable elements (MITEs) are truncated derivatives of autonomous DNA transposons, and are dispersed abundantly in most eukaryotic genomes. We aimed to characterize various MITEs families in Brassica in terms of their presence, sequence characteristics and evolutionary activity. Dot plot analyses involving comparison of homoeologous bacterial artificial chromosome (BAC) sequences allowed identification of 15 novel families of mobile MITEs. Of which, 5 were Stowaway-like with TA Target Site Duplications (TSDs), 4 Tourist-like with TAA/TTA TSDs, 5 Mutator-like with 9-10 bp TSDs and 1 novel MITE (BoXMITE1) flanked by 3 bp TSDs. Our data suggested that there are about 30,000 MITE-related sequences in Brassica rapa and B. oleracea genomes. In situ hybridization showed one abundant family was dispersed in the A-genome, while another was located near 45S rDNA sites. PCR analysis using primers flanking sequences of MITE elements detected MITE insertion polymorphisms between and within the three Brassica (AA, BB, CC) genomes, with many insertions being specific to single genomes and others showing evidence of more recent evolutionary insertions. Our BAC sequence comparison strategy enables identification of evolutionarily active MITEs with no prior knowledge of MITE sequences. The details of MITE families reported in Brassica enable their identification, characterization and annotation. Insertion polymorphisms of MITEs and their transposition activity indicated important mechanism of genome evolution and diversification. MITE families derived from known Mariner, Harbinger and Mutator DNA transposons were discovered, as well as some novel structures. The identification of Brassica MITEs will have broad applications in Brassica genomics, breeding, hybridization and phylogeny through their use as DNA markers.
Escaping introns in COI through cDNA barcoding of mushrooms: Pleurotus as a test case.

PubMed

Avin, Farhat A; Subha, Bhassu; Tan, Yee-Shin; Braukmann, Thomas W A; Vikineswary, Sabaratnam; Hebert, Paul D N

2017-09-01

DNA barcoding involves the use of one or more short, standardized DNA fragments for the rapid identification of species. A 648-bp segment near the 5' terminus of the mitochondrial cytochrome c oxidase subunit I (COI) gene has been adopted as the universal DNA barcode for members of the animal kingdom, but its utility in mushrooms is complicated by the frequent occurrence of large introns. As a consequence, ITS has been adopted as the standard DNA barcode marker for mushrooms despite several shortcomings. This study employed newly designed primers coupled with cDNA analysis to examine COI sequence diversity in six species of Pleurotus and compared these results with those for ITS. The ability of the COI gene to discriminate six species of Pleurotus , the commonly cultivated oyster mushroom, was examined by analysis of cDNA. The amplification success, sequence variation within and among species, and the ability to design effective primers was tested. We compared ITS sequences to their COI cDNA counterparts for all isolates. ITS discriminated between all six species, but some sequence results were uninterpretable, because of length variation among ITS copies. By comparison, a complete COI sequences were recovered from all but three individuals of Pleurotus giganteus where only the 5' region was obtained. The COI sequences permitted the resolution of all species when partial data was excluded for P. giganteus . Our results suggest that COI can be a useful barcode marker for mushrooms when cDNA analysis is adopted, permitting identifications in cases where ITS cannot be recovered or where it offers higher resolution when fresh tissue is. The suitability of this approach remains to be confirmed for other mushrooms.
Quantitation of next generation sequencing library preparation protocol efficiencies using droplet digital PCR assays - a systematic comparison of DNA library preparation kits for Illumina sequencing.

PubMed

Aigrain, Louise; Gu, Yong; Quail, Michael A

2016-06-13

The emergence of next-generation sequencing (NGS) technologies in the past decade has allowed the democratization of DNA sequencing both in terms of price per sequenced bases and ease to produce DNA libraries. When it comes to preparing DNA sequencing libraries for Illumina, the current market leader, a plethora of kits are available and it can be difficult for the users to determine which kit is the most appropriate and efficient for their applications; the main concerns being not only cost but also minimal bias, yield and time efficiency. We compared 9 commercially available library preparation kits in a systematic manner using the same DNA sample by probing the amount of DNA remaining after each protocol steps using a new droplet digital PCR (ddPCR) assay. This method allows the precise quantification of fragments bearing either adaptors or P5/P7 sequences on both ends just after ligation or PCR enrichment. We also investigated the potential influence of DNA input and DNA fragment size on the final library preparation efficiency. The overall library preparations efficiencies of the libraries show important variations between the different kits with the ones combining several steps into a single one exhibiting some final yields 4 to 7 times higher than the other kits. Detailed ddPCR data also reveal that the adaptor ligation yield itself varies by more than a factor of 10 between kits, certain ligation efficiencies being so low that it could impair the original library complexity and impoverish the sequencing results. When a PCR enrichment step is necessary, lower adaptor-ligated DNA inputs leads to greater amplification yields, hiding the latent disparity between kits. We describe a ddPCR assay that allows us to probe the efficiency of the most critical step in the library preparation, ligation, and to draw conclusion on which kits is more likely to preserve the sample heterogeneity and reduce the need of amplification.

Comparison of large-insert, small-insert and pyrosequencing libraries for metagenomic analysis.

PubMed

Danhorn, Thomas; Young, Curtis R; DeLong, Edward F

2012-11-01

The development of DNA sequencing methods for characterizing microbial communities has evolved rapidly over the past decades. To evaluate more traditional, as well as newer methodologies for DNA library preparation and sequencing, we compared fosmid, short-insert shotgun and 454 pyrosequencing libraries prepared from the same metagenomic DNA samples. GC content was elevated in all fosmid libraries, compared with shotgun and 454 libraries. Taxonomic composition of the different libraries suggested that this was caused by a relative underrepresentation of dominant taxonomic groups with low GC content, notably Prochlorales and the SAR11 cluster, in fosmid libraries. While these abundant taxa had a large impact on library representation, we also observed a positive correlation between taxon GC content and fosmid library representation in other low-GC taxa, suggesting a general trend. Analysis of gene category representation in different libraries indicated that the functional composition of a library was largely a reflection of its taxonomic composition, and no additional systematic biases against particular functional categories were detected at the level of sequencing depth in our samples. Another important but less predictable factor influencing the apparent taxonomic and functional library composition was the read length afforded by the different sequencing technologies. Our comparisons and analyses provide a detailed perspective on the influence of library type on the recovery of microbial taxa in metagenomic libraries and underscore the different uses and utilities of more traditional, as well as contemporary 'next-generation' DNA library construction and sequencing technologies for exploring the genomics of the natural microbial world.
Error Rate Comparison during Polymerase Chain Reaction by DNA Polymerase

DOE PAGES

McInerney, Peter; Adams, Paul; Hadi, Masood Z.

2014-01-01

As larger-scale cloning projects become more prevalent, there is an increasing need for comparisons among high fidelity DNA polymerases used for PCR amplification. All polymerases marketed for PCR applications are tested for fidelity properties (i.e., error rate determination) by vendors, and numerous literature reports have addressed PCR enzyme fidelity. Nonetheless, it is often difficult to make direct comparisons among different enzymes due to numerous methodological and analytical differences from study to study. We have measured the error rates for 6 DNA polymerases commonly used in PCR applications, including 3 polymerases typically used for cloning applications requiring high fidelity. Error ratemore » measurement values reported here were obtained by direct sequencing of cloned PCR products. The strategy employed here allows interrogation of error rate across a very large DNA sequence space, since 94 unique DNA targets were used as templates for PCR cloning. The six enzymes included in the study, Taq polymerase, AccuPrime-Taq High Fidelity, KOD Hot Start, cloned Pfu polymerase, Phusion Hot Start, and Pwo polymerase, we find the lowest error rates with Pfu , Phusion, and Pwo polymerases. Error rates are comparable for these 3 enzymes and are >10x lower than the error rate observed with Taq polymerase. Mutation spectra are reported, with the 3 high fidelity enzymes displaying broadly similar types of mutations. For these enzymes, transition mutations predominate, with little bias observed for type of transition.« less
Structural analysis of the rDNA intergenic spacer of Brassica nigra: evolutionary divergence of the spacers of the three diploid Brassica species.

PubMed

Bhatia, S; Singh Negi, M; Lakshmikumaran, M

1996-11-01

EcoRI restriction of the B. nigra rDNA recombinants, isolated from a lambda genomic library, showed that the 3.9-kb fragment corresponded to the Intergenic Spacer (IGS), which was sequenced and found to be 3,928 bp in size. Sequence and dot-matrix analyses showed that the organization of the B. nigra rDNA IGS was typical of most rDNA spacers, consisting of a central repetitive region and flanking unique sequences on either side. The repetitive region was composed of two repeat families-RF 'A' and RF 'B.' The B. nigra RF 'A' consisted of a tandem array of three full-length copies of a 106-bp sequence element. RF 'B' was composed of 66 tandemly repeated elements. Each 'B' element was only 21-bp in size and this is the smallest repeat unit identified in plant rDNA to date. The putative transcription initiation site (TIS) was identified as nucleotide position 3,110. Based on the sequence analysis it was suggested that the present organization of the repeat families was generated by successive cycles of deletions and amplifications and was being maintained by homogenization processes such as gene conversion and crossing-over.A detailed comparison of the rDNA IGS sequences of the three diploid Brassica species-namely, B. nigra, B. campestris, and B. oleracea-was carried out. First, comparisons revealed that B. campestris and B. oleracea were close to each other as the repeat families in both showed high sequence homology between each other. Second, the repeat elements in both the species were organized in an interspersed manner. Third, a 52-bp sequence, present just downstream of the repeats in B. campestris, was found to be identical to the B. oleracea repeats, thereby suggesting a common progenitor. On the other hand, in B. nigra no interspersion pattern of organization of repeats was observed. Further, the B. nigra RF 'A' was identified as distinct from the repeat families of B. campestris and B. oleracea. Based on this analysis, it was suggested that during speciation B. campestris and B. oleracea evolved in one lineage whereas B. nigra diverged into a separate lineage. The comparative analysis of the IGS helped in identifying not only conserved ancestral sequence motifs of possible functional significance such as promoters and enhancers, but also sequences which showed variation between the three diploid species and were therefore identified as species-specific sequences.
Advances in DNA metabarcoding for food and wildlife forensic species identification.

PubMed

Staats, Martijn; Arulandhu, Alfred J; Gravendeel, Barbara; Holst-Jensen, Arne; Scholtens, Ingrid; Peelen, Tamara; Prins, Theo W; Kok, Esther

2016-07-01

Species identification using DNA barcodes has been widely adopted by forensic scientists as an effective molecular tool for tracking adulterations in food and for analysing samples from alleged wildlife crime incidents. DNA barcoding is an approach that involves sequencing of short DNA sequences from standardized regions and comparison to a reference database as a molecular diagnostic tool in species identification. In recent years, remarkable progress has been made towards developing DNA metabarcoding strategies, which involves next-generation sequencing of DNA barcodes for the simultaneous detection of multiple species in complex samples. Metabarcoding strategies can be used in processed materials containing highly degraded DNA e.g. for the identification of endangered and hazardous species in traditional medicine. This review aims to provide insight into advances of plant and animal DNA barcoding and highlights current practices and recent developments for DNA metabarcoding of food and wildlife forensic samples from a practical point of view. Special emphasis is placed on new developments for identifying species listed in the Convention on International Trade of Endangered Species (CITES) appendices for which reliable methods for species identification may signal and/or prevent illegal trade. Current technological developments and challenges of DNA metabarcoding for forensic scientists will be assessed in the light of stakeholders' needs.
Phylogenetic Analysis of Pasteuria penetrans by 16S rRNA Gene Cloning and Sequencing.

PubMed

Anderson, J M; Preston, J F; Dickson, D W; Hewlett, T E; Williams, N H; Maruniak, J E

1999-09-01

Pasteuria penetrans is an endospore-forming bacterial parasite of Meloidogyne spp. This organism is among the most promising agents for the biological control of root-knot nematodes. In order to establish the phylogenetic position of this species relative to other endospore-forming bacteria, the 16S ribosomal genes from two isolates of P. penetrans, P-20, which preferentially infects M. arenaria race 1, and P-100, which preferentially infects M. incognita and M. javanica, were PCR-amplified from a purified endospore extraction. Universal primers for the 16S rRNA gene were used to amplify DNA which was cloned, and a nucleotide sequence was obtained for 92% of the gene (1,390 base pairs) encoding the 16S rDNA from each isolate. Comparison of both isolates showed identical sequences that were compared to 16S rDNA sequences of 30 other endospore-forming bacteria obtained from GenBank. Parsimony analyses indicated that P. penetrans is a species within a clade that includes Alicyclobacillus acidocaldarius, A. cycloheptanicus, Sulfobacillus sp., Bacillus tusciae, B. schlegelii, and P. ramosa. Its closest neighbor is P. ramosa, a parasite of Daphnia spp. (water fleas). This study provided a genomic basis for the relationship of species assigned to the genus Pasteuria, and for comparison of species that are parasites of different phytopathogenic nematodes.
Draft genome sequence of Cryptococcus terricola JCM 24523, an oleaginous yeast capable of expressing exogenous DNA

DOE PAGES

Close, Dan; Ojumu, John O.; Zhang, Gui X.

2016-11-03

Cryptococcus terricola JCM 24523 has recently been identified as an oleaginous yeast capable of converting starch into fatty acids. Here, this draft genome sequence provides a platform for elucidating its fatty acid production potential and supporting comparisons with other oleaginous species.
Draft genome sequence of Cryptococcus terricola JCM 24523, an oleaginous yeast capable of expressing exogenous DNA

DOE Office of Scientific and Technical Information (OSTI.GOV)

Close, Dan; Ojumu, John O.; Zhang, Gui X.

Cryptococcus terricola JCM 24523 has recently been identified as an oleaginous yeast capable of converting starch into fatty acids. Here, this draft genome sequence provides a platform for elucidating its fatty acid production potential and supporting comparisons with other oleaginous species.
Ebbie: automated analysis and storage of small RNA cloning data using a dynamic web server

PubMed Central

Ebhardt, H Alexander; Wiese, Kay C; Unrau, Peter J

2006-01-01

Background DNA sequencing is used ubiquitously: from deciphering genomes[1] to determining the primary sequence of small RNAs (smRNAs) [2-5]. The cloning of smRNAs is currently the most conventional method to determine the actual sequence of these important regulators of gene expression. Typical smRNA cloning projects involve the sequencing of hundreds to thousands of smRNA clones that are delimited at their 5' and 3' ends by fixed sequence regions. These primers result from the biochemical protocol used to isolate and convert the smRNA into clonable PCR products. Recently we completed a smRNA cloning project involving tobacco plants, where analysis was required for ~700 smRNA sequences[6]. Finding no easily accessible research tool to enter and analyze smRNA sequences we developed Ebbie to assist us with our study. Results Ebbie is a semi-automated smRNA cloning data processing algorithm, which initially searches for any substring within a DNA sequencing text file, which is flanked by two constant strings. The substring, also termed smRNA or insert, is stored in a MySQL and BlastN database. These inserts are then compared using BlastN to locally installed databases allowing the rapid comparison of the insert to both the growing smRNA database and to other static sequence databases. Our laboratory used Ebbie to analyze scores of DNA sequencing data originating from an smRNA cloning project[6]. Through its built-in instant analysis of all inserts using BlastN, we were able to quickly identify 33 groups of smRNAs from ~700 database entries. This clustering allowed the easy identification of novel and highly expressed clusters of smRNAs. Ebbie is available under GNU GPL and currently implemented on Conclusion Ebbie was designed for medium sized smRNA cloning projects with about 1,000 database entries [6-8].Ebbie can be used for any type of sequence analysis where two constant primer regions flank a sequence of interest. The reliable storage of inserts, and their annotation in a MySQL database, BlastN[9] comparison of new inserts to dynamic and static databases make it a powerful new tool in any laboratory using DNA sequencing. Ebbie also prevents manual mistakes during the excision process and speeds up annotation and data-entry. Once the server is installed locally, its access can be restricted to protect sensitive new DNA sequencing data. Ebbie was primarily designed for smRNA cloning projects, but can be applied to a variety of RNA and DNA cloning projects[2,3,10,11]. PMID:16584563
Optimal Ancient DNA Yields from the Inner Ear Part of the Human Petrous Bone.

PubMed

Pinhasi, Ron; Fernandes, Daniel; Sirak, Kendra; Novak, Mario; Connell, Sarah; Alpaslan-Roodenberg, Songül; Gerritsen, Fokke; Moiseyev, Vyacheslav; Gromov, Andrey; Raczky, Pál; Anders, Alexandra; Pietrusewsky, Michael; Rollefson, Gary; Jovanovic, Marija; Trinhhoang, Hiep; Bar-Oz, Guy; Oxenham, Marc; Matsumura, Hirofumi; Hofreiter, Michael

2015-01-01

The invention and development of next or second generation sequencing methods has resulted in a dramatic transformation of ancient DNA research and allowed shotgun sequencing of entire genomes from fossil specimens. However, although there are exceptions, most fossil specimens contain only low (~ 1% or less) percentages of endogenous DNA. The only skeletal element for which a systematically higher endogenous DNA content compared to other skeletal elements has been shown is the petrous part of the temporal bone. In this study we investigate whether (a) different parts of the petrous bone of archaeological human specimens give different percentages of endogenous DNA yields, (b) there are significant differences in average DNA read lengths, damage patterns and total DNA concentration, and (c) it is possible to obtain endogenous ancient DNA from petrous bones from hot environments. We carried out intra-petrous comparisons for ten petrous bones from specimens from Holocene archaeological contexts across Eurasia dated between 10,000-1,800 calibrated years before present (cal. BP). We obtained shotgun DNA sequences from three distinct areas within the petrous: a spongy part of trabecular bone (part A), the dense part of cortical bone encircling the osseous inner ear, or otic capsule (part B), and the dense part within the otic capsule (part C). Our results confirm that dense bone parts of the petrous bone can provide high endogenous aDNA yields and indicate that endogenous DNA fractions for part C can exceed those obtained for part B by up to 65-fold and those from part A by up to 177-fold, while total endogenous DNA concentrations are up to 126-fold and 109-fold higher for these comparisons. Our results also show that while endogenous yields from part C were lower than 1% for samples from hot (both arid and humid) parts, the DNA damage patterns indicate that at least some of the reads originate from ancient DNA molecules, potentially enabling ancient DNA analyses of samples from hot regions that are otherwise not amenable to ancient DNA analyses.
G-Anchor: a novel approach for whole-genome comparative mapping utilizing evolutionary conserved DNA sequences.

PubMed

Lenis, Vasileios Panagiotis E; Swain, Martin; Larkin, Denis M

2018-05-01

Cross-species whole-genome sequence alignment is a critical first step for genome comparative analyses, ranging from the detection of sequence variants to studies of chromosome evolution. Animal genomes are large and complex, and whole-genome alignment is a computationally intense process, requiring expensive high-performance computing systems due to the need to explore extensive local alignments. With hundreds of sequenced animal genomes available from multiple projects, there is an increasing demand for genome comparative analyses. Here, we introduce G-Anchor, a new, fast, and efficient pipeline that uses a strictly limited but highly effective set of local sequence alignments to anchor (or map) an animal genome to another species' reference genome. G-Anchor makes novel use of a databank of highly conserved DNA sequence elements. We demonstrate how these elements may be aligned to a pair of genomes, creating anchors. These anchors enable the rapid mapping of scaffolds from a de novo assembled genome to chromosome assemblies of a reference species. Our results demonstrate that G-Anchor can successfully anchor a vertebrate genome onto a phylogenetically related reference species genome using a desktop or laptop computer within a few hours and with comparable accuracy to that achieved by a highly accurate whole-genome alignment tool such as LASTZ. G-Anchor thus makes whole-genome comparisons accessible to researchers with limited computational resources. G-Anchor is a ready-to-use tool for anchoring a pair of vertebrate genomes. It may be used with large genomes that contain a significant fraction of evolutionally conserved DNA sequences and that are not highly repetitive, polypoid, or excessively fragmented. G-Anchor is not a substitute for whole-genome aligning software but can be used for fast and accurate initial genome comparisons. G-Anchor is freely available and a ready-to-use tool for the pairwise comparison of two genomes.
Secondary structure prediction for complete rDNA sequences (18S, 5.8S, and 28S rDNA) of Demodex folliculorum, and comparison of divergent domains structures across Acari.

PubMed

Zhao, Ya-E; Wang, Zheng-Hang; Xu, Yang; Wu, Li-Ping; Hu, Li

2013-10-01

According to base pairing, the rRNA folds into corresponding secondary structures, which contain additional phylogenetic information. On the basis of sequencing for complete rDNA sequences (18S, ITS1, 5.8S, ITS2 and 28S rDNA) of Demodex, we predicted the secondary structure of the complete rDNA sequence (18S, 5.8S, and 28S rDNA) of Demodex folliculorum, which was in concordance with that of the main arthropod lineages in past studies. And together with the sequence data from GenBank, we also predicted the secondary structures of divergent domains in SSU rRNA of 51 species and in LSU rRNA of 43 species from four superfamilies in Acari (Cheyletoidea, Tetranychoidea, Analgoidea and Ixodoidea). The multiple alignment among the four superfamilies in Acari showed that, insertions from Tetranychoidea SSU rRNA formed two newly proposed helixes, and helix c3-2b of LSU rRNA was absent in Demodex (Cheyletoidea) taxa. Generally speaking, LSU rRNA presented more remarkable differences than SSU rRNA did, mainly in D2, D3, D5, D7a, D7b, D8 and D10. Copyright © 2013 Elsevier Inc. All rights reserved.
Molecular cloning of chitinase 33 (chit33) gene from Trichoderma atroviride

PubMed Central

Matroudi, S.; Zamani, M.R.; Motallebi, M.

2008-01-01

In this study Trichoderma atroviride was selected as over producer of chitinase enzyme among 30 different isolates of Trichoderma sp. on the basis of chitinase specific activity. From this isolate the genomic and cDNA clones encoding chit33 have been isolated and sequenced. Comparison of genomic and cDNA sequences for defining gene structure indicates that this gene contains three short introns and also an open reading frame coding for a protein of 321 amino acids. The deduced amino acid sequence includes a 19 aa putative signal peptide. Homology between this sequence and other reported Trichoderma Chit33 proteins are discussed. The coding sequence of chit33 gene was cloned in pEt26b(+) expression vector and expressed in E. coli. PMID:24031242
Comparative whole genome DNA methylation profiling of cattle sperm and somatic tissues reveals striking hypomethylated patterns in sperm

USDA-ARS?s Scientific Manuscript database

Using whole-genome bisulfite sequencing (WGBS), we profiled the DNA methylome of cattle sperms through comparison with three bovine somatic tissues (mammary grand, brain and blood). Large differences between them were observed in the methylation patterns of global CpGs, pericentromeric satellites, p...
DETECTION AND COMPARISON OF GIARDIAVIRUS (GLV) FROM DIFFERENT ASSEMBLAGES OF GIARDIA DUODENALIS

USDA-ARS?s Scientific Manuscript database

Five assemblages of Giardia were identified from cysts in cattle, dog, cat, sheep, and reindeer feces using ribosomal DNA (rDNA) sequencing. Assemblage A was present in cattle and reindeer feces, Assemblages C and D were present in dog feces, Assemblage E was present in cattle and sheep feces, and ...
A force-based, parallel assay for the quantification of protein-DNA interactions.

PubMed

Limmer, Katja; Pippig, Diana A; Aschenbrenner, Daniela; Gaub, Hermann E

2014-01-01

Analysis of transcription factor binding to DNA sequences is of utmost importance to understand the intricate regulatory mechanisms that underlie gene expression. Several techniques exist that quantify DNA-protein affinity, but they are either very time-consuming or suffer from possible misinterpretation due to complicated algorithms or approximations like many high-throughput techniques. We present a more direct method to quantify DNA-protein interaction in a force-based assay. In contrast to single-molecule force spectroscopy, our technique, the Molecular Force Assay (MFA), parallelizes force measurements so that it can test one or multiple proteins against several DNA sequences in a single experiment. The interaction strength is quantified by comparison to the well-defined rupture stability of different DNA duplexes. As a proof-of-principle, we measured the interaction of the zinc finger construct Zif268/NRE against six different DNA constructs. We could show the specificity of our approach and quantify the strength of the protein-DNA interaction.
Haplogroup relationships between domestic and wild sheep resolved using a mitogenome panel.

PubMed

Meadows, J R S; Hiendleder, S; Kijas, J W

2011-04-01

Five haplogroups have been identified in domestic sheep through global surveys of mitochondrial (mt) sequence variation, however these group classifications are often based on small fragments of the complete mtDNA sequence; partial control region or the cytochrome B gene. This study presents the complete mitogenome from representatives of each haplogroup identified in domestic sheep, plus a sample of their wild relatives. Comparison of the sequence successfully resolved the relationships between each haplogroup and provided insight into the relationship with wild sheep. The five haplogroups were characterised as branching independently, a radiation that shared a common ancestor 920,000 ± 190,000 years ago based on protein coding sequence. The utility of various mtDNA components to inform the true relationship between sheep was also examined with Bayesian, maximum likelihood and partitioned Bremmer support analyses. The control region was found to be the mtDNA component, which contributed the highest amount of support to the tree generated using the complete data set. This study provides the nucleus of a mtDNA mitogenome panel, which can be used to assess additional mitogenomes and serve as a reference set to evaluate small fragments of the mtDNA.
Haplogroup relationships between domestic and wild sheep resolved using a mitogenome panel

PubMed Central

Meadows, J R S; Hiendleder, S; Kijas, J W

2011-01-01

Five haplogroups have been identified in domestic sheep through global surveys of mitochondrial (mt) sequence variation, however these group classifications are often based on small fragments of the complete mtDNA sequence; partial control region or the cytochrome B gene. This study presents the complete mitogenome from representatives of each haplogroup identified in domestic sheep, plus a sample of their wild relatives. Comparison of the sequence successfully resolved the relationships between each haplogroup and provided insight into the relationship with wild sheep. The five haplogroups were characterised as branching independently, a radiation that shared a common ancestor 920 000±190 000 years ago based on protein coding sequence. The utility of various mtDNA components to inform the true relationship between sheep was also examined with Bayesian, maximum likelihood and partitioned Bremmer support analyses. The control region was found to be the mtDNA component, which contributed the highest amount of support to the tree generated using the complete data set. This study provides the nucleus of a mtDNA mitogenome panel, which can be used to assess additional mitogenomes and serve as a reference set to evaluate small fragments of the mtDNA. PMID:20940734
Promoter selection in human mitochondria involves binding of a transcription factor to orientation-independent upstream regulatory elements.

PubMed

Fisher, R P; Topper, J N; Clayton, D A

1987-07-17

Selective transcription of human mitochondrial DNA requires a transcription factor (mtTF) in addition to an essentially nonselective RNA polymerase. Partially purified mtTF is able to sequester promoter-containing DNA in preinitiation complexes in the absence of mitochondrial RNA polymerase, suggesting a DNA-binding mechanism for factor activity. Functional domains, required for positive transcriptional regulation by mtTF, are identified within both major promoters of human mtDNA through transcription of mutant promoter templates in a reconstituted in vitro system. These domains are essentially coextensive with DNA sequences protected from nuclease digestion by mtTF-binding. Comparison of the sequences of the two mtTF-responsive elements reveals significant homology only when one sequence is inverted; the binding sites are in opposite orientations with respect to the predominant direction of transcription. Thus mtTF may function bidirectionally, requiring additional protein-DNA interactions to dictate transcriptional polarity. The mtTF-responsive elements are arrayed as direct repeats, separated by approximately 80 bp within the displacement-loop region of human mitochondrial DNA; this arrangement may reflect duplication of an ancestral bidirectional promoter, giving rise to separate, unidirectional promoters for each strand.
Cloud-based adaptive exon prediction for DNA analysis.

PubMed

Putluri, Srinivasareddy; Zia Ur Rahman, Md; Fathima, Shaik Yasmeen

2018-02-01

Cloud computing offers significant research and economic benefits to healthcare organisations. Cloud services provide a safe place for storing and managing large amounts of such sensitive data. Under conventional flow of gene information, gene sequence laboratories send out raw and inferred information via Internet to several sequence libraries. DNA sequencing storage costs will be minimised by use of cloud service. In this study, the authors put forward a novel genomic informatics system using Amazon Cloud Services, where genomic sequence information is stored and accessed for processing. True identification of exon regions in a DNA sequence is a key task in bioinformatics, which helps in disease identification and design drugs. Three base periodicity property of exons forms the basis of all exon identification techniques. Adaptive signal processing techniques found to be promising in comparison with several other methods. Several adaptive exon predictors (AEPs) are developed using variable normalised least mean square and its maximum normalised variants to reduce computational complexity. Finally, performance evaluation of various AEPs is done based on measures such as sensitivity, specificity and precision using various standard genomic datasets taken from National Center for Biotechnology Information genomic sequence database.
High-throughput sequencing of three Lemnoideae (duckweeds) chloroplast genomes from total DNA.

PubMed

Wang, Wenqin; Messing, Joachim

2011-01-01

Chloroplast genomes provide a wealth of information for evolutionary and population genetic studies. Chloroplasts play a particularly important role in the adaption for aquatic plants because they float on water and their major surface is exposed continuously to sunlight. The subfamily of Lemnoideae represents such a collection of aquatic species that because of photosynthesis represents one of the fastest growing plant species on earth. We sequenced the chloroplast genomes from three different genera of Lemnoideae, Spirodela polyrhiza, Wolffiella lingulata and Wolffia australiana by high-throughput DNA sequencing of genomic DNA using the SOLiD platform. Unfractionated total DNA contains high copies of plastid DNA so that sequences from the nucleus and mitochondria can easily be filtered computationally. Remaining sequence reads were assembled into contiguous sequences (contigs) using SOLiD software tools. Contigs were mapped to a reference genome of Lemna minor and gaps, selected by PCR, were sequenced on the ABI3730xl platform. This combinatorial approach yielded whole genomic contiguous sequences in a cost-effective manner. Over 1,000-time coverage of chloroplast from total DNA were reached by the SOLiD platform in a single spot on a quadrant slide without purification. Comparative analysis indicated that the chloroplast genome was conserved in gene number and organization with respect to the reference genome of L. minor. However, higher nucleotide substitution, abundant deletions and insertions occurred in non-coding regions of these genomes, indicating a greater genomic dynamics than expected from the comparison of other related species in the Pooideae. Noticeably, there was no transition bias over transversion in Lemnoideae. The data should have immediate applications in evolutionary biology and plant taxonomy with increased resolution and statistical power.

High-Throughput Sequencing of Three Lemnoideae (Duckweeds) Chloroplast Genomes from Total DNA

PubMed Central

Wang, Wenqin; Messing, Joachim

2011-01-01

Background Chloroplast genomes provide a wealth of information for evolutionary and population genetic studies. Chloroplasts play a particularly important role in the adaption for aquatic plants because they float on water and their major surface is exposed continuously to sunlight. The subfamily of Lemnoideae represents such a collection of aquatic species that because of photosynthesis represents one of the fastest growing plant species on earth. Methods We sequenced the chloroplast genomes from three different genera of Lemnoideae, Spirodela polyrhiza, Wolffiella lingulata and Wolffia australiana by high-throughput DNA sequencing of genomic DNA using the SOLiD platform. Unfractionated total DNA contains high copies of plastid DNA so that sequences from the nucleus and mitochondria can easily be filtered computationally. Remaining sequence reads were assembled into contiguous sequences (contigs) using SOLiD software tools. Contigs were mapped to a reference genome of Lemna minor and gaps, selected by PCR, were sequenced on the ABI3730xl platform. Conclusions This combinatorial approach yielded whole genomic contiguous sequences in a cost-effective manner. Over 1,000-time coverage of chloroplast from total DNA were reached by the SOLiD platform in a single spot on a quadrant slide without purification. Comparative analysis indicated that the chloroplast genome was conserved in gene number and organization with respect to the reference genome of L. minor. However, higher nucleotide substitution, abundant deletions and insertions occurred in non-coding regions of these genomes, indicating a greater genomic dynamics than expected from the comparison of other related species in the Pooideae. Noticeably, there was no transition bias over transversion in Lemnoideae. The data should have immediate applications in evolutionary biology and plant taxonomy with increased resolution and statistical power. PMID:21931804
Molecular characterization of a distinct begomovirus species from Vernonia cinerea and its associated DNA-beta using the bacteriophage Phi 29 DNA polymerase.

PubMed

Packialakshmi, R M; Srivastava, N; Girish, K R; Usha, R

2010-08-01

Vernonia cinerea plants with yellow vein symptoms were collected around crop fields in Madurai. A portion (550 bp) of the AV1 gene amplified using degenerate primers from the total DNA purified from diseased leaf sample was cloned and sequenced. Specific primers derived from the above sequence were used to amplify 2,745 nucleotides with the typical genome organization of begomoviral DNA A (EMBL Accession No. AM182232). Sequence comparison with other begomoviruses revealed the greatest identity (82.4%) with Emilia yellow vein virus (EmYVV-[Fz1]) from China and less than 80% with all other known begomoviruses. The International Committee on Taxonomy of Viruses (ICTV) has therefore recognized Vernonia yellow vein virus (VeYVV) as a distinct begomovirus species. Conventional PCR could not amplify the DNA B or DNA beta from the diseased tissue. However, the beta DNA (1364 bp) associated with the disease was obtained (Accession No. FN435836) by the rolling circle amplification-restriction fragment length polymorphism method (RCA-RFLP) using Phi 29 DNA polymerase. Sequence analysis shows that DNA beta of VeYVV has the highest identity (56.8%) with DNA beta of Sigesbeckia yellow vein Guangxi betasatellite (SibYVGxB-[CN: Gx111:05]) and 56-53% with DNA beta associated with other begomoviruses. This is the first report of the molecular characterization of VeYVV from V. cinerea in India. The complete molecular characterization, phylogenetic analysis, and putative recombination events in VeYVV are reported.
Biochemical Characterization of a Mycobacteriophage Derived DnaB Ortholog Reveals New Insight into the Evolutionary Origin of DnaB Helicases

PubMed Central

Bhowmik, Priyanka; Das Gupta, Sujoy K.

2015-01-01

The bacterial replicative helicases known as DnaB are considered to be members of the RecA superfamily. All members of this superfamily, including DnaB, have a conserved C- terminal domain, known as the RecA core. We unearthed a series of mycobacteriophage encoded proteins in which the RecA core domain alone was present. These proteins were phylogenetically related to each other and formed a distinct clade within the RecA superfamily. A mycobacteriophage encoded protein, Wildcat Gp80 that roots deep in the DnaB family, was found to possess a core domain having significant sequence homology (Expect value < 10-5) with members of this novel cluster. This indicated that Wildcat Gp80, and by extrapolation, other members of the DnaB helicase family, may have evolved from a single domain RecA core polypeptide belonging to this novel group. Biochemical investigations confirmed that Wildcat Gp80 was a helicase. Surprisingly, our investigations also revealed that a thioredoxin tagged truncated version of the protein in which the N-terminal sequences were removed was fully capable of supporting helicase activity, although its ATP dependence properties were different. DnaB helicase activity is thus, primarily a function of the RecA core although additional N-terminal sequences may be necessary for fine tuning its activity and stability. Based on sequence comparison and biochemical studies we propose that DnaB helicases may have evolved from single domain RecA core proteins having helicase activities of their own, through the incorporation of additional N-terminal sequences. PMID:26237048
Authentication of Cordyceps sinensis by DNA Analyses: Comparison of ITS Sequence Analysis and RAPD-Derived Molecular Markers.

PubMed

Lam, Kelly Y C; Chan, Gallant K L; Xin, Gui-Zhong; Xu, Hong; Ku, Chuen-Fai; Chen, Jian-Ping; Yao, Ping; Lin, Huang-Quan; Dong, Tina T X; Tsim, Karl W K

2015-12-15

Cordyceps sinensis is an endoparasitic fungus widely used as a tonic and medicinal food in the practice of traditional Chinese medicine (TCM). In historical usage, Cordyceps specifically is referring to the species of C. sinensis. However, a number of closely related species are named themselves as Cordyceps, and they are sold commonly as C. sinensis. The substitutes and adulterants of C. sinensis are often introduced either intentionally or accidentally in the herbal market, which seriously affects the therapeutic effects or even leads to life-threatening poisoning. Here, we aim to identify Cordyceps by DNA sequencing technology. Two different DNA-based approaches were compared. The internal transcribed spacer (ITS) sequences and the random amplified polymorphic DNA (RAPD)-sequence characterized amplified region (SCAR) were developed here to authenticate different species of Cordyceps. Both approaches generally enabled discrimination of C. sinensis from others. The application of the two methods, supporting each other, increases the security of identification. For better reproducibility and faster analysis, the SCAR markers derived from the RAPD results provide a new method for quick authentication of Cordyceps.
DMINDA: an integrated web server for DNA motif identification and analyses

PubMed Central

Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying

2014-01-01

DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. PMID:24753419
Benchmark Dataset for Whole Genome Sequence Compression.

PubMed

C L, Biji; S Nair, Achuthsankar

2017-01-01

The research in DNA data compression lacks a standard dataset to test out compression tools specific to DNA. This paper argues that the current state of achievement in DNA compression is unable to be benchmarked in the absence of such scientifically compiled whole genome sequence dataset and proposes a benchmark dataset using multistage sampling procedure. Considering the genome sequence of organisms available in the National Centre for Biotechnology and Information (NCBI) as the universe, the proposed dataset selects 1,105 prokaryotes, 200 plasmids, 164 viruses, and 65 eukaryotes. This paper reports the results of using three established tools on the newly compiled dataset and show that their strength and weakness are evident only with a comparison based on the scientifically compiled benchmark dataset. The sample dataset and the respective links are available @ https://sourceforge.net/projects/benchmarkdnacompressiondataset/.
Development and expansion of high-quality control region databases to improve forensic mtDNA evidence interpretation.

PubMed

Irwin, Jodi A; Saunier, Jessica L; Strouss, Katharine M; Sturk, Kimberly A; Diegoli, Toni M; Just, Rebecca S; Coble, Michael D; Parson, Walther; Parsons, Thomas J

2007-06-01

In an effort to increase the quantity, breadth and availability of mtDNA databases suitable for forensic comparisons, we have developed a high-throughput process to generate approximately 5000 control region sequences per year from regional US populations, global populations from which the current US population is derived and global populations currently under-represented in available forensic databases. The system utilizes robotic instrumentation for all laboratory steps from pre-extraction through sequence detection, and a rigorous eight-step, multi-laboratory data review process with entirely electronic data transfer. Over the past 3 years, nearly 10,000 control region sequences have been generated using this approach. These data are being made publicly available and should further address the need for consistent, high-quality mtDNA databases for forensic testing.
The complete chloroplast genome of Aconitum chiisanense Nakai (Ranunculaceae).

PubMed

Lim, Chae Eun; Kim, Goon-Bo; Baek, Seunghoon; Han, Su-Min; Yu, Hee-Ju; Mun, Jeong-Hwan

2017-01-01

We determined the complete chloroplast DNA sequence of Aconitum chiisanense Nakai, a rare Aconitum species endemic to Korea. The chloroplast genome is 155 934 bp in length and contains 4 rRNA, 30 tRNA, and 78 protein-coding genes. Phylogenetic analysis revealed that the chloroplast genome of A. chiisanense is closely related to that of A. barbatum var. puberulum. Sequence comparison with other Ranunculaceae chloroplasts identified a unique deletion in the rps16 gene of A. chiisanense chloroplast DNA that can serve as a molecular marker for species identification.
Time-resolved fluorescence imaging of slab gels for lifetime base-calling in DNA sequencing applications.

PubMed

Lassiter, S J; Stryjewski, W; Legendre, B L; Erdmann, R; Wahl, M; Wurm, J; Peterson, R; Middendorf, L; Soper, S A

2000-11-01

A compact time-resolved near-IR fluorescence imager was constructed to obtain lifetime and intensity images of DNA sequencing slab gels. The scanner consisted of a microscope body with f/1.2 relay optics onto which was mounted a pulsed diode laser (repetition rate 80 MHz, lasing wavelength 680 nm, average power 5 mW), filtering optics, and a large photoactive area (diameter 500 microns) single-photon avalanche diode that was actively quenched to provide a large dynamic operating range. The time-resolved data were processed using electronics configured in a conventional time-correlated single-photon-counting format with all of the counting hardware situated on a PC card resident on the computer bus. The microscope head produced a timing response of 450 ps (fwhm) in a scanning mode, allowing the measurement of subnano-second lifetimes. The time-resolved microscope head was placed in an automated DNA sequencer and translated across a 21-cm-wide gel plate in approximately 6 s (scan rate 3.5 cm/s) with an accumulation time per pixel of 10 ms. The sampling frequency was 0.17 Hz (duty cycle 0.0017), sufficient to prevent signal aliasing during the electrophoresis separation. Software (written in Visual Basic) allowed acquisition of both the intensity image and lifetime analysis of DNA bands migrating through the gel in real time. Using a dual-labeling (IRD700 and Cy5.5 labeling dyes)/two-lane sequencing strategy, we successfully read 670 bases of a control M13mp18 ssDNA template using lifetime identification. Comparison of the reconstructed sequence with the known sequence of the phage indicated the number of miscalls was only 2, producing an error rate of approximately 0.3% (identification accuracy 99.7%). The lifetimes were calculated using maximum likelihood estimators and allowed on-line determinations with high precision, even when short integration times were used to construct the decay profiles. Comparison of the lifetime base calling to a single-dye/four-lane sequencing strategy indicated similar results in terms of miscalls, but reduced insertion and deletion errors using lifetime identification methods, improving the overall read accuracy.
Molecular detection of fungal pathogens in clinical specimens by 18S rDNA high-throughput screening in comparison to ITS PCR and culture.

PubMed

Wagner, K; Springer, B; Pires, V P; Keller, P M

2018-05-03

The rising incidence of invasive fungal infections and the expanding spectrum of fungal pathogens makes early and accurate identification of the causative pathogen a daunting task. Diagnostics using molecular markers enable rapid identification of fungi, offer new insights into infectious disease dynamics, and open new possibilities for infectious disease control and prevention. We performed a retrospective study using clinical specimens (N = 233) from patients with suspected fungal infection previously subjected to culture and/or internal transcribed spacer (ITS) PCR. We used these specimens to evaluate a high-throughput screening method for fungal detection using automated DNA extraction (QIASymphony), fungal ribosomal small subunit (18S) rDNA RT-PCR and amplicon sequencing. Fungal sequences were compared with sequences from the curated, commercially available SmartGene IDNS database for pathogen identification. Concordance between 18S rDNA RT-PCR and culture results was 91%, and congruence between 18S rDNA RT-PCR and ITS PCR results was 94%. In addition, 18S rDNA RT-PCR and Sanger sequencing detected fungal pathogens in culture negative (N = 13) and ITS PCR negative specimens (N = 12) from patients with a clinically confirmed fungal infection. Our results support the use of the 18S rDNA RT-PCR diagnostic workflow for rapid and accurate identification of fungal pathogens in clinical specimens.
Structure and characterization of a cDNA clone for phenylalanine ammonia-lyase from cut-injured roots of sweet potato

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tanaka, Yoshiyuki; Matsuoka, Makoto; Yamanoto, Naoki

A cDNA clone for phenylalanine ammonia-lyase (PAL) induced in wounded sweet potato (Ipomoea batatas Lam.) root was obtained by immunoscreening a cDNA library. The protein produced in Escherichia coli cells containing the plasmid pPAL02 was indistinguishable from sweet potato PAL as judged by Ouchterlony double diffusion assays. The M{sub r} of its subunit was 77,000. The cells converted ({sup 14}C)-L-phenylalanine into ({sup 14}C)-t-cinnamic acid and PAL activity was detected in the homogenate of the cells. The activity was dependent on the presence of the pPAL02 plasmid DNA. The nucleotide sequence of the cDNA contained a 2,121-base pair (bp) open-reading framemore » capable of coding for a polypeptide with 707 amino acids (M{sub r} 77,137), a 22-bp 5{prime}-noncoding region and a 207-bp 3{prime}-noncoding region. The results suggest that the insert DNA fully encoded the amino acid sequence for sweet potato PAL that is induced by wounding. Comparison of the deduced amino acid sequence with that of a PAL cDNA fragment from Phaseolus vulgaris revealed 78.9% homology. The sequence from amino acid residues 258 to 494 was highly conserved, showing 90.7% homology.« less
Mitochondrial DNA variant at HVI region as a candidate of genetic markers of type 2 diabetes

NASA Astrophysics Data System (ADS)

Gumilar, Gun Gun; Purnamasari, Yunita; Setiadi, Rahmat

2016-02-01

Mitochondrial DNA (mtDNA) is maternally inherited. mtDNA mutations which can contribute to the excess of maternal inheritance of type 2 diabetes. Due to the high mutation rate, one of the areas in the mtDNA that is often associated with the disease is the hypervariable region I (HVI). Therefore, this study was conducted to determine the genetic variants of human mtDNA HVI that related to the type 2 diabetes in four samples that were taken from four generations in one lineage. Steps being taken include the lyses of hair follicles, amplification of mtDNA HVI fragment using Polymerase Chain Reaction (PCR), detection of PCR products through agarose gel electrophoresis technique, the measurement of the concentration of mtDNA using UV-Vis spectrophotometer, determination of the nucleotide sequence via direct sequencing method and analysis of the sequencing results using SeqMan DNASTAR program. Based on the comparison between nucleotide sequence of samples and revised Cambridge Reference Sequence (rCRS) obtained six same mutations that these are C16147T, T16189C, C16193del, T16127C, A16235G, and A16293C. After comparing the data obtained to the secondary data from Mitomap and NCBI, it were found that two mutations, T16189C and T16217C, become candidates as genetic markers of type 2 diabetes even the mutations were found also in the generations of undiagnosed type 2 diabetes. The results of this study are expected to give contribution to the collection of human mtDNA database of genetic variants that associated to metabolic diseases, so that in the future it can be utilized in various fields, especially in medicine.
Finding needles in haystacks: linking scientific names, reference specimens and molecular data for Fungi

Treesearch

C.L. Schoch; B. Robbertse; V. Robert; R.G. Haight; K. Kovacs; B. Leung; W. Meyer; R.H. Nilsson; K. Hughes; A.N. Miller; P.M. Kirk; K. Abarenkov; M.C. Aime; H.A. Ariyawansa; M. Bidartondo; T. Boekhout; B. Buyck; Q. Cai; J. Chen; A. Crespo; P.W. Crous; U. Damm; Z.W. De Beer; B.T.M. Dentinger; P.K. Divakar; M. Duenas; N. Feau; K. Fliegerova; M.A. Garcia; Z.-W. Ge; G.W. Griffith; J.Z. Groenewald; M. Groenewald; M. Grube; M. Gryzenhout; C. Gueidan; L. Guo; S. Hambleton; R. Hamelin; K. Hansen; V. Hofstetter; S.-B. Hong; J. Houbraken; K.D. Hyde; P. Inderbitzin; P.R. Johnston; S.C. Karunarathna; U. Koljalg; G.M. Kovacs; E. Kraichak; K. Krizsan; C.P. Kurtzman; K.-H. Larsson; S. Leavitt; P.M. Letcher; K. Liimatainen; J.-K. Liu; D.J. Lodge; J. Jennifer Luangsa-ard; H.T. Lumbsch; S.S.N. Maharachchikumbura; D. Manamgoda; M.P. Martin; A.M. Minnis; J.-M. Moncalvo; G. Mule; K.K. Nakasone; T. Niskanen; I. Olariaga; T. Papp; T. Petkovits; R. Pino-Bodas; M.J. Powell; H.A. Raja; D. Redecker; J.M. Sarmiento-Ramirez; K.A. Seifert; B. Shrestha; S. Stenroos; B. Stielow; S.-O. Suh; K. Tanaka; L. Tedersoo; M.T. Telleria; D. Udayanga; W.A. Untereiner; J. Dieguez Uribeondo; K.V. Subbarao; C. Vagvolgyi; C. Visagie; K. Voigt; D.M. Walker; B.S. Weir; M. Weiss; N.N. Wijayawardene; M.J. Wingfield; J.P. Xu; Z.L. Yang; N. Zhang; W.-Y. Zhuang; S. Federhen

2014-01-01

DNA phylogenetic comparisons have shown that morphology-based species recognition often underestimates fungal diversity. Therefore, the need for accurate DNA sequence data, tied to both correct taxonomic names and clearly annotated specimen data, has never been greater. Furthermore, the growing number of molecular ecology and microbiome projects using high-throughput...
Structural insight into the specificity of the B3 DNA-binding domains provided by the co-crystal structure of the C-terminal fragment of BfiI restriction enzyme

PubMed Central

Golovenko, Dmitrij; Manakova, Elena; Zakrys, Linas; Zaremba, Mindaugas; Sasnauskas, Giedrius; Gražulis, Saulius; Siksnys, Virginijus

2014-01-01

The B3 DNA-binding domains (DBDs) of plant transcription factors (TF) and DBDs of EcoRII and BfiI restriction endonucleases (EcoRII-N and BfiI-C) share a common structural fold, classified as the DNA-binding pseudobarrel. The B3 DBDs in the plant TFs recognize a diverse set of target sequences. The only available co-crystal structure of the B3-like DBD is that of EcoRII-N (recognition sequence 5′-CCTGG-3′). In order to understand the structural and molecular mechanisms of specificity of B3 DBDs, we have solved the crystal structure of BfiI-C (recognition sequence 5′-ACTGGG-3′) complexed with 12-bp cognate oligoduplex. Structural comparison of BfiI-C–DNA and EcoRII-N–DNA complexes reveals a conserved DNA-binding mode and a conserved pattern of interactions with the phosphodiester backbone. The determinants of the target specificity are located in the loops that emanate from the conserved structural core. The BfiI-C–DNA structure presented here expands a range of templates for modeling of the DNA-bound complexes of the B3 family of plant TFs. PMID:24423868
Isolation and characterization of full-length putative alcohol dehydrogenase genes from polygonum minus

NASA Astrophysics Data System (ADS)

Hamid, Nur Athirah Abd; Ismail, Ismanizan

2013-11-01

Polygonum minus, locally named as Kesum is an aromatic herb which is high in secondary metabolite content. Alcohol dehydrogenase is an important enzyme that catalyzes the reversible oxidation of alcohol and aldehyde with the presence of NAD(P)(H) as co-factor. The main focus of this research is to identify the gene of ADH. The total RNA was extracted from leaves of P. minus which was treated with 150 μM Jasmonic acid. Full-length cDNA sequence of ADH was isolated via rapid amplification cDNA end (RACE). Subsequently, in silico analysis was conducted on the full-length cDNA sequence and PCR was done on genomic DNA to determine the exon and intron organization. Two sequences of ADH, designated as PmADH1 and PmADH2 were successfully isolated. Both sequences have ORF of 801 bp which encode 266 aa residues. Nucleotide sequence comparison of PmADH1 and PmADH2 indicated that both sequences are highly similar at the ORF region but divergent in the 3' untranslated regions (UTR). The amino acid is differ at the 107 residue; PmADH1 contains Gly (G) residue while PmADH2 contains Cys (C) residue. The intron-exon organization pattern of both sequences are also same, with 3 introns and 4 exons. Based on in silico analysis, both sequences contain "classical" short chain alcohol dehydrogenases/reductases ((c) SDRs) conserved domain. The results suggest that both sequences are the members of short chain alcohol dehydrogenase family.
Application of a mitochondrial DNA control region frequency database for UK domestic cats.

PubMed

Ottolini, Barbara; Lall, Gurdeep Matharu; Sacchini, Federico; Jobling, Mark A; Wetton, Jon H

2017-03-01

DNA variation in 402bp of the mitochondrial control region flanked by repeat sequences RS2 and RS3 was evaluated by Sanger sequencing in 152 English domestic cats, in order to determine the significance of matching DNA sequences between hairs found with a victim's body and the suspect's pet cat. Whilst 95% of English cats possessed one of the twelve globally widespread mitotypes, four new variants were observed, the most common of which (2% frequency) was shared with the evidential samples. No significant difference in mitotype frequency was seen between 32 individuals from the locality of the crime and 120 additional cats from the rest of England, suggesting a lack of local population structure. However, significant differences were observed in comparison with frequencies in other countries, including the closely neighbouring Netherlands, highlighting the importance of appropriate genetic databases when determining the evidential significance of mitochondrial DNA evidence. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Simple diazonium chemistry to develop specific gene sensing platforms.

PubMed

Revenga-Parra, M; García-Mendiola, T; González-Costas, J; González-Romero, E; Marín, A García; Pau, J L; Pariente, F; Lorenzo, E

2014-02-27

A simple strategy for covalent immobilizing DNA sequences, based on the formation of stable diazonized conducting platforms, is described. The electrochemical reduction of 4-nitrobenzenediazonium salt onto screen-printed carbon electrodes (SPCE) in aqueous media gives rise to terminal grafted amino groups. The presence of primary aromatic amines allows the formation of diazonium cations capable to react with the amines present at the DNA capture probe. As a comparison a second strategy based on the binding of aminated DNA capture probes to the developed diazonized conducting platforms through a crosslinking agent was also employed. The resulting DNA sensing platforms were characterized by cyclic voltammetry, electrochemical impedance spectroscopy and spectroscopic ellipsometry. The hybridization event with the complementary sequence was detected using hexaamineruthenium (III) chloride as electrochemical indicator. Finally, they were applied to the analysis of a 145-bp sequence from the human gene MRP3, reaching a detection limit of 210 pg μL(-1). Copyright © 2014 Elsevier B.V. All rights reserved.
A molecular survey of Babesia spp. and Theileria spp. in red foxes (Vulpes vulpes) and their ticks from Thuringia, Germany.

PubMed

Najm, Nour-Addeen; Meyer-Kayser, Elisabeth; Hoffmann, Lothar; Herb, Ingrid; Fensterer, Veronika; Pfister, Kurt; Silaghi, Cornelia

2014-06-01

Wild canines which are closely related to dogs constitute a potential reservoir for haemoparasites by both hosting tick species that infest dogs and harbouring tick-transmitted canine haemoparasites. In this study, the prevalence of Babesia spp. and Theileria spp. was investigated in German red foxes (Vulpes vulpes) and their ticks. DNA extracts of 261 spleen samples and 1953 ticks included 4 tick species: Ixodes ricinus (n=870), I. canisuga (n=585), I. hexagonus (n=485), and Dermacentor reticulatus (n=13) were examined for the presence of Babesia/Theileria spp. by a conventional PCR targeting the 18S rRNA gene. One hundred twenty-one out of 261 foxes (46.4%) were PCR-positive. Out of them, 44 samples were sequenced, and all sequences had 100% similarity to Theileria annae. Similarly, sequencing was carried out for 65 out of 118 PCR-positive ticks. Theileria annae DNA was detected in 61.5% of the sequenced samples, Babesia microti DNA was found in 9.2%, and Babesia venatorum in 7.6% of the sequenced samples. The foxes were most positive in June and October, whereas the peak of tick positivity was in October. Furthermore, the positivity of the ticks was higher for I. canisuga in comparison to the other tick species and for nymphs in comparison to adults. The high prevalence of T. annae DNA in red foxes in this study suggests a reservoir function of those animals for T. annae. To our knowledge, this is the first report of T. annae in foxes from Germany as well as the first detection of T. annae and B. microti in the fox tick I. canisuga. Detection of DNA of T. annae and B. microti in three tick species collected from foxes adds new potential vectors for these two pathogens and suggests a potential role of the red fox in their natural endemic cycles. Copyright © 2014 Elsevier GmbH. All rights reserved.
Identifications of captive and wild tilapia species existing in Hawaii by mitochondrial DNA control region sequence.

PubMed

Wu, Liang; Yang, Jinzeng

2012-01-01

The tilapia family of the Cichlidae includes many fish species, which live in freshwater and saltwater environments. Several species, such as O. niloticus, O. aureus, and O. mossambicus, are excellent for aquaculture because these fish are easily reproduced and readily adapt to diverse environments. Historically, tilapia species, including O. mossambicus, S. melanotheron, and O. aureus, were introduced to Hawaii many decades ago, and the state of Hawaii uses the import permit policy to prevent O. niloticus from coming into the islands. However, hybrids produced from O. niloticus may already be present in the freshwater and marine environments of the islands. The purpose of this study was to identify tilapia species that exist in Hawaii using mitochondrial DNA analysis. In this study, we analyzed 382 samples collected from 13 farm (captive) and wild tilapia populations in Oahu and the Hawaii Islands. Comparison of intraspecies variation between the mitochondrial DNA control region (mtDNA CR) and cytochrome c oxidase I (COI) gene from five populations indicated that mtDNA CR had higher nucleotide diversity than COI. A phylogenetic tree of all sampled tilapia was generated using mtDNA CR sequences. The neighbor-joining tree analysis identified seven distinctive tilapia species: O. aureus, O. mossambicus, O. niloticus, S. melanotheron, O. urolepies, T. redalli, and a hybrid of O. massambicus and O. niloticus. Of all the populations examined, 10 populations consisting of O. aureus, O. mossambicus, O. urolepis, and O. niloticus from the farmed sites were relatively pure, whereas three wild populations showed some degree of introgression and hybridization. This DNA-based tilapia species identification is the first report that confirmed tilapia species identities in the wild and captive populations in Hawaii. The DNA sequence comparisons of mtDNA CR appear to be a valid method for tilapia species identification. The suspected tilapia hybrids that consist of O. niloticus are present in captive and wild populations in Hawaii.
Identifications of Captive and Wild Tilapia Species Existing in Hawaii by Mitochondrial DNA Control Region Sequence

PubMed Central

Wu, Liang; Yang, Jinzeng

2012-01-01

Background The tilapia family of the Cichlidae includes many fish species, which live in freshwater and saltwater environments. Several species, such as O. niloticus, O. aureus, and O. mossambicus, are excellent for aquaculture because these fish are easily reproduced and readily adapt to diverse environments. Historically, tilapia species, including O. mossambicus, S. melanotheron, and O. aureus, were introduced to Hawaii many decades ago, and the state of Hawaii uses the import permit policy to prevent O. niloticus from coming into the islands. However, hybrids produced from O. niloticus may already be present in the freshwater and marine environments of the islands. The purpose of this study was to identify tilapia species that exist in Hawaii using mitochondrial DNA analysis. Methodology/Principal Findings In this study, we analyzed 382 samples collected from 13 farm (captive) and wild tilapia populations in Oahu and the Hawaii Islands. Comparison of intraspecies variation between the mitochondrial DNA control region (mtDNA CR) and cytochrome c oxidase I (COI) gene from five populations indicated that mtDNA CR had higher nucleotide diversity than COI. A phylogenetic tree of all sampled tilapia was generated using mtDNA CR sequences. The neighbor-joining tree analysis identified seven distinctive tilapia species: O. aureus, O. mossambicus, O. niloticus, S. melanotheron, O. urolepies, T. redalli, and a hybrid of O. massambicus and O. niloticus. Of all the populations examined, 10 populations consisting of O. aureus, O. mossambicus, O. urolepis, and O. niloticus from the farmed sites were relatively pure, whereas three wild populations showed some degree of introgression and hybridization. Conclusions/Significance This DNA-based tilapia species identification is the first report that confirmed tilapia species identities in the wild and captive populations in Hawaii. The DNA sequence comparisons of mtDNA CR appear to be a valid method for tilapia species identification. The suspected tilapia hybrids that consist of O. niloticus are present in captive and wild populations in Hawaii. PMID:23251613

Molecular cloning of two human liver 3 alpha-hydroxysteroid/dihydrodiol dehydrogenase isoenzymes that are identical with chlordecone reductase and bile-acid binder.

PubMed Central

Deyashiki, Y; Ogasawara, A; Nakayama, T; Nakanishi, M; Miyabe, Y; Sato, K; Hara, A

1994-01-01

Human liver contains two dihydrodiol dehydrogenases, DD2 and DD4, associated with 3 alpha-hydroxysteroid dehydrogenase activity. We have raised polyclonal antibodies that cross-reacted with the two enzymes and isolated two 1.2 kb cDNA clones (C9 and C11) for the two enzymes from a human liver cDNA library using the antibodies. The clones of C9 and C11 contained coding sequences corresponding to 306 and 321 amino acid residues respectively, but lacked 5'-coding regions around the initiation codon. Sequence analyses of several peptides obtained by enzymic and chemical cleavages of the two purified enzymes verified that the C9 and C11 clones encoded DD2 and DD4 respectively, and further indicated that the sequence of DD2 had at least additional 16 residues upward from the N-terminal sequence deduced from the cDNA. There was 82% amino acid sequence identity between the two enzymes, indicating that the enzymes are genetic isoenzymes. A computer-based comparison of the cDNAs of the isoenzymes with the DNA sequence database revealed that the nucleotide and amino acid sequences of DD2 and DD4 are virtually identical with those of human bile-acid binder and human chlordecone reductase cDNAs respectively. Images Figure 1 PMID:8172617
Whats, hows and whys of programmed DNA elimination in Tetrahymena

PubMed Central

Noto, Tomoko

2017-01-01

Programmed genome rearrangements in ciliates provide fascinating examples of flexible epigenetic genome regulations and important insights into the interaction between transposable elements (TEs) and host genomes. DNA elimination in Tetrahymena thermophila removes approximately 12 000 internal eliminated sequences (IESs), which correspond to one-third of the genome, when the somatic macronucleus (MAC) differentiates from the germline micronucleus (MIC). More than half of the IESs, many of which show high similarity to TEs, are targeted for elimination in cis by the small RNA-mediated genome comparison of the MIC to the MAC. Other IESs are targeted for elimination in trans by the same small RNAs through repetitive sequences. Furthermore, the small RNA–heterochromatin feedback loop ensures robust DNA elimination. Here, we review an updated picture of the DNA elimination mechanism, discuss the physiological and evolutionary roles of DNA elimination, and outline the key questions that remain unanswered. PMID:29021213
Precise assignment of the heavy-strand promoter of mouse mitochondrial DNA: cognate start sites are not required for transcriptional initiation.

PubMed Central

Chang, D D; Clayton, D A

1986-01-01

Transcription of the heavy strand of mouse mitochondrial DNA starts from two closely spaced, distinct sites located in the displacement loop region of the genome. We report here an analysis of regulatory sequences required for faithful transcription from these two sites. Data obtained from in vitro assays demonstrated that a 51-base-pair region, encompassing nucleotides -40 to +11 of the downstream start site, contains sufficient information for accurate transcription from both start sites. Deletion of the 3' flanking sequences, including one or both start sites to -17, resulted in the initiation of transcription by the mitochondrial RNA polymerase from alternative sites within vector DNA sequences. This feature places the mouse heavy-strand promoter uniquely among other known mitochondrial promoters, all of which absolutely require cognate start sites for transcription. Comparison of the heavy-strand promoter with those of other vertebrate mitochondrial DNAs revealed a remarkably high rate of sequence divergence among species. Images PMID:3785226
The most conserved genome segments for life detection on Earth and other planets.

PubMed

Isenbarger, Thomas A; Carr, Christopher E; Johnson, Sarah Stewart; Finney, Michael; Church, George M; Gilbert, Walter; Zuber, Maria T; Ruvkun, Gary

2008-12-01

On Earth, very simple but powerful methods to detect and classify broad taxa of life by the polymerase chain reaction (PCR) are now standard practice. Using DNA primers corresponding to the 16S ribosomal RNA gene, one can survey a sample from any environment for its microbial inhabitants. Due to massive meteoritic exchange between Earth and Mars (as well as other planets), a reasonable case can be made for life on Mars or other planets to be related to life on Earth. In this case, the supremely sensitive technologies used to study life on Earth, including in extreme environments, can be applied to the search for life on other planets. Though the 16S gene has become the standard for life detection on Earth, no genome comparisons have established that the ribosomal genes are, in fact, the most conserved DNA segments across the kingdoms of life. We present here a computational comparison of full genomes from 13 diverse organisms from the Archaea, Bacteria, and Eucarya to identify genetic sequences conserved across the widest divisions of life. Our results identify the 16S and 23S ribosomal RNA genes as well as other universally conserved nucleotide sequences in genes encoding particular classes of transfer RNAs and within the nucleotide binding domains of ABC transporters as the most conserved DNA sequence segments across phylogeny. This set of sequences defines a core set of DNA regions that have changed the least over billions of years of evolution and provides a means to identify and classify divergent life, including ancestrally related life on other planets.
Conserved features of eukaryotic hsp70 genes revealed by comparison with the nucleotide sequence of human hsp70.

PubMed Central

Hunt, C; Morimoto, R I

1985-01-01

We have determined the nucleotide sequence of the human hsp70 gene and 5' flanking region. The hsp70 gene is transcribed as an uninterrupted primary transcript of 2440 nucleotides composed of a 5' noncoding leader sequence of 212 nucleotides, a 3' noncoding region of 242 nucleotides, and a continuous open reading frame of 1986 nucleotides that encodes a protein with predicted molecular mass of 69,800 daltons. Upstream of the 5' terminus are the canonical TATAAA box, the sequence ATTGG that corresponds in the inverted orientation to the CCAAT motif, and the dyad sequence CTGGAAT/ATTCCCG that shares homology in 12 of 14 positions with the consensus transcription regulatory sequence common to Drosophila heat shock genes. Comparison of the predicted amino acid sequences of human hsp70 with the published sequences of Drosophila hsp70 and Escherichia coli dnaK reveals that human hsp70 is 73% identical to Drosophila hsp70 and 47% identical to E. coli dnaK. Surprisingly, the nucleotide sequences of the human and Drosophila genes are 72% identical and human and E. coli genes are 50% identical, which is more highly conserved than necessary given the degeneracy of the genetic code. The lack of accumulated silent nucleotide substitutions leads us to propose that there may be additional information in the nucleotide sequence of the hsp70 gene or the corresponding mRNA that precludes the maximum divergence allowed in the silent codon positions. PMID:3931075
An adaptive, object oriented strategy for base calling in DNA sequence analysis.

PubMed Central

Giddings, M C; Brumley, R L; Haker, M; Smith, L M

1993-01-01

An algorithm has been developed for the determination of nucleotide sequence from data produced in fluorescence-based automated DNA sequencing instruments employing the four-color strategy. This algorithm takes advantage of object oriented programming techniques for modularity and extensibility. The algorithm is adaptive in that data sets from a wide variety of instruments and sequencing conditions can be used with good results. Confidence values are provided on the base calls as an estimate of accuracy. The algorithm iteratively employs confidence determinations from several different modules, each of which examines a different feature of the data for accurate peak identification. Modules within this system can be added or removed for increased performance or for application to a different task. In comparisons with commercial software, the algorithm performed well. Images PMID:8233787
Massively parallel sequencing of forensic STRs: Considerations of the DNA commission of the International Society for Forensic Genetics (ISFG) on minimal nomenclature requirements.

PubMed

Parson, Walther; Ballard, David; Budowle, Bruce; Butler, John M; Gettings, Katherine B; Gill, Peter; Gusmão, Leonor; Hares, Douglas R; Irwin, Jodi A; King, Jonathan L; Knijff, Peter de; Morling, Niels; Prinz, Mechthild; Schneider, Peter M; Neste, Christophe Van; Willuweit, Sascha; Phillips, Christopher

2016-05-01

The DNA Commission of the International Society for Forensic Genetics (ISFG) is reviewing factors that need to be considered ahead of the adoption by the forensic community of short tandem repeat (STR) genotyping by massively parallel sequencing (MPS) technologies. MPS produces sequence data that provide a precise description of the repeat allele structure of a STR marker and variants that may reside in the flanking areas of the repeat region. When a STR contains a complex arrangement of repeat motifs, the level of genetic polymorphism revealed by the sequence data can increase substantially. As repeat structures can be complex and include substitutions, insertions, deletions, variable tandem repeat arrangements of multiple nucleotide motifs, and flanking region SNPs, established capillary electrophoresis (CE) allele descriptions must be supplemented by a new system of STR allele nomenclature, which retains backward compatibility with the CE data that currently populate national DNA databases and that will continue to be produced for the coming years. Thus, there is a pressing need to produce a standardized framework for describing complex sequences that enable comparison with currently used repeat allele nomenclature derived from conventional CE systems. It is important to discern three levels of information in hierarchical order (i) the sequence, (ii) the alignment, and (iii) the nomenclature of STR sequence data. We propose a sequence (text) string format the minimal requirement of data storage that laboratories should follow when adopting MPS of STRs. We further discuss the variant annotation and sequence comparison framework necessary to maintain compatibility among established and future data. This system must be easy to use and interpret by the DNA specialist, based on a universally accessible genome assembly, and in place before the uptake of MPS by the general forensic community starts to generate sequence data on a large scale. While the established nomenclature for CE-based STR analysis will remain unchanged in the future, the nomenclature of sequence-based STR genotypes will need to follow updated rules and be generated by expert systems that translate MPS sequences to match CE conventions in order to guarantee compatibility between the different generations of STR data. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Isolation and characterization of the chicken trypsinogen gene family.

PubMed Central

Wang, K; Gan, L; Lee, I; Hood, L

1995-01-01

Based on genomic Southern hybridizations and cDNA sequence analyses, the chicken trypsinogen gene family can be divided into two multi-member subfamilies, a six-member trypsinogen I subfamily which encodes the cationic trypsin isoenzymes and a three-member trypsinogen II subfamily which encodes the anionic trypsin isoenzymes. The chicken cDNA and genomic clones containing these two subfamilies were isolated and characterized by DNA sequence analysis. The results indicated that the chicken trypsinogen genes encoded a signal peptide of 15 to 16 amino acid residues, an activation peptide of 9 to 10 residues and a trypsin of 223 amino acid residues. The chicken trypsinogens contain all the common catalytic and structural features for trypsins, including the catalytic triad His, Asp and Ser and the six disulphide bonds. The trypsinogen I and II subfamilies share approximately 70% sequence identity at the nucleotide and amino acid level. The sequence comparison among chicken trypsinogen subfamily members and trypsin sequences from other species suggested that the chicken trypsinogen genes may have evolved in coincidental or concerted fashion. Images Figure 6 Figure 7 PMID:7733885
Evidence for Interspecies Gene Transfer in the Evolution of 2,4-Dichlorophenoxyacetic Acid Degraders

PubMed Central

McGowan, Catherine; Fulthorpe, Roberta; Wright, Alice; Tiedje, J. M.

1998-01-01

Small-subunit ribosomal DNA (SSU rDNA) from 20 phenotypically distinct strains of 2,4-dichlorophenoxyacetic acid (2,4-D)-degrading bacteria was partially sequenced, yielding 18 unique strains belonging to members of the alpha, beta, and gamma subgroups of the class Proteobacteria. To understand the origin of 2,4-D degradation in this diverse collection, the first gene in the 2,4-D pathway, tfdA, was sequenced. The sequences fell into three unique classes found in various members of the beta and gamma subgroups of Proteobacteria. None of the α-Proteobacteria yielded tfdA PCR products. A comparison of the dendrogram of the tfdA genes with that of the SSU rDNA genes demonstrated incongruency in phylogenies, and hence 2,4-D degradation must have originated from gene transfer between species. Only those strains with tfdA sequences highly similar to the tfdA sequence of strain JMP134 (tfdA class I) transferred all the 2,4-D genes and conferred the 2,4-D degradation phenotype to a Burkholderia cepacia recipient. PMID:9758850
NMR and computational methods applied to the 3- dimensional structure determination of DNA and ligand-DNA complexes in solution

NASA Astrophysics Data System (ADS)

Smith, Jarrod Anson

2D homonuclear 1H NMR methods and restrained molecular dynamics (rMD) calculations have been applied to determining the three-dimensional structures of DNA and minor groove-binding ligand-DNA complexes in solution. The structure of the DNA decamer sequence d(GCGTTAACGC)2 has been solved both with a distance-based rMD protocol and an NOE relaxation matrix backcalculation-based protocol in order to probe the relative merits of the different refinement methods. In addition, three minor groove binding ligand-DNA complexes have been examined. The solution structure of the oligosaccharide moiety of the antitumor DNA scission agent calicheamicin γ1I has been determined in complex with a decamer duplex containing its high affinity 5'-TCCT- 3' binding sequence. The structure of the complex reinforces the belief that the oligosaccharide moiety is responsible for the sequence selective minor-groove binding activity of the agent, and critical intermolecular contacts are revealed. The solution structures of both the (+) and (-) enantiomers of the minor groove binding DNA alkylating agent duocarmycin SA have been determined in covalent complex with the undecamer DNA duplex d(GACTAATTGTC).d(GAC AATTAGTC). The results support the proposal that the alkylation activity of the duocarmycin antitumor antibiotics is catalyzed by a binding-induced conformational change in the ligand which activates the cyclopropyl group for reaction with the DNA. Comparisons between the structures of the two enantiomers covalently bound to the same DNA sequence at the same 5'-AATTA-3 ' site have provided insight into the binding orientation and site selectivity, as well as the relative rates of reactivity of these two agents.
Evidence of accelerated evolution and ectodermal-specific expression of presumptive BDS toxin cDNAs from Anemonia viridis.

PubMed

Nicosia, Aldo; Maggio, Teresa; Mazzola, Salvatore; Cuttitta, Angela

2013-10-30

Anemonia viridis is a widespread and extensively studied Mediterranean species of sea anemone from which a large number of polypeptide toxins, such as blood depressing substances (BDS) peptides, have been isolated. The first members of this class, BDS-1 and BDS-2, are polypeptides belonging to the β-defensin fold family and were initially described for their antihypertensive and antiviral activities. BDS-1 and BDS-2 are 43 amino acid peptides characterised by three disulfide bonds that act as neurotoxins affecting Kv3.1, Kv3.2 and Kv3.4 channel gating kinetics. In addition, BDS-1 inactivates the Nav1.7 and Nav1.3 channels. The development of a large dataset of A. viridis expressed sequence tags (ESTs) and the identification of 13 putative BDS-like cDNA sequences has attracted interest, especially as scientific and diagnostic tools. A comparison of BDS cDNA sequences showed that the untranslated regions are more conserved than the protein-coding regions. Moreover, the KA/KS ratios calculated for all pairwise comparisons showed values greater than 1, suggesting mechanisms of accelerated evolution. The structures of the BDS homologs were predicted by molecular modelling. All toxins possess similar 3D structures that consist of a triple-stranded antiparallel β-sheet and an additional small antiparallel β-sheet located downstream of the cleavage/maturation site; however, the orientation of the triple-stranded β-sheet appears to differ among the toxins. To characterise the spatial expression profile of the putative BDS cDNA sequences, tissue-specific cDNA libraries, enriched for BDS transcripts, were constructed. In addition, the proper amplification of ectodermal or endodermal markers ensured the tissue specificity of each library. Sequencing randomly selected clones from each library revealed ectodermal-specific expression of ten BDS transcripts, while transcripts of BDS-8, BDS-13, BDS-14 and BDS-15 failed to be retrieved, likely due to under-representation in our cDNA libraries. The calculation of the relative abundance of BDS transcripts in the cDNA libraries revealed that BDS-1, BDS-3, BDS-4, BDS-5 and BDS-6 are the most represented transcripts.
Comparison of Four Human Papillomavirus Genotyping Methods: Next-generation Sequencing, INNO-LiPA, Electrochemical DNA Chip, and Nested-PCR.

PubMed

Nilyanimit, Pornjarim; Chansaenroj, Jira; Poomipak, Witthaya; Praianantathavorn, Kesmanee; Payungporn, Sunchai; Poovorawan, Yong

2018-03-01

Human papillomavirus (HPV) infection causes cervical cancer, thus necessitating early detection by screening. Rapid and accurate HPV genotyping is crucial both for the assessment of patients with HPV infection and for surveillance studies. Fifty-eight cervicovaginal samples were tested for HPV genotypes using four methods in parallel: nested-PCR followed by conventional sequencing, INNO-LiPA, electrochemical DNA chip, and next-generation sequencing (NGS). Seven HPV genotypes (16, 18, 31, 33, 45, 56, and 58) were identified by all four methods. Nineteen HPV genotypes were detected by NGS, but not by nested-PCR, INNO-LiPA, or electrochemical DNA chip. Although NGS is relatively expensive and complex, it may serve as a sensitive HPV genotyping method. Because of its highly sensitive detection of multiple HPV genotypes, NGS may serve as an alternative for diagnostic HPV genotyping in certain situations. © The Korean Society for Laboratory Medicine
Analysis of methylated patterns and quality-related genes in tobacco (Nicotiana tabacum) cultivars.

PubMed

Jiao, Junna; Jia, Yanlong; Lv, Zhuangwei; Sun, Chuanfei; Gao, Lijie; Yan, Xiaoxiao; Cui, Liusu; Tang, Zongxiang; Yan, Benju

2014-08-01

Methylation-sensitive amplified polymorphism was used in this study to investigate epigenetic information of four tobacco cultivars: Yunyan 85, NC89, K326, and Yunyan 87. The DNA fragments with methylated information were cloned by reamplified PCR and sequenced. The results of Blast alignments showed that the genes with methylation information included chitinase, nitrate reductase, chloroplast DNA, mitochondrial DNA, ornithine decarboxylase, ribulose carboxylase, and promoter sequences. Homologous comparison in three cloned gene sequences (nitrate reductase, ornithine decarboxylase, and ribulose decarboxylase) indicated that geographic factors had significant influence on the whole genome methylation. Introns also contained different information in different tobacco cultivars. These findings suggest that synthetic mechanisms for tobacco aromatic components could be affected by different environmental factors leading to variation of noncoding regions in the genome, which finally results in different fragrance and taste in different tobacco cultivars.
A 12-year molecular survey of clinical herpes simplex virus type 2 isolates demonstrates the circulation of clade A and B strains in Germany.

PubMed

Schmidt-Chanasit, Jonas; Bialonski, Alexandra; Heinemann, Patrick; Ulrich, Rainer G; Günther, Stephan; Rabenau, Holger F; Doerr, Hans Wilhelm

2010-07-01

Recently two different herpes simplex virus type 2 (HSV-2) clades (A and B) were described on DNA sequence data of the glycoprotein E (gE), G (gG) and I (gI) genes. To type the circulating HSV-2 wild-type strains in Germany by a novel approach and to monitor potential changes in the molecular epidemiology between 1997 and 2008. A total of 64 clinical HSV-2 isolates were analyzed by a novel approach using the DNA sequences of the complete open reading frames of glycoprotein B (gB) and gG. Recombination analysis of the gB and gG gene sequences was performed to reveal intragenic recombinants. Based on the phylogenetic analysis of the gB coding DNA sequence 8 of 64 (12%) isolates were classified as clade A strains and 56 of 64 (88%) isolates were classified as clade B strains. Analysis of the gG coding DNA sequence classified 4 (6%) isolates as clade A strains and 60 (94%) isolates as clade B strains. In comparison, the 8 isolates classified as clade A strains using the gB sequence data were classified as clade B strains when using the gG coding DNA sequence, suggesting intergenic recombination events. Intragenic recombination events were not detected. The first molecular survey of clinical HSV-2 isolates from Germany demonstrated the circulation of clade A and B strains and of intergenic recombinants over a period of 12 years. Copyright (c) 2010 Elsevier B.V. All rights reserved.
Advances in yeast systematics and phylogeny and their use as predictors of biotechnologically important metabolic pathways

USDA-ARS?s Scientific Manuscript database

Detection, identification, and classification of yeasts have undergone a major transformation in the last decade and a half following application of gene sequence analyses and genome comparisons. Development of a database (barcode) of easily determined DNA sequences from domains 1 and 2 (D1/D2) of t...
Microbial Analysis of Bite Marks by Sequence Comparison of Streptococcal DNA

PubMed Central

Kennedy, Darnell M.; Stanton, Jo-Ann L.; García, José A.; Mason, Chris; Rand, Christy J.; Kieser, Jules A.; Tompkins, Geoffrey R.

2012-01-01

Bite mark injuries often feature in violent crimes. Conventional morphometric methods for the forensic analysis of bite marks involve elements of subjective interpretation that threaten the credibility of this field. Human DNA recovered from bite marks has the highest evidentiary value, however recovery can be compromised by salivary components. This study assessed the feasibility of matching bacterial DNA sequences amplified from experimental bite marks to those obtained from the teeth responsible, with the aim of evaluating the capability of three genomic regions of streptococcal DNA to discriminate between participant samples. Bite mark and teeth swabs were collected from 16 participants. Bacterial DNA was extracted to provide the template for PCR primers specific for streptococcal 16S ribosomal RNA (16S rRNA) gene, 16S–23S intergenic spacer (ITS) and RNA polymerase beta subunit (rpoB). High throughput sequencing (GS FLX 454), followed by stringent quality filtering, generated reads from bite marks for comparison to those generated from teeth samples. For all three regions, the greatest overlaps of identical reads were between bite mark samples and the corresponding teeth samples. The average proportions of reads identical between bite mark and corresponding teeth samples were 0.31, 0.41 and 0.31, and for non-corresponding samples were 0.11, 0.20 and 0.016, for 16S rRNA, ITS and rpoB, respectively. The probabilities of correctly distinguishing matching and non-matching teeth samples were 0.92 for ITS, 0.99 for 16S rRNA and 1.0 for rpoB. These findings strongly support the tenet that bacterial DNA amplified from bite marks and teeth can provide corroborating information in the identification of assailants. PMID:23284761
Non-invasive method to obtain DNA from freshwater mussels (Bivalvia: Unionidae)

USGS Publications Warehouse

Henley, W.F.; Grobler, P.J.; Neves, R.J.

2006-01-01

To determine whether DNA could be isolated from tissues obtained by brush-swabbing the mantle, viscera and foot, mantle-clips and swabbed cells were obtained from eight Quadrula pustulosa (Lea, 1831). DNA yields from clips and swabbings were 447.0 and 975.3 ??g/??L, respectively. Furthermore, comparisons of sequences from the ND-1 mitochondrial gene region showed a 100% sequence agreement of DNA from cells obtained by clips and swabs. To determine the number of swabs needed to obtain adequate yields of DNA for analyses, the visceras and feet of 5 Q. pustulosa each were successively swabbed 2, 4 and 6 times. DNA yields from the 2, 4 and 6 swabbed mussel groups were 399.4, 833.8 and 852.6 ng/??L, respectively. ND-1 sequences from the lowest yield still provided 846-901 bp for the ND-1 region. Nevertheless, to ensure adequate DNA yield from cell samples obtained by swabbing, we recommend that 4 swab-strokes of the viscera and foot be obtained. The use of integumental swabbing for collection of cells for determination of genetic relationships among freshwater mussels is noninvasive, when compared with tissue collection by mantle-clipping. Therefore, its use is recommended for freshwater mussels, especially state-protected or federally listed mussel species.
Population and forensic genetic analyses of mitochondrial DNA control region variation from six major provinces in the Korean population.

PubMed

Hong, Seung Beom; Kim, Ki Cheol; Kim, Wook

2015-07-01

We generated complete mitochondrial DNA (mtDNA) control region sequences from 704 unrelated individuals residing in six major provinces in Korea. In addition to our earlier survey of the distribution of mtDNA haplogroup variation, a total of 560 different haplotypes characterized by 271 polymorphic sites were identified, of which 473 haplotypes were unique. The gene diversity and random match probability were 0.9989 and 0.0025, respectively. According to the pairwise comparison of the 704 control region sequences, the mean number of pairwise differences between individuals was 13.47±6.06. Based on the result of mtDNA control region sequences, pairwise FST genetic distances revealed genetic homogeneity of the Korean provinces on a peninsular level, except in samples from Jeju Island. This result indicates there may be a need to formulate a local mtDNA database for Jeju Island, to avoid bias in forensic parameter estimates caused by genetic heterogeneity of the population. Thus, the present data may help not only in personal identification but also in determining maternal lineages to provide an expanded and reliable Korean mtDNA database. These data will be available on the EMPOP database via accession number EMP00661. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Complete mitochondrial genomes of eleven extinct or possibly extinct bird species.

PubMed

Anmarkrud, Jarl A; Lifjeld, Jan T

2017-03-01

Natural history museum collections represent a vast source of ancient and historical DNA samples from extinct taxa that can be utilized by high-throughput sequencing tools to reveal novel genetic and phylogenetic information about them. Here, we report on the successful sequencing of complete mitochondrial genome sequences (mitogenomes) from eleven extinct bird species, using de novo assembly of short sequences derived from toepad samples of degraded DNA from museum specimens. For two species (the Passenger Pigeon Ectopistes migratorius and the South Island Piopio Turnagra capensis), whole mitogenomes were already available from recent studies, whereas for five others (the Great Auk Pinguinis impennis, the Imperial Woodpecker Campehilus imperialis, the Huia Heteralocha acutirostris, the Kauai Oo Moho braccathus and the South Island Kokako Callaeas cinereus), there were partial mitochondrial sequences available for comparison. For all seven species, we found sequence similarities of >98%. For the remaining four species (the Kamao Myadestes myadestinus, the Paradise Parrot Psephotellus pulcherrimus, the Ou Psittirostra psittacea and the Lesser Akialoa Akialoa obscura), there was no sequence information available for comparison, so we conducted blast searches and phylogenetic analyses to determine their phylogenetic positions and identify their closest extant relatives. These mitogenomes will be valuable for future analyses of avian phylogenetics and illustrate the importance of museum collections as repositories for genomics resources. © 2016 John Wiley & Sons Ltd.
DNA microarray-based genome comparison of a pathogenic and a nonpathogenic strain of Xylella fastidiosa delineates genes important for bacterial virulence.

PubMed

Koide, Tie; Zaini, Paulo A; Moreira, Leandro M; Vêncio, Ricardo Z N; Matsukuma, Adriana Y; Durham, Alan M; Teixeira, Diva C; El-Dorry, Hamza; Monteiro, Patrícia B; da Silva, Ana Claudia R; Verjovski-Almeida, Sergio; da Silva, Aline M; Gomes, Suely L

2004-08-01

Xylella fastidiosa is a phytopathogenic bacterium that causes serious diseases in a wide range of economically important crops. Despite extensive comparative analyses of genome sequences of Xylella pathogenic strains from different plant hosts, nonpathogenic strains have not been studied. In this report, we show that X. fastidiosa strain J1a12, associated with citrus variegated chlorosis (CVC), is nonpathogenic when injected into citrus and tobacco plants. Furthermore, a DNA microarray-based comparison of J1a12 with 9a5c, a CVC strain that is highly pathogenic and had its genome completely sequenced, revealed that 14 coding sequences of strain 9a5c are absent or highly divergent in strain J1a12. Among them, we found an arginase and a fimbrial adhesin precursor of type III pilus, which were confirmed to be absent in the nonpathogenic strain by PCR and DNA sequencing. The absence of arginase can be correlated to the inability of J1a12 to multiply in host plants. This enzyme has been recently shown to act as a bacterial survival mechanism by down-regulating host nitric oxide production. The lack of the adhesin precursor gene is in accordance with the less aggregated phenotype observed for J1a12 cells growing in vitro. Thus, the absence of both genes can be associated with the failure of the J1a12 strain to establish and spread in citrus and tobacco plants. These results provide the first detailed comparison between a nonpathogenic strain and a pathogenic strain of X. fastidiosa, constituting an important step towards understanding the molecular basis of the disease.

DNA Microarray-Based Genome Comparison of a Pathogenic and a Nonpathogenic Strain of Xylella fastidiosa Delineates Genes Important for Bacterial Virulence†

PubMed Central

Koide, Tie; Zaini, Paulo A.; Moreira, Leandro M.; Vêncio, Ricardo Z. N.; Matsukuma, Adriana Y.; Durham, Alan M.; Teixeira, Diva C.; El-Dorry, Hamza; Monteiro, Patrícia B.; da Silva, Ana Claudia R.; Verjovski-Almeida, Sergio; da Silva, Aline M.; Gomes, Suely L.

2004-01-01

Xylella fastidiosa is a phytopathogenic bacterium that causes serious diseases in a wide range of economically important crops. Despite extensive comparative analyses of genome sequences of Xylella pathogenic strains from different plant hosts, nonpathogenic strains have not been studied. In this report, we show that X. fastidiosa strain J1a12, associated with citrus variegated chlorosis (CVC), is nonpathogenic when injected into citrus and tobacco plants. Furthermore, a DNA microarray-based comparison of J1a12 with 9a5c, a CVC strain that is highly pathogenic and had its genome completely sequenced, revealed that 14 coding sequences of strain 9a5c are absent or highly divergent in strain J1a12. Among them, we found an arginase and a fimbrial adhesin precursor of type III pilus, which were confirmed to be absent in the nonpathogenic strain by PCR and DNA sequencing. The absence of arginase can be correlated to the inability of J1a12 to multiply in host plants. This enzyme has been recently shown to act as a bacterial survival mechanism by down-regulating host nitric oxide production. The lack of the adhesin precursor gene is in accordance with the less aggregated phenotype observed for J1a12 cells growing in vitro. Thus, the absence of both genes can be associated with the failure of the J1a12 strain to establish and spread in citrus and tobacco plants. These results provide the first detailed comparison between a nonpathogenic strain and a pathogenic strain of X. fastidiosa, constituting an important step towards understanding the molecular basis of the disease. PMID:15292146
Nucleotide sequence and regulatory studies of VGF, a nervous system-specific mRNA that is rapidly and relatively selectively induced by nerve growth factor.

PubMed

Salton, S R

1991-09-01

A nervous system-specific mRNA that is rapidly induced in PC12 cells to a greater extent by nerve growth factor (NGF) than by epidermal growth factor treatment has been cloned. The polypeptide deduced from the nucleic acid sequence of the NGF33.1 cDNA clone contains regions of amino acid sequence identity with that predicted by the cDNA clone VGF, and further analysis suggests that both NGF33.1 and VGF cDNA clones very likely correspond to the same mRNA (VGF). In this report both the nucleic acid sequence that corresponds to VGF mRNA and the polypeptide predicted by the NGF33.1 cDNA clone are presented. Genomic Southern analysis and database comparison did not detect additional sequences with high homology to the VGF gene. Induction of VGF mRNA by depolarization and phorbol 12-myristate 13-acetate treatment was greater than by serum stimulation or protein kinase A pathway activation. These studies suggest that VGF mRNA is induced to the greatest extent by NGF treatment and that VGF is one of the most rapidly regulated neuronal mRNAs identified in PC12 cells.
Birth and death of genes linked to chromosomal inversion

PubMed Central

Furuta, Yoshikazu; Kawai, Mikihiko; Yahara, Koji; Takahashi, Noriko; Handa, Naofumi; Tsuru, Takeshi; Oshima, Kenshiro; Yoshida, Masaru; Azuma, Takeshi; Hattori, Masahira; Uchiyama, Ikuo; Kobayashi, Ichizo

2011-01-01

The birth and death of genes is central to adaptive evolution, yet the underlying genome dynamics remain elusive. The availability of closely related complete genome sequences helps to follow changes in gene contents and clarify their relationship to overall genome organization. Helicobacter pylori, bacteria in our stomach, are known for their extreme genome plasticity through mutation and recombination and will make a good target for such an analysis. In comparing their complete genome sequences, we found that gain and loss of genes (loci) for outer membrane proteins, which mediate host interaction, occurred at breakpoints of chromosomal inversions. Sequence comparison there revealed a unique mechanism of DNA duplication: DNA duplication associated with inversion. In this process, a DNA segment at one chromosomal locus is copied and inserted, in an inverted orientation, into a distant locus on the same chromosome, while the entire region between these two loci is also inverted. Recognition of this and three more inversion modes, which occur through reciprocal recombination between long or short sequence similarity or adjacent to a mobile element, allowed reconstruction of synteny evolution through inversion events in this species. These results will guide the interpretation of extensive DNA sequencing results for understanding long- and short-term genome evolution in various organisms and in cancer cells. PMID:21212362
Diepoxybutane Interstrand Cross-Links Induce DNA Bending

PubMed Central

Millard, Julie T.; McGowan, Erin E.; Bradley, Sharonda Q.

2011-01-01

The bifunctional alkylating agent 1,2,3,4-diepoxybutane (DEB) is thought to be a major contributor to the carcinogenicity of 1,3-butadiene, from which it is derived in vivo. DEB forms DNA interstrand cross-links primarily between distal deoxyguanosine residues at the duplex sequence 5’-GNC. In order for the short butanediol tether to span this distance, distortion of the DNA target has been postulated. We determined that the electrophoretic mobility of ligated DNA oligomers containing DEB cross-links was retarded in comparison with control, uncross-linked DNA. Our data are consistent with DNA bending of ~34° per lesion towards the major groove. PMID:21839139
Phylum- and Class-Specific PCR Primers for General Microbial Community Analysis

PubMed Central

Blackwood, Christopher B.; Oaks, Adam; Buyer, Jeffrey S.

2005-01-01

Amplification of a particular DNA fragment from a mixture of organisms by PCR is a common first step in methods of examining microbial community structure. The use of group-specific primers in community DNA profiling applications can provide enhanced sensitivity and phylogenetic detail compared to domain-specific primers. Other uses for group-specific primers include quantitative PCR and library screening. The purpose of the present study was to develop several primer sets targeting commonly occurring and important groups. Primers specific for the 16S ribosomal sequences of Alphaproteobacteria, Betaproteobacteria, Bacilli, Actinobacteria, and Planctomycetes and for parts of both the 18S ribosomal sequence and the internal transcribed spacer region of Basidiomycota were examined. Primers were tested by comparison to sequences in the ARB 2003 database, and chosen primers were further tested by cloning and sequencing from soil community DNA. Eighty-five to 100% of the sequences obtained from clone libraries were found to be placed with the groups intended as targets, demonstrating the specificity of the primers under field conditions. It will be important to reevaluate primers over time because of the continual growth of sequence databases and revision of microbial taxonomy. PMID:16204538
Cloud-based adaptive exon prediction for DNA analysis

PubMed Central

Putluri, Srinivasareddy; Fathima, Shaik Yasmeen

2018-01-01

Cloud computing offers significant research and economic benefits to healthcare organisations. Cloud services provide a safe place for storing and managing large amounts of such sensitive data. Under conventional flow of gene information, gene sequence laboratories send out raw and inferred information via Internet to several sequence libraries. DNA sequencing storage costs will be minimised by use of cloud service. In this study, the authors put forward a novel genomic informatics system using Amazon Cloud Services, where genomic sequence information is stored and accessed for processing. True identification of exon regions in a DNA sequence is a key task in bioinformatics, which helps in disease identification and design drugs. Three base periodicity property of exons forms the basis of all exon identification techniques. Adaptive signal processing techniques found to be promising in comparison with several other methods. Several adaptive exon predictors (AEPs) are developed using variable normalised least mean square and its maximum normalised variants to reduce computational complexity. Finally, performance evaluation of various AEPs is done based on measures such as sensitivity, specificity and precision using various standard genomic datasets taken from National Center for Biotechnology Information genomic sequence database. PMID:29515813
Magnetic bead purification of labeled DNA fragments forhigh-throughput capillary electrophoresis sequencing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Elkin, Christopher; Kapur, Hitesh; Smith, Troy

2001-09-15

We have developed an automated purification method for terminator sequencing products based on a magnetic bead technology. This 384-well protocol generates labeled DNA fragments that are essentially free of contaminates for less than $0.005 per reaction. In comparison to laborious ethanol precipitation protocols, this method increases the phred20 read length by forty bases with various DNA templates such as PCR fragments, Plasmids, Cosmids and RCA products. Our method eliminates centrifugation and is compatible with both the MegaBACE 1000 and ABIPrism 3700 capillary instruments. As of September 2001, this method has produced over 1.6 million samples with 93 percent averaging 620more » phred20 bases as part of Joint Genome Institutes Production Process.« less
New progress in snake mitochondrial gene rearrangement.

PubMed

Chen, Nian; Zhao, Shujin

2009-08-01

To further understand the evolution of snake mitochondrial genomes, the complete mitochondrial DNA (mtDNA) sequences were determined for representative species from two snake families: the Many-banded krait, the Banded krait, the Chinese cobra, the King cobra, the Hundred-pace viper, the Short-tailed mamushi, and the Chain viper. Thirteen protein-coding genes, 22-23 tRNA genes, 2 rRNA genes, and 2 control regions were identified in these mtDNAs. Duplication of the control region and translocation of the tRNAPro gene were two notable features of the snake mtDNAs. These results from the gene rearrangement comparisons confirm the correctness of traditional classification schemes and validate the utility of comparing complete mtDNA sequences for snake phylogeny reconstruction.
How close is close: 16S rRNA sequence identity may not be sufficient to guarantee species identity

NASA Technical Reports Server (NTRS)

Fox, G. E.; Wisotzkey, J. D.; Jurtshuk, P. Jr

1992-01-01

16S rRNA (genes coding for rRNA) sequence comparisons were conducted with the following three psychrophilic strains: Bacillus globisporus W25T (T = type strain) and Bacillus psychrophilus W16AT, and W5. These strains exhibited more than 99.5% sequence identity and within experimental uncertainty could be regarded as identical. Their close taxonomic relationship was further documented by phenotypic similarities. In contrast, previously published DNA-DNA hybridization results have convincingly established that these strains do not belong to the same species if current standards are used. These results emphasize the important point that effective identity of 16S rRNA sequences is not necessarily a sufficient criterion to guarantee species identity. Thus, although 16S rRNA sequences can be used routinely to distinguish and establish relationships between genera and well-resolved species, very recently diverged species may not be recognizable.
Shotgun Optical Maps of the Whole Escherichia coli O157:H7 Genome

PubMed Central

Lim, Alex; Dimalanta, Eileen T.; Potamousis, Konstantinos D.; Yen, Galex; Apodoca, Jennifer; Tao, Chunhong; Lin, Jieyi; Qi, Rong; Skiadas, John; Ramanathan, Arvind; Perna, Nicole T.; Plunkett, Guy; Burland, Valerie; Mau, Bob; Hackett, Jeremiah; Blattner, Frederick R.; Anantharaman, Thomas S.; Mishra, Bhubaneswar; Schwartz, David C.

2001-01-01

We have constructed NheI and XhoI optical maps of Escherichia coli O157:H7 solely from genomic DNA molecules to provide a uniquely valuable scaffold for contig closure and sequence validation. E. coli O157:H7 is a common pathogen found in contaminated food and water. Our approach obviated the need for the analysis of clones, PCR products, and hybridizations, because maps were constructed from ensembles of single DNA molecules. Shotgun sequencing of bacterial genomes remains labor-intensive, despite advances in sequencing technology. This is partly due to manual intervention required during the last stages of finishing. The applicability of optical mapping to this problem was enhanced by advances in machine vision techniques that improved mapping throughput and created a path to full automation of mapping. Comparisons were made between maps and sequence data that characterized sequence gaps and guided nascent assemblies. PMID:11544203
Correlation of Local Effects of DNA Sequence and Position of Beta-Alanine Inserts with Polyamide-DNA Complex Binding Affinities and Kinetics

PubMed Central

Wang, Shuo; Nanjunda, Rupesh; Aston, Karl; Bashkin, James K.; Wilson, W. David

2012-01-01

In order to better understand the effects of β-alanine (β) substitution and the number of heterocycles on DNA binding affinity and selectivity, the interactions of an eight-ring hairpin polyamide (PA) and two β derivatives as well as a six-heterocycle analog have been investigated with their cognate DNA sequence, 5′-TGGCTT-3′. Binding selectivity and the effects of β have been investigated with the cognate and five mutant DNAs. A set of powerful and complementary methods have been employed for both energetic and structural evaluations: UV-melting, biosensor-surface plasmon resonance, isothermal titration calorimetry, circular dichroism and a DNA ligation ladder global structure assay. The reduced number of heterocycles in the six-ring PA weakens the binding affinity; however, the smaller PA aggregates significantly less than the larger PAs, and allows us to obtain the binding thermodynamics. The PA-DNA binding enthalpy is large and negative with a large negative ΔCp, and is the primary driving component of the Gibbs free energy. The complete SPR binding results clearly show that β substitutions can substantially weaken the binding affinity of hairpin PAs in a position-dependent manner. More importantly, the changes in PA binding to the mutant DNAs further confirm the position-dependent effects on PA-DNA interaction affinity. Comparison of mutant DNA sequences also shows a different effect in recognition of T•A versus A•T base pairs. The effects of DNA mutations on binding of a single PA as well as the effects of the position of β substitution on binding tell a clear and very important story about sequence dependent binding of PAs to DNA. PMID:23167504
Complete nucleotide sequences of a new bipartite begomovirus from Malvastrum sp. plants with bright yellow mosaic symptoms in South Texas.

PubMed

Alabi, Olufemi J; Villegas, Cecilia; Gregg, Lori; Murray, K Daniel

2016-06-01

Two isolates of a novel bipartite begomovirus, tentatively named malvastrum bright yellow mosaic virus (MaBYMV), were molecularly characterized from naturally infected plants of the genus Malvastrum showing bright yellow mosaic disease symptoms in South Texas. Six complete DNA-A and five DNA-B genome sequences of MaBYMV obtained from the isolates ranged in length from 2,608 to 2,609 nucleotides (nt) and 2,578 to 2,605 nt, respectively. Both genome segments shared a 178- to 180-nt common region. In pairwise comparisons, the complete DNA-A and DNA-B sequences of MaBYMV were most similar (87-88 % and 79-81 % identity, respectively) and phylogenetically related to the corresponding sequences of sida mosaic Sinaloa virus-[MX-Gua-06]. Further analysis revealed that MaBYMV is a putative recombinant virus, thus supporting the notion that malvaceous hosts may be influencing the evolution of several begomoviruses. The design of new diagnostic primers enabled the detection of MaBYMV in cohorts of Bemisia tabaci collected from symptomatic Malvastrum sp. plants, thus implicating whiteflies as potential vectors of the virus.
DMINDA: an integrated web server for DNA motif identification and analyses.

PubMed

Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying

2014-07-01

DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Bacterial community composition in different sediments from the Eastern Mediterranean Sea: a comparison of four 16S ribosomal DNA clone libraries.

PubMed

Polymenakou, Paraskevi N; Bertilsson, Stefan; Tselepides, Anastasios; Stephanou, Euripides G

2005-10-01

The regional variability of sediment bacterial community composition and diversity was studied by comparative analysis of four large 16S ribosomal DNA (rDNA) clone libraries from sediments in different regions of the Eastern Mediterranean Sea (Thermaikos Gulf, Cretan Sea, and South lonian Sea). Amplified rDNA restriction analysis of 664 clones from the libraries indicate that the rDNA richness and evenness was high: for example, a near-1:1 relationship among screened clones and number of unique restriction patterns when up to 190 clones were screened for each library. Phylogenetic analysis of 207 bacterial 16S rDNA sequences from the sediment libraries demonstrated that Gamma-, Delta-, and Alphaproteobacteria, Holophaga/Acidobacteria, Planctomycetales, Actinobacteria, Bacteroidetes, and Verrucomicrobia were represented in all four libraries. A few clones also grouped with the Betaproteobacteria, Nitrospirae, Spirochaetales, Chlamydiae, Firmicutes, and candidate division OPl 1. The abundance of sequences affiliated with Gammaproteobacteria was higher in libraries from shallow sediments in the Thermaikos Gulf (30 m) and the Cretan Sea (100 m) compared to the deeper South Ionian station (2790 m). Most sequences in the four sediment libraries clustered with uncultured 16S rDNA phylotypes from marine habitats, and many of the closest matches were clones from hydrocarbon seeps, benzene-mineralizing consortia, sulfate reducers, sulk oxidizers, and ammonia oxidizers. LIBSHUFF statistics of 16S rDNA gene sequences from the four libraries revealed major differences, indicating either a very high richness in the sediment bacterial communities or considerable variability in bacterial community composition among regions, or both.
Preselection of EGFR mutations in non-small-cell lung cancer patients by immunohistochemistry: comparison with DNA-sequencing, EGFR wild-type expression, gene copy number gain and clinicopathological data.

PubMed

Gaber, Rania; Watermann, Iris; Kugler, Christian; Vollmer, Ekkehard; Perner, Sven; Reck, Martin; Goldmann, Torsten

2017-01-01

Targeting epidermal growth factor receptor (EGFR) in patients with non-small-cell lung cancer (NSCLC) having EGFR mutations is associated with an improved overall survival. The aim of this study is to verify, if EGFR mutations detected by immunohistochemistry (IHC) is a convincing way to preselect patients for DNA-sequencing and to figure out, the statistical association between EGFR mutation, wild-type EGFR overexpression, gene copy number gain, which are the main factors inducing EGFR tumorigenic activity and the clinicopathological data. Two hundred sixteen tumor tissue samples of primarily chemotherapeutic naïve NSCLC patients were analyzed for EGFR mutations E746-A750del and L858R and correlated with DNA-sequencing. Two hundred six of which were assessed by IHC, using 6B6 and 43B2 specific antibodies followed by DNA-sequencing of positive cases and 10 already genotyped tumor tissues were also included to investigate debugging accuracy of IHC. In addition, EGFR wild-type overexpression was IHC evaluated and EGFR gene copy number determination was performed by fluorescence in situ hybridization (FISH). Forty-one÷206 (19.9%) cases were positive for mutated EGFR by IHC. Eight of them had EGFR mutations of exons 18-21 by DNA-sequencing. Hit rate of 10 already genotyped NSCLC mutated cases was 90% by IHC. Positive association was found between EGFR mutations determined by IHC and both EGFR overexpression and increased gene copy number (p=0.002 and p<0.001, respectively). Additionally, positive association was detected between EGFR mutations, high tumor grade and clinical stage (p<0.001). IHC staining with mutation specific antibodies was demonstrated as a possible useful screening test to preselect patients for DNA-sequencing.
Determination of fetal DNA fraction from the plasma of pregnant women using sequence read counts.

PubMed

Kim, Sung K; Hannum, Gregory; Geis, Jennifer; Tynan, John; Hogg, Grant; Zhao, Chen; Jensen, Taylor J; Mazloom, Amin R; Oeth, Paul; Ehrich, Mathias; van den Boom, Dirk; Deciu, Cosmin

2015-08-01

This study introduces a novel method, referred to as SeqFF, for estimating the fetal DNA fraction in the plasma of pregnant women and to infer the underlying mechanism that allows for such statistical modeling. Autosomal regional read counts from whole-genome massively parallel single-end sequencing of circulating cell-free DNA (ccfDNA) from the plasma of 25 312 pregnant women were used to train a multivariate model. The pretrained model was then applied to 505 pregnant samples to assess the performance of SeqFF against known methodologies for fetal DNA fraction calculations. Pearson's correlation between chromosome Y and SeqFF for pregnancies with male fetuses from two independent cohorts ranged from 0.932 to 0.938. Comparison between a single-nucleotide polymorphism-based approach and SeqFF yielded a Pearson's correlation of 0.921. Paired-end sequencing suggests that shorter ccfDNA, that is, less than 150 bp in length, is nonuniformly distributed across the genome. Regions exhibiting an increased proportion of short ccfDNA, which are more likely of fetal origin, tend to provide more information in the SeqFF calculations. SeqFF is a robust and direct method to determine fetal DNA fraction. Furthermore, the method is applicable to both male and female pregnancies and can greatly improve the accuracy of noninvasive prenatal testing for fetal copy number variation. © 2015 John Wiley & Sons, Ltd.
Studies of Xenopus laevis mitochondrial DNA: D-loop mapping and characterization of DNA-binding proteins

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cairns, S.S.

1987-01-01

In X. laevis oocytes, mitochondrial DNA accumulates to 10/sup 5/ times the somatic cell complement, and is characterized by a high frequency of a triple-stranded displacement hoop structure at the origin of replication. To map the termini of the single strands, it was necessary to correct the nucleotide sequence of the D-loop region. The revised sequence of 2458 nucleotides contains 54 discrepancies in comparison to a previously published sequence. Radiolabeling of the nascent strands of the D-loop structure either at the 5' end or at the 3' end identifies a major species with a length of 1670 nucleotides. Cleavage ofmore » the 5' labeled strands reveals two families of ends located near several matches to an element, designated CSB-1, that is conserved in this location in several vertebrate genomes. Cleavage of 3' labeled strands produced one fragment. The unique 3' end maps to about 15 nucleotides preceding the tRNA/sup Pro/ gene. A search for proteins which may bind to mtDNA in this region to regulate nucleic acid synthesis has identified three activities in lysates of X. laevis mitochondria. The DNA-binding proteins were assayed by monitoring their ability to retard the migration of labeled double- or single-stranded DNA fragments in polyacrylamide gels. The DNA binding preference was determined by competition with an excess of either ds- or ssDNA.« less
Human Contamination in Public Genome Assemblies.

PubMed

Kryukov, Kirill; Imanishi, Tadashi

2016-01-01

Contamination in genome assembly can lead to wrong or confusing results when using such genome as reference in sequence comparison. Although bacterial contamination is well known, the problem of human-originated contamination received little attention. In this study we surveyed 45,735 available genome assemblies for evidence of human contamination. We used lineage specificity to distinguish between contamination and conservation. We found that 154 genome assemblies contain fragments that with high confidence originate as contamination from human DNA. Majority of contaminating human sequences were present in the reference human genome assembly for over a decade. We recommend that existing contaminated genomes should be revised to remove contaminated sequence, and that new assemblies should be thoroughly checked for presence of human DNA before submitting them to public databases.
Details of the evolutionary history from invertebrates to vertebrates, as deduced from the sequences of 18S rDNA.

PubMed Central

Wada, H; Satoh, N

1994-01-01

Almost the entire sequences of 18S rDNA were determined for two chaetognaths, five echinoderms, a hemichordate, and two urochordates (a larvacean and a salp). Phylogenetic comparisons of the sequences, together with those of other deuterostomes (an ascidian, a cephalochordate, and vertebrates) and protostomes (an arthropod and a mollusc), suggest the monophyly of the deuterostomes, with the exception of the chaetognaths. Chaetognaths may not be a group of deuterostomes. The deuterostome group closest to vertebrates was the group of cephalochordates. Ascidians, larvaceans, and salps seem to form a discrete group (urochordates), in which the early divergence of larvaceans is evident. These results support the hypothesis that chordates evolved from free-living ancestors. PMID:8127885
Target Site Recognition by a Diversity-Generating Retroelement

PubMed Central

Guo, Huatao; Tse, Longping V.; Nieh, Angela W.; Czornyj, Elizabeth; Williams, Steven; Oukil, Sabrina; Liu, Vincent B.; Miller, Jeff F.

2011-01-01

Diversity-generating retroelements (DGRs) are in vivo sequence diversification machines that are widely distributed in bacterial, phage, and plasmid genomes. They function to introduce vast amounts of targeted diversity into protein-encoding DNA sequences via mutagenic homing. Adenine residues are converted to random nucleotides in a retrotransposition process from a donor template repeat (TR) to a recipient variable repeat (VR). Using the Bordetella bacteriophage BPP-1 element as a prototype, we have characterized requirements for DGR target site function. Although sequences upstream of VR are dispensable, a 24 bp sequence immediately downstream of VR, which contains short inverted repeats, is required for efficient retrohoming. The inverted repeats form a hairpin or cruciform structure and mutational analysis demonstrated that, while the structure of the stem is important, its sequence can vary. In contrast, the loop has a sequence-dependent function. Structure-specific nuclease digestion confirmed the existence of a DNA hairpin/cruciform, and marker coconversion assays demonstrated that it influences the efficiency, but not the site of cDNA integration. Comparisons with other phage DGRs suggested that similar structures are a conserved feature of target sequences. Using a kanamycin resistance determinant as a reporter, we found that transplantation of the IMH and hairpin/cruciform-forming region was sufficient to target the DGR diversification machinery to a heterologous gene. In addition to furthering our understanding of DGR retrohoming, our results suggest that DGRs may provide unique tools for directed protein evolution via in vivo DNA diversification. PMID:22194701

False positives complicate ancient pathogen identifications using high-throughput shotgun sequencing

PubMed Central

2014-01-01

Background Identification of historic pathogens is challenging since false positives and negatives are a serious risk. Environmental non-pathogenic contaminants are ubiquitous. Furthermore, public genetic databases contain limited information regarding these species. High-throughput sequencing may help reliably detect and identify historic pathogens. Results We shotgun-sequenced 8 16th-century Mixtec individuals from the site of Teposcolula Yucundaa (Oaxaca, Mexico) who are reported to have died from the huey cocoliztli (‘Great Pestilence’ in Nahautl), an unknown disease that decimated native Mexican populations during the Spanish colonial period, in order to identify the pathogen. Comparison of these sequences with those deriving from the surrounding soil and from 4 precontact individuals from the site found a wide variety of contaminant organisms that confounded analyses. Without the comparative sequence data from the precontact individuals and soil, false positives for Yersinia pestis and rickettsiosis could have been reported. Conclusions False positives and negatives remain problematic in ancient DNA analyses despite the application of high-throughput sequencing. Our results suggest that several studies claiming the discovery of ancient pathogens may need further verification. Additionally, true single molecule sequencing’s short read lengths, inability to sequence through DNA lesions, and limited ancient-DNA-specific technical development hinder its application to palaeopathology. PMID:24568097
Isolation of a gammaherpesvirus similar to asinine herpesvirus-2 (AHV-2) from a mule and a survey of mules and donkeys for AHV-2 infection by real-time PCR.

PubMed

Bell, Stephanie A; Pusterla, Nicola; Balasuriya, Udeni B R; Mapes, Samantha M; Nyberg, Nicole L; MacLachlan, N James

2008-07-27

Equids are commonly infected by herpesviruses, but isolation of herpesviruses from mules has apparently not been previously reported. Furthermore, the genomic relationships among the various equid herpesviruses are poorly characterized. We describe the isolation and preliminary characterization of a mule gammaherpesvirus tentatively identified as asinine herpesvirus-2 (AHV-2; also designated equid herpesvirus-7 (EHV-7)) from the nasal secretions (NS) of a healthy mule in northern California. The virus was initially identified by transmission electron microscopic examination of lysates of cell culture inoculated with NS collected from the mule. A 913 nucleotide sequence of the DNA polymerase gene was amplified using degenerate primers, and comparison of this sequence with those of various other herpesviruses showed that the mule herpesvirus was most closely related to EHV-2 (AHV-2 sequences were not available for comparison). The sequence of a shorter portion (166 nucleotides) of the mule herpesvirus DNA polymerase gene was identical to that of the published sequence of an asinine gammaherpesvirus, previously designated as AHV-4-3 (AY054992). AHV-2 was detected by real-time polymerase chain reaction assay in the NS of approximately 8% of a cohort of 114 healthy mules and 13 donkeys.
The sequence of camelpox virus shows it is most closely related to variola virus, the cause of smallpox.

PubMed

Gubser, Caroline; Smith, Geoffrey L

2002-04-01

Camelpox virus (CMPV) and variola virus (VAR) are orthopoxviruses (OPVs) that share several biological features and cause high mortality and morbidity in their single host species. The sequence of a virulent CMPV strain was determined; it is 202182 bp long, with inverted terminal repeats (ITRs) of 6045 bp and has 206 predicted open reading frames (ORFs). As for other poxviruses, the genes are tightly packed with little non-coding sequence. Most genes within 25 kb of each terminus are transcribed outwards towards the terminus, whereas genes within the centre of the genome are transcribed from either DNA strand. The central region of the genome contains genes that are highly conserved in other OPVs and 87 of these are conserved in all sequenced chordopoxviruses. In contrast, genes towards either terminus are more variable and encode proteins involved in host range, virulence or immunomodulation. In some cases, these are broken versions of genes found in other OPVs. The relationship of CMPV to other OPVs was analysed by comparisons of DNA and predicted protein sequences, repeats within the ITRs and arrangement of ORFs within the terminal regions. Each comparison gave the same conclusion: CMPV is the closest known virus to variola virus, the cause of smallpox.
Archaebacterial rhodopsin sequences: Implications for evolution

NASA Technical Reports Server (NTRS)

Lanyi, J. K.

1991-01-01

It was proposed over 10 years ago that the archaebacteria represent a separate kingdom which diverged very early from the eubacteria and eukaryotes. It follows that investigations of archaebacterial characteristics might reveal features of early evolution. So far, two genes, one for bacteriorhodopsin and another for halorhodopsin, both from Halobacterium halobium, have been sequenced. We cloned and sequenced the gene coding for the polypeptide of another one of these rhodopsins, a halorhodopsin in Natronobacterium pharaonis. Peptide sequencing of cyanogen bromide fragments, and immuno-reactions of the protein and synthetic peptides derived from the C-terminal gene sequence, confirmed that the open reading frame was the structural gene for the pharaonis halorhodopsin polypeptide. The flanking DNA sequences of this gene, as well as those of other bacterial rhodopsins, were compared to previously proposed archaebacterial consensus sequences. In pairwise comparisons of the open reading frame with DNA sequences for bacterio-opsin and halo-opsin from Halobacterium halobium, silent divergences were calculated. These indicate very considerable evolutionary distance between each pair of genes, even in the dame organism. In spite of this, three protein sequences show extensive similarities, indicating strong selective pressures.
Diversity and distribution of single-stranded DNA phages in the North Atlantic Ocean

PubMed Central

Tucker, Kimberly P; Parsons, Rachel; Symonds, Erin M; Breitbart, Mya

2011-01-01

Knowledge of marine phages is highly biased toward double-stranded DNA (dsDNA) phages; however, recent metagenomic surveys have also identified single-stranded DNA (ssDNA) phages in the oceans. Here, we describe two complete ssDNA phage genomes that were reconstructed from a viral metagenome from 80 m depth at the Bermuda Atlantic Time-series Study (BATS) site in the northwestern Sargasso Sea and examine their spatial and temporal distributions. Both genomes (SARssφ1 and SARssφ2) exhibited similarity to known phages of the Microviridae family in terms of size, GC content, genome organization and protein sequence. PCR amplification of the replication initiation protein (Rep) gene revealed narrow and distinct depth distributions for the newly described ssDNA phages within the upper 200 m of the water column at the BATS site. Comparison of Rep gene sequences obtained from the BATS site over time revealed changes in the diversity of ssDNA phages over monthly time scales, although some nearly identical sequences were recovered from samples collected 4 years apart. Examination of ssDNA phage diversity along transects through the North Atlantic Ocean revealed a positive correlation between genetic distance and geographic distance between sampling sites. Together, the data suggest fundamental differences between the distribution of these ssDNA phages and the distribution of known marine dsDNA phages, possibly because of differences in host range, host distribution, virion stability, or viral evolution mechanisms and rates. Future work needs to elucidate the host ranges for oceanic ssDNA phages and determine their ecological roles in the marine ecosystem. PMID:21124487
[Hepatitis C virus: sequence homology of a European isolate and divergence from the prototype].

PubMed

Seelig, R; Seelig, H P; Renz, M

1991-08-01

The polymerase chain reaction (PCR) detected specific hepatitis C viral (HCV) RNA sequences in liver biopsies from two patients with chronic hepatitis, in the tissue of a liver implantate, in plasma from four chronic non-A, non-B hepatitis (NANBH) patients and, for the first time, in an infectious anti-D-immunoglobulin preparation. A comparison of the viral sequences coding for a region for the nonstructural NS3 protein from the liver tissues revealed only a very small degree of sequence divergence on the cDNA as well as on the amino acid level (between 0 and 5%). The sequence similarities of the RNA isolated from plasma of the four chronic NANBH patients and the anti-D-immunoglobulin preparation were partly somewhat lower but altogether also high (between 90 and 100%). In contrast, all eight cDNA and amino acid sequences exhibited a significantly higher degree of divergence in comparison with the HCV prototype sequence (between 29 and 32%) than among themselves (between 0 and 10%). This unexpected high sequence similarity of the eight European isolates and their low homology to the Northamerican prototype sequence is indicative for the existence of different types of HCV. This will be important not only for epidemiological studies but also for the development of effective diagnostic procedures and vaccines. Concerning the pathogenesis of NANBH, a double infection or a helper mechanism has to be considered: in addition to the C virus, sequences of an other virus particle were found in the infectious IgG preparation as well as in the liver biopsies.
Energetics of protein-DNA interactions.

PubMed

Donald, Jason E; Chen, William W; Shakhnovich, Eugene I

2007-01-01

Protein-DNA interactions are vital for many processes in living cells, especially transcriptional regulation and DNA modification. To further our understanding of these important processes on the microscopic level, it is necessary that theoretical models describe the macromolecular interaction energetics accurately. While several methods have been proposed, there has not been a careful comparison of how well the different methods are able to predict biologically important quantities such as the correct DNA binding sequence, total binding free energy and free energy changes caused by DNA mutation. In addition to carrying out the comparison, we present two important theoretical models developed initially in protein folding that have not yet been tried on protein-DNA interactions. In the process, we find that the results of these knowledge-based potentials show a strong dependence on the interaction distance and the derivation method. Finally, we present a knowledge-based potential that gives comparable or superior results to the best of the other methods, including the molecular mechanics force field AMBER99.
The TGA codons are present in the open reading frame of selenoprotein P cDNA

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hill, K.E.; Lloyd, R.S.; Read, R.

1991-03-11

The TGA codon in DNA has been shown to direct incorporation of selenocysteine into protein. Several proteins from bacteria and animals contain selenocysteine in their primary structures. Each of the cDNA clones of these selenoproteins contains one TGA codon in the open reading frame which corresponds to the selenocysteine in the protein. A cDNA clone for selenoprotein P (SeP), obtained from a {gamma}ZAP rat liver library, was sequenced by the dideoxy termination method. The correct reading frame was determined by comparison of the deduced amino acid sequence with the amino acid sequence of several peptides from SeP. Using SeP labelledmore » with {sup 75}Se in vivo, the selenocysteine content of the peptides was verified by the collection of carboxymethylated {sup 77}Se-selenocysteine as it eluted from the amino acid analyzer and determination of the radioactivity contained in the collected samples. Ten TGA codons are present in the open reading frame of the cDNA. Peptide fragmentation studies and the deduced sequence indicate that selenium-rich regions are located close to the carboxy terminus. Nine of the 10 selenocysteines are located in the terminal 26% of the sequence with four in the terminal 15 amino acids. The deduced sequence codes for a protein of 385 amino acids. Cleavage of the signal peptide gives the mature protein with 366 amino acids and a calculated mol wt of 41,052 Da. Searches of PIR and SWISSPROT protein databases revealed no similarity with glutathione peroxidase or other selenoproteins.« less
Filling Gaps in Biodiversity Knowledge for Macrofungi: Contributions and Assessment of an Herbarium Collection DNA Barcode Sequencing Project

PubMed Central

Osmundson, Todd W.; Robert, Vincent A.; Schoch, Conrad L.; Baker, Lydia J.; Smith, Amy; Robich, Giovanni; Mizzan, Luca; Garbelotto, Matteo M.

2013-01-01

Despite recent advances spearheaded by molecular approaches and novel technologies, species description and DNA sequence information are significantly lagging for fungi compared to many other groups of organisms. Large scale sequencing of vouchered herbarium material can aid in closing this gap. Here, we describe an effort to obtain broad ITS sequence coverage of the approximately 6000 macrofungal-species-rich herbarium of the Museum of Natural History in Venice, Italy. Our goals were to investigate issues related to large sequencing projects, develop heuristic methods for assessing the overall performance of such a project, and evaluate the prospects of such efforts to reduce the current gap in fungal biodiversity knowledge. The effort generated 1107 sequences submitted to GenBank, including 416 previously unrepresented taxa and 398 sequences exhibiting a best BLAST match to an unidentified environmental sequence. Specimen age and taxon affected sequencing success, and subsequent work on failed specimens showed that an ITS1 mini-barcode greatly increased sequencing success without greatly reducing the discriminating power of the barcode. Similarity comparisons and nonmetric multidimensional scaling ordinations based on pairwise distance matrices proved to be useful heuristic tools for validating the overall accuracy of specimen identifications, flagging potential misidentifications, and identifying taxa in need of additional species-level revision. Comparison of within- and among-species nucleotide variation showed a strong increase in species discriminating power at 1–2% dissimilarity, and identified potential barcoding issues (same sequence for different species and vice-versa). All sequences are linked to a vouchered specimen, and results from this study have already prompted revisions of species-sequence assignments in several taxa. PMID:23638077
Filling gaps in biodiversity knowledge for macrofungi: contributions and assessment of an herbarium collection DNA barcode sequencing project.

PubMed

Osmundson, Todd W; Robert, Vincent A; Schoch, Conrad L; Baker, Lydia J; Smith, Amy; Robich, Giovanni; Mizzan, Luca; Garbelotto, Matteo M

2013-01-01

Despite recent advances spearheaded by molecular approaches and novel technologies, species description and DNA sequence information are significantly lagging for fungi compared to many other groups of organisms. Large scale sequencing of vouchered herbarium material can aid in closing this gap. Here, we describe an effort to obtain broad ITS sequence coverage of the approximately 6000 macrofungal-species-rich herbarium of the Museum of Natural History in Venice, Italy. Our goals were to investigate issues related to large sequencing projects, develop heuristic methods for assessing the overall performance of such a project, and evaluate the prospects of such efforts to reduce the current gap in fungal biodiversity knowledge. The effort generated 1107 sequences submitted to GenBank, including 416 previously unrepresented taxa and 398 sequences exhibiting a best BLAST match to an unidentified environmental sequence. Specimen age and taxon affected sequencing success, and subsequent work on failed specimens showed that an ITS1 mini-barcode greatly increased sequencing success without greatly reducing the discriminating power of the barcode. Similarity comparisons and nonmetric multidimensional scaling ordinations based on pairwise distance matrices proved to be useful heuristic tools for validating the overall accuracy of specimen identifications, flagging potential misidentifications, and identifying taxa in need of additional species-level revision. Comparison of within- and among-species nucleotide variation showed a strong increase in species discriminating power at 1-2% dissimilarity, and identified potential barcoding issues (same sequence for different species and vice-versa). All sequences are linked to a vouchered specimen, and results from this study have already prompted revisions of species-sequence assignments in several taxa.
Comparison between TRF2 and TRF1 of their telomeric DNA-bound structures and DNA-binding activities

PubMed Central

Hanaoka, Shingo; Nagadoi, Aritaka; Nishimura, Yoshifumi

2005-01-01

Mammalian telomeres consist of long tandem arrays of double-stranded telomeric TTAGGG repeats packaged by the telomeric DNA-binding proteins TRF1 and TRF2. Both contain a similar C-terminal Myb domain that mediates sequence-specific binding to telomeric DNA. In a DNA complex of TRF1, only the single Myb-like domain consisting of three helices can bind specifically to double-stranded telomeric DNA. TRF2 also binds to double-stranded telomeric DNA. Although the DNA binding mode of TRF2 is likely identical to that of TRF1, TRF2 plays an important role in the t-loop formation that protects the ends of telomeres. Here, to clarify the details of the double-stranded telomeric DNA-binding modes of TRF1 and TRF2, we determined the solution structure of the DNA-binding domain of human TRF2 bound to telomeric DNA; it consists of three helices, and like TRF1, the third helix recognizes TAGGG sequence in the major groove of DNA with the N-terminal arm locating in the minor groove. However, small but significant differences are observed; in contrast to the minor groove recognition of TRF1, in which an arginine residue recognizes the TT sequence, a lysine residue of TRF2 interacts with the TT part. We examined the telomeric DNA-binding activities of both DNA-binding domains of TRF1 and TRF2 and found that TRF1 binds more strongly than TRF2. Based on the structural differences of both domains, we created several mutants of the DNA-binding domain of TRF2 with stronger binding activities compared to the wild-type TRF2. PMID:15608118
Structural impact of complete CpG methylation within target DNA on specific complex formation of the inducible transcription factor Egr-1.

PubMed

Zandarashvili, Levani; White, Mark A; Esadze, Alexandre; Iwahara, Junji

2015-07-08

The inducible transcription factor Egr-1 binds specifically to 9-bp target sequences containing two CpG sites that can potentially be methylated at four cytosine bases. Although it appears that complete CpG methylation would make an unfavorable steric clash in the previous crystal structures of the complexes with unmethylated or partially methylated DNA, our affinity data suggest that DNA recognition by Egr-1 is insensitive to CpG methylation. We have determined, at a 1.4-Å resolution, the crystal structure of the Egr-1 zinc-finger complex with completely methylated target DNA. Structural comparison of the three different methylation states reveals why Egr-1 can recognize the target sequences regardless of CpG methylation. Copyright © 2015 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.
Spermine Condenses DNA, but Not RNA Duplexes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Katz, Andrea M.; Tolokh, Igor S.; Pabit, Suzette A.

Interactions between the polyamine spermine and nucleic acids drive important cellular processes. Spermine condenses DNA, and some RNAs such as poly(rA):poly(rU). A large fraction of the spermine present in cells is bound to RNA, but apparently does not condense it. Here, we study the effect of spermine binding to short duplex RNA and DNA and compare our findings with predictions of molecular dynamics simulations. When small numbers of spermine are introduced, RNA with a designed sequence, containing a mixture of 14 GC pairs and 11 AU pairs, resists condensation relative to DNA of an equivalent sequence or to 25 basemore » pair poly(rA):poly(rU) RNA. Comparison of wide-angle x-ray scattering profiles with simulation suggests that spermine is sequestered deep within the major groove of mixed sequence RNA, preventing condensation by limiting opportunities to bridge to other molecules as well as stabilizing the RNA by locking it into a particular conformation. In contrast, for DNA, simulations suggest that spermine binds external to the duplex, offering opportunities for intermolecular interaction. The goal of this study is to explain how RNA can remain soluble, and available for interaction with other molecules in the cell, despite the presence of spermine at concentrations high enough to precipitate DNA.« less
Deletions of fetal and adult muscle cDNA in Duchenne and Becker muscular dystrophy patients.

PubMed Central

Cross, G S; Speer, A; Rosenthal, A; Forrest, S M; Smith, T J; Edwards, Y; Flint, T; Hill, D; Davies, K E

1987-01-01

We have isolated a cDNA molecule from a human adult muscle cDNA library which is deleted in several Duchenne muscular dystrophy patients. Patient deletions have been used to map the exons across the Xp21 region of the short arm of the X chromosome. We demonstrate that a very mildly affected 61 year old patient is deleted for at least nine exons of the adult cDNA. We find no evidence for differential exon usage between adult and fetal muscle in this region of the gene. There must therefore be less essential domains of the protein structure which can be removed without complete loss of function. The sequence of 2.0 kb of the adult cDNA shows no homology to any previously described protein listed in the data banks although sequence comparison at the amino acid level suggests that the protein has a structure not dissimilar to rod structures of cytoskeletal proteins such as lamin and myosin. There are single nucleotide differences in the DNA sequence between the adult and fetal cDNAs which result in amino acid changes but none that would be predicted to change the structure of the protein dramatically. Images Fig. 1. Fig. 2. Fig. 3. Fig. 4. Fig. 5. Fig. 7. PMID:3428261
A single mini-barcode test to screen for Australian mammalian predators from environmental samples

PubMed Central

MacDonald, Anna J; Sarre, Stephen D

2017-01-01

Abstract Identification of species from trace samples is now possible through the comparison of diagnostic DNA fragments against reference DNA sequence databases. DNA detection of animals from non-invasive samples, such as predator faeces (scats) that contain traces of DNA from their species of origin, has proved to be a valuable tool for the management of elusive wildlife. However, application of this approach can be limited by the availability of appropriate genetic markers. Scat DNA is often degraded, meaning that longer DNA sequences, including standard DNA barcoding markers, are difficult to recover. Instead, targeted short diagnostic markers are required to serve as diagnostic mini-barcodes. The mitochondrial genome is a useful source of such trace DNA markers because it provides good resolution at the species level and occurs in high copy numbers per cell. We developed a mini-barcode based on a short (178 bp) fragment of the conserved 12S ribosomal ribonucleic acid mitochondrial gene sequence, with the goal of discriminating amongst the scats of large mammalian predators of Australia. We tested the sensitivity and specificity of our primers and can accurately detect and discriminate amongst quolls, cats, dogs, foxes, and devils from trace DNA samples. Our approach provides a cost-effective, time-efficient, and non-invasive tool that enables identification of all 8 medium-large mammal predators in Australia, including native and introduced species, using a single test. With modification, this approach is likely to be of broad applicability elsewhere. PMID:28810700
Genomic resources for Myzus persicae: EST sequencing, SNP identification, and microarray design

PubMed Central

Ramsey, John S; Wilson, Alex CC; de Vos, Martin; Sun, Qi; Tamborindeguy, Cecilia; Winfield, Agnese; Malloch, Gaynor; Smith, Dawn M; Fenton, Brian; Gray, Stewart M; Jander, Georg

2007-01-01

Background The green peach aphid, Myzus persicae (Sulzer), is a world-wide insect pest capable of infesting more than 40 plant families, including many crop species. However, despite the significant damage inflicted by M. persicae in agricultural systems through direct feeding damage and by its ability to transmit plant viruses, limited genomic information is available for this species. Results Sequencing of 16 M. persicae cDNA libraries generated 26,669 expressed sequence tags (ESTs). Aphids for library construction were raised on Arabidopsis thaliana, Nicotiana benthamiana, Brassica oleracea, B. napus, and Physalis floridana (with and without Potato leafroll virus infection). The M. persicae cDNA libraries include ones made from sexual and asexual whole aphids, guts, heads, and salivary glands. In silico comparison of cDNA libraries identified aphid genes with tissue-specific expression patterns, and gene expression that is induced by feeding on Nicotiana benthamiana. Furthermore, 2423 genes that are novel to science and potentially aphid-specific were identified. Comparison of cDNA data from three aphid lineages identified single nucleotide polymorphisms that can be used as genetic markers and, in some cases, may represent functional differences in the protein products. In particular, non-conservative amino acid substitutions in a highly expressed gut protease may be of adaptive significance for M. persicae feeding on different host plants. The Agilent eArray platform was used to design an M. persicae oligonucleotide microarray representing over 10,000 unique genes. Conclusion New genomic resources have been developed for M. persicae, an agriculturally important insect pest. These include previously unknown sequence data, a collection of expressed genes, molecular markers, and a DNA microarray that can be used to study aphid gene expression. These resources will help elucidate the adaptations that allow M. persicae to develop compatible interactions with its host plants, complementing ongoing work illuminating plant molecular responses to phloem-feeding insects. PMID:18021414
iDNA-Prot: Identification of DNA Binding Proteins Using Random Forest with Grey Model

PubMed Central

Lin, Wei-Zhong; Fang, Jian-An; Xiao, Xuan; Chou, Kuo-Chen

2011-01-01

DNA-binding proteins play crucial roles in various cellular processes. Developing high throughput tools for rapidly and effectively identifying DNA-binding proteins is one of the major challenges in the field of genome annotation. Although many efforts have been made in this regard, further effort is needed to enhance the prediction power. By incorporating the features into the general form of pseudo amino acid composition that were extracted from protein sequences via the “grey model” and by adopting the random forest operation engine, we proposed a new predictor, called iDNA-Prot, for identifying uncharacterized proteins as DNA-binding proteins or non-DNA binding proteins based on their amino acid sequences information alone. The overall success rate by iDNA-Prot was 83.96% that was obtained via jackknife tests on a newly constructed stringent benchmark dataset in which none of the proteins included has pairwise sequence identity to any other in a same subset. In addition to achieving high success rate, the computational time for iDNA-Prot is remarkably shorter in comparison with the relevant existing predictors. Hence it is anticipated that iDNA-Prot may become a useful high throughput tool for large-scale analysis of DNA-binding proteins. As a user-friendly web-server, iDNA-Prot is freely accessible to the public at the web-site on http://icpr.jci.edu.cn/bioinfo/iDNA-Prot or http://www.jci-bioinfo.cn/iDNA-Prot. Moreover, for the convenience of the vast majority of experimental scientists, a step-by-step guide is provided on how to use the web-server to get the desired results. PMID:21935457
Estimating the efficiency of fish cross-species cDNA microarray hybridization.

PubMed

Cohen, Raphael; Chalifa-Caspi, Vered; Williams, Timothy D; Auslander, Meirav; George, Stephen G; Chipman, James K; Tom, Moshe

2007-01-01

Using an available cross-species cDNA microarray is advantageous for examining multigene expression patterns in non-model organisms, saving the need for construction of species-specific arrays. The aim of the present study was to estimate relative efficiency of cross-species hybridizations across bony fishes, using bioinformatics tools. The methodology may serve also as a model for similar evaluations in other taxa. The theoretical evaluation was done by substituting comparative whole-transcriptome sequence similarity information into the thermodynamic hybridization equation. Complementary DNA sequence assemblages of nine fish species belonging to common families or suborders and distributed across the bony fish taxonomic branch were selected for transcriptome-wise comparisons. Actual cross-species hybridizations among fish of different taxonomic distances were used to validate and eventually to calibrate the theoretically computed relative efficiencies.
Typing Clostridium difficile strains based on tandem repeat sequences

PubMed Central

2009-01-01

Background Genotyping of epidemic Clostridium difficile strains is necessary to track their emergence and spread. Portability of genotyping data is desirable to facilitate inter-laboratory comparisons and epidemiological studies. Results This report presents results from a systematic screen for variation in repetitive DNA in the genome of C. difficile. We describe two tandem repeat loci, designated 'TR6' and 'TR10', which display extensive sequence variation that may be useful for sequence-based strain typing. Based on an investigation of 154 C. difficile isolates comprising 75 ribotypes, tandem repeat sequencing demonstrated excellent concordance with widely used PCR ribotyping and equal discriminatory power. Moreover, tandem repeat sequences enabled the reconstruction of the isolates' largely clonal population structure and evolutionary history. Conclusion We conclude that sequence analysis of the two repetitive loci introduced here may be highly useful for routine typing of C. difficile. Tandem repeat sequence typing resolves phylogenetic diversity to a level equivalent to PCR ribotypes. DNA sequences may be stored in databases accessible over the internet, obviating the need for the exchange of reference strains. PMID:19133124
Comparison of impedimetric detection of DNA hybridization on the various biosensors based on modified glassy carbon electrodes with PANHS and nanomaterials of RGO and MWCNTs.

PubMed

Benvidi, Ali; Tezerjani, Marzieh Dehghan; Jahanbani, Shahriar; Mazloum Ardakani, Mohammad; Moshtaghioun, Seyed Mohammad

2016-01-15

In this research, we have developed lable free DNA biosensors based on modified glassy carbon electrodes (GCE) with reduced graphene oxide (RGO) and carbon nanotubes (MWCNTs) for detection of DNA sequences. This paper compares the detection of BRCA1 5382insC mutation using independent glassy carbon electrodes (GCE) modified with RGO and MWCNTs. A probe (BRCA1 5382insC mutation detection (ssDNA)) was then immobilized on the modified electrodes for a specific time. The immobilization of the probe and its hybridization with the target DNA (Complementary DNA) were performed under optimum conditions using different electrochemical techniques such as cyclic voltammetry (CV) and electrochemical impedance spectroscopy (EIS). The proposed biosensors were used for determination of complementary DNA sequences. The non-modified DNA biosensor (1-pyrenebutyric acid-N- hydroxysuccinimide ester (PANHS)/GCE), revealed a linear relationship between ∆Rct and logarithm of the complementary target DNA concentration ranging from 1.0×10(-16)molL(-1) to 1.0×10(-10)mol L(-1) with a correlation coefficient of 0.992, for DNA biosensors modified with multi-wall carbon nanotubes (MWCNTs) and reduced graphene oxide (RGO) wider linear range and lower detection limit were obtained. For ssDNA/PANHS/MWCNTs/GCE a linear range 1.0×10(-17)mol L(-1)-1.0×10(-10)mol L(-1) with a correlation coefficient of 0.993 and for ssDNA/PANHS/RGO/GCE a linear range from 1.0×10(-18)mol L(-1) to 1.0×10(-10)mol L(-1) with a correlation coefficient of 0.985 were obtained. In addition, the mentioned biosensors were satisfactorily applied for discriminating of complementary sequences from noncomplementary sequences, so the mentioned biosensors can be used for the detection of BRCA1-associated breast cancer. Copyright © 2015. Published by Elsevier B.V.

Morphological and genetic evidence of contemporary intersectional hybridisation in Mediterranean Helichrysum (Asteraceae, Gnaphalieae).

PubMed

Galbany-Casals, M; Carnicero-Campmany, P; Blanco-Moreno, J M; Smissen, R D

2012-09-01

Hybridisation is considered an important evolutionary phenomenon in Gnaphalieae, but contemporary hybridisation has been little explored within the tribe. Here, hybridisation between Helichrysum orientale and Helichrysum stoechas is studied at two different localities in the islands of Crete and Rhodes (Greece). Using three different types of molecular data (AFLP, nrDNA ITS sequences and cpDNA ndhF sequences) and morphological data, the aim is to provide simultaneous and direct comparisons between molecular and morphological variation among the parental species and the studied hybrid populations. AFLP profiles, ITS sequences and morphological data support the existence of hybrids at the two localities studied, shown as morphological and genetic intermediates between the parental species. Chloroplast DNA sequences show that both parental species can act either as pollen donor or as maternal parent. Fertility of hybrids is demonstrated by the viability of seeds produced by hybrids from both localities, and the detection of a backcross specimen to H. orientale. Although there is general congruence of morphological and molecular data, the analysis of morphology and ITS sequences can fail to detect backcross hybrids. © 2012 German Botanical Society and The Royal Botanical Society of the Netherlands.
Performance comparison of two commercial human whole-exome capture systems on formalin-fixed paraffin-embedded lung adenocarcinoma samples.

PubMed

Bonfiglio, Silvia; Vanni, Irene; Rossella, Valeria; Truini, Anna; Lazarevic, Dejan; Dal Bello, Maria Giovanna; Alama, Angela; Mora, Marco; Rijavec, Erika; Genova, Carlo; Cittaro, Davide; Grossi, Francesco; Coco, Simona

2016-08-30

Next Generation Sequencing (NGS) has become a valuable tool for molecular landscape characterization of cancer genomes, leading to a better understanding of tumor onset and progression, and opening new avenues in translational oncology. Formalin-fixed paraffin-embedded (FFPE) tissue is the method of choice for storage of clinical samples, however low quality of FFPE genomic DNA (gDNA) can limit its use for downstream applications. To investigate the FFPE specimen suitability for NGS analysis and to establish the performance of two solution-based exome capture technologies, we compared the whole-exome sequencing (WES) data of gDNA extracted from 5 fresh frozen (FF) and 5 matched FFPE lung adenocarcinoma tissues using: SeqCap EZ Human Exome v.3.0 (Roche NimbleGen) and SureSelect XT Human All Exon v.5 (Agilent Technologies). Sequencing metrics on Illumina HiSeq were optimal for both exome systems and comparable among FFPE and FF samples, with a slight increase of PCR duplicates in FFPE, mainly in Roche NimbleGen libraries. Comparison of single nucleotide variants (SNVs) between FFPE-FF pairs reached overlapping values >90 % in both systems. Both WES showed high concordance with target re-sequencing data by Ion PGM™ in 22 lung-cancer genes, regardless the source of samples. Exon coverage of 623 cancer-related genes revealed high coverage efficiency of both kits, proposing WES as a valid alternative to target re-sequencing. High-quality and reliable data can be successfully obtained from WES of FFPE samples starting from a relatively low amount of input gDNA, suggesting the inclusion of NGS-based tests into clinical contest. In conclusion, our analysis suggests that the WES approach could be extended to a translational research context as well as to the clinic (e.g. to study rare malignancies), where the simultaneous analysis of the whole coding region of the genome may help in the detection of cancer-linked variants.
Mapping of transcription factor binding regions in mammalian cells by ChIP: Comparison of array- and sequencing-based technologies

PubMed Central

Euskirchen, Ghia M.; Rozowsky, Joel S.; Wei, Chia-Lin; Lee, Wah Heng; Zhang, Zhengdong D.; Hartman, Stephen; Emanuelsson, Olof; Stolc, Viktor; Weissman, Sherman; Gerstein, Mark B.; Ruan, Yijun; Snyder, Michael

2007-01-01

Recent progress in mapping transcription factor (TF) binding regions can largely be credited to chromatin immunoprecipitation (ChIP) technologies. We compared strategies for mapping TF binding regions in mammalian cells using two different ChIP schemes: ChIP with DNA microarray analysis (ChIP-chip) and ChIP with DNA sequencing (ChIP-PET). We first investigated parameters central to obtaining robust ChIP-chip data sets by analyzing STAT1 targets in the ENCODE regions of the human genome, and then compared ChIP-chip to ChIP-PET. We devised methods for scoring and comparing results among various tiling arrays and examined parameters such as DNA microarray format, oligonucleotide length, hybridization conditions, and the use of competitor Cot-1 DNA. The best performance was achieved with high-density oligonucleotide arrays, oligonucleotides ≥50 bases (b), the presence of competitor Cot-1 DNA and hybridizations conducted in microfluidics stations. When target identification was evaluated as a function of array number, 80%–86% of targets were identified with three or more arrays. Comparison of ChIP-chip with ChIP-PET revealed strong agreement for the highest ranked targets with less overlap for the low ranked targets. With advantages and disadvantages unique to each approach, we found that ChIP-chip and ChIP-PET are frequently complementary in their relative abilities to detect STAT1 targets for the lower ranked targets; each method detected validated targets that were missed by the other method. The most comprehensive list of STAT1 binding regions is obtained by merging results from ChIP-chip and ChIP-sequencing. Overall, this study provides information for robust identification, scoring, and validation of TF targets using ChIP-based technologies. PMID:17568005
Signatures of DNA Methylation across Insects Suggest Reduced DNA Methylation Levels in Holometabola

PubMed Central

Provataris, Panagiotis; Meusemann, Karen; Niehuis, Oliver; Grath, Sonja; Misof, Bernhard

2018-01-01

Abstract It has been experimentally shown that DNA methylation is involved in the regulation of gene expression and the silencing of transposable element activity in eukaryotes. The variable levels of DNA methylation among different insect species indicate an evolutionarily flexible role of DNA methylation in insects, which due to a lack of comparative data is not yet well-substantiated. Here, we use computational methods to trace signatures of DNA methylation across insects by analyzing transcriptomic and genomic sequence data from all currently recognized insect orders. We conclude that: 1) a functional methylation system relying exclusively on DNA methyltransferase 1 is widespread across insects. 2) DNA methylation has potentially been lost or extremely reduced in species belonging to springtails (Collembola), flies and relatives (Diptera), and twisted-winged parasites (Strepsiptera). 3) Holometabolous insects display signs of reduced DNA methylation levels in protein-coding sequences compared with hemimetabolous insects. 4) Evolutionarily conserved insect genes associated with housekeeping functions tend to display signs of heavier DNA methylation in comparison to the genomic/transcriptomic background. With this comparative study, we provide the much needed basis for experimental and detailed comparative analyses required to gain a deeper understanding on the evolution and function of DNA methylation in insects. PMID:29697817
Mitochondrial genes in the colourless alga Prototheca wickerhamii resemble plant genes in their exons but fungal genes in their introns.

PubMed Central

Wolff, G; Burger, G; Lang, B F; Kück, U

1993-01-01

The mitochondrial DNA from the colourless alga Prototheca wickerhamii contains two mosaic genes as was revealed from complete sequencing of the circular extranuclear genome. The genes for the large subunit of the ribosomal RNA (LSUrRNA) as well as for subunit I of the cytochrome oxidase (coxI) carry two and three intronic sequences respectively. On the basis of their canonical nucleotide sequences they can be classified as group I introns. Phylogenetic comparisons of the coxI protein sequences allow us to conclude that the P.wickerhamii mtDNA is much closer related to higher plant mtDNAs than to those of the chlorophyte alga C.reinhardtii. The comparison of the intron sequences revealed several unusual features: (1) The P.wickerhamii introns are structurally related to mitochondrial introns from various ascomycetous fungi. (2) Phylogenetic analyses indicate a close relationship between fungal and algal intronic sequences. (3) The P. wickerhamii introns are located at positions within the structural genes which can be considered as preferred intron insertion sites in homologous mitochondrial genes from fungi or liverwort. In all cases, the sequences adjacent to the insertion sites are very well conserved over large evolutionary distances. Our finding of highly similar introns in fungi and algae is consistent with the idea that introns have already been present in the bacterial ancestors of present day mitochondria and evolved concomitantly with the organelles. PMID:7680126
Structure of the highly repeated, long interspersed DNA family (LINE or L1Rn) of the rat.

PubMed Central

D'Ambrosio, E; Waitzkin, S D; Witney, F R; Salemme, A; Furano, A V

1986-01-01

We present the DNA sequence of a 6.7-kilobase member of the rat long interspersed repeated DNA family (LINE or L1Rn). This member (LINE 3) is flanked by a perfect 14-base-pair (bp) direct repeat and is a full-length, or close-to-full-length, member of this family. LINE 3 contains an approximately 100-bp A-rich right end, a number of long (greater than 400-bp) open reading frames, and a ca. 200-bp G + C-rich (ca. 60%) cluster near each terminus. Comparison of the LINE 3 sequence with the sequence of about one-half of another member, which we also present, as well as restriction enzyme analysis of the genomic copies of this family, indicates that in length and overall structure LINE 3 is quite typical of the 40,000 or so other genomic members of this family which would account for as much as 10% of the rat genome. Therefore, the rat LINE family is relatively homogeneous, which contrasts with the heterogeneous LINE families in primates and mice. Transcripts corresponding to the entire LINE sequence are abundant in the nuclear RNA of rat liver. The characteristics of the rat LINE family are discussed with respect to the possible function and evolution of this family of DNA sequences. Images PMID:3023845
PDNAsite: Identification of DNA-binding Site from Protein Sequence by Incorporating Spatial and Sequence Context

PubMed Central

Zhou, Jiyun; Xu, Ruifeng; He, Yulan; Lu, Qin; Wang, Hongpeng; Kong, Bing

2016-01-01

Protein-DNA interactions are involved in many fundamental biological processes essential for cellular function. Most of the existing computational approaches employed only the sequence context of the target residue for its prediction. In the present study, for each target residue, we applied both the spatial context and the sequence context to construct the feature space. Subsequently, Latent Semantic Analysis (LSA) was applied to remove the redundancies in the feature space. Finally, a predictor (PDNAsite) was developed through the integration of the support vector machines (SVM) classifier and ensemble learning. Results on the PDNA-62 and the PDNA-224 datasets demonstrate that features extracted from spatial context provide more information than those from sequence context and the combination of them gives more performance gain. An analysis of the number of binding sites in the spatial context of the target site indicates that the interactions between binding sites next to each other are important for protein-DNA recognition and their binding ability. The comparison between our proposed PDNAsite method and the existing methods indicate that PDNAsite outperforms most of the existing methods and is a useful tool for DNA-binding site identification. A web-server of our predictor (http://hlt.hitsz.edu.cn:8080/PDNAsite/) is made available for free public accessible to the biological research community. PMID:27282833
Isolation, cDNA cloning and gene expression of an antibacterial protein from larvae of the coconut rhinoceros beetle, Oryctes rhinoceros.

PubMed

Yang, J; Yamamoto, M; Ishibashi, J; Taniai, K; Yamakawa, M

1998-08-01

An antibacterial protein, designated rhinocerosin, was purified to homogeneity from larvae of the coconut rhinoceros beetle, Oryctes rhinoceros immunized with Escherichia coli. Based on the amino acid sequence of the N-terminal region, a degenerate primer was synthesized and reverse-transcriptase PCR was performed to clone rhinocerosin cDNA. As a result, a 279-bp fragment was obtained. The complete nucleotide sequence was determined by sequencing the extended rhinocerosin cDNA clone by 5' rapid amplification of cDNA ends. The deduced amino acid sequence of the mature portion of rhinocerosin was composed of 72 amino acids without cystein residues and was shown to be rich in glycine (11.1%) and proline (11.1%) residues. Comparison of the deduced amino acid sequence of rhinocerosin with those of other antibacterial proteins indicated that it has 77.8% and 44.6% identity with holotricin 2 and coleoptrecin, respectively. Rhinocerosin had strong antibacterial activity against E. coli, Streptococcus pyogenes, Staphylococcus aureus but not against Pseudomonas aeruginosa. Results of reverse-transcriptase PCR analysis of gene expression in different tissues indicated that the rhinocerosin gene is strongly expressed in the fat body and the Malpighian tubule, and weakly expressed in hemocytes and midgut. In addition, gene expression was inducible by bacteria in the fat body, the Malpighian tubule and hemocyte but constitutive expression was observed in the midgut.
Molecular and Ecological Evidence for Species Specificity and Coevolution in a Group of Marine Algal-Bacterial Symbioses

PubMed Central

Ashen, Jon B.; Goff, Lynda J.

2000-01-01

The phylogenetic relationships of bacterial symbionts from three gall-bearing species in the marine red algal genus Prionitis (Rhodophyta) were inferred from 16S rDNA sequence analysis and compared to host phylogeny also inferred from sequence comparisons (nuclear ribosomal internal-transcribed-spacer region). Gall formation has been described previously on two species of Prionitis, P. lanceolata (from central California) and P. decipiens (from Peru). This investigation reports gall formation on a third related host, Prionitis filiformis. Phylogenetic analyses based on sequence comparisons place the bacteria as a single lineage within the Roseobacter grouping of the α subclass of the division Proteobacteria (99.4 to 98.25% sequence identity among phylotypes). Comparison of symbiont and host molecular phylogenies confirms the presence of three gall-bearing algal lineages and is consistent with the hypothesis that these red seaweeds and their bacterial symbionts are coevolving. The species specificity of these associations was investigated in nature by whole-cell hybridization of gall bacteria and in the laboratory by using cross-inoculation trials. Whole-cell in situ hybridization confirmed that a single bacterial symbiont phylotype is present in galls on each host. In laboratory trials, bacterial symbionts were incapable of inducing galls on alternate hosts (including two non-gall-bearing species). Symbiont-host specificity in Prionitis gall formation indicates an effective ecological separation between these closely related symbiont phylotypes and provides an example of a biological context in which to consider the organismic significance of 16S rDNA sequence variation. PMID:10877801
Detection of regional DNA methylation using DNA-graphene affinity interactions.

PubMed

Haque, Md Hakimul; Gopalan, Vinod; Yadav, Sharda; Islam, Md Nazmul; Eftekhari, Ehsan; Li, Qin; Carrascosa, Laura G; Nguyen, Nam-Trung; Lam, Alfred K; Shiddiky, Muhammad J A

2017-01-15

We report a new method for the detection of regional DNA methylation using base-dependent affinity interaction (i.e., adsorption) of DNA with graphene. Due to the strongest adsorption affinity of guanine bases towards graphene, bisulfite-treated guanine-enriched methylated DNA leads to a larger amount of the adsorbed DNA on the graphene-modified electrodes in comparison to the adenine-enriched unmethylated DNA. The level of the methylation is quantified by monitoring the differential pulse voltammetric current as a function of the adsorbed DNA. The assay is sensitive to distinguish methylated and unmethylated DNA sequences at single CpG resolution by differentiating changes in DNA methylation as low as 5%. Furthermore, this method has been used to detect methylation levels in a collection of DNA samples taken from oesophageal cancer tissues. Copyright Â© 2016 Elsevier B.V. All rights reserved.
DNA-DNA interaction beyond the ground state

NASA Astrophysics Data System (ADS)

Lee, D. J.; Wynveen, A.; Kornyshev, A. A.

2004-11-01

The electrostatic interaction potential between DNA duplexes in solution is a basis for the statistical mechanics of columnar DNA assemblies. It may also play an important role in recombination of homologous genes. We develop a theory of this interaction that includes thermal torsional fluctuations of DNA using field-theoretical methods and Monte Carlo simulations. The theory extends and rationalizes the earlier suggested variational approach which was developed in the context of a ground state theory of interaction of nonhomologous duplexes. It shows that the heuristic variational theory is equivalent to the Hartree self-consistent field approximation. By comparison of the Hartree approximation with an exact solution based on the QM analogy of path integrals, as well as Monte Carlo simulations, we show that this easily analytically-tractable approximation works very well in most cases. Thermal fluctuations do not remove the ability of DNA molecules to attract each other at favorable azimuthal conformations, neither do they wash out the possibility of electrostatic “snap-shot” recognition of homologous sequences, considered earlier on the basis of ground state calculations. At short distances DNA molecules undergo a “torsional alignment transition,” which is first order for nonhomologous DNA and weaker order for homologous sequences.
Human somatostatin I: sequence of the cDNA.

PubMed Central

Shen, L P; Pictet, R L; Rutter, W J

1982-01-01

RNA has been isolated from a human pancreatic somatostatinoma and used to prepare a cDNA library. After prescreening, clones containing somatostatin I sequences were identified by hybridization with an anglerfish somatostatin I-cloned cDNA probe. From the nucleotide sequence of two of these clones, we have deduced an essentially full-length mRNA sequence, including the preprosomatostatin coding region, 105 nucleotides from the 5' untranslated region and the complete 150-nucleotide 3' untranslated region. The coding region predicts a 116-amino acid precursor protein (Mr, 12.727) that contains somatostatin-14 and -28 at its COOH terminus. The predicted amino acid sequence of human somatostatin-28 is identical to that of somatostatin-28 isolated from the porcine and ovine species. A comparison of the amino acid sequences of human and anglerfish preprosomatostatin I indicated that the COOH-terminal region encoding somatostatin-14 and the adjacent 6 amino acids are highly conserved, whereas the remainder of the molecule, including the signal peptide region, is more divergent. However, many of the amino acid differences found in the pro region of the human and anglerfish proteins are conservative changes. This suggests that the propeptides have a similar secondary structure, which in turn may imply a biological function for this region of the molecule. Images PMID:6126875
Noninvasive genome sampling in chimpanzees.

PubMed

Kohn, Michael H

2010-12-01

The inevitable has happened: genomic technologies have been added to our noninvasive genetic sampling repertoire. In this issue of Molecular Ecology, Perry et al. (2010) demonstrate how DNA extraction from chimpanzee faeces, followed by a series of steps to enrich for target loci, can be coupled with next-generation sequencing. These authors collected sequence and single-nucleotide polymorphism (SNP) data at more than 600 genomic loci (chromosome 21 and the X) and the complete mitochondrial DNA. By design, each locus was 'deep sequenced' to enable SNP identification. To demonstrate the reliability of their data, the work included samples from six captive chimps, which allowed for a comparison between presumably genuine SNPs obtained from blood and potentially flawed SNPs deduced from faeces. Thus, with this method, anyone with the resources, skills and ambition to do genome sequencing of wild, elusive, or protected mammals can enjoy all of the benefits of noninvasive sampling. © 2010 Blackwell Publishing Ltd.
A communal catalogue reveals Earth's multiscale microbial diversity.

PubMed

Thompson, Luke R; Sanders, Jon G; McDonald, Daniel; Amir, Amnon; Ladau, Joshua; Locey, Kenneth J; Prill, Robert J; Tripathi, Anupriya; Gibbons, Sean M; Ackermann, Gail; Navas-Molina, Jose A; Janssen, Stefan; Kopylova, Evguenia; Vázquez-Baeza, Yoshiki; González, Antonio; Morton, James T; Mirarab, Siavash; Zech Xu, Zhenjiang; Jiang, Lingjing; Haroon, Mohamed F; Kanbar, Jad; Zhu, Qiyun; Jin Song, Se; Kosciolek, Tomasz; Bokulich, Nicholas A; Lefler, Joshua; Brislawn, Colin J; Humphrey, Gregory; Owens, Sarah M; Hampton-Marcell, Jarrad; Berg-Lyons, Donna; McKenzie, Valerie; Fierer, Noah; Fuhrman, Jed A; Clauset, Aaron; Stevens, Rick L; Shade, Ashley; Pollard, Katherine S; Goodwin, Kelly D; Jansson, Janet K; Gilbert, Jack A; Knight, Rob

2017-11-23

Our growing awareness of the microbial world's importance and diversity contrasts starkly with our limited understanding of its fundamental structure. Despite recent advances in DNA sequencing, a lack of standardized protocols and common analytical frameworks impedes comparisons among studies, hindering the development of global inferences about microbial life on Earth. Here we present a meta-analysis of microbial community samples collected by hundreds of researchers for the Earth Microbiome Project. Coordinated protocols and new analytical methods, particularly the use of exact sequences instead of clustered operational taxonomic units, enable bacterial and archaeal ribosomal RNA gene sequences to be followed across multiple studies and allow us to explore patterns of diversity at an unprecedented scale. The result is both a reference database giving global context to DNA sequence data and a framework for incorporating data from future studies, fostering increasingly complete characterization of Earth's microbial diversity.
Genetic discovery in Xylella fastidiosa through sequence analysis of selected randomly amplified polymorphic DNAs.

PubMed

Chen, Jianchi; Civerolo, Edwin L; Jarret, Robert L; Van Sluys, Marie-Anne; de Oliveira, Mariana C

2005-02-01

Xylella fastidiosa causes many important plant diseases including Pierce's disease (PD) in grape and almond leaf scorch disease (ALSD). DNA-based methodologies, such as randomly amplified polymorphic DNA (RAPD) analysis, have been playing key roles in genetic information collection of the bacterium. This study further analyzed the nucleotide sequences of selected RAPDs from X. fastidiosa strains in conjunction with the available genome sequence databases and unveiled several previously unknown novel genetic traits. These include a sequence highly similar to those in the phage family of Podoviridae. Genome comparisons among X. fastidiosa strains suggested that the "phage" is currently active. Two other RAPDs were also related to horizontal gene transfer: one was part of a broadly distributed cryptic plasmid and the other was associated with conjugal transfer. One RAPD inferred a genomic rearrangement event among X. fastidiosa PD strains and another identified a single nucleotide polymorphism of evolutionary value.
Use of mutation spectra analysis software.

PubMed

Rogozin, I; Kondrashov, F; Glazko, G

2001-02-01

The study and comparison of mutation(al) spectra is an important problem in molecular biology, because these spectra often reflect on important features of mutations and their fixation. Such features include the interaction of DNA with various mutagens, the function of repair/replication enzymes, and properties of target proteins. It is known that mutability varies significantly along nucleotide sequences, such that mutations often concentrate at certain positions, called "hotspots," in a sequence. In this paper, we discuss in detail two approaches for mutation spectra analysis: the comparison of mutation spectra with a HG-PUBL program, (FTP: sunsite.unc.edu/pub/academic/biology/dna-mutations/hyperg) and hotspot prediction with the CLUSTERM program (www.itba.mi.cnr.it/webmutation; ftp.bionet.nsc.ru/pub/biology/dbms/clusterm.zip). Several other approaches for mutational spectra analysis, such as the analysis of a target protein structure, hotspot context revealing, multiple spectra comparisons, as well as a number of mutation databases are briefly described. Mutation spectra in the lacI gene of E. coli and the human p53 gene are used for illustration of various difficulties of such analysis. Copyright 2001 Wiley-Liss, Inc.
Impact of cadmium, cobalt and nickel on sequence-specific DNA binding of p63 and p73 in vitro and in cells

DOE Office of Scientific and Technical Information (OSTI.GOV)

Adámik, Matej; Bažantová, Pavla; Department of Biology and Ecology, Faculty of Science, University of Ostrava, Chittussiho 10, 701 03 Ostrava

Highlights: • DNA binding of p53 family core domains is inhibited by cadmium, cobalt and nickel. • Binding to DNA protects p53 family core domains from metal induced inhibition. • Cadmium, cobalt and nickel induced inhibition was reverted by EDTA in vitro. - Abstract: Site-specific DNA recognition and binding activity belong to common attributes of all three members of tumor suppressor p53 family proteins: p53, p63 and p73. It was previously shown that heavy metals can affect p53 conformation, sequence-specific binding and suppress p53 response to DNA damage. Here we report for the first time that cadmium, nickel and cobalt,more » which have already been shown to disturb various DNA repair mechanisms, can also influence p63 and p73 sequence-specific DNA binding activity and transactivation of p53 family target genes. Based on results of electrophoretic mobility shift assay and luciferase reporter assay, we conclude that cadmium inhibits sequence-specific binding of all three core domains to p53 consensus sequences and abolishes transactivation of several promoters (e.g. BAX and MDM2) by 50 μM concentrations. In the presence of specific DNA, all p53 family core domains were partially protected against loss of DNA binding activity due to cadmium treatment. Effective cadmium concentration to abolish DNA–protein interactions was about two times higher for p63 and p73 proteins than for p53. Furthermore, we detected partial reversibility of cadmium inhibition for all p53 family members by EDTA. DTT was able to reverse cadmium inhibition only for p53 and p73. Nickel and cobalt abolished DNA–p53 interaction at sub-millimolar concentrations while inhibition of p63 and p73 DNA binding was observed at millimolar concentrations. In summary, cadmium strongly inhibits p53, p63 and p73 DNA binding in vitro and in cells in comparison to nickel and cobalt. The role of cadmium inhibition of p53 tumor suppressor family in carcinogenesis is discussed.« less
Comparison of Human and Guinea Pig Acetylcholinesterase Sequences and Rates of Oxime-Assisted Reactivation

DTIC Science & Technology

2010-01-01

of appropriate animal model systems. For OP poisoning, the guinea pig (Cavia porcellus) is a commonly used animal model because guinea pigs more...endogenous bioscavenger in vivo. Although guinea pigs historically have been used to test OP poisoning therapies, it has been found recently that guinea pig AChE...transcribed mRNA encoding guinea pig AChE, amplified the resulting cDNA, and sequenced this product. The nucleotide and deduced amino acid sequences of
The occurrence of Toxocara malaysiensis in cats in China, confirmed by sequence-based analyses of ribosomal DNA.

PubMed

Li, Ming-Wei; Zhu, Xing-Quan; Gasser, Robin B; Lin, Rui-Qing; Sani, Rehana A; Lun, Zhao-Rong; Jacobs, Dennis E

2006-10-01

Non-isotopic polymerase chain reaction (PCR)-based single-strand conformation polymorphism and sequence analyses of the second internal transcribed spacer (ITS-2) of nuclear ribosomal DNA (rDNA) were utilized to genetically characterise ascaridoids from dogs and cats from China by comparison with those from other countries. The study showed that Toxocara canis, Toxocara cati, and Toxascaris leonina from China were genetically the same as those from other geographical origins. Specimens from cats from Guangzhou, China, which were morphologically consistent with Toxocara malaysiensis, were the same genetically as those from Malaysia, with the exception of a polymorphism in the ITS-2 but no unequivocal sequence difference. This is the first report of T. malaysiensis in cats outside of Malaysia (from where it was originally described), supporting the proposal that this species has a broader geographical distribution. The molecular approach employed provides a powerful tool for elucidating the biology, epidemiology, and zoonotic significance of T. malaysiensis.
Complete Sequence of the mitochondrial genome of the tapeworm Hymenolepis diminuta: Gene arrangements indicate that platyhelminths are eutrochozoans

DOE Office of Scientific and Technical Information (OSTI.GOV)

von Nickisch-Rosenegk, Markus; Brown, Wesley M.; Boore, Jeffrey L.

2001-01-01

Using ''long-PCR'' we have amplified in overlapping fragments the complete mitochondrial genome of the tapeworm Hymenolepis diminuta (Platyhelminthes: Cestoda) and determined its 13,900 nucleotide sequence. The gene content is the same as that typically found for animal mitochondrial DNA (mtDNA) except that atp8 appears to be lacking, a condition found previously for several other animals. Despite the small size of this mtDNA, there are two large non-coding regions, one of which contains 13 repeats of a 31 nucleotide sequence and a potential stem-loop structure of 25 base pairs with an 11-member loop. Large potential secondary structures are identified also formore » the non-coding regions of two other cestode mtDNAs. Comparison of the mitochondrial gene arrangement of H. diminuta with those previously published supports a phylogenetic position of flatworms as members of the Eutrochozoa, rather than being basal to either a clade of protostomes or a clade of coelomates.« less

Calibrating genomic and allelic coverage bias in single-cell sequencing.

PubMed

Zhang, Cheng-Zhong; Adalsteinsson, Viktor A; Francis, Joshua; Cornils, Hauke; Jung, Joonil; Maire, Cecile; Ligon, Keith L; Meyerson, Matthew; Love, J Christopher

2015-04-16

Artifacts introduced in whole-genome amplification (WGA) make it difficult to derive accurate genomic information from single-cell genomes and require different analytical strategies from bulk genome analysis. Here, we describe statistical methods to quantitatively assess the amplification bias resulting from whole-genome amplification of single-cell genomic DNA. Analysis of single-cell DNA libraries generated by different technologies revealed universal features of the genome coverage bias predominantly generated at the amplicon level (1-10 kb). The magnitude of coverage bias can be accurately calibrated from low-pass sequencing (∼0.1 × ) to predict the depth-of-coverage yield of single-cell DNA libraries sequenced at arbitrary depths. We further provide a benchmark comparison of single-cell libraries generated by multi-strand displacement amplification (MDA) and multiple annealing and looping-based amplification cycles (MALBAC). Finally, we develop statistical models to calibrate allelic bias in single-cell whole-genome amplification and demonstrate a census-based strategy for efficient and accurate variant detection from low-input biopsy samples.
Calibrating genomic and allelic coverage bias in single-cell sequencing

PubMed Central

Francis, Joshua; Cornils, Hauke; Jung, Joonil; Maire, Cecile; Ligon, Keith L.; Meyerson, Matthew; Love, J. Christopher

2016-01-01

Artifacts introduced in whole-genome amplification (WGA) make it difficult to derive accurate genomic information from single-cell genomes and require different analytical strategies from bulk genome analysis. Here, we describe statistical methods to quantitatively assess the amplification bias resulting from whole-genome amplification of single-cell genomic DNA. Analysis of single-cell DNA libraries generated by different technologies revealed universal features of the genome coverage bias predominantly generated at the amplicon level (1–10 kb). The magnitude of coverage bias can be accurately calibrated from low-pass sequencing (~0.1 ×) to predict the depth-of-coverage yield of single-cell DNA libraries sequenced at arbitrary depths. We further provide a benchmark comparison of single-cell libraries generated by multi-strand displacement amplification (MDA) and multiple annealing and looping-based amplification cycles (MALBAC). Finally, we develop statistical models to calibrate allelic bias in single-cell whole-genome amplification and demonstrate a census-based strategy for efficient and accurate variant detection from low-input biopsy samples. PMID:25879913
Scaling up discovery of hidden diversity in fungi: impacts of barcoding approaches.

PubMed

Yahr, Rebecca; Schoch, Conrad L; Dentinger, Bryn T M

2016-09-05

The fungal kingdom is a hyperdiverse group of multicellular eukaryotes with profound impacts on human society and ecosystem function. The challenge of documenting and describing fungal diversity is exacerbated by their typically cryptic nature, their ability to produce seemingly unrelated morphologies from a single individual and their similarity in appearance to distantly related taxa. This multiplicity of hurdles resulted in the early adoption of DNA-based comparisons to study fungal diversity, including linking curated DNA sequence data to expertly identified voucher specimens. DNA-barcoding approaches in fungi were first applied in specimen-based studies for identification and discovery of taxonomic diversity, but are now widely deployed for community characterization based on sequencing of environmental samples. Collectively, fungal barcoding approaches have yielded important advances across biological scales and research applications, from taxonomic, ecological, industrial and health perspectives. A major outstanding issue is the growing problem of 'sequences without names' that are somewhat uncoupled from the traditional framework of fungal classification based on morphology and preserved specimens. This review summarizes some of the most significant impacts of fungal barcoding, its limitations, and progress towards the challenge of effective utilization of the exponentially growing volume of data gathered from high-throughput sequencing technologies.This article is part of the themed issue 'From DNA barcodes to biomes'. © 2016 The Authors.
Cloning of a coconut endosperm cDNA encoding a 1-acyl-sn-glycerol-3-phosphate acyltransferase that accepts medium-chain-length substrates.

PubMed Central

Knutzon, D S; Lardizabal, K D; Nelsen, J S; Bleibaum, J L; Davies, H M; Metz, J G

1995-01-01

Immature coconut (Cocos nucifera) endosperm contains a 1-acyl-sn-glycerol-3-phosphate acyltransferase (LPAAT) activity that shows a preference for medium-chain-length fatty acyl-coenzyme A substrates (H.M. Davies, D.J. Hawkins, J.S. Nelsen [1995] Phytochemistry 39:989-996). Beginning with solubilized membrane preparations, we have used chromatographic separations to identify a polypeptide with an apparent molecular mass of 29 kD, whose presence in various column fractions correlates with the acyltransferase activity detected in those same fractions. Amino acid sequence data obtained from several peptides generated from this protein were used to isolate a full-length clone from a coconut endosperm cDNA library. Clone pCGN5503 contains a 1325-bp cDNA insert with an open reading frame encoding a 308-amino acid protein with a calculated molecular mass of 34.8 kD. Comparison of the deduced amino acid sequence of pCGN5503 to sequences in the data banks revealed significant homology to other putative LPAAT sequences. Expression of the coconut cDNA in Escherichia coli conferred upon those cells a novel LPAAT activity whose substrate activity profile matched that of the coconut enzyme. PMID:8552723
Molecular cloning of a cDNA encoding the glycoprotein of hen oviduct microsomal signal peptidase.

PubMed Central

Newsome, A L; McLean, J W; Lively, M O

1992-01-01

Detergent-solubilized hen oviduct signal peptidase has been characterized previously as an apparent complex of a 19 kDa protein and a 23 kDa glycoprotein (GP23) [Baker & Lively (1987) Biochemistry 26, 8561-8567]. A cDNA clone encoding GP23 from a chicken oviduct lambda gt11 cDNA library has now been characterized. The cDNA encodes a protein of 180 amino acid residues with a single site for asparagine-linked glycosylation that has been directly identified by amino acid sequence analysis of a tryptic-digest peptide containing the glycosylated site. Immunoblot analysis reveals cross-reactivity with a dog pancreas protein. Comparison of the deduced amino acid sequence of GP23 with the 22/23 kDa glycoprotein of dog microsomal signal peptidase [Shelness, Kanwar & Blobel (1988) J. Biol. Chem. 263, 17063-17070], one of five proteins associated with this enzyme, reveals that the amino acid sequences are 90% identical. Thus the signal peptidase glycoprotein is as highly conserved as the sequences of cytochromes c and b from these same species and is likely to be found in a similar form in many, if not all, vertebrate species. The data also show conclusively that the dog and avian signal peptidases have at least one protein subunit in common. Images Fig. 1. PMID:1546959
Small gene family encoding an eggshell (chorion) protein of the human parasite Schistosoma mansoni

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bobek, L.A.; Rekosh, D.M.; Lo Verde, P.T.

1988-08-01

The authors isolated six independent genomic clones encoding schistosome chorion or eggshell proteins from a Schistosoma mansoni genomic library. A linkage map of five of the clones spanning 35 kilobase pairs (kbp) of the S. mansoni genome was constructed. The region contained two eggshell protein genes closely linked, separated by 7.5 kbp of intergenic DNA. The two genes of the cluster were arranged in the same orientation, that is, they were transcribed from the same strand. The sixth clone probably represents a third copy of the eggshell gene that is not contained within the 35-kbp region. The 5- end ofmore » the mRNA transcribed from these genes was defined by primer extension directly off the RNA. The ATCAT cap site sequence was homologous to a silkmoth chorion PuTCATT cap site sequence, where Pu indicates any purine. DNA sequence analysis showed that there were no introns in these genes. The DNA sequences of the three genes were very homologous to each other and to a cDNA clone, pSMf61-46, differing only in three or four nucleotices. A multiple TATA box was located at positions -23 to -31, and a CAAAT sequence was located at -52 upstream of the eggshell transcription unit. Comparison of sequences in regions further upstream with silkmoth and Drosophila sequences revealed very short elements that were shared. One such element, TCACGT, recently shown to be an essential cis-regulatory element for silkmoth chorion gene promoter function, was found at a similar position in all three organisms.« less
The mitochondrial genome of the gymnosperm Cycas taitungensis contains a novel family of short interspersed elements, Bpu sequences, and abundant RNA editing sites.

PubMed

Chaw, Shu-Miaw; Shih, Arthur Chun-Chieh; Wang, Daryi; Wu, Yu-Wei; Liu, Shu-Mei; Chou, The-Yuan

2008-03-01

The mtDNA of Cycas taitungensis is a circular molecule of 414,903 bp, making it 2- to 6-fold larger than the known mtDNAs of charophytes and bryophytes, but similar to the average of 7 elucidated angiosperm mtDNAs. It is characterized by abundant RNA editing sites (1,084), more than twice the number found in the angiosperm mtDNAs. The A + T content of Cycas mtDNA is 53.1%, the lowest among known land plants. About 5% of the Cycas mtDNA is composed of a novel family of mobile elements, which we designated as "Bpu sequences." They share a consensus sequence of 36 bp with 2 terminal direct repeats (AAGG) and a recognition site for the Bpu 10I restriction endonuclease (CCTGAAGC). Comparison of the Cycas mtDNA with other plant mtDNAs revealed many new insights into the biology and evolution of land plant mtDNAs. For example, the noncoding sequences in mtDNAs have drastically expanded as land plants have evolved, with abrupt increases appearing in the bryophytes, and then in the seed plants. As a result, the genomic organizations of seed plant mtDNAs are much less compact than in other plants. Also, the Cycas mtDNA appears to have been exempted from the frequent gene loss observed in angiosperm mtDNAs. Similar to the angiosperms, the 3 Cycas genes nad1, nad2, and nad5 are disrupted by 5 group II intron squences, which have brought the genes into trans-splicing arrangements. The evolutionary origin and invasion/duplication mechanism of the Bpu sequences in Cycas mtDNA are hypothesized and discussed.
The D1-D2 region of the large subunit ribosomal DNA as barcode for ciliates.

PubMed

Stoeck, T; Przybos, E; Dunthorn, M

2014-05-01

Ciliates are a major evolutionary lineage within the alveolates, which are distributed in nearly all habitats on our planet and are an essential component for ecosystem function, processes and stability. Accurate identification of these unicellular eukaryotes through, for example, microscopy or mating type reactions is reserved to few specialists. To satisfy the demand for a DNA barcode for ciliates, which meets the standard criteria for DNA barcodes defined by the Consortium for the Barcode of Life (CBOL), we here evaluated the D1-D2 region of the ribosomal DNA large subunit (LSU-rDNA). Primer universality for the phylum Ciliophora was tested in silico with available database sequences as well as in the laboratory with 73 ciliate species, which represented nine of 12 ciliate classes. Primers tested in this study were successful for all tested classes. To test the ability of the D1-D2 region to resolve conspecific and congeneric sequence divergence, 63 Paramecium strains were sampled from 24 mating species. The average conspecific D1-D2 variation was 0.18%, whereas congeneric sequence divergence averaged 4.83%. In pairwise genetic distance analyses, we identified a D1-D2 sequence divergence of <0.6% as an ideal threshold to discriminate Paramecium species. Using this definition, only 3.8% of all conspecific and 3.9% of all congeneric sequence comparisons had the potential of false assignments. Neighbour-joining analyses inferred monophyly for all taxa but for two Paramecium octaurelia strains. Here, we present a protocol for easy DNA amplification of single cells and voucher deposition. In conclusion, the presented data pinpoint the D1-D2 region as an excellent candidate for an official CBOL barcode for ciliated protists. © 2013 John Wiley & Sons Ltd.
Does a global DNA barcoding gap exist in Annelida?

PubMed

Kvist, Sebastian

2016-05-01

Accurate identification of unknown specimens by means of DNA barcoding is contingent on the presence of a DNA barcoding gap, among other factors, as its absence may result in dubious specimen identifications - false negatives or positives. Whereas the utility of DNA barcoding would be greatly reduced in the absence of a distinct and sufficiently sized barcoding gap, the limits of intraspecific and interspecific distances are seldom thoroughly inspected across comprehensive sampling. The present study aims to illuminate this aspect of barcoding in a comprehensive manner for the animal phylum Annelida. All cytochrome c oxidase subunit I sequences (cox1 gene; the chosen region for zoological DNA barcoding) present in GenBank for Annelida, as well as for "Polychaeta", "Oligochaeta", and Hirudinea separately, were downloaded and curated for length, coverage and potential contaminations. The final datasets consisted of 9782 (Annelida), 5545 ("Polychaeta"), 3639 ("Oligochaeta"), and 598 (Hirudinea) cox1 sequences and these were either (i) used as is in an automated global barcoding gap detection analysis or (ii) further analyzed for genetic distances, separated into bins containing intraspecific and interspecific comparisons and plotted in a graph to visualize any potential global barcoding gap. Over 70 million pairwise genetic comparisons were made and results suggest that although there is a tendency towards separation, no distinct or sufficiently sized global barcoding gap exists in either of the datasets rendering future barcoding efforts at risk of erroneous specimen identifications (but local barcoding gaps may still exist allowing for the identification of specimens at lower taxonomic ranks). This seems to be especially true for earthworm taxa, which account for fully 35% of the total number of interspecific comparisons that show 0% divergence.
W-curve alignments for HIV-1 genomic comparisons.

PubMed

Cork, Douglas J; Lembark, Steven; Tovanabutra, Sodsai; Robb, Merlin L; Kim, Jerome H

2010-06-01

The W-curve was originally developed as a graphical visualization technique for viewing DNA and RNA sequences. Its ability to render features of DNA also makes it suitable for computational studies. Its main advantage in this area is utilizing a single-pass algorithm for comparing the sequences. Avoiding recursion during sequence alignments offers advantages for speed and in-process resources. The graphical technique also allows for multiple models of comparison to be used depending on the nucleotide patterns embedded in similar whole genomic sequences. The W-curve approach allows us to compare large numbers of samples quickly. We are currently tuning the algorithm to accommodate quirks specific to HIV-1 genomic sequences so that it can be used to aid in diagnostic and vaccine efforts. Tracking the molecular evolution of the virus has been greatly hampered by gap associated problems predominantly embedded within the envelope gene of the virus. Gaps and hypermutation of the virus slow conventional string based alignments of the whole genome. This paper describes the W-curve algorithm itself, and how we have adapted it for comparison of similar HIV-1 genomes. A treebuilding method is developed with the W-curve that utilizes a novel Cylindrical Coordinate distance method and gap analysis method. HIV-1 C2-V5 env sequence regions from a Mother/Infant cohort study are used in the comparison. The output distance matrix and neighbor results produced by the W-curve are functionally equivalent to those from Clustal for C2-V5 sequences in the mother/infant pairs infected with CRF01_AE. Significant potential exists for utilizing this method in place of conventional string based alignment of HIV-1 genomes, such as Clustal X. With W-curve heuristic alignment, it may be possible to obtain clinically useful results in a short time-short enough to affect clinical choices for acute treatment. A description of the W-curve generation process, including a comparison technique of aligning extremes of the curves to effectively phase-shift them past the HIV-1 gap problem, is presented. Besides yielding similar neighbor-joining phenogram topologies, most Mother and Infant C2-V5 sequences in the cohort pairs geometrically map closest to each other, indicating that W-curve heuristics overcame any gap problem.
Phylogenic study of Lemnoideae (duckweeds) through complete chloroplast genomes for eight accessions.

PubMed

Ding, Yanqiang; Fang, Yang; Guo, Ling; Li, Zhidan; He, Kaize; Zhao, Yun; Zhao, Hai

2017-01-01

Phylogenetic relationship within different genera of Lemnoideae, a kind of small aquatic monocotyledonous plants, was not well resolved, using either morphological characters or traditional markers. Given that rich genetic information in chloroplast genome makes them particularly useful for phylogenetic studies, we used chloroplast genomes to clarify the phylogeny within Lemnoideae. DNAs were sequenced with next-generation sequencing. The duckweeds chloroplast genomes were indirectly filtered from the total DNA data, or directly obtained from chloroplast DNA data. To test the reliability of assembling the chloroplast genome based on the filtration of the total DNA, two methods were used to assemble the chloroplast genome of Landoltia punctata strain ZH0202. A phylogenetic tree was built on the basis of the whole chloroplast genome sequences using MrBayes v.3.2.6 and PhyML 3.0. Eight complete duckweeds chloroplast genomes were assembled, with lengths ranging from 165,775 bp to 171,152 bp, and each contains 80 protein-coding sequences, four rRNAs, 30 tRNAs and two pseudogenes. The identity of L. punctata strain ZH0202 chloroplast genomes assembled through two methods was 100%, and their sequences and lengths were completely identical. The chloroplast genome comparison demonstrated that the differences in chloroplast genome sizes among the Lemnoideae primarily resulted from variation in non-coding regions, especially from repeat sequence variation. The phylogenetic analysis demonstrated that the different genera of Lemnoideae are derived from each other in the following order: Spirodela , Landoltia , Lemna , Wolffiella , and Wolffia . This study demonstrates potential of whole chloroplast genome DNA as an effective option for phylogenetic studies of Lemnoideae. It also showed the possibility of using chloroplast DNA data to elucidate those phylogenies which were not yet solved well by traditional methods even in plants other than duckweeds.
Phylogenic study of Lemnoideae (duckweeds) through complete chloroplast genomes for eight accessions

PubMed Central

Ding, Yanqiang; Fang, Yang; Guo, Ling; Li, Zhidan; He, Kaize

2017-01-01

Background Phylogenetic relationship within different genera of Lemnoideae, a kind of small aquatic monocotyledonous plants, was not well resolved, using either morphological characters or traditional markers. Given that rich genetic information in chloroplast genome makes them particularly useful for phylogenetic studies, we used chloroplast genomes to clarify the phylogeny within Lemnoideae. Methods DNAs were sequenced with next-generation sequencing. The duckweeds chloroplast genomes were indirectly filtered from the total DNA data, or directly obtained from chloroplast DNA data. To test the reliability of assembling the chloroplast genome based on the filtration of the total DNA, two methods were used to assemble the chloroplast genome of Landoltia punctata strain ZH0202. A phylogenetic tree was built on the basis of the whole chloroplast genome sequences using MrBayes v.3.2.6 and PhyML 3.0. Results Eight complete duckweeds chloroplast genomes were assembled, with lengths ranging from 165,775 bp to 171,152 bp, and each contains 80 protein-coding sequences, four rRNAs, 30 tRNAs and two pseudogenes. The identity of L. punctata strain ZH0202 chloroplast genomes assembled through two methods was 100%, and their sequences and lengths were completely identical. The chloroplast genome comparison demonstrated that the differences in chloroplast genome sizes among the Lemnoideae primarily resulted from variation in non-coding regions, especially from repeat sequence variation. The phylogenetic analysis demonstrated that the different genera of Lemnoideae are derived from each other in the following order: Spirodela, Landoltia, Lemna, Wolffiella, and Wolffia. Discussion This study demonstrates potential of whole chloroplast genome DNA as an effective option for phylogenetic studies of Lemnoideae. It also showed the possibility of using chloroplast DNA data to elucidate those phylogenies which were not yet solved well by traditional methods even in plants other than duckweeds. PMID:29302399
Control control control: a reassessment and comparison of GenBank and chromatogram mtDNA sequence variation in Baltic grey seals (Halichoerus grypus).

PubMed

Fietz, Katharina; Graves, Jeff A; Olsen, Morten Tange

2013-01-01

Genetic data can provide a powerful tool for those interested in the biology, management and conservation of wildlife, but also lead to erroneous conclusions if appropriate controls are not taken at all steps of the analytical process. This particularly applies to data deposited in public repositories such as GenBank, whose utility relies heavily on the assumption of high data quality. Here we report on an in-depth reassessment and comparison of GenBank and chromatogram mtDNA sequence data generated in a previous study of Baltic grey seals. By re-editing the original chromatogram data we found that approximately 40% of the grey seal mtDNA haplotype sequences posted in GenBank contained errors. The re-analysis of the edited chromatogram data yielded overall similar results and conclusions as the original study. However, a significantly different outcome was observed when using the uncorrected dataset based on the GenBank haplotypes. We therefore suggest disregarding the existing GenBank data and instead using the correct haplotypes reported here. Our study serves as an illustrative example reiterating the importance of quality control through every step of a research project, from data generation to interpretation and submission to an online repository. Errors conducted in any step may lead to biased results and conclusions, and could impact management decisions.
Control Control Control: A Reassessment and Comparison of GenBank and Chromatogram mtDNA Sequence Variation in Baltic Grey Seals (Halichoerus grypus)

PubMed Central

Fietz, Katharina; Graves, Jeff A.; Olsen, Morten Tange

2013-01-01

Genetic data can provide a powerful tool for those interested in the biology, management and conservation of wildlife, but also lead to erroneous conclusions if appropriate controls are not taken at all steps of the analytical process. This particularly applies to data deposited in public repositories such as GenBank, whose utility relies heavily on the assumption of high data quality. Here we report on an in-depth reassessment and comparison of GenBank and chromatogram mtDNA sequence data generated in a previous study of Baltic grey seals. By re-editing the original chromatogram data we found that approximately 40% of the grey seal mtDNA haplotype sequences posted in GenBank contained errors. The re-analysis of the edited chromatogram data yielded overall similar results and conclusions as the original study. However, a significantly different outcome was observed when using the uncorrected dataset based on the GenBank haplotypes. We therefore suggest disregarding the existing GenBank data and instead using the correct haplotypes reported here. Our study serves as an illustrative example reiterating the importance of quality control through every step of a research project, from data generation to interpretation and submission to an online repository. Errors conducted in any step may lead to biased results and conclusions, and could impact management decisions. PMID:23977362
Sequencing and analysis of 10,967 full-length cDNA clones from Xenopus laevis and Xenopus tropicalis reveals post-tetraploidization transcriptome remodeling

PubMed Central

Morin, Ryan D.; Chang, Elbert; Petrescu, Anca; Liao, Nancy; Griffith, Malachi; Kirkpatrick, Robert; Butterfield, Yaron S.; Young, Alice C.; Stott, Jeffrey; Barber, Sarah; Babakaiff, Ryan; Dickson, Mark C.; Matsuo, Corey; Wong, David; Yang, George S.; Smailus, Duane E.; Wetherby, Keith D.; Kwong, Peggy N.; Grimwood, Jane; Brinkley, Charles P.; Brown-John, Mabel; Reddix-Dugue, Natalie D.; Mayo, Michael; Schmutz, Jeremy; Beland, Jaclyn; Park, Morgan; Gibson, Susan; Olson, Teika; Bouffard, Gerard G.; Tsai, Miranda; Featherstone, Ruth; Chand, Steve; Siddiqui, Asim S.; Jang, Wonhee; Lee, Ed; Klein, Steven L.; Blakesley, Robert W.; Zeeberg, Barry R.; Narasimhan, Sudarshan; Weinstein, John N.; Pennacchio, Christa Prange; Myers, Richard M.; Green, Eric D.; Wagner, Lukas; Gerhard, Daniela S.; Marra, Marco A.; Jones, Steven J.M.; Holt, Robert A.

2006-01-01

Sequencing of full-insert clones from full-length cDNA libraries from both Xenopus laevis and Xenopus tropicalis has been ongoing as part of the Xenopus Gene Collection Initiative. Here we present 10,967 full ORF verified cDNA clones (8049 from X. laevis and 2918 from X. tropicalis) as a community resource. Because the genome of X. laevis, but not X. tropicalis, has undergone allotetraploidization, comparison of coding sequences from these two clawed (pipid) frogs provides a unique angle for exploring the molecular evolution of duplicate genes. Within our clone set, we have identified 445 gene trios, each comprised of an allotetraploidization-derived X. laevis gene pair and their shared X. tropicalis ortholog. Pairwise dN/dS, comparisons within trios show strong evidence for purifying selection acting on all three members. However, dN/dS ratios between X. laevis gene pairs are elevated relative to their X. tropicalis ortholog. This difference is highly significant and indicates an overall relaxation of selective pressures on duplicated gene pairs. We have found that the paralogs that have been lost since the tetraploidization event are enriched for several molecular functions, but have found no such enrichment in the extant paralogs. Approximately 14% of the paralogous pairs analyzed here also show differential expression indicative of subfunctionalization. PMID:16672307
Targeted enrichment of ancient pathogens yielding the pPCP1 plasmid of Yersinia pestis from victims of the Black Death.

PubMed

Schuenemann, Verena J; Bos, Kirsten; DeWitte, Sharon; Schmedes, Sarah; Jamieson, Joslyn; Mittnik, Alissa; Forrest, Stephen; Coombes, Brian K; Wood, James W; Earn, David J D; White, William; Krause, Johannes; Poinar, Hendrik N

2011-09-20

Although investigations of medieval plague victims have identified Yersinia pestis as the putative etiologic agent of the pandemic, methodological limitations have prevented large-scale genomic investigations to evaluate changes in the pathogen's virulence over time. We screened over 100 skeletal remains from Black Death victims of the East Smithfield mass burial site (1348-1350, London, England). Recent methods of DNA enrichment coupled with high-throughput DNA sequencing subsequently permitted reconstruction of ten full human mitochondrial genomes (16 kb each) and the full pPCP1 (9.6 kb) virulence-associated plasmid at high coverage. Comparisons of molecular damage profiles between endogenous human and Y. pestis DNA confirmed its authenticity as an ancient pathogen, thus representing the longest contiguous genomic sequence for an ancient pathogen to date. Comparison of our reconstructed plasmid against modern Y. pestis shows identity with several isolates matching the Medievalis biovar; however, our chromosomal sequences indicate the victims were infected with a Y. pestis variant that has not been previously reported. Our data reveal that the Black Death in medieval Europe was caused by a variant of Y. pestis that may no longer exist, and genetic data carried on its pPCP1 plasmid were not responsible for the purported epidemiological differences between ancient and modern forms of Y. pestis infections.
Genome Analysis of the Domestic Dog (Korean Jindo) by Massively Parallel Sequencing

PubMed Central

Kim, Ryong Nam; Kim, Dae-Soo; Choi, Sang-Haeng; Yoon, Byoung-Ha; Kang, Aram; Nam, Seong-Hyeuk; Kim, Dong-Wook; Kim, Jong-Joo; Ha, Ji-Hong; Toyoda, Atsushi; Fujiyama, Asao; Kim, Aeri; Kim, Min-Young; Park, Kun-Hyang; Lee, Kang Seon; Park, Hong-Seog

2012-01-01

Although pioneering sequencing projects have shed light on the boxer and poodle genomes, a number of challenges need to be met before the sequencing and annotation of the dog genome can be considered complete. Here, we present the DNA sequence of the Jindo dog genome, sequenced to 45-fold average coverage using Illumina massively parallel sequencing technology. A comparison of the sequence to the reference boxer genome led to the identification of 4 675 437 single nucleotide polymorphisms (SNPs, including 3 346 058 novel SNPs), 71 642 indels and 8131 structural variations. Of these, 339 non-synonymous SNPs and 3 indels are located within coding sequences (CDS). In particular, 3 non-synonymous SNPs and a 26-bp deletion occur in the TCOF1 locus, implying that the difference observed in cranial facial morphology between Jindo and boxer dogs might be influenced by those variations. Through the annotation of the Jindo olfactory receptor gene family, we found 2 unique olfactory receptor genes and 236 olfactory receptor genes harbouring non-synonymous homozygous SNPs that are likely to affect smelling capability. In addition, we determined the DNA sequence of the Jindo dog mitochondrial genome and identified Jindo dog-specific mtDNA genotypes. This Jindo genome data upgrade our understanding of dog genomic architecture and will be a very valuable resource for investigating not only dog genetics and genomics but also human and dog disease genetics and comparative genomics. PMID:22474061
Myxobolus cerebralis internal transcribed spacer 1 (ITS-1) sequences support recent spread of the parasite to North America and within Europe

USGS Publications Warehouse

Whipps, Christopher M.; El-Matbouli, M.; Hedrick, R.P.; Blazer, V.; Kent, M.L.

2004-01-01

Molecular approaches for resolving relationships among the Myxozoa have relied mainly on small subunit (SSU) ribosomal DNA (rDNA) sequence analysis. This region of the gene is generally used for higher phylogenetic studies, and the conservative nature of this gene may make it inadequate for intraspecific comparisons. Previous intraspecific studies of Myxobolus cerebralis based on molecular analyses reported that the sequence of SSU rDNA and the internal transcribed spacer (ITS) were highly conserved in representatives of the parasite from North America and Europe. Considering that the ITS is usually a more variable region than the SSU, we reanalyzed available sequences on GenBank and obtained sequences from other M. cerebralis representatives from the states of California and West Virginia in the USA and from Germany and Russia. With the exception of 7 base pairs, most of the sequence designated as ITS-1 in GenBank was a highly conserved portion of the rDNA near the 3-prime end of the SSU region. Nonetheless, the additional ITS-1 sequences obtained from the available geographic representatives were well conserved. It is unlikely that we would have observed virtually identical ITS-1 sequences between European and American M. cerebralis samples had it spread naturally over time, particularly when compared to the variation seen between isolates of another myxozoan (Kudoa thyrsites) that has most likely spread naturally. These data further support the hypothesis that the current distribution of M. cerebralis in North America is a result of recent introductions followed by dispersal via anthropogenic means, largely through the stocking of infected trout for sport fishing.
Isolation and characterization of adrenoleukodystrophy protein (ALDP) related sequences in the human genome

DOE Office of Scientific and Technical Information (OSTI.GOV)

Geraghty, M.T.; Stetten, G.; Kearns, W.

1994-09-01

X-linked adrenoleukodystrophy (ALD) is a disorder of peroxisomal {beta}-oxidation of very long chain fatty acids. It presents either as progressive dementia in childhood or as progressive paraparesis in later years. Adrenal insufficiency occurs in both phenotypes. The gene of the ALD protein has been mapped to Xq28 and has recently been cloned and characterized. The ALD protein has significant homology to the peroxisomal membrane protein, PMP70 and belongs to the ATP binding cassette superfamily of transporters. We screened a human genomic library with an ALDP cDNA and isolated 5 different but highly similar clones containing sequences corresponding to the 3{prime}more » end of the ALDP gene. Comparison of the sequences over the region corresponding to exon 9 through the 3{prime} end of the ALDP gene reveals {approximately}96% nucleotide identity in both exonic and intronic regions. Splice sites and open reading frames are maintained. Using both FISH and human-rodent DNA mapping panels, we positively assign these ALDP-related sequences to chromosomes 2, 16 and 22, and provisionally to 1 and 20. Southern blot of primate DNA probed with a partial ALDP cDNA (exon 2-10) shows that expansion of ALDP-related sequences occurred in higher primates (chimp, gorilla and human). Although Northern blots show multiple ALDP-hybridizing transcripts in certain tissues, we have no evidence to date for expression of these ALDP-related sequences. In conclusion, our data show there has been an unusual and recent dispersal to multiple chromosomes of structural gene sequences related to the ALDP gene. The functional significance of these sequences remains to be determined but their existence complicates PCR and mutation analysis of the ALDP gene.« less
Introduction of a novel 18S rDNA gene arrangement along with distinct ITS region in the saline water microalga Dunaliella

PubMed Central

2010-01-01

Comparison of 18S rDNA gene sequences is a very promising method for identification and classification of living organisms. Molecular identification and discrimination of different Dunaliella species were carried out based on the size of 18S rDNA gene and, number and position of introns in the gene. Three types of 18S rDNA structure have already been reported: the gene with a size of ~1770 bp lacking any intron, with a size of ~2170 bp consisting one intron near 5' terminus, and with a size of ~2570 bp harbouring two introns near 5' and 3' termini. Hereby, we report a new 18S rDNA gene arrangement in terms of intron localization and nucleotide sequence in a Dunaliella isolated from Iranian salt lakes (ABRIINW-M1/2). PCR amplification with genus-specific primers resulted in production of a ~2170 bp DNA band, which is similar to that of D. salina 18S rDNA gene containing only one intron near 5' terminus. Whilst, sequence composition of the gene revealed the lack of any intron near 5' terminus in our isolate. Furthermore, another alteration was observed due to the presence of a 440 bp DNA fragment near 3' terminus. Accordingly, 18S rDNA gene of the isolate is clearly different from those of D. salina and any other Dunaliella species reported so far. Moreover, analysis of ITS region sequence showed the diversity of this region compared to the previously reported species. 18S rDNA and ITS sequences of our isolate were submitted with accesion numbers of EU678868 and EU927373 in NCBI database, respectively. The optimum growth rate of this isolate occured at the salinity level of 1 M NaCl. The maximum carotenoid content under stress condition of intense light (400 μmol photon m-2 s-1), high salinity (4 M NaCl) and deficiency of nitrate and phosphate nutritions reached to 240 ng/cell after 15 days. PMID:20377865

DOE Office of Scientific and Technical Information (OSTI.GOV)

Man, Viet Hoang; Pan, Feng; Sagui, Celeste, E-mail: sagui@ncsu.edu

We explore the use of a fast laser melting simulation approach combined with atomistic molecular dynamics simulations in order to determine the melting and healing responses of B-DNA and Z-DNA dodecamers with the same d(5′-CGCGCGCGCGCG-3′){sub 2} sequence. The frequency of the laser pulse is specifically tuned to disrupt Watson-Crick hydrogen bonds, thus inducing melting of the DNA duplexes. Subsequently, the structures relax and partially refold, depending on the field strength. In addition to the inherent interest of the nonequilibrium melting process, we propose that fast melting by an infrared laser pulse could be used as a technique for a fastmore » comparison of relative stabilities of same-sequence oligonucleotides with different secondary structures with full atomistic detail of the structures and solvent. This could be particularly useful for nonstandard secondary structures involving non-canonical base pairs, mismatches, etc.« less
Clinical comparison of branched DNA and reverse transcriptase-PCR and nucleic acid sequence-based amplification assay for the quantitation of circulating recombinant form_BC HIV-1 RNA in plasma.

PubMed

Pan, Pinliang; Tao, Xiaoxia; Zhang, Qi; Xing, Wenge; Sun, Xianguang; Pei, Lijian; Jiang, Yan

2007-12-01

To investigate the correlation between three viral load assays for circulating recombinant form (CRF)_BC. Recent studies in HIV-1 molecular epidemiology, reveals that CRF_BC is the dominant subtype of HIV-1 virus in mainland China, representing over 45% of the HIV-1 infected population. The performances of nucleic acid sequence-based amplification (NASBA), branched DNA (bDNA) and reverse transcriptase polymerase chain reaction (RT-PCR) were compared for the HIV-1 viral load detection and quantitation of CRF_BC in China. Sixteen HIV-1 positive and three HIV-1 negative samples were collected. Sequencing of the positive samples in the gp41 region was conducted. The HIV-1 viral load values were determined using bDNA, RT-PCR and NASBA assays. Deming regression analysis with SPSS 12.0 (SPS Inc., Chicago, Illinois, USA) was performed for data analysis. Sequencing and phylogenetic analysis of env gene (gp41) region of the 16 HIV-1 positive clinical specimens from Guizhou Province in southwest China revealed the dominance of the subtype CRF_BC in that region. A good correlation of their viral load values was observed among three assays. Pearson's correlation between RT-PCR and bDNA is 0.969, Lg(VL)RT-PCR = 0.969 * Lg(VL)bDNA + 0.55; Pearson's correlation between RT-PCR and NASBA is 0.968, Lg(VL)RT-PCR = 0.968 * Lg(VL)NASBA + 0.937; Pearson's correlation between NASBA and bDNA is 0.980, Lg(VL)NASBA = 0.980 * Lg(VL)bDNA - 0.318. When testing with 3 different assays, RT-PCR, bDNA and NASBA, the group of 16 HIV-1 positive samples showed the viral load value was highest for RT-PCR, followed by bDNA then NASBA, which is consistent with the former results in subtype B. The three viral load assays are highly correlative for CRF_BC in China.
Hydrophobic and electrostatic interactions between cell penetrating peptides and plasmid DNA are important for stable non-covalent complexation and intracellular delivery.

PubMed

Upadhya, Archana; Sangave, Preeti C

2016-10-01

Cell penetrating peptides are useful tools for intracellular delivery of nucleic acids. Delivery of plasmid DNA, a large nucleic acid, poses a challenge for peptide mediated transport. The paper investigates and compares efficacy of five novel peptide designs for complexation of plasmid DNA and subsequent delivery into cells. The peptides were designed to contain reported DNA condensing agents and basic cell penetrating sequences, octa-arginine (R 8 ) and CHK 6 HC coupled to cell penetration accelerating peptides such as Bax inhibitory mutant peptide (KLPVM) and a peptide derived from the Kaposi fibroblast growth factor (kFGF) membrane translocating sequence. A tryptophan rich peptide, an analogue of Pep-3, flanked with CH 3 on either ends was also a part of the study. The peptides were analysed for plasmid DNA complexation, protection of peptide-plasmid DNA complexes against DNase I, serum components and competitive ligands by simple agarose gel electrophoresis techniques. Hemolysis of rat red blood corpuscles (RBCs) in the presence of the peptides was used as a measure of peptide cytotoxicity. Plasmid DNA delivery through the designed peptides was evaluated in two cell lines, human cervical cancer cell line (HeLa) and (NIH/3 T3) mouse embryonic fibroblasts via expression of the secreted alkaline phosphatase (SEAP) reporter gene. The importance of hydrophobic sequences in addition to cationic sequences in peptides for non-covalent plasmid DNA complexation and delivery has been illustrated. An alternative to the employment of fatty acid moieties for enhanced gene transfer has been proposed. Comparison of peptides for plasmid DNA complexation and delivery of peptide-plasmid DNA complexes to cells estimated by expression of a reporter gene, SEAP. Copyright © 2016 European Peptide Society and John Wiley & Sons, Ltd. Copyright © 2016 European Peptide Society and John Wiley & Sons, Ltd.
DNA barcode and identification of the varieties and provenances of Taiwan's domestic and imported made teas using ribosomal internal transcribed spacer 2 sequences.

PubMed

Lee, Shih-Chieh; Wang, Chia-Hsiang; Yen, Cheng-En; Chang, Chieh

2017-04-01

The major aim of made tea identification is to identify the variety and provenance of the tea plant. The present experiment used 113 tea plants [Camellia sinensis (L.) O. Kuntze] housed at the Tea Research and Extension Substation, from which 113 internal transcribed spacer 2 (ITS2) fragments, 104 trnL intron, and 98 trnL-trnF intergenic sequence region DNA sequences were successfully sequenced. The similarity of the ITS2 nucleotide sequences between tea plants housed at the Tea Research and Extension Substation was 0.379-0.994. In this polymerase chain reaction-amplified noncoding region, no varieties possessed identical sequences. Compared with the trnL intron and trnL-trnF intergenic sequence fragments of chloroplast cpDNA, the proportion of ITS2 nucleotide sequence variation was large and is more suitable for establishing a DNA barcode database to identify tea plant varieties. After establishing the database, 30 imported teas and 35 domestic made teas were used in this model system to explore the feasibility of using ITS2 sequences to identify the varieties and provenances of made teas. A phylogenetic tree was constructed using ITS2 sequences with the unweighted pair group method with arithmetic mean, which indicated that the same variety of tea plant is likely to be successfully categorized into one cluster, but contamination from other tea plants was also detected. This result provides molecular evidence that the similarity between important tea varieties in Taiwan remains high. We suggest a direct, wide collection of made tea and original samples of tea plants to establish an ITS2 sequence molecular barcode identification database to identify the varieties and provenances of tea plants. The DNA barcode comparison method can satisfy the need for a rapid, low-cost, frontline differentiation of the large amount of made teas from Taiwan and abroad, and can provide molecular evidence of their varieties and provenances. Copyright © 2016. Published by Elsevier B.V.
Cloning of the cDNA for U1 small nuclear ribonucleoprotein particle 70K protein from Arabidopsis thaliana

NASA Technical Reports Server (NTRS)

Reddy, A. S.; Czernik, A. J.; An, G.; Poovaiah, B. W.

1992-01-01

We cloned and sequenced a plant cDNA that encodes U1 small nuclear ribonucleoprotein (snRNP) 70K protein. The plant U1 snRNP 70K protein cDNA is not full length and lacks the coding region for 68 amino acids in the amino-terminal region as compared to human U1 snRNP 70K protein. Comparison of the deduced amino acid sequence of the plant U1 snRNP 70K protein with the amino acid sequence of animal and yeast U1 snRNP 70K protein showed a high degree of homology. The plant U1 snRNP 70K protein is more closely related to the human counter part than to the yeast 70K protein. The carboxy-terminal half is less well conserved but, like the vertebrate 70K proteins, is rich in charged amino acids. Northern analysis with the RNA isolated from different parts of the plant indicates that the snRNP 70K gene is expressed in all of the parts tested. Southern blotting of genomic DNA using the cDNA indicates that the U1 snRNP 70K protein is coded by a single gene.
mtDNA control-region sequence variation suggests multiple independent origins of an "Asian-specific" 9-bp deletion in sub-Saharan Africans.

PubMed Central

Soodyall, H.; Vigilant, L.; Hill, A. V.; Stoneking, M.; Jenkins, T.

1996-01-01

The intergenic COII/tRNA(Lys) 9-bp deletion in human mtDNA, which is found at varying frequencies in Asia, Southeast Asia, Polynesia, and the New World, was also found in 81 of 919 sub-Saharan Africans. Using mtDNA control-region sequence data from a subset of 41 individuals with the deletion, we identified 22 unique mtDNA types associated with the deletion in Africa. A comparison of the unique mtDNA types from sub-Saharan Africans and Asians with the 9-bp deletion revealed that sub-Saharan Africans and Asians have sequence profiles that differ in the locations and frequencies of variant sites. Both phylogenetic and mismatch-distribution analysis suggest that 9-bp deletion arose independently in sub-Saharan Africa and Asia and that the deletion has arisen more than once in Africa. Within Africa, the deletion was not found among Khoisan peoples and was rare to absent in western and southwestern African populations, but it did occur in Pygmy and Negroid populations from central Africa and in Malawi and southern African Bantu-speakers. The distribution of the 9-bp deletion in Africa suggests that the deletion could have arisen in central Africa and was then introduced to southern Africa via the recent "Bantu expansion." PMID:8644719
Genetic characterization and phylogenetic analysis of Eimeria arloingi in Iranian native kids.

PubMed

Khodakaram-Tafti, A; Hashemnia, M; Razavi, S M; Sharifiyazdi, H; Nazifi, S

2013-09-01

Among the 16 species of Eimeria from goats, Eimeria arloingi and Eimeria ninakohlyakimovae are regarded as the most pathogenic species in the world and cause clinical caprine coccidiosis. E. arloingi is known to be an important cause of coccidiosis in Iranian kids. Molecular analyses of two portions of nuclear ribosomal DNA (internal transcribed spacer1 (ITS1) and 18S rDNA) were used for the genetic characterization of the E. arloingi. Comparison of the sequencing data of E. arloingi obtained in the present study (ITS1: KC507793 and 18S rDNA: KC507792) with other Eimeria species in the GenBank database revealed a particularly close relationship between E. arloingi and Eimeria spp. from the cattle and sheep. The phylogram based on the ITS1 sequences shows that the E. arloingi, Eimeria bovis, and Eimeria zuernii formed a distinct group separate from the other remaining Eimeria spp. in cattle and poultry. In pairwise alignment, 18S rDNA sequence derived from E. arloingi showed 99% similarity to Eimeria ahsata with differences observed at only three nucleotides. This study showed that the ITS1 and 18S rDNA gene are useful genetic markers for the specific identification and differentiation of Eimeria spp. in ruminants.
Sequence of the cDNA of a human dihydrodiol dehydrogenase isoform (AKR1C2) and tissue distribution of its mRNA.

PubMed Central

Shiraishi, H; Ishikura, S; Matsuura, K; Deyashiki, Y; Ninomiya, M; Sakai, S; Hara, A

1998-01-01

Human liver contains three isoforms (DD1, DD2 and DD4) of dihydrodiol dehydrogenase with 20alpha- or 3alpha-hydroxysteroid dehydrogenase activity; the dehydrogenases belong to the aldo-oxo reductase (AKR) superfamily. cDNA species encoding DD1 and DD4 have been identified. However, four cDNA species with more than 99% sequence identity have been cloned and are compatible with a partial amino acid sequence of DD2. In this study we have isolated a cDNA clone encoding DD2, which was confirmed by comparison of the properties of the recombinant and hepatic enzymes. This cDNA showed differences of one, two, four and five nucleotides from the previously reported four cDNA species for a dehydrogenase of human colon carcinoma HT29 cells, human prostatic 3alpha-hydroxysteroid dehydrogenase, a human liver 3alpha-hydroxysteroid dehydrogenase-like protein and chlordecone reductase-like protein respectively. Expression of mRNA species for the five similar cDNA species in 20 liver samples and 10 other different tissue samples was examined by reverse transcriptase-mediated PCR with specific primers followed by diagnostic restriction with endonucleases. All the tissues expressed only one mRNA species corresponding to the newly identified cDNA for DD2: mRNA transcripts corresponding to the other cDNA species were not detected. We suggest that the new cDNA is derived from the principal gene for DD2, which has been named AKR1C2 by a new nomenclature for the AKR superfamily. It is possible that some of the other cDNA species previously reported are rare allelic variants of this gene. PMID:9716498
New Ceratocystis species from Eucalyptus and Cunninghamia in South China.

PubMed

Liu, FeiFei; Mbenoun, Michael; Barnes, Irene; Roux, Jolanda; Wingfield, Michael J; Li, GuoQing; Li, JieQiong; Chen, ShuaiFei

2015-06-01

During routine surveys for possible fungal pathogens in the rapidly expanding plantations of Eucalyptus and Cunninghamia lanceolata in China, numerous isolates of unknown species in the genus Ceratocystis (Microascales) were obtained from tree wounds. In this study we identified the Ceratocystis isolates from Eucalyptus and Cunninghamia in the GuangDong, GuangXi, FuJian and HaiNan Provinces of South China based on morphology and through comparisons of DNA sequence data for the ITS, partial β-tubulin and TEF-1α gene regions. Morphological and DNA sequence comparisons revealed two previously unknown species residing in the Indo-Pacific Clade. These are described here as Ceratocystis cercfabiensis sp. nov. and Ceratocystis collisensis sp. nov. Isolates of Ceratocystis cercfabiensis showed intragenomic variation in their ITS sequences and four strains were selected for cloning of the ITS gene region. Twelve ITS haplotypes were obtained from 17 clones selected for sequencing, differing in up to seven base positions and representing two separate phylogenetic groups. This is the first evidence of multiple ITS types in isolates of Ceratocystis residing in the Indo-Pacific Clade. Caution should thus be exercised when using the ITS gene region as a barcoding marker for Ceratocystis species in this clade. This study also represents the first record of a species of Ceratocystis from Cunninghamia.
Grasshopper, a long terminal repeat (LTR) retroelement in the phytopathogenic fungus Magnaporthe grisea.

PubMed

Dobinson, K F; Harris, R E; Hamer, J E

1993-01-01

The fungal phytopathogen Magnaporthe grisea parasitizes a wide variety of gramineous hosts. In the course of investigating the genetic relationship between pathogen genotype and host specificity we identified a retroelement that is present in some strains of M. grisea that infect finger millet and goosegrass (members of the plant genus Eleusine). The element, designated grasshopper (grh), is present in multiple copies and dispersed throughout the genome. DNA sequence analysis showed that grasshopper contains 198 base pair direct, long terminal repeats (LTRs) with features characteristic of retroviral and retrotransposon LTRs. Within the element we identified an open reading frame with sequences homologous to the reverse transcriptase, RNaseH, and integrase domains of retroelement pol genes. Comparison of the open reading frame with sequences from other retroelements showed that grh is related to the gypsy family of retrotransposons. Comparisons of the distribution of the grasshopper element with other dispersed repeated DNA sequences in M. grisea indicated that grasshopper was present in a broadly dispersed subgroup of Eleusine pathogens, suggesting that the element was acquired subsequent to the evolution of this host-specific form. We present arguments that the amplification of different retroelements within populations of M. grisea is a consequence of the clonal organization of the fungal populations.
Genome sequence determination and metagenomic characterization of a Dehalococcoides mixed culture grown on cis-1,2-dichloroethene.

PubMed

Yohda, Masafumi; Yagi, Osami; Takechi, Ayane; Kitajima, Mizuki; Matsuda, Hisashi; Miyamura, Naoaki; Aizawa, Tomoko; Nakajima, Mutsuyasu; Sunairi, Michio; Daiba, Akito; Miyajima, Takashi; Teruya, Morimi; Teruya, Kuniko; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Juan, Ayaka; Nakano, Kazuma; Aoyama, Misako; Terabayashi, Yasunobu; Satou, Kazuhito; Hirano, Takashi

2015-07-01

A Dehalococcoides-containing bacterial consortium that performed dechlorination of 0.20 mM cis-1,2-dichloroethene to ethene in 14 days was obtained from the sediment mud of the lotus field. To obtain detailed information of the consortium, the metagenome was analyzed using the short-read next-generation sequencer SOLiD 3. Matching the obtained sequence tags with the reference genome sequences indicated that the Dehalococcoides sp. in the consortium was highly homologous to Dehalococcoides mccartyi CBDB1 and BAV1. Sequence comparison with the reference sequence constructed from 16S rRNA gene sequences in a public database showed the presence of Sedimentibacter, Sulfurospirillum, Clostridium, Desulfovibrio, Parabacteroides, Alistipes, Eubacterium, Peptostreptococcus and Proteocatella in addition to Dehalococcoides sp. After further enrichment, the members of the consortium were narrowed down to almost three species. Finally, the full-length circular genome sequence of the Dehalococcoides sp. in the consortium, D. mccartyi IBARAKI, was determined by analyzing the metagenome with the single-molecule DNA sequencer PacBio RS. The accuracy of the sequence was confirmed by matching it to the tag sequences obtained by SOLiD 3. The genome is 1,451,062 nt and the number of CDS is 1566, which includes 3 rRNA genes and 47 tRNA genes. There exist twenty-eight RDase genes that are accompanied by the genes for anchor proteins. The genome exhibits significant sequence identity with other Dehalococcoides spp. throughout the genome, but there exists significant difference in the distribution RDase genes. The combination of a short-read next-generation DNA sequencer and a long-read single-molecule DNA sequencer gives detailed information of a bacterial consortium. Copyright © 2014 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.
Comparison of manual and semi-automatic DNA extraction protocols for the barcoding characterization of hematophagous louse flies (Diptera: Hippoboscidae).

PubMed

Gutiérrez-López, Rafael; Martínez-de la Puente, Josué; Gangoso, Laura; Soriguer, Ramón C; Figuerola, Jordi

2015-06-01

The barcoding of life initiative provides a universal molecular tool to distinguish animal species based on the amplification and sequencing of a fragment of the subunit 1 of the cytochrome oxidase (COI) gene. Obtaining good quality DNA for barcoding purposes is a limiting factor, especially in studies conducted on small-sized samples or those requiring the maintenance of the organism as a voucher. In this study, we compared the number of positive amplifications and the quality of the sequences obtained using DNA extraction methods that also differ in their economic costs and time requirements and we applied them for the genetic characterization of louse flies. Four DNA extraction methods were studied: chloroform/isoamyl alcohol, HotShot procedure, Qiagen DNeasy(®) Tissue and Blood Kit and DNA Kit Maxwell(®) 16LEV. All the louse flies were morphologically identified as Ornithophila gestroi and a single COI-based haplotype was identified. The number of positive amplifications did not differ significantly among DNA extraction procedures. However, the quality of the sequences was significantly lower for the case of the chloroform/isoamyl alcohol procedure with respect to the rest of methods tested here. These results may be useful for the genetic characterization of louse flies, leaving most of the remaining insect as a voucher. © 2015 The Society for Vector Ecology.
Amino acid racemization in amber-entombed insects: implications for DNA preservation

NASA Technical Reports Server (NTRS)

Bada, J. L.; Wang, X. S.; Poinar, H. N.; Paabo, S.; Poinar, G. O.

1994-01-01

DNA depurination and amino acid racemization take place at similar rates in aqueous solution at neutral pH. This relationship suggests that amino acid racemization may be useful in accessing the extent of DNA chain breakage in ancient biological remains. To test this suggestion, we have investigated the amino acids in insects entombed in fossilized tree resins ranging in age from <100 years to 130 million years. The amino acids present in 40 to 130 million year old amber-entombed insects resemble those in a modern fly and are probably the most ancient, unaltered amino acids found so far on Earth. In comparison to other geochemical environments on the surface of the Earth, the amino acid racemization rate in amber insect inclusions is retarded by a factor of >10(4). These results suggest that in amber insect inclusions DNA depurination rates would also likely be retarded in comparison to aqueous solution measurements, and thus DNA fragments containing many hundreds of base pairs should be preserved. This conclusion is consistent with the reported successful retrieval of DNA sequences from amber-entombed organisms.
The gene space in wheat: the complete γ-gliadin gene family from the wheat cultivar Chinese Spring.

PubMed

Anderson, Olin D; Huo, Naxin; Gu, Yong Q

2013-06-01

The complete set of unique γ-gliadin genes is described for the wheat cultivar Chinese Spring using a combination of expressed sequence tag (EST) and Roche 454 DNA sequences. Assemblies of Chinese Spring ESTs yielded 11 different γ-gliadin gene sequences. Two of the sequences encode identical polypeptides and are assumed to be the result of a recent gene duplication. One gene has a 3' coding mutation that changes the reading frame in the final eight codons. A second assembly of Chinese Spring γ-gliadin sequences was generated using Roche 454 total genomic DNA sequences. The 454 assembly confirmed the same 11 active genes as the EST assembly plus two pseudogenes not represented by ESTs. These 13 γ-gliadin sequences represent the complete unique set of γ-gliadin genes for cv Chinese Spring, although not ruled out are additional genes that are exact duplications of these 13 genes. A comparison with the ESTs of two other hexaploid cultivars (Butte 86 and Recital) finds that the most active genes are present in all three cultivars, with exceptions likely due to too few ESTs for detection in Butte 86 and Recital. A comparison of the numbers of ESTs per gene indicates differential levels of expression within the γ-gliadin gene family. Genome assignments were made for 6 of the 13 Chinese Spring γ-gliadin genes, i.e., one assignment from a match to two γ-gliadin genes found within a tetraploid wheat A genome BAC and four genes that match four distinct γ-gliadin sequences assembled from Roche 454 sequences from Aegilops tauschii, the hexaploid wheat D-genome ancestor.
Saccharothrix ghardaiensis sp. nov., an actinobacterium isolated from Saharan soil.

PubMed

Bouznada, Khaoula; Bouras, Noureddine; Mokrane, Salim; Chaabane Chaouch, Fawzia; Zitouni, Abdelghani; Pötter, Gabriele; Spröer, Cathrin; Klenk, Hans-Peter; Sabaou, Nasserdine

2017-03-01

The taxonomic position of a new Saccharothrix strain, designated MB46 T , isolated from a Saharan soil sample collected in Mzab region (Ghardaïa province, South Algeria) was established following a polyphasic approach. The novel microorganism has morphological and chemical characteristics typical of the members of the genus Saccharothrix and formed a phyletic line at the periphery of the Saccharothrix espanaensis subcluster in the 16S rRNA gene dendrograms. Results of the 16S rRNA gene sequence comparisons revealed that strain MB46 T shares high degrees of similarity with S. espanaensis DSM 44229 T (99.2%), Saccharothrix variisporea DSM 43911 T (98.7%) and Saccharothrix texasensis NRRL B-16134 T (98.6%). However, the new strain exhibited only 12.5-17.5% DNA relatedness to the neighbouring Saccharothrix spp. On the basis of phenotypic characteristics, 16S rRNA gene sequence comparisons and DNA-DNA hybridizations, strain MB46 T is concluded to represent a novel species of the genus Saccharothrix, for which the name Saccharothrix ghardaiensis sp. nov. (type strain MB46 T = DSM 46886 T = CECT 9046 T ) is proposed.
Quantifying the Number of Independent Organelle DNA Insertions in Genome Evolution and Human Health

PubMed Central

Martin, William F.

2017-01-01

Fragments of organelle genomes are often found as insertions in nuclear DNA. These fragments of mitochondrial DNA (numts) and plastid DNA (nupts) are ubiquitous components of eukaryotic genomes. They are, however, often edited out during the genome assembly process, leading to systematic underestimation of their frequency. Numts and nupts, once inserted, can become further fragmented through subsequent insertion of mobile elements or other recombinational events that disrupt the continuity of the inserted sequence relative to the genuine organelle DNA copy. Because numts and nupts are typically identified through sequence comparison tools such as BLAST, disruption of insertions into smaller fragments can lead to systematic overestimation of numt and nupt frequencies. Accurate identification of numts and nupts is important, however, both for better understanding of their role during evolution, and for monitoring their increasingly evident role in human disease. Human populations are polymorphic for 141 numt loci, five numts are causal to genetic disease, and cancer genomic studies are revealing an abundance of numts associated with tumor progression. Here, we report investigation of salient parameters involved in obtaining accurate estimates of numt and nupt numbers in genome sequence data. Numts and nupts from 44 sequenced eukaryotic genomes reveal lineage-specific differences in the number, relative age and frequency of insertional events as well as lineage-specific dynamics of their postinsertional fragmentation. Our findings outline the main technical parameters influencing accurate identification and frequency estimation of numts in genomic studies pertinent to both evolution and human health. PMID:28444372
DNA sequence of the lymphotropic variant of minute virus of mice, MVM(i), and comparison with the DNA sequence of the fibrotropic prototype strain.

PubMed

Astell, C R; Gardiner, E M; Tattersall, P

1986-02-01

The sequence of molecular clones of the genome of MVM(i), a lymphotropic variant of minute virus of mice, was determined and compared with that of MVM(p), the fibrotropic prototype strain. At the nucleotide level there are 163 base changes: 129 transitions and 34 transversions. Most nucleotide changes are silent, with only 27 amino acids changes predicted, of which 22 are conservative. Notable differences between the MVM(i) and MVM(p) genomes which may account for the cell specificities of these viruses occur within the 3' nontranslated regions. The differences discussed include the absence of a 65-base-pair direct in MVM(i), the presence of only two polyadenylation sites in MVM(i) compared with four in MVM(p), and sequences that bear a resemblance to enhancer sequences. Also included in this paper is an important correction to the MVM(p) sequence (C.R. Astell, M. Thomson, M. Merchlinsky, and D. C. Ward, Nucleic Acids Res. 11:999-1018, 1983).
The current status of microscopical hair comparisons.

PubMed

Rowe, W F

2001-12-08

Although the microscopical comparison of human hairs has been accepted in courts of law for over a century, recent advances in DNA technology have called this type of forensic examination into question. In a number of cases, post-conviction DNA testing has exonerated defendants who were convicted in part on the results of microscopical hair comparisons. A federal judge has held a Daubert hearing on the microscopical comparison of human hairs and has concluded that this type of examination does not meet the criteria for admission of scientific evidence in federal courts. A review of the available scientific literature on microscopical hair comparisons (including studies conducted by the Royal Canadian Mounted Police and the Federal Bureau of Investigation) leads to three conclusions: (1) microscopical comparisons of human hairs can yield scientifically defensible conclusions that can contribute to criminal investigations and criminal prosecutions, (2) the reliability of microscopical hair comparisons is strongly affected by the training of the forensic hair examiner, (3) forensic hair examiners cannot offer estimates of the probability of a match of a questioned hair with a hair from a randomly selected person. In order for microscopical hair examinations to survive challenges under the U.S. Supreme Court's Daubert decision, hair microscopists must be better trained and undergo frequent proficiency testing. More research on the error rates of microscopical hair comparisons should be undertaken, and guidelines for the permissible interpretations of such comparisons should be established. Until these issues have been addressed and satisfactorily resolved, microscopical hair comparisons should be regarded by law enforcement agencies and courts of law as merely presumptive in nature, and all microscopical hair comparisons should be confirmed by nuclear DNA profiling or mitochondrial DNA sequencing.
De novo Assembly of a 40 Mb Eukaryotic Genome from Short Sequence Reads: Sordaria macrospora, a Model Organism for Fungal Morphogenesis

PubMed Central

Nowrousian, Minou; Stajich, Jason E.; Chu, Meiling; Engh, Ines; Espagne, Eric; Halliday, Karen; Kamerewerd, Jens; Kempken, Frank; Knab, Birgit; Kuo, Hsiao-Che; Osiewacz, Heinz D.; Pöggeler, Stefanie; Read, Nick D.; Seiler, Stephan; Smith, Kristina M.; Zickler, Denise; Kück, Ulrich; Freitag, Michael

2010-01-01

Filamentous fungi are of great importance in ecology, agriculture, medicine, and biotechnology. Thus, it is not surprising that genomes for more than 100 filamentous fungi have been sequenced, most of them by Sanger sequencing. While next-generation sequencing techniques have revolutionized genome resequencing, e.g. for strain comparisons, genetic mapping, or transcriptome and ChIP analyses, de novo assembly of eukaryotic genomes still presents significant hurdles, because of their large size and stretches of repetitive sequences. Filamentous fungi contain few repetitive regions in their 30–90 Mb genomes and thus are suitable candidates to test de novo genome assembly from short sequence reads. Here, we present a high-quality draft sequence of the Sordaria macrospora genome that was obtained by a combination of Illumina/Solexa and Roche/454 sequencing. Paired-end Solexa sequencing of genomic DNA to 85-fold coverage and an additional 10-fold coverage by single-end 454 sequencing resulted in ∼4 Gb of DNA sequence. Reads were assembled to a 40 Mb draft version (N50 of 117 kb) with the Velvet assembler. Comparative analysis with Neurospora genomes increased the N50 to 498 kb. The S. macrospora genome contains even fewer repeat regions than its closest sequenced relative, Neurospora crassa. Comparison with genomes of other fungi showed that S. macrospora, a model organism for morphogenesis and meiosis, harbors duplications of several genes involved in self/nonself-recognition. Furthermore, S. macrospora contains more polyketide biosynthesis genes than N. crassa. Phylogenetic analyses suggest that some of these genes may have been acquired by horizontal gene transfer from a distantly related ascomycete group. Our study shows that, for typical filamentous fungi, de novo assembly of genomes from short sequence reads alone is feasible, that a mixture of Solexa and 454 sequencing substantially improves the assembly, and that the resulting data can be used for comparative studies to address basic questions of fungal biology. PMID:20386741
De novo assembly of a 40 Mb eukaryotic genome from short sequence reads: Sordaria macrospora, a model organism for fungal morphogenesis.

PubMed

Nowrousian, Minou; Stajich, Jason E; Chu, Meiling; Engh, Ines; Espagne, Eric; Halliday, Karen; Kamerewerd, Jens; Kempken, Frank; Knab, Birgit; Kuo, Hsiao-Che; Osiewacz, Heinz D; Pöggeler, Stefanie; Read, Nick D; Seiler, Stephan; Smith, Kristina M; Zickler, Denise; Kück, Ulrich; Freitag, Michael

2010-04-08

Filamentous fungi are of great importance in ecology, agriculture, medicine, and biotechnology. Thus, it is not surprising that genomes for more than 100 filamentous fungi have been sequenced, most of them by Sanger sequencing. While next-generation sequencing techniques have revolutionized genome resequencing, e.g. for strain comparisons, genetic mapping, or transcriptome and ChIP analyses, de novo assembly of eukaryotic genomes still presents significant hurdles, because of their large size and stretches of repetitive sequences. Filamentous fungi contain few repetitive regions in their 30-90 Mb genomes and thus are suitable candidates to test de novo genome assembly from short sequence reads. Here, we present a high-quality draft sequence of the Sordaria macrospora genome that was obtained by a combination of Illumina/Solexa and Roche/454 sequencing. Paired-end Solexa sequencing of genomic DNA to 85-fold coverage and an additional 10-fold coverage by single-end 454 sequencing resulted in approximately 4 Gb of DNA sequence. Reads were assembled to a 40 Mb draft version (N50 of 117 kb) with the Velvet assembler. Comparative analysis with Neurospora genomes increased the N50 to 498 kb. The S. macrospora genome contains even fewer repeat regions than its closest sequenced relative, Neurospora crassa. Comparison with genomes of other fungi showed that S. macrospora, a model organism for morphogenesis and meiosis, harbors duplications of several genes involved in self/nonself-recognition. Furthermore, S. macrospora contains more polyketide biosynthesis genes than N. crassa. Phylogenetic analyses suggest that some of these genes may have been acquired by horizontal gene transfer from a distantly related ascomycete group. Our study shows that, for typical filamentous fungi, de novo assembly of genomes from short sequence reads alone is feasible, that a mixture of Solexa and 454 sequencing substantially improves the assembly, and that the resulting data can be used for comparative studies to address basic questions of fungal biology.

Construction of a high-density genetic map for grape using next generation restriction-site associated DNA sequencing

PubMed Central

2012-01-01

Background Genetic mapping and QTL detection are powerful methodologies in plant improvement and breeding. Construction of a high-density and high-quality genetic map would be of great benefit in the production of superior grapes to meet human demand. High throughput and low cost of the recently developed next generation sequencing (NGS) technology have resulted in its wide application in genome research. Sequencing restriction-site associated DNA (RAD) might be an efficient strategy to simplify genotyping. Combining NGS with RAD has proven to be powerful for single nucleotide polymorphism (SNP) marker development. Results An F1 population of 100 individual plants was developed. In-silico digestion-site prediction was used to select an appropriate restriction enzyme for construction of a RAD sequencing library. Next generation RAD sequencing was applied to genotype the F1 population and its parents. Applying a cluster strategy for SNP modulation, a total of 1,814 high-quality SNP markers were developed: 1,121 of these were mapped to the female genetic map, 759 to the male map, and 1,646 to the integrated map. A comparison of the genetic maps to the published Vitis vinifera genome revealed both conservation and variations. Conclusions The applicability of next generation RAD sequencing for genotyping a grape F1 population was demonstrated, leading to the successful development of a genetic map with high density and quality using our designed SNP markers. Detailed analysis revealed that this newly developed genetic map can be used for a variety of genome investigations, such as QTL detection, sequence assembly and genome comparison. PMID:22908993
Practical aspects of genetic identification of hallucinogenic and other poisonous mushrooms for clinical and forensic purposes

PubMed Central

Kowalczyk, Marek; Sekuła, Andrzej; Mleczko, Piotr; Olszowy, Zofia; Kujawa, Anna; Zubek, Szymon; Kupiec, Tomasz

2015-01-01

Aim To assess the usefulness of a DNA-based method for identifying mushroom species for application in forensic laboratory practice. Methods Two hundred twenty-one samples of clinical forensic material (dried mushrooms, food remains, stomach contents, feces, etc) were analyzed. ITS2 region of nuclear ribosomal DNA (nrDNA) was sequenced and the sequences were compared with reference sequences collected from the National Center for Biotechnology Information gene bank (GenBank). Sporological identification of mushrooms was also performed for 57 samples of clinical material. Results Of 221 samples, positive sequencing results were obtained for 152 (69%). The highest percentage of positive results was obtained for samples of dried mushrooms (96%) and food remains (91%). Comparison with GenBank sequences enabled identification of all samples at least at the genus level. Most samples (90%) were identified at the level of species or a group of closely related species. Sporological and molecular identification were consistent at the level of species or genus for 30% of analyzed samples. Conclusion Molecular analysis identified a larger number of species than sporological method. It proved to be suitable for analysis of evidential material (dried hallucinogenic mushrooms) in forensic genetic laboratories as well as to complement classical methods in the analysis of clinical material. PMID:25727040
Deletion endpoint allele-specificity in the developmentally regulated elimination of an internal sequence (IES) in Paramecium.

PubMed Central

Dubrana, K; Le Mouël, A; Amar, L

1997-01-01

Ciliated protozoa undergo thousands of site-specific DNA deletion events during the programmed development of micronuclear genomes to macronuclear genomes. Two deletion elements, W1 and W2, were identified in the Paramecium primaurelia wild-type 156 strain. Here, we report the characterization of both elements in wild-type strain 168 and show that they display variant deletion patterns when compared with those of strain 156. The W1 ( 168 ) element is defective for deletion. The W2 ( 168 ) element is excised utilizing two alternative boundaries on one side, both are different from the boundary utilized to excise the W2156 element. By crossing the 156 and 168 strains, we demonstrate that the definition of all deletion endpoints are each controlled by cis -acting determinant(s) rather than by strain-specific trans-acting factor(s). Sequence comparison of all deleted DNA segments indicates that the 5'-TA-3'terminal sequence is strictly required at their ends. Furthermore the identity of the first eight base pairs of these ends to a previously established consensus sequence correlates with the frequency of the corresponding deletion events. Our data implies the existence of an adaptive convergent evolution of these Paramecium deleted DNA segment end sequences. PMID:9171098
Practical aspects of genetic identification of hallucinogenic and other poisonous mushrooms for clinical and forensic purposes.

PubMed

Kowalczyk, Marek; Sekuła, Andrzej; Mleczko, Piotr; Olszowy, Zofia; Kujawa, Anna; Zubek, Szymon; Kupiec, Tomasz

2015-02-01

To assess the usefulness of a DNA-based method for identifying mushroom species for application in forensic laboratory practice. Two hundred twenty-one samples of clinical forensic material (dried mushrooms, food remains, stomach contents, feces, etc) were analyzed. ITS2 region of nuclear ribosomal DNA (nrDNA) was sequenced and the sequen-ces were compared with reference sequences collected from the National Center for Biotechnology Information gene bank (GenBank). Sporological identification of mushrooms was also performed for 57 samples of clinical material. Of 221 samples, positive sequencing results were obtained for 152 (69%). The highest percentage of positive results was obtained for samples of dried mushrooms (96%) and food remains (91%). Comparison with GenBank sequences enabled identification of all samples at least at the genus level. Most samples (90%) were identified at the level of species or a group of closely related species. Sporological and molecular identification were consistent at the level of species or genus for 30% of analyzed samples. Molecular analysis identified a larger number of species than sporological method. It proved to be suitable for analysis of evidential material (dried hallucinogenic mushrooms) in forensic genetic laboratories as well as to complement classical methods in the analysis of clinical material.
Specific DNA binding of the two chicken Deformed family homeodomain proteins, Chox-1.4 and Chox-a.

PubMed Central

Sasaki, H; Yokoyama, E; Kuroiwa, A

1990-01-01

The cDNA clones encoding two chicken Deformed (Dfd) family homeobox containing genes Chox-1.4 and Chox-a were isolated. Comparison of their amino acid sequences with another chicken Dfd family homeodomain protein and with those of mouse homologues revealed that strong homologies are located in the amino terminal regions and around the homeodomains. Although homologies in other regions were relatively low, some short conserved sequences were also identified. E. coli-made full length proteins were purified and used for the production of specific antibodies and for DNA binding studies. The binding profiles of these proteins to the 5'-leader and 5'-upstream sequences of Chox-1.4 and Chox-a coding regions were analyzed by immunoprecipitation and DNase I footprint assays. These two Chox proteins bound to the same sites in the 5'-flanking sequences of their coding regions with various affinities and their binding affinities to each site were nearly the same. The consensus sequences of the high and low affinity binding sites were TAATGA(C/G) and CTAATTTT, respectively. A clustered binding site was identified in the 5'-upstream of the Chox-a gene, suggesting that this clustered binding site works as a cis-regulatory element for auto- and/or cross-regulation of Chox-a gene expression. Images PMID:1970866
Diagnostic Applications of Next Generation Sequencing in Immunogenetics and Molecular Oncology

PubMed Central

Grumbt, Barbara; Eck, Sebastian H.; Hinrichsen, Tanja; Hirv, Kaimo

2013-01-01

Summary With the introduction of the next generation sequencing (NGS) technologies, remarkable new diagnostic applications have been established in daily routine. Implementation of NGS is challenging in clinical diagnostics, but definite advantages and new diagnostic possibilities make the switch to the technology inevitable. In addition to the higher sequencing capacity, clonal sequencing of single molecules, multiplexing of samples, higher diagnostic sensitivity, workflow miniaturization, and cost benefits are some of the valuable features of the technology. After the recent advances, NGS emerged as a proven alternative for classical Sanger sequencing in the typing of human leukocyte antigens (HLA). By virtue of the clonal amplification of single DNA molecules ambiguous typing results can be avoided. Simultaneously, a higher sample throughput can be achieved by tagging of DNA molecules with multiplex identifiers and pooling of PCR products before sequencing. In our experience, up to 380 samples can be typed for HLA-A, -B, and -DRB1 in high-resolution during every sequencing run. In molecular oncology, NGS shows a markedly increased sensitivity in comparison to the conventional Sanger sequencing and is developing to the standard diagnostic tool in detection of somatic mutations in cancer cells with great impact on personalized treatment of patients. PMID:23922545
Cloned plasmid DNA fragments as calibrators for controlling GMOs: different real-time duplex quantitative PCR methods.

PubMed

Taverniers, Isabel; Van Bockstaele, Erik; De Loose, Marc

2004-03-01

Analytical real-time PCR technology is a powerful tool for implementation of the GMO labeling regulations enforced in the EU. The quality of analytical measurement data obtained by quantitative real-time PCR depends on the correct use of calibrator and reference materials (RMs). For GMO methods of analysis, the choice of appropriate RMs is currently under debate. So far, genomic DNA solutions from certified reference materials (CRMs) are most often used as calibrators for GMO quantification by means of real-time PCR. However, due to some intrinsic features of these CRMs, errors may be expected in the estimations of DNA sequence quantities. In this paper, two new real-time PCR methods are presented for Roundup Ready soybean, in which two types of plasmid DNA fragments are used as calibrators. Single-target plasmids (STPs) diluted in a background of genomic DNA were used in the first method. Multiple-target plasmids (MTPs) containing both sequences in one molecule were used as calibrators for the second method. Both methods simultaneously detect a promoter 35S sequence as GMO-specific target and a lectin gene sequence as endogenous reference target in a duplex PCR. For the estimation of relative GMO percentages both "delta C(T)" and "standard curve" approaches are tested. Delta C(T) methods are based on direct comparison of measured C(T) values of both the GMO-specific target and the endogenous target. Standard curve methods measure absolute amounts of target copies or haploid genome equivalents. A duplex delta C(T) method with STP calibrators performed at least as well as a similar method with genomic DNA calibrators from commercial CRMs. Besides this, high quality results were obtained with a standard curve method using MTP calibrators. This paper demonstrates that plasmid DNA molecules containing either one or multiple target sequences form perfect alternative calibrators for GMO quantification and are especially suitable for duplex PCR reactions.
Dynamic programming algorithms for biological sequence comparison.

PubMed

Pearson, W R; Miller, W

1992-01-01

Efficient dynamic programming algorithms are available for a broad class of protein and DNA sequence comparison problems. These algorithms require computer time proportional to the product of the lengths of the two sequences being compared [O(N2)] but require memory space proportional only to the sum of these lengths [O(N)]. Although the requirement for O(N2) time limits use of the algorithms to the largest computers when searching protein and DNA sequence databases, many other applications of these algorithms, such as calculation of distances for evolutionary trees and comparison of a new sequence to a library of sequence profiles, are well within the capabilities of desktop computers. In particular, the results of library searches with rapid searching programs, such as FASTA or BLAST, should be confirmed by performing a rigorous optimal alignment. Whereas rapid methods do not overlook significant sequence similarities, FASTA limits the number of gaps that can be inserted into an alignment, so that a rigorous alignment may extend the alignment substantially in some cases. BLAST does not allow gaps in the local regions that it reports; a calculation that allows gaps is very likely to extend the alignment substantially. Although a Monte Carlo evaluation of the statistical significance of a similarity score with a rigorous algorithm is much slower than the heuristic approach used by the RDF2 program, the dynamic programming approach should take less than 1 hr on a 386-based PC or desktop Unix workstation. For descriptive purposes, we have limited our discussion to methods for calculating similarity scores and distances that use gap penalties of the form g = rk. Nevertheless, programs for the more general case (g = q+rk) are readily available. Versions of these programs that run either on Unix workstations, IBM-PC class computers, or the Macintosh can be obtained from either of the authors.
Mixed Sequence Reader: A Program for Analyzing DNA Sequences with Heterozygous Base Calling

PubMed Central

Chang, Chun-Tien; Tsai, Chi-Neu; Tang, Chuan Yi; Chen, Chun-Houh; Lian, Jang-Hau; Hu, Chi-Yu; Tsai, Chia-Lung; Chao, Angel; Lai, Chyong-Huey; Wang, Tzu-Hao; Lee, Yun-Shien

2012-01-01

The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4) and its paralog HSPDP3. PMID:22778697
Analysis of developmental gene conservation in the Actinomycetales using DNA/DNA microarray comparisons.

PubMed

Kirby, Ralph; Herron, Paul; Hoskisson, Paul

2011-02-01

Based on available genome sequences, Actinomycetales show significant gene synteny across a wide range of species and genera. In addition, many genera show varying degrees of complex morphological development. Using the presence of gene synteny as a basis, it is clear that an analysis of gene conservation across the Streptomyces and various other Actinomycetales will provide information on both the importance of genes and gene clusters and the evolution of morphogenesis in these bacteria. Genome sequencing, although becoming cheaper, is still relatively expensive for comparing large numbers of strains. Thus, a heterologous DNA/DNA microarray hybridization dataset based on a Streptomyces coelicolor microarray allows a cheaper and greater depth of analysis of gene conservation. This study, using both bioinformatical and microarray approaches, was able to classify genes previously identified as involved in morphogenesis in Streptomyces into various subgroups in terms of conservation across species and genera. This will allow the targeting of genes for further study based on their importance at the species level and at higher evolutionary levels.
Epitopes of human testis-specific lactate dehydrogenase deduced from a cDNA sequence

DOE Office of Scientific and Technical Information (OSTI.GOV)

Millan, J.L.; Driscoll, C.E.; LeVan, K.M.

The sequence and structure of human testis-specific L-lactate dehydrogenase (LDHC/sub 4/, LDHX; (L)-lactate:NAD/sup +/ oxidoreductase, EC 1.1.1.27) has been derived from analysis of a complementary DNA (cDNA) clone comprising the complete protein coding region of the enzyme. From the deduced amino acid sequence, human LDHC/sub 4/ is as different from rodent LDHC/sub 4/ (73% homology) as it is from human LDHA/sub 4/ (76% homology) and porcine LDHB/sub 4/ (68% homology). Subunit homologies are consistent with the conclusion that the LDHC gene arose by at least two independent duplication events. Furthermore, the lower degree of homology between mouse and human LDHC/submore » 4/ and the appearance of this isozyme late in evolution suggests a higher rate of mutation in the mammalian LDHC genes than in the LDHA and -B genes. Comparison of exposed amino acid residues of discrete anti-genic determinants of mouse and human LDHC/sub 4/ reveals significant differences. Knowledge of the human LDHC/sub 4/ sequence will help design human-specific peptides useful in the development of a contraceptive vaccine.« less
Sequence analysis of ORF IV RTBV isolated from tungro infected Oryza sativa L. cv Ciherang

NASA Astrophysics Data System (ADS)

Hastilestari, Bernadetta Rina; Astuti, Dwi; Estiati, Amy; Nugroho, Satya

2015-09-01

The Effort to increase rice production is often constrained by pest and disease such as Tungro. The Tungro disease is caused by the joint infection with two dissimilar viruses; a bacil-form-DNA virus, the Rice tungro bacilliform virus(RTBV) and the spherical RNA virus, Rice tungro spherical virus (RTSV) and transmitted by Green leafhopper (Nephotettix virescens). The symptom of disease is caused by the presence of RTBV. The genome of RTBV consists of four Open reading frames (ORFs) which encode functional proteins. Of the four, ORF IV is unique because it exists only in RTBV. The most efficient method of generating disease resistance plants is to look for natural sources of resistance genes in wild or germplasm and then transfer the gene and the accompanying resistance in cultivated crop varieties. The aim of this study is, therefore, to isolate and analyze of 1170 bp gene of ORF 4 of Tungro virus isolated from an Indonesian rice cultivar, Ciherang (Oryza sativa L. cv Indica). DNA sequencing analysis using BLAST showed 94% similarity with the reference sequence gen bank Acc.M65026.1. The comparisons and mutation analysis of DNA sequences were discussed in this research.
Enterobacter muelleri sp. nov., isolated from the rhizosphere of Zea mays.

PubMed

Kämpfer, Peter; McInroy, John A; Glaeser, Stefanie P

2015-11-01

A beige-pigmented, oxidase-negative bacterial strain (JM-458T), isolated from a rhizosphere sample, was studied using a polyphasic taxonomic approach. Cells of the isolate were rod-shaped and stained Gram-negative. A comparison of the 16S rRNA gene sequence of strain JM-458T with sequences of the type strains of closely related species of the genus Enterobacter showed that it shared highest sequence similarity with Enterobacter mori (98.7 %), Enterobacter hormaechei (98.3 %), Enterobacter cloacae subsp. dissolvens, Enterobacter ludwigii and Enterobacter asburiae (all 98.2 %). 16S rRNA gene sequence similarities to all other Enterobacter species were below 98 %. Multilocus sequence analysis based on concatenated partial rpoB, gyrB, infB and atpD gene sequences showed a clear distinction of strain JM-458T from its closest related type strains. The fatty acid profile of the strain consisted of C16 : 0, C17 : 0 cyclo, iso-C15 : 0 2-OH/C16 : 1ω7c and C18 : 1ω7c as major components. DNA-DNA hybridizations between strain JM-458T and the type strains of E. mori, E. hormaechei and E. ludwigii resulted in relatedness values of 29 % (reciprocal 25 %), 24 % (reciprocal 43 %) and 16 % (reciprocal 17 %), respectively. DNA-DNA hybridization results together with multilocus sequence analysis results and differential biochemical and chemotaxonomic properties showed that strain JM-458T represents a novel species of the genus Enterobacter, for which the name Enterobacter muelleri sp. nov. is proposed. The type strain is JM-458T ( = DSM 29346T = CIP 110826T = LMG 28480T = CCM 8546T).
Inferring Higher Functional Information for RIKEN Mouse Full-Length cDNA Clones With FACTS

PubMed Central

Nagashima, Takeshi; Silva, Diego G.; Petrovsky, Nikolai; Socha, Luis A.; Suzuki, Harukazu; Saito, Rintaro; Kasukawa, Takeya; Kurochkin, Igor V.; Konagaya, Akihiko; Schönbach, Christian

2003-01-01

FACTS (Functional Association/Annotation of cDNA Clones from Text/Sequence Sources) is a semiautomated knowledge discovery and annotation system that integrates molecular function information derived from sequence analysis results (sequence inferred) with functional information extracted from text. Text-inferred information was extracted from keyword-based retrievals of MEDLINE abstracts and by matching of gene or protein names to OMIM, BIND, and DIP database entries. Using FACTS, we found that 47.5% of the 60,770 RIKEN mouse cDNA FANTOM2 clone annotations were informative for text searches. MEDLINE queries yielded molecular interaction-containing sentences for 23.1% of the clones. When disease MeSH and GO terms were matched with retrieved abstracts, 22.7% of clones were associated with potential diseases, and 32.5% with GO identifiers. A significant number (23.5%) of disease MeSH-associated clones were also found to have a hereditary disease association (OMIM Morbidmap). Inferred neoplastic and nervous system disease represented 49.6% and 36.0% of disease MeSH-associated clones, respectively. A comparison of sequence-based GO assignments with informative text-based GO assignments revealed that for 78.2% of clones, identical GO assignments were provided for that clone by either method, whereas for 21.8% of clones, the assignments differed. In contrast, for OMIM assignments, only 28.5% of clones had identical sequence-based and text-based OMIM assignments. Sequence, sentence, and term-based functional associations are included in the FACTS database (http://facts.gsc.riken.go.jp/), which permits results to be annotated and explored through web-accessible keyword and sequence search interfaces. The FACTS database will be a critical tool for investigating the functional complexity of the mouse transcriptome, cDNA-inferred interactome (molecular interactions), and pathome (pathologies). PMID:12819151
Strong transcription blockage mediated by R-loop formation within a G-rich homopurine–homopyrimidine sequence localized in the vicinity of the promoter

PubMed Central

Soo Shin, Jane Hae

2017-01-01

Abstract Guanine-rich (G-rich) homopurine–homopyrimidine nucleotide sequences can block transcription with an efficiency that depends upon their orientation, composition and length, as well as the presence of negative supercoiling or breaks in the non-template DNA strand. We report that a G-rich sequence in the non-template strand reduces the yield of T7 RNA polymerase transcription by more than an order of magnitude when positioned close (9 bp) to the promoter, in comparison to that for a distal (∼250 bp) location of the same sequence. This transcription blockage is much less pronounced for a C-rich sequence, and is not significant for an A-rich sequence. Remarkably, the blockage is not pronounced if transcription is performed in the presence of RNase H, which specifically digests the RNA strands within RNA–DNA hybrids. The blockage also becomes less pronounced upon reduced RNA polymerase concentration. Based upon these observations and those from control experiments, we conclude that the blockage is primarily due to the formation of stable RNA–DNA hybrids (R-loops), which inhibit successive rounds of transcription. Our results could be relevant to transcription dynamics in vivo (e.g. transcription ‘bursting’) and may also have practical implications for the design of expression vectors. PMID:28498974
Strong transcription blockage mediated by R-loop formation within a G-rich homopurine-homopyrimidine sequence localized in the vicinity of the promoter.

PubMed

Belotserkovskii, Boris P; Soo Shin, Jane Hae; Hanawalt, Philip C

2017-06-20

Guanine-rich (G-rich) homopurine-homopyrimidine nucleotide sequences can block transcription with an efficiency that depends upon their orientation, composition and length, as well as the presence of negative supercoiling or breaks in the non-template DNA strand. We report that a G-rich sequence in the non-template strand reduces the yield of T7 RNA polymerase transcription by more than an order of magnitude when positioned close (9 bp) to the promoter, in comparison to that for a distal (∼250 bp) location of the same sequence. This transcription blockage is much less pronounced for a C-rich sequence, and is not significant for an A-rich sequence. Remarkably, the blockage is not pronounced if transcription is performed in the presence of RNase H, which specifically digests the RNA strands within RNA-DNA hybrids. The blockage also becomes less pronounced upon reduced RNA polymerase concentration. Based upon these observations and those from control experiments, we conclude that the blockage is primarily due to the formation of stable RNA-DNA hybrids (R-loops), which inhibit successive rounds of transcription. Our results could be relevant to transcription dynamics in vivo (e.g. transcription 'bursting') and may also have practical implications for the design of expression vectors. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Molecular characterization of DnaJ 5 homologs in silkworm Bombyx mori and its expression during egg diapause.

PubMed

Sirigineedi, Sasibhushan; Vijayagowri, Esvaran; Murthy, Geetha N; Rao, Guruprasada; Ponnuvel, Kangayam M

2014-12-01

A comparison of the cDNA sequences (1 056 bp) of Bombyx mori DnaJ 5 homolog with B. mori genome revealed that unlike in other Hsps, it has an intron of 234 bp. The DnaJ 5 homolog contains 351 amino acids, of which 70 contain the conserved DnaJ domain at the N-terminal end. This homolog of B. mori has all desirable functional domains similar to other insects, and the 13 different DnaJ homologs identified in B. mori genome were distributed on different chromosomes. The expressed sequence tag database analysis of Hsp40 gene expression revealed higher expression in wing disc followed by diapause-induced eggs. Microarray analysis revealed higher expression of DnaJ 5 homolog at 18th h after oviposition in diapause-induced eggs. Further validation of DnaJ 5 expression through qPCR in diapause-induced and nondiapause eggs at different time intervals revealed higher expression in diapause eggs at 18 and 24 h after oviposition, which coincided with the expression of Hsp70 as the Hsp 40 is its co-chaperone. This study thus provides an outline of the genome organization of Hsp40 gene, and its role in egg diapause induction in B. mori. © 2013 Institute of Zoology, Chinese Academy of Sciences.
Evolution of the herpes thymidine kinase: identification and comparison of the equine herpesvirus 1 thymidine kinase gene reveals similarity to a cell-encoded thymidylate kinase.

PubMed Central

Robertson, G R; Whalley, J M

1988-01-01

We have identified the equine herpesvirus 1 (EHV-1) thymidine kinase gene (TK) by DNA-mediated transformation and by DNA sequencing. Alignment of the amino acid sequence of the EHV-1 TK with the TKs from 3 other herpesviruses revealed regions of homology, some of which correspond to the previously identified substrate binding sites, while others have as yet, no assigned function. In particular, the strict conservation of an aspartate within the proposed nucleoside binding site suggests a role in ATP binding for this residue. Comparison of 5 herpes TKs with the thymidylate kinase of yeast revealed significant similarity which was strongest in those regions important to catalytic activity of the herpes TKs, and, therefore we propose that the herpes TK may be derived from a cellular thymidylate kinase. The implications for the evolution of enzyme activities within a pathway of nucleotide metabolism are discussed. PMID:2849761
The mitochondrial genome of Paraspadella gotoi is highly reduced and reveals that chaetognaths are a sister-group to protostomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Helfenbein, Kevin G.; Fourcade, H. Matthew; Vanjani, Rohit G.

2004-05-01

We report the first complete mitochondrial (mt) DNA sequence from a member of the phylum Chaetognatha (arrow worms). The Paraspadella gotoi mtDNA is highly unusual, missing 23 of the genes commonly found in animal mtDNAs, including atp6, which has otherwise been found universally to be present. Its 14 genes are unusually arranged into two groups, one on each strand. One group is punctuated by numerous non-coding intergenic nucleotides, while the other group is tightly packed, having no non-coding nucleotides, leading to speculation that there are two transcription units with differing modes of expression. The phylogenetic position of the Chaetognatha withinmore » the Metazoa has long been uncertain, with conflicting or equivocal results from various morphological analyses and rRNA sequence comparisons. Comparisons here of amino acid sequences from mitochondrially encoded proteins gives a single most parsimonious tree that supports a position of Chaetognatha as sister to the protostomes studied here. From this, one can more clearly interpret the patterns of evolution of various developmental features, especially regarding the embryological fate of the blastopore.« less
Structure, organization and expression of common carp (Cyprinus carpio L.) SLP-76 gene.

PubMed

Huang, Rong; Sun, Xiao-Feng; Hu, Wei; Wang, Ya-Ping; Guo, Qiong-Lin

2008-05-01

SLP-76 is an important member of the SLP-76 family of adapters, and it plays a key role in TCR signaling and T cell function. Partial cDNA sequence of SLP-76 of common carp (Cyprinus carpio L.) was isolated from thymus cDNA library by the method of suppression subtractive hybridization (SSH). Subsequently, the full length cDNA of carp SLP-76 was obtained by means of 3' RACE and 5' RACE, respectively. The full length cDNA of carp SLP-76 was 2007 bp, consisting of a 5'-terminal untranslated region (UTR) of 285 bp, a 3'-terminal UTR of 240 bp, and an open reading frame of 1482 bp. Sequence comparison showed that the deduced amino acid sequence of carp SLP-76 had an overall similarity of 34-73% to that of other species homologues, and it was composed of an NH2-terminal domain, a central proline-rich domain, and a C-terminal SH2 domain. Amino acid sequence analysis indicated the existence of a Gads binding site R-X-X-K, a 10-aa-long sequence which binds to the SH3 domain of LCK in vitro, and three conserved tyrosine-containing sequence in the NH2-terminal domain. Then we used PCR to obtain a genomic DNA which covers the entire coding region of carp SLP-76. In the 9.2k-long genomic sequence, twenty one exons and twenty introns were identified. RT-PCR results showed that carp SLP-76 was expressed predominantly in hematopoietic tissues, and was upregulated in thymus tissue of four-month carp compared to one-year old carp. RT-PCR and virtual northern hybridization results showed that carp SLP-76 was also upregulated in thymus tissue of GH transgenic carp at the age of four-months. These results suggest that the expression level of SLP-76 gene may be related to thymocyte development in teleosts.

Molecular characterization of Fasciola gigantica from Mauritania based on mitochondrial and nuclear ribosomal DNA sequences.

PubMed

Amor, Nabil; Farjallah, Sarra; Salem, Mohamed; Lamine, Dia Mamadou; Merella, Paolo; Said, Khaled; Ben Slimane, Badreddine

2011-10-01

Fasciolosis caused by Fasciola hepatica and Fasciola gigantica (Platyhelminthes: Trematoda: Digenea) is considered the most important helminth infection of ruminants in tropical countries, causing considerable socioeconomic problems. From Africa, F. gigantica has been previously characterized from Burkina Faso, Senegal, Kenya, Zambia and Mali, while F. hepatica has been reported from Morocco and Tunisia, and both species have been observed from Ethiopia and Egypt on the basis of morphometric differences, while the use of molecular markers is necessary to distinguish exactly between species. Samples identified morphologically as F. gigantica (n=60) from sheep and cattle from different geographical localities of Mauritania were genetically characterized by sequences of the first (ITS-1), the 5.8S, and second (ITS-2) Internal Transcribed Spacers (ITS) of nuclear ribosomal DNA (rDNA) genes and the mitochondrial Cytochrome c Oxidase I (COI) gene. Comparison of the sequences of the Mauritanian samples with sequences of Fasciola spp. from GenBank confirmed that all samples belong to the species F. gigantica. The nucleotide sequencing of ITS rDNA of F. gigantica showed no nucleotide variation in the ITS-1, 5.8S, and ITS-2 rDNA sequences among all samples examined and those from Burkina Faso, Kenya, Egypt and Iran. The phylogenetic trees based on the ITS-1 and ITS-2 sequences showed a close relationship of the Mauritanian samples with isolates of F. gigantica from different localities of Africa and Asia. The COI genotypes of the Mauritanian specimens of F. gigantica had a high level of diversity, and they belonged to the F. gigantica phylogenically distinguishable clade. The present study is the first molecular characterization of F. gigantica in sheep and cattle from Mauritania, allowing a reliable approach for the genetic differentiation of Fasciola spp. and providing basis for further studies on liver flukes in the African countries. Copyright © 2011 Elsevier Inc. All rights reserved.
Sequence of the tomato chloroplast DNA and evolutionary comparison of solanaceous plastid genomes.

PubMed

Kahlau, Sabine; Aspinall, Sue; Gray, John C; Bock, Ralph

2006-08-01

Tomato, Solanum lycopersicum (formerly Lycopersicon esculentum), has long been one of the classical model species of plant genetics. More recently, solanaceous species have become a model of evolutionary genomics, with several EST projects and a tomato genome project having been initiated. As a first contribution toward deciphering the genetic information of tomato, we present here the complete sequence of the tomato chloroplast genome (plastome). The size of this circular genome is 155,461 base pairs (bp), with an average AT content of 62.14%. It contains 114 genes and conserved open reading frames (ycfs). Comparison with the previously sequenced plastid DNAs of Nicotiana tabacum and Atropa belladonna reveals patterns of plastid genome evolution in the Solanaceae family and identifies varying degrees of conservation of individual plastid genes. In addition, we discovered several new sites of RNA editing by cytidine-to-uridine conversion. A detailed comparison of editing patterns in the three solanaceous species highlights the dynamics of RNA editing site evolution in chloroplasts. To assess the level of intraspecific plastome variation in tomato, the plastome of a second tomato cultivar was sequenced. Comparison of the two genotypes (IPA-6, bred in South America, and Ailsa Craig, bred in Europe) revealed no nucleotide differences, suggesting that the plastomes of modern tomato cultivars display very little, if any, sequence variation.
DNA primers for amplification of mitochondrial cytochrome c oxidase subunit I from diverse metazoan invertebrates.

PubMed

Folmer, O; Black, M; Hoeh, W; Lutz, R; Vrijenhoek, R

1994-10-01

We describe "universal" DNA primers for polymerase chain reaction (PCR) amplification of a 710-bp fragment of the mitochondrial cytochrome c oxidase subunit I gene (COI) from 11 invertebrate phyla: Echinodermata, Mollusca, Annelida, Pogonophora, Arthropoda, Nemertinea, Echiura, Sipuncula, Platyhelminthes, Tardigrada, and Coelenterata, as well as the putative phylum Vestimentifera. Preliminary comparisons revealed that these COI primers generate informative sequences for phylogenetic analyses at the species and higher taxonomic levels.
DOE Office of Scientific and Technical Information (OSTI.GOV)

McInerney, Peter; Adams, Paul; Hadi, Masood Z.

As larger-scale cloning projects become more prevalent, there is an increasing need for comparisons among high fidelity DNA polymerases used for PCR amplification. All polymerases marketed for PCR applications are tested for fidelity properties (i.e., error rate determination) by vendors, and numerous literature reports have addressed PCR enzyme fidelity. Nonetheless, it is often difficult to make direct comparisons among different enzymes due to numerous methodological and analytical differences from study to study. We have measured the error rates for 6 DNA polymerases commonly used in PCR applications, including 3 polymerases typically used for cloning applications requiring high fidelity. Error ratemore » measurement values reported here were obtained by direct sequencing of cloned PCR products. The strategy employed here allows interrogation of error rate across a very large DNA sequence space, since 94 unique DNA targets were used as templates for PCR cloning. The six enzymes included in the study, Taq polymerase, AccuPrime-Taq High Fidelity, KOD Hot Start, cloned Pfu polymerase, Phusion Hot Start, and Pwo polymerase, we find the lowest error rates with Pfu , Phusion, and Pwo polymerases. Error rates are comparable for these 3 enzymes and are >10x lower than the error rate observed with Taq polymerase. Mutation spectra are reported, with the 3 high fidelity enzymes displaying broadly similar types of mutations. For these enzymes, transition mutations predominate, with little bias observed for type of transition.« less
The mitochondrial genome sequence of Enterobius vermicularis (Nematoda: Oxyurida)--an idiosyncratic gene order and phylogenetic information for chromadorean nematodes.

PubMed

Kang, Seokha; Sultana, Tahera; Eom, Keeseon S; Park, Yung Chul; Soonthornpong, Nathan; Nadler, Steven A; Park, Joong-Ki

2009-01-15

The complete mitochondrial genome sequence was determined for the human pinworm Enterobius vermicularis (Oxyurida: Nematoda) and used to infer its phylogenetic relationship to other major groups of chromadorean nematodes. The E. vermicularis genome is a 14,010-bp circular DNA molecule that encodes 36 genes (12 proteins, 22 tRNAs, and 2 rRNAs). This mtDNA genome lacks atp8, as reported for almost all other nematode species investigated. Phylogenetic analyses (maximum parsimony, maximum likelihood, neighbor joining, and Bayesian inference) of nucleotide sequences for the 12 protein-coding genes of 25 nematode species placed E. vermicularis, a representative of the order Oxyurida, as sister to the main Ascaridida+Rhabditida group. Tree topology comparisons using statistical tests rejected an alternative hypothesis favoring a closer relationship among Ascaridida, Spirurida, and Oxyurida, which has been supported from most studies based on nuclear ribosomal DNA sequences. Unlike the relatively conserved gene arrangement found for most chromadorean taxa, E. vermicularis mtDNA gene order is very unique, not sharing similarity to any other nematode species reported to date. This lack of gene order similarity may represent idiosyncratic gene rearrangements unique to this specific lineage of the oxyurids. To more fully understand the extent of gene rearrangement and its evolutionary significance within the nematode phylogenetic framework, additional mitochondrial genomes representing a greater evolutionary diversity of species must be characterized.
Use of DNA barcodes to identify flowering plants.

PubMed

Kress, W John; Wurdack, Kenneth J; Zimmer, Elizabeth A; Weigt, Lee A; Janzen, Daniel H

2005-06-07

Methods for identifying species by using short orthologous DNA sequences, known as "DNA barcodes," have been proposed and initiated to facilitate biodiversity studies, identify juveniles, associate sexes, and enhance forensic analyses. The cytochrome c oxidase 1 sequence, which has been found to be widely applicable in animal barcoding, is not appropriate for most species of plants because of a much slower rate of cytochrome c oxidase 1 gene evolution in higher plants than in animals. We therefore propose the nuclear internal transcribed spacer region and the plastid trnH-psbA intergenic spacer as potentially usable DNA regions for applying barcoding to flowering plants. The internal transcribed spacer is the most commonly sequenced locus used in plant phylogenetic investigations at the species level and shows high levels of interspecific divergence. The trnH-psbA spacer, although short ( approximately 450-bp), is the most variable plastid region in angiosperms and is easily amplified across a broad range of land plants. Comparison of the total plastid genomes of tobacco and deadly nightshade enhanced with trials on widely divergent angiosperm taxa, including closely related species in seven plant families and a group of species sampled from a local flora encompassing 50 plant families (for a total of 99 species, 80 genera, and 53 families), suggest that the sequences in this pair of loci have the potential to discriminate among the largest number of plant species for barcoding purposes.
footprintDB: a database of transcription factors with annotated cis elements and binding interfaces.

PubMed

Sebastian, Alvaro; Contreras-Moreira, Bruno

2014-01-15

Traditional and high-throughput techniques for determining transcription factor (TF) binding specificities are generating large volumes of data of uneven quality, which are scattered across individual databases. FootprintDB integrates some of the most comprehensive freely available libraries of curated DNA binding sites and systematically annotates the binding interfaces of the corresponding TFs. The first release contains 2422 unique TF sequences, 10 112 DNA binding sites and 3662 DNA motifs. A survey of the included data sources, organisms and TF families was performed together with proprietary database TRANSFAC, finding that footprintDB has a similar coverage of multicellular organisms, while also containing bacterial regulatory data. A search engine has been designed that drives the prediction of DNA motifs for input TFs, or conversely of TF sequences that might recognize input regulatory sequences, by comparison with database entries. Such predictions can also be extended to a single proteome chosen by the user, and results are ranked in terms of interface similarity. Benchmark experiments with bacterial, plant and human data were performed to measure the predictive power of footprintDB searches, which were able to correctly recover 10, 55 and 90% of the tested sequences, respectively. Correctly predicted TFs had a higher interface similarity than the average, confirming its diagnostic value. Web site implemented in PHP,Perl, MySQL and Apache. Freely available from http://floresta.eead.csic.es/footprintdb.
Resolution of the African hominoid trichotomy by use of a mitochondrial gene sequence

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ruvolo, M.; Disotell, T.R.; Allard, M.W.

1991-02-15

Mitochondrial DNA sequences encoding the cytochrome oxidase subunit II gene have been determined for five primate species, siamang (Hylobates syndactylus), lowland gorilla (Gorilla gorilla), pygmy chimpanzee (Pan paniscus), crab-eating macaque (Macaca fascicularis), and green monkey (Cercopithecus aethiops), and compared with published sequences of other primate and nonprimate species. Comparisons of cytochrome oxidase subunit II gene sequences provide clear-cut evidence from the mitochondrial genome for the separation of the African ape trichotomy into two evolutionary lineages, one leading to gorillas and the other to humans and chimpanzees. Several different tree-building methods support this same phylogenetic tree topology. The comparisons also yieldmore » trees in which a substantial length separates the divergence point of gorillas from that of humans and chimpanzees, suggesting that the lineage most immediately ancestral to humans and chimpanzees may have been in existence for a relatively long time.« less
The impact of targeting repetitive BamHI-W sequences on the sensitivity and precision of EBV DNA quantification.

PubMed

Sanosyan, Armen; Fayd'herbe de Maudave, Alexis; Bollore, Karine; Zimmermann, Valérie; Foulongne, Vincent; Van de Perre, Philippe; Tuaillon, Edouard

2017-01-01

Viral load monitoring and early Epstein-Barr virus (EBV) DNA detection are essential in routine laboratory testing, especially in preemptive management of Post-transplant Lymphoproliferative Disorder. Targeting the repetitive BamHI-W sequence was shown to increase the sensitivity of EBV DNA quantification, but the variability of BamHI-W reiterations was suggested to be a source of quantification bias. We aimed to assess the extent of variability associated with BamHI-W PCR and its impact on the sensitivity of EBV DNA quantification using the 1st WHO international standard, EBV strains and clinical samples. Repetitive BamHI-W- and LMP2 single- sequences were amplified by in-house qPCRs and BXLF-1 sequence by a commercial assay (EBV R-gene™, BioMerieux). Linearity and limits of detection of in-house methods were assessed. The impact of repeated versus single target sequences on EBV DNA quantification precision was tested on B95.8 and Raji cell lines, possessing 11 and 7 copies of the BamHI-W sequence, respectively, and on clinical samples. BamHI-W qPCR demonstrated a lower limit of detection compared to LMP2 qPCR (2.33 log10 versus 3.08 log10 IU/mL; P = 0.0002). BamHI-W qPCR underestimated the EBV DNA load on Raji strain which contained fewer BamHI-W copies than the WHO standard derived from the B95.8 EBV strain (mean bias: - 0.21 log10; 95% CI, -0.54 to 0.12). Comparison of BamHI-W qPCR versus LMP2 and BXLF-1 qPCR showed an acceptable variability between EBV DNA levels in clinical samples with the mean bias being within 0.5 log10 IU/mL EBV DNA, whereas a better quantitative concordance was observed between LMP2 and BXLF-1 assays. Targeting BamHI-W resulted to a higher sensitivity compared to LMP2 but the variable reiterations of BamHI-W segment are associated with higher quantification variability. BamHI-W can be considered for clinical and therapeutic monitoring to detect an early EBV DNA and a dynamic change in viral load.
The impact of targeting repetitive BamHI-W sequences on the sensitivity and precision of EBV DNA quantification

PubMed Central

Fayd’herbe de Maudave, Alexis; Bollore, Karine; Zimmermann, Valérie; Foulongne, Vincent; Van de Perre, Philippe; Tuaillon, Edouard

2017-01-01

Background Viral load monitoring and early Epstein-Barr virus (EBV) DNA detection are essential in routine laboratory testing, especially in preemptive management of Post-transplant Lymphoproliferative Disorder. Targeting the repetitive BamHI-W sequence was shown to increase the sensitivity of EBV DNA quantification, but the variability of BamHI-W reiterations was suggested to be a source of quantification bias. We aimed to assess the extent of variability associated with BamHI-W PCR and its impact on the sensitivity of EBV DNA quantification using the 1st WHO international standard, EBV strains and clinical samples. Methods Repetitive BamHI-W- and LMP2 single- sequences were amplified by in-house qPCRs and BXLF-1 sequence by a commercial assay (EBV R-gene™, BioMerieux). Linearity and limits of detection of in-house methods were assessed. The impact of repeated versus single target sequences on EBV DNA quantification precision was tested on B95.8 and Raji cell lines, possessing 11 and 7 copies of the BamHI-W sequence, respectively, and on clinical samples. Results BamHI-W qPCR demonstrated a lower limit of detection compared to LMP2 qPCR (2.33 log10 versus 3.08 log10 IU/mL; P = 0.0002). BamHI-W qPCR underestimated the EBV DNA load on Raji strain which contained fewer BamHI-W copies than the WHO standard derived from the B95.8 EBV strain (mean bias: - 0.21 log10; 95% CI, -0.54 to 0.12). Comparison of BamHI-W qPCR versus LMP2 and BXLF-1 qPCR showed an acceptable variability between EBV DNA levels in clinical samples with the mean bias being within 0.5 log10 IU/mL EBV DNA, whereas a better quantitative concordance was observed between LMP2 and BXLF-1 assays. Conclusions Targeting BamHI-W resulted to a higher sensitivity compared to LMP2 but the variable reiterations of BamHI-W segment are associated with higher quantification variability. BamHI-W can be considered for clinical and therapeutic monitoring to detect an early EBV DNA and a dynamic change in viral load. PMID:28850597
Evolutionary force of AT-rich repeats to trap genomic and episomal DNAs into the rice genome: lessons from endogenous pararetrovirus.

PubMed

Liu, Ruifang; Koyanagi, Kanako O; Chen, Sunlu; Kishima, Yuji

2012-12-01

In plant genomes, the incorporation of DNA segments is not a common method of artificial gene transfer. Nevertheless, various segments of pararetroviruses have been found in plant genomes in recent decades. The rice genome contains a number of segments of endogenous rice tungro bacilliform virus-like sequences (ERTBVs), many of which are present between AT dinucleotide repeats (ATrs). Comparison of genomic sequences between two closely related rice subspecies, japonica and indica, allowed us to verify the preferential insertion of ERTBVs into ATrs. In addition to ERTBVs, the comparative analyses showed that ATrs occasionally incorporate repeat sequences including transposable elements, and a wide range of other sequences. Besides the known genomic sequences, the insertion sequences also represented DNAs of unclear origins together with ERTBVs, suggesting that ATrs have integrated episomal DNAs that would have been suspended in the nucleus. Such insertion DNAs might be trapped by ATrs in the genome in a host-dependent manner. Conversely, other simple mono- and dinucleotide sequence repeats (SSR) were less frequently involved in insertion events relative to ATrs. Therefore, ATrs could be regarded as hot spots of double-strand breaks that induce non-homologous end joining. The insertions within ATrs occasionally generated new gene-related sequences or involved structural modifications of existing genes. Likewise, in a comparison between Arabidopsis thaliana and Arabidopsis lyrata, the insertions preferred ATrs to other SSRs. Therefore ATrs in plant genomes could be considered as genomic dumping sites that have trapped various DNA molecules and may have exerted a powerful evolutionary force. © 2012 The Authors. The Plant Journal © 2012 Blackwell Publishing Ltd.
RNA-primed complementary-sense DNA synthesis of the geminivirus African cassava mosaic virus.

PubMed Central

Saunders, K; Lucy, A; Stanley, J

1992-01-01

The plant DNA virus African cassava mosaic virus (ACMV) is believed to replicate by a rolling circle mechanism. To investigate complementary-sense DNA (lagging strand) synthesis, we have analysed the heterogenous form of complementary-sense DNA (H3 DNA) from infected Nicotiana benthamiana by two-dimensional agarose gel electrophoresis and blot hybridisation. The presence of an RNA moeity is demonstrated by comparison of results for nucleic acids resolved on neutral/alkaline and neutral/formamide gels, suggesting that complementary-sense DNA synthesis on the virus-sense single-stranded DNA template is preceded by the synthesis of an RNA primer. Hybridisation with probes to specific parts of ACMV DNA A genome indicates that synthesis of the putative RNA primer initiates between nucleotides 2581-221, a region that includes intergenic sequences that have been implicated in geminivirus DNA replication and the control of gene expression. Images PMID:1475192
Effect of DNA extraction and sample preservation method on rumen bacterial population.

PubMed

Fliegerova, Katerina; Tapio, Ilma; Bonin, Aurelie; Mrazek, Jakub; Callegari, Maria Luisa; Bani, Paolo; Bayat, Alireza; Vilkki, Johanna; Kopečný, Jan; Shingfield, Kevin J; Boyer, Frederic; Coissac, Eric; Taberlet, Pierre; Wallace, R John

2014-10-01

The comparison of the bacterial profile of intracellular (iDNA) and extracellular DNA (eDNA) isolated from cow rumen content stored under different conditions was conducted. The influence of rumen fluid treatment (cheesecloth squeezed, centrifuged, filtered), storage temperature (RT, -80 °C) and cryoprotectants (PBS-glycerol, ethanol) on quality and quantity parameters of extracted DNA was evaluated by bacterial DGGE analysis, real-time PCR quantification and metabarcoding approach using high-throughput sequencing. Samples clustered according to the type of extracted DNA due to considerable differences between iDNA and eDNA bacterial profiles, while storage temperature and cryoprotectants additives had little effect on sample clustering. The numbers of Firmicutes and Bacteroidetes were lower (P < 0.01) in eDNA samples. The qPCR indicated significantly higher amount of Firmicutes in iDNA sample frozen with glycerol (P < 0.01). Deep sequencing analysis of iDNA samples revealed the prevalence of Bacteroidetes and similarity of samples frozen with and without cryoprotectants, which differed from sample stored with ethanol at room temperature. Centrifugation and consequent filtration of rumen fluid subjected to the eDNA isolation procedure considerably changed the ratio of molecular operational taxonomic units (MOTUs) of Bacteroidetes and Firmicutes. Intracellular DNA extraction using bead-beating method from cheesecloth sieved rumen content mixed with PBS-glycerol and stored at -80 °C was found as the optimal method to study ruminal bacterial profile. Copyright © 2013 Elsevier Ltd. All rights reserved.
Genomics and museum specimens.

PubMed

Nachman, Michael W

2013-12-01

Nearly 25 years ago, Allan Wilson and colleagues isolated DNA sequences from museum specimens of kangaroo rats (Dipodomys panamintinus) and compared these sequences with those from freshly collected animals (Thomas et al. 1990). The museum specimens had been collected up to 78 years earlier, so the two samples provided a direct temporal comparison of patterns of genetic variation. This was not the first time DNA sequences had been isolated from preserved material, but it was the first time it had been carried out with a population sample. Population geneticists often try to make inferences about the influence of historical processes such as selection, drift, mutation and migration on patterns of genetic variation in the present. The work of Wilson and colleagues was important in part because it suggested a way in which population geneticists could actually study genetic change in natural populations through time, much the same way that experimentalists can do with artificial populations in the laboratory. Indeed, the work of Thomas et al. (1990) spawned dozens of studies in which museum specimens were used to compare historical and present-day genetic diversity (reviewed in Wandeler et al. 2007). All of these studies, however, were limited by the same fundamental problem: old DNA is degraded into short fragments. As a consequence, these studies mostly involved PCR amplification of short templates, usually short stretches of mitochondrial DNA or microsatellites. In this issue, Bi et al. (2013) report a breakthrough that should open the door to studies of genomic variation in museum specimens. They used target enrichment (exon capture) and next-generation (Illumina) sequencing to compare patterns of genetic variation in historic and present-day population samples of alpine chipmunks (Tamias alpinus) (Fig. 1). The historic samples came from specimens collected in 1915, so the temporal span of this comparison is nearly 100 years. © 2013 John Wiley & Sons Ltd.
Identification of Two Candidate Tumor Suppressor Genes on Chromosome 17p13.3: Assessment of Their Roles in Breast and Ovarian Carcinogenesis

DTIC Science & Technology

1997-07-01

minimum region of allelic loss on chromosome 17p 13.3, between polymorphic markers D17S5 and D17S28, in genomic DNA from breast and ovarian tumors (Figure 1...encode proteins of 443 and 227 amino acids, with no known functional motifs. Comparison of genomic and cDNA sequences showed that the genes overlap...is tissue specific (Figure 4). When zoo blots comprised of EcoRI fragments of genomic DNA from various species were probed with the unique exon 1 of
Interaction of Zn(II)bleomycin-A2 and Zn(II)peplomycin with a DNA hairpin containing the 5'-GT-3' binding site in comparison with the 5'-GC-3' binding site studied by NMR spectroscopy.

PubMed

Follett, Shelby E; Ingersoll, Azure D; Murray, Sally A; Reilly, Teresa M; Lehmann, Teresa E

2017-10-01

Bleomycins are a group of glycopeptide antibiotics synthesized by Streptomyces verticillus that are widely used for the treatment of various neoplastic diseases. These antibiotics have the ability to chelate a metal center, mainly Fe(II), and cause site-specific DNA cleavage. Bleomycins are differentiated by their C-terminal regions. Although this antibiotic family is a successful course of treatment for some types of cancers, it is known to cause pulmonary fibrosis. Previous studies have identified that bleomycin-related pulmonary toxicity is linked to the C-terminal region of these drugs. This region has been shown to closely interact with DNA. We examined the binding of Zn(II)peplomycin and Zn(II)bleomycin-A 2 to a DNA hairpin of sequence 5'-CCAGTATTTTTACTGG-3', containing the binding site 5'-GT-3', and compared the results with those obtained from our studies of the same MBLMs bound to a DNA hairpin containing the binding site 5'-GC-3'. We provide evidence that the DNA base sequence has a strong impact in the final structure of the drug-target complex.
Multifocal clonal evolution characterized using circulating tumour DNA in a case of metastatic breast cancer

PubMed Central

Murtaza, Muhammed; Dawson, Sarah-Jane; Pogrebniak, Katherine; Rueda, Oscar M.; Provenzano, Elena; Grant, John; Chin, Suet-Feung; Tsui, Dana W. Y.; Marass, Francesco; Gale, Davina; Ali, H. Raza; Shah, Pankti; Contente-Cuomo, Tania; Farahani, Hossein; Shumansky, Karey; Kingsbury, Zoya; Humphray, Sean; Bentley, David; Shah, Sohrab P.; Wallis, Matthew; Rosenfeld, Nitzan; Caldas, Carlos

2015-01-01

Circulating tumour DNA analysis can be used to track tumour burden and analyse cancer genomes non-invasively but the extent to which it represents metastatic heterogeneity is unknown. Here we follow a patient with metastatic ER-positive and HER2-positive breast cancer receiving two lines of targeted therapy over 3 years. We characterize genomic architecture and infer clonal evolution in eight tumour biopsies and nine plasma samples collected over 1,193 days of clinical follow-up using exome and targeted amplicon sequencing. Mutation levels in the plasma samples reflect the clonal hierarchy inferred from sequencing of tumour biopsies. Serial changes in circulating levels of sub-clonal private mutations correlate with different treatment responses between metastatic sites. This comparison of biopsy and plasma samples in a single patient with metastatic breast cancer shows that circulating tumour DNA can allow real-time sampling of multifocal clonal evolution. PMID:26530965
Multifocal clonal evolution characterized using circulating tumour DNA in a case of metastatic breast cancer.

PubMed

Murtaza, Muhammed; Dawson, Sarah-Jane; Pogrebniak, Katherine; Rueda, Oscar M; Provenzano, Elena; Grant, John; Chin, Suet-Feung; Tsui, Dana W Y; Marass, Francesco; Gale, Davina; Ali, H Raza; Shah, Pankti; Contente-Cuomo, Tania; Farahani, Hossein; Shumansky, Karey; Kingsbury, Zoya; Humphray, Sean; Bentley, David; Shah, Sohrab P; Wallis, Matthew; Rosenfeld, Nitzan; Caldas, Carlos

2015-11-04

Circulating tumour DNA analysis can be used to track tumour burden and analyse cancer genomes non-invasively but the extent to which it represents metastatic heterogeneity is unknown. Here we follow a patient with metastatic ER-positive and HER2-positive breast cancer receiving two lines of targeted therapy over 3 years. We characterize genomic architecture and infer clonal evolution in eight tumour biopsies and nine plasma samples collected over 1,193 days of clinical follow-up using exome and targeted amplicon sequencing. Mutation levels in the plasma samples reflect the clonal hierarchy inferred from sequencing of tumour biopsies. Serial changes in circulating levels of sub-clonal private mutations correlate with different treatment responses between metastatic sites. This comparison of biopsy and plasma samples in a single patient with metastatic breast cancer shows that circulating tumour DNA can allow real-time sampling of multifocal clonal evolution.
IM-TORNADO: a tool for comparison of 16S reads from paired-end libraries.

PubMed

Jeraldo, Patricio; Kalari, Krishna; Chen, Xianfeng; Bhavsar, Jaysheel; Mangalam, Ashutosh; White, Bryan; Nelson, Heidi; Kocher, Jean-Pierre; Chia, Nicholas

2014-01-01

16S rDNA hypervariable tag sequencing has become the de facto method for accessing microbial diversity. Illumina paired-end sequencing, which produces two separate reads for each DNA fragment, has become the platform of choice for this application. However, when the two reads do not overlap, existing computational pipelines analyze data from read separately and underutilize the information contained in the paired-end reads. We created a workflow known as Illinois Mayo Taxon Organization from RNA Dataset Operations (IM-TORNADO) for processing non-overlapping reads while retaining maximal information content. Using synthetic mock datasets, we show that the use of both reads produced answers with greater correlation to those from full length 16S rDNA when looking at taxonomy, phylogeny, and beta-diversity. IM-TORNADO is freely available at http://sourceforge.net/projects/imtornado and produces BIOM format output for cross compatibility with other pipelines such as QIIME, mothur, and phyloseq.
Environmental DNA (eDNA) metabarcoding assays to detect invasive invertebrate species in the Great Lakes.

PubMed

Klymus, Katy E; Marshall, Nathaniel T; Stepien, Carol A

2017-01-01

Describing and monitoring biodiversity comprise integral parts of ecosystem management. Recent research coupling metabarcoding and environmental DNA (eDNA) demonstrate that these methods can serve as important tools for surveying biodiversity, while significantly decreasing the time, expense and resources spent on traditional survey methods. The literature emphasizes the importance of genetic marker development, as the markers dictate the applicability, sensitivity and resolution ability of an eDNA assay. The present study developed two metabarcoding eDNA assays using the mtDNA 16S RNA gene with Illumina MiSeq platform to detect invertebrate fauna in the Laurentian Great Lakes and surrounding waterways, with a focus for use on invasive bivalve and gastropod species monitoring. We employed careful primer design and in vitro testing with mock communities to assess ability of the markers to amplify and sequence targeted species DNA, while retaining rank abundance information. In our mock communities, read abundances reflected the initial input abundance, with regressions having significant slopes (p<0.05) and high coefficients of determination (R2) for all comparisons. Tests on field environmental samples revealed similar ability of our markers to measure relative abundance. Due to the limited reference sequence data available for these invertebrate species, care must be taken when analyzing results and identifying sequence reads to species level. These markers extend eDNA metabarcoding research for molluscs and appear relevant to other invertebrate taxa, such as rotifers and bryozoans. Furthermore, the sphaeriid mussel assay is group-specific, exclusively amplifying bivalves in the Sphaeridae family and providing species-level identification. Our assays provide useful tools for managers and conservation scientists, facilitating early detection of invasive species as well as improving resolution of mollusc diversity.

Environmental DNA (eDNA) metabarcoding assays to detect invasive invertebrate species in the Great Lakes

PubMed Central

Klymus, Katy E.; Marshall, Nathaniel T.

2017-01-01

Describing and monitoring biodiversity comprise integral parts of ecosystem management. Recent research coupling metabarcoding and environmental DNA (eDNA) demonstrate that these methods can serve as important tools for surveying biodiversity, while significantly decreasing the time, expense and resources spent on traditional survey methods. The literature emphasizes the importance of genetic marker development, as the markers dictate the applicability, sensitivity and resolution ability of an eDNA assay. The present study developed two metabarcoding eDNA assays using the mtDNA 16S RNA gene with Illumina MiSeq platform to detect invertebrate fauna in the Laurentian Great Lakes and surrounding waterways, with a focus for use on invasive bivalve and gastropod species monitoring. We employed careful primer design and in vitro testing with mock communities to assess ability of the markers to amplify and sequence targeted species DNA, while retaining rank abundance information. In our mock communities, read abundances reflected the initial input abundance, with regressions having significant slopes (p<0.05) and high coefficients of determination (R2) for all comparisons. Tests on field environmental samples revealed similar ability of our markers to measure relative abundance. Due to the limited reference sequence data available for these invertebrate species, care must be taken when analyzing results and identifying sequence reads to species level. These markers extend eDNA metabarcoding research for molluscs and appear relevant to other invertebrate taxa, such as rotifers and bryozoans. Furthermore, the sphaeriid mussel assay is group-specific, exclusively amplifying bivalves in the Sphaeridae family and providing species-level identification. Our assays provide useful tools for managers and conservation scientists, facilitating early detection of invasive species as well as improving resolution of mollusc diversity. PMID:28542313
The Mitochondrial Genome of Chara vulgaris: Insights into the Mitochondrial DNA Architecture of the Last Common Ancestor of Green Algae and Land PlantsW⃞

PubMed Central

Turmel, Monique; Otis, Christian; Lemieux, Claude

2003-01-01

Mitochondrial DNA (mtDNA) has undergone radical changes during the evolution of green plants, yet little is known about the dynamics of mtDNA evolution in this phylum. Land plant mtDNAs differ from the few green algal mtDNAs that have been analyzed to date by their expanded size, long spacers, and diversity of introns. We have determined the mtDNA sequence of Chara vulgaris (Charophyceae), a green alga belonging to the charophycean order (Charales) that is thought to be the most closely related alga to land plants. This 67,737-bp mtDNA sequence, displaying 68 conserved genes and 27 introns, was compared with those of three angiosperms, the bryophyte Marchantia polymorpha, the charophycean alga Chaetosphaeridium globosum (Coleochaetales), and the green alga Mesostigma viride. Despite important differences in size and intron composition, Chara mtDNA strikingly resembles Marchantia mtDNA; for instance, all except 9 of 68 conserved genes lie within blocks of colinear sequences. Overall, our genome comparisons and phylogenetic analyses provide unequivocal support for a sister-group relationship between the Charales and the land plants. Only four introns in land plant mtDNAs appear to have been inherited vertically from a charalean algar ancestor. We infer that the common ancestor of green algae and land plants harbored a tightly packed, gene-rich, and relatively intron-poor mitochondrial genome. The group II introns in this ancestral genome appear to have spread to new mtDNA sites during the evolution of bryophytes and charalean green algae, accounting for part of the intron diversity found in Chara and land plant mitochondria. PMID:12897260
Methylene blue binding to DNA with alternating AT base sequence: minor groove binding is favored over intercalation.

PubMed

Rohs, Remo; Sklenar, Heinz

2004-04-01

The results presented in this paper on methylene blue (MB) binding to DNA with AT alternating base sequence complement the data obtained in two former modeling studies of MB binding to GC alternating DNA. In the light of the large amount of experimental data for both systems, this theoretical study is focused on a detailed energetic analysis and comparison in order to understand their different behavior. Since experimental high-resolution structures of the complexes are not available, the analysis is based on energy minimized structural models of the complexes in different binding modes. For both sequences, four different intercalation structures and two models for MB binding in the minor and major groove have been proposed. Solvent electrostatic effects were included in the energetic analysis by using electrostatic continuum theory, and the dependence of MB binding on salt concentration was investigated by solving the non-linear Poisson-Boltzmann equation. We find that the relative stability of the different complexes is similar for the two sequences, in agreement with the interpretation of spectroscopic data. Subtle differences, however, are seen in energy decompositions and can be attributed to the change from symmetric 5'-YpR-3' intercalation to minor groove binding with increasing salt concentration, which is experimentally observed for the AT sequence at lower salt concentration than for the GC sequence. According to our results, this difference is due to the significantly lower non-electrostatic energy for the minor groove complex with AT alternating DNA, whereas the slightly lower binding energy to this sequence is caused by a higher deformation energy of DNA. The energetic data are in agreement with the conclusions derived from different spectroscopic studies and can also be structurally interpreted on the basis of the modeled complexes. The simple static modeling technique and the neglect of entropy terms and of non-electrostatic solute-solvent interactions, which are assumed to be nearly constant for the compared complexes of MB with DNA, seem to be justified by the results.
A Predictive Approach to Network Reverse-Engineering

NASA Astrophysics Data System (ADS)

Wiggins, Chris

2005-03-01

A central challenge of systems biology is the ``reverse engineering" of transcriptional networks: inferring which genes exert regulatory control over which other genes. Attempting such inference at the genomic scale has only recently become feasible, via data-intensive biological innovations such as DNA microrrays (``DNA chips") and the sequencing of whole genomes. In this talk we present a predictive approach to network reverse-engineering, in which we integrate DNA chip data and sequence data to build a model of the transcriptional network of the yeast S. cerevisiae capable of predicting the response of genes in unseen experiments. The technique can also be used to extract ``motifs,'' sequence elements which act as binding sites for regulatory proteins. We validate by a number of approaches and present comparison of theoretical prediction vs. experimental data, along with biological interpretations of the resulting model. En route, we will illustrate some basic notions in statistical learning theory (fitting vs. over-fitting; cross- validation; assessing statistical significance), highlighting ways in which physicists can make a unique contribution in data- driven approaches to reverse engineering.
Molecular characterization of the probiotic strain Bacillus cereus var. toyoi NCIMB 40112 and differentiation from food poisoning strains.

PubMed

Klein, Günter

2011-07-01

Bacillus cereus var. toyoi strain NCIMB 40112 (Toyocerin), a probiotic authorized in the European Union as feed additive for swine, bovines, poultry, and rabbits, was characterized by DNA fingerprinting applying pulsed-field gel electrophoresis and multilocus sequence typing and was compared with reference strains (of clinical and environmental origins). The probiotic strain was clearly characterized by pulsed-field gel electrophoresis using the restriction enzymes Apa I and Sma I resulting in unique DNA patterns. The comparison to the clinical reference strain B. cereus DSM 4312 was done with the same restriction enzymes, and again a clear differentiation of the two strains was possible by the resulting DNA patterns. The use of the restriction enzymes Apa I and Sma I is recommended for further studies. Furthermore, multilocus sequence typing analysis revealed a sequence type (ST 111) that was different from all known STs of B. cereus strains from food poisoning incidents. Thus, a strain characterization and differentiation from food poisoning strains for the probiotic strain was possible. Copyright ©, International Association for Food Protection
B chromosome dynamics in Prochilodus costatus (Teleostei, Characiformes) and comparisons with supernumerary chromosome system in other Prochilodus species

PubMed Central

Melo, Silvana; Utsunomia, Ricardo; Penitente, Manolo; Sobrinho-Scudeler, Patrícia Elda; Porto-Foresti, Fábio; Oliveira, Claudio; Foresti, Fausto; Dergam, Jorge Abdala

2017-01-01

Abstract Within the genus Prochilodus Agassiz, 1829, five species are known to carry B chromosomes, i.e. chromosomes beyond the usual diploid number that have been traditionally considered as accessory for the genome. Chromosome microdissection and mapping of repetitive DNA sequences are effective tools to assess the DNA content and allow a better understanding about the origin and composition of these elements in an array of species. In this study, a novel characterization of B chromosomes in Prochilodus costatus Valenciennes, 1850 (2n=54) was reported for the first time and their sequence complementarity with the supernumerary chromosomes observed in Prochilodus lineatus (Valenciennes, 1836) and Prochilodus argenteus Agassiz, 1829 was investigated. The hybridization patterns obtained with chromosome painting using the micro B probe of P. costatus and the satDNA SATH1 mapping made it possible to assume homology of sequences between the B chromosomes of these congeneric species. Our results suggest that the origin of B chromosomes in the genus Prochilodus is a phylogenetically old event. PMID:28919971
Construction and characterization of a normalized cDNA library of Nannochloropsis oculata (Eustigmatophyceae)

NASA Astrophysics Data System (ADS)

Yu, Jianzhong; Ma, Xiaolei; Pan, Kehou; Yang, Guanpin; Yu, Wengong

2010-07-01

We constructed and characterized a normalized cDNA library of Nannochloropsis oculata CS-179, and obtained 905 nonredundant sequences (NRSs) ranging from 431-1 756 bp in length. Among them, 496 were very similar to nonredundant ones in the GenBank ( E ≤1.0e-05), and 349 ESTs had significant hits with the clusters of eukaryotic orthologous groups (KOG). Bases G and/or C at the third position of codons of 14 amino acid residues suggested a strong bias in the conserved domain of 362 NRSs (>60%). We also identified the unigenes encoding phosphorus and nitrogen transporters, suggesting that N. oculata could efficiently transport and metabolize phosphorus and nitrogen, and recognized the unigenes that involved in biosynthesis and storage of both fatty acids and polyunsaturated fatty acids (PUFAs), which will facilitate the demonstration of eicosapentaenoic acid (EPA) biosynthesis pathway of N. oculata. In comparison with the original cDNA library, the normalized library significantly increased the efficiencies of random sequencing and rarely expressed genes discovering, and decreased the frequency of abundant gene sequences.
Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications

PubMed Central

Harris, R. Alan; Wang, Ting; Coarfa, Cristian; Nagarajan, Raman P.; Hong, Chibo; Downey, Sara L.; Johnson, Brett E.; Fouse, Shaun D.; Delaney, Allen; Zhao, Yongjun; Olshen, Adam; Ballinger, Tracy; Zhou, Xin; Forsberg, Kevin J.; Gu, Junchen; Echipare, Lorigail; O’Geen, Henriette; Lister, Ryan; Pelizzola, Mattia; Xi, Yuanxin; Epstein, Charles B.; Bernstein, Bradley E.; Hawkins, R. David; Ren, Bing; Chung, Wen-Yu; Gu, Hongcang; Bock, Christoph; Gnirke, Andreas; Zhang, Michael Q.; Haussler, David; Ecker, Joseph; Li, Wei; Farnham, Peggy J.; Waterland, Robert A.; Meissner, Alexander; Marra, Marco A.; Hirst, Martin; Milosavljevic, Aleksandar; Costello, Joseph F.

2010-01-01

Sequencing-based DNA methylation profiling methods are comprehensive and, as accuracy and affordability improve, will increasingly supplant microarrays for genome-scale analyses. Here, four sequencing-based methodologies were applied to biological replicates of human embryonic stem cells to compare their CpG coverage genome-wide and in transposons, resolution, cost, concordance and its relationship with CpG density and genomic context. The two bisulfite methods reached concordance of 82% for CpG methylation levels and 99% for non-CpG cytosine methylation levels. Using binary methylation calls, two enrichment methods were 99% concordant, while regions assessed by all four methods were 97% concordant. To achieve comprehensive methylome coverage while reducing cost, an approach integrating two complementary methods was examined. The integrative methylome profile along with histone methylation, RNA, and SNP profiles derived from the sequence reads allowed genome-wide assessment of allele-specific epigenetic states, identifying most known imprinted regions and new loci with monoallelic epigenetic marks and monoallelic expression. PMID:20852635
Sequencing, Analysis, and Annotation of Expressed Sequence Tags for Camelus dromedarius

PubMed Central

Al-Swailem, Abdulaziz M.; Shehata, Maher M.; Abu-Duhier, Faisel M.; Al-Yamani, Essam J.; Al-Busadah, Khalid A.; Al-Arawi, Mohammed S.; Al-Khider, Ali Y.; Al-Muhaimeed, Abdullah N.; Al-Qahtani, Fahad H.; Manee, Manee M.; Al-Shomrani, Badr M.; Al-Qhtani, Saad M.; Al-Harthi, Amer S.; Akdemir, Kadir C.; Otu, Hasan H.

2010-01-01

Despite its economical, cultural, and biological importance, there has not been a large scale sequencing project to date for Camelus dromedarius. With the goal of sequencing complete DNA of the organism, we first established and sequenced camel EST libraries, generating 70,272 reads. Following trimming, chimera check, repeat masking, cluster and assembly, we obtained 23,602 putative gene sequences, out of which over 4,500 potentially novel or fast evolving gene sequences do not carry any homology to other available genomes. Functional annotation of sequences with similarities in nucleotide and protein databases has been obtained using Gene Ontology classification. Comparison to available full length cDNA sequences and Open Reading Frame (ORF) analysis of camel sequences that exhibit homology to known genes show more than 80% of the contigs with an ORF>300 bp and ∼40% hits extending to the start codons of full length cDNAs suggesting successful characterization of camel genes. Similarity analyses are done separately for different organisms including human, mouse, bovine, and rat. Accompanying web portal, CAGBASE (http://camel.kacst.edu.sa/), hosts a relational database containing annotated EST sequences and analysis tools with possibility to add sequences from public domain. We anticipate our results to provide a home base for genomic studies of camel and other comparative studies enabling a starting point for whole genome sequencing of the organism. PMID:20502665
Cloning of a cDNA encoding 1-aminocyclopropane-1-carboxylate synthase and expression of its mRNA in ripening apple fruit.

PubMed

Dong, J G; Kim, W T; Yip, W K; Thompson, G A; Li, L; Bennett, A B; Yang, S F

1991-08-01

1-Aminocyclopropane-1-carboxylate (ACC) synthase (EC 4.4.1.14) purified from apple (Malus sylvestris Mill.) fruit was subjected to trypsin digestion. Following separation by reversed-phase high-pressure liquid chromatography, ten tryptic peptides were sequenced. Based on the sequences of three tryptic peptides, three sets of mixed oligonucleotide probes were synthesized and used to screen a plasmid cDNA library prepared from poly(A)(+) RNA of ripe apple fruit. A 1.5-kb (kilobase) cDNA clone which hybridized to all three probes were isolated. The clone contained an open reading frame of 1214 base pairs (bp) encoding a sequence of 404 amino acids. While the polyadenine tail at the 3'-end was intact, it lacked a portion of sequence at the 5'-end. Using the RNA-based polymerase chain reaction, an additional sequence of 148 bp was obtained at the 5'-end. Thus, 1362 bp were sequenced and they encode 454 amino acids. The deduced amino-acid sequence contained peptide sequences corresponding to all ten tryptic fragments, confirming the identity of the cDNA clone. Comparison of the deduced amino-acid sequence between ACC synthase from apple fruit and those from tomato (Lycopersicon esculentum Mill.) and winter squash (Cucurbita maxima Duch.) fruits demonstrated the presence of seven highly conserved regions, including the previously identified region for the active site. The size of the translation product of ACC-synthase mRNA was similar to that of the mature protein on sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE), indicating that apple ACC-synthase undergoes only minor, if any, post-translational proteolytic processing. Analysis of ACC-synthase mRNA by in-vitro translation-immunoprecipitation, and by Northern blotting indicates that the ACC-synthase mRNA was undetectable in unripe fruit, but was accumulated massively during the ripening proccess. These data demonstrate that the expression of the ACC-synthase gene is developmentally regulated.
Continuous expression in tobacco leaves of a Brassica napus PEND homologue blocks differentiation of plastids and development of palisade cells.

PubMed

Wycliffe, Paul; Sitbon, Folke; Wernersson, Jonny; Ezcurra, Inés; Ellerström, Mats; Rask, Lars

2005-10-01

Brassica napus complementary deoxyribonucleic acid (cDNA) clones encoding a DNA-binding protein, BnPEND, were isolated by Southwestern screening. A distinctive feature of the protein was a bZIP-like sequence in the amino-terminal portion, which, after expression in Escherichia coli, bound DNA. BnPEND transcripts were present in B. napus roots and flower buds, and to a lesser extent in stems, flowers and young leaves. Treatment in the dark for 72 h markedly increased the amount of BnPEND transcript in leaves of all ages. Sequence comparison showed that BnPEND was similar to a presumed transcription factor from B. napus, GSBF1, a protein deduced from an Arabidopsis thaliana cDNA (BX825084) and the PEND protein from Pisum sativum, believed to anchor the plastid DNA to the envelope early during plastid development. Homology to expressed sequence tag (EST) sequences from additional species suggested that BnPEND homologues are widespread among the angiosperms. Transient expression of BnPEND fused with green fluorescent protein (GFP) in Nicotiana benthamiana epidermal cells showed that BnPEND is a plastid protein, and that the 15 amino acids at the amino-terminal contain information about plastid targeting. Expression of BnPEND in Nicotiana tabacum from the Cauliflower Mosaic Virus 35S promoter gave stable transformants with different extents of white to light-green areas in the leaves, and even albino plants. In the white areas, but not in adjacent green tissue, the development of palisade cells and chloroplasts was disrupted. Our data demonstrate that the BnPEND protein, when over-expressed at an inappropriate stage, functionally blocks the development of plastids and leads to altered leaf anatomy, possibly by preventing the release of plastid DNA from the envelope.
Supervised DNA Barcodes species classification: analysis, comparisons and results

PubMed Central

2014-01-01

Background Specific fragments, coming from short portions of DNA (e.g., mitochondrial, nuclear, and plastid sequences), have been defined as DNA Barcode and can be used as markers for organisms of the main life kingdoms. Species classification with DNA Barcode sequences has been proven effective on different organisms. Indeed, specific gene regions have been identified as Barcode: COI in animals, rbcL and matK in plants, and ITS in fungi. The classification problem assigns an unknown specimen to a known species by analyzing its Barcode. This task has to be supported with reliable methods and algorithms. Methods In this work the efficacy of supervised machine learning methods to classify species with DNA Barcode sequences is shown. The Weka software suite, which includes a collection of supervised classification methods, is adopted to address the task of DNA Barcode analysis. Classifier families are tested on synthetic and empirical datasets belonging to the animal, fungus, and plant kingdoms. In particular, the function-based method Support Vector Machines (SVM), the rule-based RIPPER, the decision tree C4.5, and the Naïve Bayes method are considered. Additionally, the classification results are compared with respect to ad-hoc and well-established DNA Barcode classification methods. Results A software that converts the DNA Barcode FASTA sequences to the Weka format is released, to adapt different input formats and to allow the execution of the classification procedure. The analysis of results on synthetic and real datasets shows that SVM and Naïve Bayes outperform on average the other considered classifiers, although they do not provide a human interpretable classification model. Rule-based methods have slightly inferior classification performances, but deliver the species specific positions and nucleotide assignments. On synthetic data the supervised machine learning methods obtain superior classification performances with respect to the traditional DNA Barcode classification methods. On empirical data their classification performances are at a comparable level to the other methods. Conclusions The classification analysis shows that supervised machine learning methods are promising candidates for handling with success the DNA Barcoding species classification problem, obtaining excellent performances. To conclude, a powerful tool to perform species identification is now available to the DNA Barcoding community. PMID:24721333
Comparison of Boiling and Robotics Automation Method in DNA Extraction for Metagenomic Sequencing of Human Oral Microbes.

PubMed

Yamagishi, Junya; Sato, Yukuto; Shinozaki, Natsuko; Ye, Bin; Tsuboi, Akito; Nagasaki, Masao; Yamashita, Riu

2016-01-01

The rapid improvement of next-generation sequencing performance now enables us to analyze huge sample sets with more than ten thousand specimens. However, DNA extraction can still be a limiting step in such metagenomic approaches. In this study, we analyzed human oral microbes to compare the performance of three DNA extraction methods: PowerSoil (a method widely used in this field), QIAsymphony (a robotics method), and a simple boiling method. Dental plaque was initially collected from three volunteers in the pilot study and then expanded to 12 volunteers in the follow-up study. Bacterial flora was estimated by sequencing the V4 region of 16S rRNA following species-level profiling. Our results indicate that the efficiency of PowerSoil and QIAsymphony was comparable to the boiling method. Therefore, the boiling method may be a promising alternative because of its simplicity, cost effectiveness, and short handling time. Moreover, this method was reliable for estimating bacterial species and could be used in the future to examine the correlation between oral flora and health status. Despite this, differences in the efficiency of DNA extraction for various bacterial species were observed among the three methods. Based on these findings, there is no "gold standard" for DNA extraction. In future, we suggest that the DNA extraction method should be selected on a case-by-case basis considering the aims and specimens of the study.
Structure and Dynamics of DNA and RNA Double Helices Obtained from the CCG and GGC Trinucleotide Repeats.

PubMed

Pan, Feng; Man, Viet Hoang; Roland, Christopher; Sagui, Celeste

2018-04-26

Expansions of both GGC and CCG sequences lead to a number of expandable, trinucleotide repeat (TR) neurodegenerative diseases. Understanding of these diseases involves, among other things, the structural characterization of the atypical DNA and RNA secondary structures. We have performed molecular dynamics simulations of (GCC) n and (GGC) n homoduplexes in order to characterize their conformations, stability, and dynamics. Each TR has two reading frames, which results in eight nonequivalent RNA/DNA homoduplexes, characterized by CpG or GpC steps between the Watson-Crick base pairs. Free energy maps for the eight homoduplexes indicate that the C-mismatches prefer anti-anti conformations, while G-mismatches prefer anti-syn conformations. Comparison between three modifications of the DNA AMBER force field shows good agreement for the mismatch free energy maps. The mismatches in DNA-GCC (but not CCG) are extrahelical, forming an extended e-motif. The mismatched duplexes exhibit characteristic sequence-dependent step twist, with strong variations in the G-rich sequences and the e-motif. The distribution of Na + is highly localized around the mismatches, especially G-mismatches. In the e-motif, there is strong Na + binding by two G(N7) atoms belonging to the pseudo GpC step created when cytosines are extruded and by extrahelical cytosines. Finally, we used a novel technique based on fast melting by means of an infrared laser pulse to classify the relative stability of the different DNA-CCG and -GGC homoduplexes.
Repatriation and Identification of Finnish World War II Soldiers

PubMed Central

Palo, Jukka U.; Hedman, Minttu; Söderholm, Niklas; Sajantila, Antti

2007-01-01

Aim To present a summary of the organization, field search, repatriation, forensic anthropological examination, and DNA analysis for the purpose of identification of Finnish soldiers with unresolved fate in World War II. Methods Field searches were organized, executed, and financed by the Ministry of Education and the Association for Cherishing the Memory of the Dead of the War. Anthropological examination conducted on human remains retrieved in the field searches was used to establish the minimum number of individuals and description of the skeletal diseases, treatment, anomalies, or injuries. DNA tests were performed by extracting DNA from powdered bones and blood samples from relatives. Mitochondrial DNA (mtDNA) sequence comparisons, together with circumstantial evidence, were used to connect the remains to the putative family members. Results At present, the skeletal remains of about a thousand soldiers have been found and repatriated. In forensic anthropological examination, several injuries related to death were documented. For the total of 181 bone samples, mtDNA HVR-1 and HVR-2 sequences were successfully obtained for 167 (92.3%) and 148 (81.8%) of the samples, respectively. Five samples yielded no reliable sequence data. Our data suggests that mtDNA preserves at least for 60 years in the boreal acidic soil. The quality of the obtained mtDNA sequence data varied depending on the sample bone type, with long compact bones (femur, tibia and humerus) having significantly better (90.0%) success rate than other bones (51.2%). Conclusion Although more than 60 years have passed since the World War II, our experience is that resolving the fate of soldiers missing in action is still of uttermost importance for people having lost their relatives in the war. Although cultural and individual differences may exist, our experience presented here gives a good perspective on the importance of individual identification performed by forensic professionals. PMID:17696308
Bisulfite Conversion of DNA: Performance Comparison of Different Kits and Methylation Quantitation of Epigenetic Biomarkers that Have the Potential to Be Used in Non-Invasive Prenatal Testing

PubMed Central

Leontiou, Chrysanthia A.; Hadjidaniel, Michael D.; Mina, Petros; Antoniou, Pavlos; Ioannides, Marios; Patsalis, Philippos C.

2015-01-01

Introduction Epigenetic alterations, including DNA methylation, play an important role in the regulation of gene expression. Several methods exist for evaluating DNA methylation, but bisulfite sequencing remains the gold standard by which base-pair resolution of CpG methylation is achieved. The challenge of the method is that the desired outcome (conversion of unmethylated cytosines) positively correlates with the undesired side effects (DNA degradation and inappropriate conversion), thus several commercial kits try to adjust a balance between the two. The aim of this study was to compare the performance of four bisulfite conversion kits [Premium Bisulfite kit (Diagenode), EpiTect Bisulfite kit (Qiagen), MethylEdge Bisulfite Conversion System (Promega) and BisulFlash DNA Modification kit (Epigentek)] regarding conversion efficiency, DNA degradation and conversion specificity. Methods Performance was tested by combining fully methylated and fully unmethylated λ-DNA controls in a series of spikes by means of Sanger sequencing (0%, 25%, 50% and 100% methylated spikes) and Next-Generation Sequencing (0%, 3%, 5%, 7%, 10%, 25%, 50% and 100% methylated spikes). We also studied the methylation status of two of our previously published differentially methylated regions (DMRs) at base resolution by using spikes of chorionic villus sample in whole blood. Results The kits studied showed different but comparable results regarding DNA degradation, conversion efficiency and conversion specificity. However, the best performance was observed with the MethylEdge Bisulfite Conversion System (Promega) followed by the Premium Bisulfite kit (Diagenode). The DMRs, EP6 and EP10, were confirmed to be hypermethylated in the CVS and hypomethylated in whole blood. Conclusion Our findings indicate that the MethylEdge Bisulfite Conversion System (Promega) was shown to have the best performance among the kits. In addition, the methylation level of two of our DMRs, EP6 and EP10, was confirmed. Finally, we showed that bisulfite amplicon sequencing is a suitable approach for methylation analysis of targeted regions. PMID:26247357
Bisulfite Conversion of DNA: Performance Comparison of Different Kits and Methylation Quantitation of Epigenetic Biomarkers that Have the Potential to Be Used in Non-Invasive Prenatal Testing.

PubMed

Leontiou, Chrysanthia A; Hadjidaniel, Michael D; Mina, Petros; Antoniou, Pavlos; Ioannides, Marios; Patsalis, Philippos C

2015-01-01

Epigenetic alterations, including DNA methylation, play an important role in the regulation of gene expression. Several methods exist for evaluating DNA methylation, but bisulfite sequencing remains the gold standard by which base-pair resolution of CpG methylation is achieved. The challenge of the method is that the desired outcome (conversion of unmethylated cytosines) positively correlates with the undesired side effects (DNA degradation and inappropriate conversion), thus several commercial kits try to adjust a balance between the two. The aim of this study was to compare the performance of four bisulfite conversion kits [Premium Bisulfite kit (Diagenode), EpiTect Bisulfite kit (Qiagen), MethylEdge Bisulfite Conversion System (Promega) and BisulFlash DNA Modification kit (Epigentek)] regarding conversion efficiency, DNA degradation and conversion specificity. Performance was tested by combining fully methylated and fully unmethylated λ-DNA controls in a series of spikes by means of Sanger sequencing (0%, 25%, 50% and 100% methylated spikes) and Next-Generation Sequencing (0%, 3%, 5%, 7%, 10%, 25%, 50% and 100% methylated spikes). We also studied the methylation status of two of our previously published differentially methylated regions (DMRs) at base resolution by using spikes of chorionic villus sample in whole blood. The kits studied showed different but comparable results regarding DNA degradation, conversion efficiency and conversion specificity. However, the best performance was observed with the MethylEdge Bisulfite Conversion System (Promega) followed by the Premium Bisulfite kit (Diagenode). The DMRs, EP6 and EP10, were confirmed to be hypermethylated in the CVS and hypomethylated in whole blood. Our findings indicate that the MethylEdge Bisulfite Conversion System (Promega) was shown to have the best performance among the kits. In addition, the methylation level of two of our DMRs, EP6 and EP10, was confirmed. Finally, we showed that bisulfite amplicon sequencing is a suitable approach for methylation analysis of targeted regions.
DNA sequence selectivity of hairpin polyamide turn units

PubMed Central

Farkas, Michelle E.; Li, Benjamin C.; Dose, Christian; Dervan, Peter B.

2011-01-01

A class of hairpin polyamides linked by 3,4-diaminobutyric acid, resulting in a β-amine residue at the turn unit, showed improved binding affinities relative to their α-amino-γ-turn analogs for particular sequences. We incorporated β-amino-γ-turns in six-ring polyamides and determined whether there are any sequence preferences under the turn unit by quantitative footprinting titrations. Although there was an energetic penalty for G·C and C·G base pairs, we found little preference for T·A over A·T at the β-amino-γ-turn position. Fluorine and hydroxyl substituted α-amino-γ-turns were synthesized for comparison. Their binding affinities and specificities in the context of six-ring polyamides demonstrated overall diminished affinity and no additional specificity at the turn position. We anticipate that this study will be a baseline for further investigation of the turn subunit as a recognition element for the DNA minor groove. PMID:19349175
Identification of the skeletal remains of Martin Bormann by mtDNA analysis.

PubMed

Anslinger, K; Weichhold, G; Keil, W; Bayer, B; Eisenmenger, W

2001-01-01

Contrary to statements of an eye-witness who reported that Martin Bormann, the second most powerful man in the Third Reich, died on 2 May 1945 in Berlin, rumours persisted over the years that he had escaped from Germany after World War II. In 1972, skeletal remains were found during construction work, and by investigating the teeth and the bones experts concluded that they were from Bormann. Nevertheless, new rumours arose and in order to end this speculation we were commissioned to identify the skeletal remains by mitochondrial DNA analysis. The comparison of the sequence of HV1 and HV2 from the skeletal remains and a living maternal relative of Martin Bormann revealed no differences and this sequence was not found in 1,500 Caucasoid reference sequences. Based on this investigation, we support the hypothesis that the skeletal remains are those of Martin Bormann.
Characterization of Zea mays endosperm C-24 sterol methyltransferase: one of two types of sterol methyltransferase in higher plants.

PubMed

Grebenok, R J; Galbraith, D W; Penna, D D

1997-08-01

We report the characterization of a higher-plant C-24 sterol methyltransferase by yeast complementation. A Zea mays endosperm expressed sequence tag (EST) was identified which, upon complete sequencing, showed 46% identity to the yeast C-24 methyltransferase gene (ERG6) and 75% and 37% amino acid identity to recently isolated higher-plant sterol methyltransferases from soybean and Arabidopsis, respectively. When placed under GALA regulation, the Z. mays cDNA functionally complemented the erg6 mutation, restoring ergosterol production and conferring resistance to cycloheximide. Complementation was both plasmid-dependent and galactose-inducible. The Z. mays cDNA clone contains an open reading frame encoding a 40 kDa protein containing motifs common to a large number of S-adenosyl-L-methionine methyltransferases (SMTs). Sequence comparisons and functional studies of the maize, soybean and Arabidopsis cDNAs indicates two types of C-24 SMTs exist in higher plants.

Comparison of the Heme Iron Utilization Systems of Pathogenic Vibrios

PubMed Central

O’Malley, S. M.; Mouton, S. L.; Occhino, D. A.; Deanda, M. T.; Rashidi, J. R.; Fuson, K. L.; Rashidi, C. E.; Mora, M. Y.; Payne, S. M.; Henderson, D. P.

1999-01-01

Vibrio alginolyticus, Vibrio fluvialis, and Vibrio parahaemolyticus utilized heme and hemoglobin as iron sources and contained chromosomal DNA similar to several Vibrio cholerae heme iron utilization genes. A V. parahaemolyticus gene that performed the function of V. cholerae hutA was isolated. A portion of the tonB1 locus of V. parahaemolyticus was sequenced and found to encode proteins similar in amino acid sequence to V. cholerae HutW, TonB1, and ExbB1. A recombinant plasmid containing the V. cholerae tonB1 and exbB1D1 genes complemented a V. alginolyticus heme utilization mutant. These data suggest that the heme iron utilization systems of the pathogenic vibrios tested, particularly V. parahaemolyticus and V. alginolyticus, are similar at the DNA level, the functional level, and, in the case of V. parahaemolyticus, the amino acid sequence or protein level to that of V. cholerae. PMID:10348876
Isolation of a complementary DNA clone for thyroid microsomal antigen. Homology with the gene for thyroid peroxidase.

PubMed Central

Seto, P; Hirayu, H; Magnusson, R P; Gestautas, J; Portmann, L; DeGroot, L J; Rapoport, B

1987-01-01

The thyroid microsomal antigen (MSA) in autoimmune thyroid disease is a protein of approximately 107 kD. We screened a human thyroid cDNA library constructed in the expression vector lambda gt11 with anti-107-kD monoclonal antibodies. Of five clones obtained, the recombinant beta-galactosidase fusion protein from one clone (PM-5) was confirmed to react with the monoclonal antiserum. The complementary DNA (cDNA) insert from PM-5 (0.8 kb) was used as a probe on Northern blot analysis to estimate the size of the mRNA coding for the MSA. The 2.9-kb messenger RNA (mRNA) species observed was the same size as that coding for human thyroid peroxidase (TPO). The probe did not bind to human liver mRNA, indicating the thyroid-specific nature of the PM-5-related mRNA. The nucleotide sequence of PM-5 (842 bp) was determined and consisted of a single open reading frame. Comparison of the nucleotide sequence of PM-5 with that presently available for pig TPO indicates 84% homology. In conclusion, a cDNA clone representing part of the microsomal antigen has been isolated. Sequence homology with porcine TPO, as well as identity in the size of the mRNA species for both the microsomal antigen and TPO, indicate that the microsomal antigen is, at least in part, TPO. Images PMID:3654979
Characterization of a molt-inhibiting hormone (MIH) of the crayfish, Orconectes limosus, by cDNA cloning and mass spectrometric analysis.

PubMed

Bulau, Patrick; Okuno, Atsuro; Thome, Elke; Schmitz, Tina; Peter-Katalinic, Jasna; Keller, Rainer

2005-11-01

The structure of the precursor of a molt-inhibiting hormone (MIH) of the American crayfish, Orconectes limosus was determined by cloning of a cDNA based on RNA from the neurosecretory perikarya of the X-organ in the eyestalk ganglia. The open reading frame includes the complete precursor sequence, consisting of a signal peptide of 29, and the MIH sequence of 77 amino acids. In addition, the mature peptide was isolated by HPLC from the neurohemal sinus gland and analyzed by ESI-MS and MALDI-TOF-MS peptide mapping. This showed that the mature peptide (Mass 8664.29 Da) consists of only 75 amino acids, having Ala75-NH2 as C-terminus. Thus, C-terminal Arg77 of the precursor is removed during processing, and Gly76 serves as an amide donor. Sequence comparison confirms this peptide as a novel member of the large family, which includes crustacean hyperglycaemic hormone (CHH), MIH and gonad (vitellogenesis)-inhibiting hormone (GIH/VIH). The lack of a CPRP (CHH-precursor related peptide) in the hormone precursor, the size and specific sequence characteristics show that Orl MIH belongs to the MIH/GIH(VIH) subgroup of this larger family. Comparison with the MIH of Procambarus clarkii, the only other MIH that has thus far been identified in freshwater crayfish, shows extremely high sequence conservation. Both MIHs differ in only one amino acid residue ( approximately 99% identity), whereas the sequence identity to several other known MIHs is between 40 and 46%.
Lactobacillus cypricasei Lawson et al. 2001 is a later heterotypic synonym of Lactobacillus acidipiscis Tanasupawat et al. 2000.

PubMed

Naser, Sabri M; Vancanneyt, Marc; Hoste, Bart; Snauwaert, Cindy; Swings, Jean

2006-07-01

The applicability of a multilocus sequence analysis (MLSA)-based identification system for lactobacilli was evaluated. Two housekeeping genes that code for the phenylalanyl-tRNA synthase alpha-subunit (pheS) and RNA polymerase alpha-subunit (rpoA) were sequenced and analysed for members of the Lactobacillus salivarius species group. The type strains of Lactobacillus acidipiscis and Lactobacillus cypricasei were investigated further using a third gene that encodes the alpha-subunit of ATP synthase (atpA). The MLSA data revealed close relatedness between L. acidipiscis and L. cypricasei, with 99.8-100 % pheS, rpoA and atpA gene sequence similarities. Comparison of the 16S rRNA gene sequences of the type strains of the two species confirmed the close relatedness (99.8 % gene sequence similarity) between the two taxa. Similar phenotypes and high DNA-DNA binding values in the range of 84 to 97.5 % confirmed that L. acidipiscis and L. cypricasei are synonymous species. On the basis of the present study, it is proposed that Lactobacillus cypricasei is a later heterotypic synonym of Lactobacillus acidipiscis.
Colorimetric DNA detection of transgenic plants using gold nanoparticles functionalized with L-shaped DNA probes

NASA Astrophysics Data System (ADS)

Nourisaeid, Elham; Mousavi, Amir; Arpanaei, Ayyoob

2016-01-01

In this study, a DNA colorimetric detection system based on gold nanoparticles functionalized with L-shaped DNA probes was prepared and evaluated. We investigated the hybridization efficiency of the L-shaped probes and studied the effect of nanoparticle size and the L-shaped DNA probe length on the performance of the as-prepared system. Probes were attached to the surface of gold nanoparticles using an adenine sequence. An optimal sequence of 35S rRNA gene promoter from the cauliflower mosaic virus, which is frequently used in the development of transgenic plants, and the two complementary ends of this gene were employed as model target strands and probe molecules, respectively. The spectrophotometric properties of the as-prepared systems indicated that the large NPs show better changes in the absorption spectrum and consequently present a better performance. The results of this study revealed that the probe/Au-NPs prepared using a vertical spacer containing 5 thymine oligonucleotides exhibited a stronger spectrophotometric response in comparison to that of larger probes. These results in general indicate the suitable performance of the L-shaped DNA probe-functionalized Au-NPs, and in particular emphasize the important role of the gold nanoparticle size and length of the DNA probes in enhancing the performance of such a system.
The evolutionary history of Saccharomyces species inferred from completed mitochondrial genomes and revision in the ‘yeast mitochondrial genetic code’

PubMed Central

Szabóová, Dana; Bielik, Peter; Poláková, Silvia; Šoltys, Katarína; Jatzová, Katarína; Szemes, Tomáš

2017-01-01

Abstract The yeast Saccharomyces are widely used to test ecological and evolutionary hypotheses. A large number of nuclear genomic DNA sequences are available, but mitochondrial genomic data are insufficient. We completed mitochondrial DNA (mtDNA) sequencing from Illumina MiSeq reads for all Saccharomyces species. All are circularly mapped molecules decreasing in size with phylogenetic distance from Saccharomyces cerevisiae but with similar gene content including regulatory and selfish elements like origins of replication, introns, free-standing open reading frames or GC clusters. Their most profound feature is species-specific alteration in gene order. The genetic code slightly differs from well-established yeast mitochondrial code as GUG is used rarely as the translation start and CGA and CGC code for arginine. The multilocus phylogeny, inferred from mtDNA, does not correlate with the trees derived from nuclear genes. mtDNA data demonstrate that Saccharomyces cariocanus should be assigned as a separate species and Saccharomyces bayanus CBS 380T should not be considered as a distinct species due to mtDNA nearly identical to Saccharomyces uvarum mtDNA. Apparently, comparison of mtDNAs should not be neglected in genomic studies as it is an important tool to understand the origin and evolutionary history of some yeast species. PMID:28992063
Do neighboring lakes share common taxa of bacterioplankton? Comparison of 16S rDNA fingerprints and sequences from three geographic regions.

PubMed

Lindström, E S; Leskinen, E

2002-07-01

Bacterioplankton community composition was studied in 12 lakes in three different geographic regions in Scandinavia using denaturing gradient gel electrophoresis (DGGE) and sequencing of 16S rDNA. Area-specific abundant taxa were found in the lakes in two of the regions. In the region of Uppland the lakes had an alpha-proteobacterium, belonging to the subgroup Alpha V in common. The Alpha V bacteria appeared to be favored by neutral or higher pH values. The lakes in Lappland were found to harbor Actinobacteria, which appeared to be favored in bog lakes. No abundant taxon was found to be in common for the lakes in Svalbard, the third region studied.
Direct comparisons of Illumina vs. Roche 454 sequencing technologies on the same microbial community DNA sample.

PubMed

Luo, Chengwei; Tsementzi, Despina; Kyrpides, Nikos; Read, Timothy; Konstantinidis, Konstantinos T

2012-01-01

Next-generation sequencing (NGS) is commonly used in metagenomic studies of complex microbial communities but whether or not different NGS platforms recover the same diversity from a sample and their assembled sequences are of comparable quality remain unclear. We compared the two most frequently used platforms, the Roche 454 FLX Titanium and the Illumina Genome Analyzer (GA) II, on the same DNA sample obtained from a complex freshwater planktonic community. Despite the substantial differences in read length and sequencing protocols, the platforms provided a comparable view of the community sampled. For instance, derived assemblies overlapped in ~90% of their total sequences and in situ abundances of genes and genotypes (estimated based on sequence coverage) correlated highly between the two platforms (R(2)>0.9). Evaluation of base-call error, frameshift frequency, and contig length suggested that Illumina offered equivalent, if not better, assemblies than Roche 454. The results from metagenomic samples were further validated against DNA samples of eighteen isolate genomes, which showed a range of genome sizes and G+C% content. We also provide quantitative estimates of the errors in gene and contig sequences assembled from datasets characterized by different levels of complexity and G+C% content. For instance, we noted that homopolymer-associated, single-base errors affected ~1% of the protein sequences recovered in Illumina contigs of 10× coverage and 50% G+C; this frequency increased to ~3% when non-homopolymer errors were also considered. Collectively, our results should serve as a useful practical guide for choosing proper sampling strategies and data possessing protocols for future metagenomic studies.
Plasma DNA tissue mapping by genome-wide methylation sequencing for noninvasive prenatal, cancer, and transplantation assessments

PubMed Central

Sun, Kun; Jiang, Peiyong; Chan, K. C. Allen; Wong, John; Cheng, Yvonne K. Y.; Liang, Raymond H. S.; Chan, Wai-kong; Ma, Edmond S. K.; Chan, Stephen L.; Cheng, Suk Hang; Chan, Rebecca W. Y.; Tong, Yu K.; Ng, Simon S. M.; Wong, Raymond S. M.; Hui, David S. C.; Leung, Tse Ngong; Leung, Tak Y.; Lai, Paul B. S.; Chiu, Rossa W. K.; Lo, Yuk Ming Dennis

2015-01-01

Plasma consists of DNA released from multiple tissues within the body. Using genome-wide bisulfite sequencing of plasma DNA and deconvolution of the sequencing data with reference to methylation profiles of different tissues, we developed a general approach for studying the major tissue contributors to the circulating DNA pool. We tested this method in pregnant women, patients with hepatocellular carcinoma, and subjects following bone marrow and liver transplantation. In most subjects, white blood cells were the predominant contributors to the circulating DNA pool. The placental contributions in the plasma of pregnant women correlated with the proportional contributions as revealed by fetal-specific genetic markers. The graft-derived contributions to the plasma in the transplant recipients correlated with those determined using donor-specific genetic markers. Patients with hepatocellular carcinoma showed elevated plasma DNA contributions from the liver, which correlated with measurements made using tumor-associated copy number aberrations. In hepatocellular carcinoma patients and in pregnant women exhibiting copy number aberrations in plasma, comparison of methylation deconvolution results using genomic regions with different copy number status pinpointed the tissue type responsible for the aberrations. In a pregnant woman diagnosed as having follicular lymphoma during pregnancy, methylation deconvolution indicated a grossly elevated contribution from B cells into the plasma DNA pool and localized B cells as the origin of the copy number aberrations observed in plasma. This method may serve as a powerful tool for assessing a wide range of physiological and pathological conditions based on the identification of perturbed proportional contributions of different tissues into plasma. PMID:26392541
Mitochondrial DNA Sequence Divergence among Meloidogyne incognita, Romanomermis culicivorax, Ascaris suum, and Caenorhabditis elegans

PubMed Central

Powers, T. O.; Harris, T. S.; Hyman, B. C.

1993-01-01

Mitochondrial DNA sequences were obtained from the NADH dehydrogenase subunit 3 (ND3), large rRNA, and cytochrome b genes from Meloidogyne incognita and Romanomermis culicivorax. Both species show considerable genetic distance within these same genes when compared with Caenorhabditis elegans or Ascaris suum, two species previously analyzed. Caenorhabditis, Ascaris, and Meloidogyne were selected as representatives of three subclasses in the nematode class Secernentea: Rhabditia, Spiruria, and Diplogasteria, respectively. Romanomermis served as a representative out-group of the class Adenophorea. The divergence between the phytoparasitic lineage (represented by Meloidogyne) and the three other species is so great that virtually every variable position in these genes appears to have accumulated multiple mutations, obscuring the phylogenetic information obtainable from these comparisons. The 39 and 42% amino acid similarity between the M. incognita and C. elegans ND3 and cytochrome b coding sequences, respectively, are approximately the same as those of C. elegans-mouse comparisons for the same genes (26 and 44%). This discovery calls into question the feasibility of employing cloned C. elegans probes as reagents to isolate phytoparasitic nematode genes. The genetic distance between the phytoparasitic nematode lineage and C. elegans markedly contrasts with the 79% amino acid similarity between C. elegans and A. suum for the same sequences. The molecular data suggest that Caenorhabditis and Ascaris belong to the same subclass. PMID:19279810
Quantifying the Number of Independent Organelle DNA Insertions in Genome Evolution and Human Health.

PubMed

Hazkani-Covo, Einat; Martin, William F

2017-05-01

Fragments of organelle genomes are often found as insertions in nuclear DNA. These fragments of mitochondrial DNA (numts) and plastid DNA (nupts) are ubiquitous components of eukaryotic genomes. They are, however, often edited out during the genome assembly process, leading to systematic underestimation of their frequency. Numts and nupts, once inserted, can become further fragmented through subsequent insertion of mobile elements or other recombinational events that disrupt the continuity of the inserted sequence relative to the genuine organelle DNA copy. Because numts and nupts are typically identified through sequence comparison tools such as BLAST, disruption of insertions into smaller fragments can lead to systematic overestimation of numt and nupt frequencies. Accurate identification of numts and nupts is important, however, both for better understanding of their role during evolution, and for monitoring their increasingly evident role in human disease. Human populations are polymorphic for 141 numt loci, five numts are causal to genetic disease, and cancer genomic studies are revealing an abundance of numts associated with tumor progression. Here, we report investigation of salient parameters involved in obtaining accurate estimates of numt and nupt numbers in genome sequence data. Numts and nupts from 44 sequenced eukaryotic genomes reveal lineage-specific differences in the number, relative age and frequency of insertional events as well as lineage-specific dynamics of their postinsertional fragmentation. Our findings outline the main technical parameters influencing accurate identification and frequency estimation of numts in genomic studies pertinent to both evolution and human health. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Introducing OTUshuff and DwOdum: A new set of tools for estimating beta diversity for under-sampled communities

USDA-ARS?s Scientific Manuscript database

Characterization of complex microbial communities by DNA sequencing has become a standard technique in microbial ecology. Yet, particular features of this approach render traditional methods of community comparison problematic. In particular, a very low proportion of community members are typically ...
Estimating beta diversity for under-sampled communities using the variably weighted Odum dissimilarity index and OTUshuff

USDA-ARS?s Scientific Manuscript database

Characterization of complex microbial communities by DNA sequencing has become a standard technique in microbial ecology. Yet, particular features of this approach render traditional methods of community comparison problematic. In particular, a very low proportion of community members are typically ...
Scar-less multi-part DNA assembly design automation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hillson, Nathan J.

The present invention provides a method of a method of designing an implementation of a DNA assembly. In an exemplary embodiment, the method includes (1) receiving a list of DNA sequence fragments to be assembled together and an order in which to assemble the DNA sequence fragments, (2) designing DNA oligonucleotides (oligos) for each of the DNA sequence fragments, and (3) creating a plan for adding flanking homology sequences to each of the DNA oligos. In an exemplary embodiment, the method includes (1) receiving a list of DNA sequence fragments to be assembled together and an order in which tomore » assemble the DNA sequence fragments, (2) designing DNA oligonucleotides (oligos) for each of the DNA sequence fragments, and (3) creating a plan for adding optimized overhang sequences to each of the DNA oligos.« less
An integrated pipeline for next generation sequencing and annotation of the complete mitochondrial genome of the giant intestinal fluke, Fasciolopsis buski (Lankester, 1857) Looss, 1899

PubMed Central

Biswal, Devendra Kumar; Ghatani, Sudeep; Shylla, Jollin A.; Sahu, Ranjana; Mullapudi, Nandita

2013-01-01

Helminths include both parasitic nematodes (roundworms) and platyhelminths (trematode and cestode flatworms) that are abundant, and are of clinical importance. The genetic characterization of parasitic flatworms using advanced molecular tools is central to the diagnosis and control of infections. Although the nuclear genome houses suitable genetic markers (e.g., in ribosomal (r) DNA) for species identification and molecular characterization, the mitochondrial (mt) genome consistently provides a rich source of novel markers for informative systematics and epidemiological studies. In the last decade, there have been some important advances in mtDNA genomics of helminths, especially lung flukes, liver flukes and intestinal flukes. Fasciolopsis buski, often called the giant intestinal fluke, is one of the largest digenean trematodes infecting humans and found primarily in Asia, in particular the Indian subcontinent. Next-generation sequencing (NGS) technologies now provide opportunities for high throughput sequencing, assembly and annotation within a short span of time. Herein, we describe a high-throughput sequencing and bioinformatics pipeline for mt genomics for F. buski that emphasizes the utility of short read NGS platforms such as Ion Torrent and Illumina in successfully sequencing and assembling the mt genome using innovative approaches for PCR primer design as well as assembly. We took advantage of our NGS whole genome sequence data (unpublished so far) for F. buski and its comparison with available data for the Fasciola hepatica mtDNA as the reference genome for design of precise and specific primers for amplification of mt genome sequences from F. buski. A long-range PCR was carried out to create an NGS library enriched in mt DNA sequences. Two different NGS platforms were employed for complete sequencing, assembly and annotation of the F. buski mt genome. The complete mt genome sequences of the intestinal fluke comprise 14,118 bp and is thus the shortest trematode mitochondrial genome sequenced to date. The noncoding control regions are separated into two parts by the tRNA-Gly gene and don’t contain either tandem repeats or secondary structures, which are typical for trematode control regions. The gene content and arrangement are identical to that of F. hepatica. The F. buski mtDNA genome has a close resemblance with F. hepatica and has a similar gene order tallying with that of other trematodes. The mtDNA for the intestinal fluke is reported herein for the first time by our group that would help investigate Fasciolidae taxonomy and systematics with the aid of mtDNA NGS data. More so, it would serve as a resource for comparative mitochondrial genomics and systematic studies of trematode parasites. PMID:24255820
Comparative Sequence Analysis of Multidrug-Resistant IncA/C Plasmids from Salmonella enterica.

PubMed

Hoffmann, Maria; Pettengill, James B; Gonzalez-Escalona, Narjol; Miller, John; Ayers, Sherry L; Zhao, Shaohua; Allard, Marc W; McDermott, Patrick F; Brown, Eric W; Monday, Steven R

2017-01-01

Determinants of multidrug resistance (MDR) are often encoded on mobile elements, such as plasmids, transposons, and integrons, which have the potential to transfer among foodborne pathogens, as well as to other virulent pathogens, increasing the threats these traits pose to human and veterinary health. Our understanding of MDR among Salmonella has been limited by the lack of closed plasmid genomes for comparisons across resistance phenotypes, due to difficulties in effectively separating the DNA of these high-molecular weight, low-copy-number plasmids from chromosomal DNA. To resolve this problem, we demonstrate an efficient protocol for isolating, sequencing and closing IncA/C plasmids from Salmonella sp. using single molecule real-time sequencing on a Pacific Biosciences (Pacbio) RS II Sequencer. We obtained six Salmonella enterica isolates from poultry, representing six different serovars, each exhibiting the MDR-Ampc resistance profile. Salmonella plasmids were obtained using a modified mini preparation and transformed with Escherichia coli DH10Br. A Qiagen Large-Construct kit™ was used to recover highly concentrated and purified plasmid DNA that was sequenced using PacBio technology. These six closed IncA/C plasmids ranged in size from 104 to 191 kb and shared a stable, conserved backbone containing 98 core genes, with only six differences among those core genes. The plasmids encoded a number of antimicrobial resistance genes, including those for quaternary ammonium compounds and mercury. We then compared our six IncA/C plasmid sequences: first with 14 IncA/C plasmids derived from S. enterica available at the National Center for Biotechnology Information (NCBI), and then with an additional 38 IncA/C plasmids derived from different taxa. These comparisons allowed us to build an evolutionary picture of how antimicrobial resistance may be mediated by this common plasmid backbone. Our project provides detailed genetic information about resistance genes in plasmids, advances in plasmid sequencing, and phylogenetic analyses, and important insights about how MDR evolution occurs across diverse serotypes from different animal sources, particularly in agricultural settings where antimicrobial drug use practices vary.
Elongation Factor-Tu (EF-Tu) proteins structural stability and bioinformatics in ancestral gene reconstruction

NASA Astrophysics Data System (ADS)

Dehipawala, Sunil; Nguyen, A.; Tremberger, G.; Cheung, E.; Schneider, P.; Lieberman, D.; Holden, T.; Cheung, T.

2013-09-01

A paleo-experimental evolution report on elongation factor EF-Tu structural stability results has provided an opportunity to rewind the tape of life using the ancestral protein sequence reconstruction modeling approach; consistent with the book of life dogma in current biology and being an important component in the astrobiology community. Fractal dimension via the Higuchi fractal method and Shannon entropy of the DNA sequence classification could be used in a diagram that serves as a simple summary. Results from biomedical gene research provide examples on the diagram methodology. Comparisons between biomedical genes such as EEF2 (elongation factor 2 human, mouse, etc), WDR85 in epigenetics, HAR1 in human specificity, DLG1 in cognitive skill, and HLA-C in mosquito bite immunology with EF Tu DNA sequences have accounted for the reported circular dichroism thermo-stability data systematically; the results also infer a relatively less volatility geologic time period from 2 to 3 Gyr from adaptation viewpoint. Comparison to Thermotoga maritima MSB8 and Psychrobacter shows that Thermus thermophilus HB8 EF-Tu calibration sequence could be an outlier, consistent with free energy calculation by NUPACK. Diagram methodology allows computer simulation studies and HAR1 shows about 0.5% probability from chimp to human in terms of diagram location, and SNP simulation results such as amoebic meningoencephalitis NAF1 suggest correlation. Extensions to the studies of the translation and transcription elongation factor sequences in Megavirus Chiliensis, Megavirus Lba and Pandoravirus show that the studied Pandoravirus sequence could be an outlier with the highest fractal dimension and lowest entropy, as compared to chicken as a deviant in the DNMT3A DNA methylation gene sequences from zebrafish to human and to the less than one percent probability in computer simulation using the HAR1 0.5% probability as reference. The diagram methodology would be useful in ancestral gene reconstruction studies in astrobiology and also be applicable to the study of point mutation in conformational thermostabilization research with Synchrotron based X-ray data for drug applications such as Parkinson's disease.
Structural analysis of two length variants of the rDNA intergenic spacer from Eruca sativa.

PubMed

Lakshmikumaran, M; Negi, M S

1994-03-01

Restriction enzyme analysis of the rRNA genes of Eruca sativa indicated the presence of many length variants within a single plant and also between different cultivars which is unusual for most crucifers studied so far. Two length variants of the rDNA intergenic spacer (IGS) from a single individual E. sativa (cv. Itsa) plant were cloned and characterized. The complete nucleotide sequences of both the variants (3 kb and 4 kb) were determined. The intergenic spacer contains three families of tandemly repeated DNA sequences denoted as A, B and C. However, the long (4 kb) variant shows the presence of an additional repeat, denoted as D, which is a duplication of a 224 bp sequence just upstream of the putative transcription initiation site. Repeat units belonging to the three different families (A, B and C) were in the size range of 22 to 30 bp. Such short repeat elements are present in the IGS of most of the crucifers analysed so far. Sequence analysis of the variants (3 kb and 4 kb) revealed that the length heterogeneity of the spacer is located at three different regions and is due to the varying copy numbers of repeat units belonging to families A and B. Length variation of the spacer is also due to the presence of a large duplication (D repeats) in the 4 kb variant which is absent in the 3 kb variant. The putative transcription initiation site was identified by comparisons with the rDNA sequences from other plant species.
Quantitative high-throughput profiling of snake venom gland transcriptomes and proteomes (Ovophis okinavensis and Protobothrops flavoviridis)

PubMed Central

2013-01-01

Background Advances in DNA sequencing and proteomics have facilitated quantitative comparisons of snake venom composition. Most studies have employed one approach or the other. Here, both Illumina cDNA sequencing and LC/MS were used to compare the transcriptomes and proteomes of two pit vipers, Protobothrops flavoviridis and Ovophis okinavensis, which differ greatly in their biology. Results Sequencing of venom gland cDNA produced 104,830 transcripts. The Protobothrops transcriptome contained transcripts for 103 venom-related proteins, while the Ovophis transcriptome contained 95. In both, transcript abundances spanned six orders of magnitude. Mass spectrometry identified peptides from 100% of transcripts that occurred at higher than contaminant (e.g. human keratin) levels, including a number of proteins never before sequenced from snakes. These transcriptomes reveal fundamentally different envenomation strategies. Adult Protobothrops venom promotes hemorrhage, hypotension, incoagulable blood, and prey digestion, consistent with mammalian predation. Ovophis venom composition is less readily interpreted, owing to insufficient pharmacological data for venom serine and metalloproteases, which comprise more than 97.3% of Ovophis transcripts, but only 38.0% of Protobothrops transcripts. Ovophis venom apparently represents a hybrid strategy optimized for frogs and small mammals. Conclusions This study illustrates the power of cDNA sequencing combined with MS profiling. The former quantifies transcript composition, allowing detection of novel proteins, but cannot indicate which proteins are actually secreted, as does MS. We show, for the first time, that transcript and peptide abundances are correlated. This means that MS can be used for quantitative, non-invasive venom profiling, which will be beneficial for studies of endangered species. PMID:24224955
Amino acid sequence of bovine muzzle epithelial desmocollin derived from cloned cDNA: a novel subtype of desmosomal cadherins.

PubMed

Koch, P J; Goldschmidt, M D; Walsh, M J; Zimbelmann, R; Schmelz, M; Franke, W W

1991-05-01

Desmosomes are cell-type-specific intercellular junctions found in epithelium, myocardium and certain other tissues. They consist of assemblies of molecules involved in the adhesion of specific cell types and in the anchorage of cell-type-specific cytoskeletal elements, the intermediate-size filaments, to the plasma membrane. To explore the individual desmosomal components and their functions we have isolated DNA clones encoding the desmosomal glycoprotein, desmocollin, using antibodies and a cDNA expression library from bovine muzzle epithelium. The cDNA-deduced amino-acid sequence of desmocollin (presently we cannot decide to which of the two desmocollins, DC I or DC II, this clone relates) defines a polypeptide with a calculated molecular weight of 85,000, with a single candidate sequence of 24 amino acids sufficiently long for a transmembrane arrangement, and an extracellular aminoterminal portion of 561 amino acid residues, compared to a cytoplasmic part of only 176 amino acids. Amino acid sequence comparisons have revealed that desmocollin is highly homologous to members of the cadherin family of cell adhesion molecules, including the previously sequenced desmoglein, another desmosome-specific cadherin. Using riboprobes derived from cDNAs for Northern-blot analyses, we have identified an mRNA of approximately 6 kb in stratified epithelia such as muzzle epithelium and tongue mucosa but not in two epithelial cell culture lines containing desmosomes and desmoplakins. The difference may indicate drastic differences in mRNA concentration or the existence of cell-type-specific desmocollin subforms. The molecular topology of desmocollin(s) is discussed in relation to possible functions of the individual molecular domains.

A compositional segmentation of the human mitochondrial genome is related to heterogeneities in the guanine mutation rate

PubMed Central

Samuels, David C.; Boys, Richard J.; Henderson, Daniel A.; Chinnery, Patrick F.

2003-01-01

We applied a hidden Markov model segmentation method to the human mitochondrial genome to identify patterns in the sequence, to compare these patterns to the gene structure of mtDNA and to see whether these patterns reveal additional characteristics important for our understanding of genome evolution, structure and function. Our analysis identified three segmentation categories based upon the sequence transition probabilities. Category 2 segments corresponded to the tRNA and rRNA genes, with a greater strand-symmetry in these segments. Category 1 and 3 segments covered the protein- coding genes and almost all of the non-coding D-loop. Compared to category 1, the mtDNA segments assigned to category 3 had much lower guanine abundance. A comparison to two independent databases of mitochondrial mutations and polymorphisms showed that the high substitution rate of guanine in human mtDNA is largest in the category 3 segments. Analysis of synonymous mutations showed the same pattern. This suggests that this heterogeneity in the mutation rate is partly independent of respiratory chain function and is a direct property of the genome sequence itself. This has important implications for our understanding of mtDNA evolution and its use as a ‘molecular clock’ to determine the rate of population and species divergence. PMID:14530452
Replication Protein A-1 Has a Preference for the Telomeric G-rich Sequence in Trypanosoma cruzi.

PubMed

Pavani, Raphael Souza; Vitarelli, Marcela O; Fernandes, Carlos A H; Mattioli, Fabio F; Morone, Mariana; Menezes, Milene C; Fontes, Marcos R M; Cano, Maria Isabel N; Elias, Maria Carolina

2018-05-01

Replication protein A (RPA), the major eukaryotic single-stranded binding protein, is a heterotrimeric complex formed by RPA-1, RPA-2, and RPA-3. RPA is a fundamental player in replication, repair, recombination, and checkpoint signaling. In addition, increasing evidences have been adding functions to RPA in telomere maintenance, such as interaction with telomerase to facilitate its activity and also involvement in telomere capping in some conditions. Trypanosoma cruzi, the etiological agent of Chagas disease is a protozoa parasite that appears early in the evolution of eukaryotes. Recently, we have showed that T. cruziRPA presents canonical functions being involved with DNA replication and DNA damage response. Here, we found by FISH/IF assays that T. cruziRPA localizes at telomeres even outside replication (S) phase. In vitro analysis showed that one telomeric repeat is sufficient to bind RPA-1. Telomeric DNA induces different secondary structural modifications on RPA-1 in comparison with other types of DNA. In addition, RPA-1 presents a higher affinity for telomeric sequence compared to randomic sequence, suggesting that RPA may play specific roles in T. cruzi telomeric region. © 2017 The Author(s) Journal of Eukaryotic Microbiology © 2017 International Society of Protistologists.
Molecular dynamics study of some non-hydrogen-bonding base pair DNA strands

NASA Astrophysics Data System (ADS)

Tiwari, Rakesh K.; Ojha, Rajendra P.; Tiwari, Gargi; Pandey, Vishnudatt; Mall, Vijaysree

2018-05-01

In order to elucidate the structural activity of hydrophobic modified DNA, the DMMO2-D5SICS, base pair is introduced as a constituent in different set of 12-mer and 14-mer DNA sequences for the molecular dynamics (MD) simulation in explicit water solvent. AMBER 14 force field was employed for each set of duplex during the 200ns production-dynamics simulation in orthogonal-box-water solvent by the Particle-Mesh-Ewald (PME) method in infinite periodic boundary conditions (PBC) to determine conformational parameters of the complex. The force-field parameters of modified base-pair were calculated by Gaussian-code using Hartree-Fock /ab-initio methodology. RMSD Results reveal that the conformation of the duplex is sequence dependent and the binding energy of the complex depends on the position of the modified base-pair in the nucleic acid strand. We found that non-bonding energy had a significant contribution to stabilising such type of duplex in comparison to electrostatic energy. The distortion produced within strands by such type of base-pair was local and destabilised the duplex integrity near to substitution, moreover the binding energy of duplex depends on the position of substitution of hydrophobic base-pair and the DNA sequence and strongly supports the corresponding experimental study.
Using next-generation sequencing for high resolution multiplex analysis of copy number variation from nanogram quantities of DNA from formalin-fixed paraffin-embedded specimens.

PubMed

Wood, Henry M; Belvedere, Ornella; Conway, Caroline; Daly, Catherine; Chalkley, Rebecca; Bickerdike, Melissa; McKinley, Claire; Egan, Phil; Ross, Lisa; Hayward, Bruce; Morgan, Joanne; Davidson, Leslie; MacLennan, Ken; Ong, Thian K; Papagiannopoulos, Kostas; Cook, Ian; Adams, David J; Taylor, Graham R; Rabbitts, Pamela

2010-08-01

The use of next-generation sequencing technologies to produce genomic copy number data has recently been described. Most approaches, however, reply on optimal starting DNA, and are therefore unsuitable for the analysis of formalin-fixed paraffin-embedded (FFPE) samples, which largely precludes the analysis of many tumour series. We have sought to challenge the limits of this technique with regards to quality and quantity of starting material and the depth of sequencing required. We confirm that the technique can be used to interrogate DNA from cell lines, fresh frozen material and FFPE samples to assess copy number variation. We show that as little as 5 ng of DNA is needed to generate a copy number karyogram, and follow this up with data from a series of FFPE biopsies and surgical samples. We have used various levels of sample multiplexing to demonstrate the adjustable resolution of the methodology, depending on the number of samples and available resources. We also demonstrate reproducibility by use of replicate samples and comparison with microarray-based comparative genomic hybridization (aCGH) and digital PCR. This technique can be valuable in both the analysis of routine diagnostic samples and in examining large repositories of fixed archival material.
Genomics approach to the environmental community of microorganisms

NASA Astrophysics Data System (ADS)

Kawarabayasi, Y.; Maruyama, A.

2004-12-01

It was indicated by microscopic observation or comparison of 16S rDNA sequence that many extremophiles were surviving in many hydrothermal environments. But it is generally said that over 99% of total microbes are now uncultivable. Thus, we planned to identify uncultivable microbes through direct sequencing of environmental DNA. At first, shotgun plasmid libraries were directly constructed with the DNA molecules prepared from mixed microbes collected from low-temperature hydrothermal water at RM24 in the Southern East Pacific Rise (S-EPR). It was shown that the sequences of some number of clones indicated the similar feature to the intron in eukaryote or tandem repetitive sequence identified in some human familiar diseases. The results indicated that many microorganisms with eukaryotic feature were dominant in low temperature water of S-EPR. Secondly, shotgun plasmid libraries were constructed from the environmental DNA prepared from Beppu hot springs. The ORFs were easily identified all clones determined entire sequence. Thus it can be said that hot springs is good resources for searching novel genes. At last, the mixed microbes isolated from Suiyo seamount were used for construction of shotgun library. The clones in this library contained the ORFs. From some clones in hot spring and Suiyo sample, aminoacyl-tRNA synthatase, which is generally present in all organisms, was isolated by similarity. The phylogenetic analysis of aminoacyl-tRNA synthetase identified indicated that novel and unidentified microorganisms should be present in hot spring or Suiyo seamount. The novel genes identified from Suiyo seamount were also utilized for expression in E. coli. Some gene products were successfully obtained from the E. coli cells as soluble proteins. Some protein indicated the thermostability up to 70_E#8249;C, meaning that the original host cell of this gene should be stable up to the same temperature. Our work indicates that environmental genomics, including the direct cloning, sequencing of environmental DNA and expression of gene identified, is powerful approach to collect novel uncultivable microbes or novel active genes.
SNBRFinder: A Sequence-Based Hybrid Algorithm for Enhanced Prediction of Nucleic Acid-Binding Residues.

PubMed

Yang, Xiaoxia; Wang, Jia; Sun, Jun; Liu, Rong

2015-01-01

Protein-nucleic acid interactions are central to various fundamental biological processes. Automated methods capable of reliably identifying DNA- and RNA-binding residues in protein sequence are assuming ever-increasing importance. The majority of current algorithms rely on feature-based prediction, but their accuracy remains to be further improved. Here we propose a sequence-based hybrid algorithm SNBRFinder (Sequence-based Nucleic acid-Binding Residue Finder) by merging a feature predictor SNBRFinderF and a template predictor SNBRFinderT. SNBRFinderF was established using the support vector machine whose inputs include sequence profile and other complementary sequence descriptors, while SNBRFinderT was implemented with the sequence alignment algorithm based on profile hidden Markov models to capture the weakly homologous template of query sequence. Experimental results show that SNBRFinderF was clearly superior to the commonly used sequence profile-based predictor and SNBRFinderT can achieve comparable performance to the structure-based template methods. Leveraging the complementary relationship between these two predictors, SNBRFinder reasonably improved the performance of both DNA- and RNA-binding residue predictions. More importantly, the sequence-based hybrid prediction reached competitive performance relative to our previous structure-based counterpart. Our extensive and stringent comparisons show that SNBRFinder has obvious advantages over the existing sequence-based prediction algorithms. The value of our algorithm is highlighted by establishing an easy-to-use web server that is freely accessible at http://ibi.hzau.edu.cn/SNBRFinder.
The complete sequence of the mitochondrial genome of the African Penguin (Spheniscus demersus).

PubMed

Labuschagne, Christiaan; Kotzé, Antoinette; Grobler, J Paul; Dalton, Desiré L

2014-01-15

The complete mitochondrial genome of the African Penguin (Spheniscus demersus) was sequenced. The molecule was sequenced via next generation sequencing and primer walking. The size of the genome is 17,346 bp in length. Comparison with the mitochondrial DNA of two other penguin genomes that have so far been reported was conducted namely; Little blue penguin (Eudyptula minor) and the Rockhopper penguin (Eudyptes chrysocome). This analysis made it possible to identify common penguin mitochondrial DNA characteristics. The S. demersus mtDNA genome is very similar, both in composition and length to both the E. chrysocome and E. minor genomes. The gene content of the African penguin mitochondrial genome is typical of vertebrates and all three penguin species have the standard gene order originally identified in the chicken. The control region for S. demersus is located between tRNA-Glu and tRNA-Phe and all three species of penguins contain two sets of similar repeats with varying copy numbers towards the 3' end of the control region, accounting for the size variance. This is the first report of the complete nucleotide sequence for the mitochondrial genome of the African penguin, S. demersus. These results can be subsequently used to provide information for penguin phylogenetic studies and insights into the evolution of genomes. © 2013 Elsevier B.V. All rights reserved.
Recombinational hotspot specific to female meiosis in the mouse major histocompatibility complex.

PubMed

Shiroishi, T; Hanzawa, N; Sagai, T; Ishiura, M; Gojobori, T; Steinmetz, M; Moriwaki, K

1990-01-01

The wm7 haplotype of the major histocompatibility complex (MHC), derived from the Japanese wild mouse Mus musculus molossinus, enhances recombination specific to female meiosis in the K/A beta interval of the MHC. We have mapped crossover points of fifteen independent recombinants from genetic crosses of the wm7 and laboratory haplotypes. Most of them were confined to a short segment of approximately 1 kilobase (kb) of DNA between the A beta 3 and A beta 2 genes, indicating the presence of a female-specific recombinational hotspot. Its location overlaps with a sex-independent hotspot previously identified in the Mus musculus castaneus CAS3 haplotype. We have cloned and sequenced DNA fragments surrounding the hotspot from the wm7 haplotype and the corresponding regions from the hotspot-negative B10.A and C57BL/10 strains. There is no significant difference between the sequences of these three strains, or between these and the published sequences of the CAS3 and C57BL/6 strains. However, a comparison of this A beta 3/A beta 2 hotspot with a previously characterized hotspot in the E beta gene revealed that they have a very similar molecular organization. Each hotspot consists of two elements, the consensus sequence of the mouse middle repetitive MT family and the tetrameric repeated sequences, which are separated by 1 kb of DNA.
Sequenza: allele-specific copy number and mutation profiles from tumor sequencing data.

PubMed

Favero, F; Joshi, T; Marquard, A M; Birkbak, N J; Krzystanek, M; Li, Q; Szallasi, Z; Eklund, A C

2015-01-01

Exome or whole-genome deep sequencing of tumor DNA along with paired normal DNA can potentially provide a detailed picture of the somatic mutations that characterize the tumor. However, analysis of such sequence data can be complicated by the presence of normal cells in the tumor specimen, by intratumor heterogeneity, and by the sheer size of the raw data. In particular, determination of copy number variations from exome sequencing data alone has proven difficult; thus, single nucleotide polymorphism (SNP) arrays have often been used for this task. Recently, algorithms to estimate absolute, but not allele-specific, copy number profiles from tumor sequencing data have been described. We developed Sequenza, a software package that uses paired tumor-normal DNA sequencing data to estimate tumor cellularity and ploidy, and to calculate allele-specific copy number profiles and mutation profiles. We applied Sequenza, as well as two previously published algorithms, to exome sequence data from 30 tumors from The Cancer Genome Atlas. We assessed the performance of these algorithms by comparing their results with those generated using matched SNP arrays and processed by the allele-specific copy number analysis of tumors (ASCAT) algorithm. Comparison between Sequenza/exome and SNP/ASCAT revealed strong correlation in cellularity (Pearson's r = 0.90) and ploidy estimates (r = 0.42, or r = 0.94 after manual inspecting alternative solutions). This performance was noticeably superior to previously published algorithms. In addition, in artificial data simulating normal-tumor admixtures, Sequenza detected the correct ploidy in samples with tumor content as low as 30%. The agreement between Sequenza and SNP array-based copy number profiles suggests that exome sequencing alone is sufficient not only for identifying small scale mutations but also for estimating cellularity and inferring DNA copy number aberrations. © The Author 2014. Published by Oxford University Press on behalf of the European Society for Medical Oncology.
Effect of the reflectional symmetry on the coherent hole transport across DNA hairpins

NASA Astrophysics Data System (ADS)

Zarea, Mehdi; Berlin, Yuri; Ratner, Mark A.

2017-03-01

The coherent hole transfer in three types of DNA hairpins containing strands with adenine (A) and guanine (G) nucleobases has been studied. The investigated hairpins involve An+1GGAn, AnGAGAn, or (AG)2nA strands that connect the hole donor and hole acceptor located on opposite ends of hairpins. The positive charge transfer from the photo-excited donor to the acceptor is shown to be slower for An+1GGAn in comparison with AnGAGAn and (AG)2nA sequences. We have revealed that this is due to the reflectional symmetry of the last two sequences with respect to the axis passing through the middle base. As has been demonstrated, the symmetry of the sequence structure manifests itself in the reflectional symmetry of the energy eigenstates. In addition, it has been shown that (AG)2nA is the only symmetric sequence with a zero energy state in the middle of the LUMO tight-binding energy band. Based on our theoretical findings, we predict that the hairpin with this sequence should have the fastest coherent hole transfer rate among the class of base sequences studied.
Extensive structural variations between mitochondrial genomes of CMS and normal peppers (Capsicum annuum L.) revealed by complete nucleotide sequencing.

PubMed

Jo, Yeong Deuk; Choi, Yoomi; Kim, Dong-Hwan; Kim, Byung-Dong; Kang, Byoung-Cheorl

2014-07-04

Cytoplasmic male sterility (CMS) is an inability to produce functional pollen that is caused by mutation of the mitochondrial genome. Comparative analyses of mitochondrial genomes of lines with and without CMS in several species have revealed structural differences between genomes, including extensive rearrangements caused by recombination. However, the mitochondrial genome structure and the DNA rearrangements that may be related to CMS have not been characterized in Capsicum spp. We obtained the complete mitochondrial genome sequences of the pepper CMS line FS4401 (507,452 bp) and the fertile line Jeju (511,530 bp). Comparative analysis between mitochondrial genomes of peppers and tobacco that are included in Solanaceae revealed extensive DNA rearrangements and poor conservation in non-coding DNA. In comparison between pepper lines, FS4401 and Jeju mitochondrial DNAs contained the same complement of protein coding genes except for one additional copy of an atp6 gene (ψatp6-2) in FS4401. In terms of genome structure, we found eighteen syntenic blocks in the two mitochondrial genomes, which have been rearranged in each genome. By contrast, sequences between syntenic blocks, which were specific to each line, accounted for 30,380 and 17,847 bp in FS4401 and Jeju, respectively. The previously-reported CMS candidate genes, orf507 and ψatp6-2, were located on the edges of the largest sequence segments that were specific to FS4401. In this region, large number of small sequence segments which were absent or found on different locations in Jeju mitochondrial genome were combined together. The incorporation of repeats and overlapping of connected sequence segments by a few nucleotides implied that extensive rearrangements by homologous recombination might be involved in evolution of this region. Further analysis using mtDNA pairs from other plant species revealed common features of DNA regions around CMS-associated genes. Although large portion of sequence context was shared by mitochondrial genomes of CMS and male-fertile pepper lines, extensive genome rearrangements were detected. CMS candidate genes located on the edges of highly-rearranged CMS-specific DNA regions and near to repeat sequences. These characteristics were detected among CMS-associated genes in other species, implying a common mechanism might be involved in the evolution of CMS-associated genes.
A novel species-specific tandem repeat DNA family from Sinapis arvensis: detection of telomere-like sequences.

PubMed

Kapila, R; Das, S; Srivastava, P S; Lakshmikumaran, M

1996-08-01

DNA sequences representing a tandemly repeated DNA family of the Sinapis arvensis genome were cloned and characterized. The 700-bp tandem repeat family is represented by two clones, pSA35 and pSA52, which are 697 and 709 bp in length, respectively. Dot matrix analysis of the sequences indicates the presence of repeated elements within each monomeric unit. Sequence analysis of the repetitive region of clones pSA35 and pSA52 shows that there are several copies of a 7-bp repeat element organized in tandem. The consensus sequence of this repeat element is 5'-TTTAGGG-3'. These elements are highly mutated and the difference in length between the two clones is due to different copy numbers of these elements. The repetitive region of clone pSA35 has 26 copies of the element TTTAGGG, whereas clone pSA52 has 28 copies. The repetitive region in both clones is flanked on either side by inverted repeats that may be footprints of a transposition event. Sequence comparison indicates that the element TTTAGGG is identical to telomeric repeats present in Arabidopsis, maize, tomato, and other plants. However, Bal31 digestion kinetics indicates non-telomeric localization of the 700-bp tandem repeats. The clones represent a novel repeat family as (i) they contain telomere-like motifs as subrepeats within each unit; and (ii) they do not hybridize to related crucifers and are species-specific in nature.
An Efficient Method for Electroporation of Small Interfering RNAs into ENCODE Project Tier 1 GM12878 and K562 Cell Lines.

PubMed

Muller, Ryan Y; Hammond, Ming C; Rio, Donald C; Lee, Yeon J

2015-12-01

The Encyclopedia of DNA Elements (ENCODE) Project aims to identify all functional sequence elements in the human genome sequence by use of high-throughput DNA/cDNA sequencing approaches. To aid the standardization, comparison, and integration of data sets produced from different technologies and platforms, the ENCODE Consortium selected several standard human cell lines to be used by the ENCODE Projects. The Tier 1 ENCODE cell lines include GM12878, K562, and H1 human embryonic stem cell lines. GM12878 is a lymphoblastoid cell line, transformed with the Epstein-Barr virus, that was selected by the International HapMap Project for whole genome and transcriptome sequencing by use of the Illumina platform. K562 is an immortalized myelogenous leukemia cell line. The GM12878 cell line is attractive for the ENCODE Projects, as it offers potential synergy with the International HapMap Project. Despite the vast amount of sequencing data available on the GM12878 cell line through the ENCODE Project, including transcriptome, chromatin immunoprecipitation-sequencing for histone marks, and transcription factors, no small interfering siRNA-mediated knockdown studies have been performed in the GM12878 cell line, as cationic lipid-mediated transfection methods are inefficient for lymphoid cell lines. Here, we present an efficient and reproducible method for transfection of a variety of siRNAs into the GM12878 and K562 cell lines, which subsequently results in targeted protein depletion.
Comparison of performance of three commercial platforms for warfarin sensitivity genotyping.

PubMed

Babic, Nikolina; Haverfield, Eden V; Burrus, Julie A; Lozada, Anthony; Das, Soma; Yeo, Kiang-Teck J

2009-08-01

We performed a 3-way comparison on the Osmetech eSensor, AutoGenomics INFINITI, and a real-time PCR method (Paragonx reagents/Stratagene RT-PCR platform) for their FDA-cleared warfarin panels, and additional polymorphisms (CYP2C9*5, *6, and 11 and extended VKORC1 panels) where available. One hundred de-identified DNA samples were used in this IRB-approved study. Accuracy was determined by comparison of genotyping results across three platforms. Any discrepancy was resolved by bi-directional sequencing. The CYP4F2 on Osmetech was validated by bi-directional sequencing. Accuracies for CYP2C9*2 and *3 were 100% for all 3 platforms. VKORC1 3673 genotyping accuracies were 100% on eSensor and 97% on Infiniti. CYP2C9*5, *6 and *11 showed 100% concordance between eSensor and Infiniti. VKORC1 6484 and 9041 variants compared between ParagonDx and Infiniti analyzer were 100% (6484) and 99% (9041) concordant. CYP4F2 was 100% concordant with sequencing results. The time required to generate the results from automated DNA extraction-to-result was approximately 8h on Infiniti, and 4h on eSensor and ParagonDx, respectively. Overall, we observed excellent CYP2C9*2 and *3 genotyping accuracy for all three platforms. For VKORC1 3673 genotyping, eSensor demonstrated a slightly higher accuracy than the Infiniti, and CYP4F2 on Osmetech was 100% accurate.
eShadow: A tool for comparing closely related sequences

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ovcharenko, Ivan; Boffelli, Dario; Loots, Gabriela G.

2004-01-15

Primate sequence comparisons are difficult to interpret due to the high degree of sequence similarity shared between such closely related species. Recently, a novel method, phylogenetic shadowing, has been pioneered for predicting functional elements in the human genome through the analysis of multiple primate sequence alignments. We have expanded this theoretical approach to create a computational tool, eShadow, for the identification of elements under selective pressure in multiple sequence alignments of closely related genomes, such as in comparisons of human to primate or mouse to rat DNA. This tool integrates two different statistical methods and allows for the dynamic visualizationmore » of the resulting conservation profile. eShadow also includes a versatile optimization module capable of training the underlying Hidden Markov Model to differentially predict functional sequences. This module grants the tool high flexibility in the analysis of multiple sequence alignments and in comparing sequences with different divergence rates. Here, we describe the eShadow comparative tool and its potential uses for analyzing both multiple nucleotide and protein alignments to predict putative functional elements. The eShadow tool is publicly available at http://eshadow.dcode.org/« less
Molecular Cloning and Sequence Analysis of the Sta58 Major Antigen Gene of Rickettsia tsutsugamushi: Sequence homology and Antigenic Comparison of Sta58 to the 60-Kilodalton Family of Stress Proteins

DTIC Science & Technology

1990-05-01

Sta58 antigen and the Sta56 strain- GroES, C. burnetii HtpA, Mycobacterium tuberculosis 12- specific major antigen of R. tsutsugamushi (strain Karp...kb HindlIl fragment carrying the gene for the Sta58 tuberculosis, and Mycobacterium smegmatis (65-kDa anti- protein was subjected to DNA sequence...the Hsp6O and HsplO proteins. R. tsu., R. isutsugamushi; M. lep., Mvtcobacteriutn leprae : C. bur., C. burneiii; Synech.. Synechococcus strain 6301; T
A new family of polymerases related to superfamily A DNA polymerases and T7-like DNA-dependent RNA polymerases.

PubMed

Iyer, Lakshminarayan M; Abhiman, Saraswathi; Aravind, L

2008-10-04

Using sequence profile methods and structural comparisons we characterize a previously unknown family of nucleic acid polymerases in a group of mobile elements from genomes of diverse bacteria, an algal plastid and certain DNA viruses, including the recently reported Sputnik virus. Using contextual information from domain architectures and gene-neighborhoods we present evidence that they are likely to possess both primase and DNA polymerase activity, comparable to the previously reported prim-pol proteins. These newly identified polymerases help in defining the minimal functional core of superfamily A DNA polymerases and related RNA polymerases. Thus, they provide a framework to understand the emergence of both DNA and RNA polymerization activity in this class of enzymes. They also provide evidence that enigmatic DNA viruses, such as Sputnik, might have emerged from mobile elements coding these polymerases.
A new family of polymerases related to superfamily A DNA polymerases and T7-like DNA-dependent RNA polymerases

PubMed Central

Iyer, Lakshminarayan M; Abhiman, Saraswathi; Aravind, L

2008-01-01

Using sequence profile methods and structural comparisons we characterize a previously unknown family of nucleic acid polymerases in a group of mobile elements from genomes of diverse bacteria, an algal plastid and certain DNA viruses, including the recently reported Sputnik virus. Using contextual information from domain architectures and gene-neighborhoods we present evidence that they are likely to possess both primase and DNA polymerase activity, comparable to the previously reported prim-pol proteins. These newly identified polymerases help in defining the minimal functional core of superfamily A DNA polymerases and related RNA polymerases. Thus, they provide a framework to understand the emergence of both DNA and RNA polymerization activity in this class of enzymes. They also provide evidence that enigmatic DNA viruses, such as Sputnik, might have emerged from mobile elements coding these polymerases. This article was reviewed by Eugene Koonin and Mark Ragan. PMID:18834537
Tissue-specific DNA methylation is conserved across human, mouse, and rat, and driven by primary sequence conservation.

PubMed

Zhou, Jia; Sears, Renee L; Xing, Xiaoyun; Zhang, Bo; Li, Daofeng; Rockweiler, Nicole B; Jang, Hyo Sik; Choudhary, Mayank N K; Lee, Hyung Joo; Lowdon, Rebecca F; Arand, Jason; Tabers, Brianne; Gu, C Charles; Cicero, Theodore J; Wang, Ting

2017-09-12

Uncovering mechanisms of epigenome evolution is an essential step towards understanding the evolution of different cellular phenotypes. While studies have confirmed DNA methylation as a conserved epigenetic mechanism in mammalian development, little is known about the conservation of tissue-specific genome-wide DNA methylation patterns. Using a comparative epigenomics approach, we identified and compared the tissue-specific DNA methylation patterns of rat against those of mouse and human across three shared tissue types. We confirmed that tissue-specific differentially methylated regions are strongly associated with tissue-specific regulatory elements. Comparisons between species revealed that at a minimum 11-37% of tissue-specific DNA methylation patterns are conserved, a phenomenon that we define as epigenetic conservation. Conserved DNA methylation is accompanied by conservation of other epigenetic marks including histone modifications. Although a significant amount of locus-specific methylation is epigenetically conserved, the majority of tissue-specific DNA methylation is not conserved across the species and tissue types that we investigated. Examination of the genetic underpinning of epigenetic conservation suggests that primary sequence conservation is a driving force behind epigenetic conservation. In contrast, evolutionary dynamics of tissue-specific DNA methylation are best explained by the maintenance or turnover of binding sites for important transcription factors. Our study extends the limited literature of comparative epigenomics and suggests a new paradigm for epigenetic conservation without genetic conservation through analysis of transcription factor binding sites.
cDNA identification, comparison and phylogenetic aspects of lombricine kinase from two oligochaete species.

PubMed

Doumen, Chris

2010-06-01

Creatine kinase and arginine kinase are the typical representatives of an eight-member phosphagen kinase family, which play important roles in the cellular energy metabolism of animals. The phylum Annelida underwent a series of evolutionary processes that resulted in rapid divergence and radiation of these enzymes, producing the greatest diversity of the phosphagen kinases within this phylum. Lombricine kinase (EC 2.7.3.5) is one of such enzymes and sequence information is rather limited compared to other phosphagen kinases. This study presents data on the cDNA sequences of lombricine kinase from two oligochaete species, the California blackworm (Lumbriculus variegatus) and the sludge worm (Tubifex tubifex). The deduced amino acid sequences are analyzed and compared with other selected phosphagen kinases, including two additional lombricine kinase sequences extracted from DNA databases and provide further insights in the evolution and position of these enzymes within the phosphagen kinase family. The data confirms the presence of a deleted region within the flexible loop (the GS region) of all six examined lombricine kinases. A phylogenetic analysis of these six lombricine kinases clearly positions the enzymes together in a small subcluster within the larger creatine kinase (EC 2.7.3.2) clade. 2010. Published by Elsevier Inc.

High-resolution phylogeography of zoonotic tapeworm Echinococcus granulosus sensu stricto genotype G1 with an emphasis on its distribution in Turkey, Italy and Spain.

PubMed

Kinkar, Liina; Laurimäe, Teivi; Simsek, Sami; Balkaya, Ibrahim; Casulli, Adriano; Manfredi, Maria Teresa; Ponce-Gordo, Francisco; Varcasia, Antonio; Lavikainen, Antti; González, Luis Miguel; Rehbein, Steffen; VAN DER Giessen, Joke; Sprong, Hein; Saarma, Urmas

2016-11-01

Echinococcus granulosus is the causative agent of cystic echinococcosis. The disease is a significant global public health concern and human infections are most commonly associated with E. granulosus sensu stricto (s. s.) genotype G1. The objectives of this study were to: (i) analyse the genetic variation and phylogeography of E. granulosus s. s. G1 in part of its main distribution range in Europe using 8274 bp of mtDNA; (ii) compare the results with those derived from previously used shorter mtDNA sequences and highlight the major differences. We sequenced a total of 91 E. granulosus s. s. G1 isolates from six different intermediate host species, including humans. The isolates originated from seven countries representing primarily Turkey, Italy and Spain. Few samples were also from Albania, Greece, Romania and from a patient originating from Algeria, but diagnosed in Finland. The analysed 91 sequences were divided into 83 haplotypes, revealing complex phylogeography and high genetic variation of E. granulosus s. s. G1 in Europe, particularly in the high-diversity domestication centre of western Asia. Comparisons with shorter mtDNA datasets revealed that 8274 bp sequences provided significantly higher phylogenetic resolution and thus more power to reveal the genetic relations between different haplotypes.
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions

DOEpatents

Gardner, Shea N; Mariella, Jr., Raymond P; Christian, Allen T; Young, Jennifer A; Clague, David S

2013-06-25

A method of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths.
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions

DOEpatents

Gardner, Shea N [San Leandro, CA; Mariella, Jr., Raymond P.; Christian, Allen T [Tracy, CA; Young, Jennifer A [Berkeley, CA; Clague, David S [Livermore, CA

2011-01-18

A method of fabricating a DNA molecule of user-defined sequence. The method comprises the steps of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an even or odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths. In one embodiment starting sequence fragments are of different lengths, n, n+1, n+2, etc.
Impact of cultivation on characterisation of species composition of soil bacterial communities.

PubMed

McCaig, A E.; Grayston, S J.; Prosser, J I.; Glover, L A.

2001-03-01

The species composition of culturable bacteria in Scottish grassland soils was investigated using a combination of Biolog and 16S rDNA analysis for characterisation of isolates. The inclusion of a molecular approach allowed direct comparison of sequences from culturable bacteria with sequences obtained during analysis of DNA extracted directly from the same soil samples. Bacterial strains were isolated on Pseudomonas isolation agar (PIA), a selective medium, and on tryptone soya agar (TSA), a general laboratory medium. In total, 12 and 21 morphologically different bacterial cultures were isolated on PIA and TSA, respectively. Biolog and sequencing placed PIA isolates in the same taxonomic groups, the majority of cultures belonging to the Pseudomonas (sensu stricto) group. However, analysis of 16S rDNA sequences proved more efficient than Biolog for characterising TSA isolates due to limitations of the Microlog database for identifying environmental bacteria. In general, 16S rDNA sequences from TSA isolates showed high similarities to cultured species represented in sequence databases, although TSA-8 showed only 92.5% similarity to the nearest relative, Bacillus insolitus. In general, there was very little overlap between the culturable and uncultured bacterial communities, although two sequences, PIA-2 and TSA-13, showed >99% similarity to soil clones. A cloning step was included prior to sequence analysis of two isolates, TSA-5 and TSA-14, and analysis of several clones confirmed that these cultures comprised at least four and three sequence types, respectively. All isolate clones were most closely related to uncultured bacteria, with clone TSA-5.1 showing 99.8% similarity to a sequence amplified directly from the same soil sample. Interestingly, one clone, TSA-5.4, clustered within a novel group comprising only uncultured sequences. This group, which is associated with the novel, deep-branching Acidobacterium capsulatum lineage, also included clones isolated during direct analysis of the same soil and from a wide range of other sample types studied elsewhere. The study demonstrates the value of fine-scale molecular analysis for identification of laboratory isolates and indicates the culturability of approximately 1% of the total population but under a restricted range of media and cultivation conditions.
Are commercial providers a viable option for clinical bacterial sequencing?

PubMed

Raven, Kathy; Blane, Beth; Churcher, Carol; Parkhill, Julian; Peacock, Sharon J

2018-04-05

Bacterial whole-genome sequencing in the clinical setting has the potential to bring major improvements to infection control and clinical practice. Sequencing instruments are not currently available in the majority of routine microbiology laboratories worldwide, but an alternative is to use external sequencing providers. To foster discussion around this we investigated whether send-out services were a viable option. Four providers offering MiSeq sequencing were selected based on cost and evaluated based on the service provided and sequence data quality. DNA was prepared from five methicillin-resistant Staphylococcus aureus (MRSA) isolates, four of which were investigated during a previously published outbreak in the UK together with a reference MRSA isolate (ST22 HO 5096 0412). Cost of sequencing per isolate ranged from £155 to £342 and turnaround times from DNA postage to arrival of sequence data ranged from 12 to 63 days. Comparison of commercially generated genomes against the original sequence data demonstrated very high concordance, with no more than one single nucleotide polymorphism (SNP) difference on core genome mapping between the original sequences and the new sequence for all four providers. Multilocus sequence type could not be assigned based on assembly for the two cheapest sequence providers due to fragmented assemblies probably caused by a lower output of sequence data per isolate. Our results indicate that external providers returned highly accurate genome data, but that improvements are required in turnaround time to make this a viable option for use in clinical practice.
Use of DNA barcodes to identify flowering plants

PubMed Central

Kress, W. John; Wurdack, Kenneth J.; Zimmer, Elizabeth A.; Weigt, Lee A.; Janzen, Daniel H.

2005-01-01

Methods for identifying species by using short orthologous DNA sequences, known as “DNA barcodes,” have been proposed and initiated to facilitate biodiversity studies, identify juveniles, associate sexes, and enhance forensic analyses. The cytochrome c oxidase 1 sequence, which has been found to be widely applicable in animal barcoding, is not appropriate for most species of plants because of a much slower rate of cytochrome c oxidase 1 gene evolution in higher plants than in animals. We therefore propose the nuclear internal transcribed spacer region and the plastid trnH-psbA intergenic spacer as potentially usable DNA regions for applying barcoding to flowering plants. The internal transcribed spacer is the most commonly sequenced locus used in plant phylogenetic investigations at the species level and shows high levels of interspecific divergence. The trnH-psbA spacer, although short (≈450-bp), is the most variable plastid region in angiosperms and is easily amplified across a broad range of land plants. Comparison of the total plastid genomes of tobacco and deadly nightshade enhanced with trials on widely divergent angiosperm taxa, including closely related species in seven plant families and a group of species sampled from a local flora encompassing 50 plant families (for a total of 99 species, 80 genera, and 53 families), suggest that the sequences in this pair of loci have the potential to discriminate among the largest number of plant species for barcoding purposes. PMID:15928076
LexA Binds to Transcription Regulatory Site of Cell Division Gene ftsZ in Toxic Cyanobacterium Microcystis aeruginosa.

PubMed

Honda, Takashi; Morimoto, Daichi; Sako, Yoshihiko; Yoshida, Takashi

2018-05-17

Previously, we showed that DNA replication and cell division in toxic cyanobacterium Microcystis aeruginosa are coordinated by transcriptional regulation of cell division gene ftsZ and that an unknown protein specifically bound upstream of ftsZ (BpFz; DNA-binding protein to an upstream site of ftsZ) during successful DNA replication and cell division. Here, we purified BpFz from M. aeruginosa strain NIES-298 using DNA-affinity chromatography and gel-slicing combined with gel electrophoresis mobility shift assay (EMSA). The N-terminal amino acid sequence of BpFz was identified as TNLESLTQ, which was identical to that of transcription repressor LexA from NIES-843. EMSA analysis using mutant probes showed that the sequence GTACTAN 3 GTGTTC was important in LexA binding. Comparison of the upstream regions of lexA in the genomes of closely related cyanobacteria suggested that the sequence TASTRNNNNTGTWC could be a putative LexA recognition sequence (LexA box). Searches for TASTRNNNNTGTWC as a transcriptional regulatory site (TRS) in the genome of M. aeruginosa NIES-843 showed that it was present in genes involved in cell division, photosynthesis, and extracellular polysaccharide biosynthesis. Considering that BpFz binds to the TRS of ftsZ during normal cell division, LexA may function as a transcriptional activator of genes related to cell reproduction in M. aeruginosa, including ftsZ. This may be an example of informality in the control of bacterial cell division.
PCR amplification and DNA sequencing of Demodex injai from otic secretions of a dog.

PubMed

Milosevic, Milivoj A; Frank, Linda A; Brahmbhatt, Rupal A; Kania, Stephen A

2013-04-01

The identification of Demodex mites from dogs is usually based on morphology and location. Mites with uncharacteristic features or from unusual locations, hosts or disease manifestations could represent new species not previously described; however, this is difficult to determine based on morphology alone. The goal of this study was to identify and confirm Demodex injai in association with otitis externa in a dog using PCR amplification and DNA sequencing. Otic samples were obtained from a beagle in which a long-bodied Demodex mite was identified. For comparison, Demodex mite samples were collected from a swab and scraping of the dorsal skin of a wire-haired fox terrier and an otic sample from a dog with generalized and otic demodicosis. To identify the Demodex mite, DNA was extracted, and 16S rRNA was amplified by PCR, sequenced and compared with Demodex sequences available in public databases and from separate samples morphologically diagnosed as D. injai and Demodex canis. PCR amplification of the long-bodied mite rRNA DNA obtained from otic samples was approximately 330 bp and was identical to that from the mite morphologically identified as D. injai obtained from the dorsal skin of a dog. Furthermore, the examined mite did not have any significant homology to any of the reported genes from Demodex spp. These results confirmed that the demodex mites in this case were D. injai. © 2013 The Authors. Veterinary Dermatology © 2013 ESVD and ACVD.
Molecular and Morphological Characterization of Fasciola spp. Isolated from Different Host Species in a Newly Emerging Focus of Human Fascioliasis in Iran

PubMed Central

Shafiei, Reza; Sarkari, Bahador; Sadjjadi, Seyed Mahmuod; Mowlavi, Gholam Reza; Moshfe, Abdolali

2014-01-01

The current study aimed to find out the morphometric and genotypic divergences of the flukes isolated from different hosts in a newly emerging focus of human fascioliasis in Iran. Adult Fasciola spp. were collected from 34 cattle, 13 sheep, and 11 goats from Kohgiluyeh and Boyer-Ahmad province, southwest of Iran. Genomic DNA was extracted from the flukes and PCR-RFLP was used to characterize the isolates. The ITS1, ITS2, and mitochondrial genes (mtDNA) of NDI and COI from individual liver flukes were amplified and the amplicons were sequenced. Genetic variation within and between the species was evaluated by comparing the sequences. Moreover, morphometric characteristics of flukes were measured through a computer image analysis system. Based on RFLP profile, from the total of 58 isolates, 41 isolates (from cattle, sheep, and goat) were identified as Fasciola hepatica, while 17 isolates from cattle were identified as Fasciola gigantica. Comparison of the ITS1 and ITS2 sequences showed six and seven single-base substitutions, resulting in segregation of the specimens into two different genotypes. The sequences of COI markers showed seven DNA polymorphic sites for F. hepatica and 35 DNA polymorphic sites for F. gigantica. Morphological diversity of the two species was observed in linear, ratios, and areas measurements. The findings have implications for studying the population genetics, epidemiology, and control of the disease. PMID:25018891
Application of next-generation sequencing for rapid marker development in molecular plant breeding: a case study on anthracnose disease resistance in Lupinus angustifolius L.

PubMed Central

2012-01-01

Background In the last 30 years, a number of DNA fingerprinting methods such as RFLP, RAPD, AFLP, SSR, DArT, have been extensively used in marker development for molecular plant breeding. However, it remains a daunting task to identify highly polymorphic and closely linked molecular markers for a target trait for molecular marker-assisted selection. The next-generation sequencing (NGS) technology is far more powerful than any existing generic DNA fingerprinting methods in generating DNA markers. In this study, we employed a grain legume crop Lupinus angustifolius (lupin) as a test case, and examined the utility of an NGS-based method of RAD (restriction-site associated DNA) sequencing as DNA fingerprinting for rapid, cost-effective marker development tagging a disease resistance gene for molecular breeding. Results Twenty informative plants from a cross of RxS (disease resistant x susceptible) in lupin were subjected to RAD single-end sequencing by multiplex identifiers. The entire RAD sequencing products were resolved in two lanes of the 16-lanes per run sequencing platform Solexa HiSeq2000. A total of 185 million raw reads, approximately 17 Gb of sequencing data, were collected. Sequence comparison among the 20 test plants discovered 8207 SNP markers. Filtration of DNA sequencing data with marker identification parameters resulted in the discovery of 38 molecular markers linked to the disease resistance gene Lanr1. Five randomly selected markers were converted into cost-effective, simple PCR-based markers. Linkage analysis using marker genotyping data and disease resistance phenotyping data on a F8 population consisting of 186 individual plants confirmed that all these five markers were linked to the R gene. Two of these newly developed sequence-specific PCR markers, AnSeq3 and AnSeq4, flanked the target R gene at a genetic distance of 0.9 centiMorgan (cM), and are now replacing the markers previously developed by a traditional DNA fingerprinting method for marker-assisted selection in the Australian national lupin breeding program. Conclusions We demonstrated that more than 30 molecular markers linked to a target gene of agronomic trait of interest can be identified from a small portion (1/8) of one sequencing run on HiSeq2000 by applying NGS based RAD sequencing in marker development. The markers developed by the strategy described in this study are all co-dominant SNP markers, which can readily be converted into high throughput multiplex format or low-cost, simple PCR-based markers desirable for large scale marker implementation in plant breeding programs. The high density and closely linked molecular markers associated with a target trait help to overcome a major bottleneck for implementation of molecular markers on a wide range of germplasm in breeding programs. We conclude that application of NGS based RAD sequencing as DNA fingerprinting is a very rapid and cost-effective strategy for marker development in molecular plant breeding. The strategy does not require any prior genome knowledge or molecular information for the species under investigation, and it is applicable to other plant species. PMID:22805587
Application of next-generation sequencing for rapid marker development in molecular plant breeding: a case study on anthracnose disease resistance in Lupinus angustifolius L.

PubMed

Yang, Huaan; Tao, Ye; Zheng, Zequn; Li, Chengdao; Sweetingham, Mark W; Howieson, John G

2012-07-17

In the last 30 years, a number of DNA fingerprinting methods such as RFLP, RAPD, AFLP, SSR, DArT, have been extensively used in marker development for molecular plant breeding. However, it remains a daunting task to identify highly polymorphic and closely linked molecular markers for a target trait for molecular marker-assisted selection. The next-generation sequencing (NGS) technology is far more powerful than any existing generic DNA fingerprinting methods in generating DNA markers. In this study, we employed a grain legume crop Lupinus angustifolius (lupin) as a test case, and examined the utility of an NGS-based method of RAD (restriction-site associated DNA) sequencing as DNA fingerprinting for rapid, cost-effective marker development tagging a disease resistance gene for molecular breeding. Twenty informative plants from a cross of RxS (disease resistant x susceptible) in lupin were subjected to RAD single-end sequencing by multiplex identifiers. The entire RAD sequencing products were resolved in two lanes of the 16-lanes per run sequencing platform Solexa HiSeq2000. A total of 185 million raw reads, approximately 17 Gb of sequencing data, were collected. Sequence comparison among the 20 test plants discovered 8207 SNP markers. Filtration of DNA sequencing data with marker identification parameters resulted in the discovery of 38 molecular markers linked to the disease resistance gene Lanr1. Five randomly selected markers were converted into cost-effective, simple PCR-based markers. Linkage analysis using marker genotyping data and disease resistance phenotyping data on a F8 population consisting of 186 individual plants confirmed that all these five markers were linked to the R gene. Two of these newly developed sequence-specific PCR markers, AnSeq3 and AnSeq4, flanked the target R gene at a genetic distance of 0.9 centiMorgan (cM), and are now replacing the markers previously developed by a traditional DNA fingerprinting method for marker-assisted selection in the Australian national lupin breeding program. We demonstrated that more than 30 molecular markers linked to a target gene of agronomic trait of interest can be identified from a small portion (1/8) of one sequencing run on HiSeq2000 by applying NGS based RAD sequencing in marker development. The markers developed by the strategy described in this study are all co-dominant SNP markers, which can readily be converted into high throughput multiplex format or low-cost, simple PCR-based markers desirable for large scale marker implementation in plant breeding programs. The high density and closely linked molecular markers associated with a target trait help to overcome a major bottleneck for implementation of molecular markers on a wide range of germplasm in breeding programs. We conclude that application of NGS based RAD sequencing as DNA fingerprinting is a very rapid and cost-effective strategy for marker development in molecular plant breeding. The strategy does not require any prior genome knowledge or molecular information for the species under investigation, and it is applicable to other plant species.
Terminations of DNA synthesis on 'proflavine and light'-treated phi X174 single-stranded DNA.

PubMed

Piette, J; Calberg-Bacq, C M; Lopez, M; van de Vorst, A

1984-04-05

Bacteriophage phi X174 single-stranded DNA molecules were primed with five different restriction fragments and irradiated with visible light in the presence of proflavine. This photodamaged DNA was used as template for the in vitro complementary chain synthesis by E. coli DNA polymerase I (Klenow fragment). Chain terminations were observed by polyacrylamide gel electrophoresis of the synthesized products and localized by comparison with standard sequencing performed simultaneously on the untreated template. 90% of the chain terminations occurred one nucleotide before a guanine residue in the template strand. More than 80% of the sequenced guanine residues were blocking lesions demonstrating the absence of 'hot-spots' for the photodamaging effect of proflavine. At a defined position, the chain termination frequency increased linearly with the irradiation time and was directly influenced by the proflavine concentration present. An important part of lesions resulted from the action of singlet oxygen produced by excited proflavine as shown by the effect that both NaN3 and 2H2O exerted on the reaction. The induced blocking lesions must be important in vivo since no complete replicative forms could be extracted from cell infected with bacteriophages inactivated by 'proflavine and light' treatment.
Molecular Characterization of the Skate Peripherin/rds Gene: Relationship to Its Orthologues and Paralogues

PubMed Central

Li, Chibo; Ding, Xi-Qin; O’Brien, John; Al-Ubaidi, Muayyad R.

2010-01-01

PURPOSE A great deal of information about functionally significant domains of a protein may be obtained by comparison of primary sequences of gene homologues over a broad phylogenetic base. This study was designed to identify evolutionarily conserved domains of the photoreceptor disc membrane protein peripherin/rds by analysis of the homologue in a primitive vertebrate, the skate. METHODS A skate retinal cDNA library was screened using a mouse peripherin/rds clone. The 5′ and 3′ untranslated regions of the skate peripherin/rds (srds) cDNA were isolated by the rapid amplification of cDNA ends (RACE) approach. The gene structure was characterized by PCR amplification and sequencing of genomic fragments. Northern and Western blot analyses were used to identify srds transcript and protein, respectively. RESULTS A new homologue of peripherin/rds was identified from the skate retinal cDNA library. SRDS is a glycoprotein with a predicted molecular mass of 40.2 kDa. The srds gene consists of two exons and one small intron and transcribes into a single 6-kb message. Phylogenetic analysis places SRDS at the base of peripherin/rds family and near the division of that group and the branch leading to rds-like and rom-1 genes. SRDS protein is 54.5% identical with peripherin/rds across species. Identity is significantly higher (73%) in the intradiscal domains. Sequence comparison revealed the conservation of all residues that have been shown, on mutation, to associate with retinitis pigmentosa and showed conservation of most residues associated with macular dystrophies. Comparison with ROM-1 and other rds-like proteins revealed the presence of a highly conserved domain in the large intradiscal loop. CONCLUSIONS Srds represents the skate orthologue of mammalian peripherin/rds genes. Conservation of most of the residues associated with human retinal diseases indicates that these residues serve important functional roles. The high degree of conservation of a short stretch within the large intradiscal loop also suggests an important function for this domain. PMID:12766040
Positive selection and propeptide repeats promote rapid interspecific divergence of a gastropod sperm protein.

PubMed

Hellberg, M E; Moy, G W; Vacquier, V D

2000-03-01

Male-specific proteins have increasingly been reported as targets of positive selection and are of special interest because of the role they may play in the evolution of reproductive isolation. We report the rapid interspecific divergence of cDNA encoding a major acrosomal protein of unknown function (TMAP) of sperm from five species of teguline gastropods. A mitochondrial DNA clock (calibrated by congeneric species divided by the Isthmus of Panama) estimates that these five species diverged 2-10 MYA. Inferred amino acid sequences reveal a propeptide that has diverged rapidly between species. The mature protein has diverged faster still due to high nonsynonymous substitution rates (> 25 nonsynonymous substitutions per site per 10(9) years). cDNA encoding the mature protein (89-100 residues) shows evidence of positive selection (Dn/Ds > 1) for 4 of 10 pairwise species comparisons. cDNA and predicted secondary-structure comparisons suggest that TMAP is neither orthologous nor paralogous to abalone lysin, and thus marks a second, phylogenetically independent, protein subject to strong positive selection in free-spawning marine gastropods. In addition, an internal repeat in one species (Tegula aureotincta) produces a duplicated cleavage site which results in two alternatively processed mature proteins differing by nine amino acid residues. Such alternative processing may provide a mechanism for introducing novel amino acid sequence variation at the amino-termini of proteins. Highly divergent TMAP N-termini from two other tegulines (Tegula regina and Norrisia norrisii) may have originated by such a mechanism.
mtDNA variation of the critically endangered hawksbill turtle (Eretmochelys imbricata) nesting on Iranian islands of the Persian Gulf.

PubMed

Tabib, M; Zolgharnein, H; Mohammadi, M; Salari-Aliabadi, M A; Qasemi, A; Roshani, S; Rajabi-Maham, H; Frootan, F

2011-01-01

Genetic diversity of sea turtles (hawksbill turtle) was studied using sequencing of mitochondrial DNA (mtDNA, D-loop region). Thirty dead embryos were collected from the Kish and Qeshm Islands in the Persian Gulf. Analysis of sequence variation over 890 bp of the mtDNA control region revealed five haplotypes among 30 individuals. This is the first time that Iranian haplotypes have been recorded. Nucleotide and haplotype diversity was 0.77 and 0.001 for Qeshm Island and 0.64 and 0.002 for Kish Island, respectively. Total haplotype diversity was calculated as 0.69, which demonstrates low genetic diversity in this area. The data also indicated very high rates of migration between the populations of these two islands. A comparison of our data with data from previous studies downloaded from a gene bank showed that turtles of the Persian Gulf migrated from the Pacific and the Sea of Oman into this area. On the other hand, evidence of migration from populations to the West was not found.
Identification and characterization of the reptilian GnRH-II gene in the leopard gecko, Eublepharis macularius, and its evolutionary considerations.

PubMed

Ikemoto, Tadahiro; Park, Min Kyun

2003-10-16

To elucidate the molecular phylogeny and evolution of a particular peptide, one must analyze not the limited primary amino acid sequences of the low molecular weight mature polypeptide, but rather the sequences of the corresponding precursors from various species. Of all the structural variants of gonadotropin-releasing hormone (GnRH), GnRH-II (chicken GnRH-II, or cGnRH-II) is remarkably conserved without any sequence substitutions among vertebrates, but its precursor sequences vary considerably. We have identified and characterized the full-length complementary DNA (cDNA) encoding the GnRH-II precursor and determined its genomic structure, consisting of four exons and three introns, in a reptilian species, the leopard gecko Eublepharis macularius. This is the first report about the GnRH-II precursor cDNA/gene from reptiles. The deduced leopard gecko prepro-GnRH-II polypeptide had the highest identities with the corresponding polypeptides of amphibians. The GnRH-II precursor mRNA was detected in more than half of the tissues and organs examined. This widespread expression is consistent with the previous findings in several species, though the roles of GnRH outside the hypothalamus-pituitary-gonadal axis remain largely unknown. Molecular phylogenetic analysis combined with sequence comparison showed that the leopard gecko is more similar to fishes and amphibians than to eutherian mammals with respect to the GnRH-II precursor sequence. These results strongly suggest that the divergence of the GnRH-II precursor sequences seen in eutherian mammals may have occurred along with amniote evolution.
Improved coverage of cDNA-AFLP by sequential digestion of immobilized cDNA.

PubMed

Weiberg, Arne; Pöhler, Dirk; Morgenstern, Burkhard; Karlovsky, Petr

2008-10-13

cDNA-AFLP is a transcriptomics technique which does not require prior sequence information and can therefore be used as a gene discovery tool. The method is based on selective amplification of cDNA fragments generated by restriction endonucleases, electrophoretic separation of the products and comparison of the band patterns between treated samples and controls. Unequal distribution of restriction sites used to generate cDNA fragments negatively affects the performance of cDNA-AFLP. Some transcripts are represented by more than one fragment while other escape detection, causing redundancy and reducing the coverage of the analysis, respectively. With the goal of improving the coverage of cDNA-AFLP without increasing its redundancy, we designed a modified cDNA-AFLP protocol. Immobilized cDNA is sequentially digested with several restriction endonucleases and the released DNA fragments are collected in mutually exclusive pools. To investigate the performance of the protocol, software tool MECS (Multiple Enzyme cDNA-AFLP Simulation) was written in Perl. cDNA-AFLP protocols described in the literature and the new sequential digestion protocol were simulated on sets of cDNA sequences from mouse, human and Arabidopsis thaliana. The redundancy and coverage, the total number of PCR reactions, and the average fragment length were calculated for each protocol and cDNA set. Simulation revealed that sequential digestion of immobilized cDNA followed by the partitioning of released fragments into mutually exclusive pools outperformed other cDNA-AFLP protocols in terms of coverage, redundancy, fragment length, and the total number of PCRs. Primers generating 30 to 70 amplicons per PCR provided the highest fraction of electrophoretically distinguishable fragments suitable for normalization. For A. thaliana, human and mice transcriptome, the use of two marking enzymes and three sequentially applied releasing enzymes for each of the marking enzymes is recommended.
Comparison of the canine and human acid {beta}-galactosidase gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ahern-Rindell, A.J.; Kretz, K.A.; O`Brien, J.S.

Several canine cDNA libraries were screened with human {beta}-galactosidase cDNA as probe. Seven positive clones were isolated and sequenced yielding a partial (2060 bp) canine {beta}-galactosidase cDNA with 86% identity to the human {beta}-galactosidase cDNA. Preliminary analysis of a canine genomic library indicated conservation of exon number and size. Analysis by Northern blotting disclosed a single mRNA of 2.4 kb in fibroblasts and liver from normal dogs and dogs affected with GM1 gangliosidosis. Although incomplete, these results indicate canine GM1 gangliosidosis is a suitable animal model of the human disease and should further efforts to devise a gene therapy strategymore » for its treatment. 20 refs., 2 figs., 1 tab.« less
Study of DNA binding sites using the Rényi parametric entropy measure.

PubMed

Krishnamachari, A; moy Mandal, Vijnan; Karmeshu

2004-04-07

Shannon's definition of uncertainty or surprisal has been applied extensively to measure the information content of aligned DNA sequences and characterizing DNA binding sites. In contrast to Shannon's uncertainty, this study investigates the applicability and suitability of a parametric uncertainty measure due to Rényi. It is observed that this measure also provides results in agreement with Shannon's measure, pointing to its utility in analysing DNA binding site region. For facilitating the comparison between these uncertainty measures, a dimensionless quantity called "redundancy" has been employed. It is found that Rényi's measure at low parameter values possess a better delineating feature of binding sites (of binding regions) than Shannon's measure. The critical value of the parameter is chosen with an outlier criterion.
DNA methylation of retrotransposons, DNA transposons and genes in sugar beet (Beta vulgaris L.).

PubMed

Zakrzewski, Falk; Schmidt, Martin; Van Lijsebettens, Mieke; Schmidt, Thomas

2017-06-01

The methylation of cytosines shapes the epigenetic landscape of plant genomes, coordinates transgenerational epigenetic inheritance, represses the activity of transposable elements (TEs), affects gene expression and, hence, can influence the phenotype. Sugar beet (Beta vulgaris ssp. vulgaris), an important crop that accounts for 30% of worldwide sugar needs, has a relatively small genome size (758 Mbp) consisting of approximately 485 Mbp repetitive DNA (64%), in particular satellite DNA, retrotransposons and DNA transposons. Genome-wide cytosine methylation in the sugar beet genome was studied in leaves and leaf-derived callus with a focus on repetitive sequences, including retrotransposons and DNA transposons, the major groups of repetitive DNA sequences, and compared with gene methylation. Genes showed a specific methylation pattern for CG, CHG (H = A, C, and T) and CHH sites, whereas the TE pattern differed, depending on the TE class (class 1, retrotransposons and class 2, DNA transposons). Along genes and TEs, CG and CHG methylation was higher than that of adjacent genomic regions. In contrast to the relatively low CHH methylation in retrotransposons and genes, the level of CHH methylation in DNA transposons was strongly increased, pointing to a functional role of asymmetric methylation in DNA transposon silencing. Comparison of genome-wide DNA methylation between sugar beet leaves and callus revealed a differential methylation upon tissue culture. Potential epialleles were hypomethylated (lower methylation) at CG and CHG sites in retrotransposons and genes and hypermethylated (higher methylation) at CHH sites in DNA transposons of callus when compared with leaves. © 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.

Intraspecific differentiation of Paramecium novaurelia strains (Ciliophora, Protozoa) inferred from phylogenetic analysis of ribosomal and mitochondrial DNA variation.

PubMed

Tarcz, Sebastian

2013-01-01

Paramecium novaurelia Beale and Schneller, 1954, was first found in Scotland and is known to occur mainly in Europe, where it is the most common species of the P. aurelia complex. In recent years, two non-European localities have been described: Turkey and the United States of America. This article presents the analysis of intraspecific variability among 25 strains of P. novaurelia with the application of ribosomal and mitochondrial loci (ITS1-5.8S-ITS2, 5' large subunit rDNA (5'LSU rDNA) and cytochrome c oxidase subunit 1 (COI) mtDNA). The mean distance observed for all of the studied P. novaurelia sequence pairs was p=0.008/0.016/0.092 (ITS1-5.8S-ITS2/5'LSU rDNA/COI). Phylogenetic trees (NJ/MP/BI) based on a comparison of all of the analysed sequences show that the studied strains of P. novaurelia form a distinct clade, separate from the P. caudatum outgroup, and are divided into two clusters (A and B) and two branches (C and D). The occurrence of substantial genetic differentiation within P. novaurelia, confirmed by the analysed DNA fragments, indicates a rapid evolution of particular species within the Paramecium genus. Copyright © 2012 Elsevier GmbH. All rights reserved.
Characterization of infectious Murray Valley encephalitis virus derived from a stably cloned genome-length cDNA.

PubMed

Hurrelbrink, R J; Nestorowicz, A; McMinn, P C

1999-12-01

An infectious cDNA clone of Murray Valley encephalitis virus prototype strain 1-51 (MVE-1-51) was constructed by stably inserting genome-length cDNA into the low-copy-number plasmid vector pMC18. Designated pMVE-1-51, the clone consisted of genome-length cDNA of MVE-1-51 under the control of a T7 RNA polymerase promoter. The clone was constructed by using existing components of a cDNA library, in addition to cDNA of the 3' terminus derived by RT-PCR of poly(A)-tailed viral RNA. Upon comparison with other flavivirus sequences, the previously undetermined sequence of the 3' UTR was found to contain elements conserved throughout the genus FLAVIVIRUS: RNA transcribed from pMVE-1-51 and subsequently transfected into BHK-21 cells generated infectious virus. The plaque morphology, replication kinetics and antigenic profile of clone-derived virus (CDV-1-51) was similar to the parental virus in vitro. Furthermore, the virulence properties of CDV-1-51 and MVE-1-51 (LD(50) values and mortality profiles) were found to be identical in vivo in the mouse model. Through site-directed mutagenesis, the infectious clone should serve as a valuable tool for investigating the molecular determinants of virulence in MVE virus.
Transcriptome Analysis and Comparison of Marmota monax and Marmota himalayana.

PubMed

Liu, Yanan; Wang, Baoju; Wang, Lu; Vikash, Vikash; Wang, Qin; Roggendorf, Michael; Lu, Mengji; Yang, Dongliang; Liu, Jia

2016-01-01

The Eastern woodchuck (Marmota monax) is a classical animal model for studying hepatitis B virus (HBV) infection and hepatocellular carcinoma (HCC) in humans. Recently, we found that Marmota himalayana, an Asian animal species closely related to Marmota monax, is susceptible to woodchuck hepatitis virus (WHV) infection and can be used as a new mammalian model for HBV infection. However, the lack of genomic sequence information of both Marmota models strongly limited their application breadth and depth. To address this major obstacle of the Marmota models, we utilized Illumina RNA-Seq technology to sequence the cDNA libraries of liver and spleen samples of two Marmota monax and four Marmota himalayana. In total, over 13 billion nucleotide bases were sequenced and approximately 1.5 billion clean reads were obtained. Following assembly, 106,496 consensus sequences of Marmota monax and 78,483 consensus sequences of Marmota himalayana were detected. For functional annotation, in total 73,603 Unigenes of Marmota monax and 78,483 Unigenes of Marmota himalayana were identified using different databases (NR, NT, Swiss-Prot, KEGG, COG, GO). The Unigenes were aligned by blastx to protein databases to decide the coding DNA sequences (CDS) and in total 41,247 CDS of Marmota monax and 34,033 CDS of Marmota himalayana were predicted. The single nucleotide polymorphisms (SNPs) and the simple sequence repeats (SSRs) were also analyzed for all Unigenes obtained. Moreover, a large-scale transcriptome comparison was performed and revealed a high similarity in transcriptome sequences between the two marmota species. Our study provides an extensive amount of novel sequence information for Marmota monax and Marmota himalayana. This information may serve as a valuable genomics resource for further molecular, developmental and comparative evolutionary studies, as well as for the identification and characterization of functional genes that are involved in WHV infection and HCC development in the woodchuck model.
Transcriptome Analysis and Comparison of Marmota monax and Marmota himalayana

PubMed Central

Wang, Lu; Vikash, Vikash; Wang, Qin; Roggendorf, Michael; Lu, Mengji; Yang, Dongliang; Liu, Jia

2016-01-01

The Eastern woodchuck (Marmota monax) is a classical animal model for studying hepatitis B virus (HBV) infection and hepatocellular carcinoma (HCC) in humans. Recently, we found that Marmota himalayana, an Asian animal species closely related to Marmota monax, is susceptible to woodchuck hepatitis virus (WHV) infection and can be used as a new mammalian model for HBV infection. However, the lack of genomic sequence information of both Marmota models strongly limited their application breadth and depth. To address this major obstacle of the Marmota models, we utilized Illumina RNA-Seq technology to sequence the cDNA libraries of liver and spleen samples of two Marmota monax and four Marmota himalayana. In total, over 13 billion nucleotide bases were sequenced and approximately 1.5 billion clean reads were obtained. Following assembly, 106,496 consensus sequences of Marmota monax and 78,483 consensus sequences of Marmota himalayana were detected. For functional annotation, in total 73,603 Unigenes of Marmota monax and 78,483 Unigenes of Marmota himalayana were identified using different databases (NR, NT, Swiss-Prot, KEGG, COG, GO). The Unigenes were aligned by blastx to protein databases to decide the coding DNA sequences (CDS) and in total 41,247 CDS of Marmota monax and 34,033 CDS of Marmota himalayana were predicted. The single nucleotide polymorphisms (SNPs) and the simple sequence repeats (SSRs) were also analyzed for all Unigenes obtained. Moreover, a large-scale transcriptome comparison was performed and revealed a high similarity in transcriptome sequences between the two marmota species. Our study provides an extensive amount of novel sequence information for Marmota monax and Marmota himalayana. This information may serve as a valuable genomics resource for further molecular, developmental and comparative evolutionary studies, as well as for the identification and characterization of functional genes that are involved in WHV infection and HCC development in the woodchuck model. PMID:27806133
Comparison of cancer-associated genetic abnormalities in columnar-lined esophagus tissues with and without goblet cells.

PubMed

Bandla, Santhoshi; Peters, Jeffrey H; Ruff, David; Chen, Shiaw-Min; Li, Chieh-Yuan; Song, Kunchang; Thoms, Kimberly; Litle, Virginia R; Watson, Thomas; Chapurin, Nikita; Lada, Michal; Pennathur, Arjun; Luketich, James D; Peterson, Derick; Dulak, Austin; Lin, Lin; Bass, Adam; Beer, David G; Godfrey, Tony E; Zhou, Zhongren

2014-07-01

To determine and compare the frequency of cancer-associated genetic abnormalities in esophageal metaplasia biopsies with and without goblet cells. Barrett's esophagus is associated with increased risk of esophageal adenocarcinoma (EAC), but the appropriate histologic definition of Barrett's esophagus is debated. Intestinal metaplasia (IM) is defined by the presence of goblet cells whereas nongoblet cell metaplasia (NGM) lacks goblet cells. Both have been implicated in EAC risk but this is controversial. Although IM is known to harbor genetic changes associated with EAC, little is known about NGM. We hypothesized that if NGM and IM infer similar EAC risk, then they would harbor similar genetic aberrations in genes associated with EAC. Ninety frozen NGM, IM, and normal tissues from 45 subjects were studied. DNA copy number abnormalities were identified using microarrays and fluorescence in situ hybridization. Targeted sequencing of all exons from 20 EAC-associated genes was performed on metaplasia biopsies using Ion AmpliSeq DNA sequencing. Frequent copy number abnormalities targeting cancer-associated genes were found in IM whereas no such changes were observed in NGM. In 1 subject, fluorescence in situ hybridization confirmed loss of CDKN2A and amplification of chromosome 8 in IM but not in a nearby NGM biopsy. Targeted sequencing revealed 11 nonsynonymous mutations in 16 IM samples and 2 mutations in 19 NGM samples. This study reports the largest and most comprehensive comparison of DNA aberrations in IM and NGM genomes. Our results show that IM has a much higher frequency of cancer-associated mutations than NGM.
Application of Quaternion in improving the quality of global sequence alignment scores for an ambiguous sequence target in Streptococcus pneumoniae DNA

NASA Astrophysics Data System (ADS)

Lestari, D.; Bustamam, A.; Novianti, T.; Ardaneswari, G.

2017-07-01

DNA sequence can be defined as a succession of letters, representing the order of nucleotides within DNA, using a permutation of four DNA base codes including adenine (A), guanine (G), cytosine (C), and thymine (T). The precise code of the sequences is determined using DNA sequencing methods and technologies, which have been developed since the 1970s and currently become highly developed, advanced and highly throughput sequencing technologies. So far, DNA sequencing has greatly accelerated biological and medical research and discovery. However, in some cases DNA sequencing could produce any ambiguous and not clear enough sequencing results that make them quite difficult to be determined whether these codes are A, T, G, or C. To solve these problems, in this study we can introduce other representation of DNA codes namely Quaternion Q = (PA, PT, PG, PC), where PA, PT, PG, PC are the probability of A, T, G, C bases that could appear in Q and PA + PT + PG + PC = 1. Furthermore, using Quaternion representations we are able to construct the improved scoring matrix for global sequence alignment processes, by applying a dot product method. Moreover, this scoring matrix produces better and higher quality of the match and mismatch score between two DNA base codes. In implementation, we applied the Needleman-Wunsch global sequence alignment algorithm using Octave, to analyze our target sequence which contains some ambiguous sequence data. The subject sequences are the DNA sequences of Streptococcus pneumoniae families obtained from the Genebank, meanwhile the target DNA sequence are received from our collaborator database. As the results we found the Quaternion representations improve the quality of the sequence alignment score and we can conclude that DNA sequence target has maximum similarity with Streptococcus pneumoniae.
A Comparison Between Genotyping-by-sequencing and Array-based Scoring of SNPs for Genomic Prediction Accuracy in Winter Wheat

USDA-ARS?s Scientific Manuscript database

The utilization of DNA molecular markers in plant breeding to maximize selection response via marker assisted selection (MAS) and genomic selection (GS) has the potential to revolutionize plant breeding. A key factor affecting GS applicability is the choice of molecular marker platform. Genotypying-...
Cloning of a cDNA encoding bovine mitochondrial NADP(+)-specific isocitrate dehydrogenase and structural comparison with its isoenzymes from different species.

PubMed Central

Huh, T L; Ryu, J H; Huh, J W; Sung, H C; Oh, I U; Song, B J; Veech, R L

1993-01-01

Mitochondrial NADP(+)-specific isocitrate dehydrogenase (IDP) was co-purified with the pyruvate dehydrogenase complex from bovine kidney mitochondria. The determination of its N-terminal 16-amino-acid sequence revealed that it is highly similar to the IDP from yeast. A cDNA clone (1.8 kb long) encoding this protein was isolated from a bovine kidney lambda gt11 cDNA library using a synthetic oligodeoxynucleotide. The deduced protein sequence of this cDNA clone rendered a precursor protein of 452 amino-acid residues (50,830 Da) and a mature protein of 413 amino-acid residues (46,519 Da). It is 100% identical to the internal tryptic peptide sequences of the autologous form from pig heart and 62% similar to that from yeast. However, it shares little similarity with the mitochondrial NAD(+)-specific isoenzyme from yeast. Structural analyses of the deduced proteins of IDP isoenzymes from different species indicated that similarity exists in certain regions, which may represent the common domains for the active sites or coenzyme-binding sites. In Northern-blot analysis, one species of mRNA (about 2.2 kb for both bovine and human) was hybridized with a 32P-labelled cDNA probe. Southern-blot analysis of genomic DNAs verified simple patterns of hybridization with this cDNA. These results strongly indicate that the mitochondrial IDP may be derived from a single gene family which does not appear to be closely related to that of the NAD(+)-specific isoenzyme. Images Figure 1 Figure 3 Figure 4 Figure 5 PMID:8318002
Complete nucleotide sequence of spring beauty latent virus, a bromovirus infectious to Arabidopsis thaliana.

PubMed

Fujisaki, K; Hagihara, F; Kaido, M; Mise, K; Okuno, T

2003-01-01

Spring beauty latent virus (SBLV), a bromovirus, systemically and efficiently infected Arabidopsis thaliana, whereas the well-studied bromoviruses brome mosaic virus (BMV) and cowpea chlorotic mottle virus (CCMV) did not infect and poorly infected A. thaliana, respectively. We constructed biologically active cDNA clones of SBLV genomic RNAs and determined their complete nucleotide sequences. Interestingly, SBLV RNA3 contains both the box B motif in the intercistronic region, as does BMV, and the subgenomic promoter-like sequence in the 5' noncoding region, as does CCMV. Sequence comparisons of SBLV, BMV, CCMV, and broad bean mottle virus demonstrated that SBLV is closely related to BMV and CCMV.
Assessment of the quality of DNA from various formalin-fixed paraffin-embedded (FFPE) tissues and the use of this DNA for next-generation sequencing (NGS) with no artifactual mutation

PubMed Central

Einaga, Naoki; Yoshida, Akio; Noda, Hiroko; Suemitsu, Masaaki; Nakayama, Yuki; Sakurada, Akihisa; Kawaji, Yoshiko; Yamaguchi, Hiromi; Sasaki, Yasushi; Tokino, Takashi; Esumi, Mariko

2017-01-01

Formalin-fixed, paraffin-embedded (FFPE) tissues used for pathological diagnosis are valuable for studying cancer genomics. In particular, laser-capture microdissection of target cells determined by histopathology combined with FFPE tissue section immunohistochemistry (IHC) enables precise analysis by next-generation sequencing (NGS) of the genetic events occurring in cancer. The result is a new strategy for a pathological tool for cancer diagnosis: ‘microgenomics’. To more conveniently and precisely perform microgenomics, we revealed by systematic analysis the following three details regarding FFPE DNA compared with paired frozen tissue DNA. 1) The best quality of FFPE DNA is obtained by tissue fixation with 10% neutral buffered formalin for 1 day and heat treatment of tissue lysates at 95°C for 30 minutes. 2) IHC staining of FFPE tissues decreases the quantity and quality of FFPE DNA to one-fourth, and antigen retrieval (at 120°C for 15 minutes, pH 6.0) is the major reason for this decrease. 3) FFPE DNA prepared as described herein is sufficient for NGS. For non-mutated tissue specimens, no artifactual mutation occurs during FFPE preparation, as shown by precise comparison of NGS of FFPE DNA and paired frozen tissue DNA followed by validation. These results demonstrate that even FFPE tissues used for routine clinical diagnosis can be utilized to obtain reliable NGS data if appropriate conditions of fixation and validation are applied. PMID:28498833
Bacterial communities in Great Barrier Reef calcareous sediments: Contrasting 16S rDNA libraries from nearshore and outer shelf reefs

NASA Astrophysics Data System (ADS)

Uthicke, S.; McGuire, K.

2007-03-01

Bacterial communities in eight 16S rDNA clone libraries from calcareous sediments were investigated to provide an assessment of the bacterial diversity on sediments of the Great Barrier Reef (GBR) and to investigate differences due to decreased water quality. Sample effort was spread across two locations on each of four coral reefs, with two reefs located nearshore and two reefs on the outer shelf to allow robust statistical comparison of nearshore reefs (subjected to enhanced runoff) and outer shelf reefs (pristine conditions). Out of 221 non-chimeric sequences, 189 (85.5%) were unique and only one sequence occurred in more than one library. Rarefaction analyses and coverage calculations indicated that only a small fraction of the diversity was sampled. Cluster analyses and comparison to published sequences indicated that sequences retrieved belonged to the α, γ and δ subdivision of the Proteobacteria (6.8, 29.4 and 13.6% of the total, respectively), Cytophaga-Flavobacteria-Bacteroidetes (CFB) group (20.4%), Cyanobacteria (5.4%), Planctomycetaceae (7.7%), Verrucomicrobiaceae (6.8%), Acidobacteriaceae (2.7%). Analysis of Similarity (ANOSIM, based on grouping all retrieved sequences into 9 phylogenetic groups) indicated that subtle differences do exist in the community composition between nearshore and outer shelf reefs. Similarity percentage analysis (SIMPER) indicated that Acidobacteriaceae and Cyanobacteriaceae were the main contributors to the dissimilarity. A significant difference between bacteria on nearshore and outer shelf reefs also existed on the molecular level ( FST = 0.008, p = 0.007 for all samples, 0.006, p = 0.022 when repeated sequences within libraries were removed). Thus, bacterial communities on carbonate sediments investigated were highly diverse and differences in community composition may provide important leads for the search for indicator species or communities for water quality differences.
Large-Scale Concatenation cDNA Sequencing

PubMed Central

Yu, Wei; Andersson, Björn; Worley, Kim C.; Muzny, Donna M.; Ding, Yan; Liu, Wen; Ricafrente, Jennifer Y.; Wentland, Meredith A.; Lennon, Greg; Gibbs, Richard A.

1997-01-01

A total of 100 kb of DNA derived from 69 individual human brain cDNA clones of 0.7–2.0 kb were sequenced by concatenated cDNA sequencing (CCS), whereby multiple individual DNA fragments are sequenced simultaneously in a single shotgun library. The method yielded accurate sequences and a similar efficiency compared with other shotgun libraries constructed from single DNA fragments (>20 kb). Computer analyses were carried out on 65 cDNA clone sequences and their corresponding end sequences to examine both nucleic acid and amino acid sequence similarities in the databases. Thirty-seven clones revealed no DNA database matches, 12 clones generated exact matches (≥98% identity), and 16 clones generated nonexact matches (57%–97% identity) to either known human or other species genes. Of those 28 matched clones, 8 had corresponding end sequences that failed to identify similarities. In a protein similarity search, 27 clone sequences displayed significant matches, whereas only 20 of the end sequences had matches to known protein sequences. Our data indicate that full-length cDNA insert sequences provide significantly more nucleic acid and protein sequence similarity matches than expressed sequence tags (ESTs) for database searching. [All 65 cDNA clone sequences described in this paper have been submitted to the GenBank data library under accession nos. U79240–U79304.] PMID:9110174
The mitochondrial genome of Moniliophthora roreri, the frosty pod rot pathogen of cacao.

PubMed

Costa, Gustavo G L; Cabrera, Odalys G; Tiburcio, Ricardo A; Medrano, Francisco J; Carazzolle, Marcelo F; Thomazella, Daniela P T; Schuster, Stephen C; Carlson, John E; Guiltinan, Mark J; Bailey, Bryan A; Mieczkowski, Piotr; Pereira, Gonçalo A G; Meinhardt, Lyndel W

2012-05-01

In this study, we report the sequence of the mitochondrial (mt) genome of the Basidiomycete fungus Moniliophthora roreri, which is the etiologic agent of frosty pod rot of cacao (Theobroma cacao L.). We also compare it to the mtDNA from the closely-related species Moniliophthora perniciosa, which causes witches' broom disease of cacao. The 94 Kb mtDNA genome of M. roreri has a circular topology and codes for the typical 14 mt genes involved in oxidative phosphorylation. It also codes for both rRNA genes, a ribosomal protein subunit, 13 intronic open reading frames (ORFs), and a full complement of 27 tRNA genes. The conserved genes of M. roreri mtDNA are completely syntenic with homologous genes of the 109 Kb mtDNA of M. perniciosa. As in M. perniciosa, M. roreri mtDNA contains a high number of hypothetical ORFs (28), a remarkable feature that make Moniliophthoras the largest reservoir of hypothetical ORFs among sequenced fungal mtDNA. Additionally, the mt genome of M. roreri has three free invertron-like linear mt plasmids, one of which is very similar to that previously described as integrated into the main M. perniciosa mtDNA molecule. Moniliophthora roreri mtDNA also has a region of suspected plasmid origin containing 15 hypothetical ORFs distributed in both strands. One of these ORFs is similar to an ORF in the mtDNA gene encoding DNA polymerase in Pleurotus ostreatus. The comparison to M. perniciosa showed that the 15 Kb difference in mtDNA sizes is mainly attributed to a lower abundance of repetitive regions in M. roreri (5.8 Kb vs 20.7 Kb). The most notable differences between M. roreri and M. perniciosa mtDNA are attributed to repeats and regions of plasmid origin. These elements might have contributed to the rapid evolution of mtDNA. Since M. roreri is the second species of the genus Moniliophthora whose mtDNA genome has been sequenced, the data presented here contribute valuable information for understanding the evolution of fungal mt genomes among closely-related species. Crown Copyright © 2012. Published by Elsevier Ltd. All rights reserved.
mtDNA and the Origin of the Icelanders: Deciphering Signals of Recent Population History

PubMed Central

Helgason, Agnar; Sigurðardóttir, Sigrún; Gulcher, Jeffrey R.; Ward, Ryk; Stefánsson, Kári

2000-01-01

Previous attempts to investigate the origin of the Icelanders have provided estimates of ancestry ranging from a 98% British Isles contribution to an 86% Scandinavian contribution. We generated mitochondrial sequence data for 401 Icelandic individuals and compared these data with >2,500 other European sequences from published sources, to determine the probable origins of women who contributed to Iceland’s settlement. Although the mean number of base-pair differences is high in the Icelandic sequences and they are widely distributed in the overall European mtDNA phylogeny, we find a smaller number of distinct mitochondrial lineages, compared with most other European populations. The frequencies of a number of mtDNA lineages in the Icelanders deviate noticeably from those in neighboring populations, suggesting that founder effects and genetic drift may have had a considerable influence on the Icelandic gene pool. This is in accordance with available demographic evidence about Icelandic population history. A comparison with published mtDNA lineages from European populations indicates that, whereas most founding females probably originated from Scandinavia and the British Isles, lesser contributions from other populations may also have taken place. We present a highly resolved phylogenetic network for the Icelandic data, identifying a number of previously unreported mtDNA lineage clusters and providing a detailed depiction of the evolutionary relationships between European mtDNA clusters. Our findings indicate that European populations contain a large number of closely related mitochondrial lineages, many of which have not yet been sampled in the current comparative data set. Consequently, substantial increases in sample sizes that use mtDNA data will be needed to obtain valid estimates of the diverse ancestral mixtures that ultimately gave rise to contemporary populations. PMID:10712214
Comparison of immunohistochemistry, DNA sequencing and allele-specific PCR for the detection of IDH1 mutations in gliomas.

PubMed

Loussouarn, Delphine; Le Loupp, Anne-Gaëlle; Frenel, Jean-Sébastien; Leclair, François; Von Deimling, Andreas; Aumont, Maud; Martin, Stéphane; Campone, Mario; Denis, Marc G

2012-06-01

Previous studies have identified mutations of the isocitrate dehydrogenase 1 (IDH1) gene in more than 70% of World Health Organization (WHO) grade II and III gliomas. The most frequent mutation leads to a specific amino acid change from arginine to histidine at codon 132 (c.395G>A, p.R132H). IDH1 mutated tumors have a better prognosis than IDH1 non-mutated tumors. The aim of our study was to evaluate and compare the methods of mIDH1 R132H immunohistochemistry, allele-specific PCR and DNA sequencing for determination of IDH1 status. We performed a retrospective study of 91 patients with WHO grade II (n=43) and III (n=48) oligodendrogliomas. A fragment of exon 4 spanning the sequence encoding the catalytic domain of IDH1, including codon 132, was amplified and sequenced using standard conditions. Allele-specific amplification was performed using two forward primers with variations in their 3' nucleotides such that each was specific for the wild-type or the mutated variant, and one reverse primer. Immunohistochemistry was performed with mouse monoclonal mIDH1 R132H. DNA was extracted from FFPE sections following macrodissection. IDH1 mutations were found in 55/90 patients (61.1%) by direct sequencing. R132H mutations were found in 47/55 patients (85.4%). The results of the allele-specific PCR positively correlated with those from DNA sequencing. Other mutations (p.R132C, p.R132S and pR132G) were found by DNA sequencing in 3, 3 and 2 tumors, respectively (8/55 patients, 14.6%). mIDH1 R132H immunostaining was found in the 47 patients presenting the R132H mutation (sensitivity 47/47, 100% for this mutation). None of the tumors presenting a wild-type IDH1 gene were stained (specificity 35/35, 100%). Our results demonstrate that immunohistochemistry using the mIDH1 R132H antibody and allele-specific amplification are highly sensitive techniques to detect the most frequent mutation of the IDH1 gene.
Synthesis of DNA

DOEpatents

Mariella, Jr., Raymond P.

2008-11-18

A method of synthesizing a desired double-stranded DNA of a predetermined length and of a predetermined sequence. Preselected sequence segments that will complete the desired double-stranded DNA are determined. Preselected segment sequences of DNA that will be used to complete the desired double-stranded DNA are provided. The preselected segment sequences of DNA are assembled to produce the desired double-stranded DNA.
Primary structures of ribosomal proteins from the archaebacterium Halobacterium marismortui and the eubacterium Bacillus stearothermophilus.

PubMed

Arndt, E; Scholzen, T; Krömer, W; Hatakeyama, T; Kimura, M

1991-06-01

Approximately 40 ribosomal proteins from each Halobacterium marismortui and Bacillus stearothermophilus have been sequenced either by direct protein sequence analysis or by DNA sequence analysis of the appropriate genes. The comparison of the amino acid sequences from the archaebacterium H marismortui with the available ribosomal proteins from the eubacterial and eukaryotic kingdoms revealed four different groups of proteins: 24 proteins are related to both eubacterial as well as eukaryotic proteins. Eleven proteins are exclusively related to eukaryotic counterparts. For three proteins only eubacterial relatives-and for another three proteins no counterpart-could be found. The similarities of the halobacterial ribosomal proteins are in general somewhat higher to their eukaryotic than to their eubacterial counterparts. The comparison of B stearothermophilus proteins with their E coli homologues showed that the proteins evolved at different rates. Some proteins are highly conserved with 64-76% identity, others are poorly conserved with only 25-34% identical amino acid residues.
Genetic analysis of Fasciola isolates from cattle in Korea based on second internal transcribed spacer (ITS-2) sequence of nuclear ribosomal DNA.

PubMed

Choe, Se-Eun; Nguyen, Thuy Thi-Dieu; Kang, Tae-Gyu; Kweon, Chang-Hee; Kang, Seung-Won

2011-09-01

Nuclear ribosomal DNA sequence of the second internal transcribed spacer (ITS-2) has been used efficiently to identify the liver fluke species collected from different hosts and various geographic regions. ITS-2 sequences of 19 Fasciola samples collected from Korean native cattle were determined and compared. Sequence comparison including ITS-2 sequences of isolates from this study and reference sequences from Fasciola hepatica and Fasciola gigantica and intermediate Fasciola in Genbank revealed seven identical variable sites of investigated isolates. Among 19 samples, 12 individuals had ITS-2 sequences completely identical to that of pure F. hepatica, five possessed the sequences identical to F. gigantica type, whereas two shared the sequence of both F. hepatica and F. gigantica. No variations in length and nucleotide composition of ITS-2 sequence were observed within isolates that belonged to F. hepatica or F. gigantica. At the position of 218, five Fasciola containing a single-base substitution (C>T) formed a distinct branch inside the F. gigantica-type group which was similar to those of Asian-origin isolates. The phylogenetic tree of the Fasciola spp. based on complete ITS-2 sequences from this study and other representative isolates in different locations clearly showed that pure F. hepatica, F. gigantica type and intermediate Fasciola were observed. The result also provided additional genetic evidence for the existence of three forms of Fasciola isolated from native cattle in Korea by genetic approach using ITS-2 sequence.
Nanopore Technology: A Simple, Inexpensive, Futuristic Technology for DNA Sequencing.

PubMed

Gupta, P D

2016-10-01

In health care, importance of DNA sequencing has been fully established. Sanger's Capillary Electrophoresis DNA sequencing methodology is time consuming, cumbersome, hence become more expensive. Lately, because of its versatility DNA sequencing became house hold name, and therefore, there is an urgent need of simple, fast, inexpensive, DNA sequencing technology. In the beginning of this century efforts were made, and Nanopore DNA sequencing technology was developed; still it is infancy, nevertheless, it is the futuristic technology.
DNA pooling: a comprehensive, multi-stage association analysis of ACSL6 and SIRT5 polymorphisms in schizophrenia.

PubMed

Chowdari, K V; Northup, A; Pless, L; Wood, J; Joo, Y H; Mirnics, K; Lewis, D A; Levitt, P R; Bacanu, S-A; Nimgaonkar, V L

2007-04-01

Many candidate gene association studies have evaluated incomplete, unrepresentative sets of single nucleotide polymorphisms (SNPs), producing non-significant results that are difficult to interpret. Using a rapid, efficient strategy designed to investigate all common SNPs, we tested associations between schizophrenia and two positional candidate genes: ACSL6 (Acyl-Coenzyme A synthetase long-chain family member 6) and SIRT5 (silent mating type information regulation 2 homologue 5). We initially evaluated the utility of DNA sequencing traces to estimate SNP allele frequencies in pooled DNA samples. The mean variances for the DNA sequencing estimates were acceptable and were comparable to other published methods (mean variance: 0.0008, range 0-0.0119). Using pooled DNA samples from cases with schizophrenia/schizoaffective disorder (Diagnostic and Statistical Manual of Mental Disorders edition IV criteria) and controls (n=200, each group), we next sequenced all exons, introns and flanking upstream/downstream sequences for ACSL6 and SIRT5. Among 69 identified SNPs, case-control allele frequency comparisons revealed nine suggestive associations (P<0.2). Each of these SNPs was next genotyped in the individual samples composing the pools. A suggestive association with rs 11743803 at ACSL6 remained (allele-wise P=0.02), with diminished evidence in an extended sample (448 cases, 554 controls, P=0.062). In conclusion, we propose a multi-stage method for comprehensive, rapid, efficient and economical genetic association analysis that enables simultaneous SNP detection and allele frequency estimation in large samples. This strategy may be particularly useful for research groups lacking access to high throughput genotyping facilities. Our analyses did not yield convincing evidence for associations of schizophrenia with ACSL6 or SIRT5.

Identification and properties of the largest subunit of the DNA-dependent RNA polymerase of fish lymphocystis disease virus: dramatic difference in the domain organization in the family Iridoviridae.

PubMed

Müller, M; Schnitzler, P; Koonin, E V; Darai, G

1995-05-01

Cytoplasmic DNA viruses encode a DNA-dependent RNA polymerase (DdRP) that is essential for transcription of viral genes. The amino acid sequences of the known largest subunits of DdRPs from different species contain highly conserved regions. Oligonucleotide primers, deduced from two conserved domains (RQP[T/S]LH and NADFDGDE) were used for detecting the corresponding gene of fish lymphocystis disease virus (FLCDV), a member of the family Iridoviridae, which replicates in the cytoplasm of infected cells of flatfish. The gene coding for the largest subunit of the DdRP was identified using a PCR-derived probe. The screening of the complete EcoRI gene library of the viral genome led to the identification of the gene locus of the largest subunit of the DdRP within the EcoRI DNA fragment B (12.4 kbp, 0.034 to 0.165 map units). The nucleotide sequence of a part (8334 bp) of the EcoRI DNA fragment B was determined and a large ORF on the lower strand (ATG = 5787; TAA = 2190) was detected which encodes a protein of 1199 amino acids. Comparison of the amino acid sequences of the largest subunits of the DdRP (RPO1) of FLCDV and Chilo iridescent virus (CIV) revealed a dramatic difference in their domain organization. Unlike the 1051 aa RPO1 of CIV, which lacks the C-terminal domain conserved in eukaryotic, eubacterial and other viral RNA polymerases, the 1199 aa RPO1 of FLCDV is fully collinear with its cellular and viral homologues. Despite this difference, comparative analysis of the amino acid sequences of viral and cellular RNA polymerases suggests a common origin for the largest RNA polymerase subunits of FLCDV and CIV.
Structure, inheritance, and expression of hybrid poplar (Populus trichocarpa x Populus deltoides) phenylalanine ammonia-lyase genes.

PubMed Central

Subramaniam, R; Reinold, S; Molitor, E K; Douglas, C J

1993-01-01

A heterologous probe encoding phenylalanine ammonia-lyase (PAL) was used to identify PAL clones in cDNA libraries made with RNA from young leaf tissue of two Populus deltoides x P. trichocarpa F1 hybrid clones. Sequence analysis of a 2.4-kb cDNA confirmed its identity as a full-length PAl clone. The predicted amino acid sequence is conserved in comparison with that of PAL genes from several other plants. Southern blot analysis of popular genomic DNA from parental and hybrid individuals, restriction site polymorphism in PAL cDNA clones, and sequence heterogeneity in the 3' ends of several cDNA clones suggested that PAL is encoded by at least two genes that can be distinguished by HindIII restriction site polymorphisms. Clones containing each type of PAL gene were isolated from a poplar genomic library. Analysis of the segregation of PAL-specific HindIII restriction fragment-length polymorphisms demonstrated the existence of two independently segregating PAL loci, one of which was mapped to a linkage group of the poplar genetic map. Developmentally regulated PAL expression in poplar was analyzed using RNA blots. Highest expression was observed in young stems, apical buds, and young leaves. Expression was lower in older stems and undetectable in mature leaves. Cellular localization of PAL expression by in situ hybridization showed very high levels of expression in subepidermal cells of leaves early during leaf development. In stems and petioles, expression was associated with subepidermal cells and vascular tissues. PMID:8108506
The genome-wide DNA sequence specificity of the anti-tumour drug bleomycin in human cells.

PubMed

Murray, Vincent; Chen, Jon K; Tanaka, Mark M

2016-07-01

The cancer chemotherapeutic agent, bleomycin, cleaves DNA at specific sites. For the first time, the genome-wide DNA sequence specificity of bleomycin breakage was determined in human cells. Utilising Illumina next-generation DNA sequencing techniques, over 200 million bleomycin cleavage sites were examined to elucidate the bleomycin genome-wide DNA selectivity. The genome-wide bleomycin cleavage data were analysed by four different methods to determine the cellular DNA sequence specificity of bleomycin strand breakage. For the most highly cleaved DNA sequences, the preferred site of bleomycin breakage was at 5'-GT* dinucleotide sequences (where the asterisk indicates the bleomycin cleavage site), with lesser cleavage at 5'-GC* dinucleotides. This investigation also determined longer bleomycin cleavage sequences, with preferred cleavage at 5'-GT*A and 5'- TGT* trinucleotide sequences, and 5'-TGT*A tetranucleotides. For cellular DNA, the hexanucleotide DNA sequence 5'-RTGT*AY (where R is a purine and Y is a pyrimidine) was the most highly cleaved DNA sequence. It was striking that alternating purine-pyrimidine sequences were highly cleaved by bleomycin. The highest intensity cleavage sites in cellular and purified DNA were very similar although there were some minor differences. Statistical nucleotide frequency analysis indicated a G nucleotide was present at the -3 position (relative to the cleavage site) in cellular DNA but was absent in purified DNA.
Centromeres: long intergenic spaces with adaptive features.

PubMed

Kanizay, Lisa; Dawe, R Kelly

2009-08-01

Centromeres are composed of inner kinetochore proteins, which are largely conserved across species, and repetitive DNA, which shows comparatively little sequence conservation. Due to this fundamental paradox the formation and maintenance of centromeres remains largely a mystery. However, it has become increasingly clear that a long-standing balance between epigenetic and genetic control governs the interactions of centromeric DNA and inner kinetochore proteins. The comparison of classical neocentromeres in plants, which are entirely genetic in their mode of operation, and clinical neocentromeres, which are sequence-independent, illustrates the conflict between genetics and epigenetics in regions that control their own transmission to progeny. Tandem repeat arrays present in centromeres may have an origin in meiotic drive or other selfish patterns of evolution, as is the case for the CENP-B box and CENP-B protein in human. In grasses retrotransposons have invaded centromeres to the point of complete domination, consequently breaking genetic regulation at these centromeres. The accumulation of tandem repeats and transposons causes centromeres to expand in size, effectively pushing genes to the sides and opening the centromere to ever fewer constraints on the DNA sequence. On genetic maps centromeres appear as long intergenic spaces that evolve rapidly and apparently without regard to host fitness.
DNA barcoding for effective biodiversity assessment of a hyperdiverse arthropod group: the ants of Madagascar

PubMed Central

Smith, M. Alex; Fisher, Brian L; Hebert, Paul D.N

2005-01-01

The role of DNA barcoding as a tool to accelerate the inventory and analysis of diversity for hyperdiverse arthropods is tested using ants in Madagascar. We demonstrate how DNA barcoding helps address the failure of current inventory methods to rapidly respond to pressing biodiversity needs, specifically in the assessment of richness and turnover across landscapes with hyperdiverse taxa. In a comparison of inventories at four localities in northern Madagascar, patterns of richness were not significantly different when richness was determined using morphological taxonomy (morphospecies) or sequence divergence thresholds (Molecular Operational Taxonomic Unit(s); MOTU). However, sequence-based methods tended to yield greater richness and significantly lower indices of similarity than morphological taxonomy. MOTU determined using our molecular technique were a remarkably local phenomenon—indicative of highly restricted dispersal and/or long-term isolation. In cases where molecular and morphological methods differed in their assignment of individuals to categories, the morphological estimate was always more conservative than the molecular estimate. In those cases where morphospecies descriptions collapsed distinct molecular groups, sequence divergences of 16% (on average) were contained within the same morphospecies. Such high divergences highlight taxa for further detailed genetic, morphological, life history, and behavioral studies. PMID:16214741
Morphological and Molecular Identification of Globodera pallida Associated with Potato in Idaho

PubMed Central

Skantar, A. M.; Handoo, Z. A.; Carta, L. K.; Chitwood, D. J.

2007-01-01

The identity of a newly discovered population of pale potato cyst nematode Globodera pallida associated with potato in eastern Idaho was established by morphological and molecular methods. Morphometrics of cysts and second-stage juveniles were generally within the expected ranges for G. pallida with some variations noted. The Idaho population and paratype material from Epworth, Lincolnshire, England, both showed variations in tail shape, with bluntly rounded to finely pointed tail termini. Compared to literature values for the paratypes, second-stage juveniles of the Idaho population had a somewhat shorter mean body length, and cysts had a slightly higher mean distance from the anus to the nearest edge of the fenestra. PCR-RFLP of the rDNA ITS region, sequence-specific multiplex PCR and DNA sequence comparisons all confirmed the identity of the Idaho population as G. pallida. The ITS rDNA sequence of the Idaho isolate was identical to those from York, England, and the Netherlands. Species-specific primers that can positively identify the tobacco cyst nematode Globodera tabacum were also developed, providing a new assay for distinguishing this species from G. pallida and the golden potato cyst nematode Globodera rostochiensis. PMID:19259482
Comparison of two PCR-based methods and automated DNA sequencing for prop-1 genotyping in Ames dwarf mice.

PubMed

Gerstner, Arpad; DeFord, James H; Papaconstantinou, John

2003-07-25

Ames dwarfism is caused by a homozygous single nucleotide mutation in the pituitary specific prop-1 gene, resulting in combined pituitary hormone deficiency, reduced growth and extended lifespan. Thus, these mice serve as an important model system for endocrinological, aging and longevity studies. Because the phenotype of wild type and heterozygous mice is undistinguishable, it is imperative for successful breeding to accurately genotype these animals. Here we report a novel, yet simple, approach for prop-1 genotyping using PCR-based allele-specific amplification (PCR-ASA). We also compare this method to other potential genotyping techniques, i.e. PCR-based restriction fragment length polymorphism analysis (PCR-RFLP) and fluorescence automated DNA sequencing. We demonstrate that the single-step PCR-ASA has several advantages over the classical PCR-RFLP because the procedure is simple, less expensive and rapid. To further increase the specificity and sensitivity of the PCR-ASA, we introduced a single-base mismatch at the 3' penultimate position of the mutant primer. Our results also reveal that the fluorescence automated DNA sequencing has limitations for detecting a single nucleotide polymorphism in the prop-1 gene, particularly in heterozygotes.
Phylogenetic and Structural Analysis of the Pluripotency Factor Sex-Determining Region Y box2 Gene of Camelus dromedarius (cSox2).

PubMed

Alawad, Abdullah; Alharbi, Sultan; Alhazzaa, Othman; Alagrafi, Faisal; Alkhrayef, Mohammed; Alhamdan, Ziyad; Alenazi, Abdullah; Al-Johi, Hasan; Alanazi, Ibrahim O; Hammad, Mohamed

2016-01-01

Although the sequencing information of Sox2 cDNA for many mammalian is available, the Sox2 cDNA of Camelus dromedaries has not yet been characterized. The objective of this study was to sequence and characterize Sox2 cDNA from the brain of C. dromedarius (also known as Arabian camel). A full coding sequence of the Sox2 gene from the brain of C. dromedarius was amplified by reverse transcription PCRjmc and then sequenced using the 3730XL series platform Sequencer (Applied Biosystem) for the first time. The cDNA sequence displayed an open reading frame of 822 nucleotides, encoding a protein of 273 amino acids. The molecular weight and the isoelectric point of the translated protein were calculated as 29.825 kDa and 10.11, respectively, using bioinformatics analysis. The predicted cSox2 protein sequence exhibited high identity: 99% for Homo sapiens, Mus musculus, Bos taurus, and Vicugna pacos; 98% for Sus scrofa and 93% for Camelus ferus. A 3D structure was built based on the available crystal structure of the HMG-box domain of human stem cell transcription factor Sox2 (PDB: 2 LE4) with 81 residues and predicting bioinformatics software for 273 amino acid residues. The comparison confirms the presence of the HMG-box domain in the cSox2 protein. The orthologous phylogenetic analysis showed that the Sox2 isoform from C. dromedarius was grouped with humans, alpacas, cattle, and pigs. We believe that this genetic and structural information will be a helpful source for the annotation. Furthermore, Sox2 is one of the transcription factors that contributes to the generation-induced pluripotent stem cells (iPSCs), which in turn will probably help generate camel induced pluripotent stem cells (CiPSCs).
Bridging two scholarly islands enriches both: COI DNA barcodes for species identification versus human mitochondrial variation for the study of migrations and pathologies.

PubMed

Thaler, David S; Stoeckle, Mark Y

2016-10-01

DNA barcodes for species identification and the analysis of human mitochondrial variation have developed as independent fields even though both are based on sequences from animal mitochondria. This study finds questions within each field that can be addressed by reference to the other. DNA barcodes are based on a 648-bp segment of the mitochondrially encoded cytochrome oxidase I. From most species, this segment is the only sequence available. It is impossible to know whether it fairly represents overall mitochondrial variation. For modern humans, the entire mitochondrial genome is available from thousands of healthy individuals. SNPs in the human mitochondrial genome are evenly distributed across all protein-encoding regions arguing that COI DNA barcode is representative. Barcode variation among related species is largely based on synonymous codons. Data on human mitochondrial variation support the interpretation that most - possibly all - synonymous substitutions in mitochondria are selectively neutral. DNA barcodes confirm reports of a low variance in modern humans compared to nonhuman primates. In addition, DNA barcodes allow the comparison of modern human variance to many other extant animal species. Birds are a well-curated group in which DNA barcodes are coupled with census and geographic data. Putting modern human variation in the context of intraspecies variation among birds shows humans to be a single breeding population of average variance.
Enzymatic Production of Monoclonal Stoichiometric Single-Stranded DNA Oligonucleotides

PubMed Central

Ducani, Cosimo; Kaul, Corinna; Moche, Martin; Shih, William M.; Högberg, Björn

2013-01-01

Single-stranded oligonucleotides are important as research tools as probes for diagnostics and gene therapy. Today, production of oligonucleotides is done via solid-phase synthesis. However, the capabilities of current polymer chemistry are limited in comparison to what can be produced in biological systems. The errors in synthetic DNA increases with oligonucleotide length, and sequence diversity can often be a problem. Here, we present the Monoclonal Stoichiometric (MOSIC) method for enzymatic DNA oligonucleotide production. Using this method, we amplify oligonucleotides from clonal templates followed by digestion of a cutter-hairpin, resulting in pools of monoclonal oligonucleotides with precisely controlled relative stoichiometric ratios. We present data where MOSIC oligonucleotides, 14–378 nt long, were prepared either by in vitro rolling-circle amplification, or by amplification in Escherichia coli in the form of phagemid DNA. The formation of a DNA crystal and folding of DNA nanostructures confirmed the scalability, purity and stoichiometry of the produced oligonucleotides. PMID:23727986
Rapid identification and classification of bacteria by 16S rDNA restriction fragment melting curve analyses (RFMCA).

PubMed

Rudi, Knut; Kleiberg, Gro H; Heiberg, Ragnhild; Rosnes, Jan T

2007-08-01

The aim of this work was to evaluate restriction fragment melting curve analyses (RFMCA) as a novel approach for rapid classification of bacteria during food production. RFMCA was evaluated for bacteria isolated from sous vide food products, and raw materials used for sous vide production. We identified four major bacterial groups in the material analysed (cluster I-Streptococcus, cluster II-Carnobacterium/Bacillus, cluster III-Staphylococcus and cluster IV-Actinomycetales). The accuracy of RFMCA was evaluated by comparison with 16S rDNA sequencing. The strains satisfying the RFMCA quality filtering criteria (73%, n=57), with both 16S rDNA sequence information and RFMCA data (n=45) gave identical group assignments with the two methods. RFMCA enabled rapid and accurate classification of bacteria that is database compatible. Potential application of RFMCA in the food or pharmaceutical industry will include development of classification models for the bacteria expected in a given product, and then to build an RFMCA database as a part of the product quality control.
IM-TORNADO: A Tool for Comparison of 16S Reads from Paired-End Libraries

PubMed Central

Jeraldo, Patricio; Kalari, Krishna; Chen, Xianfeng; Bhavsar, Jaysheel; Mangalam, Ashutosh; White, Bryan; Nelson, Heidi; Kocher, Jean-Pierre; Chia, Nicholas

2014-01-01

Motivation 16S rDNA hypervariable tag sequencing has become the de facto method for accessing microbial diversity. Illumina paired-end sequencing, which produces two separate reads for each DNA fragment, has become the platform of choice for this application. However, when the two reads do not overlap, existing computational pipelines analyze data from read separately and underutilize the information contained in the paired-end reads. Results We created a workflow known as Illinois Mayo Taxon Organization from RNA Dataset Operations (IM-TORNADO) for processing non-overlapping reads while retaining maximal information content. Using synthetic mock datasets, we show that the use of both reads produced answers with greater correlation to those from full length 16S rDNA when looking at taxonomy, phylogeny, and beta-diversity. Availability and Implementation IM-TORNADO is freely available at http://sourceforge.net/projects/imtornado and produces BIOM format output for cross compatibility with other pipelines such as QIIME, mothur, and phyloseq. PMID:25506826
Diversity of the Arabidopsis mitochondrial genome occurs via nuclear-controlled recombination activity.

PubMed

Arrieta-Montiel, Maria P; Shedge, Vikas; Davila, Jaime; Christensen, Alan C; Mackenzie, Sally A

2009-12-01

The plant mitochondrial genome is recombinogenic, with DNA exchange activity controlled to a large extent by nuclear gene products. One nuclear gene, MSH1, appears to participate in suppressing recombination in Arabidopsis at every repeated sequence ranging in size from 108 to 556 bp. Present in a wide range of plant species, these mitochondrial repeats display evidence of successful asymmetric DNA exchange in Arabidopsis when MSH1 is disrupted. Recombination frequency appears to be influenced by repeat sequence homology and size, with larger size repeats corresponding to increased DNA exchange activity. The extensive mitochondrial genomic reorganization of the msh1 mutant produced altered mitochondrial transcription patterns. Comparison of mitochondrial genomes from the Arabidopsis ecotypes C24, Col-0, and Ler suggests that MSH1 activity accounts for most or all of the polymorphisms distinguishing these genomes, producing ecotype-specific stoichiometric changes in each line. Our observations suggest that MSH1 participates in mitochondrial genome evolution by influencing the lineage-specific pattern of mitochondrial genetic variation in higher plants.
Chicken skin virome analyzed by high-throughput sequencing shows a composition highly different from human skin.

PubMed

Denesvre, Caroline; Dumarest, Marine; Rémy, Sylvie; Gourichon, David; Eloit, Marc

2015-10-01

Recent studies show that human skin at homeostasis is a complex ecosystem whose virome include circular DNA viruses, especially papillomaviruses and polyomaviruses. To determine the chicken skin virome in comparison with human skin virome, a chicken swabs pool sample from fifteen indoor healthy chickens of five genetic backgrounds was examined for the presence of DNA viruses by high-throughput sequencing (HTS). The results indicate a predominance of herpesviruses from the Mardivirus genus, coming from either vaccinal origin or presumably asymptomatic infection. Despite the high sensitivity of the HTS method used herein to detect small circular DNA viruses, we did not detect any papillomaviruses, polyomaviruses, or circoviruses, indicating that these viruses may not be resident of the chicken skin. The results suggest that the turkey herpesvirus is a resident of chicken skin in vaccinated chickens. This study indicates major differences between the skin viromes of chickens and humans. The origin of this difference remains to be further studied in relation with skin physiology, environment, or virus population dynamics.
CoCoNUT: an efficient system for the comparison and analysis of genomes

PubMed Central

2008-01-01

Background Comparative genomics is the analysis and comparison of genomes from different species. This area of research is driven by the large number of sequenced genomes and heavily relies on efficient algorithms and software to perform pairwise and multiple genome comparisons. Results Most of the software tools available are tailored for one specific task. In contrast, we have developed a novel system CoCoNUT (Computational Comparative geNomics Utility Toolkit) that allows solving several different tasks in a unified framework: (1) finding regions of high similarity among multiple genomic sequences and aligning them, (2) comparing two draft or multi-chromosomal genomes, (3) locating large segmental duplications in large genomic sequences, and (4) mapping cDNA/EST to genomic sequences. Conclusion CoCoNUT is competitive with other software tools w.r.t. the quality of the results. The use of state of the art algorithms and data structures allows CoCoNUT to solve comparative genomics tasks more efficiently than previous tools. With the improved user interface (including an interactive visualization component), CoCoNUT provides a unified, versatile, and easy-to-use software tool for large scale studies in comparative genomics. PMID:19014477
New luminescent mycenoid fungi (Basidiomycota, Agaricales) from São Paulo State, Brazil.

PubMed

Desjardin, Dennis E; Perry, Brian A; Stevani, Cassius V

Four species of mycenoid fungi are reported as luminescent (or putatively luminescent) on the basis of specimens collected from São Paulo State, Brazil. Two of them represent new species (Mycena oculisnymphae, Resinomycena petarensis), and two represent new reports of luminescence in previously described species (M. deformis, M. globulispora). Comprehensive descriptions, illustrations, photographs, and comparisons with phenetically similar species are provided. Sequences of nuc rDNA internal transcribed spacer regions were generated for barcoding purposes and for comparisons with similar species.
Analysis of the ergosterol biosynthesis pathway cloning, molecular characterization and phylogeny of lanosterol 14 α-demethylase (ERG11) gene of Moniliophthora perniciosa.

PubMed

de Oliveira Ceita, Geruza; Vilas-Boas, Laurival Antônio; Castilho, Marcelo Santos; Carazzolle, Marcelo Falsarella; Pirovani, Carlos Priminho; Selbach-Schnadelbach, Alessandra; Gramacho, Karina Peres; Ramos, Pablo Ivan Pereira; Barbosa, Luciana Veiga; Pereira, Gonçalo Amarante Guimarães; Góes-Neto, Aristóteles

2014-10-01

The phytopathogenic fungus Moniliophthora perniciosa (Stahel) Aime & Philips-Mora, causal agent of witches' broom disease of cocoa, causes countless damage to cocoa production in Brazil. Molecular studies have attempted to identify genes that play important roles in fungal survival and virulence. In this study, sequences deposited in the M. perniciosa Genome Sequencing Project database were analyzed to identify potential biological targets. For the first time, the ergosterol biosynthetic pathway in M. perniciosa was studied and the lanosterol 14α-demethylase gene (ERG11) that encodes the main enzyme of this pathway and is a target for fungicides was cloned, characterized molecularly and its phylogeny analyzed. ERG11 genomic DNA and cDNA were characterized and sequence analysis of the ERG11 protein identified highly conserved domains typical of this enzyme, such as SRS1, SRS4, EXXR and the heme-binding region (HBR). Comparison of the protein sequences and phylogenetic analysis revealed that the M. perniciosa enzyme was most closely related to that of Coprinopsis cinerea.
Analysis of the ergosterol biosynthesis pathway cloning, molecular characterization and phylogeny of lanosterol 14 α-demethylase (ERG11) gene of Moniliophthora perniciosa

PubMed Central

de Oliveira Ceita, Geruza; Vilas-Boas, Laurival Antônio; Castilho, Marcelo Santos; Carazzolle, Marcelo Falsarella; Pirovani, Carlos Priminho; Selbach-Schnadelbach, Alessandra; Gramacho, Karina Peres; Ramos, Pablo Ivan Pereira; Barbosa, Luciana Veiga; Pereira, Gonçalo Amarante Guimarães; Góes-Neto, Aristóteles

2014-01-01

The phytopathogenic fungus Moniliophthora perniciosa (Stahel) Aime & Philips-Mora, causal agent of witches’ broom disease of cocoa, causes countless damage to cocoa production in Brazil. Molecular studies have attempted to identify genes that play important roles in fungal survival and virulence. In this study, sequences deposited in the M. perniciosa Genome Sequencing Project database were analyzed to identify potential biological targets. For the first time, the ergosterol biosynthetic pathway in M. perniciosa was studied and the lanosterol 14α-demethylase gene (ERG11) that encodes the main enzyme of this pathway and is a target for fungicides was cloned, characterized molecularly and its phylogeny analyzed. ERG11 genomic DNA and cDNA were characterized and sequence analysis of the ERG11 protein identified highly conserved domains typical of this enzyme, such as SRS1, SRS4, EXXR and the heme-binding region (HBR). Comparison of the protein sequences and phylogenetic analysis revealed that the M. perniciosa enzyme was most closely related to that of Coprinopsis cinerea. PMID:25505843
The flaA locus of Bacillus subtilis is part of a large operon coding for flagellar structures, motility functions, and an ATPase-like polypeptide.

PubMed Central

Albertini, A M; Caramori, T; Crabb, W D; Scoffone, F; Galizzi, A

1991-01-01

We cloned and sequenced 8.3 kb of Bacillus subtilis DNA corresponding to the flaA locus involved in flagellar biosynthesis, motility, and chemotaxis. The DNA sequence revealed the presence of 10 complete and 2 incomplete open reading frames. Comparison of the deduced amino acid sequences to data banks showed similarities of nine of the deduced products to a number of proteins of Escherichia coli and Salmonella typhimurium for which a role in flagellar functioning has been directly demonstrated. In particular, the sequence data suggest that the flaA operon codes for the M-ring protein, components of the motor switch, and the distal part of the basal-body rod. The gene order is remarkably similar to that described for region III of the enterobacterial flagellar regulon. One of the open reading frames was translated into a protein with 48% amino acid identity to S. typhimurium FliI and 29% identity to the beta subunit of E. coli ATP synthase. PMID:1828465
PCR detection of Anaplasma phagocytophilum in goat flocks in an area endemic for tick-borne fever in Switzerland.

PubMed

Silaghi, C; Scheuerle, M C; Friche Passos, L M; Thiel, C; Pfister, K

2011-02-01

Central Switzerland is a highly endemic region for tick-borne fever (TBF) in cattle, however, little is known about A. phagocytophilum in goats. In the present study, 72 animals from six goat flocks (373 EDTA blood-samples) in Central Switzerland were analysed for A. phagocytophilum DNA. A real-time PCR targeting the msp2 gene of A. phagocytophilum was performed and in positive samples the partial 165 rRNA, groEL and msp4 gene were amplified for sequence analysis. Four DNA extracts were positive. Different sequence types on basis of the amplified genes were found. For comparison, sequences of A. phagocytophilum from 12 cattle (originating from Switzerland and Southern Germany) were analysed. The 165 rRNA gene sequences from cattle were all identical amongst each other, but the groEL and msp4 gene differed depending on the origin of the cattle samples and differed from the variants from goats. This study clearly provides molecular evidence for the presence of different types of A. phagocytophilum in goat flocks in Switzerland, a fact which deserves more thorough attention in clinical studies.

ESTs and EST-linked polymorphisms for genetic mapping and phylogenetic reconstruction in the guppy, Poecilia reticulata

PubMed Central

Dreyer, Christine; Hoffmann, Margarete; Lanz, Christa; Willing, Eva-Maria; Riester, Markus; Warthmann, Norman; Sprecher, Andrea; Tripathi, Namita; Henz, Stefan R; Weigel, Detlef

2007-01-01

Background The guppy, Poecilia reticulata, is a well-known model organism for studying inheritance and variation of male ornamental traits as well as adaptation to different river habitats. However, genomic resources for studying this important model were not previously widely available. Results With the aim of generating molecular markers for genetic mapping of the guppy, cDNA libraries were constructed from embryos and different adult organs to generate expressed sequence tags (ESTs). About 18,000 ESTs were annotated according to BLASTN and BLASTX results and the sequence information from the 3' UTRs was exploited to generate PCR primers for re-sequencing of genomic DNA from different wild type strains. By comparison of EST-linked genomic sequences from at least four different ecotypes, about 1,700 polymorphisms were identified, representing about 400 distinct genes. Two interconnected MySQL databases were built to organize the ESTs and markers, respectively. A robust phylogeny of the guppy was reconstructed, based on 10 different nuclear genes. Conclusion Our EST and marker databases provide useful tools for genetic mapping and phylogenetic studies of the guppy. PMID:17686157
Molecular Detection, Isolation, and Physiological Characterization of Functionally Dominant Phenol-Degrading Bacteria in Activated Sludge

PubMed Central

Watanabe, Kazuya; Teramoto, Maki; Futamata, Hiroyuki; Harayama, Shigeaki

1998-01-01

DNA was isolated from phenol-digesting activated sludge, and partial fragments of the 16S ribosomal DNA (rDNA) and the gene encoding the largest subunit of multicomponent phenol hydroxylase (LmPH) were amplified by PCR. An analysis of the amplified fragments by temperature gradient gel electrophoresis (TGGE) demonstrated that two major 16S rDNA bands (bands R2 and R3) and two major LmPH gene bands (bands P2 and P3) appeared after the activated sludge became acclimated to phenol. The nucleotide sequences of these major bands were determined. In parallel, bacteria were isolated from the activated sludge by direct plating or by plating after enrichment either in batch cultures or in a chemostat culture. The bacteria isolated were classified into 27 distinct groups by a repetitive extragenic palindromic sequence PCR analysis. The partial nucleotide sequences of 16S rDNAs and LmPH genes of members of these 27 groups were then determined. A comparison of these nucleotide sequences with the sequences of the major TGGE bands indicated that the major bacterial populations, R2 and R3, possessed major LmPH genes P2 and P3, respectively. The dominant populations could be isolated either by direct plating or by chemostat culture enrichment but not by batch culture enrichment. One of the dominant strains (R3) which contained a novel type of LmPH (P3), was closely related to Valivorax paradoxus, and the result of a kinetic analysis of its phenol-oxygenating activity suggested that this strain was the principal phenol digester in the activated sludge. PMID:9797297
Control of artefactual variation in reported inter-sample relatedness during clinical use of a Mycobacterium tuberculosis sequencing pipeline.

PubMed

Wyllie, David H; Sanderson, Nicholas; Myers, Richard; Peto, Tim; Robinson, Esther; Crook, Derrick W; Smith, E Grace; Walker, A Sarah

2018-06-06

Contact tracing requires reliable identification of closely related bacterial isolates. When we noticed the reporting of artefactual variation between M. tuberculosis isolates during routine next generation sequencing of Mycobacterium spp, we investigated its basis in 2,018 consecutive M. tuberculosis isolates. In the routine process used, clinical samples were decontaminated and inoculated into broth cultures; from positive broth cultures DNA was extracted, sequenced, reads mapped, and consensus sequences determined. We investigated the process of consensus sequence determination, which selects the most common nucleotide at each position. Having determined the high-quality read depth and depth of minor variants across 8,006 M. tuberculosis genomic regions, we quantified the relationship between the minor variant depth and the amount of non-Mycobacterial bacterial DNA, which originates from commensal microbes killed during sample decontamination. In the presence of non-Mycobacterial bacterial DNA, we found significant increases in minor variant frequencies of more than 1.5 fold in 242 regions covering 5.1% of the M. tuberculosis genome. Included within these were four high variation regions strongly influenced by the amount of non-Mycobacterial bacterial DNA. Excluding these four regions from pairwise distance comparisons reduced biologically implausible variation from 5.2% to 0% in an independent validation set derived from 226 individuals. Thus, we have demonstrated an approach identifying critical genomic regions contributing to clinically relevant artefactual variation in bacterial similarity searches. The approach described monitors the outputs of the complex multi-step laboratory and bioinformatics process, allows periodic process adjustments, and will have application to quality control of routine bacterial genomics. Copyright © 2018 Wyllie et al.
Serratia aquatilis sp. nov., isolated from drinking water systems.

PubMed

Kämpfer, Peter; Glaeser, Stefanie P

2016-01-01

A cream-white-pigmented, oxidase-negative bacterium (strain 2015-2462-01T), isolated from a drinking water system, was investigated in detail to determine its taxonomic position. Cells of the isolate were rod-shaped and stained Gram-negative. A comparison of the 16S rRNA gene sequence of strain 2015-2462-01T with sequences of the type strains of closely related species of the genus Serratia revealed highest similarity to Serratia fonticola (98.4 %), Serratia proteamaculans (97.8 %), Serratia liquefaciens and Serratia grimesii (both 97.7 %). 16S rRNA gene sequence similarities to all other Serratia species were below 97.4 %. Multilocus sequence analysis (MLSA) on the basis of concatenated partial gyrB, rpoB, infB and atpD gene sequences showed a clear distinction of strain 2015-2462-01T from the type strains of the closest related Serratia species. The fatty acid profile of the strain consisted of C16 : 1 ω7c, C16 : 0; C14 : 0 and C14 : 0 3-OH/iso-C16 : 1 I as major components. DNA-DNA hybridizations between 2015-2462-01T and S. fonticola ATCC 29844T resulted in a relatedness value of 27 % (reciprocal 20 %). This DNA-DNA hybridization result in combination with the MLSA results and the differential biochemical properties indicated that strain 2015-2462-01T represents a novel species of the genus Serratia, for which the name Serratia aquatilis sp. nov. is proposed. The type strain is 2015-2462-01T ( = LMG 29119T = CCM 8626T).
Prediction of constitutive A-to-I editing sites from human transcriptomes in the absence of genomic sequences

PubMed Central

2013-01-01

Background Adenosine-to-inosine (A-to-I) RNA editing is recognized as a cellular mechanism for generating both RNA and protein diversity. Inosine base pairs with cytidine during reverse transcription and therefore appears as guanosine during sequencing of cDNA. Current approaches of RNA editing identification largely depend on the comparison between transcriptomes and genomic DNA (gDNA) sequencing datasets from the same individuals, and it has been challenging to identify editing candidates from transcriptomes in the absence of gDNA information. Results We have developed a new strategy to accurately predict constitutive RNA editing sites from publicly available human RNA-seq datasets in the absence of relevant genomic sequences. Our approach establishes new parameters to increase the ability to map mismatches and to minimize sequencing/mapping errors and unreported genome variations. We identified 695 novel constitutive A-to-I editing sites that appear in clusters (named “editing boxes”) in multiple samples and which exhibit spatial and dynamic regulation across human tissues. Some of these editing boxes are enriched in non-repetitive regions lacking inverted repeat structures and contain an extremely high conversion frequency of As to Is. We validated a number of editing boxes in multiple human cell lines and confirmed that ADAR1 is responsible for the observed promiscuous editing events in non-repetitive regions, further expanding our knowledge of the catalytic substrate of A-to-I RNA editing by ADAR enzymes. Conclusions The approach we present here provides a novel way of identifying A-to-I RNA editing events by analyzing only RNA-seq datasets. This method has allowed us to gain new insights into RNA editing and should also aid in the identification of more constitutive A-to-I editing sites from additional transcriptomes. PMID:23537002
Sequence and Structure Dependent DNA-DNA Interactions

NASA Astrophysics Data System (ADS)

Kopchick, Benjamin; Qiu, Xiangyun

Molecular forces between dsDNA strands are largely dominated by electrostatics and have been extensively studied. Quantitative knowledge has been accumulated on how DNA-DNA interactions are modulated by varied biological constituents such as ions, cationic ligands, and proteins. Despite its central role in biology, the sequence of DNA has not received substantial attention and ``random'' DNA sequences are typically used in biophysical studies. However, ~50% of human genome is composed of non-random-sequence DNAs, particularly repetitive sequences. Furthermore, covalent modifications of DNA such as methylation play key roles in gene functions. Such DNAs with specific sequences or modifications often take on structures other than the canonical B-form. Here we present series of quantitative measurements of the DNA-DNA forces with the osmotic stress method on different DNA sequences, from short repeats to the most frequent sequences in genome, and to modifications such as bromination and methylation. We observe peculiar behaviors that appear to be strongly correlated with the incurred structural changes. We speculate the causalities in terms of the differences in hydration shell and DNA surface structures.
Anti-telomere antibodies in systemic lupus erythematosus (SLE): a comparison with five antinuclear antibody assays in 430 patients with SLE and other rheumatic diseases.

PubMed

Salonen, E M; Miettinen, A; Walle, T K; Koskenmies, S; Kere, J; Julkunen, H

2004-10-01

To investigate the prevalence and diagnostic significance of antibodies against telomeric DNA in systemic lupus erythematosus (SLE) and other autoimmune rheumatic diseases, and to make comparisons with five conventional anti-DNA or anti-nuclear antibody (ANA) assays. Antibodies to telomeres, which are highly repetitive sequences of DNA (TTAGGG/CCCTAA) at the end of eukaryotic chromosomes, were measured by an enzyme linked immunosorbent assay (ELISA) in 305 patients with SLE and 125 patients with other autoimmune rheumatic diseases (78 rheumatoid arthritis, 32 primary Sjögren's syndrome, eight mixed connective tissue disease, seven miscellaneous rheumatic diseases). Other assays used were two commercial ELISA assays for anti-dsDNA using calf thymus as antigen, Crithidialuciliae immunofluorescence, and radioimmunoassay (RIA) for anti-dsDNA and immunofluorescence using Hep-2 cells for ANA. The prevalence of anti-telomere in SLE was 60%, v 5% in rheumatoid arthritis and 18% in other autoimmune rheumatic diseases. Specificity of anti-telomere for SLE was 91%; positive and negative predictive values were 95% and 46%, respectively. For anti-dsDNA by two ELISA assays using calf thymus as antigen, sensitivities were 69% and 29% and specificities 66% and 96%, respectively. Other anti-dsDNA assays had low sensitivities (RIA 43%, Crithidia immunofluorescence 13%). The association of anti-telomere with a history of nephritis in patients with SLE was stronger (p = 0.005) than by any other assay (p = 0.006-0.999). The correlations between the different assays were good (p<0.001 for all comparisons). The new ELISA for anti-telomere antibodies using standardised human dsDNA as antigen is a sensitive and highly specific test for SLE.
Genetic origin of goat populations in Oman revealed by mitochondrial DNA analysis.

PubMed

Al-Araimi, Nasser Ali; Gaafar, Osman Mahgoub; Costa, Vânia; Neira, Agusto Luzuriaga; Al-Atiyat, Raed Mahmoud; Beja-Pereira, Albano

2017-01-01

The Sultanate of Oman has a complex mosaic of livestock species and production systems, but the genetic diversity, demographic history or origins of these Omani animals has not been expensively studied. Goats might constitute one of the most abundant and important domestic livestock species since the Neolithic transition. Here, we examined the genetic diversity, origin, population structure and demographic history of Omani goats. Specifically, we analyzed a 525-bp fragment of the first hypervariable region of the mitochondrial DNA (mtDNA) control region from 69 Omani individuals and compared this fragment with 17 mtDNA sequences from Somalia and Yemen as well as 18 wild goat species and 1,198 previously published goat sequences from neighboring countries. The studied goat breeds show substantial diversity. The haplotype and nucleotide diversities of Omani goats were found equal to 0.983 ± 0.006 and 0.0284 ± 0.014, respectively. The phylogenetic analyses allowed us to classify Omani goats into three mtDNA haplogroups (A, B and G): haplogroup A was found to be predominant and widely distributed and accounted for 80% of all samples, and haplogroups B and G exhibited low frequencies. Phylogenetic comparisons with wild goats revealed that five of the native Omani goat populations originate from Capra aegagrus. Furthermore, most comparisons of pairwise population FST values within and between these five Omani goat breeds as well as between Omani goats and nine populations from nearby countries were not significant. These results suggest strong gene flow among goat populations caused by the extensive transport of goats and the frequent movements of human populations in ancient Arabia. The findings improve our understanding of the migration routes of modern goats from their region of domestication into southeastern Arabia and thereby shed light on human migratory and commercial networks during historical times.
Quantitative comparison of DNA methylation assays for biomarker development and clinical applications.

PubMed

2016-07-01

DNA methylation patterns are altered in numerous diseases and often correlate with clinically relevant information such as disease subtypes, prognosis and drug response. With suitable assays and after validation in large cohorts, such associations can be exploited for clinical diagnostics and personalized treatment decisions. Here we describe the results of a community-wide benchmarking study comparing the performance of all widely used methods for DNA methylation analysis that are compatible with routine clinical use. We shipped 32 reference samples to 18 laboratories in seven different countries. Researchers in those laboratories collectively contributed 21 locus-specific assays for an average of 27 predefined genomic regions, as well as six global assays. We evaluated assay sensitivity on low-input samples and assessed the assays' ability to discriminate between cell types. Good agreement was observed across all tested methods, with amplicon bisulfite sequencing and bisulfite pyrosequencing showing the best all-round performance. Our technology comparison can inform the selection, optimization and use of DNA methylation assays in large-scale validation studies, biomarker development and clinical diagnostics.
DNA rearrangements directed by non-coding RNAs in ciliates

PubMed Central

Mochizuki, Kazufumi

2013-01-01

Extensive programmed rearrangement of DNA, including DNA elimination, chromosome fragmentation, and DNA descrambling, takes place in the newly developed macronucleus during the sexual reproduction of ciliated protozoa. Recent studies have revealed that two distant classes of ciliates use distinct types of non-coding RNAs to regulate such DNA rearrangement events. DNA elimination in Tetrahymena is regulated by small non-coding RNAs that are produced and utilized in an RNAi-related process. It has been proposed that the small RNAs produced from the micronuclear genome are used to identify eliminated DNA sequences by whole-genome comparison between the parental macronucleus and the micronucleus. In contrast, DNA descrambling in Oxytricha is guided by long non-coding RNAs that are produced from the parental macronuclear genome. These long RNAs are proposed to act as templates for the direct descrambling events that occur in the developing macronucleus. Both cases provide useful examples to study epigenetic chromatin regulation by non-coding RNAs. PMID:21956937
A Comparison Study for DNA Motif Modeling on Protein Binding Microarray.

PubMed

Wong, Ka-Chun; Li, Yue; Peng, Chengbin; Wong, Hau-San

2016-01-01

Transcription factor binding sites (TFBSs) are relatively short (5-15 bp) and degenerate. Identifying them is a computationally challenging task. In particular, protein binding microarray (PBM) is a high-throughput platform that can measure the DNA binding preference of a protein in a comprehensive and unbiased manner; for instance, a typical PBM experiment can measure binding signal intensities of a protein to all possible DNA k-mers (k = 8∼10). Since proteins can often bind to DNA with different binding intensities, one of the major challenges is to build TFBS (also known as DNA motif) models which can fully capture the quantitative binding affinity data. To learn DNA motif models from the non-convex objective function landscape, several optimization methods are compared and applied to the PBM motif model building problem. In particular, representative methods from different optimization paradigms have been chosen for modeling performance comparison on hundreds of PBM datasets. The results suggest that the multimodal optimization methods are very effective for capturing the binding preference information from PBM data. In particular, we observe a general performance improvement if choosing di-nucleotide modeling over mono-nucleotide modeling. In addition, the models learned by the best-performing method are applied to two independent applications: PBM probe rotation testing and ChIP-Seq peak sequence prediction, demonstrating its biological applicability.
Comparison of mitochondrial DNA control region sequence and microsatellite DNA analyses in estimating population structure and gene flow rates in Atlantic sturgeon Acipenser oxyrinchus

USGS Publications Warehouse

Wirgin, I.; Waldman, J.; Stabile, J.; Lubinski, B.; King, T.

2002-01-01

Atlantic sturgeon Acipenser oxyrinchus is large, long-lived, and anadromous with subspecies distributed along the Atlantic (A. oxyrinchus oxyrinchus) and Gulf of Mexico (A. o. desotoi) coasts of North America. Although it is not certain if extirpation of some population units has occurred, because of anthropogenic influences abundances of all populations are low compared with historical levels. Informed management of A. oxyrinchus demands a detailed knowledge of its population structure, levels of genetic diversity, and likelihood to home to natal rivers. We compared the use of mitochondrial DNA (mtDNA) control region sequence and microsatellite nuclear DNA (nDNA) analyses in identifying the stock structure and homing fidelity of Atlantic and Gulf coast populations of A. oxyrinchus. The approaches were concordant in that they revealed moderate to high levels of genetic diversity and suggested that populations of Atlantic sturgeon are highly structured. At least six genetically distinct management units were detected using the two approaches among the rivers surveyed. Mitochondrial DNA sequences revealed a significant cline in haplotype diversity along the Atlantic coast with monomorphism observed in Canadian populations. High levels of nDNA diversity were also observed among populations along the Atlantic coast, including the two Canadian populations, probably resulting from the more rapid rate of mutational and evolutionary change at microsatellite loci. Estimates of gene flow among populations were similar between both approaches with the exception that because of mtDNA monomorphism in Canadian populations, gene flow estimates between them were unobtainable. Analyses of both genomes provided high resolution and confidence in characterizing the population structure of Atlantic sturgeon. Microsatellite analysis was particularly informative in delineating population structure in rivers that were recently glaciated and may prove diagnostic in rivers that are geographically proximal along the south Atlantic coast of the US.
Mechanism for DNA transposons to generate introns on genomic scales

PubMed Central

Huff, Jason T.; Zilberman, Daniel; Roy, Scott W.

2017-01-01

Discovered four decades ago, the existence of introns was one of the most unexpected findings in molecular biology1. Introns are sequences interrupting genes that must be removed as part of mRNA production. Genome sequencing projects have documented that most eukaryotic genes contain at least one and frequently many introns2,3. Comparison of these genomes reveals a history of long evolutionary periods with little intron gain punctuated by episodes of rapid, extensive gain2,3. However, no detailed mechanism for such episodic intron generation has been empirically supported on a sufficient scale, despite several proposals4–8. Here we show how short non-autonomous DNA transposons independently generated hundreds to thousands of introns in the prasinophyte Micromonas pusilla and the pelagophyte Aureococcus anophagefferens. Each transposon carries one splice site. The other splice site is co-opted from gene sequence duplicated upon transposon insertion, allowing perfect splicing out of RNA. The distributions of sequences that can be co-opted are biased with respect to codons, and phasing of transposon-generated introns is similarly biased. These transposons insert between preexisting nucleosomes, so that multiple nearby insertions generate nucleosome-sized intervening segments. Thus, transposon insertion and sequence co-option may explain the intron phase biases2 and prevalence of nucleosome-sized exons9 observed in eukaryotes. Overall, the two independent examples of proliferating elements illustrate a general DNA transposon mechanism plausibly accounting for episodes of rapid, extensive intron gain during eukaryotic evolution2,3. PMID:27760113
A comparison of honey bee-collected pollen from working agricultural lands using light microscopy and ITS metabarcoding

USGS Publications Warehouse

Smart, Matthew; Cornman, Robert S.; Iwanowicz, Deborah; McDermott-Kubeczko, Margaret; Pettis, Jeff S; Spivak, Marla S; Otto, Clint R.

2017-01-01

Taxonomic identification of pollen has historically been accomplished via light microscopy but requires specialized knowledge and reference collections, particularly when identification to lower taxonomic levels is necessary. Recently, next-generation sequencing technology has been used as a cost-effective alternative for identifying bee-collected pollen; however, this novel approach has not been tested on a spatially or temporally robust number of pollen samples. Here, we compare pollen identification results derived from light microscopy and DNA sequencing techniques with samples collected from honey bee colonies embedded within a gradient of intensive agricultural landscapes in the Northern Great Plains throughout the 2010–2011 growing seasons. We demonstrate that at all taxonomic levels, DNA sequencing was able to discern a greater number of taxa, and was particularly useful for the identification of infrequently detected species. Importantly, substantial phenological overlap did occur for commonly detected taxa using either technique, suggesting that DNA sequencing is an appropriate, and enhancing, substitutive technique for accurately capturing the breadth of bee-collected species of pollen present across agricultural landscapes. We also show that honey bees located in high and low intensity agricultural settings forage on dissimilar plants, though with overlap of the most abundantly collected pollen taxa. We highlight practical applications of utilizing sequencing technology, including addressing ecological issues surrounding land use, climate change, importance of taxa relative to abundance, and evaluating the impact of conservation program habitat enhancement efforts.
Diversity of Basidiomycetes in Michigan Agricultural Soils▿

PubMed Central

Lynch, Michael D. J.; Thorn, R. Greg

2006-01-01

We analyzed the communities of soil basidiomycetes in agroecosystems that differ in tillage history at the Kellogg Biological Station Long-Term Ecological Research site near Battle Creek, Michigan. The approach combined soil DNA extraction through a bead-beating method modified to increase recovery of fungal DNA, PCR amplification with basidiomycete-specific primers, cloning and restriction fragment length polymorphism screening of mixed PCR products, and sequencing of unique clones. Much greater diversity was detected than was anticipated in this habitat on the basis of culture-based methods or surveys of fruiting bodies. With “species” defined as organisms yielding PCR products with ≥99% identity in the 5′ 650 bases of the nuclear large-subunit ribosomal DNA, 241 “species” were detected among 409 unique basidiomycete sequences recovered. Almost all major clades of basidiomycetes from basidiomycetous yeasts and other heterobasidiomycetes through polypores and euagarics (gilled mushrooms and relatives) were represented, with a majority from the latter clade. Only 24 of 241 “species” had 99% or greater sequence similarity to named reference sequences in GenBank, and several clades with multiple “species” could not be identified at the genus level by phylogenetic comparisons with named sequences. The total estimated “species” richness for this 11.2-ha site was 367 “species” of basidiomycetes. Since >99% of the study area has not been sampled, the accuracy of our diversity estimate is uncertain. Replication in time and space is required to detect additional diversity and the underlying community structure. PMID:16950900
Repetitive DNA in the pea (Pisum sativum L.) genome: comprehensive characterization using 454 sequencing and comparison to soybean and Medicago truncatula

PubMed Central

Macas, Jiří; Neumann, Pavel; Navrátilová, Alice

2007-01-01

Background Extraordinary size variation of higher plant nuclear genomes is in large part caused by differences in accumulation of repetitive DNA. This makes repetitive DNA of great interest for studying the molecular mechanisms shaping architecture and function of complex plant genomes. However, due to methodological constraints of conventional cloning and sequencing, a global description of repeat composition is available for only a very limited number of higher plants. In order to provide further data required for investigating evolutionary patterns of repeated DNA within and between species, we used a novel approach based on massive parallel sequencing which allowed a comprehensive repeat characterization in our model species, garden pea (Pisum sativum). Results Analysis of 33.3 Mb sequence data resulted in quantification and partial sequence reconstruction of major repeat families occurring in the pea genome with at least thousands of copies. Our results showed that the pea genome is dominated by LTR-retrotransposons, estimated at 140,000 copies/1C. Ty3/gypsy elements are less diverse and accumulated to higher copy numbers than Ty1/copia. This is in part due to a large population of Ogre-like retrotransposons which alone make up over 20% of the genome. In addition to numerous types of mobile elements, we have discovered a set of novel satellite repeats and two additional variants of telomeric sequences. Comparative genome analysis revealed that there are only a few repeat sequences conserved between pea and soybean genomes. On the other hand, all major families of pea mobile elements are well represented in M. truncatula. Conclusion We have demonstrated that even in a species with a relatively large genome like pea, where a single 454-sequencing run provided only 0.77% coverage, the generated sequences were sufficient to reconstruct and analyze major repeat families corresponding to a total of 35–48% of the genome. These data provide a starting point for further investigations of legume plant genomes based on their global comparative analysis and for the development of more sophisticated approaches for data mining. PMID:18031571
Successful application of FTA Classic Card technology and use of bacteriophage phi29 DNA polymerase for large-scale field sampling and cloning of complete maize streak virus genomes.

PubMed

Owor, Betty E; Shepherd, Dionne N; Taylor, Nigel J; Edema, Richard; Monjane, Adérito L; Thomson, Jennifer A; Martin, Darren P; Varsani, Arvind

2007-03-01

Leaf samples from 155 maize streak virus (MSV)-infected maize plants were collected from 155 farmers' fields in 23 districts in Uganda in May/June 2005 by leaf-pressing infected samples onto FTA Classic Cards. Viral DNA was successfully extracted from cards stored at room temperature for 9 months. The diversity of 127 MSV isolates was analysed by PCR-generated RFLPs. Six representative isolates having different RFLP patterns and causing either severe, moderate or mild disease symptoms, were chosen for amplification from FTA cards by bacteriophage phi29 DNA polymerase using the TempliPhi system. Full-length genomes were inserted into a cloning vector using a unique restriction enzyme site, and sequenced. The 1.3-kb PCR product amplified directly from FTA-eluted DNA and used for RFLP analysis was also cloned and sequenced. Comparison of cloned whole genome sequences with those of the original PCR products indicated that the correct virus genome had been cloned and that no errors were introduced by the phi29 polymerase. This is the first successful large-scale application of FTA card technology to the field, and illustrates the ease with which large numbers of infected samples can be collected and stored for downstream molecular applications such as diversity analysis and cloning of potentially new virus genomes.
Nucleotide variation in the mitochondrial genome provides evidence for dual routes of postglacial recolonization and genetic recombination in the northeastern brook trout (Salvelinus fontinalis).

PubMed

Pilgrim, B L; Perry, R C; Barron, J L; Marshall, H D

2012-09-26

Levels and patterns of mitochondrial DNA (mtDNA) variation were examined to investigate the population structure and possible routes of postglacial recolonization of the world's northernmost native populations of brook trout (Salvelinus fontinalis), which are found in Labrador, Canada. We analyzed the sequence diversity of a 1960-bp portion of the mitochondrial genome (NADH dehydrogenase 1 gene and part of cytochrome oxidase 1) of 126 fish from 32 lakes distributed throughout seven regions of northeastern Canada. These populations were found to have low levels of mtDNA diversity, a characteristic trait of populations at northern extremes, with significant structuring at the level of the watershed. Upon comparison of northeastern brook trout sequences to the publicly available brook trout whole mitochondrial genome (GenBank AF154850), we infer that the GenBank sequence is from a fish whose mtDNA has recombined with that of Arctic charr (S. alpinus). The haplotype distribution provides evidence of two different postglacial founding groups contributing to present-day brook trout populations in the northernmost part of their range; the evolution of the majority of the haplotypes coincides with the timing of glacier retreat from Labrador. Our results exemplify the strong influence that historical processes such as glaciations have had on shaping the current genetic structure of northern species such as the brook trout.
Integration of adeno-associated virus vectors in CD34+ human hematopoietic progenitor cells after transduction.

PubMed

Fisher-Adams, G; Wong, K K; Podsakoff, G; Forman, S J; Chatterjee, S

1996-07-15

Gene transfer vectors based on adeno-associated virus (AAV) appear promising because of their high transduction frequencies regardless of cell cycle status and ability to integrate into chromosomal DNA. We tested AAV-mediated gene transfer into a panel of human bone marrow or umbilical cord-derived CD34+ hematopoietic progenitor cells, using vectors encoding several transgenes under the control of viral and cellular promoters. Gene transfer was evaluated by (1) chromosomal integration of vector sequences and (2) analysis of transgene expression. Southern hybridization and fluorescence in situ hybridization analysis of transduced CD34 genomic DNA showed the presence of integrated vector sequences in chromosomal DNA in a portion of transduced cells and showed that integrated vector sequences were replicated along with cellular DNA during mitosis. Transgene expression in transduced CD34 cells in suspension cultures and in myeloid colonies differentiating in vitro from transduced CD34 cells approximated that predicted by the multiplicity of transduction. This was true in CD34 cells from different donors, regardless of the transgene or selective pressure. Comparisons of CD34 cell transduction either before or after cytokine stimulation showed similar gene transfer frequencies. Our findings suggest that AAV transduction of CD34+ hematopoietic progenitor cells is efficient, can lead to stable integration in a population of transduced cells, and may therefore provide the basis for safe and efficient ex vivo gene therapy of the hematopoietic system.
Tyrosine Recombinase Retrotransposons and Transposons.

PubMed

Poulter, Russell T M; Butler, Margi I

2015-04-01

Retrotransposons carrying tyrosine recombinases (YR) are widespread in eukaryotes. The first described tyrosine recombinase mobile element, DIRS1, is a retroelement from the slime mold Dictyostelium discoideum. The YR elements are bordered by terminal repeats related to their replication via free circular dsDNA intermediates. Site-specific recombination is believed to integrate the circle without creating duplications of the target sites. Recently a large number of YR retrotransposons have been described, including elements from fungi (mucorales and basidiomycetes), plants (green algae) and a wide range of animals including nematodes, insects, sea urchins, fish, amphibia and reptiles. YR retrotransposons can be divided into three major groups: the DIRS elements, PAT-like and the Ngaro elements. The three groups form distinct clades on phylogenetic trees based on alignments of reverse transcriptase/ribonuclease H (RT/RH) and YR sequences, and also having some structural distinctions. A group of eukaryote DNA transposons, cryptons, also carry tyrosine recombinases. These DNA transposons do not encode a reverse transcriptase. They have been detected in several pathogenic fungi and oomycetes. Sequence comparisons suggest that the crypton YRs are related to those of the YR retrotransposons. We suggest that the YR retrotransposons arose from the combination of a crypton-like YR DNA transposon and the RT/RH encoding sequence of a retrotransposon. This acquisition must have occurred at a very early point in the evolution of eukaryotes.

Application of a reverse dot blot, DNA-DNA hydridization method to quantify host-feeding tendencies of two sibling species in the Anopheles gambiae complex

PubMed Central

Fritz, Megan L; Miller, James R; Bayoh, M Nabie; Vulule, John M; Landgraf, Jeffrey R; Walker, Edward D

2012-01-01

A DNA-DNA hybridization method, reverse dot blot analysis (RDBA), was used for identification of Anopheles gambiae s.s. and An. arabiensis hosts. Of 299 blood fed and half gravid An. gambiae s.l. collected from Kisian, Kenya, 244 individuals were identifiable to species; 69.5% were An. arabiensis, and 29.5% were An. gambiae s.s. Host identifications with RDBA were comparable to conventional PCR followed by direct sequencing of amplicons of the vertebrate mitochondrial cytochrome B gene. Of the 174 amplicon-producing samples used for comparison of these two methods, 147 were identifiable by direct sequencing, and 139 of these same by RDBA. An. arabiensis blood meals were mostly (>90%) bovine in origin, whereas An. gambiae s.s. fed upon humans > 90% of the time. RDBA detected that 2 of 112 An. arabiensis had blood from more than one host species, whereas PCR and direct sequencing did not. Recent insecticide-treated bednet (ITN) use in Kisian has likely caused the shift in the dominant vector species from An. gambiae s.s. to An. arabiensis. RDBA provides an opportunity to study changes in host-feeding by members of the An. gambiae complex as a response to the broadening distribution of vector control measures targeting host-selection behaviors. PMID:24188164
Optimal word sizes for dissimilarity measures and estimation of the degree of dissimilarity between DNA sequences.

PubMed

Wu, Tiee-Jian; Huang, Ying-Hsueh; Li, Lung-An

2005-11-15

Several measures of DNA sequence dissimilarity have been developed. The purpose of this paper is 3-fold. Firstly, we compare the performance of several word-based or alignment-based methods. Secondly, we give a general guideline for choosing the window size and determining the optimal word sizes for several word-based measures at different window sizes. Thirdly, we use a large-scale simulation method to simulate data from the distribution of SK-LD (symmetric Kullback-Leibler discrepancy). These simulated data can be used to estimate the degree of dissimilarity beta between any pair of DNA sequences. Our study shows (1) for whole sequence similiarity/dissimilarity identification the window size taken should be as large as possible, but probably not >3000, as restricted by CPU time in practice, (2) for each measure the optimal word size increases with window size, (3) when the optimal word size is used, SK-LD performance is superior in both simulation and real data analysis, (4) the estimate beta of beta based on SK-LD can be used to filter out quickly a large number of dissimilar sequences and speed alignment-based database search for similar sequences and (5) beta is also applicable in local similarity comparison situations. For example, it can help in selecting oligo probes with high specificity and, therefore, has potential in probe design for microarrays. The algorithm SK-LD, estimate beta and simulation software are implemented in MATLAB code, and are available at http://www.stat.ncku.edu.tw/tjwu
Finding needles in haystacks: linking scientific names, reference specimens and molecular data for Fungi.

PubMed

Schoch, Conrad L; Robbertse, Barbara; Robert, Vincent; Vu, Duong; Cardinali, Gianluigi; Irinyi, Laszlo; Meyer, Wieland; Nilsson, R Henrik; Hughes, Karen; Miller, Andrew N; Kirk, Paul M; Abarenkov, Kessy; Aime, M Catherine; Ariyawansa, Hiran A; Bidartondo, Martin; Boekhout, Teun; Buyck, Bart; Cai, Qing; Chen, Jie; Crespo, Ana; Crous, Pedro W; Damm, Ulrike; De Beer, Z Wilhelm; Dentinger, Bryn T M; Divakar, Pradeep K; Dueñas, Margarita; Feau, Nicolas; Fliegerova, Katerina; García, Miguel A; Ge, Zai-Wei; Griffith, Gareth W; Groenewald, Johannes Z; Groenewald, Marizeth; Grube, Martin; Gryzenhout, Marieka; Gueidan, Cécile; Guo, Liangdong; Hambleton, Sarah; Hamelin, Richard; Hansen, Karen; Hofstetter, Valérie; Hong, Seung-Beom; Houbraken, Jos; Hyde, Kevin D; Inderbitzin, Patrik; Johnston, Peter R; Karunarathna, Samantha C; Kõljalg, Urmas; Kovács, Gábor M; Kraichak, Ekaphan; Krizsan, Krisztina; Kurtzman, Cletus P; Larsson, Karl-Henrik; Leavitt, Steven; Letcher, Peter M; Liimatainen, Kare; Liu, Jian-Kui; Lodge, D Jean; Luangsa-ard, Janet Jennifer; Lumbsch, H Thorsten; Maharachchikumbura, Sajeewa S N; Manamgoda, Dimuthu; Martín, María P; Minnis, Andrew M; Moncalvo, Jean-Marc; Mulè, Giuseppina; Nakasone, Karen K; Niskanen, Tuula; Olariaga, Ibai; Papp, Tamás; Petkovits, Tamás; Pino-Bodas, Raquel; Powell, Martha J; Raja, Huzefa A; Redecker, Dirk; Sarmiento-Ramirez, J M; Seifert, Keith A; Shrestha, Bhushan; Stenroos, Soili; Stielow, Benjamin; Suh, Sung-Oui; Tanaka, Kazuaki; Tedersoo, Leho; Telleria, M Teresa; Udayanga, Dhanushka; Untereiner, Wendy A; Diéguez Uribeondo, Javier; Subbarao, Krishna V; Vágvölgyi, Csaba; Visagie, Cobus; Voigt, Kerstin; Walker, Donald M; Weir, Bevan S; Weiß, Michael; Wijayawardene, Nalin N; Wingfield, Michael J; Xu, J P; Yang, Zhu L; Zhang, Ning; Zhuang, Wen-Ying; Federhen, Scott

2014-01-01

DNA phylogenetic comparisons have shown that morphology-based species recognition often underestimates fungal diversity. Therefore, the need for accurate DNA sequence data, tied to both correct taxonomic names and clearly annotated specimen data, has never been greater. Furthermore, the growing number of molecular ecology and microbiome projects using high-throughput sequencing require fast and effective methods for en masse species assignments. In this article, we focus on selecting and re-annotating a set of marker reference sequences that represent each currently accepted order of Fungi. The particular focus is on sequences from the internal transcribed spacer region in the nuclear ribosomal cistron, derived from type specimens and/or ex-type cultures. Re-annotated and verified sequences were deposited in a curated public database at the National Center for Biotechnology Information (NCBI), namely the RefSeq Targeted Loci (RTL) database, and will be visible during routine sequence similarity searches with NR_prefixed accession numbers. A set of standards and protocols is proposed to improve the data quality of new sequences, and we suggest how type and other reference sequences can be used to improve identification of Fungi. Database URL: http://www.ncbi.nlm.nih.gov/bioproject/PRJNA177353. Published by Oxford University Press 2013. This work is written by US Government employees and is in the public domain in the US.
A High-Throughput Process for the Solid-Phase Purification of Synthetic DNA Sequences

PubMed Central

Grajkowski, Andrzej; Cieślak, Jacek; Beaucage, Serge L.

2017-01-01

An efficient process for the purification of synthetic phosphorothioate and native DNA sequences is presented. The process is based on the use of an aminopropylated silica gel support functionalized with aminooxyalkyl functions to enable capture of DNA sequences through an oximation reaction with the keto function of a linker conjugated to the 5′-terminus of DNA sequences. Deoxyribonucleoside phosphoramidites carrying this linker, as a 5′-hydroxyl protecting group, have been synthesized for incorporation into DNA sequences during the last coupling step of a standard solid-phase synthesis protocol executed on a controlled pore glass (CPG) support. Solid-phase capture of the nucleobase- and phosphate-deprotected DNA sequences released from the CPG support is demonstrated to proceed near quantitatively. Shorter than full-length DNA sequences are first washed away from the capture support; the solid-phase purified DNA sequences are then released from this support upon reaction with tetra-n-butylammonium fluoride in dry dimethylsulfoxide (DMSO) and precipitated in tetrahydrofuran (THF). The purity of solid-phase-purified DNA sequences exceeds 98%. The simulated high-throughput and scalability features of the solid-phase purification process are demonstrated without sacrificing purity of the DNA sequences. PMID:28628204
Heterochromatin and molecular characterization of DsmarMITE transposable element in the beetle Dichotomius schiffleri (Coleoptera: Scarabaeidae).

PubMed

Xavier, Crislaine; Cabral-de-Mello, Diogo Cavalcanti; de Moura, Rita Cássia

2014-12-01

Cytogenetic studies of the Neotropical beetle genus Dichotomius (Scarabaeinae, Coleoptera) have shown dynamism for centromeric constitutive heterochromatin sequences. In the present work we studied the chromosomes and isolated repetitive sequences of Dichotomius schiffleri aiming to contribute to the understanding of coleopteran genome/chromosomal organization. Dichotomius schiffleri presented a conserved karyotype and heterochromatin distribution in comparison to other species of the genus with 2n = 18, biarmed chromosomes, and pericentromeric C-positive blocks. Similarly to heterochromatin distributional patterns, the highly and moderately repetitive DNA fraction (C 0 t-1 DNA) was detected in pericentromeric areas, contrasting with the euchromatic mapping of an isolated TE (named DsmarMITE). After structural analyses, the DsmarMITE was classified as a non-autonomous element of the type miniature inverted-repeat transposable element (MITE) with terminal inverted repeats similar to Mariner elements of insects from different orders. The euchromatic distribution for DsmarMITE indicates that it does not play a part in the dynamics of constitutive heterochromatin sequences.
Mitochondrial DNA and Y-chromosomal diversity in ancient populations of domestic sheep (Ovis aries) in Finland: comparison with contemporary sheep breeds.

PubMed

Niemi, Marianna; Bläuer, Auli; Iso-Touru, Terhi; Nyström, Veronica; Harjula, Janne; Taavitsainen, Jussi-Pekka; Storå, Jan; Lidén, Kerstin; Kantanen, Juha

2013-01-22

Several molecular and population genetic studies have focused on the native sheep breeds of Finland. In this work, we investigated their ancestral sheep populations from Iron Age, Medieval and Post-Medieval periods by sequencing a partial mitochondrial DNA D-loop and the 5'-promoter region of the SRY gene. We compared the maternal (mitochondrial DNA haplotypes) and paternal (SNP oY1) genetic diversity of ancient sheep in Finland with modern domestic sheep populations in Europe and Asia to study temporal changes in genetic variation and affinities between ancient and modern populations. A 523-bp mitochondrial DNA sequence was successfully amplified for 26 of 36 sheep ancient samples i.e. five, seven and 14 samples representative of Iron Age, Medieval and Post-Medieval sheep, respectively. Genetic diversity was analyzed within the cohorts. This ancient dataset was compared with present-day data consisting of 94 animals from 10 contemporary European breeds and with GenBank DNA sequence data to carry out a haplotype sharing analysis. Among the 18 ancient mitochondrial DNA haplotypes identified, 14 were present in the modern breeds. Ancient haplotypes were assigned to the highly divergent ovine haplogroups A and B, haplogroup B being the major lineage within the cohorts. Only two haplotypes were detected in the Iron Age samples, while the genetic diversity of the Medieval and Post-Medieval cohorts was higher. For three of the ancient DNA samples, Y-chromosome SRY gene sequences were amplified indicating that they originated from rams. The SRY gene of these three ancient ram samples contained SNP G-oY1, which is frequent in modern north-European sheep breeds. Our study did not reveal any sign of major population replacement of native sheep in Finland since the Iron Age. Variations in the availability of archaeological remains may explain differences in genetic diversity estimates and patterns within the cohorts rather than demographic events that occurred in the past. Our ancient DNA results fit well with the genetic context of domestic sheep as determined by analyses of modern north-European sheep breeds.
Mitochondrial DNA and Y-chromosomal diversity in ancient populations of domestic sheep (Ovis aries) in Finland: comparison with contemporary sheep breeds

PubMed Central

2013-01-01

Background Several molecular and population genetic studies have focused on the native sheep breeds of Finland. In this work, we investigated their ancestral sheep populations from Iron Age, Medieval and Post-Medieval periods by sequencing a partial mitochondrial DNA D-loop and the 5’-promoter region of the SRY gene. We compared the maternal (mitochondrial DNA haplotypes) and paternal (SNP oY1) genetic diversity of ancient sheep in Finland with modern domestic sheep populations in Europe and Asia to study temporal changes in genetic variation and affinities between ancient and modern populations. Results A 523-bp mitochondrial DNA sequence was successfully amplified for 26 of 36 sheep ancient samples i.e. five, seven and 14 samples representative of Iron Age, Medieval and Post-Medieval sheep, respectively. Genetic diversity was analyzed within the cohorts. This ancient dataset was compared with present-day data consisting of 94 animals from 10 contemporary European breeds and with GenBank DNA sequence data to carry out a haplotype sharing analysis. Among the 18 ancient mitochondrial DNA haplotypes identified, 14 were present in the modern breeds. Ancient haplotypes were assigned to the highly divergent ovine haplogroups A and B, haplogroup B being the major lineage within the cohorts. Only two haplotypes were detected in the Iron Age samples, while the genetic diversity of the Medieval and Post-Medieval cohorts was higher. For three of the ancient DNA samples, Y-chromosome SRY gene sequences were amplified indicating that they originated from rams. The SRY gene of these three ancient ram samples contained SNP G-oY1, which is frequent in modern north-European sheep breeds. Conclusions Our study did not reveal any sign of major population replacement of native sheep in Finland since the Iron Age. Variations in the availability of archaeological remains may explain differences in genetic diversity estimates and patterns within the cohorts rather than demographic events that occurred in the past. Our ancient DNA results fit well with the genetic context of domestic sheep as determined by analyses of modern north-European sheep breeds. PMID:23339395
Isolation and Expression Profile of the Ca2+-Activated Chloride Channel-like Membrane Protein 6 Gene in Xenopus laevis

PubMed Central

Lee, Ra Mi; Ryu, Rae Hyung; Jeong, Seong Won; Oh, Soo Jin; Huang, Hue; Han, Jin Soo; Lee, Chi Ho; Lee, C. Justin; Jan, Lily Yeh

2011-01-01

To clone the first anion channel from Xenopus laevis (X. laevis), we isolated a calcium-activated chloride channel (CLCA)-like membrane protein 6 gene (CMP6) in X. laevis. As a first step in gene isolation, an expressed sequence tags database was screened to find the partial cDNA fragment. A putative partial cDNA sequence was obtained by comparison with rat CLCAs identified in our laboratory. First stranded cDNA was synthesized by reverse transcription polymerase-chain reaction (RT-PCR) using a specific primer designed for the target cDNA. Repeating the 5' and 3' rapid amplification of cDNA ends, full-length cDNA was constructed from the cDNA pool. The full-length CMP6 cDNA completed via 5'- and 3'-RACE was 2,940 bp long and had an open reading frame (ORF) of 940 amino acids. The predicted 940 polypeptides have four major transmembrane domains and showed about 50% identity with that of rat brain CLCAs in our previously published data. Semi-quantification analysis revealed that CMP6 was most abundantly expressed in small intestine, colon and liver. However, all tissues except small intestine, colon and liver had undetectable levels. This result became more credible after we did real-time PCR quantification for the target gene. In view of all CLCA studies focused on human or murine channels, this finding suggests a hypothetical protein as an ion channel, an X. laevis CLCA. PMID:21826170
cDNA cloning of the human monocarboxylate transporter 1 and chromosomal localization of the SLC16A1 locus to 1p13.2-p12

DOE Office of Scientific and Technical Information (OSTI.GOV)

Garcia, C.K.; Li, X.; Luna, J.

1994-09-15

Lactate and pyruvate are transported across cell membranes by monocarboxylate transporters (MCTs). Here, the authors use the recently cloned cDNA for hamster MCT1 to isolate cDNA and genomic clones for human MCT1. Comparison of the human and hamster amino acid sequences revealed that the proteins are 86% identical. The gene for human MCT1 (gene symbol, SLC16A1) was localized to human chromosome bands 1p13.2-p12 by PCR analysis of panels of human X rodent cell hybrid lines and by fluorescence chromosomal in situ hybridization. 9 refs., 2 figs.
Ribosomal RNA Genes Contribute to the Formation of Pseudogenes and Junk DNA in the Human Genome.

PubMed

Robicheau, Brent M; Susko, Edward; Harrigan, Amye M; Snyder, Marlene

2017-02-01

Approximately 35% of the human genome can be identified as sequence devoid of a selected-effect function, and not derived from transposable elements or repeated sequences. We provide evidence supporting a known origin for a fraction of this sequence. We show that: 1) highly degraded, but near full length, ribosomal DNA (rDNA) units, including both 45S and Intergenic Spacer (IGS), can be found at multiple sites in the human genome on chromosomes without rDNA arrays, 2) that these rDNA sequences have a propensity for being centromere proximal, and 3) that sequence at all human functional rDNA array ends is divergent from canonical rDNA to the point that it is pseudogenic. We also show that small sequence strings of rDNA (from 45S + IGS) can be found distributed throughout the genome and are identifiable as an "rDNA-like signal", representing 0.26% of the q-arm of HSA21 and ∼2% of the total sequence of other regions tested. The size of sequence strings found in the rDNA-like signal intergrade into the size of sequence strings that make up the full-length degrading rDNA units found scattered throughout the genome. We conclude that the displaced and degrading rDNA sequences are likely of a similar origin but represent different stages in their evolution towards random sequence. Collectively, our data suggests that over vast evolutionary time, rDNA arrays contribute to the production of junk DNA. The concept that the production of rDNA pseudogenes is a by-product of concerted evolution represents a previously under-appreciated process; we demonstrate here its importance. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Representation of DNA sequences in genetic codon context with applications in exon and intron prediction.

PubMed

Yin, Changchuan

2015-04-01

To apply digital signal processing (DSP) methods to analyze DNA sequences, the sequences first must be specially mapped into numerical sequences. Thus, effective numerical mappings of DNA sequences play key roles in the effectiveness of DSP-based methods such as exon prediction. Despite numerous mappings of symbolic DNA sequences to numerical series, the existing mapping methods do not include the genetic coding features of DNA sequences. We present a novel numerical representation of DNA sequences using genetic codon context (GCC) in which the numerical values are optimized by simulation annealing to maximize the 3-periodicity signal to noise ratio (SNR). The optimized GCC representation is then applied in exon and intron prediction by Short-Time Fourier Transform (STFT) approach. The results show the GCC method enhances the SNR values of exon sequences and thus increases the accuracy of predicting protein coding regions in genomes compared with the commonly used 4D binary representation. In addition, this study offers a novel way to reveal specific features of DNA sequences by optimizing numerical mappings of symbolic DNA sequences.
Haplotype Detection from Next-Generation Sequencing in High-Ploidy-Level Species: 45S rDNA Gene Copies in the Hexaploid Spartina maritima

PubMed Central

Boutte, Julien; Aliaga, Benoît; Lima, Oscar; Ferreira de Carvalho, Julie; Ainouche, Abdelkader; Macas, Jiri; Rousseau-Gueutin, Mathieu; Coriton, Olivier; Ainouche, Malika; Salmon, Armel

2015-01-01

Gene and whole-genome duplications are widespread in plant nuclear genomes, resulting in sequence heterogeneity. Identification of duplicated genes may be particularly challenging in highly redundant genomes, especially when there are no diploid parents as a reference. Here, we developed a pipeline to detect the different copies in the ribosomal RNA gene family in the hexaploid grass Spartina maritima from next-generation sequencing (Roche-454) reads. The heterogeneity of the different domains of the highly repeated 45S unit was explored by identifying single nucleotide polymorphisms (SNPs) and assembling reads based on shared polymorphisms. SNPs were validated using comparisons with Illumina sequence data sets and by cloning and Sanger (re)sequencing. Using this approach, 29 validated polymorphisms and 11 validated haplotypes were reported (out of 34 and 20, respectively, that were initially predicted by our program). The rDNA domains of S. maritima have similar lengths as those found in other Poaceae, apart from the 5′-ETS, which is approximately two-times longer in S. maritima. Sequence homogeneity was encountered in coding regions and both internal transcribed spacers (ITS), whereas high intragenomic variability was detected in the intergenic spacer (IGS) and the external transcribed spacer (ETS). Molecular cytogenetic analysis by fluorescent in situ hybridization (FISH) revealed the presence of one pair of 45S rDNA signals on the chromosomes of S. maritima instead of three expected pairs for a hexaploid genome, indicating loss of duplicated homeologous loci through the diploidization process. The procedure developed here may be used at any ploidy level and using different sequencing technologies. PMID:26530424
Genome Fragmentation Is Not Confined to the Peridinin Plastid in Dinoflagellates

PubMed Central

Espelund, Mari; Minge, Marianne A.; Gabrielsen, Tove M.; Nederbragt, Alexander J.; Shalchian-Tabrizi, Kamran; Otis, Christian; Turmel, Monique; Lemieux, Claude; Jakobsen, Kjetill S.

2012-01-01

When plastids are transferred between eukaryote lineages through series of endosymbiosis, their environment changes dramatically. Comparison of dinoflagellate plastids that originated from different algal groups has revealed convergent evolution, suggesting that the host environment mainly influences the evolution of the newly acquired organelle. Recently the genome from the anomalously pigmented dinoflagellate Karlodinium veneficum plastid was uncovered as a conventional chromosome. To determine if this haptophyte-derived plastid contains additional chromosomal fragments that resemble the mini-circles of the peridin-containing plastids, we have investigated its genome by in-depth sequencing using 454 pyrosequencing technology, PCR and clone library analysis. Sequence analyses show several genes with significantly higher copy numbers than present in the chromosome. These genes are most likely extrachromosomal fragments, and the ones with highest copy numbers include genes encoding the chaperone DnaK(Hsp70), the rubisco large subunit (rbcL), and two tRNAs (trnE and trnM). In addition, some photosystem genes such as psaB, psaA, psbB and psbD are overrepresented. Most of the dnaK and rbcL sequences are found as shortened or fragmented gene sequences, typically missing the 3′-terminal portion. Both dnaK and rbcL are associated with a common sequence element consisting of about 120 bp of highly conserved AT-rich sequence followed by a trnE gene, possibly serving as a control region. Decatenation assays and Southern blot analysis indicate that the extrachromosomal plastid sequences do not have the same organization or lengths as the minicircles of the peridinin dinoflagellates. The fragmentation of the haptophyte-derived plastid genome K. veneficum suggests that it is likely a sign of a host-driven process shaping the plastid genomes of dinoflagellates. PMID:22719952
Mitochondrial DNA diversity in the acanthocephalan Prosthenorchis elegans in Colombia based on cytochrome c oxidase I (COI) gene sequence.

PubMed

Falla, Ana Carolina; Brieva, Claudia; Bloor, Paul

2015-12-01

Prosthenorchis elegans is a member of the Phylum Acanthocephala and is an important parasite affecting New World Primates in the wild in South America and in captivity around the world. It is of significant management concern due to its pathogenicity and mode of transmission through intermediate hosts. Current diagnosis of P. elegans is based on the detection of eggs by coprological examination. However, this technique lacks both specificity and sensitivity, since eggs of most members of the genus are morphologically indistinguishable and shed intermittently, making differential diagnosis difficult, and coprological examinations are often negative in animals severely infected at death. We examined sequence variation in 633 bp of mitochondrial DNA (mtDNA) cytochrome c oxidase I (COI) sequence in 37 isolates of P. elegans from New World monkeys (Saguinus leucopus and Cebus albifrons) in Colombia held in rescue centers and from the wild. Intraspecific divergence ranged from 0.0 to 1.6% and was comparable with corresponding values within other species of acanthocephalans. Furthermore, comparisons of patterns of sequence divergence within the Acanthocephala suggest that Prosthenorchis represents a separate genus within the Oligacanthorhynchida. Six distinct haplotypes were identified within P. elegans which grouped into one of two well-supported mtDNA haplogroups. No association between haplogroup/haplotype, holding facility and species was found. This information will help pave the way to the development of molecular-based diagnostic tools for the detection of P. elegans as well as furthering research into the life cycle, intermediate hosts and epidemiological aspects of the species.
Molecular characterization of the ribosomal DNA unit of Sarcocystis singaporensis, Sarcocystis zamani and Sarcocystis zuoi from rodents in Thailand

PubMed Central

WATTHANAKAIWAN, Vichan; SUKMAK, Manakorn; HAMARIT, Kriengsak; KAOLIM, Nongnid; WAJJWALKU, Worawidh; MUANGKRAM, Yuttamol

2017-01-01

Sarcocystis species are heteroxenous cyst-forming coccidian protozoan parasites with a wide host range, including rodents. In this study, Sarcocystis spp. samples were isolated from Bandicota indica, Rattus argentiventer, R. tiomanicus and R. norvegicus across five provinces of Thailand. Two major groups of Sarcocystis cysts were determined in this study: large and small cysts. By sequence comparisons and phylogenetic analyses based on the partial sequences of 28S ribosomal DNA, the large cysts showed the highest identity value (99%) with the S. zamani in GenBank database. While the small cysts could be divided into 2 groups of Sarcocystis: S. singaporensis and presupposed S. zuoi. The further analysis on 18S rDNA supported that the 2 isolates (S2 and B6 no.2) were as identified as S. singaporensis shared a high sequence identity with the S. singaporensis in GenBank database and the unidentified Sarcocystis (4 isolates, i.e., B6 no.10, B6 no.12, B10 no.4 and B10 no.7) showed 96.3–99.5% identity to S. zuoi as well as high distinct identity from others Sarcocystis spp. (≤93%). The result indicated that these four samples should be S. zuoi. In this study, we provided complete sequence of internal transcribed spacer 1 (ITS1), 5.8S rDNA and internal transcribed spacer 2 (ITS2) of these three Sarcocystis species and our new primer set could be useful to study the evolution of Sarcocystis. PMID:28701623
Molecular characterization of the ribosomal DNA unit of Sarcocystis singaporensis, Sarcocystis zamani and Sarcocystis zuoi from rodents in Thailand.

PubMed

Watthanakaiwan, Vichan; Sukmak, Manakorn; Hamarit, Kriengsak; Kaolim, Nongnid; Wajjwalku, Worawidh; Muangkram, Yuttamol

2017-08-18

Sarcocystis species are heteroxenous cyst-forming coccidian protozoan parasites with a wide host range, including rodents. In this study, Sarcocystis spp. samples were isolated from Bandicota indica, Rattus argentiventer, R. tiomanicus and R. norvegicus across five provinces of Thailand. Two major groups of Sarcocystis cysts were determined in this study: large and small cysts. By sequence comparisons and phylogenetic analyses based on the partial sequences of 28S ribosomal DNA, the large cysts showed the highest identity value (99%) with the S. zamani in GenBank database. While the small cysts could be divided into 2 groups of Sarcocystis: S. singaporensis and presupposed S. zuoi. The further analysis on 18S rDNA supported that the 2 isolates (S2 and B6 no.2) were as identified as S. singaporensis shared a high sequence identity with the S. singaporensis in GenBank database and the unidentified Sarcocystis (4 isolates, i.e., B6 no.10, B6 no.12, B10 no.4 and B10 no.7) showed 96.3-99.5% identity to S. zuoi as well as high distinct identity from others Sarcocystis spp. (≤93%). The result indicated that these four samples should be S. zuoi. In this study, we provided complete sequence of internal transcribed spacer 1 (ITS1), 5.8S rDNA and internal transcribed spacer 2 (ITS2) of these three Sarcocystis species and our new primer set could be useful to study the evolution of Sarcocystis.
Development of Genetic Markers in Eucalyptus Species by Target Enrichment and Exome Sequencing

PubMed Central

Dasgupta, Modhumita Ghosh; Dharanishanthi, Veeramuthu; Agarwal, Ishangi; Krutovsky, Konstantin V.

2015-01-01

The advent of next-generation sequencing has facilitated large-scale discovery, validation and assessment of genetic markers for high density genotyping. The present study was undertaken to identify markers in genes supposedly related to wood property traits in three Eucalyptus species. Ninety four genes involved in xylogenesis were selected for hybridization probe based nuclear genomic DNA target enrichment and exome sequencing. Genomic DNA was isolated from the leaf tissues and used for on-array probe hybridization followed by Illumina sequencing. The raw sequence reads were trimmed and high-quality reads were mapped to the E. grandis reference sequence and the presence of single nucleotide variants (SNVs) and insertions/ deletions (InDels) were identified across the three species. The average read coverage was 216X and a total of 2294 SNVs and 479 InDels were discovered in E. camaldulensis, 2383 SNVs and 518 InDels in E. tereticornis, and 1228 SNVs and 409 InDels in E. grandis. Additionally, SNV calling and InDel detection were conducted in pair-wise comparisons of E. tereticornis vs. E. grandis, E. camaldulensis vs. E. tereticornis and E. camaldulensis vs. E. grandis. This study presents an efficient and high throughput method on development of genetic markers for family– based QTL and association analysis in Eucalyptus. PMID:25602379
Muver, a computational framework for accurately calling accumulated mutations.

PubMed

Burkholder, Adam B; Lujan, Scott A; Lavender, Christopher A; Grimm, Sara A; Kunkel, Thomas A; Fargo, David C

2018-05-09

Identification of mutations from next-generation sequencing data typically requires a balance between sensitivity and accuracy. This is particularly true of DNA insertions and deletions (indels), that can impart significant phenotypic consequences on cells but are harder to call than substitution mutations from whole genome mutation accumulation experiments. To overcome these difficulties, we present muver, a computational framework that integrates established bioinformatics tools with novel analytical methods to generate mutation calls with the extremely low false positive rates and high sensitivity required for accurate mutation rate determination and comparison. Muver uses statistical comparison of ancestral and descendant allelic frequencies to identify variant loci and assigns genotypes with models that include per-sample assessments of sequencing errors by mutation type and repeat context. Muver identifies maximally parsimonious mutation pathways that connect these genotypes, differentiating potential allelic conversion events and delineating ambiguities in mutation location, type, and size. Benchmarking with a human gold standard father-son pair demonstrates muver's sensitivity and low false positive rates. In DNA mismatch repair (MMR) deficient Saccharomyces cerevisiae, muver detects multi-base deletions in homopolymers longer than the replicative polymerase footprint at rates greater than predicted for sequential single-base deletions, implying a novel multi-repeat-unit slippage mechanism. Benchmarking results demonstrate the high accuracy and sensitivity achieved with muver, particularly for indels, relative to available tools. Applied to an MMR-deficient Saccharomyces cerevisiae system, muver mutation calls facilitate mechanistic insights into DNA replication fidelity.
Single-cell genomic sequencing using Multiple Displacement Amplification.

PubMed

Lasken, Roger S

2007-10-01

Single microbial cells can now be sequenced using DNA amplified by the Multiple Displacement Amplification (MDA) reaction. The few femtograms of DNA in a bacterium are amplified into micrograms of high molecular weight DNA suitable for DNA library construction and Sanger sequencing. The MDA-generated DNA also performs well when used directly as template for pyrosequencing by the 454 Life Sciences method. While MDA from single cells loses some of the genomic sequence, this approach will greatly accelerate the pace of sequencing from uncultured microbes. The genetically linked sequences from single cells are also a powerful tool to be used in guiding genomic assembly of shotgun sequences of multiple organisms from environmental DNA extracts (metagenomic sequences).
Large-Scale Collection and Analysis of Full-Length cDNAs from Brachypodium distachyon and Integration with Pooideae Sequence Resources

PubMed Central

Mochida, Keiichi; Uehara-Yamaguchi, Yukiko; Takahashi, Fuminori; Yoshida, Takuhiro; Sakurai, Tetsuya; Shinozaki, Kazuo

2013-01-01

A comprehensive collection of full-length cDNAs is essential for correct structural gene annotation and functional analyses of genes. We constructed a mixed full-length cDNA library from 21 different tissues of Brachypodium distachyon Bd21, and obtained 78,163 high quality expressed sequence tags (ESTs) from both ends of ca. 40,000 clones (including 16,079 contigs). We updated gene structure annotations of Brachypodium genes based on full-length cDNA sequences in comparison with the latest publicly available annotations. About 10,000 non-redundant gene models were supported by full-length cDNAs; ca. 6,000 showed some transcription unit modifications. We also found ca. 580 novel gene models, including 362 newly identified in Bd21. Using the updated transcription start sites, we searched a total of 580 plant cis-motifs in the −3 kb promoter regions and determined a genome-wide Brachypodium promoter architecture. Furthermore, we integrated the Brachypodium full-length cDNAs and updated gene structures with available sequence resources in wheat and barley in a web-accessible database, the RIKEN Brachypodium FL cDNA database. The database represents a “one-stop” information resource for all genomic information in the Pooideae, facilitating functional analysis of genes in this model grass plant and seamless knowledge transfer to the Triticeae crops. PMID:24130698

Is MMTV associated with human breast cancer? Maybe, but probably not.

PubMed

Perzova, Raisa; Abbott, Lynn; Benz, Patricia; Landas, Steve; Khan, Seema; Glaser, Jordan; Cunningham, Coleen K; Poiesz, Bernard

2017-10-13

Conflicting results regarding the association of MMTV with human breast cancer have been reported. Published sequence data have indicated unique MMTV strains in some human samples. However, concerns regarding contamination as a cause of false positive results have persisted. We performed PCR assays for MMTV on human breast cancer cell lines and fresh frozen and formalin fixed normal and malignant human breast epithelial samples. Assays were also performed on peripheral blood mononuclear cells from volunteer blood donors and subjects at risk for human retroviral infections. In addition, assays were performed on DNA samples from wild and laboratory mice. Sequencing of MMTV positive samples from both humans and mice were performed and phylogenetically compared. Using PCR under rigorous conditions to prevent and detect "carryover" contamination, we did detect MMTV DNA in human samples, including breast cancer. However, the results were not consistent and seemed to be an artifact. Further, experiments indicated that the probable source of false positives was murine DNA, containing endogenous MMTV, present in our building. However, comparison of published and, herein, newly described MMTV sequences with published data, indicates that there are some very unique human MMTV sequences in the literature. While we could not confirm the true presence of MMTV in our human breast cancer subjects, the data indicate that further, perhaps more traditional, retroviral studies are warranted to ascertain whether MMTV might rarely be the cause of human breast cancer.
Isolation and sequencing of the gene encoding Sp23, a structural protein of spermatophore of the mealworm beetle, Tenebrio molitor.

PubMed

Feng, X; Happ, G M

1996-11-14

The cDNA for Sp23, a structural protein of the spermatophore of Tenebrio molitor, had been previously cloned and characterized (Paesen, G.C., Schwartz, M.B., Peferoen, M., Weyda, F. and Happ, G.M. (1992a) Amino acid sequence of Sp23, a structure protein of the spermatophore of the mealworm beetle, Tenebrio molitor. J. Biol. Chem. 257, 18852-18857). Using the labeled cDNA for Sp23 as a probe to screen a library of genomic DNA from Tenebrio molitor, we isolated a genomic clone for Sp23. A 5373-base pair (bp) restriction fragment containing the Sp23 gene was sequenced. The coding region is separated by a 55-bp intron which is located close to the translation start site. Three putative ecdysone response elements (EcRE) are identified in the 5' flanking region of the Sp23 gene. Comparison of the flanking regions of the Sp23 gene with those of the D-protein gene expressed in the accessory glands of Tenebrio reveals similar sequences present in the flanking regions of the two genes. The genomic organization of the coding region of the Sp23 gene shares similarities with that of the D-protein gene, three Drosophila accessory gland genes and two Drosophila 20-OH ecdysone-responsive genes.
Skeletal muscle plasticity induced by seasonal acclimatization in carp involves differential expression of rRNA and molecules that epigenetically regulate its synthesis.

PubMed

Fuentes, Eduardo N; Zuloaga, Rodrigo; Nardocci, Gino; Fernandez de la Reguera, Catalina; Simonet, Nicolas; Fumeron, Robinson; Valdes, Juan Antonio; Molina, Alfredo; Alvarez, Marco

2014-01-01

Ribosomal biogenesis controls cellular growth in living organisms, with the rate-limiting step of this process being the transcription of ribosomal DNA (rDNA). Considering that epigenetic mechanisms allow an organism to respond to environmental changes, the expression in muscle of several molecules that regulate epigenetic rRNA synthesis, as well as rDNA transcription, were evaluated during the seasonal acclimatization of the carp. First, the nucleotide sequences encoding the components forming the NoRC (ttf-I, tip5) and eNoSC (sirt1, nml, suv39h1), two chromatin remodeling complexes that silence rRNA synthesis, as well as the sequence of ubf1, a key regulator of rDNA transcription, were obtained. Subsequently the transcriptional regulation of the aforementioned molecules, and other key molecules involved in rRNA synthesis (mh2a1, mh2a2, h2a.z, h2a.z.7, nuc, p80), was assessed. The carp sequences for TTF-I, TIP5, SIRT1, NML, SUV39H1, and UBF1 showed a high conservation of domains and key amino acids in comparison with other fish and higher vertebrates. The mRNA contents in muscle for ttf-I, tip5, sirt1, nml, suv39h1, mh2a1, mh2a.z, and nuc were up-regulated during winter in comparison with summer, whereas the mRNA levels of mh2a2, ubf1, and p80 were down-regulated. Also, the contents of molecules involved in processing the rRNA (snoRNAs) and pRNA, a stabilizer of NoRC complex, were analyzed, finding that these non-coding RNAs were not affected by seasonal acclimatization. These results suggest that variations in the expression of rRNA and the molecules that epigenetically regulate its synthesis are contributing to the muscle plasticity induced by seasonal acclimatization in carp. Copyright © 2014 Elsevier Inc. All rights reserved.
Genetic composition and connectivity of the Antillean manatee (Trichechus manatus manatus) in Panama

USGS Publications Warehouse

Díaz-Ferguson, Edgardo; Hunter, Margaret; Guzmán, Héctor M.

2017-01-01

Genetic diversity and haplotype composition of the West Indian manatee (Trichechus manatus) population from the San San Pond Sak wetland in Bocas del Toro, Panama was studied using a segment of mitochondrial DNA (D’loop). No genetic information has been published to date for Panamanian populations. Due to the secretive behavior and small population size of the species in the area, DNA extraction was conducted from opportunistically collected fecal (N=20), carcass tissue (N=4) and bone (N=4) samples. However, after DNA processing only 10 samples provided good quality DNA for sequencing (3 fecal, 4 tissue and 3 bone samples). We found three haplotypes in total; two of these haplotypes are reported for the first time, J02 (N=3) and J03 (N=4), and one J01 was previously published (N=3). Genetic diversity showed similar values to previous studies conducted in other Caribbean regions with moderate values of nucleotide diversity (π= 0.00152) and haplotipic diversity (Hd= 0.57). Connectivity assessment was based on sequence similarity, genetic distance and genetic differentiation between San San population and other manatee populations previously studied. The J01 haplotype found in the Panamanian population is shared with populations in the Caribbean mainland and the Gulf of Mexico showing a reduced differentiation corroborated with Fst value between HSSPS and this region of 0.0094. In contrast, comparisons between our sequences and populations in the Eastern Caribbean (South American populations) and North Western Caribbean showed fewer similarities (Fst =0.049 and 0.058, respectively). These results corroborate previous phylogeographic patterns already established for manatee populations and situate Panamanian populations into the Belize and Mexico cluster. In addition, these findings will be a baseline for future studies and comparisons with manatees in other areas of Panama and Central America. These results should be considered to inform management decisions regarding conservation of genetic diversity, future controlled introductions, connectivity and effective population size of the West Indian manatee along the Central American corridor.
Nullomers and High Order Nullomers in Genomic Sequences

PubMed Central

Vergni, Davide; Santoni, Daniele

2016-01-01

A nullomer is an oligomer that does not occur as a subsequence in a given DNA sequence, i.e. it is an absent word of that sequence. The importance of nullomers in several applications, from drug discovery to forensic practice, is now debated in the literature. Here, we investigated the nature of nullomers, whether their absence in genomes has just a statistical explanation or it is a peculiar feature of genomic sequences. We introduced an extension of the notion of nullomer, namely high order nullomers, which are nullomers whose mutated sequences are still nullomers. We studied different aspects of them: comparison with nullomers of random sequences, CpG distribution and mean helical rise. In agreement with previous results we found that the number of nullomers in the human genome is much larger than expected by chance. Nevertheless antithetical results were found when considering a random DNA sequence preserving dinucleotide frequencies. The analysis of CpG frequencies in nullomers and high order nullomers revealed, as expected, a high CpG content but it also highlighted a strong dependence of CpG frequencies on the dinucleotide position, suggesting that nullomers have their own peculiar structure and are not simply sequences whose CpG frequency is biased. Furthermore, phylogenetic trees were built on eleven species based on both the similarities between the dinucleotide frequencies and the number of nullomers two species share, showing that nullomers are fairly conserved among close species. Finally the study of mean helical rise of nullomers sequences revealed significantly high mean rise values, reinforcing the hypothesis that those sequences have some peculiar structural features. The obtained results show that nullomers are the consequence of the peculiar structure of DNA (also including biased CpG frequency and CpGs islands), so that the hypermutability model, also taking into account CpG islands, seems to be not sufficient to explain nullomer phenomenon. Finally, high order nullomers could emphasize those features that already make simple nullomers useful in several applications. PMID:27906971
Quick identification of acetic acid bacteria based on nucleotide sequences of the 16S-23S rDNA internal transcribed spacer region and of the PQQ-dependent alcohol dehydrogenase gene.

PubMed

Trcek, Janja

2005-10-01

Acetic acid bacteria (AAB) are well known for oxidizing different ethanol-containing substrates into various types of vinegar. They are also used for production of some biotechnologically important products, such as sorbose and gluconic acids. However, their presence is not always appreciated since certain species also spoil wine, juice, beer and fruits. To be able to follow AAB in all these processes, the species involved must be identified accurately and quickly. Because of inaccuracy and very time-consuming phenotypic analysis of AAB, the application of molecular methods is necessary. Since the pairwise comparison among the 16S rRNA gene sequences of AAB shows very high similarity (up to 99.9%) other DNA-targets should be used. Our previous studies showed that the restriction analysis of 16S-23S rDNA internal transcribed spacer region is a suitable approach for quick affiliation of an acetic acid bacterium to a distinct group of restriction types and also for quick identification of a potentially novel species of acetic acid bacterium (Trcek & Teuber 2002; Trcek 2002). However, with the exception of two conserved genes, encoding tRNAIle and tRNAAla, the sequences of 16S-23S rDNA are highly divergent among AAB species. For this reason we analyzed in this study a gene encoding PQQ-dependent ADH as a possible DNA-target. First we confirmed the expression of subunit I of PQQ-dependent ADH (AdhA) also in Asaia, the only genus of AAB which exhibits little or no ADH-activity. Further we analyzed the partial sequences of adhA among some representative species of the genera Acetobacter, Gluconobacter and Gluconacetobacter. The conserved and variable regions in these sequences made possible the construction of A. acetispecific oligonucleotide the specificity of which was confirmed in PCR-reaction using 45 well-defined strains of AAB as DNA-templates. The primer was also successfully used in direct identification of A. aceti from home made cider vinegar as well as for revealing the misclassification of strain IFO 3283 into the species A. aceti.
Combination of the immunization with the sequence close to the consensus sequence and two DNA prime plus one VLP boost generate H5 hemagglutinin specific broad neutralizing antibodies

PubMed Central

Wang, Guiqin; Yin, Renfu; Zhou, Paul; Ding, Zhuang

2017-01-01

Hemagglutinin (HA) head has long been considered to be able to elicit only a narrow, strain-specific antibody response as it undergoes rapid antigenic drift. However, we previously showed that a heterologous prime-boost strategy, in which mice were primed twice with DNA encoding HA and boosted once with virus-like particles (VLP) from an H5N1 strain A/Thailand/1(KAN)-1/2004 (noted as TH DDV), induced anti-head broad cross-H5 neutralizing antibody response. To explain why TH DDV immunization could generate such breadth, we systemically compared the neutralization breadth and potency between TH DDV sera and immune sera elicited by TH DDD (three times of DNA immunizations), TH VVV (three times of VLP immunizations), TH DV (one DNA prime plus one VLP boost) and TK DDV (plasmid DNA and VLP derived from another H5N1 strain, A/Turkey/65596/2006). Then we determined the antigenic sites (AS) on TH HA head and the key residues of the main antigenic site. Through the comparison of different regiments, we found that the combination of the immunization with the sequence close to the consensus sequence and two DNA prime plus one VLP boost caused that TH DDV immunization generate broad neutralizing antibodies. Antigenic analysis showed that TH DDV, TH DV, TH DDD and TH VVV sera recognize the common antigenic site AS1. Antibodies directed to AS1 contribute to the largest proportion of the neutralizing activity of these immune sera. Residues 188 and 193 in AS1 are the key residues which are responsible for neutralization breadth of the immune sera. Interestingly, residues 188 and 193 locate in classical antigen sites but are relatively conserved among the 16 tested strains and 1,663 HA sequences from NCBI database. Thus, our results strongly indicate that it is feasible to develop broad cross-H5 influenza vaccines against HA head. PMID:28542275
An alternative method for cDNA cloning from surrogate eukaryotic cells transfected with the corresponding genomic DNA.

PubMed

Hu, Lin-Yong; Cui, Chen-Chen; Song, Yu-Jie; Wang, Xiang-Guo; Jin, Ya-Ping; Wang, Ai-Hua; Zhang, Yong

2012-07-01

cDNA is widely used in gene function elucidation and/or transgenics research but often suitable tissues or cells from which to isolate mRNA for reverse transcription are unavailable. Here, an alternative method for cDNA cloning is described and tested by cloning the cDNA of human LALBA (human alpha-lactalbumin) from genomic DNA. First, genomic DNA containing all of the coding exons was cloned from human peripheral blood and inserted into a eukaryotic expression vector. Next, by delivering the plasmids into either 293T or fibroblast cells, surrogate cells were constructed. Finally, the total RNA was extracted from the surrogate cells and cDNA was obtained by RT-PCR. The human LALBA cDNA that was obtained was compared with the corresponding mRNA published in GenBank. The comparison showed that the two sequences were identical. The novel method for cDNA cloning from surrogate eukaryotic cells described here uses well-established techniques that are feasible and simple to use. We anticipate that this alternative method will have widespread applications.
Acquisition of New DNA Sequences After Infection of Chicken Cells with Avian Myeloblastosis Virus

PubMed Central

Shoyab, M.; Baluda, M. A.; Evans, R.

1974-01-01

DNA-RNA hybridization studies between 70S RNA from avian myeloblastosis virus (AMV) and an excess of DNA from (i) AMV-induced leukemic chicken myeloblasts or (ii) a mixture of normal and of congenitally infected K-137 chicken embryos producing avian leukosis viruses revealed the presence of fast- and slow-hybridizing virus-specific DNA sequences. However, the leukemic cells contained twice the level of AMV-specific DNA sequences observed in normal chicken embryonic cells. The fast-reacting sequences were two to three times more numerous in leukemic DNA than in DNA from the mixed embryos. The slow-reacting sequences had a reiteration frequency of approximately 9 and 6, in the two respective systems. Both the fast- and the slow-reacting DNA sequences in leukemic cells exhibited a higher Tm (2 C) than the respective DNA sequences in normal cells. In normal and leukemic cells the slow hybrid sequences appeared to have a Tm which was 2 C higher than that of the fast hybrid sequences. Individual non-virus-producing chicken embryos, either group-specific antigen positive or negative, contained 40 to 100 copies of the fast sequences and 2 to 6 copies of the slowly hybridizing sequences per cell genome. Normal rat cells did not contain DNA that hybridized with AMV RNA, whereas non-virus-producing rat cells transformed by B-77 avian sarcoma virus contained only the slowly reacting sequences. The results demonstrate that leukemic cells transformed by AMV contain new AMV-specific DNA sequences which were not present before infection. PMID:16789139
Phylogeny and evolution of the auks (subfamily Alcinae) based on mitochondrial DNA sequences

USGS Publications Warehouse

Moum, Truls; Johansen, Steinar; Erikstad, Kjell Einar; Piatt, John F.

1994-01-01

The genetic divergence and phylogeny of the auks was assessed by mitochondrial DNA sequence comparisons in a study using 19 of the 22 auk species and two outgroup representatives. We compared more than 500 nucleotides from each of two mitochondrial genes encoding 12S rRNA and the NADH dehydrogenase subunit 6. Divergence times were estimated from transversional substitutions. The dovekie (Alle alle) is related to the razorbill (Alca torda) and the murres (Uria spp). Furthermore, the Xantus's murrelet (Synthliboramphus hypoleucus) and the ancient (Synthliboramphus antiquus) and Japanese murrelets (Synthliboramphus wumizusume) are genetically distinct members of the same main lineage, whereas brachyramphine and synthliboramphine murrelets are not closely related. An early adaptive radiation of six main species groups of auks seems to trace back to Middle Miocene. Later speciation probably involved ecological differentiations and geographical isolations.
A streamlined collecting and preparation protocol for DNA barcoding of Lepidoptera as part of large-scale rapid biodiversity assessment projects, exemplified by the Indonesian Biodiversity Discovery and Information System (IndoBioSys).

PubMed

Schmidt, Olga; Hausmann, Axel; Cancian de Araujo, Bruno; Sutrisno, Hari; Peggie, Djunijanti; Schmidt, Stefan

2017-01-01

Here we present a general collecting and preparation protocol for DNA barcoding of Lepidoptera as part of large-scale rapid biodiversity assessment projects, and a comparison with alternative preserving and vouchering methods. About 98% of the sequenced specimens processed using the present collecting and preparation protocol yielded sequences with more than 500 base pairs. The study is based on the first outcomes of the Indonesian Biodiversity Discovery and Information System (IndoBioSys). IndoBioSys is a German-Indonesian research project that is conducted by the Museum für Naturkunde in Berlin and the Zoologische Staatssammlung München, in close cooperation with the Research Center for Biology - Indonesian Institute of Sciences (RCB-LIPI, Bogor).
Comparison of next generation sequencing technologies for transcriptome characterization

PubMed Central

2009-01-01

Background We have developed a simulation approach to help determine the optimal mixture of sequencing methods for most complete and cost effective transcriptome sequencing. We compared simulation results for traditional capillary sequencing with "Next Generation" (NG) ultra high-throughput technologies. The simulation model was parameterized using mappings of 130,000 cDNA sequence reads to the Arabidopsis genome (NCBI Accession SRA008180.19). We also generated 454-GS20 sequences and de novo assemblies for the basal eudicot California poppy (Eschscholzia californica) and the magnoliid avocado (Persea americana) using a variety of methods for cDNA synthesis. Results The Arabidopsis reads tagged more than 15,000 genes, including new splice variants and extended UTR regions. Of the total 134,791 reads (13.8 MB), 119,518 (88.7%) mapped exactly to known exons, while 1,117 (0.8%) mapped to introns, 11,524 (8.6%) spanned annotated intron/exon boundaries, and 3,066 (2.3%) extended beyond the end of annotated UTRs. Sequence-based inference of relative gene expression levels correlated significantly with microarray data. As expected, NG sequencing of normalized libraries tagged more genes than non-normalized libraries, although non-normalized libraries yielded more full-length cDNA sequences. The Arabidopsis data were used to simulate additional rounds of NG and traditional EST sequencing, and various combinations of each. Our simulations suggest a combination of FLX and Solexa sequencing for optimal transcriptome coverage at modest cost. We have also developed ESTcalc http://fgp.huck.psu.edu/NG_Sims/ngsim.pl, an online webtool, which allows users to explore the results of this study by specifying individualized costs and sequencing characteristics. Conclusion NG sequencing technologies are a highly flexible set of platforms that can be scaled to suit different project goals. In terms of sequence coverage alone, the NG sequencing is a dramatic advance over capillary-based sequencing, but NG sequencing also presents significant challenges in assembly and sequence accuracy due to short read lengths, method-specific sequencing errors, and the absence of physical clones. These problems may be overcome by hybrid sequencing strategies using a mixture of sequencing methodologies, by new assemblers, and by sequencing more deeply. Sequencing and microarray outcomes from multiple experiments suggest that our simulator will be useful for guiding NG transcriptome sequencing projects in a wide range of organisms. PMID:19646272
BASIC: A Simple and Accurate Modular DNA Assembly Method.

PubMed

Storch, Marko; Casini, Arturo; Mackrow, Ben; Ellis, Tom; Baldwin, Geoff S

2017-01-01

Biopart Assembly Standard for Idempotent Cloning (BASIC) is a simple, accurate, and robust DNA assembly method. The method is based on linker-mediated DNA assembly and provides highly accurate DNA assembly with 99 % correct assemblies for four parts and 90 % correct assemblies for seven parts [1]. The BASIC standard defines a single entry vector for all parts flanked by the same prefix and suffix sequences and its idempotent nature means that the assembled construct is returned in the same format. Once a part has been adapted into the BASIC format it can be placed at any position within a BASIC assembly without the need for reformatting. This allows laboratories to grow comprehensive and universal part libraries and to share them efficiently. The modularity within the BASIC framework is further extended by the possibility of encoding ribosomal binding sites (RBS) and peptide linker sequences directly on the linkers used for assembly. This makes BASIC a highly versatile library construction method for combinatorial part assembly including the construction of promoter, RBS, gene variant, and protein-tag libraries. In comparison with other DNA assembly standards and methods, BASIC offers a simple robust protocol; it relies on a single entry vector, provides for easy hierarchical assembly, and is highly accurate for up to seven parts per assembly round [2].
Translocation-coupled DNA cleavage by the Type ISP restriction-modification enzymes

PubMed Central

Chand, Mahesh Kumar; Nirwan, Neha; Diffin, Fiona M.; van Aelst, Kara; Kulkarni, Manasi; Pernstich, Christian; Szczelkun, Mark D.; Saikrishnan, Kayarat

2015-01-01

Endonucleolytic double-strand DNA break production requires separate strand cleavage events. Although catalytic mechanisms for simple dimeric endonucleases are available, there are many complex nuclease machines which are poorly understood in comparison. Here we studied the single polypeptide Type ISP restriction-modification (RM) enzymes, which cleave random DNA between distant target sites when two enzymes collide following convergent ATP-driven translocation. We report the 2.7 Angstroms resolution X-ray crystal structure of a Type ISP enzyme-DNA complex, revealing that both the helicase-like ATPase and nuclease are unexpectedly located upstream of the direction of translocation, inconsistent with simple nuclease domain-dimerization. Using single-molecule and biochemical techniques, we demonstrate that each ATPase remodels its DNA-protein complex and translocates along DNA without looping it, leading to a collision complex where the nuclease domains are distal. Sequencing of single cleavage events suggests a previously undescribed endonuclease model, where multiple, stochastic strand nicking events combine to produce DNA scission. PMID:26389736
Detection and quantitation of single nucleotide polymorphisms, DNA sequence variations, DNA mutations, DNA damage and DNA mismatches

DOEpatents

McCutchen-Maloney, Sandra L.

2002-01-01

DNA mutation binding proteins alone and as chimeric proteins with nucleases are used with solid supports to detect DNA sequence variations, DNA mutations and single nucleotide polymorphisms. The solid supports may be flow cytometry beads, DNA chips, glass slides or DNA dips sticks. DNA molecules are coupled to solid supports to form DNA-support complexes. Labeled DNA is used with unlabeled DNA mutation binding proteins such at TthMutS to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by binding which gives an increase in signal. Unlabeled DNA is utilized with labeled chimeras to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by nuclease activity of the chimera which gives a decrease in signal.
VIP Barcoding: composition vector-based software for rapid species identification based on DNA barcoding.

PubMed

Fan, Long; Hui, Jerome H L; Yu, Zu Guo; Chu, Ka Hou

2014-07-01

Species identification based on short sequences of DNA markers, that is, DNA barcoding, has emerged as an integral part of modern taxonomy. However, software for the analysis of large and multilocus barcoding data sets is scarce. The Basic Local Alignment Search Tool (BLAST) is currently the fastest tool capable of handling large databases (e.g. >5000 sequences), but its accuracy is a concern and has been criticized for its local optimization. However, current more accurate software requires sequence alignment or complex calculations, which are time-consuming when dealing with large data sets during data preprocessing or during the search stage. Therefore, it is imperative to develop a practical program for both accurate and scalable species identification for DNA barcoding. In this context, we present VIP Barcoding: a user-friendly software in graphical user interface for rapid DNA barcoding. It adopts a hybrid, two-stage algorithm. First, an alignment-free composition vector (CV) method is utilized to reduce searching space by screening a reference database. The alignment-based K2P distance nearest-neighbour method is then employed to analyse the smaller data set generated in the first stage. In comparison with other software, we demonstrate that VIP Barcoding has (i) higher accuracy than Blastn and several alignment-free methods and (ii) higher scalability than alignment-based distance methods and character-based methods. These results suggest that this platform is able to deal with both large-scale and multilocus barcoding data with accuracy and can contribute to DNA barcoding for modern taxonomy. VIP Barcoding is free and available at http://msl.sls.cuhk.edu.hk/vipbarcoding/. © 2014 John Wiley & Sons Ltd.
Mitochondrion-to-Chloroplast DNA Transfers and Intragenomic Proliferation of Chloroplast Group II Introns in Gloeotilopsis Green Algae (Ulotrichales, Ulvophyceae).

PubMed

Turmel, Monique; Otis, Christian; Lemieux, Claude

2016-09-19

To probe organelle genome evolution in the Ulvales/Ulotrichales clade, the newly sequenced chloroplast and mitochondrial genomes of Gloeotilopsis planctonica and Gloeotilopsis sarcinoidea (Ulotrichales) were compared with those of Pseudendoclonium akinetum (Ulotrichales) and of the few other green algae previously sampled in the Ulvophyceae. At 105,236 bp, the G planctonica mitochondrial DNA (mtDNA) is the largest mitochondrial genome reported so far among chlorophytes, whereas the 221,431-bp G planctonica and 262,888-bp G sarcinoidea chloroplast DNAs (cpDNAs) are the largest chloroplast genomes analyzed among the Ulvophyceae. Gains of non-coding sequences largely account for the expansion of these genomes. Both Gloeotilopsis cpDNAs lack the inverted repeat (IR) typically found in green plants, indicating that two independent IR losses occurred in the Ulvales/Ulotrichales. Our comparison of the Pseudendoclonium and Gloeotilopsis cpDNAs offered clues regarding the mechanism of IR loss in the Ulotrichales, suggesting that internal sequences from the rDNA operon were differentially lost from the two original IR copies during this process. Our analyses also unveiled a number of genetic novelties. Short mtDNA fragments were discovered in two distinct regions of the G sarcinoidea cpDNA, providing the first evidence for intracellular inter-organelle gene migration in green algae. We identified for the first time in green algal organelles, group II introns with LAGLIDADG ORFs as well as group II introns inserted into untranslated gene regions. We discovered many group II introns occupying sites not previously documented for the chloroplast genome and demonstrated that a number of them arose by intragenomic proliferation, most likely through retrohoming. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Mitochondrion-to-Chloroplast DNA Transfers and Intragenomic Proliferation of Chloroplast Group II Introns in Gloeotilopsis Green Algae (Ulotrichales, Ulvophyceae)

PubMed Central

Turmel, Monique; Otis, Christian; Lemieux, Claude

2016-01-01

Abstract To probe organelle genome evolution in the Ulvales/Ulotrichales clade, the newly sequenced chloroplast and mitochondrial genomes of Gloeotilopsis planctonica and Gloeotilopsis sarcinoidea (Ulotrichales) were compared with those of Pseudendoclonium akinetum (Ulotrichales) and of the few other green algae previously sampled in the Ulvophyceae. At 105,236 bp, the G. planctonica mitochondrial DNA (mtDNA) is the largest mitochondrial genome reported so far among chlorophytes, whereas the 221,431-bp G. planctonica and 262,888-bp G. sarcinoidea chloroplast DNAs (cpDNAs) are the largest chloroplast genomes analyzed among the Ulvophyceae. Gains of non-coding sequences largely account for the expansion of these genomes. Both Gloeotilopsis cpDNAs lack the inverted repeat (IR) typically found in green plants, indicating that two independent IR losses occurred in the Ulvales/Ulotrichales. Our comparison of the Pseudendoclonium and Gloeotilopsis cpDNAs offered clues regarding the mechanism of IR loss in the Ulotrichales, suggesting that internal sequences from the rDNA operon were differentially lost from the two original IR copies during this process. Our analyses also unveiled a number of genetic novelties. Short mtDNA fragments were discovered in two distinct regions of the G. sarcinoidea cpDNA, providing the first evidence for intracellular inter-organelle gene migration in green algae. We identified for the first time in green algal organelles, group II introns with LAGLIDADG ORFs as well as group II introns inserted into untranslated gene regions. We discovered many group II introns occupying sites not previously documented for the chloroplast genome and demonstrated that a number of them arose by intragenomic proliferation, most likely through retrohoming. PMID:27503298
Predicting nuclear gene coalescence from mitochondrial data: the three-times rule.

PubMed

Palumbi, S R; Cipriano, F; Hare, M P

2001-05-01

Coalescence theory predicts when genetic drift at nuclear loci will result in fixation of sequence differences to produce monophyletic gene trees. However, the theory is difficult to apply to particular taxa because it hinges on genetically effective population size, which is generally unknown. Neutral theory also predicts that evolution of monophyly will be four times slower in nuclear than in mitochondrial genes primarily because genetic drift is slower at nuclear loci. Variation in mitochondrial DNA (mtDNA) within and between species has been studied extensively, but can these mtDNA data be used to predict coalescence in nuclear loci? Comparison of neutral theories of coalescence of mitochondrial and nuclear loci suggests a simple rule of thumb. The "three-times rule" states that, on average, most nuclear loci will be monophyletic when the branch length leading to the mtDNA sequences of a species is three times longer than the average mtDNA sequence diversity observed within that species. A test using mitochondrial and nuclear intron data from seven species of whales and dolphins suggests general agreement with predictions of the three-times rule. We define the coalescence ratio as the mitochondrial branch length for a species divided by intraspecific mtDNA diversity. We show that species with high coalescence ratios show nuclear monophyly, whereas species with low ratios have polyphyletic nuclear gene trees. As expected, species with intermediate coalescence ratios show a variety of patterns. Especially at very high or low coalescence ratios, the three-times rule predicts nuclear gene patterns that can help detect the action of selection. The three-times rule may be useful as an empirical benchmark for evaluating evolutionary processes occurring at multiple loci.
Nucleotide sequence determination of guinea-pig casein B mRNA reveals homology with bovine and rat alpha s1 caseins and conservation of the non-coding regions of the mRNA.

PubMed Central

Hall, L; Laird, J E; Craig, R K

1984-01-01

Nucleotide sequence analysis of cloned guinea-pig casein B cDNA sequences has identified two casein B variants related to the bovine and rat alpha s1 caseins. Amino acid homology was largely confined to the known bovine or predicted rat phosphorylation sites and within the 'signal' precursor sequence. Comparison of the deduced nucleotide sequence of the guinea-pig and rat alpha s1 casein mRNA species showed greater sequence conservation in the non-coding than in the coding regions, suggesting a functional and possibly regulatory role for the non-coding regions of casein mRNA. The results provide insight into the evolution of the casein genes, and raise questions as to the role of conserved nucleotide sequences within the non-coding regions of mRNA species. Images Fig. 1. PMID:6548375

Some links on this page may take you to non-federal websites. Their policies may differ from this site.