dna sequence present: Topics by Science.gov

Sample records for dna sequence present

Quantitative DNA fiber mapping

DOEpatents

Gray, Joe W.; Weier, Heinz-Ulrich G.

1998-01-01

The present invention relates generally to the DNA mapping and sequencing technologies. In particular, the present invention provides enhanced methods and compositions for the physical mapping and positional cloning of genomic DNA. The present invention also provides a useful analytical technique to directly map cloned DNA sequences onto individual stretched DNA molecules.
Scar-less multi-part DNA assembly design automation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hillson, Nathan J.

The present invention provides a method of a method of designing an implementation of a DNA assembly. In an exemplary embodiment, the method includes (1) receiving a list of DNA sequence fragments to be assembled together and an order in which to assemble the DNA sequence fragments, (2) designing DNA oligonucleotides (oligos) for each of the DNA sequence fragments, and (3) creating a plan for adding flanking homology sequences to each of the DNA oligos. In an exemplary embodiment, the method includes (1) receiving a list of DNA sequence fragments to be assembled together and an order in which tomore » assemble the DNA sequence fragments, (2) designing DNA oligonucleotides (oligos) for each of the DNA sequence fragments, and (3) creating a plan for adding optimized overhang sequences to each of the DNA oligos.« less
A comprehensive list of cloned human DNA sequences

PubMed Central

Schmidtke, Jörg; Cooper, David N.

1987-01-01

A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3575113
A comprehensive list of cloned human DNA sequences

PubMed Central

Schmidtke, Jörg; Cooper, David N.

1990-01-01

A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2333227
A comprehensive list of cloned human DNA sequences

PubMed Central

Schmidtke, Jörg; Cooper, David N.

1988-01-01

A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3368330
A comprehensive list of cloned human DNA sequences

PubMed Central

Schmidtke, Jörg; Cooper, David N.

1989-01-01

A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2654889
Short, interspersed, and repetitive DNA sequences in Spiroplasma species.

PubMed

Nur, I; LeBlanc, D J; Tully, J G

1987-03-01

Small fragments of DNA from an 8-kbp plasmid, pRA1, from a plant pathogenic strain of Spiroplasma citri were shown previously to be present in the chromosomal DNA of at least two species of Spiroplasma. We describe here the shot-gun cloning of chromosomal DNA from S. citri Maroc and the identification of two distinct sequences exhibiting homology to pRA1. Further subcloning experiments provided specific molecular probes for the identification of these two sequences in chromosomal DNA from three distinct plant pathogenic species of Spiroplasma. The results of Southern blot hybridization indicated that each of the pRA1-associated sequences is present as multiple copies in short, dispersed, and repetitive sequences in the chromosomes of these three strains. None of the sequences was detectable in chromosomal DNA from an additional nine Spiroplasma strains examined.
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies.

PubMed

Utturkar, Sagar M; Klingeman, Dawn M; Hurt, Richard A; Brown, Steven D

2017-01-01

This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted. PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.
Extending the spectrum of DNA sequences retrieved from ancient bones and teeth

PubMed Central

Glocke, Isabelle; Meyer, Matthias

2017-01-01

The number of DNA fragments surviving in ancient bones and teeth is known to decrease with fragment length. Recent genetic analyses of Middle Pleistocene remains have shown that the recovery of extremely short fragments can prove critical for successful retrieval of sequence information from particularly degraded ancient biological material. Current sample preparation techniques, however, are not optimized to recover DNA sequences from fragments shorter than ∼35 base pairs (bp). Here, we show that much shorter DNA fragments are present in ancient skeletal remains but lost during DNA extraction. We present a refined silica-based DNA extraction method that not only enables efficient recovery of molecules as short as 25 bp but also doubles the yield of sequences from longer fragments due to improved recovery of molecules with single-strand breaks. Furthermore, we present strategies for monitoring inefficiencies in library preparation that may result from co-extraction of inhibitory substances during DNA extraction. The combination of DNA extraction and library preparation techniques described here substantially increases the yield of DNA sequences from ancient remains and provides access to a yet unexploited source of highly degraded DNA fragments. Our work may thus open the door for genetic analyses on even older material. PMID:28408382
Fabrication and characterization of a solid state nanopore with self-aligned carbon nanoelectrodes for molecular detection

NASA Astrophysics Data System (ADS)

Spinney, Patrick; Collins, Scott D.; Howitt, David G.; Smith, Rosemary L.

2012-06-01

Rapid and cost-effective DNA sequencing is a pivotal prerequisite for the genomics era. Many of the recent advances in forensics, medicine, agriculture, taxonomy, and drug discovery have paralleled critical advances in DNA sequencing technology. Nanopore modalities for DNA sequencing have recently surfaced including the electrical interrogation of protein ion channels and/or solid-state nanopores during translocation of DNA. However to date, most of this work has met with mixed success. In this work, we present a unique nanofabrication strategy that realizes an artificial nanopore articulated with carbon electrodes to sense the current modulations during the transport of DNA through the nanopore. This embodiment overcomes most of the technical difficulties inherent in other artificial nanopore embodiments and present a versatile platform for the testing of DNA single nucleotide detection. Characterization of the device using gold nanoparticles, silica nanoparticles, lambda dsDNA and 16-mer ssDNA are presented. Although single molecule DNA sequencing is still not demonstrated, the device shows a path towards this goal.
SNP discovery through de novo deep sequencing using the next generation of DNA sequencers

USDA-ARS?s Scientific Manuscript database

The production of high volumes of DNA sequence data using new technologies has permitted more efficient identification of single nucleotide polymorphisms in vertebrate genomes. This chapter presented practical methodology for production and analysis of DNA sequence data for SNP discovery....
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Jr., Richard A.

This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted.more » PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Furthermore, our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.« less
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies

DOE PAGES

Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Jr., Richard A.; ...

2017-07-18

This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted.more » PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Furthermore, our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.« less
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies

PubMed Central

Utturkar, Sagar M.; Klingeman, Dawn M.; Hurt, Richard A.; Brown, Steven D.

2017-01-01

This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted. PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences. PMID:28769883
Complete mitochondrial genome sequence of a Middle Pleistocene cave bear reconstructed from ultrashort DNA fragments.

PubMed

Dabney, Jesse; Knapp, Michael; Glocke, Isabelle; Gansauge, Marie-Theres; Weihmann, Antje; Nickel, Birgit; Valdiosera, Cristina; García, Nuria; Pääbo, Svante; Arsuaga, Juan-Luis; Meyer, Matthias

2013-09-24

Although an inverse relationship is expected in ancient DNA samples between the number of surviving DNA fragments and their length, ancient DNA sequencing libraries are strikingly deficient in molecules shorter than 40 bp. We find that a loss of short molecules can occur during DNA extraction and present an improved silica-based extraction protocol that enables their efficient retrieval. In combination with single-stranded DNA library preparation, this method enabled us to reconstruct the mitochondrial genome sequence from a Middle Pleistocene cave bear (Ursus deningeri) bone excavated at Sima de los Huesos in the Sierra de Atapuerca, Spain. Phylogenetic reconstructions indicate that the U. deningeri sequence forms an early diverging sister lineage to all Western European Late Pleistocene cave bears. Our results prove that authentic ancient DNA can be preserved for hundreds of thousand years outside of permafrost. Moreover, the techniques presented enable the retrieval of phylogenetically informative sequences from samples in which virtually all DNA is diminished to fragments shorter than 50 bp.
Complete mitochondrial genome sequence of a Middle Pleistocene cave bear reconstructed from ultrashort DNA fragments

PubMed Central

Dabney, Jesse; Knapp, Michael; Glocke, Isabelle; Gansauge, Marie-Theres; Weihmann, Antje; Nickel, Birgit; Valdiosera, Cristina; García, Nuria; Pääbo, Svante; Arsuaga, Juan-Luis; Meyer, Matthias

2013-01-01

Although an inverse relationship is expected in ancient DNA samples between the number of surviving DNA fragments and their length, ancient DNA sequencing libraries are strikingly deficient in molecules shorter than 40 bp. We find that a loss of short molecules can occur during DNA extraction and present an improved silica-based extraction protocol that enables their efficient retrieval. In combination with single-stranded DNA library preparation, this method enabled us to reconstruct the mitochondrial genome sequence from a Middle Pleistocene cave bear (Ursus deningeri) bone excavated at Sima de los Huesos in the Sierra de Atapuerca, Spain. Phylogenetic reconstructions indicate that the U. deningeri sequence forms an early diverging sister lineage to all Western European Late Pleistocene cave bears. Our results prove that authentic ancient DNA can be preserved for hundreds of thousand years outside of permafrost. Moreover, the techniques presented enable the retrieval of phylogenetically informative sequences from samples in which virtually all DNA is diminished to fragments shorter than 50 bp. PMID:24019490
Sequence and Structure Dependent DNA-DNA Interactions

NASA Astrophysics Data System (ADS)

Kopchick, Benjamin; Qiu, Xiangyun

Molecular forces between dsDNA strands are largely dominated by electrostatics and have been extensively studied. Quantitative knowledge has been accumulated on how DNA-DNA interactions are modulated by varied biological constituents such as ions, cationic ligands, and proteins. Despite its central role in biology, the sequence of DNA has not received substantial attention and ``random'' DNA sequences are typically used in biophysical studies. However, ~50% of human genome is composed of non-random-sequence DNAs, particularly repetitive sequences. Furthermore, covalent modifications of DNA such as methylation play key roles in gene functions. Such DNAs with specific sequences or modifications often take on structures other than the canonical B-form. Here we present series of quantitative measurements of the DNA-DNA forces with the osmotic stress method on different DNA sequences, from short repeats to the most frequent sequences in genome, and to modifications such as bromination and methylation. We observe peculiar behaviors that appear to be strongly correlated with the incurred structural changes. We speculate the causalities in terms of the differences in hydration shell and DNA surface structures.
Inaugural Genomics Automation Congress and the coming deluge of sequencing data.

PubMed

Creighton, Chad J

2010-10-01

Presentations at Select Biosciences's first 'Genomics Automation Congress' (Boston, MA, USA) in 2010 focused on next-generation sequencing and the platforms and methodology around them. The meeting provided an overview of sequencing technologies, both new and emerging. Speakers shared their recent work on applying sequencing to profile cells for various levels of biomolecular complexity, including DNA sequences, DNA copy, DNA methylation, mRNA and microRNA. With sequencing time and costs continuing to drop dramatically, a virtual explosion of very large sequencing datasets is at hand, which will probably present challenges and opportunities for high-level data analysis and interpretation, as well as for information technology infrastructure.
Advances in high throughput DNA sequence data compression.

PubMed

Sardaraz, Muhammad; Tahir, Muhammad; Ikram, Ataul Aziz

2016-06-01

Advances in high throughput sequencing technologies and reduction in cost of sequencing have led to exponential growth in high throughput DNA sequence data. This growth has posed challenges such as storage, retrieval, and transmission of sequencing data. Data compression is used to cope with these challenges. Various methods have been developed to compress genomic and sequencing data. In this article, we present a comprehensive review of compression methods for genome and reads compression. Algorithms are categorized as referential or reference free. Experimental results and comparative analysis of various methods for data compression are presented. Finally, key challenges and research directions in DNA sequence data compression are highlighted.
DNA Replication Profiling Using Deep Sequencing.

PubMed

Saayman, Xanita; Ramos-Pérez, Cristina; Brown, Grant W

2018-01-01

Profiling of DNA replication during progression through S phase allows a quantitative snap-shot of replication origin usage and DNA replication fork progression. We present a method for using deep sequencing data to profile DNA replication in S. cerevisiae.

Fractal landscape analysis of DNA walks

NASA Technical Reports Server (NTRS)

Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Sciortino, F.; Simons, M.; Stanley, H. E.

1992-01-01

By mapping nucleotide sequences onto a "DNA walk", we uncovered remarkably long-range power law correlations [Nature 356 (1992) 168] that imply a new scale invariant property of DNA. We found such long-range correlations in intron-containing genes and in non-transcribed regulatory DNA sequences, but not in cDNA sequences or intron-less genes. In this paper, we present more explicit evidences to support our findings.
DNA fingerprinting, DNA barcoding, and next generation sequencing technology in plants.

PubMed

Sucher, Nikolaus J; Hennell, James R; Carles, Maria C

2012-01-01

DNA fingerprinting of plants has become an invaluable tool in forensic, scientific, and industrial laboratories all over the world. PCR has become part of virtually every variation of the plethora of approaches used for DNA fingerprinting today. DNA sequencing is increasingly used either in combination with or as a replacement for traditional DNA fingerprinting techniques. A prime example is the use of short, standardized regions of the genome as taxon barcodes for biological identification of plants. Rapid advances in "next generation sequencing" (NGS) technology are driving down the cost of sequencing and bringing large-scale sequencing projects into the reach of individual investigators. We present an overview of recent publications that demonstrate the use of "NGS" technology for DNA fingerprinting and DNA barcoding applications.
Representation of DNA sequences in genetic codon context with applications in exon and intron prediction.

PubMed

Yin, Changchuan

2015-04-01

To apply digital signal processing (DSP) methods to analyze DNA sequences, the sequences first must be specially mapped into numerical sequences. Thus, effective numerical mappings of DNA sequences play key roles in the effectiveness of DSP-based methods such as exon prediction. Despite numerous mappings of symbolic DNA sequences to numerical series, the existing mapping methods do not include the genetic coding features of DNA sequences. We present a novel numerical representation of DNA sequences using genetic codon context (GCC) in which the numerical values are optimized by simulation annealing to maximize the 3-periodicity signal to noise ratio (SNR). The optimized GCC representation is then applied in exon and intron prediction by Short-Time Fourier Transform (STFT) approach. The results show the GCC method enhances the SNR values of exon sequences and thus increases the accuracy of predicting protein coding regions in genomes compared with the commonly used 4D binary representation. In addition, this study offers a novel way to reveal specific features of DNA sequences by optimizing numerical mappings of symbolic DNA sequences.
A High-Throughput Process for the Solid-Phase Purification of Synthetic DNA Sequences

PubMed Central

Grajkowski, Andrzej; Cieślak, Jacek; Beaucage, Serge L.

2017-01-01

An efficient process for the purification of synthetic phosphorothioate and native DNA sequences is presented. The process is based on the use of an aminopropylated silica gel support functionalized with aminooxyalkyl functions to enable capture of DNA sequences through an oximation reaction with the keto function of a linker conjugated to the 5′-terminus of DNA sequences. Deoxyribonucleoside phosphoramidites carrying this linker, as a 5′-hydroxyl protecting group, have been synthesized for incorporation into DNA sequences during the last coupling step of a standard solid-phase synthesis protocol executed on a controlled pore glass (CPG) support. Solid-phase capture of the nucleobase- and phosphate-deprotected DNA sequences released from the CPG support is demonstrated to proceed near quantitatively. Shorter than full-length DNA sequences are first washed away from the capture support; the solid-phase purified DNA sequences are then released from this support upon reaction with tetra-n-butylammonium fluoride in dry dimethylsulfoxide (DMSO) and precipitated in tetrahydrofuran (THF). The purity of solid-phase-purified DNA sequences exceeds 98%. The simulated high-throughput and scalability features of the solid-phase purification process are demonstrated without sacrificing purity of the DNA sequences. PMID:28628204
The genome-wide DNA sequence specificity of the anti-tumour drug bleomycin in human cells.

PubMed

Murray, Vincent; Chen, Jon K; Tanaka, Mark M

2016-07-01

The cancer chemotherapeutic agent, bleomycin, cleaves DNA at specific sites. For the first time, the genome-wide DNA sequence specificity of bleomycin breakage was determined in human cells. Utilising Illumina next-generation DNA sequencing techniques, over 200 million bleomycin cleavage sites were examined to elucidate the bleomycin genome-wide DNA selectivity. The genome-wide bleomycin cleavage data were analysed by four different methods to determine the cellular DNA sequence specificity of bleomycin strand breakage. For the most highly cleaved DNA sequences, the preferred site of bleomycin breakage was at 5'-GT* dinucleotide sequences (where the asterisk indicates the bleomycin cleavage site), with lesser cleavage at 5'-GC* dinucleotides. This investigation also determined longer bleomycin cleavage sequences, with preferred cleavage at 5'-GT*A and 5'- TGT* trinucleotide sequences, and 5'-TGT*A tetranucleotides. For cellular DNA, the hexanucleotide DNA sequence 5'-RTGT*AY (where R is a purine and Y is a pyrimidine) was the most highly cleaved DNA sequence. It was striking that alternating purine-pyrimidine sequences were highly cleaved by bleomycin. The highest intensity cleavage sites in cellular and purified DNA were very similar although there were some minor differences. Statistical nucleotide frequency analysis indicated a G nucleotide was present at the -3 position (relative to the cleavage site) in cellular DNA but was absent in purified DNA.
[Current applications of high-throughput DNA sequencing technology in antibody drug research].

PubMed

Yu, Xin; Liu, Qi-Gang; Wang, Ming-Rong

2012-03-01

Since the publication of a high-throughput DNA sequencing technology based on PCR reaction was carried out in oil emulsions in 2005, high-throughput DNA sequencing platforms have been evolved to a robust technology in sequencing genomes and diverse DNA libraries. Antibody libraries with vast numbers of members currently serve as a foundation of discovering novel antibody drugs, and high-throughput DNA sequencing technology makes it possible to rapidly identify functional antibody variants with desired properties. Herein we present a review of current applications of high-throughput DNA sequencing technology in the analysis of antibody library diversity, sequencing of CDR3 regions, identification of potent antibodies based on sequence frequency, discovery of functional genes, and combination with various display technologies, so as to provide an alternative approach of discovery and development of antibody drugs.
RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis.

PubMed

Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab

2012-01-01

RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. http://www.cemb.edu.pk/sw.html RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language.
DNA Multiple Sequence Alignment Guided by Protein Domains: The MSA-PAD 2.0 Method.

PubMed

Balech, Bachir; Monaco, Alfonso; Perniola, Michele; Santamaria, Monica; Donvito, Giacinto; Vicario, Saverio; Maggi, Giorgio; Pesole, Graziano

2018-01-01

Multiple sequence alignment (MSA) is a fundamental component in many DNA sequence analyses including metagenomics studies and phylogeny inference. When guided by protein profiles, DNA multiple alignments assume a higher precision and robustness. Here we present details of the use of the upgraded version of MSA-PAD (2.0), which is a DNA multiple sequence alignment framework able to align DNA sequences coding for single/multiple protein domains guided by PFAM or user-defined annotations. MSA-PAD has two alignment strategies, called "Gene" and "Genome," accounting for coding domains order and genomic rearrangements, respectively. Novel options were added to the present version, where the MSA can be guided by protein profiles provided by the user. This allows MSA-PAD 2.0 to run faster and to add custom protein profiles sometimes not present in PFAM database according to the user's interest. MSA-PAD 2.0 is currently freely available as a Web application at https://recasgateway.cloud.ba.infn.it/ .
CROSS-DISCIPLINARY PHYSICS AND RELATED AREAS OF SCIENCE AND TECHNOLOGY: Characteristics of alternating current hopping conductivity in DNA sequences

NASA Astrophysics Data System (ADS)

Ma, Song-Shan; Xu, Hui; Wang, Huan-You; Guo, Rui

2009-08-01

This paper presents a model to describe alternating current (AC) conductivity of DNA sequences, in which DNA is considered as a one-dimensional (1D) disordered system, and electrons transport via hopping between localized states. It finds that AC conductivity in DNA sequences increases as the frequency of the external electric field rises, and it takes the form of øac(ω) ~ ω2 ln2(1/ω). Also AC conductivity of DNA sequences increases with the increase of temperature, this phenomenon presents characteristics of weak temperature-dependence. Meanwhile, the AC conductivity in an off-diagonally correlated case is much larger than that in the uncorrelated case of the Anderson limit in low temperatures, which indicates that the off-diagonal correlations in DNA sequences have a great effect on the AC conductivity, while at high temperature the off-diagonal correlations no longer play a vital role in electric transport. In addition, the proportion of nucleotide pairs p also plays an important role in AC electron transport of DNA sequences. For p < 0.5, the conductivity of DNA sequence decreases with the increase of p, while for p >= 0.5, the conductivity increases with the increase of p.
Methods for sequencing GC-rich and CCT repeat DNA templates

DOEpatents

Robinson, Donna L.

2007-02-20

The present invention is directed to a PCR-based method of cycle sequencing DNA and other polynucleotide sequences having high CG content and regions of high GC content, and includes for example DNA strands with a high Cytosine and/or Guanosine content and repeated motifs such as CCT repeats.
First Complete Squash leaf curl China virus Genomic Segment DNA-A Sequence from East Timor

PubMed Central

Maina, Solomon; Edwards, Owain R.; de Almeida, Luis; Ximenes, Abel

2017-01-01

ABSTRACT We present here the first complete Squash leaf curl China virus (SLCCV) genomic segment DNA-A sequence from East Timor. It was isolated from a pumpkin plant. When compared with 15 complete SLCCV DNA-A genome sequences from other world regions, it most resembled the Malaysian isolate MC1 sequence. PMID:28619789
G-quadruplex and G-rich sequence stimulate Pif1p-catalyzed downstream duplex DNA unwinding through reducing waiting time at ss/dsDNA junction

PubMed Central

Zhang, Bo; Wu, Wen-Qiang; Liu, Na-Nv; Duan, Xiao-Lei; Li, Ming; Dou, Shuo-Xing; Hou, Xi-Miao; Xi, Xu-Guang

2016-01-01

Alternative DNA structures that deviate from B-form double-stranded DNA such as G-quadruplex (G4) DNA can be formed by G-rich sequences that are widely distributed throughout the human genome. We have previously shown that Pif1p not only unfolds G4, but also unwinds the downstream duplex DNA in a G4-stimulated manner. In the present study, we further characterized the G4-stimulated duplex DNA unwinding phenomenon by means of single-molecule fluorescence resonance energy transfer. It was found that Pif1p did not unwind the partial duplex DNA immediately after unfolding the upstream G4 structure, but rather, it would dwell at the ss/dsDNA junction with a ‘waiting time’. Further studies revealed that the waiting time was in fact related to a protein dimerization process that was sensitive to ssDNA sequence and would become rapid if the sequence is G-rich. Furthermore, we identified that the G-rich sequence, as the G4 structure, equally stimulates duplex DNA unwinding. The present work sheds new light on the molecular mechanism by which G4-unwinding helicase Pif1p resolves physiological G4/duplex DNA structures in cells. PMID:27471032
Homogeneity of the 16S rDNA sequence among geographically disparate isolates of Taylorella equigenitalis

PubMed Central

Matsuda, M; Tazumi, A; Kagawa, S; Sekizuka, T; Murayama, O; Moore, JE; Millar, BC

2006-01-01

Background At present, six accessible sequences of 16S rDNA from Taylorella equigenitalis (T. equigenitalis) are available, whose sequence differences occur at a few nucleotide positions. Thus it is important to determine these sequences from additional strains in other countries, if possible, in order to clarify any anomalies regarding 16S rDNA sequence heterogeneity. Here, we clone and sequence the approximate full-length 16S rDNA from additional strains of T. equigenitalis isolated in Japan, Australia and France and compare these sequences to the existing published sequences. Results Clarification of any anomalies regarding 16S rDNA sequence heterogeneity of T. equigenitalis was carried out. When cloning, sequencing and comparison of the approximate full-length 16S rDNA from 17 strains of T. equigenitalis isolated in Japan, Australia and France, nucleotide sequence differences were demonstrated at the six loci in the 1,469 nucleotide sequence. Moreover, 12 polymorphic sites occurred among 23 sequences of the 16S rDNA, including the six reference sequences. Conclusion High sequence similarity (99.5% or more) was observed throughout, except from nucleotide positions 138 to 501 where substitutions and deletions were noted. PMID:16398935
Niche and neutral processes both shape community structure in parallelized, aerobic, single carbon-source enrichments

DOE Data Explorer

Flynn, Theodore M.; Koval, Jason C.; Greenwald, Stephanie M.; Owens, Sarah M.; Kemner, Kenneth M.; Antonopoulos, Dionysios A.

2017-01-01

We present DNA sequence data in FASTA-formatted files from aerobic environmental microcosms inoculated with a sole carbon source. DNA sequences are of 16S rRNA genes present in DNA extracted from each microcosm along with the environmental samples (soil, water) used to inoculate them. These samples were sequenced using the Illumina MiSeq platform at the Environmental Sample Preparation and Sequencing Facility at Argonne National Laboratory. This data is compatible with standard microbiome analysis pipelines (e.g., QIIME, mothur, etc.).
Sequence independent amplification of DNA

DOEpatents

Bohlander, S.K.

1998-03-24

The present invention is a rapid sequence-independent amplification procedure (SIA). Even minute amounts of DNA from various sources can be amplified independent of any sequence requirements of the DNA or any a priori knowledge of any sequence characteristics of the DNA to be amplified. This method allows, for example, the sequence independent amplification of microdissected chromosomal material and the reliable construction of high quality fluorescent in situ hybridization (FISH) probes from YACs or from other sources. These probes can be used to localize YACs on metaphase chromosomes but also--with high efficiency--in interphase nuclei. 25 figs.
Sequence independent amplification of DNA

DOEpatents

Bohlander, Stefan K.

1998-01-01

The present invention is a rapid sequence-independent amplification procedure (SIA). Even minute amounts of DNA from various sources can be amplified independent of any sequence requirements of the DNA or any a priori knowledge of any sequence characteristics of the DNA to be amplified. This method allows, for example the sequence independent amplification of microdissected chromosomal material and the reliable construction of high quality fluorescent in situ hybridization (FISH) probes from YACs or from other sources. These probes can be used to localize YACs on metaphase chromosomes but also--with high efficiency--in interphase nuclei.
Nanopore-CMOS Interfaces for DNA Sequencing

PubMed Central

Magierowski, Sebastian; Huang, Yiyun; Wang, Chengjie; Ghafar-Zadeh, Ebrahim

2016-01-01

DNA sequencers based on nanopore sensors present an opportunity for a significant break from the template-based incumbents of the last forty years. Key advantages ushered by nanopore technology include a simplified chemistry and the ability to interface to CMOS technology. The latter opportunity offers substantial promise for improvement in sequencing speed, size and cost. This paper reviews existing and emerging means of interfacing nanopores to CMOS technology with an emphasis on massively-arrayed structures. It presents this in the context of incumbent DNA sequencing techniques, reviews and quantifies nanopore characteristics and models and presents CMOS circuit methods for the amplification of low-current nanopore signals in such interfaces. PMID:27509529
Nanopore-CMOS Interfaces for DNA Sequencing.

PubMed

Magierowski, Sebastian; Huang, Yiyun; Wang, Chengjie; Ghafar-Zadeh, Ebrahim

2016-08-06

DNA sequencers based on nanopore sensors present an opportunity for a significant break from the template-based incumbents of the last forty years. Key advantages ushered by nanopore technology include a simplified chemistry and the ability to interface to CMOS technology. The latter opportunity offers substantial promise for improvement in sequencing speed, size and cost. This paper reviews existing and emerging means of interfacing nanopores to CMOS technology with an emphasis on massively-arrayed structures. It presents this in the context of incumbent DNA sequencing techniques, reviews and quantifies nanopore characteristics and models and presents CMOS circuit methods for the amplification of low-current nanopore signals in such interfaces.
Separating endogenous ancient DNA from modern day contamination in a Siberian Neandertal

PubMed Central

Skoglund, Pontus; Northoff, Bernd H.; Shunkov, Michael V.; Derevianko, Anatoli P.; Pääbo, Svante; Krause, Johannes; Jakobsson, Mattias

2014-01-01

One of the main impediments for obtaining DNA sequences from ancient human skeletons is the presence of contaminating modern human DNA molecules in many fossil samples and laboratory reagents. However, DNA fragments isolated from ancient specimens show a characteristic DNA damage pattern caused by miscoding lesions that differs from present day DNA sequences. Here, we develop a framework for evaluating the likelihood of a sequence originating from a model with postmortem degradation—summarized in a postmortem degradation score—which allows the identification of DNA fragments that are unlikely to originate from present day sources. We apply this approach to a contaminated Neandertal specimen from Okladnikov Cave in Siberia to isolate its endogenous DNA from modern human contaminants and show that the reconstructed mitochondrial genome sequence is more closely related to the variation of Western Neandertals than what was discernible from previous analyses. Our method opens up the potential for genomic analysis of contaminated fossil material. PMID:24469802
Non-B-DNA structures on the interferon-beta promoter?

PubMed

Robbe, K; Bonnefoy, E

1998-01-01

The high mobility group (HMG) I protein intervenes as an essential factor during the virus induced expression of the interferon-beta (IFN-beta) gene. It is a non-histone chromatine associated protein that has the dual capacity of binding to a non-B-DNA structure such as cruciform-DNA as well as to AT rich B-DNA sequences. In this work we compare the binding affinity of HMGI for a synthetic cruciform-DNA to its binding affinity for the HMGI-binding-site present in the positive regulatory domain II (PRDII) of the IFN-beta promoter. Using gel retardation experiments, we show that HMGI protein binds with at least ten times more affinity to the synthetic cruciform-DNA structure than to the PRDII B-DNA sequence. DNA hairpin sequences are present in both the human and the murine PRDII-DNAs. We discuss in this work the presence of, yet putative, non-B-DNA structures in the IFN-beta promoter.

A DNA sequence analysis package for the IBM personal computer.

PubMed Central

Lagrimini, L M; Brentano, S T; Donelson, J E

1984-01-01

We present here a collection of DNA sequence analysis programs, called "PC Sequence" (PCS), which are designed to run on the IBM Personal Computer (PC). These programs are written in IBM PC compiled BASIC and take full advantage of the IBM PC's speed, error handling, and graphics capabilities. For a modest initial expense in hardware any laboratory can use these programs to quickly perform computer analysis on DNA sequences. They are written with the novice user in mind and require very little training or previous experience with computers. Also provided are a text editing program for creating and modifying DNA sequence files and a communications program which enables the PC to communicate with and collect information from mainframe computers and DNA sequence databases. PMID:6546433
RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis

PubMed Central

Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab

2012-01-01

RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. Availability http://www.cemb.edu.pk/sw.html Abbreviations RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language. PMID:23055611
Assessing Diversity of DNA Structure-Related Sequence Features in Prokaryotic Genomes

PubMed Central

Huang, Yongjie; Mrázek, Jan

2014-01-01

Prokaryotic genomes are diverse in terms of their nucleotide and oligonucleotide composition as well as presence of various sequence features that can affect physical properties of the DNA molecule. We present a survey of local sequence patterns which have a potential to promote non-canonical DNA conformations (i.e. different from standard B-DNA double helix) and interpret the results in terms of relationships with organisms' habitats, phylogenetic classifications, and other characteristics. Our present work differs from earlier similar surveys not only by investigating a wider range of sequence patterns in a large number of genomes but also by using a more realistic null model to assess significant deviations. Our results show that simple sequence repeats and Z-DNA-promoting patterns are generally suppressed in prokaryotic genomes, whereas palindromes and inverted repeats are over-represented. Representation of patterns that promote Z-DNA and intrinsic DNA curvature increases with increasing optimal growth temperature (OGT), and decreases with increasing oxygen requirement. Additionally, representations of close direct repeats, palindromes and inverted repeats exhibit clear negative trends with increasing OGT. The observed relationships with environmental characteristics, particularly OGT, suggest possible evolutionary scenarios of structural adaptation of DNA to particular environmental niches. PMID:24408877
Autonomous replication and addition of telomerelike sequences to DNA microinjected into Paramecium tetraurelia macronuclei.

PubMed Central

Gilley, D; Preer, J R; Aufderheide, K J; Polisky, B

1988-01-01

Paramecium tetraurelia can be transformed by microinjection of cloned serotype A gene sequences into the macronucleus. Transformants are detected by their ability to express serotype A surface antigen from the injected templates. After injection, the DNA is converted from a supercoiled form to a linear form by cleavage at nonrandom sites. The linear form appears to replicate autonomously as a unit-length molecule and is present in transformants at high copy number. The injected DNA is further processed by the addition of paramecium-type telomeric sequences to the termini of the linear DNA. To examine the fate of injected linear DNA molecules, plasmid pSA14SB DNA containing the A gene was cleaved into two linear pieces, a 14-kilobase (kb) piece containing the A gene and flanking sequences and a 2.2-kb piece consisting of the procaryotic vector. In transformants expressing the A gene, we observed that two linear DNA species were present which correspond to the two species injected. Both species had Paramecium telomerelike sequences added to their termini. For the 2.2-kb DNA, we show that the site of addition of the telomerelike sequences is directly at one terminus and within one nucleotide of the other terminus. These results indicate that injected procaryotic DNA is capable of autonomous replication in Paramecium macronuclei and that telomeric addition in the macronucleus does not require specific recognition sequences. Images PMID:3211128
The kinetoplast DNA of the Australian trypanosome, Trypanosoma copemani, shares features with Trypanosoma cruzi and Trypanosoma lewisi.

PubMed

Botero, Adriana; Kapeller, Irit; Cooper, Crystal; Clode, Peta L; Shlomai, Joseph; Thompson, R C Andrew

2018-05-17

Kinetoplast DNA (kDNA) is the mitochondrial genome of trypanosomatids. It consists of a few dozen maxicircles and several thousand minicircles, all catenated topologically to form a two-dimensional DNA network. Minicircles are heterogeneous in size and sequence among species. They present one or several conserved regions that contain three highly conserved sequence blocks. CSB-1 (10 bp sequence) and CSB-2 (8 bp sequence) present lower interspecies homology, while CSB-3 (12 bp sequence) or the Universal Minicircle Sequence is conserved within most trypanosomatids. The Universal Minicircle Sequence is located at the replication origin of the minicircles, and is the binding site for the UMS binding protein, a protein involved in trypanosomatid survival and virulence. Here, we describe the structure and organisation of the kDNA of Trypanosoma copemani, a parasite that has been shown to infect mammalian cells and has been associated with the drastic decline of the endangered Australian marsupial, the woylie (Bettongia penicillata). Deep genomic sequencing showed that T. copemani presents two classes of minicircles that share sequence identity and organisation in the conserved sequence blocks with those of Trypanosoma cruzi and Trypanosoma lewisi. A 19,257 bp partial region of the maxicircle of T. copemani that contained the entire coding region was obtained. Comparative analysis of the T. copemani entire maxicircle coding region with the coding regions of T. cruzi and T. lewisi showed they share 71.05% and 71.28% identity, respectively. The shared features in the maxicircle/minicircle organisation and sequence between T. copemani and T. cruzi/T. lewisi suggest similarities in their process of kDNA replication, and are of significance in understanding the evolution of Australian trypanosomes. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
Laser Desorption Mass Spectrometry for DNA Sequencing and Analysis

NASA Astrophysics Data System (ADS)

Chen, C. H. Winston; Taranenko, N. I.; Golovlev, V. V.; Isola, N. R.; Allman, S. L.

1998-03-01

Rapid DNA sequencing and/or analysis is critically important for biomedical research. In the past, gel electrophoresis has been the primary tool to achieve DNA analysis and sequencing. However, gel electrophoresis is a time-consuming and labor-extensive process. Recently, we have developed and used laser desorption mass spectrometry (LDMS) to achieve sequencing of ss-DNA longer than 100 nucleotides. With LDMS, we succeeded in sequencing DNA in seconds instead of hours or days required by gel electrophoresis. In addition to sequencing, we also applied LDMS for the detection of DNA probes for hybridization LDMS was also used to detect short tandem repeats for forensic applications. Clinical applications for disease diagnosis such as cystic fibrosis caused by base deletion and point mutation have also been demonstrated. Experimental details will be presented in the meeting. abstract.
Characterization of proviruses cloned from mink cell focus-forming virus-infected cellular DNA.

PubMed Central

Khan, A S; Repaske, R; Garon, C F; Chan, H W; Rowe, W P; Martin, M A

1982-01-01

Two proviruses were cloned from EcoRI-digested DNA extracted from mink cells chronically infected with AKR mink cell focus-forming (MCF) 247 murine leukemia virus (MuLV), using a lambda phage host vector system. One cloned MuLV DNA fragment (designated MCF 1) contained sequences extending 6.8 kilobases from an EcoRI restriction site in the 5' long terminal repeat (LTR) to an EcoRI site located in the envelope (env) region and was indistinguishable by restriction endonuclease mapping for 5.1 kilobases (except for the EcoRI site in the LTR) from the 5' end of AKR ecotropic proviral DNA. The DNA segment extending from 5.1 to 6.8 kilobases contained several restriction sites that were not present in the AKR ecotropic provirus. A 0.5-kilobase DNA segment located at the 3' end of MCF 1 DNA contained sequences which hybridized to a xenotropic env-specific DNA probe but not to labeled ecotropic env-specific DNA. This dual character of MCF 1 proviral DNA was also confirmed by analyzing heteroduplex molecules by electron microscopy. The second cloned proviral DNA (designated MCF 2) was a 6.9-kilobase EcoRI DNA fragment which contained LTR sequences at each end and a 2.0-kilobase deletion encompassing most of the env region. The MCF 2 proviral DNA proved to be a useful reagent for detecting LTRs electron microscopically due to the presence of nonoverlapping, terminally located LTR sequences which effected its circularization with DNAs containing homologous LTR sequences. Nucleotide sequence analysis demonstrated the presence of a 104-base-pair direct repeat in the LTR of MCF 2 DNA. In contrast, only a single copy of the reiterated component of the direct repeat was present in MCF 1 DNA. Images PMID:6281459
Acquisition of New DNA Sequences After Infection of Chicken Cells with Avian Myeloblastosis Virus

PubMed Central

Shoyab, M.; Baluda, M. A.; Evans, R.

1974-01-01

DNA-RNA hybridization studies between 70S RNA from avian myeloblastosis virus (AMV) and an excess of DNA from (i) AMV-induced leukemic chicken myeloblasts or (ii) a mixture of normal and of congenitally infected K-137 chicken embryos producing avian leukosis viruses revealed the presence of fast- and slow-hybridizing virus-specific DNA sequences. However, the leukemic cells contained twice the level of AMV-specific DNA sequences observed in normal chicken embryonic cells. The fast-reacting sequences were two to three times more numerous in leukemic DNA than in DNA from the mixed embryos. The slow-reacting sequences had a reiteration frequency of approximately 9 and 6, in the two respective systems. Both the fast- and the slow-reacting DNA sequences in leukemic cells exhibited a higher Tm (2 C) than the respective DNA sequences in normal cells. In normal and leukemic cells the slow hybrid sequences appeared to have a Tm which was 2 C higher than that of the fast hybrid sequences. Individual non-virus-producing chicken embryos, either group-specific antigen positive or negative, contained 40 to 100 copies of the fast sequences and 2 to 6 copies of the slowly hybridizing sequences per cell genome. Normal rat cells did not contain DNA that hybridized with AMV RNA, whereas non-virus-producing rat cells transformed by B-77 avian sarcoma virus contained only the slowly reacting sequences. The results demonstrate that leukemic cells transformed by AMV contain new AMV-specific DNA sequences which were not present before infection. PMID:16789139
Protein Crystal Eco R1 Endonulease-DNA Complex

NASA Technical Reports Server (NTRS)

1998-01-01

Type II restriction enzymes, such as Eco R1 endonulease, present a unique advantage for the study of sequence-specific recognition because they leave a record of where they have been in the form of the cleaved ends of the DNA sites where they were bound. The differential behavior of a sequence -specific protein at sites of differing base sequence is the essence of the sequence-specificity; the core question is how do these proteins discriminate between different DNA sequences especially when the two sequences are very similar. Principal Investigator: Dan Carter/New Century Pharmaceuticals
Local alignment of two-base encoded DNA sequence

PubMed Central

Homer, Nils; Merriman, Barry; Nelson, Stanley F

2009-01-01

Background DNA sequence comparison is based on optimal local alignment of two sequences using a similarity score. However, some new DNA sequencing technologies do not directly measure the base sequence, but rather an encoded form, such as the two-base encoding considered here. In order to compare such data to a reference sequence, the data must be decoded into sequence. The decoding is deterministic, but the possibility of measurement errors requires searching among all possible error modes and resulting alignments to achieve an optimal balance of fewer errors versus greater sequence similarity. Results We present an extension of the standard dynamic programming method for local alignment, which simultaneously decodes the data and performs the alignment, maximizing a similarity score based on a weighted combination of errors and edits, and allowing an affine gap penalty. We also present simulations that demonstrate the performance characteristics of our two base encoded alignment method and contrast those with standard DNA sequence alignment under the same conditions. Conclusion The new local alignment algorithm for two-base encoded data has substantial power to properly detect and correct measurement errors while identifying underlying sequence variants, and facilitating genome re-sequencing efforts based on this form of sequence data. PMID:19508732
The number of reduced alignments between two DNA sequences

PubMed Central

2014-01-01

Background In this study we consider DNA sequences as mathematical strings. Total and reduced alignments between two DNA sequences have been considered in the literature to measure their similarity. Results for explicit representations of some alignments have been already obtained. Results We present exact, explicit and computable formulas for the number of different possible alignments between two DNA sequences and a new formula for a class of reduced alignments. Conclusions A unified approach for a wide class of alignments between two DNA sequences has been provided. The formula is computable and, if complemented by software development, will provide a deeper insight into the theory of sequence alignment and give rise to new comparison methods. AMS Subject Classification Primary 92B05, 33C20, secondary 39A14, 65Q30 PMID:24684679
Ancient dna from pleistocene fossils: Preservation, recovery, and utility of ancient genetic information for quaternary research

NASA Astrophysics Data System (ADS)

Yang, Hong

Until recently, recovery and analysis of genetic information encoded in ancient DNA sequences from Pleistocene fossils were impossible. Recent advances in molecular biology offered technical tools to obtain ancient DNA sequences from well-preserved Quaternary fossils and opened the possibilities to directly study genetic changes in fossil species to address various biological and paleontological questions. Ancient DNA studies involving Pleistocene fossil material and ancient DNA degradation and preservation in Quaternary deposits are reviewed. The molecular technology applied to isolate, amplify, and sequence ancient DNA is also presented. Authentication of ancient DNA sequences and technical problems associated with modern and ancient DNA contamination are discussed. As illustrated in recent studies on ancient DNA from proboscideans, it is apparent that fossil DNA sequence data can shed light on many aspects of Quaternary research such as systematics and phylogeny. conservation biology, evolutionary theory, molecular taphonomy, and forensic sciences. Improvement of molecular techniques and a better understanding of DNA degradation during fossilization are likely to build on current strengths and to overcome existing problems, making fossil DNA data a unique source of information for Quaternary scientists.
Human Chromosome 7: DNA Sequence and Biology

PubMed Central

Scherer, Stephen W.; Cheung, Joseph; MacDonald, Jeffrey R.; Osborne, Lucy R.; Nakabayashi, Kazuhiko; Herbrick, Jo-Anne; Carson, Andrew R.; Parker-Katiraee, Layla; Skaug, Jennifer; Khaja, Razi; Zhang, Junjun; Hudek, Alexander K.; Li, Martin; Haddad, May; Duggan, Gavin E.; Fernandez, Bridget A.; Kanematsu, Emiko; Gentles, Simone; Christopoulos, Constantine C.; Choufani, Sanaa; Kwasnicka, Dorota; Zheng, Xiangqun H.; Lai, Zhongwu; Nusskern, Deborah; Zhang, Qing; Gu, Zhiping; Lu, Fu; Zeesman, Susan; Nowaczyk, Malgorzata J.; Teshima, Ikuko; Chitayat, David; Shuman, Cheryl; Weksberg, Rosanna; Zackai, Elaine H.; Grebe, Theresa A.; Cox, Sarah R.; Kirkpatrick, Susan J.; Rahman, Nazneen; Friedman, Jan M.; Heng, Henry H. Q.; Pelicci, Pier Giuseppe; Lo-Coco, Francesco; Belloni, Elena; Shaffer, Lisa G.; Pober, Barbara; Morton, Cynthia C.; Gusella, James F.; Bruns, Gail A. P.; Korf, Bruce R.; Quade, Bradley J.; Ligon, Azra H.; Ferguson, Heather; Higgins, Anne W.; Leach, Natalia T.; Herrick, Steven R.; Lemyre, Emmanuelle; Farra, Chantal G.; Kim, Hyung-Goo; Summers, Anne M.; Gripp, Karen W.; Roberts, Wendy; Szatmari, Peter; Winsor, Elizabeth J. T.; Grzeschik, Karl-Heinz; Teebi, Ahmed; Minassian, Berge A.; Kere, Juha; Armengol, Lluis; Pujana, Miguel Angel; Estivill, Xavier; Wilson, Michael D.; Koop, Ben F.; Tosi, Sabrina; Moore, Gudrun E.; Boright, Andrew P.; Zlotorynski, Eitan; Kerem, Batsheva; Kroisel, Peter M.; Petek, Erwin; Oscier, David G.; Mould, Sarah J.; Döhner, Hartmut; Döhner, Konstanze; Rommens, Johanna M.; Vincent, John B.; Venter, J. Craig; Li, Peter W.; Mural, Richard J.; Adams, Mark D.; Tsui, Lap-Chee

2010-01-01

DNA sequence and annotation of the entire human chromosome 7, encompassing nearly 158 million nucleotides of DNA and 1917 gene structures, are presented. To generate a higher order description, additional structural features such as imprinted genes, fragile sites, and segmental duplications were integrated at the level of the DNA sequence with medical genetic data, including 440 chromosome rearrangement breakpoints associated with disease. This approach enabled the discovery of candidate genes for developmental diseases including autism. PMID:12690205
An improved model for whole genome phylogenetic analysis by Fourier transform.

PubMed

Yin, Changchuan; Yau, Stephen S-T

2015-10-07

DNA sequence similarity comparison is one of the major steps in computational phylogenetic studies. The sequence comparison of closely related DNA sequences and genomes is usually performed by multiple sequence alignments (MSA). While the MSA method is accurate for some types of sequences, it may produce incorrect results when DNA sequences undergone rearrangements as in many bacterial and viral genomes. It is also limited by its computational complexity for comparing large volumes of data. Previously, we proposed an alignment-free method that exploits the full information contents of DNA sequences by Discrete Fourier Transform (DFT), but still with some limitations. Here, we present a significantly improved method for the similarity comparison of DNA sequences by DFT. In this method, we map DNA sequences into 2-dimensional (2D) numerical sequences and then apply DFT to transform the 2D numerical sequences into frequency domain. In the 2D mapping, the nucleotide composition of a DNA sequence is a determinant factor and the 2D mapping reduces the nucleotide composition bias in distance measure, and thus improving the similarity measure of DNA sequences. To compare the DFT power spectra of DNA sequences with different lengths, we propose an improved even scaling algorithm to extend shorter DFT power spectra to the longest length of the underlying sequences. After the DFT power spectra are evenly scaled, the spectra are in the same dimensionality of the Fourier frequency space, then the Euclidean distances of full Fourier power spectra of the DNA sequences are used as the dissimilarity metrics. The improved DFT method, with increased computational performance by 2D numerical representation, can be applicable to any DNA sequences of different length ranges. We assess the accuracy of the improved DFT similarity measure in hierarchical clustering of different DNA sequences including simulated and real datasets. The method yields accurate and reliable phylogenetic trees and demonstrates that the improved DFT dissimilarity measure is an efficient and effective similarity measure of DNA sequences. Due to its high efficiency and accuracy, the proposed DFT similarity measure is successfully applied on phylogenetic analysis for individual genes and large whole bacterial genomes. Copyright © 2015 Elsevier Ltd. All rights reserved.
Complete mtDNA sequencing reveals mutations m.9185T>C and m.13513G>A in three patients with Leigh syndrome.

PubMed

Pelnena, Dita; Burnyte, Birute; Jankevics, Eriks; Lace, Baiba; Dagyte, Evelina; Grigalioniene, Kristina; Utkus, Algirdas; Krumina, Zita; Rozentale, Jolanta; Adomaitiene, Irina; Stavusis, Janis; Pliss, Liana; Inashkina, Inna

2017-12-12

The most common mitochondrial disorder in children is Leigh syndrome, which is a progressive and genetically heterogeneous neurodegenerative disorder caused by mutations in nuclear genes or mitochondrial DNA (mtDNA). In the present study, a novel and robust method of complete mtDNA sequencing, which allows amplification of the whole mitochondrial genome, was tested. Complete mtDNA sequencing was performed in a cohort of patients with suspected mitochondrial mutations. Patients from Latvia and Lithuania (n = 92 and n = 57, respectively) referred by clinical geneticists were included. The de novo point mutations m.9185T>C and m.13513G>A, respectively, were detected in two patients with lactic acidosis and neurodegenerative lesions. In one patient with neurodegenerative lesions, the mutation m.9185T>C was identified. These mutations are associated with Leigh syndrome. The present data suggest that full-length mtDNA sequencing is recommended as a supplement to nuclear gene testing and enzymatic assays to enhance mitochondrial disease diagnostics.
Detection of herpes simplex virus-specific DNA sequences in latently infected mice and in humans.

PubMed

Efstathiou, S; Minson, A C; Field, H J; Anderson, J R; Wildy, P

1986-02-01

Herpes simplex virus-specific DNA sequences have been detected by Southern hybridization analysis in both central and peripheral nervous system tissues of latently infected mice. We have detected virus-specific sequences corresponding to the junction fragment but not the genomic termini, an observation first made by Rock and Fraser (Nature [London] 302:523-525, 1983). This "endless" herpes simplex virus DNA is both qualitatively and quantitatively stable in mouse neural tissue analyzed over a 4-month period. In addition, examination of DNA extracted from human trigeminal ganglia has shown herpes simplex virus DNA to be present in an "endless" form similar to that found in the mouse model system. Further restriction enzyme analysis of latently infected mouse brainstem and human trigeminal DNA has shown that this "endless" herpes simplex virus DNA is present in all four isomeric configurations.
Detection of herpes simplex virus-specific DNA sequences in latently infected mice and in humans.

PubMed Central

Efstathiou, S; Minson, A C; Field, H J; Anderson, J R; Wildy, P

1986-01-01

Herpes simplex virus-specific DNA sequences have been detected by Southern hybridization analysis in both central and peripheral nervous system tissues of latently infected mice. We have detected virus-specific sequences corresponding to the junction fragment but not the genomic termini, an observation first made by Rock and Fraser (Nature [London] 302:523-525, 1983). This "endless" herpes simplex virus DNA is both qualitatively and quantitatively stable in mouse neural tissue analyzed over a 4-month period. In addition, examination of DNA extracted from human trigeminal ganglia has shown herpes simplex virus DNA to be present in an "endless" form similar to that found in the mouse model system. Further restriction enzyme analysis of latently infected mouse brainstem and human trigeminal DNA has shown that this "endless" herpes simplex virus DNA is present in all four isomeric configurations. Images PMID:3003377
SeqCompress: an algorithm for biological sequence compression.

PubMed

Sardaraz, Muhammad; Tahir, Muhammad; Ikram, Ataul Aziz; Bajwa, Hassan

2014-10-01

The growth of Next Generation Sequencing technologies presents significant research challenges, specifically to design bioinformatics tools that handle massive amount of data efficiently. Biological sequence data storage cost has become a noticeable proportion of total cost in the generation and analysis. Particularly increase in DNA sequencing rate is significantly outstripping the rate of increase in disk storage capacity, which may go beyond the limit of storage capacity. It is essential to develop algorithms that handle large data sets via better memory management. This article presents a DNA sequence compression algorithm SeqCompress that copes with the space complexity of biological sequences. The algorithm is based on lossless data compression and uses statistical model as well as arithmetic coding to compress DNA sequences. The proposed algorithm is compared with recent specialized compression tools for biological sequences. Experimental results show that proposed algorithm has better compression gain as compared to other existing algorithms. Copyright © 2014 Elsevier Inc. All rights reserved.
Cloning and sequence analysis of complementary DNA encoding an aberrantly rearranged human T-cell gamma chain.

PubMed Central

Dialynas, D P; Murre, C; Quertermous, T; Boss, J M; Leiden, J M; Seidman, J G; Strominger, J L

1986-01-01

Complementary DNA (cDNA) encoding a human T-cell gamma chain has been cloned and sequenced. At the junction of the variable and joining regions, there is an apparent deletion of two nucleotides in the human cDNA sequence relative to the murine gamma-chain cDNA sequence, resulting simultaneously in the generation of an in-frame stop codon and in a translational frameshift. For this reason, the sequence presented here encodes an aberrantly rearranged human T-cell gamma chain. There are several surprising differences between the deduced human and murine gamma-chain amino acid sequences. These include poor homology in the variable region, poor homology in a discrete segment of the constant region precisely bounded by the expected junctions of exon CII, and the presence in the human sequence of five potential sites for N-linked glycosylation. Images PMID:3458221
Sequencing historical specimens: successful preparation of small specimens with low amounts of degraded DNA.

PubMed

Sproul, John S; Maddison, David R

2017-11-01

Despite advances that allow DNA sequencing of old museum specimens, sequencing small-bodied, historical specimens can be challenging and unreliable as many contain only small amounts of fragmented DNA. Dependable methods to sequence such specimens are especially critical if the specimens are unique. We attempt to sequence small-bodied (3-6 mm) historical specimens (including nomenclatural types) of beetles that have been housed, dried, in museums for 58-159 years, and for which few or no suitable replacement specimens exist. To better understand ideal approaches of sample preparation and produce preparation guidelines, we compared different library preparation protocols using low amounts of input DNA (1-10 ng). We also explored low-cost optimizations designed to improve library preparation efficiency and sequencing success of historical specimens with minimal DNA, such as enzymatic repair of DNA. We report successful sample preparation and sequencing for all historical specimens despite our low-input DNA approach. We provide a list of guidelines related to DNA repair, bead handling, reducing adapter dimers and library amplification. We present these guidelines to facilitate more economical use of valuable DNA and enable more consistent results in projects that aim to sequence challenging, irreplaceable historical specimens. © 2017 John Wiley & Sons Ltd.

Nature and distribution of feline sarcoma virus nucleotide sequences.

PubMed Central

Frankel, A E; Gilbert, J H; Porzig, K J; Scolnick, E M; Aaronson, S A

1979-01-01

The genomes of three independent isolates of feline sarcoma virus (FeSV) were compared by molecular hybridization techniques. Using complementary DNAs prepared from two strains, SM- and ST-FeSV, common complementary DNA'S were selected by sequential hybridization to FeSV and feline leukemia virus RNAs. These DNAs were shown to be highly related among the three independent sarcoma virus isolates. FeSV-specific complementary DNAs were prepared by selection for hybridization by the homologous FeSV RNA and against hybridization by fline leukemia virus RNA. Sarcoma virus-specific sequences of SM-FeSV were shown to differ from those of either ST- or GA-FeSV strains, whereas ST-FeSV-specific DNA shared extensive sequence homology with GA-FeSV. By molecular hybridization, each set of FeSV-specific sequences was demonstrated to be present in normal cat cellular DNA in approximately one copy per haploid genome and was conserved throughout Felidae. In contrast, FeSV-common sequences were present in multiple DNA copies and were found only in Mediterranean cats. The present results are consistent with the concept that each FeSV strain has arisen by a mechanism involving recombination between feline leukemia virus and cat cellular DNA sequences, the latter represented within the cat genome in a manner analogous to that of a cellular gene. PMID:225544
An Evolutionary Classification of Genomic Function

PubMed Central

Graur, Dan; Zheng, Yichen; Azevedo, Ricardo B.R.

2015-01-01

The pronouncements of the ENCODE Project Consortium regarding “junk DNA” exposed the need for an evolutionary classification of genomic elements according to their selected-effect function. In the classification scheme presented here, we divide the genome into “functional DNA,” that is, DNA sequences that have a selected-effect function, and “rubbish DNA,” that is, sequences that do not. Functional DNA is further subdivided into “literal DNA” and “indifferent DNA.” In literal DNA, the order of nucleotides is under selection; in indifferent DNA, only the presence or absence of the sequence is under selection. Rubbish DNA is further subdivided into “junk DNA” and “garbage DNA.” Junk DNA neither contributes to nor detracts from the fitness of the organism and, hence, evolves under selective neutrality. Garbage DNA, on the other hand, decreases the fitness of its carriers. Garbage DNA exists in the genome only because natural selection is neither omnipotent nor instantaneous. Each of these four functional categories can be 1) transcribed and translated, 2) transcribed but not translated, or 3) not transcribed. The affiliation of a DNA segment to a particular functional category may change during evolution: Functional DNA may become junk DNA, junk DNA may become garbage DNA, rubbish DNA may become functional DNA, and so on; however, determining the functionality or nonfunctionality of a genomic sequence must be based on its present status rather than on its potential to change (or not to change) in the future. Changes in functional affiliation are divided into pseudogenes, Lazarus DNA, zombie DNA, and Jekyll-to-Hyde DNA. PMID:25635041
[Whole Genome Sequencing of Human mtDNA Based on Ion Torrent PGM™ Platform].

PubMed

Cao, Y; Zou, K N; Huang, J P; Ma, K; Ping, Y

2017-08-01

To analyze and detect the whole genome sequence of human mitochondrial DNA （mtDNA） by Ion Torrent PGM™ platform and to study the differences of mtDNA sequence in different tissues. Samples were collected from 6 unrelated individuals by forensic postmortem examination, including chest blood, hair, costicartilage, nail, skeletal muscle and oral epithelium. Amplification of whole genome sequence of mtDNA was performed by 4 pairs of primer. Libraries were constructed with Ion Shear™ Plus Reagents kit and Ion Plus Fragment Library kit. Whole genome sequencing of mtDNA was performed using Ion Torrent PGM™ platform. Sanger sequencing was used to determine the heteroplasmy positions and the mutation positions on HVⅠ region. The whole genome sequence of mtDNA from all samples were amplified successfully. Six unrelated individuals belonged to 6 different haplotypes. Different tissues in one individual had heteroplasmy difference. The heteroplasmy positions and the mutation positions on HVⅠ region were verified by Sanger sequencing. After a consistency check by the Kappa method, it was found that the results of mtDNA sequence had a high consistency in different tissues. The testing method used in present study for sequencing the whole genome sequence of human mtDNA can detect the heteroplasmy difference in different tissues, which have good consistency. The results provide guidance for the further applications of mtDNA in forensic science. Copyright© by the Editorial Department of Journal of Forensic Medicine
Using complementary DNA from MyoD-transduced fibroblasts to sequence large muscle genes.

PubMed

Waddell, Leigh B; Monnier, Nicole; Cooper, Sandra T; North, Kathryn N; Clarke, Nigel F

2011-08-01

Large muscle genes are often sequenced using complementary DNA (cDNA) made from muscle messenger RNA (mRNA) to reduce the cost and workload associated with sequencing from genomic DNA. Two potential barriers are the availability of a frozen muscle biopsy, and difficulties in detecting nonsense mutations due to nonsense-mediated mRNA decay (NMD). We present patient examples showing that use of MyoD-transduced fibroblasts as a source of muscle-specific mRNA overcomes these potential difficulties in sequencing large muscle-related genes. Copyright © 2011 Wiley Periodicals, Inc.
GENESUS: a two-step sequence design program for DNA nanostructure self-assembly.

PubMed

Tsutsumi, Takanobu; Asakawa, Takeshi; Kanegami, Akemi; Okada, Takao; Tahira, Tomoko; Hayashi, Kenshi

2014-01-01

DNA has been recognized as an ideal material for bottom-up construction of nanometer scale structures by self-assembly. The generation of sequences optimized for unique self-assembly (GENESUS) program reported here is a straightforward method for generating sets of strand sequences optimized for self-assembly of arbitrarily designed DNA nanostructures by a generate-candidates-and-choose-the-best strategy. A scalable procedure to prepare single-stranded DNA having arbitrary sequences is also presented. Strands for the assembly of various structures were designed and successfully constructed, validating both the program and the procedure.
Future collaborations between NEON and the U.S. EPA: linking molecular genomics for bioassessment with national ecological data sets

EPA Science Inventory

Molecular taxonomic techniques such as DNA barcoding offer interesting new capabilities for studying community biodiversity for applications like biological monitoring. Beyond DNA barcoding, new DNA sequencing technologies (i.e. Next-Generation Sequencing) present even greater po...
DNA barcode goes two-dimensions: DNA QR code web server.

PubMed

Liu, Chang; Shi, Linchun; Xu, Xiaolan; Li, Huan; Xing, Hang; Liang, Dong; Jiang, Kun; Pang, Xiaohui; Song, Jingyuan; Chen, Shilin

2012-01-01

The DNA barcoding technology uses a standard region of DNA sequence for species identification and discovery. At present, "DNA barcode" actually refers to DNA sequences, which are not amenable to information storage, recognition, and retrieval. Our aim is to identify the best symbology that can represent DNA barcode sequences in practical applications. A comprehensive set of sequences for five DNA barcode markers ITS2, rbcL, matK, psbA-trnH, and CO1 was used as the test data. Fifty-three different types of one-dimensional and ten two-dimensional barcode symbologies were compared based on different criteria, such as coding capacity, compression efficiency, and error detection ability. The quick response (QR) code was found to have the largest coding capacity and relatively high compression ratio. To facilitate the further usage of QR code-based DNA barcodes, a web server was developed and is accessible at http://qrfordna.dnsalias.org. The web server allows users to retrieve the QR code for a species of interests, convert a DNA sequence to and from a QR code, and perform species identification based on local and global sequence similarities. In summary, the first comprehensive evaluation of various barcode symbologies has been carried out. The QR code has been found to be the most appropriate symbology for DNA barcode sequences. A web server has also been constructed to allow biologists to utilize QR codes in practical DNA barcoding applications.
Assessing the Fidelity of Ancient DNA Sequences Amplified From Nuclear Genes

PubMed Central

Binladen, Jonas; Wiuf, Carsten; Gilbert, M. Thomas P.; Bunce, Michael; Barnett, Ross; Larson, Greger; Greenwood, Alex D.; Haile, James; Ho, Simon Y. W.; Hansen, Anders J.; Willerslev, Eske

2006-01-01

To date, the field of ancient DNA has relied almost exclusively on mitochondrial DNA (mtDNA) sequences. However, a number of recent studies have reported the successful recovery of ancient nuclear DNA (nuDNA) sequences, thereby allowing the characterization of genetic loci directly involved in phenotypic traits of extinct taxa. It is well documented that postmortem damage in ancient mtDNA can lead to the generation of artifactual sequences. However, as yet no one has thoroughly investigated the damage spectrum in ancient nuDNA. By comparing clone sequences from 23 fossil specimens, recovered from environments ranging from permafrost to desert, we demonstrate the presence of miscoding lesion damage in both the mtDNA and nuDNA, resulting in insertion of erroneous bases during amplification. Interestingly, no significant differences in the frequency of miscoding lesion damage are recorded between mtDNA and nuDNA despite great differences in cellular copy numbers. For both mtDNA and nuDNA, we find significant positive correlations between total sequence heterogeneity and the rates of type 1 transitions (adenine → guanine and thymine → cytosine) and type 2 transitions (cytosine → thymine and guanine → adenine), respectively. Type 2 transitions are by far the most dominant and increase relative to those of type 1 with damage load. The results suggest that the deamination of cytosine (and 5-methyl cytosine) to uracil (and thymine) is the main cause of miscoding lesions in both ancient mtDNA and nuDNA sequences. We argue that the problems presented by postmortem damage, as well as problems with contamination from exogenous sources of conserved nuclear genes, allelic variation, and the reliance on single nucleotide polymorphisms, call for great caution in studies relying on ancient nuDNA sequences. PMID:16299392
Divergent nuclear 18S rDNA paralogs in a turkey coccidium, Eimeria meleagrimitis, complicate molecular systematics and identification.

PubMed

El-Sherry, Shiem; Ogedengbe, Mosun E; Hafeez, Mian A; Barta, John R

2013-07-01

Multiple 18S rDNA sequences were obtained from two single-oocyst-derived lines of each of Eimeria meleagrimitis and Eimeria adenoeides. After analysing the 15 new 18S rDNA sequences from two lines of E. meleagrimitis and 17 new sequences from two lines of E. adenoeides, there were clear indications that divergent, paralogous 18S rDNA copies existed within the nuclear genome of E. meleagrimitis. In contrast, mitochondrial cytochrome c oxidase subunit I (COI) partial sequences from all lines of a particular Eimeria sp. were identical and, in phylogenetic analyses, COI sequences clustered unambiguously in monophyletic and highly-supported clades specific to individual Eimeria sp. Phylogenetic analysis of the new 18S rDNA sequences from E. meleagrimitis showed that they formed two distinct clades: Type A with four new sequences; and Type B with nine new sequences; both Types A and B sequences were obtained from each of the single-oocyst-derived lines of E. meleagrimitis. Together these rDNA types formed a well-supported E. meleagrimitis clade. Types A and B 18S rDNA sequences from E. meleagrimitis had a mean sequence identity of only 97.4% whereas mean sequence identity within types was 99.1-99.3%. The observed intraspecific sequence divergence among E. meleagrimitis 18S rDNA sequence types was even higher (approximately 2.6%) than the interspecific sequence divergence present between some well-recognized species such as Eimeria tenella and Eimeria necatrix (1.1%). Our observations suggest that, unlike COI sequences, 18S rDNA sequences are not reliable molecular markers to be used alone for species identification with coccidia, although 18S rDNA sequences have clear utility for phylogenetic reconstruction of apicomplexan parasites at the genus and higher taxonomic ranks. Copyright © 2013. Published by Elsevier Ltd.
Genomic signal processing methods for computation of alignment-free distances from DNA sequences.

PubMed

Borrayo, Ernesto; Mendizabal-Ruiz, E Gerardo; Vélez-Pérez, Hugo; Romo-Vázquez, Rebeca; Mendizabal, Adriana P; Morales, J Alejandro

2014-01-01

Genomic signal processing (GSP) refers to the use of digital signal processing (DSP) tools for analyzing genomic data such as DNA sequences. A possible application of GSP that has not been fully explored is the computation of the distance between a pair of sequences. In this work we present GAFD, a novel GSP alignment-free distance computation method. We introduce a DNA sequence-to-signal mapping function based on the employment of doublet values, which increases the number of possible amplitude values for the generated signal. Additionally, we explore the use of three DSP distance metrics as descriptors for categorizing DNA signal fragments. Our results indicate the feasibility of employing GAFD for computing sequence distances and the use of descriptors for characterizing DNA fragments.
Genomic Signal Processing Methods for Computation of Alignment-Free Distances from DNA Sequences

PubMed Central

Borrayo, Ernesto; Mendizabal-Ruiz, E. Gerardo; Vélez-Pérez, Hugo; Romo-Vázquez, Rebeca; Mendizabal, Adriana P.; Morales, J. Alejandro

2014-01-01

Genomic signal processing (GSP) refers to the use of digital signal processing (DSP) tools for analyzing genomic data such as DNA sequences. A possible application of GSP that has not been fully explored is the computation of the distance between a pair of sequences. In this work we present GAFD, a novel GSP alignment-free distance computation method. We introduce a DNA sequence-to-signal mapping function based on the employment of doublet values, which increases the number of possible amplitude values for the generated signal. Additionally, we explore the use of three DSP distance metrics as descriptors for categorizing DNA signal fragments. Our results indicate the feasibility of employing GAFD for computing sequence distances and the use of descriptors for characterizing DNA fragments. PMID:25393409
Rational design of DNA sequences for nanotechnology, microarrays and molecular computers using Eulerian graphs.

PubMed

Pancoska, Petr; Moravek, Zdenek; Moll, Ute M

2004-01-01

Nucleic acids are molecules of choice for both established and emerging nanoscale technologies. These technologies benefit from large functional densities of 'DNA processing elements' that can be readily manufactured. To achieve the desired functionality, polynucleotide sequences are currently designed by a process that involves tedious and laborious filtering of potential candidates against a series of requirements and parameters. Here, we present a complete novel methodology for the rapid rational design of large sets of DNA sequences. This method allows for the direct implementation of very complex and detailed requirements for the generated sequences, thus avoiding 'brute force' filtering. At the same time, these sequences have narrow distributions of melting temperatures. The molecular part of the design process can be done without computer assistance, using an efficient 'human engineering' approach by drawing a single blueprint graph that represents all generated sequences. Moreover, the method eliminates the necessity for extensive thermodynamic calculations. Melting temperature can be calculated only once (or not at all). In addition, the isostability of the sequences is independent of the selection of a particular set of thermodynamic parameters. Applications are presented for DNA sequence designs for microarrays, universal microarray zip sequences and electron transfer experiments.
A molecular model for illegitimate recombination in Bacillus subtilis.

PubMed

Temeyer, K B; Hopkins, K M; Chapman, L F

1991-01-01

The recombinant DNA junctions at which pUB110 and Bacillus subtilis chromosomal DNA were joined to form the plasmid pKBT1 were cloned and sequenced. From the sequencing data we conclude that the pUB110 sequence is intact in the pair of cloned pKBT1 fragments and pTL12 sequences are not present. A molecular model for the formation of pKBT1 based on structural motifs characteristic of the joint sites is presented.
DNA Shape Dominates Sequence Affinity in Nucleosome Formation

NASA Astrophysics Data System (ADS)

Freeman, Gordon S.; Lequieu, Joshua P.; Hinckley, Daniel M.; Whitmer, Jonathan K.; de Pablo, Juan J.

2014-10-01

Nucleosomes provide the basic unit of compaction in eukaryotic genomes, and the mechanisms that dictate their position at specific locations along a DNA sequence are of central importance to genetics. In this Letter, we employ molecular models of DNA and proteins to elucidate various aspects of nucleosome positioning. In particular, we show how DNA's histone affinity is encoded in its sequence-dependent shape, including subtle deviations from the ideal straight B-DNA form and local variations of minor groove width. By relying on high-precision simulations of the free energy of nucleosome complexes, we also demonstrate that, depending on DNA's intrinsic curvature, histone binding can be dominated by bending interactions or electrostatic interactions. More generally, the results presented here explain how sequence, manifested as the shape of the DNA molecule, dominates molecular recognition in the problem of nucleosome positioning.
Impact of Lateral Transfers on the Genomes of Lepidoptera

PubMed Central

Drezen, Jean-Michel; Josse, Thibaut; Bézier, Annie; Gauthier, Jérémy; Huguet, Elisabeth

2017-01-01

Transfer of DNA sequences between species regardless of their evolutionary distance is very common in bacteria, but evidence that horizontal gene transfer (HGT) also occurs in multicellular organisms has been accumulating in the past few years. The actual extent of this phenomenon is underestimated due to frequent sequence filtering of “alien” DNA before genome assembly. However, recent studies based on genome sequencing have revealed, and experimentally verified, the presence of foreign DNA sequences in the genetic material of several species of Lepidoptera. Large DNA viruses, such as baculoviruses and the symbiotic viruses of parasitic wasps (bracoviruses), have the potential to mediate these transfers in Lepidoptera. In particular, using ultra-deep sequencing, newly integrated transposons have been identified within baculovirus genomes. Bacterial genes have also been acquired by genomes of Lepidoptera, as in other insects and nematodes. In addition, insertions of bracovirus sequences were present in the genomes of certain moth and butterfly lineages, that were likely corresponding to rearrangements of ancient integrations. The viral genes present in these sequences, sometimes of hymenopteran origin, have been co-opted by lepidopteran species to confer some protection against pathogens. PMID:29120392
DNABIT Compress - Genome compression algorithm.

PubMed

Rajarajeswari, Pothuraju; Apparao, Allam

2011-01-22

Data compression is concerned with how information is organized in data. Efficient storage means removal of redundancy from the data being stored in the DNA molecule. Data compression algorithms remove redundancy and are used to understand biologically important molecules. We present a compression algorithm, "DNABIT Compress" for DNA sequences based on a novel algorithm of assigning binary bits for smaller segments of DNA bases to compress both repetitive and non repetitive DNA sequence. Our proposed algorithm achieves the best compression ratio for DNA sequences for larger genome. Significantly better compression results show that "DNABIT Compress" algorithm is the best among the remaining compression algorithms. While achieving the best compression ratios for DNA sequences (Genomes),our new DNABIT Compress algorithm significantly improves the running time of all previous DNA compression programs. Assigning binary bits (Unique BIT CODE) for (Exact Repeats, Reverse Repeats) fragments of DNA sequence is also a unique concept introduced in this algorithm for the first time in DNA compression. This proposed new algorithm could achieve the best compression ratio as much as 1.58 bits/bases where the existing best methods could not achieve a ratio less than 1.72 bits/bases.
cgDNAweb: a web interface to the cgDNA sequence-dependent coarse-grain model of double-stranded DNA.

PubMed

De Bruin, Lennart; Maddocks, John H

2018-06-14

The sequence-dependent statistical mechanical properties of fragments of double-stranded DNA is believed to be pertinent to its biological function at length scales from a few base pairs (or bp) to a few hundreds of bp, e.g. indirect read-out protein binding sites, nucleosome positioning sequences, phased A-tracts, etc. In turn, the equilibrium statistical mechanics behaviour of DNA depends upon its ground state configuration, or minimum free energy shape, as well as on its fluctuations as governed by its stiffness (in an appropriate sense). We here present cgDNAweb, which provides browser-based interactive visualization of the sequence-dependent ground states of double-stranded DNA molecules, as predicted by the underlying cgDNA coarse-grain rigid-base model of fragments with arbitrary sequence. The cgDNAweb interface is specifically designed to facilitate comparison between ground state shapes of different sequences. The server is freely available at cgDNAweb.epfl.ch with no login requirement.
Mapping Simple Repeated DNA Sequences in Heterochromatin of Drosophila Melanogaster

PubMed Central

Lohe, A. R.; Hilliker, A. J.; Roberts, P. A.

1993-01-01

Heterochromatin in Drosophila has unusual genetic, cytological and molecular properties. Highly repeated DNA sequences (satellites) are the principal component of heterochromatin. Using probes from cloned satellites, we have constructed a chromosome map of 10 highly repeated, simple DNA sequences in heterochromatin of mitotic chromosomes of Drosophila melanogaster. Despite extensive sequence homology among some satellites, chromosomal locations could be distinguished by stringent in situ hybridizations for each satellite. Only two of the localizations previously determined using gradient-purified bulk satellite probes are correct. Eight new satellite localizations are presented, providing a megabase-level chromosome map of one-quarter of the genome. Five major satellites each exhibit a multichromosome distribution, and five minor satellites hybridize to single sites on the Y chromosome. Satellites closely related in sequence are often located near one another on the same chromosome. About 80% of Y chromosome DNA is composed of nine simple repeated sequences, in particular (AAGAC)(n) (8 Mb), (AAGAG)(n) (7 Mb) and (AATAT)(n) (6 Mb). Similarly, more than 70% of the DNA in chromosome 2 heterochromatin is composed of five simple repeated sequences. We have also generated a high resolution map of satellites in chromosome 2 heterochromatin, using a series of translocation chromosomes whose breakpoints in heterochromatin were ordered by N-banding. Finally, staining and banding patterns of heterochromatic regions are correlated with the locations of specific repeated DNA sequences. The basis for the cytochemical heterogeneity in banding appears to depend exclusively on the different satellite DNAs present in heterochromatin. PMID:8375654
DNA capture and next-generation sequencing can recover whole mitochondrial genomes from highly degraded samples for human identification

PubMed Central

2013-01-01

Background Mitochondrial DNA (mtDNA) typing can be a useful aid for identifying people from compromised samples when nuclear DNA is too damaged, degraded or below detection thresholds for routine short tandem repeat (STR)-based analysis. Standard mtDNA typing, focused on PCR amplicon sequencing of the control region (HVS I and HVS II), is limited by the resolving power of this short sequence, which misses up to 70% of the variation present in the mtDNA genome. Methods We used in-solution hybridisation-based DNA capture (using DNA capture probes prepared from modern human mtDNA) to recover mtDNA from post-mortem human remains in which the majority of DNA is both highly fragmented (<100 base pairs in length) and chemically damaged. The method ‘immortalises’ the finite quantities of DNA in valuable extracts as DNA libraries, which is followed by the targeted enrichment of endogenous mtDNA sequences and characterisation by next-generation sequencing (NGS). Results We sequenced whole mitochondrial genomes for human identification from samples where standard nuclear STR typing produced only partial profiles or demonstrably failed and/or where standard mtDNA hypervariable region sequences lacked resolving power. Multiple rounds of enrichment can substantially improve coverage and sequencing depth of mtDNA genomes from highly degraded samples. The application of this method has led to the reliable mitochondrial sequencing of human skeletal remains from unidentified World War Two (WWII) casualties approximately 70 years old and from archaeological remains (up to 2,500 years old). Conclusions This approach has potential applications in forensic science, historical human identification cases, archived medical samples, kinship analysis and population studies. In particular the methodology can be applied to any case, involving human or non-human species, where whole mitochondrial genome sequences are required to provide the highest level of maternal lineage discrimination. Multiple rounds of in-solution hybridisation-based DNA capture can retrieve whole mitochondrial genome sequences from even the most challenging samples. PMID:24289217
DNA Barcode Goes Two-Dimensions: DNA QR Code Web Server

PubMed Central

Li, Huan; Xing, Hang; Liang, Dong; Jiang, Kun; Pang, Xiaohui; Song, Jingyuan; Chen, Shilin

2012-01-01

The DNA barcoding technology uses a standard region of DNA sequence for species identification and discovery. At present, “DNA barcode” actually refers to DNA sequences, which are not amenable to information storage, recognition, and retrieval. Our aim is to identify the best symbology that can represent DNA barcode sequences in practical applications. A comprehensive set of sequences for five DNA barcode markers ITS2, rbcL, matK, psbA-trnH, and CO1 was used as the test data. Fifty-three different types of one-dimensional and ten two-dimensional barcode symbologies were compared based on different criteria, such as coding capacity, compression efficiency, and error detection ability. The quick response (QR) code was found to have the largest coding capacity and relatively high compression ratio. To facilitate the further usage of QR code-based DNA barcodes, a web server was developed and is accessible at http://qrfordna.dnsalias.org. The web server allows users to retrieve the QR code for a species of interests, convert a DNA sequence to and from a QR code, and perform species identification based on local and global sequence similarities. In summary, the first comprehensive evaluation of various barcode symbologies has been carried out. The QR code has been found to be the most appropriate symbology for DNA barcode sequences. A web server has also been constructed to allow biologists to utilize QR codes in practical DNA barcoding applications. PMID:22574113

i-rDNA: alignment-free algorithm for rapid in silico detection of ribosomal gene fragments from metagenomic sequence data sets.

PubMed

Mohammed, Monzoorul Haque; Ghosh, Tarini Shankar; Chadaram, Sudha; Mande, Sharmila S

2011-11-30

Obtaining accurate estimates of microbial diversity using rDNA profiling is the first step in most metagenomics projects. Consequently, most metagenomic projects spend considerable amounts of time, money and manpower for experimentally cloning, amplifying and sequencing the rDNA content in a metagenomic sample. In the second step, the entire genomic content of the metagenome is extracted, sequenced and analyzed. Since DNA sequences obtained in this second step also contain rDNA fragments, rapid in silico identification of these rDNA fragments would drastically reduce the cost, time and effort of current metagenomic projects by entirely bypassing the experimental steps of primer based rDNA amplification, cloning and sequencing. In this study, we present an algorithm called i-rDNA that can facilitate the rapid detection of 16S rDNA fragments from amongst millions of sequences in metagenomic data sets with high detection sensitivity. Performance evaluation with data sets/database variants simulating typical metagenomic scenarios indicates the significantly high detection sensitivity of i-rDNA. Moreover, i-rDNA can process a million sequences in less than an hour on a simple desktop with modest hardware specifications. In addition to the speed of execution, high sensitivity and low false positive rate, the utility of the algorithmic approach discussed in this paper is immense given that it would help in bypassing the entire experimental step of primer-based rDNA amplification, cloning and sequencing. Application of this algorithmic approach would thus drastically reduce the cost, time and human efforts invested in all metagenomic projects. A web-server for the i-rDNA algorithm is available at http://metagenomics.atc.tcs.com/i-rDNA/
Application of discrete Fourier inter-coefficient difference for assessing genetic sequence similarity.

PubMed

King, Brian R; Aburdene, Maurice; Thompson, Alex; Warres, Zach

2014-01-01

Digital signal processing (DSP) techniques for biological sequence analysis continue to grow in popularity due to the inherent digital nature of these sequences. DSP methods have demonstrated early success for detection of coding regions in a gene. Recently, these methods are being used to establish DNA gene similarity. We present the inter-coefficient difference (ICD) transformation, a novel extension of the discrete Fourier transformation, which can be applied to any DNA sequence. The ICD method is a mathematical, alignment-free DNA comparison method that generates a genetic signature for any DNA sequence that is used to generate relative measures of similarity among DNA sequences. We demonstrate our method on a set of insulin genes obtained from an evolutionarily wide range of species, and on a set of avian influenza viral sequences, which represents a set of highly similar sequences. We compare phylogenetic trees generated using our technique against trees generated using traditional alignment techniques for similarity and demonstrate that the ICD method produces a highly accurate tree without requiring an alignment prior to establishing sequence similarity.
Myopathic mtDNA Depletion Syndrome Due to Mutation in TK2 Gene.

PubMed

Martín-Hernández, Elena; García-Silva, María Teresa; Quijada-Fraile, Pilar; Rodríguez-García, María Elena; Rivera, Henry; Hernández-Laín, Aurelio; Coca-Robinot, David; Fernández-Toral, Joaquín; Arenas, Joaquín; Martín, Miguel A; Martínez-Azorín, Francisco

2017-01-01

Whole-exome sequencing was used to identify the disease gene(s) in a Spanish girl with failure to thrive, muscle weakness, mild facial weakness, elevated creatine kinase, deficiency of mitochondrial complex III and depletion of mtDNA. With whole-exome sequencing data, it was possible to get the whole mtDNA sequencing and discard any pathogenic variant in this genome. The analysis of whole exome uncovered a homozygous pathogenic mutation in thymidine kinase 2 gene ( TK2; NM_004614.4:c.323 C>T, p.T108M). TK2 mutations have been identified mainly in patients with the myopathic form of mtDNA depletion syndromes. This patient presents an atypical TK2-related myopathic form of mtDNA depletion syndromes, because despite having a very low content of mtDNA (<20%), she presents a slower and less severe evolution of the disease. In conclusion, our data confirm the role of TK2 gene in mtDNA depletion syndromes and expanded the phenotypic spectrum.
Compressing DNA sequence databases with coil.

PubMed

White, W Timothy J; Hendy, Michael D

2008-05-20

Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip) compression - an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression - the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST) data. Finally, coil can efficiently encode incremental additions to a sequence database. coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work.
Compressing DNA sequence databases with coil

PubMed Central

White, W Timothy J; Hendy, Michael D

2008-01-01

Background Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip) compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. Results We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression – the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST) data. Finally, coil can efficiently encode incremental additions to a sequence database. Conclusion coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work. PMID:18489794
Genetic diversity based on 28S rDNA sequences among populations of Culex quinquefasciatus collected at different locations in Tamil Nadu, India.

PubMed

Sakthivelkumar, S; Ramaraj, P; Veeramani, V; Janarthanan, S

2015-09-01

The basis of the present study was to distinguish the existence of any genetic variability among populations of Culex quinquefasciatus which would be a valuable tool in the management of mosquito control programmes. In the present study, population of Cx. quinquefasciatus collected at different locations in Tamil Nadu were analyzed for their genetic variation based on 28S rDNA D2 region nucleotide sequences. A high degree of genetic polymorphism was detected in the sequences of D2 region of 28S rDNA on the predicted secondary structures in spite of high nucleotide sequence similarity. The findings based on secondary structure using rDNA sequences suggested the existence of a complex genotypic diversity of Cx. quinquefasciatus population collected at different locations of Tamil Nadu, India. This complexity in genetic diversity in a single mosquito population collected at different locations is considered an important issue towards their influence and nature of vector potential of these mosquitoes.
Method for introducing unidirectional nested deletions

DOEpatents

Dunn, John J.; Quesada, Mark A.; Randesi, Matthew

2001-01-01

Disclosed is a method for the introduction of unidirectional deletions in a cloned DNA segment in the context of a cloning vector which contains an f1 endonuclease recognition sequence adjacent to the insertion site of the DNA segment. Also disclosed is a method for producing single-stranded DNA probes utilizing the same cloning vector. An optimal vector, PZIP is described. Methods for introducing unidirectional deletions into a terminal location of a cloned DNA sequence which is inserted into the vector of the present invention are also disclosed. These methods are useful for introducing deletions into either or both ends of a cloned DNA insert, for high throughput sequencing of any DNA of interest.
Extreme-Depth Re-sequencing of Mitochondrial DNA Finds No Evidence of Paternal Transmission in Humans.

PubMed

Pyle, Angela; Hudson, Gavin; Wilson, Ian J; Coxhead, Jonathan; Smertenko, Tania; Herbert, Mary; Santibanez-Koref, Mauro; Chinnery, Patrick F

2015-05-01

Recent reports have questioned the accepted dogma that mammalian mitochondrial DNA (mtDNA) is strictly maternally inherited. In humans, the argument hinges on detecting a signature of inter-molecular recombination in mtDNA sequences sampled at the population level, inferring a paternal source for the mixed haplotypes. However, interpreting these data is fraught with difficulty, and direct experimental evidence is lacking. Using extreme-high depth mtDNA re-sequencing up to ~1.2 million-fold coverage, we find no evidence that paternal mtDNA haplotypes are transmitted to offspring in humans, thus excluding a simple dilution mechanism for uniparental transmission of mtDNA present in all healthy individuals. Our findings indicate that an active mechanism eliminates paternal mtDNA which likely acts at the molecular level.
Extreme-Depth Re-sequencing of Mitochondrial DNA Finds No Evidence of Paternal Transmission in Humans

PubMed Central

Pyle, Angela; Hudson, Gavin; Wilson, Ian J.; Coxhead, Jonathan; Smertenko, Tania; Herbert, Mary; Santibanez-Koref, Mauro; Chinnery, Patrick F.

2015-01-01

Recent reports have questioned the accepted dogma that mammalian mitochondrial DNA (mtDNA) is strictly maternally inherited. In humans, the argument hinges on detecting a signature of inter-molecular recombination in mtDNA sequences sampled at the population level, inferring a paternal source for the mixed haplotypes. However, interpreting these data is fraught with difficulty, and direct experimental evidence is lacking. Using extreme-high depth mtDNA re-sequencing up to ~1.2 million-fold coverage, we find no evidence that paternal mtDNA haplotypes are transmitted to offspring in humans, thus excluding a simple dilution mechanism for uniparental transmission of mtDNA present in all healthy individuals. Our findings indicate that an active mechanism eliminates paternal mtDNA which likely acts at the molecular level. PMID:25973765
Cellulases and coding sequences

DOEpatents

Li, Xin-Liang; Ljungdahl, Lars G.; Chen, Huizhong

2001-02-20

The present invention provides three fungal cellulases, their coding sequences, recombinant DNA molecules comprising the cellulase coding sequences, recombinant host cells and methods for producing same. The present cellulases are from Orpinomyces PC-2.
Cellulases and coding sequences

DOEpatents

Li, Xin-Liang; Ljungdahl, Lars G.; Chen, Huizhong

2001-01-01

The present invention provides three fungal cellulases, their coding sequences, recombinant DNA molecules comprising the cellulase coding sequences, recombinant host cells and methods for producing same. The present cellulases are from Orpinomyces PC-2.
Pulling out the 1%: Whole-Genome Capture for the Targeted Enrichment of Ancient DNA Sequencing Libraries

PubMed Central

Carpenter, Meredith L.; Buenrostro, Jason D.; Valdiosera, Cristina; Schroeder, Hannes; Allentoft, Morten E.; Sikora, Martin; Rasmussen, Morten; Gravel, Simon; Guillén, Sonia; Nekhrizov, Georgi; Leshtakov, Krasimir; Dimitrova, Diana; Theodossiev, Nikola; Pettener, Davide; Luiselli, Donata; Sandoval, Karla; Moreno-Estrada, Andrés; Li, Yingrui; Wang, Jun; Gilbert, M. Thomas P.; Willerslev, Eske; Greenleaf, William J.; Bustamante, Carlos D.

2013-01-01

Most ancient specimens contain very low levels of endogenous DNA, precluding the shotgun sequencing of many interesting samples because of cost. Ancient DNA (aDNA) libraries often contain <1% endogenous DNA, with the majority of sequencing capacity taken up by environmental DNA. Here we present a capture-based method for enriching the endogenous component of aDNA sequencing libraries. By using biotinylated RNA baits transcribed from genomic DNA libraries, we are able to capture DNA fragments from across the human genome. We demonstrate this method on libraries created from four Iron Age and Bronze Age human teeth from Bulgaria, as well as bone samples from seven Peruvian mummies and a Bronze Age hair sample from Denmark. Prior to capture, shotgun sequencing of these libraries yielded an average of 1.2% of reads mapping to the human genome (including duplicates). After capture, this fraction increased substantially, with up to 59% of reads mapped to human and enrichment ranging from 6- to 159-fold. Furthermore, we maintained coverage of the majority of regions sequenced in the precapture library. Intersection with the 1000 Genomes Project reference panel yielded an average of 50,723 SNPs (range 3,062–147,243) for the postcapture libraries sequenced with 1 million reads, compared with 13,280 SNPs (range 217–73,266) for the precapture libraries, increasing resolution in population genetic analyses. Our whole-genome capture approach makes it less costly to sequence aDNA from specimens containing very low levels of endogenous DNA, enabling the analysis of larger numbers of samples. PMID:24568772
Circular replication-associated protein encoding DNA viruses identified in the faecal matter of various animals in New Zealand.

PubMed

Steel, Olivia; Kraberger, Simona; Sikorski, Alyssa; Young, Laura M; Catchpole, Ryan J; Stevens, Aaron J; Ladley, Jenny J; Coray, Dorien S; Stainton, Daisy; Dayaram, Anisha; Julian, Laurel; van Bysterveldt, Katherine; Varsani, Arvind

2016-09-01

In recent years, innovations in molecular techniques and sequencing technologies have resulted in a rapid expansion in the number of known viral sequences, in particular those with circular replication-associated protein (Rep)-encoding single-stranded (CRESS) DNA genomes. CRESS DNA viruses are present in the virome of many ecosystems and are known to infect a wide range of organisms. A large number of the recently identified CRESS DNA viruses cannot be classified into any known viral families, indicating that the current view of CRESS DNA viral sequence space is greatly underestimated. Animal faecal matter has proven to be a particularly useful source for sampling CRESS DNA viruses in an ecosystem, as it is cost-effective and non-invasive. In this study a viral metagenomic approach was used to explore the diversity of CRESS DNA viruses present in the faeces of domesticated and wild animals in New Zealand. Thirty-eight complete CRESS DNA viral genomes and two circular molecules (that may be defective molecules or single components of multicomponent genomes) were identified from forty-nine individual animal faecal samples. Based on shared genome organisations and sequence similarities, eighteen of the isolates were classified as gemycircularviruses and twelve isolates were classified as smacoviruses. The remaining eight isolates lack significant sequence similarity with any members of known CRESS DNA virus groups. This research adds significantly to our knowledge of CRESS DNA viral diversity in New Zealand, emphasising the prevalence of CRESS DNA viruses in nature, and reinforcing the suggestion that a large proportion of CRESS DNA viruses are yet to be identified. Copyright © 2016 Elsevier B.V. All rights reserved.
An algebraic hypothesis about the primeval genetic code architecture.

PubMed

Sánchez, Robersy; Grau, Ricardo

2009-09-01

A plausible architecture of an ancient genetic code is derived from an extended base triplet vector space over the Galois field of the extended base alphabet {D,A,C,G,U}, where symbol D represents one or more hypothetical bases with unspecific pairings. We hypothesized that the high degeneration of a primeval genetic code with five bases and the gradual origin and improvement of a primeval DNA repair system could make possible the transition from ancient to modern genetic codes. Our results suggest that the Watson-Crick base pairing G identical with C and A=U and the non-specific base pairing of the hypothetical ancestral base D used to define the sum and product operations are enough features to determine the coding constraints of the primeval and the modern genetic code, as well as, the transition from the former to the latter. Geometrical and algebraic properties of this vector space reveal that the present codon assignment of the standard genetic code could be induced from a primeval codon assignment. Besides, the Fourier spectrum of the extended DNA genome sequences derived from the multiple sequence alignment suggests that the called period-3 property of the present coding DNA sequences could also exist in the ancient coding DNA sequences. The phylogenetic analyses achieved with metrics defined in the N-dimensional vector space (B(3))(N) of DNA sequences and with the new evolutionary model presented here also suggest that an ancient DNA coding sequence with five or more bases does not contradict the expected evolutionary history.
Isolation and characterization of DNA from archaeological bone.

PubMed

Hagelberg, E; Clegg, J B

1991-04-22

DNA was extracted from human and animal bones recovered from archaeological sites and mitochondrial DNA sequences were amplified from the extracts using the polymerase chain reaction. Evidence is presented that the amplified sequences are authentic and do not represent contamination by extraneous DNA. The results show that significant amounts of genetic information can survive for long periods in bone, and have important implications for evolutionary genetics, anthropology and forensic science.
Using Playing Cards to Simulate a Molecular Clock

ERIC Educational Resources Information Center

Westerling, Karin E.

2008-01-01

Changes in DNA base-repair may serve as an indicator of the time elapsed since divergence from a common ancestor. DNA sequences can now be analyzed. The simulation presented in this article allows students to observe the accumulation of changes in a randomly mutating sequence of playing cards. The cards are analogous to DNA nucleotide or protein…
Toward a new paradigm of DNA writing using a massively parallel sequencing platform and degenerate oligonucleotide

PubMed Central

Hwang, Byungjin; Bang, Duhee

2016-01-01

All synthetic DNA materials require prior programming of the building blocks of the oligonucleotide sequences. The development of a programmable microarray platform provides cost-effective and time-efficient solutions in the field of data storage using DNA. However, the scalability of the synthesis is not on par with the accelerating sequencing capacity. Here, we report on a new paradigm of generating genetic material (writing) using a degenerate oligonucleotide and optomechanical retrieval method that leverages sequencing (reading) throughput to generate the desired number of oligonucleotides. As a proof of concept, we demonstrate the feasibility of our concept in digital information storage in DNA. In simulation, the ability to store data is expected to exponentially increase with increase in degenerate space. The present study highlights the major framework change in conventional DNA writing paradigm as a sequencer itself can become a potential source of making genetic materials. PMID:27876825
Toward a new paradigm of DNA writing using a massively parallel sequencing platform and degenerate oligonucleotide.

PubMed

Hwang, Byungjin; Bang, Duhee

2016-11-23

All synthetic DNA materials require prior programming of the building blocks of the oligonucleotide sequences. The development of a programmable microarray platform provides cost-effective and time-efficient solutions in the field of data storage using DNA. However, the scalability of the synthesis is not on par with the accelerating sequencing capacity. Here, we report on a new paradigm of generating genetic material (writing) using a degenerate oligonucleotide and optomechanical retrieval method that leverages sequencing (reading) throughput to generate the desired number of oligonucleotides. As a proof of concept, we demonstrate the feasibility of our concept in digital information storage in DNA. In simulation, the ability to store data is expected to exponentially increase with increase in degenerate space. The present study highlights the major framework change in conventional DNA writing paradigm as a sequencer itself can become a potential source of making genetic materials.
A laboratory information management system for DNA barcoding workflows.

PubMed

Vu, Thuy Duong; Eberhardt, Ursula; Szöke, Szániszló; Groenewald, Marizeth; Robert, Vincent

2012-07-01

This paper presents a laboratory information management system for DNA sequences (LIMS) created and based on the needs of a DNA barcoding project at the CBS-KNAW Fungal Biodiversity Centre (Utrecht, the Netherlands). DNA barcoding is a global initiative for species identification through simple DNA sequence markers. We aim at generating barcode data for all strains (or specimens) included in the collection (currently ca. 80 k). The LIMS has been developed to better manage large amounts of sequence data and to keep track of the whole experimental procedure. The system has allowed us to classify strains more efficiently as the quality of sequence data has improved, and as a result, up-to-date taxonomic names have been given to strains and more accurate correlation analyses have been carried out.
UV-Visible Spectroscopy-Based Quantification of Unlabeled DNA Bound to Gold Nanoparticles.

PubMed

Baldock, Brandi L; Hutchison, James E

2016-12-20

DNA-functionalized gold nanoparticles have been increasingly applied as sensitive and selective analytical probes and biosensors. The DNA ligands bound to a nanoparticle dictate its reactivity, making it essential to know the type and number of DNA strands bound to the nanoparticle surface. Existing methods used to determine the number of DNA strands per gold nanoparticle (AuNP) require that the sequences be fluorophore-labeled, which may affect the DNA surface coverage and reactivity of the nanoparticle and/or require specialized equipment and other fluorophore-containing reagents. We report a UV-visible-based method to conveniently and inexpensively determine the number of DNA strands attached to AuNPs of different core sizes. When this method is used in tandem with a fluorescence dye assay, it is possible to determine the ratio of two unlabeled sequences of different lengths bound to AuNPs. Two sizes of citrate-stabilized AuNPs (5 and 12 nm) were functionalized with mixtures of short (5 base) and long (32 base) disulfide-terminated DNA sequences, and the ratios of sequences bound to the AuNPs were determined using the new method. The long DNA sequence was present as a lower proportion of the ligand shell than in the ligand exchange mixture, suggesting it had a lower propensity to bind the AuNPs than the short DNA sequence. The ratio of DNA sequences bound to the AuNPs was not the same for the large and small AuNPs, which suggests that the radius of curvature had a significant influence on the assembly of DNA strands onto the AuNPs.

DNABIT Compress – Genome compression algorithm

PubMed Central

Rajarajeswari, Pothuraju; Apparao, Allam

2011-01-01

Data compression is concerned with how information is organized in data. Efficient storage means removal of redundancy from the data being stored in the DNA molecule. Data compression algorithms remove redundancy and are used to understand biologically important molecules. We present a compression algorithm, “DNABIT Compress” for DNA sequences based on a novel algorithm of assigning binary bits for smaller segments of DNA bases to compress both repetitive and non repetitive DNA sequence. Our proposed algorithm achieves the best compression ratio for DNA sequences for larger genome. Significantly better compression results show that “DNABIT Compress” algorithm is the best among the remaining compression algorithms. While achieving the best compression ratios for DNA sequences (Genomes),our new DNABIT Compress algorithm significantly improves the running time of all previous DNA compression programs. Assigning binary bits (Unique BIT CODE) for (Exact Repeats, Reverse Repeats) fragments of DNA sequence is also a unique concept introduced in this algorithm for the first time in DNA compression. This proposed new algorithm could achieve the best compression ratio as much as 1.58 bits/bases where the existing best methods could not achieve a ratio less than 1.72 bits/bases. PMID:21383923
Mapping the binding site of aflatoxin B/sub 1/ in DNA: systematic analysis of the reactivity of aflatoxin B/sub 1/ with guanines in different DNA sequences

DOE Office of Scientific and Technical Information (OSTI.GOV)

Benasutti, M.; Ejadi, S.; Whitlow, M.D.

The mutagenic and carcinogenic chemical aflatoxin B/sub 1/ (AFB/sub 1/) reacts almost exclusively at the N(7)-position of guanine following activation to its reactive form, the 8,9-epoxide (AFB/sub 1/ oxide). In general N(7)-guanine adducts yield DNA strand breaks when heated in base, a property that serves as the basis for the Maxam-Gilbert DNA sequencing reaction specific for guanine. Using DNA sequencing methods, other workers have shown that AFB/sub 1/ oxide gives strand breaks at positions of guanines; however, the guanine bands varied in intensity. This phenomenon has been used to infer that AFB/sub 1/ oxide prefers to react with guanines inmore » some sequence contexts more than in others and has been referred to as sequence specificity of binding. Herein, data on the reaction of AFB/sub 1/ oxide with several synthetic DNA polymers with different sequences are presented, and (following hydrolysis) adduct levels are determine by high-pressure liquid chromatography. These results reveal that for AFB/sub 1/ oxide (1) the N(7)-guanine adduct is the major adduct found in all of the DNA polymers, (2) adduct levels vary in different sequences, and, thus, sequence specificity is also observed by this more direct method, and (3) the intensity of bands in DNA sequencing gels is likely to reflect adduct levels formed at the N(7)-position of guanine. Knowing this, a reinvestigation of the reactivity of guanines in different DNA sequences using DNA sequencing methods was undertaken. Methods are developed to determine the X (5'-side) base and the Y (3'-side) base are most influential in determining guanine reactivity. These rules in conjunction with molecular modeling studies were used to assess the binding sites that might be utilized by AFB/sub 1/ oxide in its reaction with DNA.« less
Conserved Sequences at the Origin of Adenovirus DNA Replication

PubMed Central

Stillman, Bruce W.; Topp, William C.; Engler, Jeffrey A.

1982-01-01

The origin of adenovirus DNA replication lies within an inverted sequence repetition at either end of the linear, double-stranded viral DNA. Initiation of DNA replication is primed by a deoxynucleoside that is covalently linked to a protein, which remains bound to the newly synthesized DNA. We demonstrate that virion-derived DNA-protein complexes from five human adenovirus serological subgroups (A to E) can act as a template for both the initiation and the elongation of DNA replication in vitro, using nuclear extracts from adenovirus type 2 (Ad2)-infected HeLa cells. The heterologous template DNA-protein complexes were not as active as the homologous Ad2 DNA, most probably due to inefficient initiation by Ad2 replication factors. In an attempt to identify common features which may permit this replication, we have also sequenced the inverted terminal repeated DNA from human adenovirus serotypes Ad4 (group E), Ad9 and Ad10 (group D), and Ad31 (group A), and we have compared these to previously determined sequences from Ad2 and Ad5 (group C), Ad7 (group B), and Ad12 and Ad18 (group A) DNA. In all cases, the sequence around the origin of DNA replication can be divided into two structural domains: a proximal A · T-rich region which is partially conserved among these serotypes, and a distal G · C-rich region which is less well conserved. The G · C-rich region contains sequences similar to sequences present in papovavirus replication origins. The two domains may reflect a dual mechanism for initiation of DNA replication: adenovirus-specific protein priming of replication, and subsequent utilization of this primer by host replication factors for completion of DNA synthesis. Images PMID:7143575
Enzyme-free detection and quantification of double-stranded nucleic acids.

PubMed

Feuillie, Cécile; Merheb, Maxime Mohamad; Gillet, Benjamin; Montagnac, Gilles; Hänni, Catherine; Daniel, Isabelle

2012-08-01

We have developed a fully enzyme-free SERRS hybridization assay for specific detection of double-stranded DNA sequences. Although all DNA detection methods ranging from PCR to high-throughput sequencing rely on enzymes, this method is unique for being totally non-enzymatic. The efficiency of enzymatic processes is affected by alterations, modifications, and/or quality of DNA. For instance, a limitation of most DNA polymerases is their inability to process DNA damaged by blocking lesions. As a result, enzymatic amplification and sequencing of degraded DNA often fail. In this study we succeeded in detecting and quantifying, within a mixture, relative amounts of closely related double-stranded DNA sequences from Rupicapra rupicapra (chamois) and Capra hircus (goat). The non-enzymatic SERRS assay presented here is the corner stone of a promising approach to overcome the failure of DNA polymerase when DNA is too degraded or when the concentration of polymerase inhibitors is too high. It is the first time double-stranded DNA has been detected with a truly non-enzymatic SERRS-based method. This non-enzymatic, inexpensive, rapid assay is therefore a breakthrough in nucleic acid detection.
Spreadsheet-based program for alignment of overlapping DNA sequences.

PubMed

Anbazhagan, R; Gabrielson, E

1999-06-01

Molecular biology laboratories frequently face the challenge of aligning small overlapping DNA sequences derived from a long DNA segment. Here, we present a short program that can be used to adapt Excel spreadsheets as a tool for aligning DNA sequences, regardless of their orientation. The program runs on any Windows or Macintosh operating system computer with Excel 97 or Excel 98. The program is available for use as an Excel file, which can be downloaded from the BioTechniques Web site. Upon execution, the program opens a specially designed customized workbook and is capable of identifying overlapping regions between two sequence fragments and displaying the sequence alignment. It also performs a number of specialized functions such as recognition of restriction enzyme cutting sites and CpG island mapping without costly specialized software.
A common deletion in two gamma ray induced rat pulmonary tumor cell lines.

PubMed

Van Klaveren, P; De Bruijne, J; Van der Winden, H; Kal, H B; Bentvelzen, P

1994-01-01

Subtraction hybridization was performed on normal WAG/Rij rat DNA with DNA from a syngeneic Ir-192 induced pulmonary tumor cell line L37. The residual DNA was amplified by means of sequence-independent PCR. This procedure yielded a sequence, of which multiple copies are present in normal rat DNA. In the tumor line L37 two restriction fragments hybridizing with this repeat sequence are lacking. In another Ir-192 induced pulmonary tumor line, L33, one of these fragments was also lacking. This indicates a common deletion in the two tumor lines.
Basic quantitative polymerase chain reaction using real-time fluorescence measurements.

PubMed

Ares, Manuel

2014-10-01

This protocol uses quantitative polymerase chain reaction (qPCR) to measure the number of DNA molecules containing a specific contiguous sequence in a sample of interest (e.g., genomic DNA or cDNA generated by reverse transcription). The sample is subjected to fluorescence-based PCR amplification and, theoretically, during each cycle, two new duplex DNA molecules are produced for each duplex DNA molecule present in the sample. The progress of the reaction during PCR is evaluated by measuring the fluorescence of dsDNA-dye complexes in real time. In the early cycles, DNA duplication is not detected because inadequate amounts of DNA are made. At a certain threshold cycle, DNA-dye complexes double each cycle for 8-10 cycles, until the DNA concentration becomes so high and the primer concentration so low that the reassociation of the product strands blocks efficient synthesis of new DNA and the reaction plateaus. There are two types of measurements: (1) the relative change of the target sequence compared to a reference sequence and (2) the determination of molecule number in the starting sample. The first requires a reference sequence, and the second requires a sample of the target sequence with known numbers of the molecules of sequence to generate a standard curve. By identifying the threshold cycle at which a sample first begins to accumulate DNA-dye complexes exponentially, an estimation of the numbers of starting molecules in the sample can be extrapolated. © 2014 Cold Spring Harbor Laboratory Press.
Molecular dynamics studies on the DNA-binding process of ERG.

PubMed

Beuerle, Matthias G; Dufton, Neil P; Randi, Anna M; Gould, Ian R

2016-11-15

The ETS family of transcription factors regulate gene targets by binding to a core GGAA DNA-sequence. The ETS factor ERG is required for homeostasis and lineage-specific functions in endothelial cells, some subset of haemopoietic cells and chondrocytes; its ectopic expression is linked to oncogenesis in multiple tissues. To date details of the DNA-binding process of ERG including DNA-sequence recognition outside the core GGAA-sequence are largely unknown. We combined available structural and experimental data to perform molecular dynamics simulations to study the DNA-binding process of ERG. In particular we were able to reproduce the ERG DNA-complex with a DNA-binding simulation starting in an unbound configuration with a final root-mean-square-deviation (RMSD) of 2.1 Å to the core ETS domain DNA-complex crystal structure. This allowed us to elucidate the relevance of amino acids involved in the formation of the ERG DNA-complex and to identify Arg385 as a novel key residue in the DNA-binding process. Moreover we were able to show that water-mediated hydrogen bonds are present between ERG and DNA in our simulations and that those interactions have the potential to achieve sequence recognition outside the GGAA core DNA-sequence. The methodology employed in this study shows the promising capabilities of modern molecular dynamics simulations in the field of protein DNA-interactions.
Unique nucleotide sequence-guided assembly of repetitive DNA parts for synthetic biology applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Torella, JP; Lienert, F; Boehm, CR

2014-08-07

Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts, and they hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies-for example, repeated terminator and insulator sequences-that complicate recombination-based assembly. We and others have recently developed DNA assembly methods, which we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked withmore » UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly assembled constructs, or into high-quality combinatorial libraries in only 2-3 d. If the DNA parts must be generated from scratch, an additional 2-5 d are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques.« less
Unique nucleotide sequence (UNS)-guided assembly of repetitive DNA parts for synthetic biology applications

PubMed Central

Torella, Joseph P.; Lienert, Florian; Boehm, Christian R.; Chen, Jan-Hung; Way, Jeffrey C.; Silver, Pamela A.

2016-01-01

Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts and hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies — for example repeated terminator and insulator sequences — that complicate recombination-based assembly. We and others have recently developed DNA assembly methods that we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked with UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly-assembled constructs, or into high-quality combinatorial libraries in only 2–3 days. If the DNA parts must be generated from scratch, an additional 2–5 days are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques. PMID:25101822
Three 3D graphical representations of DNA primary sequences based on the classifications of DNA bases and their applications.

PubMed

Xie, Guosen; Mo, Zhongxi

2011-01-21

In this article, we introduce three 3D graphical representations of DNA primary sequences, which we call RY-curve, MK-curve and SW-curve, based on three classifications of the DNA bases. The advantages of our representations are that (i) these 3D curves are strictly non-degenerate and there is no loss of information when transferring a DNA sequence to its mathematical representation and (ii) the coordinates of every node on these 3D curves have clear biological implication. Two applications of these 3D curves are presented: (a) a simple formula is derived to calculate the content of the four bases (A, G, C and T) from the coordinates of nodes on the curves; and (b) a 12-component characteristic vector is constructed to compare similarity among DNA sequences from different species based on the geometrical centers of the 3D curves. As examples, we examine similarity among the coding sequences of the first exon of beta-globin gene from eleven species and validate similarity of cDNA sequences of beta-globin gene from eight species. Copyright © 2010 Elsevier Ltd. All rights reserved.
Ancient genomics

PubMed Central

Der Sarkissian, Clio; Allentoft, Morten E.; Ávila-Arcos, María C.; Barnett, Ross; Campos, Paula F.; Cappellini, Enrico; Ermini, Luca; Fernández, Ruth; da Fonseca, Rute; Ginolhac, Aurélien; Hansen, Anders J.; Jónsson, Hákon; Korneliussen, Thorfinn; Margaryan, Ashot; Martin, Michael D.; Moreno-Mayar, J. Víctor; Raghavan, Maanasa; Rasmussen, Morten; Velasco, Marcela Sandoval; Schroeder, Hannes; Schubert, Mikkel; Seguin-Orlando, Andaine; Wales, Nathan; Gilbert, M. Thomas P.; Willerslev, Eske; Orlando, Ludovic

2015-01-01

The past decade has witnessed a revolution in ancient DNA (aDNA) research. Although the field's focus was previously limited to mitochondrial DNA and a few nuclear markers, whole genome sequences from the deep past can now be retrieved. This breakthrough is tightly connected to the massive sequence throughput of next generation sequencing platforms and the ability to target short and degraded DNA molecules. Many ancient specimens previously unsuitable for DNA analyses because of extensive degradation can now successfully be used as source materials. Additionally, the analytical power obtained by increasing the number of sequence reads to billions effectively means that contamination issues that have haunted aDNA research for decades, particularly in human studies, can now be efficiently and confidently quantified. At present, whole genomes have been sequenced from ancient anatomically modern humans, archaic hominins, ancient pathogens and megafaunal species. Those have revealed important functional and phenotypic information, as well as unexpected adaptation, migration and admixture patterns. As such, the field of aDNA has entered the new era of genomics and has provided valuable information when testing specific hypotheses related to the past. PMID:25487338
Normalization of environmental metagenomic DNA enhances the discovery of under-represented microbial community members.

PubMed

Ramond, J-B; Makhalanyane, T P; Tuffin, M I; Cowan, D A

2015-04-01

Normalization is a procedure classically employed to detect rare sequences in cellular expression profiles (i.e. cDNA libraries). Here, we present a normalization protocol involving the direct treatment of extracted environmental metagenomic DNA with S1 nuclease, referred to as normalization of metagenomic DNA: NmDNA. We demonstrate that NmDNA, prior to post hoc PCR-based experiments (16S rRNA gene T-RFLP fingerprinting and clone library), increased the diversity of sequences retrieved from environmental microbial communities by detection of rarer sequences. This approach could be used to enhance the resolution of detection of ecologically relevant rare members in environmental microbial assemblages and therefore is promising in enabling a better understanding of ecosystem functioning. This study is the first testing 'normalization' on environmental metagenomic DNA (mDNA). The aim of this procedure was to improve the identification of rare phylotypes in environmental communities. Using hypoliths as model systems, we present evidence that this post-mDNA extraction molecular procedure substantially enhances the detection of less common phylotypes and could even lead to the discovery of novel microbial genotypes within a given environment. © 2014 The Society for Applied Microbiology.
Decoding DNA labels by melting curve analysis using real-time PCR.

PubMed

Balog, József A; Fehér, Liliána Z; Puskás, László G

2017-12-01

Synthetic DNA has been used as an authentication code for a diverse number of applications. However, existing decoding approaches are based on either DNA sequencing or the determination of DNA length variations. Here, we present a simple alternative protocol for labeling different objects using a small number of short DNA sequences that differ in their melting points. Code amplification and decoding can be done in two steps using quantitative PCR (qPCR). To obtain a DNA barcode with high complexity, we defined 8 template groups, each having 4 different DNA templates, yielding 158 (>2.5 billion) combinations of different individual melting temperature (Tm) values and corresponding ID codes. The reproducibility and specificity of the decoding was confirmed by using the most complex template mixture, which had 32 different products in 8 groups with different Tm values. The industrial applicability of our protocol was also demonstrated by labeling a drone with an oil-based paint containing a predefined DNA code, which was then successfully decoded. The method presented here consists of a simple code system based on a small number of synthetic DNA sequences and a cost-effective, rapid decoding protocol using a few qPCR reactions, enabling a wide range of authentication applications.
Sequence dependence of electron-induced DNA strand breakage revealed by DNA nanoarrays

PubMed Central

Keller, Adrian; Rackwitz, Jenny; Cauët, Emilie; Liévin, Jacques; Körzdörfer, Thomas; Rotaru, Alexandru; Gothelf, Kurt V.; Besenbacher, Flemming; Bald, Ilko

2014-01-01

The electronic structure of DNA is determined by its nucleotide sequence, which is for instance exploited in molecular electronics. Here we demonstrate that also the DNA strand breakage induced by low-energy electrons (18 eV) depends on the nucleotide sequence. To determine the absolute cross sections for electron induced single strand breaks in specific 13 mer oligonucleotides we used atomic force microscopy analysis of DNA origami based DNA nanoarrays. We investigated the DNA sequences 5′-TT(XYX)3TT with X = A, G, C and Y = T, BrU 5-bromouracil and found absolute strand break cross sections between 2.66 · 10−14 cm2 and 7.06 · 10−14 cm2. The highest cross section was found for 5′-TT(ATA)3TT and 5′-TT(ABrUA)3TT, respectively. BrU is a radiosensitizer, which was discussed to be used in cancer radiation therapy. The replacement of T by BrU into the investigated DNA sequences leads to a slight increase of the absolute strand break cross sections resulting in sequence-dependent enhancement factors between 1.14 and 1.66. Nevertheless, the variation of strand break cross sections due to the specific nucleotide sequence is considerably higher. Thus, the present results suggest the development of targeted radiosensitizers for cancer radiation therapy. PMID:25487346
High-resolution characterization of sequence signatures due to non-random cleavage of cell-free DNA.

PubMed

Chandrananda, Dineika; Thorne, Natalie P; Bahlo, Melanie

2015-06-17

High-throughput sequencing of cell-free DNA fragments found in human plasma has been used to non-invasively detect fetal aneuploidy, monitor organ transplants and investigate tumor DNA. However, many biological properties of this extracellular genetic material remain unknown. Research that further characterizes circulating DNA could substantially increase its diagnostic value by allowing the application of more sophisticated bioinformatics tools that lead to an improved signal to noise ratio in the sequencing data. In this study, we investigate various features of cell-free DNA in plasma using deep-sequencing data from two pregnant women (>70X, >50X) and compare them with matched cellular DNA. We utilize a descriptive approach to examine how the biological cleavage of cell-free DNA affects different sequence signatures such as fragment lengths, sequence motifs at fragment ends and the distribution of cleavage sites along the genome. We show that the size distributions of these cell-free DNA molecules are dependent on their autosomal and mitochondrial origin as well as the genomic location within chromosomes. DNA mapping to particular microsatellites and alpha repeat elements display unique size signatures. We show how cell-free fragments occur in clusters along the genome, localizing to nucleosomal arrays and are preferentially cleaved at linker regions by correlating the mapping locations of these fragments with ENCODE annotation of chromatin organization. Our work further demonstrates that cell-free autosomal DNA cleavage is sequence dependent. The region spanning up to 10 positions on either side of the DNA cleavage site show a consistent pattern of preference for specific nucleotides. This sequence motif is present in cleavage sites localized to nucleosomal cores and linker regions but is absent in nucleosome-free mitochondrial DNA. These background signals in cell-free DNA sequencing data stem from the non-random biological cleavage of these fragments. This sequence structure can be harnessed to improve bioinformatics algorithms, in particular for CNV and structural variant detection. Descriptive measures for cell-free DNA features developed here could also be used in biomarker analysis to monitor the changes that occur during different pathological conditions.
Dynamics and control of DNA sequence amplification

DOE Office of Scientific and Technical Information (OSTI.GOV)

Marimuthu, Karthikeyan; Chakrabarti, Raj, E-mail: raj@pmc-group.com, E-mail: rajc@andrew.cmu.edu; Division of Fundamental Research, PMC Advanced Technology, Mount Laurel, New Jersey 08054

2014-10-28

DNA amplification is the process of replication of a specified DNA sequence in vitro through time-dependent manipulation of its external environment. A theoretical framework for determination of the optimal dynamic operating conditions of DNA amplification reactions, for any specified amplification objective, is presented based on first-principles biophysical modeling and control theory. Amplification of DNA is formulated as a problem in control theory with optimal solutions that can differ considerably from strategies typically used in practice. Using the Polymerase Chain Reaction as an example, sequence-dependent biophysical models for DNA amplification are cast as control systems, wherein the dynamics of the reactionmore » are controlled by a manipulated input variable. Using these control systems, we demonstrate that there exists an optimal temperature cycling strategy for geometric amplification of any DNA sequence and formulate optimal control problems that can be used to derive the optimal temperature profile. Strategies for the optimal synthesis of the DNA amplification control trajectory are proposed. Analogous methods can be used to formulate control problems for more advanced amplification objectives corresponding to the design of new types of DNA amplification reactions.« less
Mesoscopic modeling of DNA denaturation rates: Sequence dependence and experimental comparison

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dahlen, Oda, E-mail: oda.dahlen@ntnu.no; Erp, Titus S. van, E-mail: titus.van.erp@ntnu.no

Using rare event simulation techniques, we calculated DNA denaturation rate constants for a range of sequences and temperatures for the Peyrard-Bishop-Dauxois (PBD) model with two different parameter sets. We studied a larger variety of sequences compared to previous studies that only consider DNA homopolymers and DNA sequences containing an equal amount of weak AT- and strong GC-base pairs. Our results show that, contrary to previous findings, an even distribution of the strong GC-base pairs does not always result in the fastest possible denaturation. In addition, we applied an adaptation of the PBD model to study hairpin denaturation for which experimentalmore » data are available. This is the first quantitative study in which dynamical results from the mesoscopic PBD model have been compared with experiments. Our results show that present parameterized models, although giving good results regarding thermodynamic properties, overestimate denaturation rates by orders of magnitude. We believe that our dynamical approach is, therefore, an important tool for verifying DNA models and for developing next generation models that have higher predictive power than present ones.« less
Detection of Merkel Cell Polyomavirus DNA in Serum Samples of Healthy Blood Donors

PubMed Central

Mazzoni, Elisa; Rotondo, John C.; Marracino, Luisa; Selvatici, Rita; Bononi, Ilaria; Torreggiani, Elena; Touzé, Antoine; Martini, Fernanda; Tognon, Mauro G.

2017-01-01

Merkel cell polyomavirus (MCPyV) has been detected in 80% of Merkel cell carcinomas (MCC). In the host, the MCPyV reservoir remains elusive. MCPyV DNA sequences were revealed in blood donor buffy coats. In this study, MCPyV DNA sequences were investigated in the sera (n = 190) of healthy blood donors. Two MCPyV DNA sequences, coding for the viral oncoprotein large T antigen (LT), were investigated using polymerase chain reaction (PCR) methods and DNA sequencing. Circulating MCPyV sequences were detected in sera with a prevalence of 2.6% (5/190), at low-DNA viral load, which is in the range of 1–4 and 1–5 copies/μl by real-time PCR and droplet digital PCR, respectively. DNA sequencing carried out in the five MCPyV-positive samples indicated that the two MCPyV LT sequences which were analyzed belong to the MKL-1 strain. Circulating MCPyV LT sequences are present in blood donor sera. MCPyV-positive samples from blood donors could represent a potential vehicle for MCPyV infection in receivers, whereas an increase in viral load may occur with multiple blood transfusions. In certain patient conditions, such as immune-depression/suppression, additional disease or old age, transfusion of MCPyV-positive samples could be an additional risk factor for MCC onset. PMID:29238698
Previously unknown and highly divergent ssDNA viruses populate the oceans.

PubMed

Labonté, Jessica M; Suttle, Curtis A

2013-11-01

Single-stranded DNA (ssDNA) viruses are economically important pathogens of plants and animals, and are widespread in oceans; yet, the diversity and evolutionary relationships among marine ssDNA viruses remain largely unknown. Here we present the results from a metagenomic study of composite samples from temperate (Saanich Inlet, 11 samples; Strait of Georgia, 85 samples) and subtropical (46 samples, Gulf of Mexico) seawater. Most sequences (84%) had no evident similarity to sequenced viruses. In total, 608 putative complete genomes of ssDNA viruses were assembled, almost doubling the number of ssDNA viral genomes in databases. These comprised 129 genetically distinct groups, each represented by at least one complete genome that had no recognizable similarity to each other or to other virus sequences. Given that the seven recognized families of ssDNA viruses have considerable sequence homology within them, this suggests that many of these genetic groups may represent new viral families. Moreover, nearly 70% of the sequences were similar to one of these genomes, indicating that most of the sequences could be assigned to a genetically distinct group. Most sequences fell within 11 well-defined gene groups, each sharing a common gene. Some of these encoded putative replication and coat proteins that had similarity to sequences from viruses infecting eukaryotes, suggesting that these were likely from viruses infecting eukaryotic phytoplankton and zooplankton.

'DNA Strider': a 'C' program for the fast analysis of DNA and protein sequences on the Apple Macintosh family of computers.

PubMed Central

Marck, C

1988-01-01

DNA Strider is a new integrated DNA and Protein sequence analysis program written with the C language for the Macintosh Plus, SE and II computers. It has been designed as an easy to learn and use program as well as a fast and efficient tool for the day-to-day sequence analysis work. The program consists of a multi-window sequence editor and of various DNA and Protein analysis functions. The editor may use 4 different types of sequences (DNA, degenerate DNA, RNA and one-letter coded protein) and can handle simultaneously 6 sequences of any type up to 32.5 kB each. Negative numbering of the bases is allowed for DNA sequences. All classical restriction and translation analysis functions are present and can be performed in any order on any open sequence or part of a sequence. The main feature of the program is that the same analysis function can be repeated several times on different sequences, thus generating multiple windows on the screen. Many graphic capabilities have been incorporated such as graphic restriction map, hydrophobicity profile and the CAI plot- codon adaptation index according to Sharp and Li. The restriction sites search uses a newly designed fast hexamer look-ahead algorithm. Typical runtime for the search of all sites with a library of 130 restriction endonucleases is 1 second per 10,000 bases. The circular graphic restriction map of the pBR322 plasmid can be therefore computed from its sequence and displayed on the Macintosh Plus screen within 2 seconds and its multiline restriction map obtained in a scrolling window within 5 seconds. PMID:2832831
Programmable RNA recognition and cleavage by CRISPR/Cas9.

PubMed

O'Connell, Mitchell R; Oakes, Benjamin L; Sternberg, Samuel H; East-Seletsky, Alexandra; Kaplan, Matias; Doudna, Jennifer A

2014-12-11

The CRISPR-associated protein Cas9 is an RNA-guided DNA endonuclease that uses RNA-DNA complementarity to identify target sites for sequence-specific double-stranded DNA (dsDNA) cleavage. In its native context, Cas9 acts on DNA substrates exclusively because both binding and catalysis require recognition of a short DNA sequence, known as the protospacer adjacent motif (PAM), next to and on the strand opposite the twenty-nucleotide target site in dsDNA. Cas9 has proven to be a versatile tool for genome engineering and gene regulation in a large range of prokaryotic and eukaryotic cell types, and in whole organisms, but it has been thought to be incapable of targeting RNA. Here we show that Cas9 binds with high affinity to single-stranded RNA (ssRNA) targets matching the Cas9-associated guide RNA sequence when the PAM is presented in trans as a separate DNA oligonucleotide. Furthermore, PAM-presenting oligonucleotides (PAMmers) stimulate site-specific endonucleolytic cleavage of ssRNA targets, similar to PAM-mediated stimulation of Cas9-catalysed DNA cleavage. Using specially designed PAMmers, Cas9 can be specifically directed to bind or cut RNA targets while avoiding corresponding DNA sequences, and we demonstrate that this strategy enables the isolation of a specific endogenous messenger RNA from cells. These results reveal a fundamental connection between PAM binding and substrate selection by Cas9, and highlight the utility of Cas9 for programmable transcript recognition without the need for tags.
Programmable RNA recognition and cleavage by CRISPR/Cas9

PubMed Central

O’Connell, Mitchell R.; Oakes, Benjamin L.; Sternberg, Samuel H.; East-Seletsky, Alexandra; Kaplan, Matias; Doudna, Jennifer A.

2014-01-01

The CRISPR-associated protein Cas9 is an RNA-guided DNA endonuclease that uses RNA:DNA complementarity to identify target sites for sequence-specific doublestranded DNA (dsDNA) cleavage1-5. In its native context, Cas9 acts on DNA substrates exclusively because both binding and catalysis require recognition of a short DNA sequence, the protospacer adjacent motif (PAM), next to and on the strand opposite the 20-nucleotide target site in dsDNA4-7. Cas9 has proven to be a versatile tool for genome engineering and gene regulation in many cell types and organisms8, but it has been thought to be incapable of targeting RNA5. Here we show that Cas9 binds with high affinity to single-stranded RNA (ssRNA) targets matching the Cas9-associated guide RNA sequence when the PAM is presented in trans as a separate DNA oligonucleotide. Furthermore, PAM-presenting oligonucleotides (PAMmers) stimulate site-specific endonucleolytic cleavage of ssRNA targets, similar to PAM-mediated stimulation of Cas9-catalyzed DNA cleavage7. Using specially designed PAMmers, Cas9 can be specifically directed to bind or cut RNA targets while avoiding corresponding DNA sequences, and we demonstrate that this strategy enables the isolation of a specific endogenous mRNA from cells. These results reveal a fundamental connection between PAM binding and substrate selection by Cas9, and highlight the utility of Cas9 for programmable and tagless transcript recognition. PMID:25274302
Divergence, differential methylation and interspersion of melon satellite DNA sequences.

PubMed Central

Shmookler Reis, R; Timmis, J N; Ingle, J

1981-01-01

Melon (Cucumis melo) satellite DNA consists of two components, Q and S, each with a buoyant density in CsCl of 1.707 g/ml, but differing by 9 degrees C in "melting" temperature. These physical properties appear to be in contradiction, since both depend on G + C content. In order to resolve this anomaly, base compositions were directly determined for isolated fractions. the low-"melting" component S contains 41.8% G + C, with 6% of C present as 5-methylcytosine, whereas Q DNA contains 54% G + C, with 41% of C methylated. Analyses of restriction site loss agreed well with the direct determinations of methylation and divergence, and indicated some clustering of methylated sites in Q DNA. Analysis of restricted main-band DNA by hydridization with RNA complementary to Q satellite DNA ("Southern transfer") showed satellite Q tandem arrays interspersed in DNA of main-band density. Sequence divergence and extent of methylation did not appear to depend on whether a repeat array was present as satellite or interspersed in main-band DNA. Hydridization in situ indicated considerable heterogeneity in the genomic proportion of the Q-DNA sequences in melon fruit nuclei, implying over- and under-representation consistent with extensive unequal recombination in satellite Q tandem arrays. The cucumber, Cucumis sativus, contains less than 8% as much Q-homologous DNA per genome as the melon, suggesting rapid evolutionary gain or loss of these tandem repeat sequences. Images Fig. 2. PLATE 1 Fig. 4. Fig. 10. PMID:6172117
Phylogenetic Network for European mtDNA

PubMed Central

Finnilä, Saara; Lehtonen, Mervi S.; Majamaa, Kari

2001-01-01

The sequence in the first hypervariable segment (HVS-I) of the control region has been used as a source of evolutionary information in most phylogenetic analyses of mtDNA. Population genetic inference would benefit from a better understanding of the variation in the mtDNA coding region, but, thus far, complete mtDNA sequences have been rare. We determined the nucleotide sequence in the coding region of mtDNA from 121 Finns, by conformation-sensitive gel electrophoresis and subsequent sequencing and by direct sequencing of the D loop. Furthermore, 71 sequences from our previous reports were included, so that the samples represented all the mtDNA haplogroups present in the Finnish population. We found a total of 297 variable sites in the coding region, which allowed the compilation of unambiguous phylogenetic networks. The D loop harbored 104 variable sites, and, in most cases, these could be localized within the coding-region networks, without discrepancies. Interestingly, many homoplasies were detected in the coding region. Nucleotide variation in the rRNA and tRNA genes was 6%, and that in the third nucleotide positions of structural genes amounted to 22% of that in the HVS-I. The complete networks enabled the relationships between the mtDNA haplogroups to be analyzed. Phylogenetic networks based on the entire coding-region sequence in mtDNA provide a rich source for further population genetic studies, and complete sequences make it easier to differentiate between disease-causing mutations and rare polymorphisms. PMID:11349229
The case for the continuing use of the revised Cambridge Reference Sequence (rCRS) and the standardization of notation in human mitochondrial DNA studies.

PubMed

Bandelt, Hans-Jürgen; Kloss-Brandstätter, Anita; Richards, Martin B; Yao, Yong-Gang; Logan, Ian

2014-02-01

Since the determination in 1981 of the sequence of the human mitochondrial DNA (mtDNA) genome, the Cambridge Reference Sequence (CRS), has been used as the reference sequence to annotate mtDNA in molecular anthropology, forensic science and medical genetics. The CRS was eventually upgraded to the revised version (rCRS) in 1999. This reference sequence is a convenient device for recording mtDNA variation, although it has often been misunderstood as a wild-type (WT) or consensus sequence by medical geneticists. Recently, there has been a proposal to replace the rCRS with the so-called Reconstructed Sapiens Reference Sequence (RSRS). Even if it had been estimated accurately, the RSRS would be a cumbersome substitute for the rCRS, as the new proposal fuses--and thus confuses--the two distinct concepts of ancestral lineage and reference point for human mtDNA. Instead, we prefer to maintain the rCRS and to report mtDNA profiles by employing the hitherto predominant circumfix style. Tree diagrams could display mutations by using either the profile notation (in conventional short forms where appropriate) or in a root-upwards way with two suffixes indicating ancestral and derived nucleotides. This would guard against misunderstandings about reporting mtDNA variation. It is therefore neither necessary nor sensible to change the present reference sequence, the rCRS, in any way. The proposed switch to RSRS would inevitably lead to notational chaos, mistakes and misinterpretations.
DNA-PK assay

DOEpatents

Anderson, Carl W.; Connelly, Margery A.

2004-10-12

The present invention provides a method for detecting DNA-activated protein kinase (DNA-PK) activity in a biological sample. The method includes contacting a biological sample with a detectably-labeled phosphate donor and a synthetic peptide substrate defined by the following features to provide specific recognition and phosphorylation by DNA-PK: (1) a phosphate-accepting amino acid pair which may include serine-glutamine (Ser-Gln) (SQ), threonine-glutamine (Thr-Gln) (TQ), glutamine-serine (Gln-Ser) (QS), or glutamine-threonine (Gln-Thr) (QT); (2) enhancer amino acids which may include glutamic acid or glutamine immediately adjacent at the amino- or carboxyl- side of the amino acid pair and forming an amino acid pair-enhancer unit; (3) a first spacer sequence at the amino terminus of the amino acid pair-enhancer unit; (4) a second spacer sequence at the carboxyl terminus of the amino acid pair-enhancer unit, which spacer sequences may include any combination of amino acids that does not provide a phosphorylation site consensus sequence motif; and, (5) a tag moiety, which may be an amino acid sequence or another chemical entity that permits separating the synthetic peptide from the phosphate donor. A compostion and a kit for the detection of DNA-PK activity are also provided. Methods for detecting DNA, protein phosphatases and substances that alter the activity of DNA-PK are also provided. The present invention also provides a method of monitoring protein kinase and DNA-PK activity in living cells. -A composition and a kit for monitoring protein kinase activity in vitro and a composition and a kit for monitoring DNA-PK activities in living cells are also provided. A method for identifying agents that alter protein kinase activity in vitro and a method for identifying agents that alter DNA-PK activity in living cells are also provided.
Single-molecule analysis of DNA cross-links using nanopore technology

NASA Astrophysics Data System (ADS)

Wolna, Anna H.

The alpha-hemolysin (alpha-HL) protein ion channel is a potential next-generation sequencing platform that has been extensively used to study nucleic acids at a single-molecule level. After applying a potential across a lipid bilayer, the imbedded alpha-HL allows monitoring of the duration and current levels of DNA translocation and immobilization. Because this method does not require DNA amplification prior to sequencing, all the DNA damage present in the cell at any given time will be present during the sequencing experiment. The goal of this research is to determine if these damage sites give distinguishable current levels beyond those observed for the canonical nucleobases. Because DNA cross-links are one of the most prevalent types of DNA damage occurring in vivo, the blockage current levels were determined for thymine-dimers, guanine(C8)-thymine(N3) cross-links and platinum adducts. All of these cross-links give a different blockage current level compared to the undamaged strands when immobilized in the ion channel, and they all can easily translocate across the alpha-HL channel. Additionally, the alpha-HL nanopore technique presents a unique opportunity to study the effects of DNA cross-links, such as thymine-dimers, on the secondary structure of DNA G-quadruplexes folded from the human telomere sequence. Using this single-molecule nanopore technique we can detect subtle structural differences that cannot be easily addressed using conventional methods. The human telomere plays crucial roles in maintaining genome stability. In the presence of suitable cations, the repetitive 5'-TTAGGG human telomere sequence can fold into G-quadruplexes that adopt the hybrid fold in vivo. The telomere sequence is hypersensitive to UV-induced thymine-dimer (T=T) formation, and yet the presence of thymine dimers does not cause telomere shortening. The potential structural disruption and thermodynamic stability of the T=T-containing natural telomere sequences were studied to understand how this damage is tolerated in telomeric DNA. The alpha-HL experiments determined that T=Ts disrupt double-chain reversal loop formation but are well tolerated in edgewise and diagonal loops of the hybrid G-quadruplexes. These studies demonstrated the power of the alpha-HL ion channel to analyze DNA modifications and secondary structures at a single-molecule level.
Streamlining the Design-to-Build Transition with Build-Optimization Software Tools.

PubMed

Oberortner, Ernst; Cheng, Jan-Fang; Hillson, Nathan J; Deutsch, Samuel

2017-03-17

Scaling-up capabilities for the design, build, and test of synthetic biology constructs holds great promise for the development of new applications in fuels, chemical production, or cellular-behavior engineering. Construct design is an essential component in this process; however, not every designed DNA sequence can be readily manufactured, even using state-of-the-art DNA synthesis methods. Current biological computer-aided design and manufacture tools (bioCAD/CAM) do not adequately consider the limitations of DNA synthesis technologies when generating their outputs. Designed sequences that violate DNA synthesis constraints may require substantial sequence redesign or lead to price-premiums and temporal delays, which adversely impact the efficiency of the DNA manufacturing process. We have developed a suite of build-optimization software tools (BOOST) to streamline the design-build transition in synthetic biology engineering workflows. BOOST incorporates knowledge of DNA synthesis success determinants into the design process to output ready-to-build sequences, preempting the need for sequence redesign. The BOOST web application is available at https://boost.jgi.doe.gov and its Application Program Interfaces (API) enable integration into automated, customized DNA design processes. The herein presented results highlight the effectiveness of BOOST in reducing DNA synthesis costs and timelines.
Insights on genome size evolution from a miniature inverted repeat transposon driving a satellite DNA.

PubMed

Scalvenzi, Thibault; Pollet, Nicolas

2014-12-01

The genome size in eukaryotes does not correlate well with the number of genes they contain. We can observe this so-called C-value paradox in amphibian species. By analyzing an amphibian genome we asked how repetitive DNA can impact genome size and architecture. We describe here our discovery of a Tc1/mariner miniature inverted-repeat transposon family present in Xenopus frogs. These transposons named miDNA4 are unique since they contain a satellite DNA motif. We found that miDNA4 measured 331 bp, contained 25 bp long inverted terminal repeat sequences and a sequence motif of 119 bp present as a unique copy or as an array of 2-47 copies. We characterized the structure, dynamics, impact and evolution of the miDNA4 family and its satellite DNA in Xenopus frog genomes. This led us to propose a model for the evolution of these two repeated sequences and how they can synergize to increase genome size. Copyright © 2014 Elsevier Inc. All rights reserved.
SPlinted Ligation Adapter Tagging (SPLAT), a novel library preparation method for whole genome bisulphite sequencing

PubMed Central

Manlig, Erika; Wahlberg, Per

2017-01-01

Abstract Sodium bisulphite treatment of DNA combined with next generation sequencing (NGS) is a powerful combination for the interrogation of genome-wide DNA methylation profiles. Library preparation for whole genome bisulphite sequencing (WGBS) is challenging due to side effects of the bisulphite treatment, which leads to extensive DNA damage. Recently, a new generation of methods for bisulphite sequencing library preparation have been devised. They are based on initial bisulphite treatment of the DNA, followed by adaptor tagging of single stranded DNA fragments, and enable WGBS using low quantities of input DNA. In this study, we present a novel approach for quick and cost effective WGBS library preparation that is based on splinted adaptor tagging (SPLAT) of bisulphite-converted single-stranded DNA. Moreover, we validate SPLAT against three commercially available WGBS library preparation techniques, two of which are based on bisulphite treatment prior to adaptor tagging and one is a conventional WGBS method. PMID:27899585
On site DNA barcoding by nanopore sequencing

PubMed Central

Menegon, Michele; Cantaloni, Chiara; Rodriguez-Prieto, Ana; Centomo, Cesare; Abdelfattah, Ahmed; Rossato, Marzia; Bernardi, Massimo; Xumerle, Luciano; Loader, Simon; Delledonne, Massimo

2017-01-01

Biodiversity research is becoming increasingly dependent on genomics, which allows the unprecedented digitization and understanding of the planet’s biological heritage. The use of genetic markers i.e. DNA barcoding, has proved to be a powerful tool in species identification. However, full exploitation of this approach is hampered by the high sequencing costs and the absence of equipped facilities in biodiversity-rich countries. In the present work, we developed a portable sequencing laboratory based on the portable DNA sequencer from Oxford Nanopore Technologies, the MinION. Complementary laboratory equipment and reagents were selected to be used in remote and tough environmental conditions. The performance of the MinION sequencer and the portable laboratory was tested for DNA barcoding in a mimicking tropical environment, as well as in a remote rainforest of Tanzania lacking electricity. Despite the relatively high sequencing error-rate of the MinION, the development of a suitable pipeline for data analysis allowed the accurate identification of different species of vertebrates including amphibians, reptiles and mammals. In situ sequencing of a wild frog allowed us to rapidly identify the species captured, thus confirming that effective DNA barcoding in the field is possible. These results open new perspectives for real-time-on-site DNA sequencing thus potentially increasing opportunities for the understanding of biodiversity in areas lacking conventional laboratory facilities. PMID:28977016
[Structural organization of 5S ribosomal DNA of Rosa rugosa].

PubMed

Tynkevych, Iu O; Volkov, R A

2014-01-01

In order to clarify molecular organization of the genomic region encoding 5S rRNA in diploid species Rosa rugosa several 5S rDNA repeated units were cloned and sequenced. Analysis of the obtained sequences revealed that only one length variant of 5S rDNA repeated units, which contains intact promoter elements in the intergenic spacer region (IGS) and appears to be transcriptionally active is present in the genome. Additionally, a limited number of 5S rDNA pseudogenes lacking a portion of coding sequence and the complete IGS was detected. A high level of sequence similarity (from 93.7 to 97.5%) between the IGS of major 5S rDNA variants of East Asian R. rugosa and North American R. nitida was found indicating comparatively recent divergence of these species.
Highly Iterated Palindromic Sequences (HIPs) and Their Relationship to DNA Methyltransferases

PubMed Central

Elhai, Jeff

2015-01-01

The sequence GCGATCGC (Highly Iterated Palindrome, HIP1) is commonly found in high frequency in cyanobacterial genomes. An important clue to its function may be the presence of two orphan DNA methyltransferases that recognize internal sequences GATC and CGATCG. An examination of genomes from 97 cyanobacteria, both free-living and obligate symbionts, showed that there are exceptional cases in which HIP1 is at a low frequency or nearly absent. In some of these cases, it appears to have been replaced by a different GC-rich palindromic sequence, alternate HIPs. When HIP1 is at a high frequency, GATC- and CGATCG-specific methyltransferases are generally present in the genome. When an alternate HIP is at high frequency, a methyltransferase specific for that sequence is present. The pattern of 1-nt deviations from HIP1 sequences is biased towards the first and last nucleotides, i.e., those distinguish CGATCG from HIP1. Taken together, the results point to a role of DNA methylation in the creation or functioning of HIP sites. A model is presented that postulates the existence of a GmeC-dependent mismatch repair system whose activity creates and maintains HIP sequences. PMID:25789551
Highly Iterated Palindromic Sequences (HIPs) and Their Relationship to DNA Methyltransferases.

PubMed

Elhai, Jeff

2015-03-17

The sequence GCGATCGC (Highly Iterated Palindrome, HIP1) is commonly found in high frequency in cyanobacterial genomes. An important clue to its function may be the presence of two orphan DNA methyltransferases that recognize internal sequences GATC and CGATCG. An examination of genomes from 97 cyanobacteria, both free-living and obligate symbionts, showed that there are exceptional cases in which HIP1 is at a low frequency or nearly absent. In some of these cases, it appears to have been replaced by a different GC-rich palindromic sequence, alternate HIPs. When HIP1 is at a high frequency, GATC- and CGATCG-specific methyltransferases are generally present in the genome. When an alternate HIP is at high frequency, a methyltransferase specific for that sequence is present. The pattern of 1-nt deviations from HIP1 sequences is biased towards the first and last nucleotides, i.e., those distinguish CGATCG from HIP1. Taken together, the results point to a role of DNA methylation in the creation or functioning of HIP sites. A model is presented that postulates the existence of a GmeC-dependent mismatch repair system whose activity creates and maintains HIP sequences.
Presence of a consensus DNA motif at nearby DNA sequence of the mutation susceptible CG nucleotides.

PubMed

Chowdhury, Kaushik; Kumar, Suresh; Sharma, Tanu; Sharma, Ankit; Bhagat, Meenakshi; Kamai, Asangla; Ford, Bridget M; Asthana, Shailendra; Mandal, Chandi C

2018-01-10

Complexity in tissues affected by cancer arises from somatic mutations and epigenetic modifications in the genome. The mutation susceptible hotspots present within the genome indicate a non-random nature and/or a position specific selection of mutation. An association exists between the occurrence of mutations and epigenetic DNA methylation. This study is primarily aimed at determining mutation status, and identifying a signature for predicting mutation prone zones of tumor suppressor (TS) genes. Nearby sequences from the top five positions having a higher mutation frequency in each gene of 42 TS genes were selected from a cosmic database and were considered as mutation prone zones. The conserved motifs present in the mutation prone DNA fragments were identified. Molecular docking studies were done to determine putative interactions between the identified conserved motifs and enzyme methyltransferase DNMT1. Collective analysis of 42 TS genes found GC as the most commonly replaced and AT as the most commonly formed residues after mutation. Analysis of the top 5 mutated positions of each gene (210 DNA segments for 42 TS genes) identified that CG nucleotides of the amino acid codons (e.g., Arginine) are most susceptible to mutation, and found a consensus DNA "T/AGC/GAGGA/TG" sequence present in these mutation prone DNA segments. Similar to TS genes, analysis of 54 oncogenes not only found CG nucleotides of the amino acid Arg as the most susceptible to mutation, but also identified the presence of similar consensus DNA motifs in the mutation prone DNA fragments (270 DNA segments for 54 oncogenes) of oncogenes. Docking studies depicted that, upon binding of DNMT1 methylates to this consensus DNA motif (C residues of CpG islands), mutation was likely to occur. Thus, this study proposes that DNMT1 mediated methylation in chromosomal DNA may decrease if a foreign DNA segment containing this consensus sequence along with CG nucleotides is exogenously introduced to dividing cancer cells. Copyright © 2017 Elsevier B.V. All rights reserved.
1-deoxy-d-xylulose-5-phosphate reductoisomerases and method of use

DOEpatents

Croteau, Rodney B.; Lange, Bernd M.

2001-01-01

The present invention relates to isolated DNA sequences which code for the expression of plant 1-deoxy-D-xylulose-5-phosphate reductoisomerase protein, such as the sequence presented in SEQ ID NO:1 which encodes a 1-deoxy-D-xylulose-5-phosphate reductoisomerase protein from peppermint (Mentha x piperita). Additionally, the present invention relates to isolated plant 1-deoxy-D-xylulose-5-phosphate reductoisomerase protein. In other aspects, the present invention is directed to replicable recombinant cloning vehicles comprising a nucleic acid sequence which codes for a plant 1-deoxy-D-xylulose-5-phosphate reductoisomerase, to modified host cells transformed, transfected, infected and/or injected with a recombinant cloning vehicle and/or DNA sequence of the invention.
1-deoxy-D-xylulose-5-phosphate reductoisomerases, and methods of use

DOEpatents

Croteau, Rodney B.; Lange, Bernd M.

2002-07-16

The present invention relates to isolated DNA sequences which code for the expression of plant 1-deoxy-D-xylulose-5-phosphate reductoisomerase protein, such as the sequence presented in SEQ ID NO:1 which encodes a 1-deoxy-D-xylulose-5-phosphate reductoisomerase protein from peppermint (Mentha x piperita). Additionally, the present invention relates to isolated plant 1-deoxy-D-xylulose-5-phosphate reductoisomerase protein. In other aspects, the present invention is directed to replicable recombinant cloning vehicles comprising a nucleic acid sequence which codes for a plant 1-deoxy-D-xylulose-5-phosphate reductoisomerase, to modified host cells transformed, transfected, infected and/or injected with a recombinant cloning vehicle and/or DNA sequence of the invention.
Coinfection of Fusobacterium nucleatum and Actinomyces israelii in Mastoiditis Diagnosed by Next-Generation DNA Sequencing

PubMed Central

Hoogestraat, Daniel R.; Abbott, April N.; SenGupta, Dhruba J.; Cummings, Lisa A.; Butler-Wu, Susan M.; Stephens, Karen; Cookson, Brad T.; Hoffman, Noah G.

2014-01-01

Some bacterial infections involve potentially complex mixtures of species that can now be distinguished using next-generation DNA sequencing. We present a case of mastoiditis where Gram stain, culture, and molecular diagnosis were nondiagnostic or discrepant. Next-generation sequencing implicated coinfection of Fusobacterium nucleatum and Actinomyces israelii, resolving these diagnostic discrepancies. PMID:24574281
Characterization of the Complete Mitochondrial Genome Sequence of Spirometra erinaceieuropaei (Cestoda: Diphyllobothriidae) from China

PubMed Central

Liu, Guo-Hua; Li, Chun; Li, Jia-Yuan; Zhou, Dong-Hui; Xiong, Rong-Chuan; Lin, Rui-Qing; Zou, Feng-Cai; Zhu, Xing-Quan

2012-01-01

Sparganosis, caused by the plerocercoid larvae of members of the genus Spirometra, can cause significant public health problem and considerable economic losses. In the present study, the complete mitochondrial DNA (mtDNA) sequence of Spirometra erinaceieuropaei from China was determined, characterized and compared with that of S. erinaceieuropaei from Japan. The gene arrangement in the mt genome sequences of S. erinaceieuropaei from China and Japan is identical. The identity of the mt genomes was 99.1% between S. erinaceieuropaei from China and Japan, and the complete mtDNA sequence of S. erinaceieuropaei from China is slightly shorter (2 bp) than that from Japan. Phylogenetic analysis of S. erinaceieuropaei with other representative cestodes using two different computational algorithms [Bayesian inference (BI) and maximum likelihood (ML)] based on concatenated amino acid sequences of 12 protein-coding genes, revealed that S. erinaceieuropaei is closely related to Diphyllobothrium spp., supporting classification based on morphological features. The present study determined the complete mtDNA sequences of S. erinaceieuropaei from China that provides novel genetic markers for studying the population genetics and molecular epidemiology of S. erinaceieuropaei in humans and animals. PMID:22553464

High-Throughput Analysis of T-DNA Location and Structure Using Sequence Capture.

PubMed

Inagaki, Soichi; Henry, Isabelle M; Lieberman, Meric C; Comai, Luca

2015-01-01

Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA-genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously, using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. Our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.
Environment and Structure Influence in DNA Conduction

NASA Technical Reports Server (NTRS)

Adessi, C.; Walch, S.; Anantram, M. P.; Biegel, Bryan (Technical Monitor)

2002-01-01

Results for transmission through the poly(G) DNA molecule are presented. We show that (i) periodically arranged sodium counter-ions in close proximity to dry DNA gives rise to a new conduction channel and aperiodicity in the counter-ion sequence can lead to a significant reduction in conduction, (ii) modification of the rise of B-DNA induces a change in the width of the transmission window, and (iii) specifically designed sequences are predicted to show intrinsic resonant tunneling behavior.
Virus-specific DNA sequences present in cells which carry the herpes simplex virus thymidine kinase gene.

PubMed

Minson, A C; Darby, G K; Wildy, P

1979-11-01

Two independently derived cell lines which carry the herpes simplex type 2 thymidine kinase gene have been examined for the presence of HSV-2-specific DNA sequences. Both cell lines contained 1 to 3 copies per cell of a sequence lying within map co-ordinates 0.2 to 0.4 of the HSV-2 genome. Revertant cells, which contained no detectable thymidine kinase, did not contain this DNA sequence. The failure of EcoR1-restricted HSV-2 DNA to act as a donor of the thymidine kinase gene in transformation experiments suggests that the gene lies close to the EcoR1 restriction site within this sequence at a map position of approx. 0.3. The HSV-2 kinase gene is therefore approximately co-linear with the HSV-1 gene.
Distribution and sequence homogeneity of an abundant satellite DNA in the beetle, Tenebrio molitor.

PubMed Central

Davis, C A; Wyatt, G R

1989-01-01

The mealworm beetle, Tenebrio molitor, contains an unusually abundant and homogeneous satellite DNA which constitutes up to 60% of its genome. The satellite DNA is shown to be present in all of the chromosomes by in situ hybridization. 18 dimers of the repeat unit were cloned and sequenced. The consensus sequence is 142 nt long and lacks any internal repeat structure. Monomers of the sequence are very similar, showing on average a 2% divergence from the calculated consensus. Variant nucleotides are scattered randomly throughout the sequence although some variants are more common than others. Neighboring repeat units are no more alike than randomly chosen ones. The results suggest that some mechanism, perhaps gene conversion, is acting to maintain the homogeneity of the satellite DNA despite its abundance and distribution on all of the chromosomes. Images PMID:2762148
High Resolution Size Analysis of Fetal DNA in the Urine of Pregnant Women by Paired-End Massively Parallel Sequencing

PubMed Central

Tsui, Nancy B. Y.; Jiang, Peiyong; Chow, Katherine C. K.; Su, Xiaoxi; Leung, Tak Y.; Sun, Hao; Chan, K. C. Allen; Chiu, Rossa W. K.; Lo, Y. M. Dennis

2012-01-01

Background Fetal DNA in maternal urine, if present, would be a valuable source of fetal genetic material for noninvasive prenatal diagnosis. However, the existence of fetal DNA in maternal urine has remained controversial. The issue is due to the lack of appropriate technology to robustly detect the potentially highly degraded fetal DNA in maternal urine. Methodology We have used massively parallel paired-end sequencing to investigate cell-free DNA molecules in maternal urine. Catheterized urine samples were collected from seven pregnant women during the third trimester of pregnancies. We detected fetal DNA by identifying sequenced reads that contained fetal-specific alleles of the single nucleotide polymorphisms. The sizes of individual urinary DNA fragments were deduced from the alignment positions of the paired reads. We measured the fractional fetal DNA concentration as well as the size distributions of fetal and maternal DNA in maternal urine. Principal Findings Cell-free fetal DNA was detected in five of the seven maternal urine samples, with the fractional fetal DNA concentrations ranged from 1.92% to 4.73%. Fetal DNA became undetectable in maternal urine after delivery. The total urinary cell-free DNA molecules were less intact when compared with plasma DNA. Urinary fetal DNA fragments were very short, and the most dominant fetal sequences were between 29 bp and 45 bp in length. Conclusions With the use of massively parallel sequencing, we have confirmed the existence of transrenal fetal DNA in maternal urine, and have shown that urinary fetal DNA was heavily degraded. PMID:23118982
More of an art than a science: Using microbial DNA sequences to compose music

DOE PAGES

Larsen, Peter E.

2016-03-01

Bacteria are everywhere. Microbial ecology is emerging as a critical field for understanding the relationships between these ubiquitous bacterial communities, the environment, and human health. Next generation DNA sequencing technology provides us a powerful tool to indirectly observe the communities by sequencing and analyzing all of the bacterial DNA present in an environment. The results of the DNA sequencing experiments can generate gigabytes to terabytes of information however, making it difficult for the citizen scientist to grasp and the educator to convey this data. Here, we present a method for interpreting massive amounts of microbial ecology data as musical performances,more » easily generated on any computer and using only commonly available or freely available software and the ‘Microbial Bebop’ algorithm. Furthermore, using this approach citizen scientists and biology educators can sonify complex data in a fun and interactive format, making it easier to communicate both the importance and the excitement of exploring the planet earth’s largest ecosystem.« less
More of an art than a science: Using microbial DNA sequences to compose music

DOE Office of Scientific and Technical Information (OSTI.GOV)

Larsen, Peter E.

Bacteria are everywhere. Microbial ecology is emerging as a critical field for understanding the relationships between these ubiquitous bacterial communities, the environment, and human health. Next generation DNA sequencing technology provides us a powerful tool to indirectly observe the communities by sequencing and analyzing all of the bacterial DNA present in an environment. The results of the DNA sequencing experiments can generate gigabytes to terabytes of information however, making it difficult for the citizen scientist to grasp and the educator to convey this data. Here, we present a method for interpreting massive amounts of microbial ecology data as musical performances,more » easily generated on any computer and using only commonly available or freely available software and the ‘Microbial Bebop’ algorithm. Furthermore, using this approach citizen scientists and biology educators can sonify complex data in a fun and interactive format, making it easier to communicate both the importance and the excitement of exploring the planet earth’s largest ecosystem.« less
Microgravity

NASA Image and Video Library

1998-12-01

Type II restriction enzymes, such as Eco R1 endonulease, present a unique advantage for the study of sequence-specific recognition because they leave a record of where they have been in the form of the cleaved ends of the DNA sites where they were bound. The differential behavior of a sequence -specific protein at sites of differing base sequence is the essence of the sequence-specificity; the core question is how do these proteins discriminate between different DNA sequences especially when the two sequences are very similar. Principal Investigator: Dan Carter/New Century Pharmaceuticals
Underwound DNA under Tension: Structure, Elasticity, and Sequence-Dependent Behaviors

NASA Astrophysics Data System (ADS)

Sheinin, Maxim Y.; Forth, Scott; Marko, John F.; Wang, Michelle D.

2011-09-01

DNA melting under torsion plays an important role in a wide variety of cellular processes. In the present Letter, we have investigated DNA melting at the single-molecule level using an angular optical trap. By directly measuring force, extension, torque, and angle of DNA, we determined the structural and elastic parameters of torsionally melted DNA. Our data reveal that under moderate forces, the melted DNA assumes a left-handed structure as opposed to an open bubble conformation and is highly torsionally compliant. We have also discovered that at low forces melted DNA properties are highly dependent on DNA sequence. These results provide a more comprehensive picture of the global DNA force-torque phase diagram.
Identification of DNA-binding proteins by combining auto-cross covariance transformation and ensemble learning.

PubMed

Liu, Bin; Wang, Shanyi; Dong, Qiwen; Li, Shumin; Liu, Xuan

2016-04-20

DNA-binding proteins play a pivotal role in various intra- and extra-cellular activities ranging from DNA replication to gene expression control. With the rapid development of next generation of sequencing technique, the number of protein sequences is unprecedentedly increasing. Thus it is necessary to develop computational methods to identify the DNA-binding proteins only based on the protein sequence information. In this study, a novel method called iDNA-KACC is presented, which combines the Support Vector Machine (SVM) and the auto-cross covariance transformation. The protein sequences are first converted into profile-based protein representation, and then converted into a series of fixed-length vectors by the auto-cross covariance transformation with Kmer composition. The sequence order effect can be effectively captured by this scheme. These vectors are then fed into Support Vector Machine (SVM) to discriminate the DNA-binding proteins from the non DNA-binding ones. iDNA-KACC achieves an overall accuracy of 75.16% and Matthew correlation coefficient of 0.5 by a rigorous jackknife test. Its performance is further improved by employing an ensemble learning approach, and the improved predictor is called iDNA-KACC-EL. Experimental results on an independent dataset shows that iDNA-KACC-EL outperforms all the other state-of-the-art predictors, indicating that it would be a useful computational tool for DNA binding protein identification. .
Genome-wide evidence for local DNA methylation spreading from small RNA-targeted sequences in Arabidopsis.

PubMed

Ahmed, Ikhlak; Sarazin, Alexis; Bowler, Chris; Colot, Vincent; Quesneville, Hadi

2011-09-01

Transposable elements (TEs) and their relics play major roles in genome evolution. However, mobilization of TEs is usually deleterious and strongly repressed. In plants and mammals, this repression is typically associated with DNA methylation, but the relationship between this epigenetic mark and TE sequences has not been investigated systematically. Here, we present an improved annotation of TE sequences and use it to analyze genome-wide DNA methylation maps obtained at single-nucleotide resolution in Arabidopsis. We show that although the majority of TE sequences are methylated, ∼26% are not. Moreover, a significant fraction of TE sequences densely methylated at CG, CHG and CHH sites (where H = A, T or C) have no or few matching small interfering RNA (siRNAs) and are therefore unlikely to be targeted by the RNA-directed DNA methylation (RdDM) machinery. We provide evidence that these TE sequences acquire DNA methylation through spreading from adjacent siRNA-targeted regions. Further, we show that although both methylated and unmethylated TE sequences located in euchromatin tend to be more abundant closer to genes, this trend is least pronounced for methylated, siRNA-targeted TE sequences located 5' to genes. Based on these and other findings, we propose that spreading of DNA methylation through promoter regions explains at least in part the negative impact of siRNA-targeted TE sequences on neighboring gene expression.
APE1 incision activity at abasic sites in tandem repeat sequences.

PubMed

Li, Mengxia; Völker, Jens; Breslauer, Kenneth J; Wilson, David M

2014-05-29

Repetitive DNA sequences, such as those present in microsatellites and minisatellites, telomeres, and trinucleotide repeats (linked to fragile X syndrome, Huntington disease, etc.), account for nearly 30% of the human genome. These domains exhibit enhanced susceptibility to oxidative attack to yield base modifications, strand breaks, and abasic sites; have a propensity to adopt non-canonical DNA forms modulated by the positions of the lesions; and, when not properly processed, can contribute to genome instability that underlies aging and disease development. Knowledge on the repair efficiencies of DNA damage within such repetitive sequences is therefore crucial for understanding the impact of such domains on genomic integrity. In the present study, using strategically designed oligonucleotide substrates, we determined the ability of human apurinic/apyrimidinic endonuclease 1 (APE1) to cleave at apurinic/apyrimidinic (AP) sites in a collection of tandem DNA repeat landscapes involving telomeric and CAG/CTG repeat sequences. Our studies reveal the differential influence of domain sequence, conformation, and AP site location/relative positioning on the efficiency of APE1 binding and strand incision. Intriguingly, our data demonstrate that APE1 endonuclease efficiency correlates with the thermodynamic stability of the DNA substrate. We discuss how these results have both predictive and mechanistic consequences for understanding the success and failure of repair protein activity associated with such oxidatively sensitive, conformationally plastic/dynamic repetitive DNA domains. Published by Elsevier Ltd.
High-resolution biophysical analysis of the dynamics of nucleosome formation

PubMed Central

Hatakeyama, Akiko; Hartmann, Brigitte; Travers, Andrew; Nogues, Claude; Buckle, Malcolm

2016-01-01

We describe a biophysical approach that enables changes in the structure of DNA to be followed during nucleosome formation in in vitro reconstitution with either the canonical “Widom” sequence or a judiciously mutated sequence. The rapid non-perturbing photochemical analysis presented here provides ‘snapshots’ of the DNA configuration at any given moment in time during nucleosome formation under a very broad range of reaction conditions. Changes in DNA photochemical reactivity upon protein binding are interpreted as being mainly induced by alterations in individual base pair roll angles. The results strengthen the importance of the role of an initial (H3/H4)2 histone tetramer-DNA interaction and highlight the modulation of this early event by the DNA sequence. (H3/H4)2 binding precedes and dictates subsequent H2A/H2B-DNA interactions, which are less affected by the DNA sequence, leading to the final octameric nucleosome. Overall, our results provide a novel, exciting way to investigate those biophysical properties of DNA that constitute a crucial component in nucleosome formation and stabilization. PMID:27263658
Determination of a mutational spectrum

DOEpatents

Thilly, William G.; Keohavong, Phouthone

1991-01-01

A method of resolving (physically separating) mutant DNA from nonmutant DNA and a method of defining or establishing a mutational spectrum or profile of alterations present in nucleic acid sequences from a sample to be analyzed, such as a tissue or body fluid. The present method is based on the fact that it is possible, through the use of DGGE, to separate nucleic acid sequences which differ by only a single base change and on the ability to detect the separate mutant molecules. The present invention, in another aspect, relates to a method for determining a mutational spectrum in a DNA sequence of interest present in a population of cells. The method of the present invention is useful as a diagnostic or analytical tool in forensic science in assessing environmental and/or occupational exposures to potentially genetically toxic materials (also referred to as potential mutagens); in biotechnology, particularly in the study of the relationship between the amino acid sequence of enzymes and other biologically-active proteins or protein-containing substances and their respective functions; and in determining the effects of drugs, cosmetics and other chemicals for which toxicity data must be obtained.
Phylogenetic Position of a Copper Age Sheep (Ovis aries) Mitochondrial DNA

PubMed Central

Olivieri, Cristina; Ermini, Luca; Rizzi, Ermanno; Corti, Giorgio; Luciani, Stefania; Marota, Isolina; De Bellis, Gianluca; Rollo, Franco

2012-01-01

Background Sheep (Ovis aries) were domesticated in the Fertile Crescent region about 9,000-8,000 years ago. Currently, few mitochondrial (mt) DNA studies are available on archaeological sheep. In particular, no data on archaeological European sheep are available. Methodology/Principal Findings Here we describe the first portion of mtDNA sequence of a Copper Age European sheep. DNA was extracted from hair shafts which were part of the clothes of the so-called Tyrolean Iceman or Ötzi (5,350 - 5,100 years before present). Mitochondrial DNA (a total of 2,429 base pairs, encompassing a portion of the control region, tRNAPhe, a portion of the 12S rRNA gene, and the whole cytochrome B gene) was sequenced using a mixed sequencing procedure based on PCR amplification and 454 sequencing of pooled amplification products. We have compared the sequence with the corresponding sequence of 334 extant lineages. Conclusions/Significance A phylogenetic network based on a new cladistic notation for the mitochondrial diversity of domestic sheep shows that the Ötzi's sheep falls within haplogroup B, thus demonstrating that sheep belonging to this haplogroup were already present in the Alps more than 5,000 years ago. On the other hand, the lineage of the Ötzi's sheep is defined by two transitions (16147, and 16440) which, assembled together, define a motif that has not yet been identified in modern sheep populations. PMID:22457789
An integrated PCR colony hybridization approach to screen cDNA libraries for full-length coding sequences.

PubMed

Pollier, Jacob; González-Guzmán, Miguel; Ardiles-Diaz, Wilson; Geelen, Danny; Goossens, Alain

2011-01-01

cDNA-Amplified Fragment Length Polymorphism (cDNA-AFLP) is a commonly used technique for genome-wide expression analysis that does not require prior sequence knowledge. Typically, quantitative expression data and sequence information are obtained for a large number of differentially expressed gene tags. However, most of the gene tags do not correspond to full-length (FL) coding sequences, which is a prerequisite for subsequent functional analysis. A medium-throughput screening strategy, based on integration of polymerase chain reaction (PCR) and colony hybridization, was developed that allows in parallel screening of a cDNA library for FL clones corresponding to incomplete cDNAs. The method was applied to screen for the FL open reading frames of a selection of 163 cDNA-AFLP tags from three different medicinal plants, leading to the identification of 109 (67%) FL clones. Furthermore, the protocol allows for the use of multiple probes in a single hybridization event, thus significantly increasing the throughput when screening for rare transcripts. The presented strategy offers an efficient method for the conversion of incomplete expressed sequence tags (ESTs), such as cDNA-AFLP tags, to FL-coding sequences.
Partial nucleotide sequences, and routine typing by polymerase chain reaction-restriction fragment length polymorphism, of the brown trout (Salmo trutta) lactate dehydrogenase, LDH-C1*90 and *100 alleles.

PubMed

McMeel, O M; Hoey, E M; Ferguson, A

2001-01-01

The cDNA nucleotide sequences of the lactate dehydrogenase alleles LDH-C1*90 and *100 of brown trout (Salmo trutta) were found to differ at position 308 where an A is present in the *100 allele but a G is present in the *90 allele. This base substitution results in an amino acid change from aspartic acid at position 82 in the LDH-C1 100 allozyme to a glycine in the 90 allozyme. Since aspartic acid has a net negative charge whilst glycine is uncharged, this is consistent with the electrophoretic observation that the LDH-C1 100 allozyme has a more anodal mobility relative to the LDH-C1 90 allozyme. Based on alignment of the cDNA sequence with the mouse genomic sequence, a local primer set was designed, incorporating the variable position, and was found to give very good amplification with brown trout genomic DNA. Sequencing of this fragment confirmed the difference in both homozygous and heterozygous individuals. Digestion of the polymerase chain reaction products with BslI, a restriction enzyme specific for the site difference, gave one, two and three fragments for the two homozygotes and the heterozygote, respectively, following electrophoretic separation. This provides a DNA-based means of routine screening of the highly informative LDH-C1* polymorphism in brown trout population genetic studies. Primer sets presented could be used to sequence cDNA of other LDH* genes of brown trout and other species.
DNA Base-Calling from a Nanopore Using a Viterbi Algorithm

PubMed Central

Timp, Winston; Comer, Jeffrey; Aksimentiev, Aleksei

2012-01-01

Nanopore-based DNA sequencing is the most promising third-generation sequencing method. It has superior read length, speed, and sample requirements compared with state-of-the-art second-generation methods. However, base-calling still presents substantial difficulty because the resolution of the technique is limited compared with the measured signal/noise ratio. Here we demonstrate a method to decode 3-bp-resolution nanopore electrical measurements into a DNA sequence using a Hidden Markov model. This method shows tremendous potential for accuracy (∼98%), even with a poor signal/noise ratio. PMID:22677395
Application of a time-dependent coalescence process for inferring the history of population size changes from DNA sequence data.

PubMed

Polanski, A; Kimmel, M; Chakraborty, R

1998-05-12

Distribution of pairwise differences of nucleotides from data on a sample of DNA sequences from a given segment of the genome has been used in the past to draw inferences about the past history of population size changes. However, all earlier methods assume a given model of population size changes (such as sudden expansion), parameters of which (e.g., time and amplitude of expansion) are fitted to the observed distributions of nucleotide differences among pairwise comparisons of all DNA sequences in the sample. Our theory indicates that for any time-dependent population size, N(tau) (in which time tau is counted backward from present), a time-dependent coalescence process yields the distribution, p(tau), of the time of coalescence between two DNA sequences randomly drawn from the population. Prediction of p(tau) and N(tau) requires the use of a reverse Laplace transform known to be unstable. Nevertheless, simulated data obtained from three models of monotone population change (stepwise, exponential, and logistic) indicate that the pattern of a past population size change leaves its signature on the pattern of DNA polymorphism. Application of the theory to the published mtDNA sequences indicates that the current mtDNA sequence variation is not inconsistent with a logistic growth of the human population.
Herpesvirus papio: state and properties of intracellular viral DNA in baboon lymphoblastoid cell lines.

PubMed

Falk, L; Lindahl, T; Bjursell, G; Klein, G

1979-07-15

Herpesvirus papio (HVP) is an indigenous B-lymphotropic virus of baboons (Papio sp.) present in latent form in baboon lymphoblastoid cell lines. It shares cross-reacting viral capsid and early antigens with the Epstein-Barr virus (EBV), and HVP DNA and EBV DNA show partial sequence homology. EBV-specific complementary RNA was employed here as a probe to investigate the physical state of the HVP DNA component in baboon lymphoblastoid cells after fractionation of cellular DNA by density gradient centrifugation. Five virus-producing cultures contained both free and integrated HVP DNA sequences while one non-producing cell line had two or three viral genome equivalents per cell in an apparently integrated form. Further analysis of one virus-producing line showed that the free HVP DNA fraction was composed of both linear and circular viral DNA. Contour length measurements of HVP circular DNA molecules by electron microscopy revealed that they were similar in length to the EBV circular DNA present in human lymphoblastoid cells.

Amino Acid Racemization and the Preservation of Ancient DNA

NASA Technical Reports Server (NTRS)

Poinar, Hendrik N.; Hoss, Matthias

1996-01-01

The extent of racemization of aspartic acid, alanine, and leucine provides criteria for assessing whether ancient tissue samples contain endogenous DNA. In samples in which the D/L ratio of aspartic acid exceeds 0.08, ancient DNA sequences could not be retrieved. Paleontological finds from which DNA sequences purportedly millions of years old have been reported show extensive racemization, and the amino acids present are mainly contaminates. An exception is the amino acids in some insects preserved in amber.
Bacillus pumilus SAFR-032 isolate

NASA Technical Reports Server (NTRS)

Venkateswaran, Kasthuri J. (Inventor)

2007-01-01

The present invention relates to discovery and isolation of a biologically pure culture of a Bacillus pumilus SAFR-032 isolate with UV sterilization resistant properties. This novel strain has been characterized on the basis of phenotypic traits, 16S rDNA sequence analysis and DNA-DNA hybridization. According to the results of these analyses, this strain belongs to the genus Bacillus. The GenBank accession number for the 16S rDNA sequence of the Bacillus pumilus SAFR-032 isolate is AY167879.
Lichenase and coding sequences

DOEpatents

Li, Xin-Liang; Ljungdahl, Lars G.; Chen, Huizhong

2000-08-15

The present invention provides a fungal lichenase, i.e., an endo-1,3-1,4-.beta.-D-glucanohydrolase, its coding sequence, recombinant DNA molecules comprising the lichenase coding sequences, recombinant host cells and methods for producing same. The present lichenase is from Orpinomyces PC-2.
2D-dynamic representation of DNA sequences as a graphical tool in bioinformatics

NASA Astrophysics Data System (ADS)

Bielińska-WaÌ§Ż, D.; WaÌ§Ż, P.

2016-10-01

2D-dynamic representation of DNA sequences is briefly reviewed. Some new examples of 2D-dynamic graphs which are the graphical tool of the method are shown. Using the examples of the complete genome sequences of the Zika virus it is shown that the present method can be applied for the study of the evolution of viral genomes.
Chromosome ends: different sequences may provide conserved functions.

PubMed

Louis, Edward J; Vershinin, Alexander V

2005-07-01

The structures of specific chromosome regions, centromeres and telomeres, present a number of puzzles. As functions performed by these regions are ubiquitous and essential, their DNA, proteins and chromatin structure are expected to be conserved. Recent studies of centromeric DNA from human, Drosophila and plant species have demonstrated that a hidden universal centromere-specific sequence is highly unlikely. The DNA of telomeres is more conserved consisting of a tandemly repeated 6-8 bp Arabidopsis-like sequence in a majority of organisms as diverse as protozoan, fungi, mammals and plants. However, there are alternatives to short DNA repeats at the ends of chromosomes and for telomere elongation by telomerase. Here we focus on the similarities and diversity that exist among the structural elements, DNA sequences and proteins, that make up terminal domains (telomeres and subtelomeres), and how organisms use these in different ways to fulfil the functions of end-replication and end-protection. Copyright (c) 2005 Wiley Periodicals, Inc.
Evidence for a Complex Class of Nonadenylated mRNA in Drosophila

PubMed Central

Zimmerman, J. Lynn; Fouts, David L.; Manning, Jerry E.

1980-01-01

The amount, by mass, of poly(A+) mRNA present in the polyribosomes of third-instar larvae of Drosophila melanogaster, and the relative contribution of the poly(A+) mRNA to the sequence complexity of total polysomal RNA, has been determined. Selective removal of poly(A+) mRNA from total polysomal RNA by use of either oligo-dT-cellulose, or poly(U)-sepharose affinity chromatography, revealed that only 0.15% of the mass of the polysomal RNA was present as poly(A+) mRNA. The present study shows that this RNA hybridized at saturation with 3.3% of the single-copy DNA in the Drosophila genome. After correction for asymmetric transcription and reactability of the DNA, 7.4% of the single-copy DNA in the Drosophila genome is represented in larval poly(A+) mRNA. This corresponds to 6.73 x 106 nucleotides of mRNA coding sequences, or approximately 5,384 diverse RNA sequences of average size 1,250 nucleotides. However, total polysomal RNA hybridizes at saturation to 10.9% of the single-copy DNA sequences. After correcting this value for asymmetric transcription and tracer DNA reactability, 24% of the single-copy DNA in Drosophila is represented in total polysomal RNA. This corresponds to 2.18 x 107 nucleotides of RNA coding sequences or 17,440 diverse RNA molecules of size 1,250 nucleotides. This value is 3.2 times greater than that observed for poly(A+) mRNA, and indicates that ≃69% of the polysomal RNA sequence complexity is contributed by nonadenylated RNA. Furthermore, if the number of different structural genes represented in total polysomal RNA is ≃1.7 x 104, then the number of genes expressed in third-instar larvae exceeds the number of chromomeres in Drosophila by about a factor of three. This numerology indicates that the number of chromomeres observed in polytene chromosomes does not reflect the number of structural gene sequences in the Drosophila genome. PMID:6777246
From cheek swabs to consensus sequences: an A to Z protocol for high-throughput DNA sequencing of complete human mitochondrial genomes

PubMed Central

2014-01-01

Background Next-generation DNA sequencing (NGS) technologies have made huge impacts in many fields of biological research, but especially in evolutionary biology. One area where NGS has shown potential is for high-throughput sequencing of complete mtDNA genomes (of humans and other animals). Despite the increasing use of NGS technologies and a better appreciation of their importance in answering biological questions, there remain significant obstacles to the successful implementation of NGS-based projects, especially for new users. Results Here we present an ‘A to Z’ protocol for obtaining complete human mitochondrial (mtDNA) genomes – from DNA extraction to consensus sequence. Although designed for use on humans, this protocol could also be used to sequence small, organellar genomes from other species, and also nuclear loci. This protocol includes DNA extraction, PCR amplification, fragmentation of PCR products, barcoding of fragments, sequencing using the 454 GS FLX platform, and a complete bioinformatics pipeline (primer removal, reference-based mapping, output of coverage plots and SNP calling). Conclusions All steps in this protocol are designed to be straightforward to implement, especially for researchers who are undertaking next-generation sequencing for the first time. The molecular steps are scalable to large numbers (hundreds) of individuals and all steps post-DNA extraction can be carried out in 96-well plate format. Also, the protocol has been assembled so that individual ‘modules’ can be swapped out to suit available resources. PMID:24460871
Improved multiple displacement amplification (iMDA) and ultraclean reagents.

PubMed

Motley, S Timothy; Picuri, John M; Crowder, Chris D; Minich, Jeremiah J; Hofstadler, Steven A; Eshoo, Mark W

2014-06-06

Next-generation sequencing sample preparation requires nanogram to microgram quantities of DNA; however, many relevant samples are comprised of only a few cells. Genomic analysis of these samples requires a whole genome amplification method that is unbiased and free of exogenous DNA contamination. To address these challenges we have developed protocols for the production of DNA-free consumables including reagents and have improved upon multiple displacement amplification (iMDA). A specialized ethylene oxide treatment was developed that renders free DNA and DNA present within Gram positive bacterial cells undetectable by qPCR. To reduce DNA contamination in amplification reagents, a combination of ion exchange chromatography, filtration, and lot testing protocols were developed. Our multiple displacement amplification protocol employs a second strand-displacing DNA polymerase, improved buffers, improved reaction conditions and DNA free reagents. The iMDA protocol, when used in combination with DNA-free laboratory consumables and reagents, significantly improved efficiency and accuracy of amplification and sequencing of specimens with moderate to low levels of DNA. The sensitivity and specificity of sequencing of amplified DNA prepared using iMDA was compared to that of DNA obtained with two commercial whole genome amplification kits using 10 fg (~1-2 bacterial cells worth) of bacterial genomic DNA as a template. Analysis showed >99% of the iMDA reads mapped to the template organism whereas only 0.02% of the reads from the commercial kits mapped to the template. To assess the ability of iMDA to achieve balanced genomic coverage, a non-stochastic amount of bacterial genomic DNA (1 pg) was amplified and sequenced, and data obtained were compared to sequencing data obtained directly from genomic DNA. The iMDA DNA and genomic DNA sequencing had comparable coverage 99.98% of the reference genome at ≥1X coverage and 99.9% at ≥5X coverage while maintaining both balance and representation of the genome. The iMDA protocol in combination with DNA-free laboratory consumables, significantly improved the ability to sequence specimens with low levels of DNA. iMDA has broad utility in metagenomics, diagnostics, ancient DNA analysis, pre-implantation embryo screening, single-cell genomics, whole genome sequencing of unculturable organisms, and forensic applications for both human and microbial targets.
DNA nanomapping using CRISPR-Cas9 as a programmable nanoparticle.

PubMed

Mikheikin, Andrey; Olsen, Anita; Leslie, Kevin; Russell-Pavier, Freddie; Yacoot, Andrew; Picco, Loren; Payton, Oliver; Toor, Amir; Chesney, Alden; Gimzewski, James K; Mishra, Bud; Reed, Jason

2017-11-21

Progress in whole-genome sequencing using short-read (e.g., <150 bp), next-generation sequencing technologies has reinvigorated interest in high-resolution physical mapping to fill technical gaps that are not well addressed by sequencing. Here, we report two technical advances in DNA nanotechnology and single-molecule genomics: (1) we describe a labeling technique (CRISPR-Cas9 nanoparticles) for high-speed AFM-based physical mapping of DNA and (2) the first successful demonstration of using DVD optics to image DNA molecules with high-speed AFM. As a proof of principle, we used this new "nanomapping" method to detect and map precisely BCL2-IGH translocations present in lymph node biopsies of follicular lymphoma patents. This HS-AFM "nanomapping" technique can be complementary to both sequencing and other physical mapping approaches.
Nanopore Kinetic Proofreading of DNA Sequences

NASA Astrophysics Data System (ADS)

Ling, Xinsheng Sean

The concept of DNA sequencing using the time dependence of the nanopore ionic current was proposed in 1996 by Kasianowicz, Brandin, Branton, and Deamer (KBBD). The KBBD concept has generated tremendous amount interests in recent decade. In this talk, I will review the current understanding of the DNA ``translocation'' dynamics and how it can be described by Schrodinger's 1915 paper on first-passage-time distribution function. Schrodinger's distribution function can be used to give a rigorous criterion for achieving nanopore DNA sequencing which turns out to be identical to that of gel electrophoresis used by Sanger in the first-generation Sanger method. A nanopore DNA sequencing technology also requires discrimination of bases with high accuracies. I will describe a solid-state nanopore sandwich structure that can function as a proofreading device capable of discriminating between correct and incorrect hybridization probes with an accuracy rivaling that of high-fidelity DNA polymerases. The latest results from Nanjing will be presented. This work is supported by China 1000-Talent Program at Southeast University, Nanjing, China.
mtDNA sequence diversity of Hazara ethnic group from Pakistan.

PubMed

Rakha, Allah; Fatima; Peng, Min-Sheng; Adan, Atif; Bi, Rui; Yasmin, Memona; Yao, Yong-Gang

2017-09-01

The present study was undertaken to investigate mitochondrial DNA (mtDNA) control region sequences of Hazaras from Pakistan, so as to generate mtDNA reference database for forensic casework in Pakistan and to analyze phylogenetic relationship of this particular ethnic group with geographically proximal populations. Complete mtDNA control region (nt 16024-576) sequences were generated through Sanger Sequencing for 319 Hazara individuals from Quetta, Baluchistan. The population sample set showed a total of 189 distinct haplotypes, belonging mainly to West Eurasian (51.72%), East & Southeast Asian (29.78%) and South Asian (18.50%) haplogroups. Compared with other populations from Pakistan, the Hazara population had a relatively high haplotype diversity (0.9945) and a lower random match probability (0.0085). The dataset has been incorporated into EMPOP database under accession number EMP00680. The data herein comprises the largest, and likely most thoroughly examined, control region mtDNA dataset from Hazaras of Pakistan. Copyright © 2017 Elsevier B.V. All rights reserved.
DNA barcoding Indian freshwater fishes.

PubMed

Lakra, Wazir Singh; Singh, M; Goswami, Mukunda; Gopalakrishnan, A; Lal, K K; Mohindra, V; Sarkar, U K; Punia, P P; Singh, K V; Bhatt, J P; Ayyappan, S

2016-11-01

DNA barcoding is a promising technique for species identification using a short mitochondrial DNA sequence of cytochrome c oxidase I (COI) gene. In the present study, DNA barcodes were generated from 72 species of freshwater fish covering the Orders Cypriniformes, Siluriformes, Perciformes, Synbranchiformes, and Osteoglossiformes representing 50 genera and 19 families. All the samples were collected from diverse sites except the species endemic to a particular location. Species were represented by multiple specimens in the great majority of the barcoded species. A total of 284 COI sequences were generated. After amplification and sequencing of 700 base pair fragment of COI, primers were trimmed which invariably generated a 655 base pair barcode sequence. The average Kimura two-parameter (K2P) distances within-species, genera, families, and orders were 0.40%, 9.60%, 13.10%, and 17.16%, respectively. DNA barcode discriminated congeneric species without any confusion. The study strongly validated the efficiency of COI as an ideal marker for DNA barcoding of Indian freshwater fishes.
Amplification of the major satellite DNA family (FA-SAT) in a cat fibrosarcoma might be related to chromosomal instability.

PubMed

Santos, Sara; Chaves, Raquel; Adega, Filomena; Bastos, Estela; Guedes-Pinto, Henrique

2006-01-01

Most mammalian chromosomes have satellite DNA sequences located at or near the centromeres, organized in arrays of variable size and higher order structure. The implications of these specific repetitive DNA sequences and their organization for centromere function are still quite cloudy. In contrast to most mammalian species, the domestic cat seems to have the major satellite DNA family (FA-SAT) localized primarily at the telomeres and secondarily at the centromeres of the chromosomes. In the present work, we analyzed chromosome preparations from a fibrosarcoma, in comparison with nontumor cells (epithelial tissue) from the same individual, by in situ hybridization of the FA-SAT cat satellite DNA family. This repetitive sequence was found to be amplified in the cat tumor chromosomes analyzed. The amplification of these satellite DNA sequences in the cat chromosomes with variable number and appearance (marker chromosomes) is discussed and might be related to mitotic instability, which could explain the exhibition of complex patterns of chromosome aberrations detected in the fibrosarcoma analyzed.
Identification and characterization of a DnaJ gene from red alga Pyropia yezoensis (Bangiales, Rhodophyta)

NASA Astrophysics Data System (ADS)

Liu, Jiao; Li, Xianchao; Tang, Xuexi; Zhou, Bin

2016-03-01

Members of the DnaJ family are proteins that play a pivotal role in various cellular processes, such as protein folding, protein transport and cellular responses to stress. In the present study, we identified and characterized the full-length DnaJ cDNA sequence from expressed sequence tags of Pyropia yezoensis ( PyDnaJ) via rapid identification of cDNA ends. This cDNA encoded a protein of 429 amino acids, which shared high sequence similarity with other identified DnaJ proteins, such as a heat shock protein 40/DnaJ from Pyropia haitanensis. The relative mRNA expression level of PyDnaJ was investigated using real-time PCR to determine its specific expression during the algal life cycle and during desiccation. The relative mRNA expression level in sporophytes was higher than that in gametophytes and significantly increased during the whole desiccation process. These results indicate that PyDnaJ is an authentic member of the DnaJ family in plants and red algae and might play a pivotal role in mitigating damage to P. yezoensis during desiccation.
JICST Factual Database JICST DNA Database

NASA Astrophysics Data System (ADS)

Shirokizawa, Yoshiko; Abe, Atsushi

Japan Information Center of Science and Technology (JICST) has started the on-line service of DNA database in October 1988. This database is composed of EMBL Nucleotide Sequence Library and Genetic Sequence Data Bank. The authors outline the database system, data items and search commands. Examples of retrieval session are presented.
Mutation detection using automated fluorescence-based sequencing.

PubMed

Montgomery, Kate T; Iartchouck, Oleg; Li, Li; Perera, Anoja; Yassin, Yosuf; Tamburino, Alex; Loomis, Stephanie; Kucherlapati, Raju

2008-04-01

The development of high-throughput DNA sequencing techniques has made direct DNA sequencing of PCR-amplified genomic DNA a rapid and economical approach to the identification of polymorphisms that may play a role in disease. Point mutations as well as small insertions or deletions are readily identified by DNA sequencing. The mutations may be heterozygous (occurring in one allele while the other allele retains the normal sequence) or homozygous (occurring in both alleles). Sequencing alone cannot discriminate between true homozygosity and apparent homozygosity due to the loss of one allele due to a large deletion. In this unit, strategies are presented for using PCR amplification and automated fluorescence-based sequencing to identify sequence variation. The size of the project and laboratory preference and experience will dictate how the data is managed and which software tools are used for analysis. A high-throughput protocol is given that has been used to search for mutations in over 200 different genes at the Harvard Medical School - Partners Center for Genetics and Genomics (HPCGG, http://www.hpcgg.org/). Copyright 2008 by John Wiley & Sons, Inc.
Biomolecule Sequencer: Next-Generation DNA Sequencing Technology for In-Flight Environmental Monitoring, Research, and Beyond

NASA Technical Reports Server (NTRS)

Smith, David J.; Burton, Aaron; Castro-Wallace, Sarah; John, Kristen; Stahl, Sarah E.; Dworkin, Jason Peter; Lupisella, Mark L.

2016-01-01

On the International Space Station (ISS), technologies capable of rapid microbial identification and disease diagnostics are not currently available. NASA still relies upon sample return for comprehensive, molecular-based sample characterization. Next-generation DNA sequencing is a powerful approach for identifying microorganisms in air, water, and surfaces onboard spacecraft. The Biomolecule Sequencer payload, manifested to SpaceX-9 and scheduled on the Increment 4748 research plan (June 2016), will assess the functionality of a commercially-available next-generation DNA sequencer in the microgravity environment of ISS. The MinION device from Oxford Nanopore Technologies (Oxford, UK) measures picoamp changes in electrical current dependent on nucleotide sequences of the DNA strand migrating through nanopores in the system. The hardware is exceptionally small (9.5 x 3.2 x 1.6 cm), lightweight (120 grams), and powered only by a USB connection. For the ISS technology demonstration, the Biomolecule Sequencer will be powered by a Microsoft Surface Pro3. Ground-prepared samples containing lambda bacteriophage, Escherichia coli, and mouse genomic DNA, will be launched and stored frozen on the ISS until experiment initiation. Immediately prior to sequencing, a crew member will collect and thaw frozen DNA samples, connect the sequencer to the Surface Pro3, inject thawed samples into a MinION flow cell, and initiate sequencing. At the completion of the sequencing run, data will be downlinked for ground analysis. Identical, synchronous ground controls will be used for data comparisons to determine sequencer functionality, run-time sequence, current dynamics, and overall accuracy. We will present our latest results from the ISS flight experiment the first time DNA has ever been sequenced in space and discuss the many potential applications of the Biomolecule Sequencer for environmental monitoring, medical diagnostics, higher fidelity and more adaptable Space Biology Human Research Program investigations, and even life detection experiments for astrobiology missions.
Existence of host-related DNA sequences in the schistosome genome.

PubMed

Iwamura, Y; Irie, Y; Kominami, R; Nara, T; Yasuraoka, K

1991-06-01

DNA sequences homologous to the mouse intracisternal A particle and endogenous type C retrovirus were detected in the DNAs of Schistosoma japonicum adults and S. mansoni eggs. Furthermore, other kinds of repetitive sequences in the host genome such as mouse type 1 Alu sequence (B1), mouse type 2 Alu sequence (B2) and mo-2 sequence, a mouse mini-satellite, were also detected in the DNAs from adults and eggs of S. japonicum and eggs of S. mansoni. Almost all of the sequences described above were absent in the DNAs of S. mansoni adults. The DNA fingerprints of schistosomes, using the mo-2 sequence, were indistinguishable from each other and resembled those of their murine hosts. Moreover, the mo-2 sequence was hypermethylated in the DNAs of schistosomes and its amount was variable in them. These facts indicate that host-related sequences are actually present in schistosomes and that the mo-2 repetitive sequence exists probably in extra-chromosome.
Alternative DNA structure formation in the mutagenic human c-MYC promoter

PubMed Central

del Mundo, Imee Marie A.; Zewail-Foote, Maha; Kerwin, Sean M.

2017-01-01

Abstract Mutation ‘hotspot’ regions in the genome are susceptible to genetic instability, implicating them in diseases. These hotspots are not random and often co-localize with DNA sequences potentially capable of adopting alternative DNA structures (non-B DNA, e.g. H-DNA and G4-DNA), which have been identified as endogenous sources of genomic instability. There are regions that contain overlapping sequences that may form more than one non-B DNA structure. The extent to which one structure impacts the formation/stability of another, within the sequence, is not fully understood. To address this issue, we investigated the folding preferences of oligonucleotides from a chromosomal breakpoint hotspot in the human c-MYC oncogene containing both potential G4-forming and H-DNA-forming elements. We characterized the structures formed in the presence of G4-DNA-stabilizing K+ ions or H-DNA-stabilizing Mg2+ ions using multiple techniques. We found that under conditions favorable for H-DNA formation, a stable intramolecular triplex DNA structure predominated; whereas, under K+-rich, G4-DNA-forming conditions, a plurality of unfolded and folded species were present. Thus, within a limited region containing sequences with the potential to adopt multiple structures, only one structure predominates under a given condition. The predominance of H-DNA implicates this structure in the instability associated with the human c-MYC oncogene. PMID:28334873
Modeling the integration of bacterial rRNA fragments into the human cancer genome.

PubMed

Sieber, Karsten B; Gajer, Pawel; Dunning Hotopp, Julie C

2016-03-21

Cancer is a disease driven by the accumulation of genomic alterations, including the integration of exogenous DNA into the human somatic genome. We previously identified in silico evidence of DNA fragments from a Pseudomonas-like bacteria integrating into the 5'-UTR of four proto-oncogenes in stomach cancer sequencing data. The functional and biological consequences of these bacterial DNA integrations remain unknown. Modeling of these integrations suggests that the previously identified sequences cover most of the sequence flanking the junction between the bacterial and human DNA. Further examination of these reads reveals that these integrations are rich in guanine nucleotides and the integrated bacterial DNA may have complex transcript secondary structures. The models presented here lay the foundation for future experiments to test if bacterial DNA integrations alter the transcription of the human genes.

Evidence of birth-and-death evolution of 5S rRNA gene in Channa species (Teleostei, Perciformes).

PubMed

Barman, Anindya Sundar; Singh, Mamta; Singh, Rajeev Kumar; Lal, Kuldeep Kumar

2016-12-01

In higher eukaryotes, minor rDNA family codes for 5S rRNA that is arranged in tandem arrays and comprises of a highly conserved 120 bp long coding sequence with a variable non-transcribed spacer (NTS). Initially the 5S rDNA repeats are considered to be evolved by the process of concerted evolution. But some recent reports, including teleost fishes suggested that evolution of 5S rDNA repeat does not fit into the concerted evolution model and evolution of 5S rDNA family may be explained by a birth-and-death evolution model. In order to study the mode of evolution of 5S rDNA repeats in Perciformes fish species, nucleotide sequence and molecular organization of five species of genus Channa were analyzed in the present study. Molecular analyses revealed several variants of 5S rDNA repeats (four types of NTS) and networks created by a neighbor net algorithm for each type of sequences (I, II, III and IV) did not show a clear clustering in species specific manner. The stable secondary structure is predicted and upstream and downstream conserved regulatory elements were characterized. Sequence analyses also shown the presence of two putative pseudogenes in Channa marulius. Present study supported that 5S rDNA repeats in genus Channa were evolved under the process of birth-and-death.
Isolation and characterization of full-length cDNA clones coding for cholinesterase from fetal human tissues

DOE Office of Scientific and Technical Information (OSTI.GOV)

Prody, C.A.; Zevin-Sonkin, D.; Gnatt, A.

1987-06-01

To study the primary structure and regulation of human cholinesterases, oligodeoxynucleotide probes were prepared according to a consensus peptide sequence present in the active site of both human serum pseudocholinesterase and Torpedo electric organ true acetylcholinesterase. Using these probes, the authors isolated several cDNA clones from lambdagt10 libraries of fetal brain and liver origins. These include 2.4-kilobase cDNA clones that code for a polypeptide containing a putative signal peptide and the N-terminal, active site, and C-terminal peptides of human BtChoEase, suggesting that they code either for BtChoEase itself or for a very similar but distinct fetal form of cholinesterase. Inmore » RNA blots of poly(A)/sup +/ RNA from the cholinesterase-producing fetal brain and liver, these cDNAs hybridized with a single 2.5-kilobase band. Blot hybridization to human genomic DNA revealed that these fetal BtChoEase cDNA clones hybridize with DNA fragments of the total length of 17.5 kilobases, and signal intensities indicated that these sequences are not present in many copies. Both the cDNA-encoded protein and its nucleotide sequence display striking homology to parallel sequences published for Torpedo AcChoEase. These finding demonstrate extensive homologies between the fetal BtChoEase encoded by these clones and other cholinesterases of various forms and species.« less
Joint Estimation of Contamination, Error and Demography for Nuclear DNA from Ancient Humans

PubMed Central

Slatkin, Montgomery

2016-01-01

When sequencing an ancient DNA sample from a hominin fossil, DNA from present-day humans involved in excavation and extraction will be sequenced along with the endogenous material. This type of contamination is problematic for downstream analyses as it will introduce a bias towards the population of the contaminating individual(s). Quantifying the extent of contamination is a crucial step as it allows researchers to account for possible biases that may arise in downstream genetic analyses. Here, we present an MCMC algorithm to co-estimate the contamination rate, sequencing error rate and demographic parameters—including drift times and admixture rates—for an ancient nuclear genome obtained from human remains, when the putative contaminating DNA comes from present-day humans. We assume we have a large panel representing the putative contaminant population (e.g. European, East Asian or African). The method is implemented in a C++ program called ‘Demographic Inference with Contamination and Error’ (DICE). We applied it to simulations and genome data from ancient Neanderthals and modern humans. With reasonable levels of genome sequence coverage (>3X), we find we can recover accurate estimates of all these parameters, even when the contamination rate is as high as 50%. PMID:27049965
Integrated in silico and biological validation of the blocking effect of Cot-1 DNA on Microarray-CGH.

PubMed

Kang, Seung-Hui; Park, Chan Hee; Jeung, Hei Cheul; Kim, Ki-Yeol; Rha, Sun Young; Chung, Hyun Cheol

2007-06-01

In array-CGH, various factors may act as variables influencing the result of experiments. Among them, Cot-1 DNA, which has been used as a repetitive sequence-blocking agent, may become an artifact-inducing factor in BAC array-CGH. To identify the effect of Cot-1 DNA on Microarray-CGH experiments, Cot-1 DNA was labeled directly and Microarray-CGH experiments were performed. The results confirmed that probes which hybridized more completely with Cot-1 DNA had a higher sequence similarity to the Alu element. Further, in the sex-mismatched Microarray-CGH experiments, the variation and intensity in the fluorescent signal were reduced in the high intensity probe group in which probes were better hybridized with Cot-1 DNA. Otherwise, those of the low intensity probe group showed no alterations regardless of Cot-1 DNA. These results confirmed by in silico methods that Cot-1 DNA could block repetitive sequences in gDNA and probes. In addition, it was confirmed biologically that the blocking effect of Cot-1 DNA could be presented via its repetitive sequences, especially Alu elements. Thus, in contrast to BAC-array CGH, the use of Cot-1 DNA is advantageous in controlling experimental variation in Microarray-CGH.
A universal colorimetry for nucleic acids and aptamer-specific ligands detection based on DNA hybridization amplification.

PubMed

Li, Shuang; Shang, Xinxin; Liu, Jia; Wang, Yujie; Guo, Yingshu; You, Jinmao

2017-07-01

We present a universal amplified-colorimetric for detecting nucleic acid targets or aptamer-specific ligand targets based on gold nanoparticle-DNA (GNP-DNA) hybridization chain reaction (HCR). The universal arrays consisted of capture probe and hairpin DNA-GNP. First, capture probe recognized target specificity and released the initiator sequence. Then dispersed hairpin DNA modified GNPs were cross-linked to form aggregates through HCR events triggered by initiator sequence. As the aggregates accumulate, a significant red-to purple color change can be easily visualized by the naked eye. We used miRNA target sequence (miRNA-203) and aptamer-specific ligand (ATP) as target molecules for this proof-of-concept experiment. Initiator sequence (DNA2) was released from the capture probe (MNP/DNA1/2 conjugates) under the strong competitiveness of miRNA-203. Hairpin DNA (H1 and H2) can be complementary with the help of initiator DNA2 to form GNP-H1/GNP-H2 aggregates. The absorption ratio (A 620 /A 520 ) values of solutions were a sensitive function of miRNA-203 concentration covering from 1.0 × 10 -11 M to 9.0 × 10 -10 M, and as low as 1.0 × 10 -11 M could be detected. At the same time, the color changed from light wine red to purple and then to light blue have occurred in the solution. For ATP, initiator sequence (5'-end of DNA3) was released from the capture probe (DNA3) under the strong combination of aptamer-ATP. The present colorimetric for specific detection of ATP exhibited good sensitivity and 1.0 × 10 -8 M ATP could be detected. The proposed strategy also showed good performances for qualitative analysis and quantitative analysis of intracellular nucleic acids and aptamer-specific ligands. Copyright © 2017 Elsevier Inc. All rights reserved.
SAM: String-based sequence search algorithm for mitochondrial DNA database queries

PubMed Central

Röck, Alexander; Irwin, Jodi; Dür, Arne; Parsons, Thomas; Parson, Walther

2011-01-01

The analysis of the haploid mitochondrial (mt) genome has numerous applications in forensic and population genetics, as well as in disease studies. Although mtDNA haplotypes are usually determined by sequencing, they are rarely reported as a nucleotide string. Traditionally they are presented in a difference-coded position-based format relative to the corrected version of the first sequenced mtDNA. This convention requires recommendations for standardized sequence alignment that is known to vary between scientific disciplines, even between laboratories. As a consequence, database searches that are vital for the interpretation of mtDNA data can suffer from biased results when query and database haplotypes are annotated differently. In the forensic context that would usually lead to underestimation of the absolute and relative frequencies. To address this issue we introduce SAM, a string-based search algorithm that converts query and database sequences to position-free nucleotide strings and thus eliminates the possibility that identical sequences will be missed in a database query. The mere application of a BLAST algorithm would not be a sufficient remedy as it uses a heuristic approach and does not address properties specific to mtDNA, such as phylogenetically stable but also rapidly evolving insertion and deletion events. The software presented here provides additional flexibility to incorporate phylogenetic data, site-specific mutation rates, and other biologically relevant information that would refine the interpretation of mitochondrial DNA data. The manuscript is accompanied by freeware and example data sets that can be used to evaluate the new software (http://stringvalidation.org). PMID:21056022
Cloning and expression of UDP-glucose: flavonoid 7-O-glucosyltransferase from hairy root cultures of Scutellaria baicalensis.

PubMed

Hirotani, M; Kuroda, R; Suzuki, H; Yoshikawa, T

2000-05-01

A cDNA encoding UDP-glucose: baicalein 7-O-glucosyltransferase (UBGT) was isolated from a cDNA library from hairy root cultures of Scutellaria baicalensis Georgi probed with a partial-length cDNA clone of a UDP-glucose: flavonoid 3-O-glucosyltransferase (UFGT) from grape (Vitis vinifera L.). The heterologous probe contained a glucosyltransferase consensus amino acid sequence which was also present in the Scutellaria cDNA clones. The complete nucleotide sequence of the 1688-bp cDNA insert was determined and the deduced amino acid sequences are presented. The nucleotide sequence analysis of UBGT revealed an open reading frame encoding a polypeptide of 476 amino acids with a calculated molecular mass of 53,094 Da. The reaction product for baicalein and UDP-glucose catalyzed by recombinant UBGT in Escherichia coli was identified as authentic baicalein 7-O-glucoside using high-performance liquid chromatography and proton nuclear magnetic resonance spectroscopy. The enzyme activities of recombinant UBGT expressed in E. coli were also detected towards flavonoids such as baicalein, wogonin, apigenin, scutellarein, 7,4'-dihydroxyflavone and kaempferol, and phenolic compounds. The accumulation of UBGT mRNA in hairy roots was in response to wounding or salicylic acid treatments.
Microbes, metagenomes and marine mammals: enabling the next generation of scientist to enter the genomic era

PubMed Central

2013-01-01

Background The revolution in DNA sequencing technology continues unabated, and is affecting all aspects of the biological and medical sciences. The training and recruitment of the next generation of researchers who are able to use and exploit the new technology is severely lacking and potentially negatively influencing research and development efforts to advance genome biology. Here we present a cross-disciplinary course that provides undergraduate students with practical experience in running a next generation sequencing instrument through to the analysis and annotation of the generated DNA sequences. Results Many labs across world are installing next generation sequencing technology and we show that the undergraduate students produce quality sequence data and were excited to participate in cutting edge research. The students conducted the work flow from DNA extraction, library preparation, running the sequencing instrument, to the extraction and analysis of the data. They sequenced microbes, metagenomes, and a marine mammal, the Californian sea lion, Zalophus californianus. The students met sequencing quality controls, had no detectable contamination in the targeted DNA sequences, provided publication quality data, and became part of an international collaboration to investigate carcinomas in carnivores. Conclusions Students learned important skills for their future education and career opportunities, and a perceived increase in students’ ability to conduct independent scientific research was measured. DNA sequencing is rapidly expanding in the life sciences. Teaching undergraduates to use the latest technology to sequence genomic DNA ensures they are ready to meet the challenges of the genomic era and allows them to participate in annotating the tree of life. PMID:24007365
Parallel gene analysis with allele-specific padlock probes and tag microarrays

PubMed Central

Banér, Johan; Isaksson, Anders; Waldenström, Erik; Jarvius, Jonas; Landegren, Ulf; Nilsson, Mats

2003-01-01

Parallel, highly specific analysis methods are required to take advantage of the extensive information about DNA sequence variation and of expressed sequences. We present a scalable laboratory technique suitable to analyze numerous target sequences in multiplexed assays. Sets of padlock probes were applied to analyze single nucleotide variation directly in total genomic DNA or cDNA for parallel genotyping or gene expression analysis. All reacted probes were then co-amplified and identified by hybridization to a standard tag oligonucleotide array. The technique was illustrated by analyzing normal and pathogenic variation within the Wilson disease-related ATP7B gene, both at the level of DNA and RNA, using allele-specific padlock probes. PMID:12930977
Switching bonds in a DNA gel: an all-DNA vitrimer.

PubMed

Romano, Flavio; Sciortino, Francesco

2015-02-20

We design an all-DNA system that behaves like vitrimers, innovative plastics with self-healing and stress-releasing properties. The DNA sequences are engineered to self-assemble first into tetra- and bifunctional units which, upon further cooling, bind to each other forming a fully bonded network gel. An innovative design of the binding regions of the DNA sequences, exploiting a double toehold-mediated strand displacement, generates a network gel which is able to reshuffle its bonds, retaining at all times full bonding. As in vitrimers, the rate of bond switching can be controlled via a thermally activated catalyst, which in the present design is very short DNA strands.
A long PCR–based approach for DNA enrichment prior to next-generation sequencing for systematic studies1

PubMed Central

Uribe-Convers, Simon; Duke, Justin R.; Moore, Michael J.; Tank, David C.

2014-01-01

• Premise of the study: We present an alternative approach for molecular systematic studies that combines long PCR and next-generation sequencing. Our approach can be used to generate templates from any DNA source for next-generation sequencing. Here we test our approach by amplifying complete chloroplast genomes, and we present a set of 58 potentially universal primers for angiosperms to do so. Additionally, this approach is likely to be particularly useful for nuclear and mitochondrial regions. • Methods and Results: Chloroplast genomes of 30 species across angiosperms were amplified to test our approach. Amplification success varied depending on whether PCR conditions were optimized for a given taxon. To further test our approach, some amplicons were sequenced on an Illumina HiSeq 2000. • Conclusions: Although here we tested this approach by sequencing plastomes, long PCR amplicons could be generated using DNA from any genome, expanding the possibilities of this approach for molecular systematic studies. PMID:25202592
Annotation, submission and screening of repetitive elements in Repbase: RepbaseSubmitter and Censor.

PubMed

Kohany, Oleksiy; Gentles, Andrew J; Hankus, Lukasz; Jurka, Jerzy

2006-10-25

Repbase is a reference database of eukaryotic repetitive DNA, which includes prototypic sequences of repeats and basic information described in annotations. Updating and maintenance of the database requires specialized tools, which we have created and made available for use with Repbase, and which may be useful as a template for other curated databases. We describe the software tools RepbaseSubmitter and Censor, which are designed to facilitate updating and screening the content of Repbase. RepbaseSubmitter is a java-based interface for formatting and annotating Repbase entries. It eliminates many common formatting errors, and automates actions such as calculation of sequence lengths and composition, thus facilitating curation of Repbase sequences. In addition, it has several features for predicting protein coding regions in sequences; searching and including Pubmed references in Repbase entries; and searching the NCBI taxonomy database for correct inclusion of species information and taxonomic position. Censor is a tool to rapidly identify repetitive elements by comparison to known repeats. It uses WU-BLAST for speed and sensitivity, and can conduct DNA-DNA, DNA-protein, or translated DNA-translated DNA searches of genomic sequence. Defragmented output includes a map of repeats present in the query sequence, with the options to report masked query sequence(s), repeat sequences found in the query, and alignments. Censor and RepbaseSubmitter are available as both web-based services and downloadable versions. They can be found at http://www.girinst.org/repbase/submission.html (RepbaseSubmitter) and http://www.girinst.org/censor/index.php (Censor).
High-throughput analysis of T-DNA location and structure using sequence capture

DOE Office of Scientific and Technical Information (OSTI.GOV)

Inagaki, Soichi; Henry, Isabelle M.; Lieberman, Meric C.

Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA—genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously,more » using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. As a result, our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.« less
High-throughput analysis of T-DNA location and structure using sequence capture

DOE PAGES

Inagaki, Soichi; Henry, Isabelle M.; Lieberman, Meric C.; ...

2015-10-07

Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA—genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously,more » using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. As a result, our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.« less
Background sequence characteristics influence the occurrence and severity of disease-causing mtDNA mutations

PubMed Central

Wei, Wei; Hudson, Gavin

2017-01-01

Inherited mitochondrial DNA (mtDNA) mutations have emerged as a common cause of human disease, with mutations occurring multiple times in the world population. The clinical presentation of three pathogenic mtDNA mutations is strongly associated with a background mtDNA haplogroup, but it is not clear whether this is limited to a handful of examples or is a more general phenomenon. To address this, we determined the characteristics of 30,506 mtDNA sequences sampled globally. After performing several quality control steps, we ascribed an established pathogenicity score to the major alleles for each sequence. The mean pathogenicity score for known disease-causing mutations was significantly different between mtDNA macro-haplogroups. Several mutations were observed across all haplogroup backgrounds, whereas others were only observed on specific clades. In some instances this reflected a founder effect, but in others, the mutation recurred but only within the same phylogenetic cluster. Sequence diversity estimates showed that disease-causing mutations were more frequent on young sequences, and genomes with two or more disease-causing mutations were more common than expected by chance. These findings implicate the mtDNA background more generally in recurrent mutation events that have been purified through natural selection in older populations. This provides an explanation for the low frequency of mtDNA disease reported in specific ethnic groups. PMID:29253894
Lactobacillus heilongjiangensis sp. nov., isolated from Chinese pickle.

PubMed

Gu, Chun Tao; Li, Chun Yan; Yang, Li Jie; Huo, Gui Cheng

2013-11-01

A Gram-stain-positive bacterial strain, S4-3(T), was isolated from traditional pickle in Heilongjiang Province, China. The bacterium was characterized by a polyphasic approach, including 16S rRNA gene sequence analysis, pheS gene sequence analysis, rpoA gene sequence analysis, dnaK gene sequence analysis, fatty acid methyl ester (FAME) analysis, determination of DNA G+C content, DNA-DNA hybridization and an analysis of phenotypic features. Strain S4-3(T) showed 97.9-98.7 % 16S rRNA gene sequence similarities, 84.4-94.1 % pheS gene sequence similarities and 94.4-96.9 % rpoA gene sequence similarities to the type strains of Lactobacillus nantensis, Lactobacillus mindensis, Lactobacillus crustorum, Lactobacillus futsaii, Lactobacillus farciminis and Lactobacillus kimchiensis. dnaK gene sequence similarities between S4-3(T) and Lactobacillus nantensis LMG 23510(T), Lactobacillus mindensis LMG 21932(T), Lactobacillus crustorum LMG 23699(T), Lactobacillus futsaii JCM 17355(T) and Lactobacillus farciminis LMG 9200(T) were 95.4, 91.5, 90.4, 91.7 and 93.1 %, respectively. Based upon the data obtained in the present study, a novel species, Lactobacillus heilongjiangensis sp. nov., is proposed and the type strain is S4-3(T) ( = LMG 26166(T) = NCIMB 14701(T)).
The barley EST DNA Replication and Repair Database (bEST-DRRD) as a tool for the identification of the genes involved in DNA replication and repair.

PubMed

Gruszka, Damian; Marzec, Marek; Szarejko, Iwona

2012-06-14

The high level of conservation of genes that regulate DNA replication and repair indicates that they may serve as a source of information on the origin and evolution of the species and makes them a reliable system for the identification of cross-species homologs. Studies that had been conducted to date shed light on the processes of DNA replication and repair in bacteria, yeast and mammals. However, there is still much to be learned about the process of DNA damage repair in plants. These studies, which were conducted mainly using bioinformatics tools, enabled the list of genes that participate in various pathways of DNA repair in Arabidopsis thaliana (L.) Heynh to be outlined; however, information regarding these mechanisms in crop plants is still very limited. A similar, functional approach is particularly difficult for a species whose complete genomic sequences are still unavailable. One of the solutions is to apply ESTs (Expressed Sequence Tags) as the basis for gene identification. For the construction of the barley EST DNA Replication and Repair Database (bEST-DRRD), presented here, the Arabidopsis nucleotide and protein sequences involved in DNA replication and repair were used to browse for and retrieve the deposited sequences, derived from four barley (Hordeum vulgare L.) sequence databases, including the "Barley Genome version 0.05" database (encompassing ca. 90% of barley coding sequences) and from two databases covering the complete genomes of two monocot models: Oryza sativa L. and Brachypodium distachyon L. in order to identify homologous genes. Sequences of the categorised Arabidopsis queries are used for browsing the repositories, which are located on the ViroBLAST platform. The bEST-DRRD is currently used in our project during the identification and validation of the barley genes involved in DNA repair. The presented database provides information about the Arabidopsis genes involved in DNA replication and repair, their expression patterns and models of protein interactions. It was designed and established to provide an open-access tool for the identification of monocot homologs of known Arabidopsis genes that are responsible for DNA-related processes. The barley genes identified in the project are currently being analysed to validate their function.
Synchronization of DNA array replication kinetics

NASA Astrophysics Data System (ADS)

Manturov, Alexey O.; Grigoryev, Anton V.

2016-04-01

In the present work we discuss the features of the DNA replication kinetics at the case of multiplicity of simultaneously elongated DNA fragments. The interaction between replicated DNA fragments is carried out by free protons that appears at the every nucleotide attachment at the free end of elongated DNA fragment. So there is feedback between free protons concentration and DNA-polymerase activity that appears as elongation rate dependence. We develop the numerical model based on a cellular automaton, which can simulate the elongation stage (growth of DNA strands) for DNA elongation process with conditions pointed above and we study the possibility of the DNA polymerases movement synchronization. The results obtained numerically can be useful for DNA polymerase movement detection and visualization of the elongation process in the case of massive DNA replication, eg, under PCR condition or for DNA "sequencing by synthesis" sequencing devices evaluation.
Detection of Cytosine methylation in ancient DNA from five native american populations using bisulfite sequencing.

PubMed

Smith, Rick W A; Monroe, Cara; Bolnick, Deborah A

2015-01-01

While cytosine methylation has been widely studied in extant populations, relatively few studies have analyzed methylation in ancient DNA. Most existing studies of epigenetic marks in ancient DNA have inferred patterns of methylation in highly degraded samples using post-mortem damage to cytosines as a proxy for cytosine methylation levels. However, this approach limits the inference of methylation compared with direct bisulfite sequencing, the current gold standard for analyzing cytosine methylation at single nucleotide resolution. In this study, we used direct bisulfite sequencing to assess cytosine methylation in ancient DNA from the skeletal remains of 30 Native Americans ranging in age from approximately 230 to 4500 years before present. Unmethylated cytosines were converted to uracils by treatment with sodium bisulfite, bisulfite products of a CpG-rich retrotransposon were pyrosequenced, and C-to-T ratios were quantified for a single CpG position. We found that cytosine methylation is readily recoverable from most samples, given adequate preservation of endogenous nuclear DNA. In addition, our results indicate that the precision of cytosine methylation estimates is inversely correlated with aDNA preservation, such that samples of low DNA concentration show higher variability in measures of percent methylation than samples of high DNA concentration. In particular, samples in this study with a DNA concentration above 0.015 ng/μL generated the most consistent measures of cytosine methylation. This study presents evidence of cytosine methylation in a large collection of ancient human remains, and indicates that it is possible to analyze epigenetic patterns in ancient populations using direct bisulfite sequencing approaches.
Mitochondrial sequence analysis for forensic identification using pyrosequencing technology.

PubMed

Andréasson, H; Asp, A; Alderborn, A; Gyllensten, U; Allen, M

2002-01-01

Over recent years, requests for mtDNA analysis in the field of forensic medicine have notably increased, and the results of such analyses have proved to be very useful in forensic cases where nuclear DNA analysis cannot be performed. Traditionally, mtDNA has been analyzed by DNA sequencing of the two hypervariable regions, HVI and HVII, in the D-loop. DNA sequence analysis using the conventional Sanger sequencing is very robust but time consuming and labor intensive. By contrast, mtDNA analysis based on the pyrosequencing technology provides fast and accurate results from the human mtDNA present in many types of evidence materials in forensic casework. The assay has been developed to determine polymorphic sites in the mitochondrial D-loop as well as the coding region to further increase the discrimination power of mtDNA analysis. The pyrosequencing technology for analysis of mtDNA polymorphisms has been tested with regard to sensitivity, reproducibility, and success rate when applied to control samples and actual casework materials. The results show that the method is very accurate and sensitive; the results are easily interpreted and provide a high success rate on casework samples. The panel of pyrosequencing reactions for the mtDNA polymorphisms were chosen to result in an optimal discrimination power in relation to the number of bases determined.

DNA base-calling from a nanopore using a Viterbi algorithm.

PubMed

Timp, Winston; Comer, Jeffrey; Aksimentiev, Aleksei

2012-05-16

Nanopore-based DNA sequencing is the most promising third-generation sequencing method. It has superior read length, speed, and sample requirements compared with state-of-the-art second-generation methods. However, base-calling still presents substantial difficulty because the resolution of the technique is limited compared with the measured signal/noise ratio. Here we demonstrate a method to decode 3-bp-resolution nanopore electrical measurements into a DNA sequence using a Hidden Markov model. This method shows tremendous potential for accuracy (~98%), even with a poor signal/noise ratio. Copyright © 2012 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Informational structure of genetic sequences and nature of gene splicing

NASA Astrophysics Data System (ADS)

Trifonov, E. N.

1991-10-01

Only about 1/20 of DNA of higher organisms codes for proteins, by means of classical triplet code. The rest of DNA sequences is largely silent, with unclear functions, if any. The triplet code is not the only code (message) carried by the sequences. There are three levels of molecular communication, where the same sequence ``talks'' to various bimolecules, while having, respectively, three different appearances: DNA, RNA and protein. Since the molecular structures and, hence, sequence specific preferences of these are substantially different, the original DNA sequence has to carry simultaneously three types of sequence patterns (codes, messages), thus, being a composite structure in which one had the same letter (nucleotide) is frequently involved in several overlapping codes of different nature. This multiplicity and overlapping of the codes is a unique feature of the Gnomic, language of genetic sequences. The coexisting codes have to be degenerate in various degrees to allow an optimal and concerted performance of all the encoded functions. There is an obvious conflict between the best possible performance of a given function and necessity to compromise the quality of a given sequence pattern in favor of other patterns. It appears that the major role of various changes in the sequences on their ``ontogenetic'' way from DNA to RNA to protein, like RNA editing and splicing, or protein post-translational modifications is to resolve such conflicts. New data are presented strongly indicating that the gene splicing is such a device to resolve the conflict between the code of DNA folding in chromatin and the triplet code for protein synthesis.
Fragile sites, dysfunctional telomere and chromosome fusions: What is 5S rDNA role?

PubMed

Barros, Alain Victor; Wolski, Michele Andressa Vier; Nogaroto, Viviane; Almeida, Mara Cristina; Moreira-Filho, Orlando; Vicari, Marcelo Ricardo

2017-04-15

Repetitive DNA regions are known as fragile chromosomal sites which present a high flexibility and low stability. Our focus was characterize fragile sites in 5S rDNA regions. The Ancistrus sp. species shows a diploid number of 50 and an indicative Robertsonian fusion at chromosomal pair 1. Two sequences of 5S rDNA were identified: 5S.1 rDNA and 5S.2 rDNA. The first sequence gathers the necessary structures to gene expression and shows a functional secondary structure prediction. Otherwise, the 5S.2 rDNA sequence does not contain the upstream sequences that are required to expression, furthermore its structure prediction reveals a nonfunctional ribosomal RNA. The chromosomal mapping revealed several 5S.1 and 5S.2 rDNA clusters. In addition, the 5S.2 rDNA clusters were found in acrocentric and metacentric chromosomes proximal regions. The pair 1 5S.2 rDNA cluster is co-located with interstitial telomeric sites (ITS). Our results indicate that its clusters are hotspots to chromosomal breaks. During the meiotic prophase bouquet arrangement, double strand breaks (DSBs) at proximal 5S.2 rDNA of acrocentric chromosomes could lead to homologous and non-homologous repair mechanisms as Robertsonian fusions. Still, ITS sites provides chromosomal instability, resulting in telomeric recombination via TRF2 shelterin protein and a series of breakage-fusion-bridge cycles. Our proposal is that 5S rDNA derived sequences, act as chromosomal fragile sites in association with some chromosomal rearrangements of Loricariidae. Copyright © 2017 Elsevier B.V. All rights reserved.
Conductance of Dry DNA: Role of Environment

NASA Technical Reports Server (NTRS)

Anantram, M. P.; Adessi, Ch.; S. Walch

2003-01-01

This paper presents viewgraphs on the conductance of dry DNA and its effect on the surrounding environment. The topics include: 1) Approach; 2) Influence of Counter Ions; 3) Conductance Versus DNA Length; 4) Intrinsic Resonant Tunneling in Engineered DNA Sequence; and 5) Transmission Versus Energy.
Variation of 45S rDNA intergenic spacers in Arabidopsis thaliana.

PubMed

Havlová, Kateřina; Dvořáčková, Martina; Peiro, Ramon; Abia, David; Mozgová, Iva; Vansáčová, Lenka; Gutierrez, Crisanto; Fajkus, Jiří

2016-11-01

Approximately seven hundred 45S rRNA genes (rDNA) in the Arabidopsis thaliana genome are organised in two 4 Mbp-long arrays of tandem repeats arranged in head-to-tail fashion separated by an intergenic spacer (IGS). These arrays make up 5 % of the A. thaliana genome. IGS are rapidly evolving sequences and frequent rearrangements inside the rDNA loci have generated considerable interspecific and even intra-individual variability which allows to distinguish among otherwise highly conserved rRNA genes. The IGS has not been comprehensively described despite its potential importance in regulation of rDNA transcription and replication. Here we describe the detailed sequence variation in the complete IGS of A. thaliana WT plants and provide the reference/consensus IGS sequence, as well as genomic DNA analysis. We further investigate mutants dysfunctional in chromatin assembly factor-1 (CAF-1) (fas1 and fas2 mutants), which are known to have a reduced number of rDNA copies, and plant lines with restored CAF-1 function (segregated from a fas1xfas2 genetic background) showing major rDNA rearrangements. The systematic rDNA loss in CAF-1 mutants leads to the decreased variability of the IGS and to the occurrence of distinct IGS variants. We present for the first time a comprehensive and representative set of complete IGS sequences, obtained by conventional cloning and by Pacific Biosciences sequencing. Our data expands the knowledge of the A. thaliana IGS sequence arrangement and variability, which has not been available in full and in detail until now. This is also the first study combining IGS sequencing data with RFLP analysis of genomic DNA.
Partial characterization of normal and Haemophilus influenzae-infected mucosal complementary DNA libraries in chinchilla middle ear mucosa.

PubMed

Kerschner, Joseph E; Erdos, Geza; Hu, Fen Ze; Burrows, Amy; Cioffi, Joseph; Khampang, Pawjai; Dahlgren, Margaret; Hayes, Jay; Keefe, Randy; Janto, Benjamin; Post, J Christopher; Ehrlich, Garth D

2010-04-01

We sought to construct and partially characterize complementary DNA (cDNA) libraries prepared from the middle ear mucosa (MEM) of chinchillas to better understand pathogenic aspects of infection and inflammation, particularly with respect to leukotriene biogenesis and response. Chinchilla MEM was harvested from controls and after middle ear inoculation with nontypeable Haemophilus influenzae. RNA was extracted to generate cDNA libraries. Randomly selected clones were subjected to sequence analysis to characterize the libraries and to provide DNA sequence for phylogenetic analyses. Reverse transcription-polymerase chain reaction of the RNA pools was used to generate cDNA sequences corresponding to genes associated with leukotriene biosynthesis and metabolism. Sequence analysis of 921 randomly selected clones from the uninfected MEM cDNA library produced approximately 250,000 nucleotides of almost entirely novel sequence data. Searches of the GenBank database with the Basic Local Alignment Search Tool provided for identification of 515 unique genes expressed in the MEM and not previously described in chinchillas. In almost all cases, the chinchilla cDNA sequences displayed much greater homology to human or other primate genes than with rodent species. Genes associated with leukotriene metabolism were present in both normal and infected MEM. Based on both phylogenetic comparisons and gene expression similarities with humans, chinchilla MEM appears to be an excellent model for the study of middle ear inflammation and infection. The higher degree of sequence similarity between chinchillas and humans compared to chinchillas and rodents was unexpected. The cDNA libraries from normal and infected chinchilla MEM will serve as useful molecular tools in the study of otitis media and should yield important information with respect to middle ear pathogenesis.
Partial Characterization of Normal and Haemophilus influenzae–Infected Mucosal Complementary DNA Libraries in Chinchilla Middle Ear Mucosa

PubMed Central

Kerschner, Joseph E.; Erdos, Geza; Hu, Fen Ze; Burrows, Amy; Cioffi, Joseph; Khampang, Pawjai; Dahlgren, Margaret; Hayes, Jay; Keefe, Randy; Janto, Benjamin; Post, J. Christopher; Ehrlich, Garth D.

2010-01-01

Objectives We sought to construct and partially characterize complementary DNA (cDNA) libraries prepared from the middle ear mucosa (MEM) of chinchillas to better understand pathogenic aspects of infection and inflammation, particularly with respect to leukotriene biogenesis and response. Methods Chinchilla MEM was harvested from controls and after middle ear inoculation with nontypeable Haemophilus influenzae. RNA was extracted to generate cDNA libraries. Randomly selected clones were subjected to sequence analysis to characterize the libraries and to provide DNA sequence for phylogenetic analyses. Reverse transcription–polymerase chain reaction of the RNA pools was used to generate cDNA sequences corresponding to genes associated with leukotriene biosynthesis and metabolism. Results Sequence analysis of 921 randomly selected clones from the uninfected MEM cDNA library produced approximately 250,000 nucleotides of almost entirely novel sequence data. Searches of the GenBank database with the Basic Local Alignment Search Tool provided for identification of 515 unique genes expressed in the MEM and not previously described in chinchillas. In almost all cases, the chinchilla cDNA sequences displayed much greater homology to human or other primate genes than with rodent species. Genes associated with leukotriene metabolism were present in both normal and infected MEM. Conclusions Based on both phylogenetic comparisons and gene expression similarities with humans, chinchilla MEM appears to be an excellent model for the study of middle ear inflammation and infection. The higher degree of sequence similarity between chinchillas and humans compared to chinchillas and rodents was unexpected. The cDNA libraries from normal and infected chinchilla MEM will serve as useful molecular tools in the study of otitis media and should yield important information with respect to middle ear pathogenesis. PMID:20433028
The role of DNA repair in herpesvirus pathogenesis.

PubMed

Brown, Jay C

2014-10-01

In cells latently infected with a herpesvirus, the viral DNA is present in the cell nucleus, but it is not extensively replicated or transcribed. In this suppressed state the virus DNA is vulnerable to mutagenic events that affect the host cell and have the potential to destroy the virus' genetic integrity. Despite the potential for genetic damage, however, herpesvirus sequences are well conserved after reactivation from latency. To account for this apparent paradox, I have tested the idea that host cell-encoded mechanisms of DNA repair are able to control genetic damage to latent herpesviruses. Studies were focused on homologous recombination-dependent DNA repair (HR). Methods of DNA sequence analysis were employed to scan herpesvirus genomes for DNA features able to activate HR. Analyses were carried out with a total of 39 herpesvirus DNA sequences, a group that included viruses from the alpha-, beta- and gamma-subfamilies. The results showed that all 39 genome sequences were enriched in two or more of the eight recombination-initiating features examined. The results were interpreted to indicate that HR can stabilize latent herpesvirus genomes. The results also showed, unexpectedly, that repair-initiating DNA features differed in alpha- compared to gamma-herpesviruses. Whereas inverted and tandem repeats predominated in alpha-herpesviruses, gamma-herpesviruses were enriched in short, GC-rich initiation sequences such as CCCAG and depleted in repeats. In alpha-herpesviruses, repair-initiating repeat sequences were found to be concentrated in a specific region (the S segment) of the genome while repair-initiating short sequences were distributed more uniformly in gamma-herpesviruses. The results suggest that repair pathways are activated differently in alpha- compared to gamma-herpesviruses. Copyright © 2014. Published by Elsevier Inc.
Functional specificity of a Hox protein mediated by the recognition of minor groove structure.

PubMed

Joshi, Rohit; Passner, Jonathan M; Rohs, Remo; Jain, Rinku; Sosinsky, Alona; Crickmore, Michael A; Jacob, Vinitha; Aggarwal, Aneel K; Honig, Barry; Mann, Richard S

2007-11-02

The recognition of specific DNA-binding sites by transcription factors is a critical yet poorly understood step in the control of gene expression. Members of the Hox family of transcription factors bind DNA by making nearly identical major groove contacts via the recognition helices of their homeodomains. In vivo specificity, however, often depends on extended and unstructured regions that link Hox homeodomains to a DNA-bound cofactor, Extradenticle (Exd). Using a combination of structure determination, computational analysis, and in vitro and in vivo assays, we show that Hox proteins recognize specific Hox-Exd binding sites via residues located in these extended regions that insert into the minor groove but only when presented with the correct DNA sequence. Our results suggest that these residues, which are conserved in a paralog-specific manner, confer specificity by recognizing a sequence-dependent DNA structure instead of directly reading a specific DNA sequence.
Complete genome sequence of a new begomovirus associated with yellow mosaic disease of Hemidesmus indicus in India.

PubMed

Reddy, M Sreekanth; Kanakala, S; Srinivas, K P; Hema, M; Malathi, V G; Sreenivasulu, P

2014-05-01

The complete DNA A genome of a virus isolate associated with yellow mosaic disease of a medicinal plant, Hemidesmus indicus, from India was cloned and sequenced. The length of DNA A was 2825 nucleotides, 35 nucleotides longer than the unit genome of monopartite begomoviruses. Comparison of the nucleotide sequence of DNA A of the virus isolate with those of other begomoviruses showed maximum sequence identity of 69 % to DNA A of ageratum yellow vein China virus (AYVCNV; AJ558120) and 68 % with tomato yellow leaf curl virus- LBa4 (TYLCV; EF185318), and it formed a distinct clade in phylogenetic analysis. The genome organization of the present virus isolate was found to be similar to that of Old World monopartite begomoviruses. The genome was considered to be monopartite, because association of DNA B and β satellite DNA components was not detected. Based on its sequence identity (<70 %) to all other begomoviruses known to date and ICTV (International Committee on Taxonomy of Viruses) species demarcating criteria (<89 % identity), it is considered a member of a novel begomovirus species, and the tentative name "Hemidesmus yellow mosaic virus" (HeYMV) is proposed.
Trading genes along the silk road: mtDNA sequences and the origin of central Asian populations.

PubMed Central

Comas, D; Calafell, F; Mateu, E; Pérez-Lezaun, A; Bosch, E; Martínez-Arias, R; Clarimon, J; Facchini, F; Fiori, G; Luiselli, D; Pettener, D; Bertranpetit, J

1998-01-01

Central Asia is a vast region at the crossroads of different habitats, cultures, and trade routes. Little is known about the genetics and the history of the population of this region. We present the analysis of mtDNA control-region sequences in samples of the Kazakh, the Uighurs, the lowland Kirghiz, and the highland Kirghiz, which we have used to address both the population history of the region and the possible selective pressures that high altitude has on mtDNA genes. Central Asian mtDNA sequences present features intermediate between European and eastern Asian sequences, in several parameters-such as the frequencies of certain nucleotides, the levels of nucleotide diversity, mean pairwise differences, and genetic distances. Several hypotheses could explain the intermediate position of central Asia between Europe and eastern Asia, but the most plausible would involve extensive levels of admixture between Europeans and eastern Asians in central Asia, possibly enhanced during the Silk Road trade and clearly after the eastern and western Eurasian human groups had diverged. Lowland and highland Kirghiz mtDNA sequences are very similar, and the analysis of molecular variance has revealed that the fraction of mitochondrial genetic variance due to altitude is not significantly different from zero. Thus, it seems unlikely that altitude has exerted a major selective pressure on mitochondrial genes in central Asian populations. PMID:9837835
Comparison of microbial DNA enrichment tools for metagenomic whole genome sequencing.

PubMed

Thoendel, Matthew; Jeraldo, Patricio R; Greenwood-Quaintance, Kerryl E; Yao, Janet Z; Chia, Nicholas; Hanssen, Arlen D; Abdel, Matthew P; Patel, Robin

2016-08-01

Metagenomic whole genome sequencing for detection of pathogens in clinical samples is an exciting new area for discovery and clinical testing. A major barrier to this approach is the overwhelming ratio of human to pathogen DNA in samples with low pathogen abundance, which is typical of most clinical specimens. Microbial DNA enrichment methods offer the potential to relieve this limitation by improving this ratio. Two commercially available enrichment kits, the NEBNext Microbiome DNA Enrichment Kit and the Molzym MolYsis Basic kit, were tested for their ability to enrich for microbial DNA from resected arthroplasty component sonicate fluids from prosthetic joint infections or uninfected sonicate fluids spiked with Staphylococcus aureus. Using spiked uninfected sonicate fluid there was a 6-fold enrichment of bacterial DNA with the NEBNext kit and 76-fold enrichment with the MolYsis kit. Metagenomic whole genome sequencing of sonicate fluid revealed 13- to 85-fold enrichment of bacterial DNA using the NEBNext enrichment kit. The MolYsis approach achieved 481- to 9580-fold enrichment, resulting in 7 to 59% of sequencing reads being from the pathogens known to be present in the samples. These results demonstrate the usefulness of these tools when testing clinical samples with low microbial burden using next generation sequencing. Copyright © 2016 Elsevier B.V. All rights reserved.
Molecular identification of Mango, Mangifera indica L.var. totupura

PubMed Central

Jagarlamudi, Sankar; G, Rosaiah; Kurapati, Ravi Kumar; Pinnamaneni, Rajasekhar

2011-01-01

Mango (>Mangifera indica) belonging to Anacardiaceae family is a fruit that grows in tropical regions. It is considered as the King of fruits. The present work was taken up to identify a tool in identifying the mango species at the molecular level. The chloroplast trnL-F region was amplified from extracted total genomic DNA using the polymerase chain reaction (PCR) and sequenced. Sequence of the dominant DGGE band revealed that Mangifera indica in tested leaves was Mangifera indica (100% similarity to the ITS sequences of Mangifera indica). This sequence was deposited in NCBI with the accession no. GQ927757. Abbreviations AFLP - Amplified fragment length polymorphism , cpDNA - Chloroplast DNA, DDGE - Denaturing gradient gel electrophoresis, DNA - Deoxyribo nucleic acid, EDTA - Ethylenediamine tetraacetic acid, HCl - Hydrochloric acid, ISSR - Inter simple sequence repeats, ITS - Internal transcribed spacer, MATAB - Methyl Ammonium Bromide, Na2SO3 - Sodium sulphite, NaCl - Sodium chloride, NCBI - National Centre for Biotechnology Information, PCR - Polymerase chain reaction, PEG - Polyethylene glycol, RAPD - Randomly amplified polymorphic DNA, trnL-F - Transfer RNA genes start codon- termination codon. PMID:21423885
Long interspersed repeated DNA (LINE) causes polymorphism at the rat insulin 1 locus.

PubMed

Lakshmikumaran, M S; D'Ambrosio, E; Laimins, L A; Lin, D T; Furano, A V

1985-09-01

The insulin 1, but not the insulin 2, locus is polymorphic (i.e., exhibits allelic variation) in rats. Restriction enzyme analysis and hybridization studies showed that the polymorphic region is 2.2 kilobases upstream of the insulin 1 coding region and is due to the presence or absence of an approximately 2.7-kilobase repeated DNA element. DNA sequence determination showed that this DNA element is a member of a long interspersed repeated DNA family (LINE) that is highly repeated (greater than 50,000 copies) and highly transcribed in the rat. Although the presence or absence of LINE sequences at the insulin 1 locus occurs in both the homozygous and heterozygous states, LINE-containing insulin 1 alleles are more prevalent in the rat population than are alleles without LINEs. Restriction enzyme analysis of the LINE-containing alleles indicated that at least two versions of the LINE sequence may be present at the insulin 1 locus in different rats. Either repeated transposition of LINE sequences or gene conversion between the resident insulin 1 LINE and other sequences in the genome are possible explanations for this.
Alternative DNA structure formation in the mutagenic human c-MYC promoter.

PubMed

Del Mundo, Imee Marie A; Zewail-Foote, Maha; Kerwin, Sean M; Vasquez, Karen M

2017-05-05

Mutation 'hotspot' regions in the genome are susceptible to genetic instability, implicating them in diseases. These hotspots are not random and often co-localize with DNA sequences potentially capable of adopting alternative DNA structures (non-B DNA, e.g. H-DNA and G4-DNA), which have been identified as endogenous sources of genomic instability. There are regions that contain overlapping sequences that may form more than one non-B DNA structure. The extent to which one structure impacts the formation/stability of another, within the sequence, is not fully understood. To address this issue, we investigated the folding preferences of oligonucleotides from a chromosomal breakpoint hotspot in the human c-MYC oncogene containing both potential G4-forming and H-DNA-forming elements. We characterized the structures formed in the presence of G4-DNA-stabilizing K+ ions or H-DNA-stabilizing Mg2+ ions using multiple techniques. We found that under conditions favorable for H-DNA formation, a stable intramolecular triplex DNA structure predominated; whereas, under K+-rich, G4-DNA-forming conditions, a plurality of unfolded and folded species were present. Thus, within a limited region containing sequences with the potential to adopt multiple structures, only one structure predominates under a given condition. The predominance of H-DNA implicates this structure in the instability associated with the human c-MYC oncogene. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
In vitro excision of adeno-associated virus DNA from recombinant plasmids: Isolation of an enzyme fraction from HeLa cells that cleaves DNA at poly(G) sequences

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gottlieb, J.; Muzyczka, N.

1988-06-01

When circular recombinant plasmids containing adeno-associated virus (AAV) DNA sequences are transfected into human cells, the AAV provirus is rescued. Using these circular AAV plasmids as substrates, the authors isolated an enzyme fraction from HeLa cell nuclear extracts that excises intact AAV DNA in vitro from vector DNA and produces linear DNA products. The recognition signal for the enzyme is a polypurine-polypyrimidine sequence which is at least 9 residues long and rich in G . C base pairs. Such sequences are present in AAV recombinant plasmids as part of the first 15 base pairs of the AAV terminal repeat andmore » in some cases as the result of cloning the AAV genome by G . C tailing. The isolated enzyme fraction does not have significant endonucleolytic activity on single-stranded or double-stranded DNA. Plasmid DNA that is transfected into tissue culture cells is cleaved in vivo to produce a pattern of DNA fragments similar to that seen with purified enzyme in vitro. The activity has been called endo R for rescue, and its behavior suggests that it may have a role in recombination of cellular chromosomes.« less
Isolation and characterization of full-length cDNA clones coding for cholinesterase from fetal human tissues.

PubMed Central

Prody, C A; Zevin-Sonkin, D; Gnatt, A; Goldberg, O; Soreq, H

1987-01-01

To study the primary structure and regulation of human cholinesterases, oligodeoxynucleotide probes were prepared according to a consensus peptide sequence present in the active site of both human serum pseudocholinesterase (BtChoEase; EC 3.1.1.8) and Torpedo electric organ "true" acetylcholinesterase (AcChoEase; EC 3.1.1.7). Using these probes, we isolated several cDNA clones from lambda gt10 libraries of fetal brain and liver origins. These include 2.4-kilobase cDNA clones that code for a polypeptide containing a putative signal peptide and the N-terminal, active site, and C-terminal peptides of human BtChoEase, suggesting that they code either for BtChoEase itself or for a very similar but distinct fetal form of cholinesterase. In RNA blots of poly(A)+ RNA from the cholinesterase-producing fetal brain and liver, these cDNAs hybridized with a single 2.5-kilobase band. Blot hybridization to human genomic DNA revealed that these fetal BtChoEase cDNA clones hybridize with DNA fragments of the total length of 17.5 kilobases, and signal intensities indicated that these sequences are not present in many copies. Both the cDNA-encoded protein and its nucleotide sequence display striking homology to parallel sequences published for Torpedo AcChoEase. These findings demonstrate extensive homologies between the fetal BtChoEase encoded by these clones and other cholinesterases of various forms and species. Images PMID:3035536
Algorithms for optimizing cross-overs in DNA shuffling.

PubMed

He, Lu; Friedman, Alan M; Bailey-Kellogg, Chris

2012-03-21

DNA shuffling generates combinatorial libraries of chimeric genes by stochastically recombining parent genes. The resulting libraries are subjected to large-scale genetic selection or screening to identify those chimeras with favorable properties (e.g., enhanced stability or enzymatic activity). While DNA shuffling has been applied quite successfully, it is limited by its homology-dependent, stochastic nature. Consequently, it is used only with parents of sufficient overall sequence identity, and provides no control over the resulting chimeric library. This paper presents efficient methods to extend the scope of DNA shuffling to handle significantly more diverse parents and to generate more predictable, optimized libraries. Our CODNS (cross-over optimization for DNA shuffling) approach employs polynomial-time dynamic programming algorithms to select codons for the parental amino acids, allowing for zero or a fixed number of conservative substitutions. We first present efficient algorithms to optimize the local sequence identity or the nearest-neighbor approximation of the change in free energy upon annealing, objectives that were previously optimized by computationally-expensive integer programming methods. We then present efficient algorithms for more powerful objectives that seek to localize and enhance the frequency of recombination by producing "runs" of common nucleotides either overall or according to the sequence diversity of the resulting chimeras. We demonstrate the effectiveness of CODNS in choosing codons and allocating substitutions to promote recombination between parents targeted in earlier studies: two GAR transformylases (41% amino acid sequence identity), two very distantly related DNA polymerases, Pol X and β (15%), and beta-lactamases of varying identity (26-47%). Our methods provide the protein engineer with a new approach to DNA shuffling that supports substantially more diverse parents, is more deterministic, and generates more predictable and more diverse chimeric libraries.
TRX-LOGOS - a graphical tool to demonstrate DNA information content dependent upon backbone dynamics in addition to base sequence.

PubMed

Fortin, Connor H; Schulze, Katharina V; Babbitt, Gregory A

2015-01-01

It is now widely-accepted that DNA sequences defining DNA-protein interactions functionally depend upon local biophysical features of DNA backbone that are important in defining sites of binding interaction in the genome (e.g. DNA shape, charge and intrinsic dynamics). However, these physical features of DNA polymer are not directly apparent when analyzing and viewing Shannon information content calculated at single nucleobases in a traditional sequence logo plot. Thus, sequence logos plots are severely limited in that they convey no explicit information regarding the structural dynamics of DNA backbone, a feature often critical to binding specificity. We present TRX-LOGOS, an R software package and Perl wrapper code that interfaces the JASPAR database for computational regulatory genomics. TRX-LOGOS extends the traditional sequence logo plot to include Shannon information content calculated with regard to the dinucleotide-based BI-BII conformation shifts in phosphate linkages on the DNA backbone, thereby adding a visual measure of intrinsic DNA flexibility that can be critical for many DNA-protein interactions. TRX-LOGOS is available as an R graphics module offered at both SourceForge and as a download supplement at this journal. To demonstrate the general utility of TRX logo plots, we first calculated the information content for 416 Saccharomyces cerevisiae transcription factor binding sites functionally confirmed in the Yeastract database and matched to previously published yeast genomic alignments. We discovered that flanking regions contain significantly elevated information content at phosphate linkages than can be observed at nucleobases. We also examined broader transcription factor classifications defined by the JASPAR database, and discovered that many general signatures of transcription factor binding are locally more information rich at the level of DNA backbone dynamics than nucleobase sequence. We used TRX-logos in combination with MEGA 6.0 software for molecular evolutionary genetics analysis to visually compare the human Forkhead box/FOX protein evolution to its binding site evolution. We also compared the DNA binding signatures of human TP53 tumor suppressor determined by two different laboratory methods (SELEX and ChIP-seq). Further analysis of the entire yeast genome, center aligned at the start codon, also revealed a distinct sequence-independent 3 bp periodic pattern in information content, present only in coding region, and perhaps indicative of the non-random organization of the genetic code. TRX-LOGOS is useful in any situation in which important information content in DNA can be better visualized at the positions of phosphate linkages (i.e. dinucleotides) where the dynamic properties of the DNA backbone functions to facilitate DNA-protein interaction.
DNA–DNA kissing complexes as a new tool for the assembly of DNA nanostructures

PubMed Central

Barth, Anna; Kobbe, Daniela; Focke, Manfred

2016-01-01

Kissing-loop annealing of nucleic acids occurs in nature in several viruses and in prokaryotic replication, among other circumstances. Nucleobases of two nucleic acid strands (loops) interact with each other, although the two strands cannot wrap around each other completely because of the adjacent double-stranded regions (stems). In this study, we exploited DNA kissing-loop interaction for nanotechnological application. We functionalized the vertices of DNA tetrahedrons with DNA stem-loop sequences. The complementary loop sequence design allowed the hybridization of different tetrahedrons via kissing-loop interaction, which might be further exploited for nanotechnology applications like cargo transport and logical elements. Importantly, we were able to manipulate the stability of those kissing-loop complexes based on the choice and concentration of cations, the temperature and the number of complementary loops per tetrahedron either at the same or at different vertices. Moreover, variations in loop sequences allowed the characterization of necessary sequences within the loop as well as additional stability control of the kissing complexes. Therefore, the properties of the presented nanostructures make them an important tool for DNA nanotechnology. PMID:26773051

A statistical model for investigating binding probabilities of DNA nucleotide sequences using microarrays.

PubMed

Lee, Mei-Ling Ting; Bulyk, Martha L; Whitmore, G A; Church, George M

2002-12-01

There is considerable scientific interest in knowing the probability that a site-specific transcription factor will bind to a given DNA sequence. Microarray methods provide an effective means for assessing the binding affinities of a large number of DNA sequences as demonstrated by Bulyk et al. (2001, Proceedings of the National Academy of Sciences, USA 98, 7158-7163) in their study of the DNA-binding specificities of Zif268 zinc fingers using microarray technology. In a follow-up investigation, Bulyk, Johnson, and Church (2002, Nucleic Acid Research 30, 1255-1261) studied the interdependence of nucleotides on the binding affinities of transcription proteins. Our article is motivated by this pair of studies. We present a general statistical methodology for analyzing microarray intensity measurements reflecting DNA-protein interactions. The log probability of a protein binding to a DNA sequence on an array is modeled using a linear ANOVA model. This model is convenient because it employs familiar statistical concepts and procedures and also because it is effective for investigating the probability structure of the binding mechanism.
DNA Extraction Protocols for Whole-Genome Sequencing in Marine Organisms.

PubMed

Panova, Marina; Aronsson, Henrik; Cameron, R Andrew; Dahl, Peter; Godhe, Anna; Lind, Ulrika; Ortega-Martinez, Olga; Pereyra, Ricardo; Tesson, Sylvie V M; Wrange, Anna-Lisa; Blomberg, Anders; Johannesson, Kerstin

2016-01-01

The marine environment harbors a large proportion of the total biodiversity on this planet, including the majority of the earths' different phyla and classes. Studying the genomes of marine organisms can bring interesting insights into genome evolution. Today, almost all marine organismal groups are understudied with respect to their genomes. One potential reason is that extraction of high-quality DNA in sufficient amounts is challenging for many marine species. This is due to high polysaccharide content, polyphenols and other secondary metabolites that will inhibit downstream DNA library preparations. Consequently, protocols developed for vertebrates and plants do not always perform well for invertebrates and algae. In addition, many marine species have large population sizes and, as a consequence, highly variable genomes. Thus, to facilitate the sequence read assembly process during genome sequencing, it is desirable to obtain enough DNA from a single individual, which is a challenge in many species of invertebrates and algae. Here, we present DNA extraction protocols for seven marine species (four invertebrates, two algae, and a marine yeast), optimized to provide sufficient DNA quality and yield for de novo genome sequencing projects.
msgbsR: An R package for analysing methylation-sensitive restriction enzyme sequencing data.

PubMed

Mayne, Benjamin T; Leemaqz, Shalem Y; Buckberry, Sam; Rodriguez Lopez, Carlos M; Roberts, Claire T; Bianco-Miotto, Tina; Breen, James

2018-02-01

Genotyping-by-sequencing (GBS) or restriction-site associated DNA marker sequencing (RAD-seq) is a practical and cost-effective method for analysing large genomes from high diversity species. This method of sequencing, coupled with methylation-sensitive enzymes (often referred to as methylation-sensitive restriction enzyme sequencing or MRE-seq), is an effective tool to study DNA methylation in parts of the genome that are inaccessible in other sequencing techniques or are not annotated in microarray technologies. Current software tools do not fulfil all methylation-sensitive restriction sequencing assays for determining differences in DNA methylation between samples. To fill this computational need, we present msgbsR, an R package that contains tools for the analysis of methylation-sensitive restriction enzyme sequencing experiments. msgbsR can be used to identify and quantify read counts at methylated sites directly from alignment files (BAM files) and enables verification of restriction enzyme cut sites with the correct recognition sequence of the individual enzyme. In addition, msgbsR assesses DNA methylation based on read coverage, similar to RNA sequencing experiments, rather than methylation proportion and is a useful tool in analysing differential methylation on large populations. The package is fully documented and available freely online as a Bioconductor package ( https://bioconductor.org/packages/release/bioc/html/msgbsR.html ).
454 Pyrosequencing to Describe Microbial Eukaryotic Community Composition, Diversity and Relative Abundance: A Test for Marine Haptophytes

PubMed Central

Egge, Elianne; Bittner, Lucie; Andersen, Tom; Audic, Stéphane; de Vargas, Colomban; Edvardsen, Bente

2013-01-01

Next generation sequencing of ribosomal DNA is increasingly used to assess the diversity and structure of microbial communities. Here we test the ability of 454 pyrosequencing to detect the number of species present, and assess the relative abundance in terms of cell numbers and biomass of protists in the phylum Haptophyta. We used a mock community consisting of equal number of cells of 11 haptophyte species and compared targeting DNA and RNA/cDNA, and two different V4 SSU rDNA haptophyte-biased primer pairs. Further, we tested four different bioinformatic filtering methods to reduce errors in the resulting sequence dataset. With sequencing depth of 11000–20000 reads and targeting cDNA with Haptophyta specific primers Hap454 we detected all 11 species. A rarefaction analysis of expected number of species recovered as a function of sampling depth suggested that minimum 1400 reads were required here to recover all species in the mock community. Relative read abundance did not correlate to relative cell numbers. Although the species represented with the largest biomass was also proportionally most abundant among the reads, there was generally a weak correlation between proportional read abundance and proportional biomass of the different species, both with DNA and cDNA as template. The 454 sequencing generated considerable spurious diversity, and more with cDNA than DNA as template. With initial filtering based only on match with barcode and primer we observed 100-fold more operational taxonomic units (OTUs) at 99% similarity than the number of species present in the mock community. Filtering based on quality scores, or denoising with PyroNoise resulted in ten times more OTU99% than the number of species. Denoising with AmpliconNoise reduced the number of OTU99% to match the number of species present in the mock community. Based on our analyses, we propose a strategy to more accurately depict haptophyte diversity using 454 pyrosequencing. PMID:24069303
SPRi-based biosensing platforms for detection of specific DNA sequences using thiolate and dithiocarbamate assemblies

NASA Astrophysics Data System (ADS)

Drozd, Marcin; Pietrzak, Mariusz D.; Malinowska, Elżbieta

2018-05-01

The framework of presented study covers the development and examination of the analytical performance of surface plasmon resonance-based (SPR) DNA biosensors dedicated for a detection of model target oligonucleotide sequence. For this aim, various strategies of immobilization of DNA probes on gold transducers were tested. Besides the typical approaches: chemisorption of thiolated ssDNA (DNA-thiol) and physisorption of non-functionalized oligonucleotides, relatively new method based on chemisorption of dithiocarbamate-functionalized ssDNA (DNA-DTC) was applied for the first time for preparation of DNA-based SPR biosensor. The special emphasis was put on the correlation between the method of DNA immobilization and the composition of obtained receptor layer. The carried out studies focused on the examination of the capability of developed receptors layers to interact with both target DNA and DNA-functionalized AuNPs. It was found, that the detection limit of target DNA sequence (27 nb length) depends on the strategy of probe immobilization and backfilling method, and in the best case it amounted to 0,66 nM. Moreover, the application of ssDNA-functionalized gold nanoparticles (AuNPs) as plasmonic labels for secondary enhancement of SPR response is presented. The influence of spatial organization and surface density of a receptor layer on the ability to interact with DNA-functionalized AuNPs is discussed. Due to the best compatibility of receptors immobilized via DTC chemisorption: 1.47 ± 0.4 ·1012 molecules • cm-2 (with the calculated area occupied by single nanoparticle label of 132.7 nm2), DNA chemisorption based on DTCs is pointed as especially promising for DNA biosensors utilizing indirect detection in competitive assays.
Genetics, structure, and prevalence of FP967 (CDC Triffid) T-DNA in flax.

PubMed

Young, Lester; Hammerlindl, Joseph; Babic, Vivijan; McLeod, Jamille; Sharpe, Andrew; Matsalla, Chad; Bekkaoui, Faouzi; Marquess, Leigh; Booker, Helen M

2015-01-01

The detection of T-DNA from a genetically modified flaxseed line (FP967, formally CDC Triffid) in a shipment of Canadian flaxseed exported to Europe resulted in a large decrease in the amount of flax planted in Canada. The Canadian flaxseed industry undertook major changes to ensure the removal of FP967 from the supply chain. This study aimed to resolve the genetics and structure of the FP967 transfer DNA (T-DNA). The FP967 T-DNA is thought to be inserted in at single genomic locus. The junction between the T-DNA and genomic DNA consisted of two inverted Right Borders with no Left Border (LB) flanking genomic DNA sequences recovered. This information was used to develop an event-specific quantitative PCR (qPCR) assay. This assay and an existing assay specific to the T-DNA construct were used to determine the genetics and prevalence of the FP967 T-DNA. These data supported the hypothesis that the T-DNA is present at a single location in the genome. The FP967 T-DNA is present at a low level (between 0.01 and 0.1%) in breeder seed lots from 2009 and 2010. None of the 11,000 and 16,000 lines selected for advancement through the Flax Breeding Program in 2010 and 2011, respectively, tested positive for the FP967 T-DNA, however. Most of the FP967 T-DNA sequence was resolved via PCR cloning and next generation sequencing. A 3,720 bp duplication of an internal portion of the T-DNA (including a Right Border) was discovered between the flanking genomic DNA and the LB. An event-specific assay, SAT2-LB, was developed for the junction between this repeat and the LB.
SPRi-Based Biosensing Platforms for Detection of Specific DNA Sequences Using Thiolate and Dithiocarbamate Assemblies.

PubMed

Drozd, Marcin; Pietrzak, Mariusz D; Malinowska, Elżbieta

2018-01-01

The framework of presented study covers the development and examination of the analytical performance of surface plasmon resonance-based (SPR) DNA biosensors dedicated for a detection of model target oligonucleotide sequence. For this aim, various strategies of immobilization of DNA probes on gold transducers were tested. Besides the typical approaches: chemisorption of thiolated ssDNA (DNA-thiol) and physisorption of non-functionalized oligonucleotides, relatively new method based on chemisorption of dithiocarbamate-functionalized ssDNA (DNA-DTC) was applied for the first time for preparation of DNA-based SPR biosensor. The special emphasis was put on the correlation between the method of DNA immobilization and the composition of obtained receptor layer. The carried out studies focused on the examination of the capability of developed receptors layers to interact with both target DNA and DNA-functionalized AuNPs. It was found, that the detection limit of target DNA sequence (27 nb length) depends on the strategy of probe immobilization and backfilling method, and in the best case it amounted to 0.66 nM. Moreover, the application of ssDNA-functionalized gold nanoparticles (AuNPs) as plasmonic labels for secondary enhancement of SPR response is presented. The influence of spatial organization and surface density of a receptor layer on the ability to interact with DNA-functionalized AuNPs is discussed. Due to the best compatibility of receptors immobilized via DTC chemisorption: 1.47 ± 0.4 · 10 12 molecules · cm -2 (with the calculated area occupied by single nanoparticle label of ~132.7 nm 2 ), DNA chemisorption based on DTCs is pointed as especially promising for DNA biosensors utilizing indirect detection in competitive assays.
New tool to assemble repetitive regions using next-generation sequencing data

NASA Astrophysics Data System (ADS)

Kuśmirek, Wiktor; Nowak, Robert M.; Neumann, Łukasz

2017-08-01

The next generation sequencing techniques produce a large amount of sequencing data. Some part of the genome are composed of repetitive DNA sequences, which are very problematic for the existing genome assemblers. We propose a modification of the algorithm for a DNA assembly, which uses the relative frequency of reads to properly reconstruct repetitive sequences. The new approach was implemented and tested, as a demonstration of the capability of our software we present some results for model organisms. The new implementation, using a three-layer software architecture was selected, where the presentation layer, data processing layer, and data storage layer were kept separate. Source code as well as demo application with web interface and the additional data are available at project web-page: http://dnaasm.sourceforge.net.
Nanopore sequencing technology: a new route for the fast detection of unauthorized GMO.

PubMed

Fraiture, Marie-Alice; Saltykova, Assia; Hoffman, Stefan; Winand, Raf; Deforce, Dieter; Vanneste, Kevin; De Keersmaecker, Sigrid C J; Roosens, Nancy H C

2018-05-21

In order to strengthen the current genetically modified organism (GMO) detection system for unauthorized GMO, we have recently developed a new workflow based on DNA walking to amplify unknown sequences surrounding a known DNA region. This DNA walking is performed on transgenic elements, commonly found in GMO, that were earlier detected by real-time PCR (qPCR) screening. Previously, we have demonstrated the ability of this approach to detect unauthorized GMO via the identification of unique transgene flanking regions and the unnatural associations of elements from the transgenic cassette. In the present study, we investigate the feasibility to integrate the described workflow with the MinION Next-Generation-Sequencing (NGS). The MinION sequencing platform can provide long read-lengths and deal with heterogenic DNA libraries, allowing for rapid and efficient delivery of sequences of interest. In addition, the ability of this NGS platform to characterize unauthorized and unknown GMO without any a priori knowledge has been assessed.
Phylogenetic analysis of Sicilian goats reveals a new mtDNA lineage.

PubMed

Sardina, M T; Ballester, M; Marmi, J; Finocchiaro, R; van Kaam, J B C H M; Portolano, B; Folch, J M

2006-08-01

The mitochondrial hypervariable region 1 (HVR1) sequence of 67 goats belonging to the Girgentana, Maltese and Derivata di Siria breeds was partially sequenced in order to present the first phylogenetic characterization of Sicilian goat breeds. These sequences were compared with published sequences of Indian and Pakistani domestic goats and wild goats. Mitochondrial lineage A was observed in most of the Sicilian goats. However, three Girgentana haplotypes were highly divergent from the Capra hircus clade, indicating that a new mtDNA lineage in domestic goats was found.
Alternative polyadenylation of the gene transcripts encoding a rat DNA polymerase beta.

PubMed

Konopiński, R; Nowak, R; Siedlecki, J A

1996-10-17

Rat cells produce two different transcripts of DNA polymerase beta (beta-Pol). The low-molecular-weight transcript (1.4 kb) was already sequenced. We report here the cloning and sequencing of the full-length cDNA, corresponding to the high-molecular-weight (HMW) transcript (4.0 kb) of beta-Pol. Sequence data strongly suggest that both transcripts are produced from a single gene by alternative polyadenylation. The HMW transcript contains the entire 1.4 kb transcript sequence and additional 2.2 kb on the 3' end. The 3' UTR of the HMW transcript contains some regulatory sequences which are not present in the 1.4-kb transcript. The A + U-rich fragment and (GU)21 sequence are believed to influence the stability of the mRNA. The functional significance of the A-rich region locally destabilizing double-stranded secondary structure remains unknown.
Evolution of EF-hand calcium-modulated proteins. III. Exon sequences confirm most dendrograms based on protein sequences: calmodulin dendrograms show significant lack of parallelism

NASA Technical Reports Server (NTRS)

Nakayama, S.; Kretsinger, R. H.

1993-01-01

In the first report in this series we presented dendrograms based on 152 individual proteins of the EF-hand family. In the second we used sequences from 228 proteins, containing 835 domains, and showed that eight of the 29 subfamilies are congruent and that the EF-hand domains of the remaining 21 subfamilies have diverse evolutionary histories. In this study we have computed dendrograms within and among the EF-hand subfamilies using the encoding DNA sequences. In most instances the dendrograms based on protein and on DNA sequences are very similar. Significant differences between protein and DNA trees for calmodulin remain unexplained. In our fourth report we evaluate the sequences and the distribution of introns within the EF-hand family and conclude that exon shuffling did not play a significant role in its evolution.
Modular probes for enriching and detecting complex nucleic acid sequences

NASA Astrophysics Data System (ADS)

Wang, Juexiao Sherry; Yan, Yan Helen; Zhang, David Yu

2017-12-01

Complex DNA sequences are difficult to detect and profile, but are important contributors to human health and disease. Existing hybridization probes lack the capability to selectively bind and enrich hypervariable, long or repetitive sequences. Here, we present a generalized strategy for constructing modular hybridization probes (M-Probes) that overcomes these challenges. We demonstrate that M-Probes can tolerate sequence variations of up to 7 nt at prescribed positions while maintaining single nucleotide sensitivity at other positions. M-Probes are also shown to be capable of sequence-selectively binding a continuous DNA sequence of more than 500 nt. Furthermore, we show that M-Probes can detect genes with triplet repeats exceeding a programmed threshold. As a demonstration of this technology, we have developed a hybrid capture method to determine the exact triplet repeat expansion number in the Huntington's gene of genomic DNA using quantitative PCR.
Enhancing the detection of barcoded reads in high throughput DNA sequencing data by controlling the false discovery rate.

PubMed

Buschmann, Tilo; Zhang, Rong; Brash, Douglas E; Bystrykh, Leonid V

2014-08-07

DNA barcodes are short unique sequences used to label DNA or RNA-derived samples in multiplexed deep sequencing experiments. During the demultiplexing step, barcodes must be detected and their position identified. In some cases (e.g., with PacBio SMRT), the position of the barcode and DNA context is not well defined. Many reads start inside the genomic insert so that adjacent primers might be missed. The matter is further complicated by coincidental similarities between barcode sequences and reference DNA. Therefore, a robust strategy is required in order to detect barcoded reads and avoid a large number of false positives or negatives.For mass inference problems such as this one, false discovery rate (FDR) methods are powerful and balanced solutions. Since existing FDR methods cannot be applied to this particular problem, we present an adapted FDR method that is suitable for the detection of barcoded reads as well as suggest possible improvements. In our analysis, barcode sequences showed high rates of coincidental similarities with the Mus musculus reference DNA. This problem became more acute when the length of the barcode sequence decreased and the number of barcodes in the set increased. The method presented in this paper controls the tail area-based false discovery rate to distinguish between barcoded and unbarcoded reads. This method helps to establish the highest acceptable minimal distance between reads and barcode sequences. In a proof of concept experiment we correctly detected barcodes in 83% of the reads with a precision of 89%. Sensitivity improved to 99% at 99% precision when the adjacent primer sequence was incorporated in the analysis. The analysis was further improved using a paired end strategy. Following an analysis of the data for sequence variants induced in the Atp1a1 gene of C57BL/6 murine melanocytes by ultraviolet light and conferring resistance to ouabain, we found no evidence of cross-contamination of DNA material between samples. Our method offers a proper quantitative treatment of the problem of detecting barcoded reads in a noisy sequencing environment. It is based on the false discovery rate statistics that allows a proper trade-off between sensitivity and precision to be chosen.
Rapid and Easy Protocol for Quantification of Next-Generation Sequencing Libraries.

PubMed

Hawkins, Steve F C; Guest, Paul C

2018-01-01

The emergence of next-generation sequencing (NGS) over the last 10 years has increased the efficiency of DNA sequencing in terms of speed, ease, and price. However, the exact quantification of a NGS library is crucial in order to obtain good data on sequencing platforms developed by the current market leader Illumina. Different approaches for DNA quantification are available currently and the most commonly used are based on analysis of the physical properties of the DNA through spectrophotometric or fluorometric methods. Although these methods are technically simple, they do not allow exact quantification as can be achieved using a real-time quantitative PCR (qPCR) approach. A qPCR protocol for DNA quantification with applications in NGS library preparation studies is presented here. This can be applied in various fields of study such as medical disorders resulting from nutritional programming disturbances.
Early history of European domestic cattle as revealed by ancient DNA.

PubMed

Bollongino, R; Edwards, C J; Alt, K W; Burger, J; Bradley, D G

2006-03-22

We present an extensive ancient DNA analysis of mainly Neolithic cattle bones sampled from archaeological sites along the route of Neolithic expansion, from Turkey to North-Central Europe and Britain. We place this first reasonable population sample of Neolithic cattle mitochondrial DNA sequence diversity in context to illustrate the continuity of haplotype variation patterns from the first European domestic cattle to the present. Interestingly, the dominant Central European pattern, a starburst phylogeny around the modal sequence, T3, has a Neolithic origin, and the reduced diversity within this cluster in the ancient samples accords with their shorter history of post-domestic accumulation of mutation.
Development of Active DNA Control Technique for DNA Sequencer With a Solid-state Nanopore

NASA Astrophysics Data System (ADS)

Akahori, Rena; Harada, Kunio; Goto, Yusuke; Yanagi, Itaru; Yokoi, Takahide; Oura, Takeshi; Shibahara, Masashi; Takeda, Ken-Ichi

We have developed a technique that can control the arbitrary speeds of DNA passing through a solid-state nanopore of a DNA sequencer. For this active DNA control technique, we used a DNA-immobilized Si probe, larger than the membrane with a nanopore, and used a piezoelectric actuator and stepper motor to drive the probe. This probe enables a user to adjust the relative position between the nanopore and DNA immobilized on the probe without the need for precise lateral control. In this presentation, we demonstrate how DNA (block copolymer ([(dT)25-(dC)25-(dA)50]m)), immobilized on the probe, slid through a nanopore and was pulled out using the active DNA control technique. As the DNA-immobilized probe was being pulled out, we obtained various ion-current signal levels corresponding to the number of different nucleotides in a single strand of DNA.
Quantitative analysis and prediction of G-quadruplex forming sequences in double-stranded DNA

PubMed Central

Kim, Minji; Kreig, Alex; Lee, Chun-Ying; Rube, H. Tomas; Calvert, Jacob; Song, Jun S.; Myong, Sua

2016-01-01

Abstract G-quadruplex (GQ) is a four-stranded DNA structure that can be formed in guanine-rich sequences. GQ structures have been proposed to regulate diverse biological processes including transcription, replication, translation and telomere maintenance. Recent studies have demonstrated the existence of GQ DNA in live mammalian cells and a significant number of potential GQ forming sequences in the human genome. We present a systematic and quantitative analysis of GQ folding propensity on a large set of 438 GQ forming sequences in double-stranded DNA by integrating fluorescence measurement, single-molecule imaging and computational modeling. We find that short minimum loop length and the thymine base are two main factors that lead to high GQ folding propensity. Linear and Gaussian process regression models further validate that the GQ folding potential can be predicted with high accuracy based on the loop length distribution and the nucleotide content of the loop sequences. Our study provides important new parameters that can inform the evaluation and classification of putative GQ sequences in the human genome. PMID:27095201
Mitochondrial DNA mutations in single human blood cells.

PubMed

Yao, Yong-Gang; Kajigaya, Sachiko; Young, Neal S

2015-09-01

Determination mitochondrial DNA (mtDNA) sequences from extremely small amounts of DNA extracted from tissue of limited amounts and/or degraded samples is frequently employed in medical, forensic, and anthropologic studies. Polymerase chain reaction (PCR) amplification followed by DNA cloning is a routine method, especially to examine heteroplasmy of mtDNA mutations. In this review, we compare the mtDNA mutation patterns detected by three different sequencing strategies. Cloning and sequencing methods that are based on PCR amplification of DNA extracted from either single cells or pooled cells yield a high frequency of mutations, partly due to the artifacts introduced by PCR and/or the DNA cloning process. Direct sequencing of PCR product which has been amplified from DNA in individual cells is able to detect the low levels of mtDNA mutations present within a cell. We further summarize the findings in our recent studies that utilized this single cell method to assay mtDNA mutation patterns in different human blood cells. Our data show that many somatic mutations observed in the end-stage differentiated cells are found in hematopoietic stem cells (HSCs) and progenitors within the CD34(+) cell compartment. Accumulation of mtDNA variations in the individual CD34+ cells is affected by both aging and family genetic background. Granulocytes harbor higher numbers of mutations compared with the other cells, such as CD34(+) cells and lymphocytes. Serial assessment of mtDNA mutations in a population of single CD34(+) cells obtained from the same donor over time suggests stability of some somatic mutations. CD34(+) cell clones from a donor marked by specific mtDNA somatic mutations can be found in the recipient after transplantation. The significance of these findings is discussed in terms of the lineage tracing of HSCs, aging effect on accumulation of mtDNA mutations and the usage of mtDNA sequence in forensic identification. Copyright © 2015 Elsevier B.V. All rights reserved.
Separation and parallel sequencing of the genomes and transcriptomes of single cells using G&T-seq.

PubMed

Macaulay, Iain C; Teng, Mabel J; Haerty, Wilfried; Kumar, Parveen; Ponting, Chris P; Voet, Thierry

2016-11-01

Parallel sequencing of a single cell's genome and transcriptome provides a powerful tool for dissecting genetic variation and its relationship with gene expression. Here we present a detailed protocol for G&T-seq, a method for separation and parallel sequencing of genomic DNA and full-length polyA(+) mRNA from single cells. We provide step-by-step instructions for the isolation and lysis of single cells; the physical separation of polyA(+) mRNA from genomic DNA using a modified oligo-dT bead capture and the respective whole-transcriptome and whole-genome amplifications; and library preparation and sequence analyses of these amplification products. The method allows the detection of thousands of transcripts in parallel with the genetic variants captured by the DNA-seq data from the same single cell. G&T-seq differs from other currently available methods for parallel DNA and RNA sequencing from single cells, as it involves physical separation of the DNA and RNA and does not require bespoke microfluidics platforms. The process can be implemented manually or through automation. When performed manually, paired genome and transcriptome sequencing libraries from eight single cells can be produced in ∼3 d by researchers experienced in molecular laboratory work. For users with experience in the programming and operation of liquid-handling robots, paired DNA and RNA libraries from 96 single cells can be produced in the same time frame. Sequence analysis and integration of single-cell G&T-seq DNA and RNA data requires a high level of bioinformatics expertise and familiarity with a wide range of informatics tools.

Transcriptome analysis by strand-specific sequencing of complementary DNA

PubMed Central

Parkhomchuk, Dmitri; Borodina, Tatiana; Amstislavskiy, Vyacheslav; Banaru, Maria; Hallen, Linda; Krobitsch, Sylvia; Lehrach, Hans; Soldatov, Alexey

2009-01-01

High-throughput complementary DNA sequencing (RNA-Seq) is a powerful tool for whole-transcriptome analysis, supplying information about a transcript's expression level and structure. However, it is difficult to determine the polarity of transcripts, and therefore identify which strand is transcribed. Here, we present a simple cDNA sequencing protocol that preserves information about a transcript's direction. Using Saccharomyces cerevisiae and mouse brain transcriptomes as models, we demonstrate that knowing the transcript's orientation allows more accurate determination of the structure and expression of genes. It also helps to identify new genes and enables studying promoter-associated and antisense transcription. The transcriptional landscapes we obtained are available online. PMID:19620212
Transcriptome analysis by strand-specific sequencing of complementary DNA.

PubMed

Parkhomchuk, Dmitri; Borodina, Tatiana; Amstislavskiy, Vyacheslav; Banaru, Maria; Hallen, Linda; Krobitsch, Sylvia; Lehrach, Hans; Soldatov, Alexey

2009-10-01

High-throughput complementary DNA sequencing (RNA-Seq) is a powerful tool for whole-transcriptome analysis, supplying information about a transcript's expression level and structure. However, it is difficult to determine the polarity of transcripts, and therefore identify which strand is transcribed. Here, we present a simple cDNA sequencing protocol that preserves information about a transcript's direction. Using Saccharomyces cerevisiae and mouse brain transcriptomes as models, we demonstrate that knowing the transcript's orientation allows more accurate determination of the structure and expression of genes. It also helps to identify new genes and enables studying promoter-associated and antisense transcription. The transcriptional landscapes we obtained are available online.
The full mitochondrial genome sequence of Raillietina tetragona from chicken (Cestoda: Davaineidae).

PubMed

Liang, Jian-Ying; Lin, Rui-Qing

2016-11-01

In the present study, the complete mitochondrial DNA (mtDNA) sequence of Raillietina tetragona was sequenced and its gene contents and genome organizations was compared with that of other tapeworm. The complete mt genome sequence of R. tetragona is 14,444 bp in length. It contains 12 protein-coding genes, two ribosomal RNA genes, 22 transfer RNA genes, and two non-coding region. All genes are transcribed in the same direction and have a nucleotide composition high in A and T. The contents of A + T of the complete mt genome are 71.4% for R. tetragona. The R. tetragona mt genome sequence provides novel mtDNA marker for studying the molecular epidemiology and population genetics of Raillietina and has implications for the molecular diagnosis of chicken cestodosis caused by Raillietina.
The TGA codons are present in the open reading frame of selenoprotein P cDNA

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hill, K.E.; Lloyd, R.S.; Read, R.

1991-03-11

The TGA codon in DNA has been shown to direct incorporation of selenocysteine into protein. Several proteins from bacteria and animals contain selenocysteine in their primary structures. Each of the cDNA clones of these selenoproteins contains one TGA codon in the open reading frame which corresponds to the selenocysteine in the protein. A cDNA clone for selenoprotein P (SeP), obtained from a {gamma}ZAP rat liver library, was sequenced by the dideoxy termination method. The correct reading frame was determined by comparison of the deduced amino acid sequence with the amino acid sequence of several peptides from SeP. Using SeP labelledmore » with {sup 75}Se in vivo, the selenocysteine content of the peptides was verified by the collection of carboxymethylated {sup 77}Se-selenocysteine as it eluted from the amino acid analyzer and determination of the radioactivity contained in the collected samples. Ten TGA codons are present in the open reading frame of the cDNA. Peptide fragmentation studies and the deduced sequence indicate that selenium-rich regions are located close to the carboxy terminus. Nine of the 10 selenocysteines are located in the terminal 26% of the sequence with four in the terminal 15 amino acids. The deduced sequence codes for a protein of 385 amino acids. Cleavage of the signal peptide gives the mature protein with 366 amino acids and a calculated mol wt of 41,052 Da. Searches of PIR and SWISSPROT protein databases revealed no similarity with glutathione peroxidase or other selenoproteins.« less
RNA from the 5' end of the R2 retrotransposon controls R2 protein binding to and cleavage of its DNA target site.

PubMed

Christensen, Shawn M; Ye, Junqiang; Eickbush, Thomas H

2006-11-21

Non-LTR retrotransposons insert into eukaryotic genomes by target-primed reverse transcription (TPRT), a process in which cleaved DNA targets are used to prime reverse transcription of the element's RNA transcript. Many of the steps in the integration pathway of these elements can be characterized in vitro for the R2 element because of the rigid sequence specificity of R2 for both its DNA target and its RNA template. R2 retrotransposition involves identical subunits of the R2 protein bound to different DNA sequences upstream and downstream of the insertion site. The key determinant regulating which DNA-binding conformation the protein adopts was found to be a 320-nt RNA sequence from near the 5' end of the R2 element. In the absence of this 5' RNA the R2 protein binds DNA sequences upstream of the insertion site, cleaves the first DNA strand, and conducts TPRT when RNA containing the 3' untranslated region of the R2 transcript is present. In the presence of the 320-nt 5' RNA, the R2 protein binds DNA sequences downstream of the insertion site. Cleavage of the second DNA strand by the downstream subunit does not appear to occur until after the 5' RNA is removed from this subunit. We postulate that the removal of the 5' RNA normally occurs during reverse transcription, and thus provides a critical temporal link to first- and second-strand DNA cleavage in the R2 retrotransposition reaction.
The wheat cytochrome oxidase subunit II gene has an intron insert and three radical amino acid changes relative to maize

PubMed Central

Bonen, Linda; Boer, Poppo H.; Gray, Michael W.

1984-01-01

We have determined the sequence of the wheat mitochondrial gene for cytochrome oxidase subunit II (COII) and find that its derived protein sequence differs from that of maize at only three amino acid positions. Unexpectedly, all three replacements are non-conservative ones. The wheat COII gene has a highly-conserved intron at the same position as in maize, but the wheat intron is 1.5 times longer because of an insert relative to its maize counterpart. Hybridization analysis of mitochondrial DNA from rye, pea, broad bean and cucumber indicates strong sequence conservation of COII coding sequences among all these higher plants. However, only rye and maize mitochondrial DNA show homology with wheat COII intron sequences and rye alone with intron-insert sequences. We find that a sequence identical to the region of the 5' exon corresponding to the transmembrane domain of the COII protein is present at a second genomic location in wheat mitochondria. These variations in COII gene structure and size, as well as the presence of repeated COII sequences, illustrate at the DNA sequence level, factors which contribute to higher plant mitochondrial DNA diversity and complexity. ImagesFig. 3.Fig. 4.Fig. 5. PMID:16453565
Contamination of sequence databases with adaptor sequences

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yoshikawa, Takeo; Sanders, A.R.; Detera-Wadleigh, S.D.

Because of the exponential increase in the amount of DNA sequences being added to the public databases on a daily basis, it has become imperative to identify sources of contamination rapidly. Previously, contaminations of sequence databases have been reported to alert the scientific community to the problem. These contaminations can be divided into two categories. The first category comprises host sequences that have been difficult for submitters to manage or control. Examples include anomalous sequences derived from Escherichia coli, which are inserted into the chromosomes (and plasmids) of the bacterial hosts. Insertion sequences are highly mobile and are capable ofmore » transposing themselves into plasmids during cloning manipulation. Another example of the first category is the infection with yeast genomic DNA or with bacterial DNA of some commercially available cDNA libraries from Clontech. The second category of database contamination is due to the inadvertent inclusion of nonhost sequences. This category includes incorporation of cloning-vector sequences and multicloning sites in the database submission. M13-derived artifacts have been common, since M13-based vectors have been widely used for subcloning DNA fragments. Recognizing this problem, the National Center for Biotechnology Information (NCBI) started to screen, in April 1994, all sequences directly submitted to GenBank, against a set of vector data retrieved from GenBank by use of key-word searches, such as {open_quotes}vector.{close_quotes} In this report, we present evidence for another sequence artifact that is widespread but that, to our knowledge, has not yet been reported. 11 refs., 1 tab.« less
Fine Dissection of Human Mitochondrial DNA Haplogroup HV Lineages Reveals Paleolithic Signatures from European Glacial Refugia

PubMed Central

Sarno, Stefania; Sevini, Federica; Vianello, Dario; Tamm, Erika; Metspalu, Ene; van Oven, Mannis; Hübner, Alexander; Sazzini, Marco; Franceschi, Claudio; Pettener, Davide; Luiselli, Donata

2015-01-01

Genetic signatures from the Paleolithic inhabitants of Eurasia can be traced from the early divergent mitochondrial DNA lineages still present in contemporary human populations. Previous studies already suggested a pre-Neolithic diffusion of mitochondrial haplogroup HV*(xH,V) lineages, a relatively rare class of mtDNA types that includes parallel branches mainly distributed across Europe and West Asia with a certain degree of structure. Up till now, variation within haplogroup HV was addressed mainly by analyzing sequence data from the mtDNA control region, except for specific sub-branches, such as HV4 or the widely distributed haplogroups H and V. In this study, we present a revised HV topology based on full mtDNA genome data, and we include a comprehensive dataset consisting of 316 complete mtDNA sequences including 60 new samples from the Italian peninsula, a previously underrepresented geographic area. We highlight points of instability in the particular topology of this haplogroup, reconstructed with BEAST-generated trees and networks. We also confirm a major lineage expansion that probably followed the Late Glacial Maximum and preceded Neolithic population movements. We finally observe that Italy harbors a reservoir of mtDNA diversity, with deep-rooting HV lineages often related to sequences present in the Caucasus and the Middle East. The resulting hypothesis of a glacial refugium in Southern Italy has implications for the understanding of late Paleolithic population movements and is discussed within the archaeological cultural shifts occurred over the entire continent. PMID:26640946
Relatively well preserved DNA is present in the crystal aggregates of fossil bones

PubMed Central

Salamon, Michal; Tuross, Noreen; Arensburg, Baruch; Weiner, Steve

2005-01-01

DNA from fossil human bones could provide invaluable information about population migrations, genetic relations between different groups and the spread of diseases. The use of ancient DNA from bones to study the genetics of past populations is, however, very often compromised by the altered and degraded state of preservation of the extracted material. The universally observed postmortem degradation, together with the real possibility of contamination with modern human DNA, makes the acquisition of reliable data, from humans in particular, very difficult. We demonstrate that relatively well preserved DNA is occluded within clusters of intergrown bone crystals that are resistant to disaggregation by the strong oxidant NaOCl. We obtained reproducible authentic sequences from both modern and ancient animal bones, including humans, from DNA extracts of crystal aggregates. The treatment with NaOCl also minimizes the possibility of modern DNA contamination. We thus demonstrate the presence of a privileged niche within fossil bone, which contains DNA in a better state of preservation than the DNA present in the total bone. This counterintuitive approach to extracting relatively well preserved DNA from bones significantly improves the chances of obtaining authentic ancient DNA sequences, especially from human bones. PMID:16162675
Few mitochondrial DNA sequences are inserted into the turkey (Meleagris gallopavo) nuclear genome: evolutionary analyses and informativity in the domestic lineage.

PubMed

Schiavo, G; Strillacci, M G; Ribani, A; Bovo, S; Roman-Ponce, S I; Cerolini, S; Bertolini, F; Bagnato, A; Fontanesi, L

2018-06-01

Mitochondrial DNA (mtDNA) insertions have been detected in the nuclear genome of many eukaryotes. These sequences are pseudogenes originated by horizontal transfer of mtDNA fragments into the nuclear genome, producing nuclear DNA sequences of mitochondrial origin (numt). In this study we determined the frequency and distribution of mtDNA-originated pseudogenes in the turkey (Meleagris gallopavo) nuclear genome. The turkey reference genome (Turkey_2.01) was aligned with the reference linearized mtDNA sequence using last. A total of 32 numt sequences (corresponding to 18 numt regions derived by unique insertional events) were identified in the turkey nuclear genome (size ranging from 66 to 1415 bp; identity against the modern turkey mtDNA corresponding region ranging from 62% to 100%). Numts were distributed in nine chromosomes and in one scaffold. They derived from parts of 10 mtDNA protein-coding genes, ribosomal genes, the control region and 10 tRNA genes. Seven numt regions reported in the turkey genome were identified in orthologues positions in the Gallus gallus genome and therefore were present in the ancestral genome that in the Cretaceous originated the lineages of the modern crown Galliformes. Five recently integrated turkey numts were validated by PCR in 168 turkeys of six different domestic populations. None of the analysed numts were polymorphic (i.e. absence of the inserted sequence, as reported in numts of recent integration in other species), suggesting that the reticulate speciation model is not useful for explaining the origin of the domesticated turkey lineage. © 2018 Stichting International Foundation for Animal Genetics.
Selective DNA demethylation by fusion of TDG with a sequence-specific DNA-binding domain

PubMed Central

Gregory, David J.; Mikhaylova, Lyudmila; Fedulov, Alexey V.

2012-01-01

Our ability to selectively manipulate gene expression by epigenetic means is limited, as there is no approach for targeted reactivation of epigenetically silenced genes, in contrast to what is available for selective gene silencing. We aimed to develop a tool for selective transcriptional activation by DNA demethylation. Here we present evidence that direct targeting of thymine-DNA-glycosylase (TDG) to specific sequences in the DNA can result in local DNA demethylation at potential regulatory sequences and lead to enhanced gene induction. When TDG was fused to a well-characterized DNA-binding domain [the Rel-homology domain (RHD) of NFκB], we observed decreased DNA methylation and increased transcriptional response to unrelated stimulus of inducible nitric oxide synthase (NOS2). The effect was not seen for control genes lacking either RHD-binding sites or high levels of methylation, nor in control mock-transduced cells. Specific reactivation of epigenetically silenced genes may thus be achievable by this approach, which provides a broadly useful strategy to further our exploration of biological mechanisms and to improve control over the epigenome. PMID:22419066
DNA microdevice for electrochemical detection of Escherichia coli 0157:H7 molecular markers.

PubMed

Berganza, J; Olabarria, G; García, R; Verdoy, D; Rebollo, A; Arana, S

2007-04-15

An electrochemical DNA sensor based on the hybridization recognition of a single-stranded DNA (ssDNA) probe immobilized onto a gold electrode to its complementary ssDNA is presented. The DNA probe is bound on gold surface electrode by using self-assembled monolayer (SAM) technology. An optimized mixed SAM with a blocking molecule preventing the nonspecific adsorption on the electrode surface has been prepared. In this paper, a DNA biosensor is designed by means of the immobilization of a single stranded DNA probe on an electrochemical transducer surface to recognize specifically Escherichia coli (E. coli) 0157:H7 complementary target DNA sequence via cyclic voltammetry experiments. The 21 mer DNA probe including a C6 alkanethiol group at the 5' phosphate end has been synthesized to form the SAM onto the gold surface through the gold sulfur bond. The goal of this paper has been to design, characterise and optimise an electrochemical DNA sensor. In order to investigate the oligonucleotide probe immobilization and the hybridization detection, experiments with different concentration of DNA and mismatch sequences have been performed. This microdevice has demonstrated the suitability of oligonucleotide Self-assembled monolayers (SAMs) on gold as immobilization method. The DNA probes deposited on gold surface have been functional and able to detect changes in bases sequence in a 21-mer oligonucleotide.
Fascioliasis transmission by Lymnaea neotropica confirmed by nuclear rDNA and mtDNA sequencing in Argentina.

PubMed

Mera y Sierra, Roberto; Artigas, Patricio; Cuervo, Pablo; Deis, Erika; Sidoti, Laura; Mas-Coma, Santiago; Bargues, Maria Dolores

2009-12-03

Fascioliasis is widespread in livestock in Argentina. Among activities included in a long-term initiative to ascertain which are the fascioliasis areas of most concern, studies were performed in a recreational farm, including liver fluke infection in different domestic animal species, classification of the lymnaeid vector and verification of natural transmission of fascioliasis by identification of the intramolluscan trematode larval stages found in naturally infected snails. The high prevalences in the domestic animals appeared related to only one lymnaeid species present. Lymnaeid and trematode classification was verified by means of nuclear ribosomal DNA and mitochondrial DNA marker sequencing. Complete sequences of 18S rRNA gene and rDNA ITS-2 and ITS-1, and a fragment of the mtDNA cox1 gene demonstrate that the Argentinian lymnaeid belongs to the species Lymnaea neotropica. Redial larval stages found in a L. neotropica specimen were ascribed to Fasciola hepatica after analysis of the complete ITS-1 sequence. The finding of L. neotropica is the first of this lymnaeid species not only in Argentina but also in Southern Cone countries. The total absence of nucleotide differences between the sequences of specimens from Argentina and the specimens from the Peruvian type locality at the levels of rDNA 18S, ITS-2 and ITS-1, and the only one mutation at the mtDNA cox1 gene suggest a very recent spread. The ecological characteristics of this lymnaeid, living in small, superficial water collections frequented by livestock, suggest that it may be carried from one place to another by remaining in dried mud stuck to the feet of transported animals. The presence of L. neotropica adds pronounced complexity to the transmission and epidemiology of fascioliasis in Argentina, due to the great difficulties in distinguishing, by traditional malacological methods, between the three similar lymnaeid species of the controversial Galba/Fossaria group present in this country: L. viatrix, Galba truncatula and L. neotropica. It also poses a problem with regard to the use, for lymnaeid vector species discrimination, of several molecular techniques which do not show sufficient accuracy, as those relying on the 18S rRNA gene or parts of it, because both L. neotropica and L. viatrix present identical 18S sequence.
Mitochondrial Genome Sequences of Nematocera (Lower Diptera): Evidence of Rearrangement following a Complete Genome Duplication in a Winter Crane Fly

PubMed Central

Beckenbach, Andrew T.

2012-01-01

The complete mitochondrial DNA sequences of eight representatives of lower Diptera, suborder Nematocera, along with nearly complete sequences from two other species, are presented. These taxa represent eight families not previously represented by complete mitochondrial DNA sequences. Most of the sequences retain the ancestral dipteran mitochondrial gene arrangement, while one sequence, that of the midge Arachnocampa flava (family Keroplatidae), has an inversion of the trnE gene. The most unusual result is the extensive rearrangement of the mitochondrial genome of a winter crane fly, Paracladura trichoptera (family Trichocera). The pattern of rearrangement indicates that the mechanism of rearrangement involved a tandem duplication of the entire mitochondrial genome, followed by random and nonrandom loss of one copy of each gene. Another winter crane fly retains the ancestral diperan gene arrangement. A preliminary mitochondrial phylogeny of the Diptera is also presented. PMID:22155689
CAPRRESI: Chimera Assembly by Plasmid Recovery and Restriction Enzyme Site Insertion.

PubMed

Santillán, Orlando; Ramírez-Romero, Miguel A; Dávila, Guillermo

2017-06-25

Here, we present chimera assembly by plasmid recovery and restriction enzyme site insertion (CAPRRESI). CAPRRESI benefits from many strengths of the original plasmid recovery method and introduces restriction enzyme digestion to ease DNA ligation reactions (required for chimera assembly). For this protocol, users clone wildtype genes into the same plasmid (pUC18 or pUC19). After the in silico selection of amino acid sequence regions where chimeras should be assembled, users obtain all the synonym DNA sequences that encode them. Ad hoc Perl scripts enable users to determine all synonym DNA sequences. After this step, another Perl script searches for restriction enzyme sites on all synonym DNA sequences. This in silico analysis is also performed using the ampicillin resistance gene (ampR) found on pUC18/19 plasmids. Users design oligonucleotides inside synonym regions to disrupt wildtype and ampR genes by PCR. After obtaining and purifying complementary DNA fragments, restriction enzyme digestion is accomplished. Chimera assembly is achieved by ligating appropriate complementary DNA fragments. pUC18/19 vectors are selected for CAPRRESI because they offer technical advantages, such as small size (2,686 base pairs), high copy number, advantageous sequencing reaction features, and commercial availability. The usage of restriction enzymes for chimera assembly eliminates the need for DNA polymerases yielding blunt-ended products. CAPRRESI is a fast and low-cost method for fusing protein-coding genes.
The 5S rDNA in two Abracris grasshoppers (Ommatolampidinae: Acrididae): molecular and chromosomal organization.

PubMed

Bueno, Danilo; Palacios-Gimenez, Octavio Manuel; Martí, Dardo Andrea; Mariguela, Tatiane Casagrande; Cabral-de-Mello, Diogo Cavalcanti

2016-08-01

The 5S ribosomal DNA (rDNA) sequences are subject of dynamic evolution at chromosomal and molecular levels, evolving through concerted and/or birth-and-death fashion. Among grasshoppers, the chromosomal location for this sequence was established for some species, but little molecular information was obtained to infer evolutionary patterns. Here, we integrated data from chromosomal and nucleotide sequence analysis for 5S rDNA in two Abracris species aiming to identify evolutionary dynamics. For both species, two arrays were identified, a larger sequence (named type-I) that consisted of the entire 5S rDNA gene plus NTS (non-transcribed spacer) and a smaller (named type-II) with truncated 5S rDNA gene plus short NTS that was considered a pseudogene. For type-I sequences, the gene corresponding region contained the internal control region and poly-T motif and the NTS presented partial transposable elements. Between the species, nucleotide differences for type-I were noticed, while type-II was identical, suggesting pseudogenization in a common ancestor. At chromosomal point to view, the type-II was placed in one bivalent, while type-I occurred in multiple copies in distinct chromosomes. In Abracris, the evolution of 5S rDNA was apparently influenced by the chromosomal distribution of clusters (single or multiple location), resulting in a mixed mechanism integrating concerted and birth-and-death evolution depending on the unit.
Rapid and reliable high-throughput methods of DNA extraction for use in barcoding and molecular systematics of mushrooms.

PubMed

Dentinger, Bryn T M; Margaritescu, Simona; Moncalvo, Jean-Marc

2010-07-01

We present two methods for DNA extraction from fresh and dried mushrooms that are adaptable to high-throughput sequencing initiatives, such as DNA barcoding. Our results show that these protocols yield ∼85% sequencing success from recently collected materials. Tests with both recent (<2 year) and older (>100 years) specimens reveal that older collections have low success rates and may be an inefficient resource for populating a barcode database. However, our method of extracting DNA from herbarium samples using small amount of tissue is reliable and could be used for important historical specimens. The application of these protocols greatly reduces time, and therefore cost, of generating DNA sequences from mushrooms and other fungi vs. traditional extraction methods. The efficiency of these methods illustrates that standardization and streamlining of sample processing should be shifted from the laboratory to the field. © 2009 Blackwell Publishing Ltd.
ParTIES: a toolbox for Paramecium interspersed DNA elimination studies.

PubMed

Denby Wilkes, Cyril; Arnaiz, Olivier; Sperling, Linda

2016-02-15

Developmental DNA elimination occurs in a wide variety of multicellular organisms, but ciliates are the only single-celled eukaryotes in which this phenomenon has been reported. Despite considerable interest in ciliates as models for DNA elimination, no standard methods for identification and characterization of the eliminated sequences are currently available. We present the Paramecium Toolbox for Interspersed DNA Elimination Studies (ParTIES), designed for Paramecium species, that (i) identifies eliminated sequences, (ii) measures their presence in a sequencing sample and (iii) detects rare elimination polymorphisms. ParTIES is multi-threaded Perl software available at https://github.com/oarnaiz/ParTIES. ParTIES is distributed under the GNU General Public Licence v3. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Unraveling systematic inventory of Echinops (Asteraceae) with special reference to nrDNA ITS sequence-based molecular typing of Echinops abuzinadianus.

PubMed

Ali, M A; Al-Hemaid, F M; Lee, J; Hatamleh, A A; Gyulai, G; Rahman, M O

2015-10-02

The present study explored the systematic inventory of Echinops L. (Asteraceae) of Saudi Arabia, with special reference to the molecular typing of Echinops abuzinadianus Chaudhary, an endemic species to Saudi Arabia, based on the internal transcribed spacer (ITS) sequences (ITS1-5.8S-ITS2) of nuclear ribosomal DNA. A sequence similarity search using BLAST and a phylogenetic analysis of the ITS sequence of E. abuzinadianus revealed a high level of sequence similarity with E. glaberrimus DC. (section Ritropsis). The novel primary sequence and the secondary structure of ITS2 of E. abuzinadianus could potentially be used for molecular genotyping.
Repatriation and Identification of Finnish World War II Soldiers

PubMed Central

Palo, Jukka U.; Hedman, Minttu; Söderholm, Niklas; Sajantila, Antti

2007-01-01

Aim To present a summary of the organization, field search, repatriation, forensic anthropological examination, and DNA analysis for the purpose of identification of Finnish soldiers with unresolved fate in World War II. Methods Field searches were organized, executed, and financed by the Ministry of Education and the Association for Cherishing the Memory of the Dead of the War. Anthropological examination conducted on human remains retrieved in the field searches was used to establish the minimum number of individuals and description of the skeletal diseases, treatment, anomalies, or injuries. DNA tests were performed by extracting DNA from powdered bones and blood samples from relatives. Mitochondrial DNA (mtDNA) sequence comparisons, together with circumstantial evidence, were used to connect the remains to the putative family members. Results At present, the skeletal remains of about a thousand soldiers have been found and repatriated. In forensic anthropological examination, several injuries related to death were documented. For the total of 181 bone samples, mtDNA HVR-1 and HVR-2 sequences were successfully obtained for 167 (92.3%) and 148 (81.8%) of the samples, respectively. Five samples yielded no reliable sequence data. Our data suggests that mtDNA preserves at least for 60 years in the boreal acidic soil. The quality of the obtained mtDNA sequence data varied depending on the sample bone type, with long compact bones (femur, tibia and humerus) having significantly better (90.0%) success rate than other bones (51.2%). Conclusion Although more than 60 years have passed since the World War II, our experience is that resolving the fate of soldiers missing in action is still of uttermost importance for people having lost their relatives in the war. Although cultural and individual differences may exist, our experience presented here gives a good perspective on the importance of individual identification performed by forensic professionals. PMID:17696308

PuLSE: Quality control and quantification of peptide sequences explored by phage display libraries.

PubMed

Shave, Steven; Mann, Stefan; Koszela, Joanna; Kerr, Alastair; Auer, Manfred

2018-01-01

The design of highly diverse phage display libraries is based on assumption that DNA bases are incorporated at similar rates within the randomized sequence. As library complexity increases and expected copy numbers of unique sequences decrease, the exploration of library space becomes sparser and the presence of truly random sequences becomes critical. We present the program PuLSE (Phage Library Sequence Evaluation) as a tool for assessing randomness and therefore diversity of phage display libraries. PuLSE runs on a collection of sequence reads in the fastq file format and generates tables profiling the library in terms of unique DNA sequence counts and positions, translated peptide sequences, and normalized 'expected' occurrences from base to residue codon frequencies. The output allows at-a-glance quantitative quality control of a phage library in terms of sequence coverage both at the DNA base and translated protein residue level, which has been missing from toolsets and literature. The open source program PuLSE is available in two formats, a C++ source code package for compilation and integration into existing bioinformatics pipelines and precompiled binaries for ease of use.
Complete chloroplast genome and 45S nrDNA sequences of the medicinal plant species Glycyrrhiza glabra and Glycyrrhiza uralensis.

PubMed

Kang, Sang-Ho; Lee, Jeong-Hoon; Lee, Hyun Oh; Ahn, Byoung Ohg; Won, So Youn; Sohn, Seong-Han; Kim, Jung Sun

2017-10-06

Glycyrrhiza uralensis and G. glabra, members of the Fabaceae, are medicinally important species that are native to Asia and Europe. Extracts from these plants are widely used as natural sweeteners because of their much greater sweetness than sucrose. In this study, the three complete chloroplast genomes and five 45S nuclear ribosomal (nr)DNA sequences of these two licorice species and an interspecific hybrid are presented. The chloroplast genomes of G. glabra, G. uralensis and G. glabra × G. uralensis were 127,895 bp, 127,716 bp and 127,939 bp, respectively. The three chloroplast genomes harbored 110 annotated genes, including 76 protein-coding genes, 30 tRNA genes and 4 rRNA genes. The 45S nrDNA sequences were either 5,947 or 5,948 bp in length. Glycyrrhiza glabra and G. glabra × G. uralensis showed two types of nrDNA, while G. uralensis contained a single type. The complete 45S nrDNA sequence unit contains 18S rRNA, ITS1, 5.8S rRNA, ITS2 and 26S rRNA. We identified simple sequence repeat and tandem repeat sequences. We also developed four reliable markers for analysis of Glycyrrhiza diversity authentication.
Strawberry disease lesions in rainbow trout from southern Idaho are associated with DNA from a Rickettsia-like organism.

PubMed

Lloyd, Sonja J; LaPatra, Scott E; Snekvik, Kevin R; St-Hilaire, Sophie; Cain, Kenneth D; Call, Douglas R

2008-11-20

Strawberry disease (SD) in the USA is a skin disorder of unknown etiology that occurs in rainbow trout Oncorhynchus mykiss and is characterized by bright red inflammatory lesions. To identify a candidate bacterial agent responsible for SD, we constructed 16S rDNA libraries from 7 SD lesion samples and 2 apparently healthy skin samples from SD-affected fish. A 16S rDNA sequence highly similar to members of the order Rickettsiales was present in 3 lesion libraries at 1%, 32% and 54% prevalence, but this sequence was not found in either healthy tissue library. Based on phylogenetic analysis, this Rickettsia-like organism (RLO) sequence is most closely related to 16S rDNA sequences of bacteria that may form a novel lineage within the Rickettsiales. We used nested PCR assays to screen 25 SD-affected fish for RLO or Flavobacterium psychrophilum DNA. Sixteen lesion samples were positive for the RLO sequence and 4 of the matched healthy samples were positive resulting in a significant association between SD lesions and presence of RLO DNA. While F. psychrophilum is reportedly associated with 'cold water strawberry disease' in the UK, we found no significant association between SD lesions and the presence of F. psychrophilum DNA. The statistical association between SD lesions and presence of RLO DNA is not proof of etiology, but these data suggest that RLO may play a role in SD in southern Idaho, USA.
Effective DNA Inhibitors of Cathepsin G by In Vitro Selection

PubMed Central

Gatto, Barbara; Vianini, Elena; Lucatello, Lorena; Sissi, Claudia; Moltrasio, Danilo; Pescador, Rodolfo; Porta, Roberto; Palumbo, Manlio

2008-01-01

Cathepsin G (CatG) is a chymotrypsin-like protease released upon degranulation of neutrophils. In several inflammatory and ischaemic diseases the impaired balance between CatG and its physiological inhibitors leads to tissue destruction and platelet aggregation. Inhibitors of CatG are suitable for the treatment of inflammatory diseases and procoagulant conditions. DNA released upon the death of neutrophils at injury sites binds CatG. Moreover, short DNA fragments are more inhibitory than genomic DNA. Defibrotide, a single stranded polydeoxyribonucleotide with antithrombotic effect is also a potent CatG inhibitor. Given the above experimental evidences we employed a selection protocol to assess whether DNA inhibition of CatG may be ascribed to specific sequences present in defibrotide DNA. A Selex protocol was applied to identify the single-stranded DNA sequences exhibiting the highest affinity for CatG, the diversity of a combinatorial pool of oligodeoxyribonucleotides being a good representation of the complexity found in defibrotide. Biophysical and biochemical studies confirmed that the selected sequences bind tightly to the target enzyme and also efficiently inhibit its catalytic activity. Sequence analysis carried out to unveil a motif responsible for CatG recognition showed a recurrence of alternating TG repeats in the selected CatG binders, adopting an extended conformation that grants maximal interaction with the highly charged protein surface. This unprecedented finding is validated by our results showing high affinity and inhibition of CatG by specific DNA sequences of variable length designed to maximally reduce pairing/folding interactions. PMID:19325843
Optimized mtDNA Control Region Primer Extension Capture Analysis for Forensically Relevant Samples and Highly Compromised mtDNA of Different Age and Origin

PubMed Central

Eduardoff, Mayra; Xavier, Catarina; Strobl, Christina; Casas-Vargas, Andrea; Parson, Walther

2017-01-01

The analysis of mitochondrial DNA (mtDNA) has proven useful in forensic genetics and ancient DNA (aDNA) studies, where specimens are often highly compromised and DNA quality and quantity are low. In forensic genetics, the mtDNA control region (CR) is commonly sequenced using established Sanger-type Sequencing (STS) protocols involving fragment sizes down to approximately 150 base pairs (bp). Recent developments include Massively Parallel Sequencing (MPS) of (multiplex) PCR-generated libraries using the same amplicon sizes. Molecular genetic studies on archaeological remains that harbor more degraded aDNA have pioneered alternative approaches to target mtDNA, such as capture hybridization and primer extension capture (PEC) methods followed by MPS. These assays target smaller mtDNA fragment sizes (down to 50 bp or less), and have proven to be substantially more successful in obtaining useful mtDNA sequences from these samples compared to electrophoretic methods. Here, we present the modification and optimization of a PEC method, earlier developed for sequencing the Neanderthal mitochondrial genome, with forensic applications in mind. Our approach was designed for a more sensitive enrichment of the mtDNA CR in a single tube assay and short laboratory turnaround times, thus complying with forensic practices. We characterized the method using sheared, high quantity mtDNA (six samples), and tested challenging forensic samples (n = 2) as well as compromised solid tissue samples (n = 15) up to 8 kyrs of age. The PEC MPS method produced reliable and plausible mtDNA haplotypes that were useful in the forensic context. It yielded plausible data in samples that did not provide results with STS and other MPS techniques. We addressed the issue of contamination by including four generations of negative controls, and discuss the results in the forensic context. We finally offer perspectives for future research to enable the validation and accreditation of the PEC MPS method for final implementation in forensic genetic laboratories. PMID:28934125
Long interspersed repeated DNA (LINE) causes polymorphism at the rat insulin 1 locus.

PubMed Central

Lakshmikumaran, M S; D'Ambrosio, E; Laimins, L A; Lin, D T; Furano, A V

1985-01-01

The insulin 1, but not the insulin 2, locus is polymorphic (i.e., exhibits allelic variation) in rats. Restriction enzyme analysis and hybridization studies showed that the polymorphic region is 2.2 kilobases upstream of the insulin 1 coding region and is due to the presence or absence of an approximately 2.7-kilobase repeated DNA element. DNA sequence determination showed that this DNA element is a member of a long interspersed repeated DNA family (LINE) that is highly repeated (greater than 50,000 copies) and highly transcribed in the rat. Although the presence or absence of LINE sequences at the insulin 1 locus occurs in both the homozygous and heterozygous states, LINE-containing insulin 1 alleles are more prevalent in the rat population than are alleles without LINEs. Restriction enzyme analysis of the LINE-containing alleles indicated that at least two versions of the LINE sequence may be present at the insulin 1 locus in different rats. Either repeated transposition of LINE sequences or gene conversion between the resident insulin 1 LINE and other sequences in the genome are possible explanations for this. Images PMID:3016521
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions

DOEpatents

Gardner, Shea N; Mariella, Jr., Raymond P; Christian, Allen T; Young, Jennifer A; Clague, David S

2013-06-25

A method of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths.
Structural basis of DNA target recognition by the B3 domain of Arabidopsis epigenome reader VAL1

PubMed Central

Sasnauskas, Giedrius; Kauneckaitė, Kotryna; Siksnys, Virginijus

2018-01-01

Abstract Arabidopsis thaliana requires a prolonged period of cold exposure during winter to initiate flowering in a process termed vernalization. Exposure to cold induces epigenetic silencing of the FLOWERING LOCUS C (FLC) gene by Polycomb group (PcG) proteins. A key role in this epigenetic switch is played by transcriptional repressors VAL1 and VAL2, which specifically recognize Sph/RY DNA sequences within FLC via B3 DNA binding domains, and mediate recruitment of PcG silencing machinery. To understand the structural mechanism of site-specific DNA recognition by VAL1, we have solved the crystal structure of VAL1 B3 domain (VAL1-B3) bound to a 12 bp oligoduplex containing the canonical Sph/RY DNA sequence 5′-CATGCA-3′/5′-TGCATG-3′. We find that VAL1-B3 makes H-bonds and van der Waals contacts to DNA bases of all six positions of the canonical Sph/RY element. In agreement with the structure, in vitro DNA binding studies show that VAL1-B3 does not tolerate substitutions at any position of the 5′-TGCATG-3′ sequence. The VAL1-B3–DNA structure presented here provides a structural model for understanding the specificity of plant B3 domains interacting with the Sph/RY and other DNA sequences. PMID:29660015
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions

DOEpatents

Gardner, Shea N [San Leandro, CA; Mariella, Jr., Raymond P.; Christian, Allen T [Tracy, CA; Young, Jennifer A [Berkeley, CA; Clague, David S [Livermore, CA

2011-01-18

A method of fabricating a DNA molecule of user-defined sequence. The method comprises the steps of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an even or odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths. In one embodiment starting sequence fragments are of different lengths, n, n+1, n+2, etc.
Methylsorb: a simple method for quantifying DNA methylation using DNA-gold affinity interactions.

PubMed

Sina, Abu Ali Ibn; Carrascosa, Laura G; Palanisamy, Ramkumar; Rauf, Sakandar; Shiddiky, Muhammad J A; Trau, Matt

2014-10-21

The analysis of DNA methylation is becoming increasingly important both in the clinic and also as a research tool to unravel key epigenetic molecular mechanisms in biology. Current methodologies for the quantification of regional DNA methylation (i.e., the average methylation over a region of DNA in the genome) are largely affected by comprehensive DNA sequencing methodologies which tend to be expensive, tedious, and time-consuming for many applications. Herein, we report an alternative DNA methylation detection method referred to as "Methylsorb", which is based on the inherent affinity of DNA bases to the gold surface (i.e., the trend of the affinity interactions is adenine > cytosine ≥ guanine > thymine).1 Since the degree of gold-DNA affinity interaction is highly sequence dependent, it provides a new capability to detect DNA methylation by simply monitoring the relative adsorption of bisulfite treated DNA sequences onto a gold chip. Because the selective physical adsorption of DNA fragments to gold enable a direct read-out of regional DNA methylation, the current requirement for DNA sequencing is obviated. To demonstrate the utility of this method, we present data on the regional methylation status of two CpG clusters located in the EN1 and MIR200B genes in MCF7 and MDA-MB-231 cells. The methylation status of these regions was obtained from the change in relative mass on gold surface with respect to relative adsorption of an unmethylated DNA source and this was detected using surface plasmon resonance (SPR) in a label-free and real-time manner. We anticipate that the simplicity of this method, combined with the high level of accuracy for identifying the methylation status of cytosines in DNA, could find broad application in biology and diagnostics.
Nucleotide sequence analysis establishes the role of endogenous murine leukemia virus DNA segments in formation of recombinant mink cell focus-forming murine leukemia viruses.

PubMed Central

Khan, A S

1984-01-01

The sequence of 363 nucleotides near the 3' end of the pol gene and 564 nucleotides from the 5' terminus of the env gene in an endogenous murine leukemia viral (MuLV) DNA segment, cloned from AKR/J mouse DNA and designated as A-12, was obtained. For comparison, the nucleotide sequence in an analogous portion of AKR mink cell focus-forming (MCF) 247 MuLV provirus was also determined. Sequence features unique to MCF247 MuLV DNA in the 3' pol and 5' env regions were identified by comparison with nucleotide sequences in analogous regions of NFS -Th-1 xenotropic and AKR ecotropic MuLV proviruses. These included (i) an insertion of 12 base pairs encoding four amino acids located 60 base pairs from the 3' terminus of the pol gene and immediately preceding the env gene, (ii) the deletion of 12 base pairs (encoding four amino acids) and the insertion of 3 base pairs (encoding one amino acid) in the 5' portion of the env gene, and (iii) single base substitutions resulting in 2 MCF247 -specific amino acids in the 3' pol and 23 in the 5' env regions. Nucleotide sequence comparison involving the 3' pol and 5' env regions of AKR MCF247 , NFS xenotropic, and AKR ecotropic MuLV proviruses with the cloned endogenous MuLV DNA indicated that MCF247 proviral DNA sequences were conserved in the cloned endogenous MuLV proviral segment. In fact, total nucleotide sequence identity existed between the endogenous MuLV DNA and the MCF247 MuLV provirus in the 3' portion of the pol gene. In the 5' env region, only 4 of 564 nucleotides were different, resulting in three amino acid changes between AKR MCF247 MuLV DNA and the endogenous MuLV DNA present in clone A-12. In addition, nucleotide sequence comparison indicated that Moloney-and Friend-MCF MuLVs were also highly related in the 3' pol and 5' env regions to the cloned endogenous MuLV DNA. These results establish the role of endogenous MuLV DNA segments in generation of recombinant MCF viruses. PMID:6328017
Molecular authentication of Radix Puerariae Lobatae and Radix Puerariae Thomsonii by ITS and 5S rRNA spacer sequencing.

PubMed

Sun, Ye; Shaw, Pang-Chui; Fung, Kwok-Pui

2007-01-01

In the present study, we examined nuclear DNA sequences in an attempt to reveal the relationships between Pueraria lobata (Willd). Ohwi, P. thomsonii Benth., and P. montana (Lour.) Merr. We found that internal transcribed spacer (ITS) sequences of nuclear ribosomal DNA are highly divergent in P. lobata and P. thomsonii, and four types of ITS with different length are found in the two species. On the other hand, DNA sequences of 5S rRNA gene spacer are highly conserved across multiple copies in P. lobata and P. thomsonii, they could be used to identify P. lobata, P. thomsonii, and P. montana of this complex, and may serve as a useful tool in medical authentication of Radix Puerariae Lobatae and Radix Puerariae Thomsonii.
Serovar distribution of a DNA sequence involved in the antigenic relationship between Leptospira and equine cornea.

PubMed

Lucchesi, Paula M A; Parma, Alberto E; Arroyo, Guillermo H

2002-01-01

Horses infected with Leptospira present several clinical disorders, one of them being recurrent uveitis. A common endpoint of equine recurrent uveitis is blindness. Serovar pomona has often been incriminated, although others have also been reported. An antigenic relationship between this bacterium and equine cornea has been described in previous studies. A leptospiral DNA fragment that encodes cross-reacting epitopes was previously cloned and expressed in Escherichia coli. A region of that DNA fragment was subcloned and sequenced. Samples of leptospiral DNA from several sources were analysed by PCR with two primer pairs designed to amplify that region. Reference strains from serovars canicola, icterohaemorrhagiae, pomona, pyrogenes, wolffi, bataviae, sentot, hebdomadis and hardjo rendered products of the expected sizes with both pairs of primers. The specific DNA region was also amplified from isolates from Argentina belonging to serogroups Canicola and Pomona. Both L. biflexa serovar patoc and L. borgpetersenii serovar tarassovi rendered a negative result. The DNA sequence related to the antigen mimicry with equine cornea was not exclusively found in serovar pomona as it was also detected in several strains of Leptospira belonging to different serovars. The results obtained with L. biflexa serovar patoc strain Patoc I and L. borgpetersenii serovar tarassovi strain Perepelicin suggest that this sequence is not present in these strains, which belong to different genomospecies than those which gave positive results. This is an interesting finding since L. biflexa comprises nonpathogenic strains and serovar tarassovi has not been associated clinically with equine uveitis.
Performance evaluation of a mitogenome capture and Illumina sequencing protocol using non-probative, case-type skeletal samples: Implications for the use of a positive control in a next-generation sequencing procedure.

PubMed

Marshall, Charla; Sturk-Andreaggi, Kimberly; Daniels-Higginbotham, Jennifer; Oliver, Robert Sean; Barritt-Ross, Suzanne; McMahon, Timothy P

2017-11-01

Next-generation ancient DNA technologies have the potential to assist in the analysis of degraded DNA extracted from forensic specimens. Mitochondrial genome (mitogenome) sequencing, specifically, may be of benefit to samples that fail to yield forensically relevant genetic information using conventional PCR-based techniques. This report summarizes the Armed Forces Medical Examiner System's Armed Forces DNA Identification Laboratory's (AFMES-AFDIL) performance evaluation of a Next-Generation Sequencing protocol for degraded and chemically treated past accounting samples. The procedure involves hybridization capture for targeted enrichment of mitochondrial DNA, massively parallel sequencing using Illumina chemistry, and an automated bioinformatic pipeline for forensic mtDNA profile generation. A total of 22 non-probative samples and associated controls were processed in the present study, spanning a range of DNA quantity and quality. Data were generated from over 100 DNA libraries by ten DNA analysts over the course of five months. The results show that the mitogenome sequencing procedure is reliable and robust, sensitive to low template (one ng control DNA) as well as degraded DNA, and specific to the analysis of the human mitogenome. Haplotypes were overall concordant between NGS replicates and with previously generated Sanger control region data. Due to the inherent risk for contamination when working with low-template, degraded DNA, a contamination assessment was performed. The consumables were shown to be void of human DNA contaminants and suitable for forensic use. Reagent blanks and negative controls were analyzed to determine the background signal of the procedure. This background signal was then used to set analytical and reporting thresholds, which were designated at 4.0X (limit of detection) and 10.0X (limit of quantiation) average coverage across the mitogenome, respectively. Nearly all human samples exceeded the reporting threshold, although coverage was reduced in chemically treated samples resulting in a ∼58% passing rate for these poor-quality samples. A concordance assessment demonstrated the reliability of the NGS data when compared to known Sanger profiles. One case sample was shown to be mixed with a co-processed sample and two reagent blanks indicated the presence of DNA above the analytical threshold. This contamination was attributed to sequencing crosstalk from simultaneously sequenced high-quality samples to include the positive control. Overall this study demonstrated that hybridization capture and Illumina sequencing provide a viable method for mitogenome sequencing of degraded and chemically treated skeletal DNA samples, yet may require alternative measures of quality control. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
Protein Science by DNA Sequencing: How Advances in Molecular Biology Are Accelerating Biochemistry.

PubMed

Higgins, Sean A; Savage, David F

2018-01-09

A fundamental goal of protein biochemistry is to determine the sequence-function relationship, but the vastness of sequence space makes comprehensive evaluation of this landscape difficult. However, advances in DNA synthesis and sequencing now allow researchers to assess the functional impact of every single mutation in many proteins, but challenges remain in library construction and the development of general assays applicable to a diverse range of protein functions. This Perspective briefly outlines the technical innovations in DNA manipulation that allow massively parallel protein biochemistry and then summarizes the methods currently available for library construction and the functional assays of protein variants. Areas in need of future innovation are highlighted with a particular focus on assay development and the use of computational analysis with machine learning to effectively traverse the sequence-function landscape. Finally, applications in the fundamentals of protein biochemistry, disease prediction, and protein engineering are presented.
Sequence analysis of the 5.8S ribosomal DNA and internal transcribed spacers (ITS1 and ITS2) from five species of the Oxalis tuberosa alliance.

PubMed

Tosto, D S; Hopp, H E

1996-01-01

The internal transcribed spacer region (ITS1 and ITS2) of the 18S-25S nuclear ribosomal DNA sequence and the intervening 5.8S region from five species of the genus Oxalis was amplified by polymerase chain reaction and subjected to direct DNA sequencing. On the basis of cytogenetic studies some species of this genus were postulated to be related by the number of chromosomes. Sequence homologies in the ITS1, 5.8S and ITS2 among species are in good agreement with previous relationships established on the basis of chromosome numbers. We also identified a highly conserved sequence of six bp in the ITS1, reported to be present in a wide range of flowering plants, but not in the Oxalidaceae family to which the genus Oxalis belongs to.
Characterization of Trichuris trichiura from humans and T. suis from pigs in China using internal transcribed spacers of nuclear ribosomal DNA.

PubMed

Liu, G H; Zhou, W; Nisbet, A J; Xu, M J; Zhou, D H; Zhao, G H; Wang, S K; Song, H Q; Lin, R Q; Zhu, X Q

2014-03-01

Trichuris trichiura and Trichuris suis parasitize (at the adult stage) the caeca of humans and pigs, respectively, causing trichuriasis. Despite these parasites being of human and animal health significance, causing considerable socio-economic losses globally, little is known of the molecular characteristics of T. trichiura and T. suis from China. In the present study, the entire first and second internal transcribed spacer (ITS-1 and ITS-2) regions of nuclear ribosomal DNA (rDNA) of T. trichiura and T. suis from China were amplified by polymerase chain reaction (PCR), the representative amplicons were cloned and sequenced, and sequence variation in the ITS rDNA was examined. The ITS rDNA sequences for the T. trichiura and T. suis samples were 1222-1267 bp and 1339-1353 bp in length, respectively. Sequence analysis revealed that the ITS-1, 5.8S and ITS-2 rDNAs of both whipworms were 600-627 bp and 655-661 bp, 154 bp, and 468-486 bp and 530-538 bp in size, respectively. Sequence variation in ITS rDNA within and among T. trichiura and T. suis was examined. Excluding nucleotide variations in the simple sequence repeats, the intra-species sequence variation in the ITS-1 was 0.2-1.7% within T. trichiura, and 0-1.5% within T. suis. For ITS-2 rDNA, the intra-species sequence variation was 0-1.3% within T. trichiura and 0.2-1.7% within T. suis. The inter-species sequence differences between the two whipworms were 60.7-65.3% for ITS-1 and 59.3-61.5% for ITS-2. These results demonstrated that the ITS rDNA sequences provide additional genetic markers for the characterization and differentiation of the two whipworms. These data should be useful for studying the epidemiology and population genetics of T. trichiura and T. suis, as well as for the diagnosis of trichuriasis in humans and pigs.
Homology between DNA polymerases of poxviruses, herpesviruses, and adenoviruses: nucleotide sequence of the vaccinia virus DNA polymerase gene.

PubMed Central

Earl, P L; Jones, E V; Moss, B

1986-01-01

A 5400-base-pair segment of the vaccinia virus genome was sequenced and an open reading frame of 938 codons was found precisely where the DNA polymerase had been mapped by transfer of a phosphonoacetate-resistance marker. A single nucleotide substitution changing glycine at position 347 to aspartic acid accounts for the drug resistance of the mutant vaccinia virus. The 5' end of the DNA polymerase mRNA was located 80 base pairs before the methionine codon initiating the open reading frame. Correspondence between the predicted Mr 108,577 polypeptide and the 110,000 purified enzyme indicates that little or no proteolytic processing occurs. Extensive homology, extending over 435 amino acids, was found upon comparing the DNA polymerase of vaccinia virus and DNA polymerase of Epstein-Barr virus. A highly conserved sequence of 14 amino acids in the carboxyl-terminal regions of the above DNA polymerases is also present at a similar location in adenovirus DNA polymerase. This structure, which is predicted to form a turn flanked by beta-pleated sheets, may form part of an essential binding or catalytic site that accounts for its presence in DNA polymerases of poxviruses, herpesviruses, and adenoviruses. Images PMID:3012524
Colony-PCR Is a Rapid Method for DNA Amplification of Hyphomycetes

PubMed Central

Walch, Georg; Knapp, Maria; Rainer, Georg; Peintner, Ursula

2016-01-01

Fungal pure cultures identified with both classical morphological methods and through barcoding sequences are a basic requirement for reliable reference sequences in public databases. Improved techniques for an accelerated DNA barcode reference library construction will result in considerably improved sequence databases covering a wider taxonomic range. Fast, cheap, and reliable methods for obtaining DNA sequences from fungal isolates are, therefore, a valuable tool for the scientific community. Direct colony PCR was already successfully established for yeasts, but has not been evaluated for a wide range of anamorphic soil fungi up to now, and a direct amplification protocol for hyphomycetes without tissue pre-treatment has not been published so far. Here, we present a colony PCR technique directly from fungal hyphae without previous DNA extraction or other prior manipulation. Seven hundred eighty-eight fungal strains from 48 genera were tested with a success rate of 86%. PCR success varied considerably: DNA of fungi belonging to the genera Cladosporium, Geomyces, Fusarium, and Mortierella could be amplified with high success. DNA of soil-borne yeasts was always successfully amplified. Absidia, Mucor, Trichoderma, and Penicillium isolates had noticeably lower PCR success. PMID:29376929
Analysis and Functional Annotation of an Expressed Sequence Tag Collection for Tropical Crop Sugarcane

PubMed Central

Vettore, André L.; da Silva, Felipe R.; Kemper, Edson L.; Souza, Glaucia M.; da Silva, Aline M.; Ferro, Maria Inês T.; Henrique-Silva, Flavio; Giglioti, Éder A.; Lemos, Manoel V.F.; Coutinho, Luiz L.; Nobrega, Marina P.; Carrer, Helaine; França, Suzelei C.; Bacci, Maurício; Goldman, Maria Helena S.; Gomes, Suely L.; Nunes, Luiz R.; Camargo, Luis E.A.; Siqueira, Walter J.; Van Sluys, Marie-Anne; Thiemann, Otavio H.; Kuramae, Eiko E.; Santelli, Roberto V.; Marino, Celso L.; Targon, Maria L.P.N.; Ferro, Jesus A.; Silveira, Henrique C.S.; Marini, Danyelle C.; Lemos, Eliana G.M.; Monteiro-Vitorello, Claudia B.; Tambor, José H.M.; Carraro, Dirce M.; Roberto, Patrícia G.; Martins, Vanderlei G.; Goldman, Gustavo H.; de Oliveira, Regina C.; Truffi, Daniela; Colombo, Carlos A.; Rossi, Magdalena; de Araujo, Paula G.; Sculaccio, Susana A.; Angella, Aline; Lima, Marleide M.A.; de Rosa, Vicente E.; Siviero, Fábio; Coscrato, Virginia E.; Machado, Marcos A.; Grivet, Laurent; Di Mauro, Sonia M.Z.; Nobrega, Francisco G.; Menck, Carlos F.M.; Braga, Marilia D.V.; Telles, Guilherme P.; Cara, Frank A.A.; Pedrosa, Guilherme; Meidanis, João; Arruda, Paulo

2003-01-01

To contribute to our understanding of the genome complexity of sugarcane, we undertook a large-scale expressed sequence tag (EST) program. More than 260,000 cDNA clones were partially sequenced from 26 standard cDNA libraries generated from different sugarcane tissues. After the processing of the sequences, 237,954 high-quality ESTs were identified. These ESTs were assembled into 43,141 putative transcripts. Of the assembled sequences, 35.6% presented no matches with existing sequences in public databases. A global analysis of the whole SUCEST data set indicated that 14,409 assembled sequences (33% of the total) contained at least one cDNA clone with a full-length insert. Annotation of the 43,141 assembled sequences associated almost 50% of the putative identified sugarcane genes with protein metabolism, cellular communication/signal transduction, bioenergetics, and stress responses. Inspection of the translated assembled sequences for conserved protein domains revealed 40,821 amino acid sequences with 1415 Pfam domains. Reassembling the consensus sequences of the 43,141 transcripts revealed a 22% redundancy in the first assembling. This indicated that possibly 33,620 unique genes had been identified and indicated that >90% of the sugarcane expressed genes were tagged. PMID:14613979

Full-Length Venom Protein cDNA Sequences from Venom-Derived mRNA: Exploring Compositional Variation and Adaptive Multigene Evolution

PubMed Central

Modahl, Cassandra M.; Mackessy, Stephen P.

2016-01-01

Envenomation of humans by snakes is a complex and continuously evolving medical emergency, and treatment is made that much more difficult by the diverse biochemical composition of many venoms. Venomous snakes and their venoms also provide models for the study of molecular evolutionary processes leading to adaptation and genotype-phenotype relationships. To compare venom complexity and protein sequences, venom gland transcriptomes are assembled, which usually requires the sacrifice of snakes for tissue. However, toxin transcripts are also present in venoms, offering the possibility of obtaining cDNA sequences directly from venom. This study provides evidence that unknown full-length venom protein transcripts can be obtained from the venoms of multiple species from all major venomous snake families. These unknown venom protein cDNAs are obtained by the use of primers designed from conserved signal peptide sequences within each venom protein superfamily. This technique was used to assemble a partial venom gland transcriptome for the Middle American Rattlesnake (Crotalus simus tzabcan) by amplifying sequences for phospholipases A2, serine proteases, C-lectins, and metalloproteinases from within venom. Phospholipase A2 sequences were also recovered from the venoms of several rattlesnakes and an elapid snake (Pseudechis porphyriacus), and three-finger toxin sequences were recovered from multiple rear-fanged snake species, demonstrating that the three major clades of advanced snakes (Elapidae, Viperidae, Colubridae) have stable mRNA present in their venoms. These cDNA sequences from venom were then used to explore potential activities derived from protein sequence similarities and evolutionary histories within these large multigene superfamilies. Venom-derived sequences can also be used to aid in characterizing venoms that lack proteomic profiles and identify sequence characteristics indicating specific envenomation profiles. This approach, requiring only venom, provides access to cDNA sequences in the absence of living specimens, even from commercial venom sources, to evaluate important regional differences in venom composition and to study snake venom protein evolution. PMID:27280639
The D1-D2 region of the large subunit ribosomal DNA as barcode for ciliates.

PubMed

Stoeck, T; Przybos, E; Dunthorn, M

2014-05-01

Ciliates are a major evolutionary lineage within the alveolates, which are distributed in nearly all habitats on our planet and are an essential component for ecosystem function, processes and stability. Accurate identification of these unicellular eukaryotes through, for example, microscopy or mating type reactions is reserved to few specialists. To satisfy the demand for a DNA barcode for ciliates, which meets the standard criteria for DNA barcodes defined by the Consortium for the Barcode of Life (CBOL), we here evaluated the D1-D2 region of the ribosomal DNA large subunit (LSU-rDNA). Primer universality for the phylum Ciliophora was tested in silico with available database sequences as well as in the laboratory with 73 ciliate species, which represented nine of 12 ciliate classes. Primers tested in this study were successful for all tested classes. To test the ability of the D1-D2 region to resolve conspecific and congeneric sequence divergence, 63 Paramecium strains were sampled from 24 mating species. The average conspecific D1-D2 variation was 0.18%, whereas congeneric sequence divergence averaged 4.83%. In pairwise genetic distance analyses, we identified a D1-D2 sequence divergence of <0.6% as an ideal threshold to discriminate Paramecium species. Using this definition, only 3.8% of all conspecific and 3.9% of all congeneric sequence comparisons had the potential of false assignments. Neighbour-joining analyses inferred monophyly for all taxa but for two Paramecium octaurelia strains. Here, we present a protocol for easy DNA amplification of single cells and voucher deposition. In conclusion, the presented data pinpoint the D1-D2 region as an excellent candidate for an official CBOL barcode for ciliated protists. © 2013 John Wiley & Sons Ltd.
DETECTION AND COMPARISON OF GIARDIAVIRUS (GLV) FROM DIFFERENT ASSEMBLAGES OF GIARDIA DUODENALIS

USDA-ARS?s Scientific Manuscript database

Five assemblages of Giardia were identified from cysts in cattle, dog, cat, sheep, and reindeer feces using ribosomal DNA (rDNA) sequencing. Assemblage A was present in cattle and reindeer feces, Assemblages C and D were present in dog feces, Assemblage E was present in cattle and sheep feces, and ...
Probability of coding of a DNA sequence: an algorithm to predict translated reading frames from their thermodynamic characteristics.

PubMed Central

Tramontano, A; Macchiato, M F

1986-01-01

An algorithm to determine the probability that a reading frame codifies for a protein is presented. It is based on the results of our previous studies on the thermodynamic characteristics of a translated reading frame. We also develop a prediction procedure to distinguish between coding and non-coding reading frames. The procedure is based on the characteristics of the putative product of the DNA sequence and not on periodicity characteristics of the sequence, so the prediction is not biased by the presence of overlapping translated reading frames or by the presence of translated reading frames on the complementary DNA strand. PMID:3753761
Structural changes induced by binding of the high-mobility group I protein to a mouse satellite DNA sequence.

PubMed Central

Slama-Schwok, A; Zakrzewska, K; Léger, G; Leroux, Y; Takahashi, M; Käs, E; Debey, P

2000-01-01

Using spectroscopic methods, we have studied the structural changes induced in both protein and DNA upon binding of the High-Mobility Group I (HMG-I) protein to a 21-bp sequence derived from mouse satellite DNA. We show that these structural changes depend on the stoichiometry of the protein/DNA complexes formed, as determined by Job plots derived from experiments using pyrene-labeled duplexes. Circular dichroism and melting temperature experiments extended in the far ultraviolet range show that while native HMG-I is mainly random coiled in solution, it adopts a beta-turn conformation upon forming a 1:1 complex in which the protein first binds to one of two dA.dT stretches present in the duplex. HMG-I structure in the 1:1 complex is dependent on the sequence of its DNA target. A 3:1 HMG-I/DNA complex can also form and is characterized by a small increase in the DNA natural bend and/or compaction coupled to a change in the protein conformation, as determined from fluorescence resonance energy transfer (FRET) experiments. In addition, a peptide corresponding to an extended DNA-binding domain of HMG-I induces an ordered condensation of DNA duplexes. Based on the constraints derived from pyrene excimer measurements, we present a model of these nucleated structures. Our results illustrate an extreme case of protein structure induced by DNA conformation that may bear on the evolutionary conservation of the DNA-binding motifs of HMG-I. We discuss the functional relevance of the structural flexibility of HMG-I associated with the nature of its DNA targets and the implications of the binding stoichiometry for several aspects of chromatin structure and gene regulation. PMID:10777751
Genome Calligrapher: A Web Tool for Refactoring Bacterial Genome Sequences for de Novo DNA Synthesis.

PubMed

Christen, Matthias; Deutsch, Samuel; Christen, Beat

2015-08-21

Recent advances in synthetic biology have resulted in an increasing demand for the de novo synthesis of large-scale DNA constructs. Any process improvement that enables fast and cost-effective streamlining of digitized genetic information into fabricable DNA sequences holds great promise to study, mine, and engineer genomes. Here, we present Genome Calligrapher, a computer-aided design web tool intended for whole genome refactoring of bacterial chromosomes for de novo DNA synthesis. By applying a neutral recoding algorithm, Genome Calligrapher optimizes GC content and removes obstructive DNA features known to interfere with the synthesis of double-stranded DNA and the higher order assembly into large DNA constructs. Subsequent bioinformatics analysis revealed that synthesis constraints are prevalent among bacterial genomes. However, a low level of codon replacement is sufficient for refactoring bacterial genomes into easy-to-synthesize DNA sequences. To test the algorithm, 168 kb of synthetic DNA comprising approximately 20 percent of the synthetic essential genome of the cell-cycle bacterium Caulobacter crescentus was streamlined and then ordered from a commercial supplier of low-cost de novo DNA synthesis. The successful assembly into eight 20 kb segments indicates that Genome Calligrapher algorithm can be efficiently used to refactor difficult-to-synthesize DNA. Genome Calligrapher is broadly applicable to recode biosynthetic pathways, DNA sequences, and whole bacterial genomes, thus offering new opportunities to use synthetic biology tools to explore the functionality of microbial diversity. The Genome Calligrapher web tool can be accessed at https://christenlab.ethz.ch/GenomeCalligrapher  .
DNA-Encoded Solid-Phase Synthesis: Encoding Language Design and Complex Oligomer Library Synthesis.

PubMed

MacConnell, Andrew B; McEnaney, Patrick J; Cavett, Valerie J; Paegel, Brian M

2015-09-14

The promise of exploiting combinatorial synthesis for small molecule discovery remains unfulfilled due primarily to the "structure elucidation problem": the back-end mass spectrometric analysis that significantly restricts one-bead-one-compound (OBOC) library complexity. The very molecular features that confer binding potency and specificity, such as stereochemistry, regiochemistry, and scaffold rigidity, are conspicuously absent from most libraries because isomerism introduces mass redundancy and diverse scaffolds yield uninterpretable MS fragmentation. Here we present DNA-encoded solid-phase synthesis (DESPS), comprising parallel compound synthesis in organic solvent and aqueous enzymatic ligation of unprotected encoding dsDNA oligonucleotides. Computational encoding language design yielded 148 thermodynamically optimized sequences with Hamming string distance ≥ 3 and total read length <100 bases for facile sequencing. Ligation is efficient (70% yield), specific, and directional over 6 encoding positions. A series of isomers served as a testbed for DESPS's utility in split-and-pool diversification. Single-bead quantitative PCR detected 9 × 10(4) molecules/bead and sequencing allowed for elucidation of each compound's synthetic history. We applied DESPS to the combinatorial synthesis of a 75,645-member OBOC library containing scaffold, stereochemical and regiochemical diversity using mixed-scale resin (160-μm quality control beads and 10-μm screening beads). Tandem DNA sequencing/MALDI-TOF MS analysis of 19 quality control beads showed excellent agreement (<1 ppt) between DNA sequence-predicted mass and the observed mass. DESPS synergistically unites the advantages of solid-phase synthesis and DNA encoding, enabling single-bead structural elucidation of complex compounds and synthesis using reactions normally considered incompatible with unprotected DNA. The widespread availability of inexpensive oligonucleotide synthesis, enzymes, DNA sequencing, and PCR make implementation of DESPS straightforward, and may prompt the chemistry community to revisit the synthesis of more complex and diverse libraries.
Variation in the number of nucleoli and incomplete homogenization of 18S ribosomal DNA sequences in leaf cells of the cultivated Oriental ginseng (Panax ginseng Meyer).

PubMed

Chelomina, Galina N; Rozhkovan, Konstantin V; Voronova, Anastasia N; Burundukova, Olga L; Muzarok, Tamara I; Zhuravlev, Yuri N

2016-04-01

Wild ginseng, Panax ginseng Meyer, is an endangered species of medicinal plants. In the present study, we analyzed variations within the ribosomal DNA (rDNA) cluster to gain insight into the genetic diversity of the Oriental ginseng, P. ginseng, at artificial plant cultivation. The roots of wild P. ginseng plants were sampled from a nonprotected natural population of the Russian Far East. The slides were prepared from leaf tissues using the squash technique for cytogenetic analysis. The 18S rDNA sequences were cloned and sequenced. The distribution of nucleotide diversity, recombination events, and interspecific phylogenies for the total 18S rDNA sequence data set was also examined. In mesophyll cells, mononucleolar nuclei were estimated to be dominant (75.7%), while the remaining nuclei contained two to four nucleoli. Among the analyzed 18S rDNA clones, 20% were identical to the 18S rDNA sequence of P. ginseng from Japan, and other clones differed in one to six substitutions. The nucleotide polymorphism was more expressed at the positions 440-640 bp, and distributed in variable regions, expansion segments, and conservative elements of core structure. The phylogenetic analysis confirmed conspecificity of ginseng plants cultivated in different regions, with two fixed mutations between P. ginseng and other species. This study identified the evidences of the intragenomic nucleotide polymorphism in the 18S rDNA sequences of P. ginseng. These data suggest that, in cultivated plants, the observed genome instability may influence the synthesis of biologically active compounds, which are widely used in traditional medicine.
Variation in the number of nucleoli and incomplete homogenization of 18S ribosomal DNA sequences in leaf cells of the cultivated Oriental ginseng (Panax ginseng Meyer)

PubMed Central

Chelomina, Galina N.; Rozhkovan, Konstantin V.; Voronova, Anastasia N.; Burundukova, Olga L.; Muzarok, Tamara I.; Zhuravlev, Yuri N.

2015-01-01

Background Wild ginseng, Panax ginseng Meyer, is an endangered species of medicinal plants. In the present study, we analyzed variations within the ribosomal DNA (rDNA) cluster to gain insight into the genetic diversity of the Oriental ginseng, P. ginseng, at artificial plant cultivation. Methods The roots of wild P. ginseng plants were sampled from a nonprotected natural population of the Russian Far East. The slides were prepared from leaf tissues using the squash technique for cytogenetic analysis. The 18S rDNA sequences were cloned and sequenced. The distribution of nucleotide diversity, recombination events, and interspecific phylogenies for the total 18S rDNA sequence data set was also examined. Results In mesophyll cells, mononucleolar nuclei were estimated to be dominant (75.7%), while the remaining nuclei contained two to four nucleoli. Among the analyzed 18S rDNA clones, 20% were identical to the 18S rDNA sequence of P. ginseng from Japan, and other clones differed in one to six substitutions. The nucleotide polymorphism was more expressed at the positions 440–640 bp, and distributed in variable regions, expansion segments, and conservative elements of core structure. The phylogenetic analysis confirmed conspecificity of ginseng plants cultivated in different regions, with two fixed mutations between P. ginseng and other species. Conclusion This study identified the evidences of the intragenomic nucleotide polymorphism in the 18S rDNA sequences of P. ginseng. These data suggest that, in cultivated plants, the observed genome instability may influence the synthesis of biologically active compounds, which are widely used in traditional medicine. PMID:27158239
Design and characterization of a nanopore-coupled polymerase for single-molecule DNA sequencing by synthesis on an electrode array

PubMed Central

Stranges, P. Benjamin; Palla, Mirkó; Kalachikov, Sergey; Nivala, Jeff; Dorwart, Michael; Trans, Andrew; Kumar, Shiv; Porel, Mintu; Chien, Minchen; Tao, Chuanjuan; Morozova, Irina; Li, Zengmin; Shi, Shundi; Aberra, Aman; Arnold, Cleoma; Yang, Alexander; Aguirre, Anne; Harada, Eric T.; Korenblum, Daniel; Pollard, James; Bhat, Ashwini; Gremyachinskiy, Dmitriy; Bibillo, Arek; Chen, Roger; Davis, Randy; Russo, James J.; Fuller, Carl W.; Roever, Stefan; Ju, Jingyue; Church, George M.

2016-01-01

Scalable, high-throughput DNA sequencing is a prerequisite for precision medicine and biomedical research. Recently, we presented a nanopore-based sequencing-by-synthesis (Nanopore-SBS) approach, which used a set of nucleotides with polymer tags that allow discrimination of the nucleotides in a biological nanopore. Here, we designed and covalently coupled a DNA polymerase to an α-hemolysin (αHL) heptamer using the SpyCatcher/SpyTag conjugation approach. These porin–polymerase conjugates were inserted into lipid bilayers on a complementary metal oxide semiconductor (CMOS)-based electrode array for high-throughput electrical recording of DNA synthesis. The designed nanopore construct successfully detected the capture of tagged nucleotides complementary to a DNA base on a provided template. We measured over 200 tagged-nucleotide signals for each of the four bases and developed a classification method to uniquely distinguish them from each other and background signals. The probability of falsely identifying a background event as a true capture event was less than 1.2%. In the presence of all four tagged nucleotides, we observed sequential additions in real time during polymerase-catalyzed DNA synthesis. Single-polymerase coupling to a nanopore, in combination with the Nanopore-SBS approach, can provide the foundation for a low-cost, single-molecule, electronic DNA-sequencing platform. PMID:27729524
Structure of homeodomain-leucine zipper/DNA complexes studied using hydroxyl radical cleavage of DNA and methylation interference.

PubMed

Tron, Adriana E; Comelli, Raúl N; Gonzalez, Daniel H

2005-12-27

Homeodomain-leucine zipper (HD-Zip) proteins, unlike most homeodomain proteins, bind a pseudopalindromic DNA sequence as dimers. We have investigated the structure of the DNA complexes formed by two HD-Zip proteins with different nucleotide preferences at the central position of the binding site using footprinting and interference methods. The results indicate that the respective complexes are not symmetric, with the strand bearing a central purine (top strand) showing higher protection around the central region and the bottom strand protected toward the 3' end. Binding to a sequence with a nonpreferred central base pair produces a decrease in protection in either the top or the bottom strand, depending upon the protein. Modeling studies derived from the complex formed by the monomeric Antennapedia homeodomain with DNA indicate that in the HD-Zip/DNA complex the recognition helix of one of the monomers is displaced within the major groove respective to the other one. This monomer seems to lose contacts with a part of the recognition sequence upon binding to the nonpreferred site. The results show that the structure of the complex formed by HD-Zip proteins with DNA is dependent upon both protein intrinsic characteristics and the nucleotides present at the central position of the recognition sequence.
Simultaneous identification of DNA and RNA viruses present in pig faeces using process-controlled deep sequencing.

PubMed

Sachsenröder, Jana; Twardziok, Sven; Hammerl, Jens A; Janczyk, Pawel; Wrede, Paul; Hertwig, Stefan; Johne, Reimar

2012-01-01

Animal faeces comprise a community of many different microorganisms including bacteria and viruses. Only scarce information is available about the diversity of viruses present in the faeces of pigs. Here we describe a protocol, which was optimized for the purification of the total fraction of viral particles from pig faeces. The genomes of the purified DNA and RNA viruses were simultaneously amplified by PCR and subjected to deep sequencing followed by bioinformatic analyses. The efficiency of the method was monitored using a process control consisting of three bacteriophages (T4, M13 and MS2) with different morphology and genome types. Defined amounts of the bacteriophages were added to the sample and their abundance was assessed by quantitative PCR during the preparation procedure. The procedure was applied to a pooled faecal sample of five pigs. From this sample, 69,613 sequence reads were generated. All of the added bacteriophages were identified by sequence analysis of the reads. In total, 7.7% of the reads showed significant sequence identities with published viral sequences. They mainly originated from bacteriophages (73.9%) and mammalian viruses (23.9%); 0.8% of the sequences showed identities to plant viruses. The most abundant detected porcine viruses were kobuvirus, rotavirus C, astrovirus, enterovirus B, sapovirus and picobirnavirus. In addition, sequences with identities to the chimpanzee stool-associated circular ssDNA virus were identified. Whole genome analysis indicates that this virus, tentatively designated as pig stool-associated circular ssDNA virus (PigSCV), represents a novel pig virus. The established protocol enables the simultaneous detection of DNA and RNA viruses in pig faeces including the identification of so far unknown viruses. It may be applied in studies investigating aetiology, epidemiology and ecology of diseases. The implemented process control serves as quality control, ensures comparability of the method and may be used for further method optimization.
Sequence-specific binding of counterions to B-DNA

PubMed Central

Denisov, Vladimir P.; Halle, Bertil

2000-01-01

Recent studies by x-ray crystallography, NMR, and molecular simulations have suggested that monovalent counterions can penetrate deeply into the minor groove of B form DNA. Such groove-bound ions potentially could play an important role in AT-tract bending and groove narrowing, thereby modulating DNA function in vivo. To address this issue, we report here 23Na magnetic relaxation dispersion measurements on oligonucleotides, including difference experiments with the groove-binding drug netropsin. The exquisite sensitivity of this method to ions in long-lived and intimate association with DNA allows us to detect sequence-specific sodium ion binding in the minor groove AT tract of three B-DNA dodecamers. The sodium ion occupancy is only a few percent, however, and therefore is not likely to contribute importantly to the ensemble of B-DNA structures. We also report results of ion competition experiments, indicating that potassium, rubidium, and cesium ions bind to the minor groove with similarly weak affinity as sodium ions, whereas ammonium ion binding is somewhat stronger. The present findings are discussed in the light of previous NMR and diffraction studies of sequence-specific counterion binding to DNA. PMID:10639130
Application of Quaternion in improving the quality of global sequence alignment scores for an ambiguous sequence target in Streptococcus pneumoniae DNA

NASA Astrophysics Data System (ADS)

Lestari, D.; Bustamam, A.; Novianti, T.; Ardaneswari, G.

2017-07-01

DNA sequence can be defined as a succession of letters, representing the order of nucleotides within DNA, using a permutation of four DNA base codes including adenine (A), guanine (G), cytosine (C), and thymine (T). The precise code of the sequences is determined using DNA sequencing methods and technologies, which have been developed since the 1970s and currently become highly developed, advanced and highly throughput sequencing technologies. So far, DNA sequencing has greatly accelerated biological and medical research and discovery. However, in some cases DNA sequencing could produce any ambiguous and not clear enough sequencing results that make them quite difficult to be determined whether these codes are A, T, G, or C. To solve these problems, in this study we can introduce other representation of DNA codes namely Quaternion Q = (PA, PT, PG, PC), where PA, PT, PG, PC are the probability of A, T, G, C bases that could appear in Q and PA + PT + PG + PC = 1. Furthermore, using Quaternion representations we are able to construct the improved scoring matrix for global sequence alignment processes, by applying a dot product method. Moreover, this scoring matrix produces better and higher quality of the match and mismatch score between two DNA base codes. In implementation, we applied the Needleman-Wunsch global sequence alignment algorithm using Octave, to analyze our target sequence which contains some ambiguous sequence data. The subject sequences are the DNA sequences of Streptococcus pneumoniae families obtained from the Genebank, meanwhile the target DNA sequence are received from our collaborator database. As the results we found the Quaternion representations improve the quality of the sequence alignment score and we can conclude that DNA sequence target has maximum similarity with Streptococcus pneumoniae.
50 years of DNA ‘Breathing’: Reflections on Old and New Approaches

PubMed Central

von Hippel, Peter H.; Johnson, Neil P.; Marcus, Andrew H.

2015-01-01

Summary The coding sequences for genes, and much other regulatory information involved in genome expression, are located ‘inside’ the DNA duplex. Thus the ‘macromolecular machines’ that read-out this information from the base sequence of the DNA must somehow access the DNA ‘interior’. Double-stranded (ds) DNA is a highly structured and cooperatively stabilized system at physiological temperatures, but is also only marginally stable and undergoes a cooperative ‘melting phase transition’ at temperatures not far above physiological. Furthermore, due to its length and heterogeneous sequence, with AT-rich segments being less stable than GC-rich segments, the DNA genome ‘melts’ in a multistate fashion. Therefore the DNA genome must also manifest thermally driven structural (‘breathing’) fluctuations at physiological temperatures that should reflect the heterogeneity of the dsDNA stability near the melting temperature. Thus many of the breathing fluctuations of dsDNA are likely also to be sequence dependent, and could well contain information that should be ‘readable’ and useable by regulatory proteins and protein complexes in site-specific binding reactions involving dsDNA ‘opening’. Our laboratory has been involved in studying the breathing fluctuations of duplex DNA for about 50 years. In this ‘Reflections’ article we present a relatively chronological overview of these studies, starting with the use of simple chemical probes (such as hydrogen exchange, formaldehyde and simple DNA ‘melting’ proteins) to examine the local stability of the dsDNA structure, and culminating in sophisticated spectroscopic approaches that can be used to monitor the breathing-dependent interactions of regulatory complexes with their duplex DNA targets in ‘real time’. PMID:23840028
The Organization of Repetitive DNA in the Genomes of Amazonian Lizard Species in the Family Teiidae.

PubMed

Carvalho, Natalia D M; Pinheiro, Vanessa S S; Carmo, Edson J; Goll, Leonardo G; Schneider, Carlos H; Gross, Maria C

2015-01-01

Repetitive DNA is the largest fraction of the eukaryote genome and comprises tandem and dispersed sequences. It presents variations in relation to its composition, number of copies, distribution, dynamics, and genome organization, and participates in the evolutionary diversification of different vertebrate species. Repetitive sequences are usually located in the heterochromatin of centromeric and telomeric regions of chromosomes, contributing to chromosomal structures. Therefore, the aim of this study was to physically map repetitive DNA sequences (5S rDNA, telomeric sequences, tropomyosin gene 1, and retroelements Rex1 and SINE) of mitotic chromosomes of Amazonian species of teiids (Ameiva ameiva, Cnemidophorus sp. 1, Kentropyx calcarata, Kentropyx pelviceps, and Tupinambis teguixin) to understand their genome organization and karyotype evolution. The mapping of repetitive sequences revealed a distinct pattern in Cnemidophorus sp. 1, whereas the other species showed all sequences interspersed in the heterochromatic region. Physical mapping of the tropomyosin 1 gene was performed for the first time in lizards and showed that in addition to being functional, this gene has a structural function similar to the mapped repetitive elements as it is located preferentially in centromeric regions and termini of chromosomes. © 2016 S. Karger AG, Basel.
Characterization of Fasciola samples by ITS of rDNA sequences revealed the existence of Fasciola hepatica and Fasciola gigantica in Yunnan Province, China.

PubMed

Shu, Fan-Fan; Lv, Rui-Qing; Zhang, Yi-Fang; Duan, Gang; Wu, Ding-Yu; Li, Bi-Feng; Yang, Jian-Fa; Zou, Feng-Cai

2012-08-01

On mainland China, liver flukes of Fasciola spp. (Digenea: Fasciolidae) can cause serious acute and chronic morbidity in numerous species of mammals such as sheep, goats, cattle, and humans. The objective of the present study was to examine the taxonomic identity of Fasciola species in Yunnan province by sequences of the first and second internal transcribed spacers (ITS-1 and ITS-2) of nuclear ribosomal DNA (rDNA). The ITS rDNA was amplified from 10 samples representing Fasciola species in cattle from 2 geographical locations in Yunnan Province, by polymerase chain reaction (PCR), and the products were sequenced directly. The lengths of the ITS-1 and ITS-2 sequences were 422 and 361-362 base pairs, respectively, for all samples sequenced. Using ITS sequences, 2 Fasciola species were revealed, namely Fasciola hepatica and Fasciola gigantica. This is the first demonstration of F. gigantica in cattle in Yunnan Province, China using a molecular approach; our findings have implications for studying the population genetic characterization of the Chinese Fasciola species and for the prevention and control of Fasciola spp. in this province.
Diff-seq: A high throughput sequencing-based mismatch detection assay for DNA variant enrichment and discovery

PubMed Central

Karas, Vlad O; Sinnott-Armstrong, Nicholas A; Varghese, Vici; Shafer, Robert W; Greenleaf, William J; Sherlock, Gavin

2018-01-01

Abstract Much of the within species genetic variation is in the form of single nucleotide polymorphisms (SNPs), typically detected by whole genome sequencing (WGS) or microarray-based technologies. However, WGS produces mostly uninformative reads that perfectly match the reference, while microarrays require genome-specific reagents. We have developed Diff-seq, a sequencing-based mismatch detection assay for SNP discovery without the requirement for specialized nucleic-acid reagents. Diff-seq leverages the Surveyor endonuclease to cleave mismatched DNA molecules that are generated after cross-annealing of a complex pool of DNA fragments. Sequencing libraries enriched for Surveyor-cleaved molecules result in increased coverage at the variant sites. Diff-seq detected all mismatches present in an initial test substrate, with specific enrichment dependent on the identity and context of the variation. Application to viral sequences resulted in increased observation of variant alleles in a biologically relevant context. Diff-Seq has the potential to increase the sensitivity and efficiency of high-throughput sequencing in the detection of variation. PMID:29361139
ATP hydrolysis provides functions that promote rejection of pairings between different copies of long repeated sequences

PubMed Central

Danilowicz, Claudia; Hermans, Laura; Coljee, Vincent; Prévost, Chantal

2017-01-01

Abstract During DNA recombination and repair, RecA family proteins must promote rapid joining of homologous DNA. Repeated sequences with >100 base pair lengths occupy more than 1% of bacterial genomes; however, commitment to strand exchange was believed to occur after testing ∼20–30 bp. If that were true, pairings between different copies of long repeated sequences would usually become irreversible. Our experiments reveal that in the presence of ATP hydrolysis even 75 bp sequence-matched strand exchange products remain quite reversible. Experiments also indicate that when ATP hydrolysis is present, flanking heterologous dsDNA regions increase the reversibility of sequence matched strand exchange products with lengths up to ∼75 bp. Results of molecular dynamics simulations provide insight into how ATP hydrolysis destabilizes strand exchange products. These results inspired a model that shows how pairings between long repeated sequences could be efficiently rejected even though most homologous pairings form irreversible products. PMID:28854739
Structure of the highly repeated, long interspersed DNA family (LINE or L1Rn) of the rat.

PubMed Central

D'Ambrosio, E; Waitzkin, S D; Witney, F R; Salemme, A; Furano, A V

1986-01-01

We present the DNA sequence of a 6.7-kilobase member of the rat long interspersed repeated DNA family (LINE or L1Rn). This member (LINE 3) is flanked by a perfect 14-base-pair (bp) direct repeat and is a full-length, or close-to-full-length, member of this family. LINE 3 contains an approximately 100-bp A-rich right end, a number of long (greater than 400-bp) open reading frames, and a ca. 200-bp G + C-rich (ca. 60%) cluster near each terminus. Comparison of the LINE 3 sequence with the sequence of about one-half of another member, which we also present, as well as restriction enzyme analysis of the genomic copies of this family, indicates that in length and overall structure LINE 3 is quite typical of the 40,000 or so other genomic members of this family which would account for as much as 10% of the rat genome. Therefore, the rat LINE family is relatively homogeneous, which contrasts with the heterogeneous LINE families in primates and mice. Transcripts corresponding to the entire LINE sequence are abundant in the nuclear RNA of rat liver. The characteristics of the rat LINE family are discussed with respect to the possible function and evolution of this family of DNA sequences. Images PMID:3023845

Large-Scale Concatenation cDNA Sequencing

PubMed Central

Yu, Wei; Andersson, Björn; Worley, Kim C.; Muzny, Donna M.; Ding, Yan; Liu, Wen; Ricafrente, Jennifer Y.; Wentland, Meredith A.; Lennon, Greg; Gibbs, Richard A.

1997-01-01

A total of 100 kb of DNA derived from 69 individual human brain cDNA clones of 0.7–2.0 kb were sequenced by concatenated cDNA sequencing (CCS), whereby multiple individual DNA fragments are sequenced simultaneously in a single shotgun library. The method yielded accurate sequences and a similar efficiency compared with other shotgun libraries constructed from single DNA fragments (>20 kb). Computer analyses were carried out on 65 cDNA clone sequences and their corresponding end sequences to examine both nucleic acid and amino acid sequence similarities in the databases. Thirty-seven clones revealed no DNA database matches, 12 clones generated exact matches (≥98% identity), and 16 clones generated nonexact matches (57%–97% identity) to either known human or other species genes. Of those 28 matched clones, 8 had corresponding end sequences that failed to identify similarities. In a protein similarity search, 27 clone sequences displayed significant matches, whereas only 20 of the end sequences had matches to known protein sequences. Our data indicate that full-length cDNA insert sequences provide significantly more nucleic acid and protein sequence similarity matches than expressed sequence tags (ESTs) for database searching. [All 65 cDNA clone sequences described in this paper have been submitted to the GenBank data library under accession nos. U79240–U79304.] PMID:9110174
Phylogenetic analysis of mtDNA lineages in South American mummies.

PubMed

Monsalve, M V; Cardenas, F; Guhl, F; Delaney, A D; Devine, D V

1996-07-01

Some studies of mtDNA propose that contemporary Amerindians have descended from four haplotype groups, each defined by specific sets of polymorphisms. One recent study also found evidence of other potential founder haplotypes. We wanted to determine whether the four haplotypes in modern populations were also present in ancient South American aboriginals. We subjected mtDNA from Colombian mummies (470 to 1849 AD) to PCR amplification and restriction endonuclease analysis. The mtDNA D-loop region was surveyed for sequence variation by restriction analysis and a segment of this region was sequenced for each mummy to characterize the haplotypes. Our mummies exhibited three of the four major characteristic haplotypes of Amerindian populations defined by four markers. With sequence data obtained in the ancient samples and published data on contemporary Amerindians it was possible to infer the origin of these six mummies.
Discrete Ramanujan transform for distinguishing the protein coding regions from other regions.

PubMed

Hua, Wei; Wang, Jiasong; Zhao, Jian

2014-01-01

Based on the study of Ramanujan sum and Ramanujan coefficient, this paper suggests the concepts of discrete Ramanujan transform and spectrum. Using Voss numerical representation, one maps a symbolic DNA strand as a numerical DNA sequence, and deduces the discrete Ramanujan spectrum of the numerical DNA sequence. It is well known that of discrete Fourier power spectrum of protein coding sequence has an important feature of 3-base periodicity, which is widely used for DNA sequence analysis by the technique of discrete Fourier transform. It is performed by testing the signal-to-noise ratio at frequency N/3 as a criterion for the analysis, where N is the length of the sequence. The results presented in this paper show that the property of 3-base periodicity can be only identified as a prominent spike of the discrete Ramanujan spectrum at period 3 for the protein coding regions. The signal-to-noise ratio for discrete Ramanujan spectrum is defined for numerical measurement. Therefore, the discrete Ramanujan spectrum and the signal-to-noise ratio of a DNA sequence can be used for distinguishing the protein coding regions from the noncoding regions. All the exon and intron sequences in whole chromosomes 1, 2, 3 and 4 of Caenorhabditis elegans have been tested and the histograms and tables from the computational results illustrate the reliability of our method. In addition, we have analyzed theoretically and gotten the conclusion that the algorithm for calculating discrete Ramanujan spectrum owns the lower computational complexity and higher computational accuracy. The computational experiments show that the technique by using discrete Ramanujan spectrum for classifying different DNA sequences is a fast and effective method. Copyright © 2014 Elsevier Ltd. All rights reserved.
Synthesis of DNA

DOEpatents

Mariella, Jr., Raymond P.

2008-11-18

A method of synthesizing a desired double-stranded DNA of a predetermined length and of a predetermined sequence. Preselected sequence segments that will complete the desired double-stranded DNA are determined. Preselected segment sequences of DNA that will be used to complete the desired double-stranded DNA are provided. The preselected segment sequences of DNA are assembled to produce the desired double-stranded DNA.
A new and fast method for preparing high quality lambda DNA suitable for sequencing.

PubMed Central

Manfioletti, G; Schneider, C

1988-01-01

A method is described for the rapid purification of high quality lambda DNA. The method can be used from either liquid or plate lysates and on a small scale or a large scale. It relies on the preadsobtion of all polyanions present in the lysate to an "insoluble" anion-exchange matrix (DEAE or TEAE). Phage particles are then disrupted by combined treatment with EDTA/proteinase K and the resulting DNA is precipitated by the addition of the cationic detergent cetyl (or hexadecyl)-trimethyl ammonium bromide-CTAB ("soluble" anion-exchange matrix). The precipitated CTAB-DNA complex is then exchanged to Na-DNA and ethanol precipitated. The resultant purified DNA is suitable for enzymatic reactions and provides a high quality template for dideoxy-sequence analysis. Images PMID:2966928
Nanopore Technology: A Simple, Inexpensive, Futuristic Technology for DNA Sequencing.

PubMed

Gupta, P D

2016-10-01

In health care, importance of DNA sequencing has been fully established. Sanger's Capillary Electrophoresis DNA sequencing methodology is time consuming, cumbersome, hence become more expensive. Lately, because of its versatility DNA sequencing became house hold name, and therefore, there is an urgent need of simple, fast, inexpensive, DNA sequencing technology. In the beginning of this century efforts were made, and Nanopore DNA sequencing technology was developed; still it is infancy, nevertheless, it is the futuristic technology.
Phylogenetic analysis of Demodex caprae based on mitochondrial 16S rDNA sequence.

PubMed

Zhao, Ya-E; Hu, Li; Ma, Jun-Xian

2013-11-01

Demodex caprae infests the hair follicles and sebaceous glands of goats worldwide, which not only seriously impairs goat farming, but also causes a big economic loss. However, there are few reports on the DNA level of D. caprae. To reveal the taxonomic position of D. caprae within the genus Demodex, the present study conducted phylogenetic analysis of D. caprae based on mt16S rDNA sequence data. D. caprae adults and eggs were obtained from a skin nodule of the goat suffering demodicidosis. The mt16S rDNA sequences of individual mite were amplified using specific primers, and then cloned, sequenced, and aligned. The sequence divergence, genetic distance, and transition/transversion rate were computed, and the phylogenetic trees in Demodex were reconstructed. Results revealed the 339-bp partial sequences of six D. caprae isolates were obtained, and the sequence identity was 100% among isolates. The pairwise divergences between D. caprae and Demodex canis or Demodex folliculorum or Demodex brevis were 22.2-24.0%, 24.0-24.9%, and 22.9-23.2%, respectively. The corresponding average genetic distances were 2.840, 2.926, and 2.665, and the average transition/transversion rates were 0.70, 0.55, and 0.54, respectively. The divergences, genetic distances, and transition/transversion rates of D. caprae versus the other three species all reached interspecies level. The five phylogenetic trees all presented that D. caprae clustered with D. brevis first, and then with D. canis, D. folliculorum, and Demodex injai in sequence. In conclusion, D. caprae is an independent species, and it is closer to D. brevis than to D. canis, D. folliculorum, or D. injai.
Threading DNA through nanopores for biosensing applications

NASA Astrophysics Data System (ADS)

Fyta, Maria

2015-07-01

This review outlines the recent achievements in the field of nanopore research. Nanopores are typically used in single-molecule experiments and are believed to have a high potential to realize an ultra-fast and very cheap genome sequencer. Here, the various types of nanopore materials, ranging from biological to 2D nanopores are discussed together with their advantages and disadvantages. These nanopores can utilize different protocols to read out the DNA nucleobases. Although, the first nanopore devices have reached the market, many still have issues which do not allow a full realization of a nanopore sequencer able to sequence the human genome in about a day. Ways to control the DNA, its dynamics and speed as the biomolecule translocates the nanopore in order to increase the signal-to-noise ratio in the reading-out process are examined in this review. Finally, the advantages, as well as the drawbacks in distinguishing the DNA nucleotides, i.e., the genetic information, are presented in view of their importance in the field of nanopore sequencing.
Using long ssDNA polynucleotides to amplify STRs loci in degraded DNA samples

PubMed Central

Pérez Santángelo, Agustín; Corti Bielsa, Rodrigo M.; Sala, Andrea; Ginart, Santiago; Corach, Daniel

2017-01-01

Obtaining informative short tandem repeat (STR) profiles from degraded DNA samples is a challenging task usually undermined by locus or allele dropouts and peak-high imbalances observed in capillary electrophoresis (CE) electropherograms, especially for those markers with large amplicon sizes. We hereby show that the current STR assays may be greatly improved for the detection of genetic markers in degraded DNA samples by using long single stranded DNA polynucleotides (ssDNA polynucleotides) as surrogates for PCR primers. These long primers allow a closer annealing to the repeat sequences, thereby reducing the length of the template required for the amplification in fragmented DNA samples, while at the same time rendering amplicons of larger sizes suitable for multiplex assays. We also demonstrate that the annealing of long ssDNA polynucleotides does not need to be fully complementary in the 5’ region of the primers, thus allowing for the design of practically any long primer sequence for developing new multiplex assays. Furthermore, genotyping of intact DNA samples could also benefit from utilizing long primers since their close annealing to the target STR sequences may overcome wrong profiling generated by insertions/deletions present between the STR region and the annealing site of the primers. Additionally, long ssDNA polynucleotides might be utilized in multiplex PCR assays for other types of degraded or fragmented DNA, e.g. circulating, cell-free DNA (ccfDNA). PMID:29099837
Using mobile sequencers in an academic classroom

PubMed Central

Zaaijer, Sophie; Erlich, Yaniv

2016-01-01

The advent of mobile DNA sequencers has made it possible to generate DNA sequencing data outside of laboratories and genome centers. Here, we report our experience of using the MinION, a mobile sequencer, in a 13-week academic course for undergraduate and graduate students. The course consisted of theoretical sessions that presented fundamental topics in genomics and several applied hackathon sessions. In these hackathons, the students used MinION sequencers to generate and analyze their own data and gain hands-on experience in the topics discussed in the theoretical classes. The manuscript describes the structure of our class, the educational material, and the lessons we learned in the process. We hope that the knowledge and material presented here will provide the community with useful tools to help educate future generations of genome scientists. DOI: http://dx.doi.org/10.7554/eLife.14258.001 PMID:27054412
Analysis of the DNA sequence of a 15,500 bp fragment near the left telomere of chromosome XV from Saccharomyces cerevisiae reveals a putative sugar transporter, a carboxypeptidase homologue and two new open reading frames.

PubMed

Gamo, F J; Lafuente, M J; Casamayor, A; Ariño, J; Aldea, M; Casas, C; Herrero, E; Gancedo, C

1996-06-15

We report the sequence of a 15.5 kb DNA segment located near the left telomere of chromosome XV of Saccharomyces cerevisiae. The sequence contains nine open reading frames (ORFs) longer than 300 bp. Three of them are internal to other ones. One corresponds to the gene LGT3 that encodes a putative sugar transporter. Three adjacent ORFs were separated by two stop codons in frame. These ORFs presented homology with the gene CPS1 that encodes carboxypeptidase S. The stop codons were not found in the same sequence derived from another yeast strain. Two other ORFs without significant homology in databases were also found. One of them, O0420, is very rich in serine and threonine and presents a series of repeated or similar amino acid stretches along the sequence.
Visual ModuleOrganizer: a graphical interface for the detection and comparative analysis of repeat DNA modules

PubMed Central

2014-01-01

Background DNA repeats, such as transposable elements, minisatellites and palindromic sequences, are abundant in sequences and have been shown to have significant and functional roles in the evolution of the host genomes. In a previous study, we introduced the concept of a repeat DNA module, a flexible motif present in at least two occurences in the sequences. This concept was embedded into ModuleOrganizer, a tool allowing the detection of repeat modules in a set of sequences. However, its implementation remains difficult for larger sequences. Results Here we present Visual ModuleOrganizer, a Java graphical interface that enables a new and optimized version of the ModuleOrganizer tool. To implement this version, it was recoded in C++ with compressed suffix tree data structures. This leads to less memory usage (at least 120-fold decrease in average) and decreases by at least four the computation time during the module detection process in large sequences. Visual ModuleOrganizer interface allows users to easily choose ModuleOrganizer parameters and to graphically display the results. Moreover, Visual ModuleOrganizer dynamically handles graphical results through four main parameters: gene annotations, overlapping modules with known annotations, location of the module in a minimal number of sequences, and the minimal length of the modules. As a case study, the analysis of FoldBack4 sequences clearly demonstrated that our tools can be extended to comparative and evolutionary analyses of any repeat sequence elements in a set of genomic sequences. With the increasing number of sequences available in public databases, it is now possible to perform comparative analyses of repeated DNA modules in a graphic and friendly manner within a reasonable time period. Availability Visual ModuleOrganizer interface and the new version of the ModuleOrganizer tool are freely available at: http://lcb.cnrs-mrs.fr/spip.php?rubrique313. PMID:24678954
Quantum sequencing: opportunities and challenges

NASA Astrophysics Data System (ADS)

di Ventra, Massimiliano

Personalized or precision medicine refers to the ability of tailoring drugs to the specific genome and transcriptome of each individual. It is however not yet feasible due the high costs and slow speed of present DNA sequencing methods. I will discuss a sequencing protocol that requires the measurement of the distributions of transverse tunneling currents during the translocation of single-stranded DNA into nanochannels. I will show that such a quantum sequencing approach can reach unprecedented speeds, without requiring any chemical preparation, amplification or labeling. I will discuss recent experiments that support these theoretical predictions, the advantages of this approach over other sequencing methods, and stress the challenges that need to be overcome to render it commercially viable.
Entropic Profiler – detection of conservation in genomes using information theory

PubMed Central

Fernandes, Francisco; Freitas, Ana T; Almeida, Jonas S; Vinga, Susana

2009-01-01

Background In the last decades, with the successive availability of whole genome sequences, many research efforts have been made to mathematically model DNA. Entropic Profiles (EP) were proposed recently as a new measure of continuous entropy of genome sequences. EP represent local information plots related to DNA randomness and are based on information theory and statistical concepts. They express the weighed relative abundance of motifs for each position in genomes. Their study is very relevant because under or over-representation segments are often associated with significant biological meaning. Findings The Entropic Profiler application here presented is a new tool designed to detect and extract under and over-represented DNA segments in genomes by using EP. It allows its computation in a very efficient way by recurring to improved algorithms and data structures, which include modified suffix trees. Available through a web interface and as downloadable source code, it allows to study positions and to search for motifs inside the whole sequence or within a specified range. DNA sequences can be entered from different sources, including FASTA files, pre-loaded examples or resuming a previously saved work. Besides the EP value plots, p-values and z-scores for each motif are also computed, along with the Chaos Game Representation of the sequence. Conclusion EP are directly related with the statistical significance of motifs and can be considered as a new method to extract and classify significant regions in genomes and estimate local scales in DNA. The present implementation establishes an efficient and useful tool for whole genome analysis. PMID:19416538
PISMA: A Visual Representation of Motif Distribution in DNA Sequences.

PubMed

Alcántara-Silva, Rogelio; Alvarado-Hermida, Moisés; Díaz-Contreras, Gibrán; Sánchez-Barrios, Martha; Carrera, Samantha; Galván, Silvia Carolina

2017-01-01

Because the graphical presentation and analysis of motif distribution can provide insights for experimental hypothesis, PISMA aims at identifying motifs on DNA sequences, counting and showing them graphically. The motif length ranges from 2 to 10 bases, and the DNA sequences range up to 10 kb. The motif distribution is shown as a bar-code-like, as a gene-map-like, and as a transcript scheme. We obtained graphical schemes of the CpG site distribution from 91 human papillomavirus genomes. Also, we present 2 analyses: one of DNA motifs associated with either methylation-resistant or methylation-sensitive CpG islands and another analysis of motifs associated with exosome RNA secretion. PISMA is developed in Java; it is executable in any type of hardware and in diverse operating systems. PISMA is freely available to noncommercial users. The English version and the User Manual are provided in Supplementary Files 1 and 2, and a Spanish version is available at www.biomedicas.unam.mx/wp-content/software/pisma.zip and www.biomedicas.unam.mx/wp-content/pdf/manual/pisma.pdf.
PISMA: A Visual Representation of Motif Distribution in DNA Sequences

PubMed Central

Alcántara-Silva, Rogelio; Alvarado-Hermida, Moisés; Díaz-Contreras, Gibrán; Sánchez-Barrios, Martha; Carrera, Samantha; Galván, Silvia Carolina

2017-01-01

Background: Because the graphical presentation and analysis of motif distribution can provide insights for experimental hypothesis, PISMA aims at identifying motifs on DNA sequences, counting and showing them graphically. The motif length ranges from 2 to 10 bases, and the DNA sequences range up to 10 kb. The motif distribution is shown as a bar-code–like, as a gene-map–like, and as a transcript scheme. Results: We obtained graphical schemes of the CpG site distribution from 91 human papillomavirus genomes. Also, we present 2 analyses: one of DNA motifs associated with either methylation-resistant or methylation-sensitive CpG islands and another analysis of motifs associated with exosome RNA secretion. Availability and Implementation: PISMA is developed in Java; it is executable in any type of hardware and in diverse operating systems. PISMA is freely available to noncommercial users. The English version and the User Manual are provided in Supplementary Files 1 and 2, and a Spanish version is available at www.biomedicas.unam.mx/wp-content/software/pisma.zip and www.biomedicas.unam.mx/wp-content/pdf/manual/pisma.pdf. PMID:28469418
cDNA cloning of the human peroxisomal enoyl-CoA hydratase: 3-Hydroxyacyl-CoA dehydrogenase bifunctional enzyme and localization to chromosome 3q26. 3-3q28: A free left Alu arm is inserted in the 3[prime] noncoding region

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hoefler, G.; Forstner, M.; Hulla, W.

1994-01-01

Enoyl-CoA hydratase:3-hydroxyacyl-CoA dehydrogenase bifunctional enzyme is one of the four enzymes of the peroxisomal, [beta]-oxidation pathway. Here, the authors report the full-length human cDNA sequence and the localization of the corresponding gene on chromosome 3q26.3-3q28. The cDNA sequence spans 3779 nucleotides with an open reading frame of 2169 nucleotides. The tripeptide SKL at the carboxy terminus, known to serve as a peroxisomal targeting signal, is present. DNA sequence comparison of the coding region showed an 80% homology between human and rat bifunctional enzyme cDNA. The 3[prime] noncoding sequence contains 117 nucleotides homologous to an Alu repeat. Based on sequence comparison,more » they propose that these nucleotides are a free left Alu arm with 86% homology to the Alu-J family. RNA analysis shows one band with highest intensity in liver and kidney. This cDNA will allow in-depth studies of molecular defects in patients with defective peroxisomal bifunctional enzyme. Moreover, it will also provide a means for studying the regulation of peroxisomal [beta]-oxidation in humans. 33 refs., 5 figs.« less
Statistical and linguistic features of DNA sequences

NASA Technical Reports Server (NTRS)

Havlin, S.; Buldyrev, S. V.; Goldberger, A. L.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.

1995-01-01

We present evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range--indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationary" feature of the sequence of base pairs by applying a new algorithm called Detrended Fluctuation Analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and noncoding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to all eukaryotic DNA sequences (33 301 coding and 29 453 noncoding) in the entire GenBank database. We describe a simple model to account for the presence of long-range power-law correlations which is based upon a generalization of the classic Levy walk. Finally, we describe briefly some recent work showing that the noncoding sequences have certain statistical features in common with natural languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function. We suggest that noncoding regions in plants and invertebrates may display a smaller entropy and larger redundancy than coding regions, further supporting the possibility that noncoding regions of DNA may carry biological information.
Secondary structure prediction for complete rDNA sequences (18S, 5.8S, and 28S rDNA) of Demodex folliculorum, and comparison of divergent domains structures across Acari.

PubMed

Zhao, Ya-E; Wang, Zheng-Hang; Xu, Yang; Wu, Li-Ping; Hu, Li

2013-10-01

According to base pairing, the rRNA folds into corresponding secondary structures, which contain additional phylogenetic information. On the basis of sequencing for complete rDNA sequences (18S, ITS1, 5.8S, ITS2 and 28S rDNA) of Demodex, we predicted the secondary structure of the complete rDNA sequence (18S, 5.8S, and 28S rDNA) of Demodex folliculorum, which was in concordance with that of the main arthropod lineages in past studies. And together with the sequence data from GenBank, we also predicted the secondary structures of divergent domains in SSU rRNA of 51 species and in LSU rRNA of 43 species from four superfamilies in Acari (Cheyletoidea, Tetranychoidea, Analgoidea and Ixodoidea). The multiple alignment among the four superfamilies in Acari showed that, insertions from Tetranychoidea SSU rRNA formed two newly proposed helixes, and helix c3-2b of LSU rRNA was absent in Demodex (Cheyletoidea) taxa. Generally speaking, LSU rRNA presented more remarkable differences than SSU rRNA did, mainly in D2, D3, D5, D7a, D7b, D8 and D10. Copyright © 2013 Elsevier Inc. All rights reserved.
Molecular studies on larvae of Pseudoterranova parasite of Trichiurus lepturus Linnaeus, 1758 and Pomatomus saltatrix (Linnaeus, 1766) off Brazilian waters.

PubMed

Borges, Juliana N; Cunha, Luiz F G; Miranda, Daniele F; Monteiro-Neto, Cassiano; Santos, Cláudia P

2015-12-01

Pseudoterranova larvae parasitizing cutlassfish Trichiurus lepturus and bluefish Pomatomus saltatrix from Southwest Atlantic coast of Brazil were studied in this work by morphological, ultrastructural and molecular approaches. The genetic analysis were performed for the ITS2 intergenic region specific for Pseudoterranova decipiens, the partial 28S (LSU) of ribosomal DNA and the mtDNA cox-1 region. We obtained results for the 28S region and mtDNA cox-1 that was amplified using the polymerase chain reaction and sequenced to evaluate the phylogenetic relationships between sequences of this study and sequences from the GenBank. The morphological profile indicated that all the nine specimens collected from both fish were L3 larvae of Pseudoterranova sp. The genetic profile confirmed the generic level but due to the absence of similar sequences for adult parasites on GenBank for the regions amplifyied, it was not possible to identify them to the species level. The sequences obtained presented 89% of similarity with Pseudoterranova decipiens (28S sequences) and Contracaecum osculatum B (mtDNA cox-1). The low similarity allied to the fact that the amplification with the specific primer for P. decipiens didn't occur, lead us to conclude that our sequences don't belong to P. decipiens complex.

BiRen: predicting enhancers with a deep-learning-based model using the DNA sequence alone.

PubMed

Yang, Bite; Liu, Feng; Ren, Chao; Ouyang, Zhangyi; Xie, Ziwei; Bo, Xiaochen; Shu, Wenjie

2017-07-01

Enhancer elements are noncoding stretches of DNA that play key roles in controlling gene expression programmes. Despite major efforts to develop accurate enhancer prediction methods, identifying enhancer sequences continues to be a challenge in the annotation of mammalian genomes. One of the major issues is the lack of large, sufficiently comprehensive and experimentally validated enhancers for humans or other species. Thus, the development of computational methods based on limited experimentally validated enhancers and deciphering the transcriptional regulatory code encoded in the enhancer sequences is urgent. We present a deep-learning-based hybrid architecture, BiRen, which predicts enhancers using the DNA sequence alone. Our results demonstrate that BiRen can learn common enhancer patterns directly from the DNA sequence and exhibits superior accuracy, robustness and generalizability in enhancer prediction relative to other state-of-the-art enhancer predictors based on sequence characteristics. Our BiRen will enable researchers to acquire a deeper understanding of the regulatory code of enhancer sequences. Our BiRen method can be freely accessed at https://github.com/wenjiegroup/BiRen . shuwj@bmi.ac.cn or boxc@bmi.ac.cn. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Biochemical Characterization of a Mycobacteriophage Derived DnaB Ortholog Reveals New Insight into the Evolutionary Origin of DnaB Helicases

PubMed Central

Bhowmik, Priyanka; Das Gupta, Sujoy K.

2015-01-01

The bacterial replicative helicases known as DnaB are considered to be members of the RecA superfamily. All members of this superfamily, including DnaB, have a conserved C- terminal domain, known as the RecA core. We unearthed a series of mycobacteriophage encoded proteins in which the RecA core domain alone was present. These proteins were phylogenetically related to each other and formed a distinct clade within the RecA superfamily. A mycobacteriophage encoded protein, Wildcat Gp80 that roots deep in the DnaB family, was found to possess a core domain having significant sequence homology (Expect value < 10-5) with members of this novel cluster. This indicated that Wildcat Gp80, and by extrapolation, other members of the DnaB helicase family, may have evolved from a single domain RecA core polypeptide belonging to this novel group. Biochemical investigations confirmed that Wildcat Gp80 was a helicase. Surprisingly, our investigations also revealed that a thioredoxin tagged truncated version of the protein in which the N-terminal sequences were removed was fully capable of supporting helicase activity, although its ATP dependence properties were different. DnaB helicase activity is thus, primarily a function of the RecA core although additional N-terminal sequences may be necessary for fine tuning its activity and stability. Based on sequence comparison and biochemical studies we propose that DnaB helicases may have evolved from single domain RecA core proteins having helicase activities of their own, through the incorporation of additional N-terminal sequences. PMID:26237048
Characterization of kinetoplast DNA from Phytomonas serpens.

PubMed

Sá-Carvalho, D; Perez-Morga, D; Traub-Cseko, Y M

1993-01-01

The restriction enzyme digestion of kinetoplast DNA from four Phytomonas serpens isolates shows an overall similar band pattern. One minicircle from isolate 30T was cloned and sequenced, showing low levels of homology but the same general features and organization as described for minicircles of other trypanosomatids. Extensive regions of the minicircle are composed by G and T on the H strand. These regions are very repetitive and similar to regions in a minicircle of Crithidia oncopelti and to telomeric sequences of Saccharomyces cerevisiae. Conserved Sequence Block 3, present in all trypanosomatids, is one nucleotide different from the consensus in P. serpens and provides a basis to differentiate P. serpens from other trypanosomatids. Electron microscopy of kinetoplast DNA evidenced a network with organization similar to other trypanosomatids and the measurement of minicircles confirmed the size of about 1.45 kb of the sequenced minicircle.
A DNA Mini-Barcoding System for Authentication of Processed Fish Products.

PubMed

Shokralla, Shadi; Hellberg, Rosalee S; Handy, Sara M; King, Ian; Hajibabaei, Mehrdad

2015-10-30

Species substitution is a form of seafood fraud for the purpose of economic gain. DNA barcoding utilizes species-specific DNA sequence information for specimen identification. Previous work has established the usability of short DNA sequences-mini-barcodes-for identification of specimens harboring degraded DNA. This study aims at establishing a DNA mini-barcoding system for all fish species commonly used in processed fish products in North America. Six mini-barcode primer pairs targeting short (127-314 bp) fragments of the cytochrome c oxidase I (CO1) DNA barcode region were developed by examining over 8,000 DNA barcodes from species in the U.S. Food and Drug Administration (FDA) Seafood List. The mini-barcode primer pairs were then tested against 44 processed fish products representing a range of species and product types. Of the 44 products, 41 (93.2%) could be identified at the species or genus level. The greatest mini-barcoding success rate found with an individual primer pair was 88.6% compared to 20.5% success rate achieved by the full-length DNA barcode primers. Overall, this study presents a mini-barcoding system that can be used to identify a wide range of fish species in commercial products and may be utilized in high throughput DNA sequencing for authentication of heavily processed fish products.
Role of DNA conformation & energetic insights in Msx-1-DNA recognition as revealed by molecular dynamics studies on specific and nonspecific complexes.

PubMed

Kachhap, Sangita; Singh, Balvinder

2015-01-01

In most of homeodomain-DNA complexes, glutamine or lysine is present at 50th position and interacts with 5th and 6th nucleotide of core recognition region. Molecular dynamics simulations of Msx-1-DNA complex (Q50-TG) and its variant complexes, that is specific (Q50K-CC), nonspecific (Q50-CC) having mutation in DNA and (Q50K-TG) in protein, have been carried out. Analysis of protein-DNA interactions and structure of DNA in specific and nonspecific complexes show that amino acid residues use sequence-dependent shape of DNA to interact. The binding free energies of all four complexes were analysed to define role of amino acid residue at 50th position in terms of binding strength considering the variation in DNA on stability of protein-DNA complexes. The order of stability of protein-DNA complexes shows that specific complexes are more stable than nonspecific ones. Decomposition analysis shows that N-terminal amino acid residues have been found to contribute maximally in binding free energy of protein-DNA complexes. Among specific protein-DNA complexes, K50 contributes more as compared to Q50 towards binding free energy in respective complexes. The sequence dependence of local conformation of DNA enables Q50/Q50K to make hydrogen bond with nucleotide(s) of DNA. The changes in amino acid sequence of protein are accommodated and stabilized around TAAT core region of DNA having variation in nucleotides.
Molecular organization and phylogenetic analysis of 5S rDNA in crustaceans of the genus Pollicipes reveal birth-and-death evolution and strong purifying selection.

PubMed

Perina, Alejandra; Seoane, David; González-Tizón, Ana M; Rodríguez-Fariña, Fernanda; Martínez-Lage, Andrés

2011-10-17

The 5S ribosomal DNA (5S rDNA) is organized in tandem arrays with repeat units that consist of a transcribing region (5S) and a variable nontranscribed spacer (NTS), in higher eukaryotes. Until recently the 5S rDNA was thought to be subject to concerted evolution, however, in several taxa, sequence divergence levels between the 5S and the NTS were found higher than expected under this model. So, many studies have shown that birth-and-death processes and selection can drive the evolution of 5S rDNA. In analyses of 5S rDNA evolution is found several 5S rDNA types in the genome, with low levels of nucleotide variation in the 5S and a spacer region highly divergent. Molecular organization and nucleotide sequence of the 5S ribosomal DNA multigene family (5S rDNA) were investigated in three Pollicipes species in an evolutionary context. The nucleotide sequence variation revealed that several 5S rDNA variants occur in Pollicipes genomes. They are clustered in up to seven different types based on differences in their nontranscribed spacers (NTS). Five different units of 5S rDNA were characterized in P. pollicipes and two different units in P. elegans and P. polymerus. Analysis of these sequences showed that identical types were shared among species and that two pseudogenes were present. We predicted the secondary structure and characterized the upstream and downstream conserved elements. Phylogenetic analysis showed an among-species clustering pattern of 5S rDNA types. These results suggest that the evolution of Pollicipes 5S rDNA is driven by birth-and-death processes with strong purifying selection.
Molecular organization and phylogenetic analysis of 5S rDNA in crustaceans of the genus Pollicipes reveal birth-and-death evolution and strong purifying selection

PubMed Central

2011-01-01

Background The 5S ribosomal DNA (5S rDNA) is organized in tandem arrays with repeat units that consist of a transcribing region (5S) and a variable nontranscribed spacer (NTS), in higher eukaryotes. Until recently the 5S rDNA was thought to be subject to concerted evolution, however, in several taxa, sequence divergence levels between the 5S and the NTS were found higher than expected under this model. So, many studies have shown that birth-and-death processes and selection can drive the evolution of 5S rDNA. In analyses of 5S rDNA evolution is found several 5S rDNA types in the genome, with low levels of nucleotide variation in the 5S and a spacer region highly divergent. Molecular organization and nucleotide sequence of the 5S ribosomal DNA multigene family (5S rDNA) were investigated in three Pollicipes species in an evolutionary context. Results The nucleotide sequence variation revealed that several 5S rDNA variants occur in Pollicipes genomes. They are clustered in up to seven different types based on differences in their nontranscribed spacers (NTS). Five different units of 5S rDNA were characterized in P. pollicipes and two different units in P. elegans and P. polymerus. Analysis of these sequences showed that identical types were shared among species and that two pseudogenes were present. We predicted the secondary structure and characterized the upstream and downstream conserved elements. Phylogenetic analysis showed an among-species clustering pattern of 5S rDNA types. Conclusions These results suggest that the evolution of Pollicipes 5S rDNA is driven by birth-and-death processes with strong purifying selection. PMID:22004418
Pairagon: a highly accurate, HMM-based cDNA-to-genome aligner.

PubMed

Lu, David V; Brown, Randall H; Arumugam, Manimozhiyan; Brent, Michael R

2009-07-01

The most accurate way to determine the intron-exon structures in a genome is to align spliced cDNA sequences to the genome. Thus, cDNA-to-genome alignment programs are a key component of most annotation pipelines. The scoring system used to choose the best alignment is a primary determinant of alignment accuracy, while heuristics that prevent consideration of certain alignments are a primary determinant of runtime and memory usage. Both accuracy and speed are important considerations in choosing an alignment algorithm, but scoring systems have received much less attention than heuristics. We present Pairagon, a pair hidden Markov model based cDNA-to-genome alignment program, as the most accurate aligner for sequences with high- and low-identity levels. We conducted a series of experiments testing alignment accuracy with varying sequence identity. We first created 'perfect' simulated cDNA sequences by splicing the sequences of exons in the reference genome sequences of fly and human. The complete reference genome sequences were then mutated to various degrees using a realistic mutation simulator and the perfect cDNAs were aligned to them using Pairagon and 12 other aligners. To validate these results with natural sequences, we performed cross-species alignment using orthologous transcripts from human, mouse and rat. We found that aligner accuracy is heavily dependent on sequence identity. For sequences with 100% identity, Pairagon achieved accuracy levels of >99.6%, with one quarter of the errors of any other aligner. Furthermore, for human/mouse alignments, which are only 85% identical, Pairagon achieved 87% accuracy, higher than any other aligner. Pairagon source and executables are freely available at http://mblab.wustl.edu/software/pairagon/
Uncovering the Ancestry of B Chromosomes in Moenkhausia sanctaefilomenae (Teleostei, Characidae)

PubMed Central

Utsunomia, Ricardo; Silva, Duílio Mazzoni Zerbinato de Andrade; Ruiz-Ruano, Francisco J.; Araya-Jaime, Cristian; Pansonato-Alves, José Carlos; Scacchetti, Priscilla Cardim; Hashimoto, Diogo Teruo; Oliveira, Claudio; Trifonov, Vladmir A.; Porto-Foresti, Fábio; Camacho, Juan Pedro M.; Foresti, Fausto

2016-01-01

B chromosomes constitute a heterogeneous mixture of genomic parasites that are sometimes derived intraspecifically from the standard genome of the host species, but result from interspecific hybridization in other cases. The mode of origin determines the DNA content, with the B chromosomes showing high similarity with the A genome in the first case, but presenting higher similarity with a different species in the second. The characid fish Moenkhausia sanctaefilomenae harbours highly invasive B chromosomes, which are present in all populations analyzed to date in the Parana and Tietê rivers. To investigate the origin of these B chromosomes, we analyzed two natural populations: one carrying B chromosomes and the other lacking them, using a combination of molecular cytogenetic techniques, nucleotide sequence analysis and high-throughput sequencing (Illumina HiSeq2000). Our results showed that i) B chromosomes have not yet reached the Paranapanema River basin; ii) B chromosomes are mitotically unstable; iii) there are two types of B chromosomes, the most frequent of which is lightly C-banded (similar to euchromatin in A chromosomes) (B1), while the other is darkly C-banded (heterochromatin-like) (B2); iv) the two B types contain the same tandem repeat DNA sequences (18S ribosomal DNA, H3 histone genes, MS3 and MS7 satellite DNA), with a higher content of 18S rDNA in the heterochromatic variant; v) all of these repetitive DNAs are present together only in the paracentromeric region of autosome pair no. 6, suggesting that the B chromosomes are derived from this A chromosome; vi) the two B chromosome variants show MS3 sequences that are highly divergent from each other and from the 0B genome, although the B2-derived sequences exhibit higher similarity with the 0B genome (this suggests an independent origin of the two B variants, with the less frequent, B2 type presumably being younger); and vii) the dN/dS ratio for the H3.2 histone gene is almost 4–6 times higher for B chromosomes than for A chromosome sequences, suggesting that purifying selection is relaxed for the DNA sequences located on the B chromosomes, presumably because they are mostly inactive. PMID:26934481
Molecular cloning and nucleotide sequence of a transforming gene detected by transfection of chicken B-cell lymphoma DNA

NASA Astrophysics Data System (ADS)

Goubin, Gerard; Goldman, Debra S.; Luce, Judith; Neiman, Paul E.; Cooper, Geoffrey M.

1983-03-01

A transforming gene detected by transfection of chicken B-cell lymphoma DNA has been isolated by molecular cloning. It is homologous to a conserved family of sequences present in normal chicken and human DNAs but is not related to transforming genes of acutely transforming retroviruses. The nucleotide sequence of the cloned transforming gene suggests that it encodes a protein that is partially homologous to the amino terminus of transferrin and related proteins although only about one tenth the size of transferrin.
Base pairing among three cis-acting sequences contributes to template switching during hepadnavirus reverse transcription.

PubMed

Liu, Ning; Tian, Ru; Loeb, Daniel D

2003-02-18

Synthesis of the relaxed-circular (RC) DNA genome of hepadnaviruses requires two template switches during plus-strand DNA synthesis: primer translocation and circularization. Although primer translocation and circularization use different donor and acceptor sequences, and are distinct temporally, they share the common theme of switching from one end of the minus-strand template to the other end. Studies of duck hepatitis B virus have indicated that, in addition to the donor and acceptor sequences, three other cis-acting sequences, named 3E, M, and 5E, are required for the synthesis of RC DNA by contributing to primer translocation and circularization. The mechanism by which 3E, M, and 5E act was not known. We present evidence that these sequences function by base pairing with each other within the minus-strand template. 3E base-pairs with one portion of M (M3) and 5E base-pairs with an adjacent portion of M (M5). We found that disrupting base pairing between 3E and M3 and between 5E and M5 inhibited primer translocation and circularization. More importantly, restoring base pairing with mutant sequences restored the production of RC DNA. These results are consistent with the model that, within duck hepatitis B virus capsids, the ends of the minus-strand template are juxtaposed via base pairing to facilitate the two template switches during plus-strand DNA synthesis.
Sequencing degraded DNA from non-destructively sampled museum specimens for RAD-tagging and low-coverage shotgun phylogenetics.

PubMed

Tin, Mandy Man-Ying; Economo, Evan Philip; Mikheyev, Alexander Sergeyevich

2014-01-01

Ancient and archival DNA samples are valuable resources for the study of diverse historical processes. In particular, museum specimens provide access to biotas distant in time and space, and can provide insights into ecological and evolutionary changes over time. However, archival specimens are difficult to handle; they are often fragile and irreplaceable, and typically contain only short segments of denatured DNA. Here we present a set of tools for processing such samples for state-of-the-art genetic analysis. First, we report a protocol for minimally destructive DNA extraction of insect museum specimens, which produced sequenceable DNA from all of the samples assayed. The 11 specimens analyzed had fragmented DNA, rarely exceeding 100 bp in length, and could not be amplified by conventional PCR targeting the mitochondrial cytochrome oxidase I gene. Our approach made these samples amenable to analysis with commonly used next-generation sequencing-based molecular analytic tools, including RAD-tagging and shotgun genome re-sequencing. First, we used museum ant specimens from three species, each with its own reference genome, for RAD-tag mapping. Were able to use the degraded DNA sequences, which were sequenced in full, to identify duplicate reads and filter them prior to base calling. Second, we re-sequenced six Hawaiian Drosophila species, with millions of years of divergence, but with only a single available reference genome. Despite a shallow coverage of 0.37 ± 0.42 per base, we could recover a sufficient number of overlapping SNPs to fully resolve the species tree, which was consistent with earlier karyotypic studies, and previous molecular studies, at least in the regions of the tree that these studies could resolve. Although developed for use with degraded DNA, all of these techniques are readily applicable to more recent tissue, and are suitable for liquid handling automation.
Comparative study of IDH1 mutations in gliomas by immunohistochemistry and DNA sequencing.

PubMed

Agarwal, Shipra; Sharma, Mehar Chand; Jha, Prerana; Pathak, Pankaj; Suri, Vaishali; Sarkar, Chitra; Chosdol, Kunzang; Suri, Ashish; Kale, Shashank Sharad; Mahapatra, Ashok Kumar; Jha, Pankaj

2013-06-01

Mutations involving isocitrate dehydrogenase 1 (IDH 1) occur in a high proportion of diffuse gliomas, with implications on diagnosis and prognosis. About 90% involve exon 4 at codon 132, replacing amino acid arginine with histidine (R132H). Rarer ones include R132C, R132S, R132G, R132L, R132V, and R132P. Most authors have used DNA-based methods to assess IDH1 status. Preliminary studies comparing imunohistochemistry (IHC) with IDH1-R132H mutation-specific antibodies have shown concordance with DNA sequencing and no cross-reactivity with wild-type IDH1 or other mutant proteins. The present study compares results of IHC with DNA sequencing in diffuse gliomas. Fifty diffuse gliomas with frozen tissue samples for DNA sequencing and adequate tissue in paraffin blocks for IHC using IDH1-R132H specific antibody were assessed for IDH1 mutations. Concordance of findings between IHC and DNA sequencing was noted in 88% (44/50) cases. All 6 cases with discrepancy were immunopositive with DIA-H09 antibody. While in 3 of these 6 cases, DNA sequencing failed to reveal any mutations, R132L (arginine replaced by leucine) mutation was found in the rest 3 cases. Interestingly, of the immunopositive cases, 46.6% (14/30) showed immunostaining in only a fraction of tumor cells. IHC is an easy and quick method of detecting IDH1-R132H mutations, but there may be some discrepancies between IHC and DNA sequencing. Although there were no false-negative cases, cross-reactivity with IDH1-R132L was seen in 3, a finding not reported thus far. Because of more universal availability of IHC over genetic testing, cross-reactivity and staining heterogeneity may have bearing over its use in detecting IDH1-R132H mutation in gliomas.
Comparative study of IDH1 mutations in gliomas by immunohistochemistry and DNA sequencing

PubMed Central

Agarwal, Shipra; Sharma, Mehar Chand; Jha, Prerana; Pathak, Pankaj; Suri, Vaishali; Sarkar, Chitra; Chosdol, Kunzang; Suri, Ashish; Kale, Shashank Sharad; Mahapatra, Ashok Kumar; Jha, Pankaj

2013-01-01

Background Mutations involving isocitrate dehydrogenase 1 (IDH 1) occur in a high proportion of diffuse gliomas, with implications on diagnosis and prognosis. About 90% involve exon 4 at codon 132, replacing amino acid arginine with histidine (R132H). Rarer ones include R132C, R132S, R132G, R132L, R132V, and R132P. Most authors have used DNA-based methods to assess IDH1 status. Preliminary studies comparing imunohistochemistry (IHC) with IDH1-R132H mutation-specific antibodies have shown concordance with DNA sequencing and no cross-reactivity with wild-type IDH1 or other mutant proteins. The present study compares results of IHC with DNA sequencing in diffuse gliomas. Materials and methods Fifty diffuse gliomas with frozen tissue samples for DNA sequencing and adequate tissue in paraffin blocks for IHC using IDH1-R132H specific antibody were assessed for IDH1 mutations. Results Concordance of findings between IHC and DNA sequencing was noted in 88% (44/50) cases. All 6 cases with discrepancy were immunopositive with DIA-H09 antibody. While in 3 of these 6 cases, DNA sequencing failed to reveal any mutations, R132L (arginine replaced by leucine) mutation was found in the rest 3 cases. Interestingly, of the immunopositive cases, 46.6% (14/30) showed immunostaining in only a fraction of tumor cells. Conclusions IHC is an easy and quick method of detecting IDH1-R132H mutations, but there may be some discrepancies between IHC and DNA sequencing. Although there were no false-negative cases, cross-reactivity with IDH1-R132L was seen in 3, a finding not reported thus far. Because of more universal availability of IHC over genetic testing, cross-reactivity and staining heterogeneity may have bearing over its use in detecting IDH1-R132H mutation in gliomas. PMID:23486690
Comprehensive restriction enzyme lists to update any DNA sequence computer program.

PubMed

Raschke, E

1993-04-01

Restriction enzyme lists are presented for the practical working geneticist to update any DNA computer program. These lists combine formerly scattered information and contain all presently known restriction enzymes with a unique recognition sequence, a cut site, or methylation (in)sensitivity. The lists are in the shortest possible form to also be functional with small DNA computer programs, and will produce clear restriction maps without any redundancy or loss of information. The lists discern between commercial and noncommercial enzymes, and prototype enzymes and different isoschizomers are cross-referenced. Differences in general methylation sensitivities and (in)sensitivities against Dam and Dcm methylases of Escherichia coli are indicated. Commercial methylases and intron-encoded endonucleases are included. An address list is presented to contact commercial suppliers. The lists are constantly updated and available in electronic form as pure US ASCII files, and in formats for the DNA computer programs DNA-Strider for Apple Macintosh, and DNAsis for IBM personal computers or compatibles via e-mail from the internet address: NETSERV@EMBL-HEIDELBERG.DE by sending only the message HELP RELIBRARY.
Quantum-Sequencing: Fast electronic single DNA molecule sequencing

NASA Astrophysics Data System (ADS)

Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

2014-03-01

A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free, high-throughput and cost-effective, single-molecule sequencing method. Here, we present the first demonstration of unique ``electronic fingerprint'' of all nucleotides (A, G, T, C), with single-molecule DNA sequencing, using Quantum-tunneling Sequencing (Q-Seq) at room temperature. We show that the electronic state of the nucleobases shift depending on the pH, with most distinct states identified at acidic pH. We also demonstrate identification of single nucleotide modifications (methylation here). Using these unique electronic fingerprints (or tunneling data), we report a partial sequence of beta lactamase (bla) gene, which encodes resistance to beta-lactam antibiotics, with over 95% success rate. These results highlight the potential of Q-Seq as a robust technique for next-generation sequencing.
DNAAlignEditor: DNA alignment editor tool

PubMed Central

Sanchez-Villeda, Hector; Schroeder, Steven; Flint-Garcia, Sherry; Guill, Katherine E; Yamasaki, Masanori; McMullen, Michael D

2008-01-01

Background With advances in DNA re-sequencing methods and Next-Generation parallel sequencing approaches, there has been a large increase in genomic efforts to define and analyze the sequence variability present among individuals within a species. For very polymorphic species such as maize, this has lead to a need for intuitive, user-friendly software that aids the biologist, often with naïve programming capability, in tracking, editing, displaying, and exporting multiple individual sequence alignments. To fill this need we have developed a novel DNA alignment editor. Results We have generated a nucleotide sequence alignment editor (DNAAlignEditor) that provides an intuitive, user-friendly interface for manual editing of multiple sequence alignments with functions for input, editing, and output of sequence alignments. The color-coding of nucleotide identity and the display of associated quality score aids in the manual alignment editing process. DNAAlignEditor works as a client/server tool having two main components: a relational database that collects the processed alignments and a user interface connected to database through universal data access connectivity drivers. DNAAlignEditor can be used either as a stand-alone application or as a network application with multiple users concurrently connected. Conclusion We anticipate that this software will be of general interest to biologists and population genetics in editing DNA sequence alignments and analyzing natural sequence variation regardless of species, and will be particularly useful for manual alignment editing of sequences in species with high levels of polymorphism. PMID:18366684
Whole-comparative genomic hybridization in domestic sheep (Ovis aries) breeds.

PubMed

Dávila-Rodríguez, M I; Cortés-Gutiérrez, E I; López-Fernández, C; Pita, M; Mezzanotte, R; Gosálvez, J

2009-01-01

Whole-comparative genomic hybridization (W-CGH) allows identification of chromosomal polymorphisms related to highly repetitive DNA sequences localized in constitutive heterochromatin. Such polymorphisms are detected establishing competition between genomic DNAs in an in situ hybridization environment without subtraction of highly repetitive DNA sequences, when comparing two species from closely related taxa (same species, sub-species, or breeds) or somewhat related taxa. This experimental approach was applied to investigating differences in highly repetitive sequences of three sheep breeds (Castellana, Ojalada, and Assaf). To this end, W-CGH was carried out using mouflon (sheep ancestor) chromosomes as a common target to co-hybridize equimolar quantities of two genomic DNAs obtained from either Castellana, Ojalada or Assaf sheep breeds. The results showed that the amount of constitutive heterochromatin is greater in all pericentromeric heterochromatin regions of acrocentric chromosomes than in metacentric or sex chromosomes. Additionally, when W-CGH was performed using DNAs from the Iberian breeds Castellana and Ojalada, chromosomal pericentromeric regions revealed quantitatively and qualitatively a presence of DNA families similar to that obtained from any of the above-cited breeds. On the contrary, when the DNA used in W-CGH experiments was obtained from Assaf, as compared to either Castellana or Ojalada, two different pericentromeric DNA families of highly repetitive sequences could be detected. Lastly, sex chromosomes were shown to be homogeneous among all breeds and thus revealed no detectable constitutive heterochromatin. W-CGH results were confirmed using DNA breakage detection-FISH experiments (DBD-FISH) carried out on lymphocytes. As a whole, the results showed that two different repetitive DNA families are present in the pericentromeric heterochromatin of the sheep breeds studied here. Additionally, they suggest a differential presence of these distinct repetitive DNA families in Castellana and Ojalada breeds as compared to the Assaf breed. Finally, the results of W-CGH after using mouflon as the targeted chromosomes also show that the two DNA families are present in the ancestor. Copyright 2009 S. Karger AG, Basel.
Serovar distribution of a DNA sequence involved in the antigenic relationship between Leptospira and equine cornea

PubMed Central

Lucchesi, Paula MA; Parma, Alberto E; Arroyo, Guillermo H

2002-01-01

Background Horses infected with Leptospira present several clinical disorders, one of them being recurrent uveitis. A common endpoint of equine recurrent uveitis is blindness. Serovar pomona has often been incriminated, although others have also been reported. An antigenic relationship between this bacterium and equine cornea has been described in previous studies. A leptospiral DNA fragment that encodes cross-reacting epitopes was previously cloned and expressed in Escherichia coli. Results A region of that DNA fragment was subcloned and sequenced. Samples of leptospiral DNA from several sources were analysed by PCR with two primer pairs designed to amplify that region. Reference strains from serovars canicola, icterohaemorrhagiae, pomona, pyrogenes, wolffi, bataviae, sentot, hebdomadis and hardjo rendered products of the expected sizes with both pairs of primers. The specific DNA region was also amplified from isolates from Argentina belonging to serogroups Canicola and Pomona. Both L. biflexa serovar patoc and L. borgpetersenii serovar tarassovi rendered a negative result. Conclusions The DNA sequence related to the antigen mimicry with equine cornea was not exclusively found in serovar pomona as it was also detected in several strains of Leptospira belonging to different serovars. The results obtained with L. biflexa serovar patoc strain Patoc I and L. borgpetersenii serovar tarassovi strain Perepelicin suggest that this sequence is not present in these strains, which belong to different genomospecies than those which gave positive results. This is an interesting finding since L. biflexa comprises nonpathogenic strains and serovar tarassovi has not been associated clinically with equine uveitis. PMID:11869455
On the roles of repetitive DNA elements in the context of a unified genomic-epigenetic system.

PubMed

von Sternberg, Richard

2002-12-01

Repetitive DNA sequences comprise a substantial portion of most eukaryotic and some prokaryotic chromosomes. Despite nearly forty years of research, the functions of various sequence families as a whole and their monomer units remain largely unknown. The inability to map specific functional roles onto many repetitive DNA elements (REs), coupled with the taxon-specificity of sequence families, have led many to speculate that these genomic components are "selfish" replicators generating genomic "junk." The purpose of this paper is to critically examine the selfishness, evolutionary effects, and functionality of REs. First, a brief overview of the range of ideas pertaining to RE function is presented. Second, the argument is presented that the selfish DNA "hypothesis" is actually a narrative scheme, that it serves to protect neo-Darwinian assumptions from criticism, and that this story is untestable and therefore not a hypothesis. Third, attempts to synthesize the selfish DNA concept with complex systems models of the genome and RE functionality are critiqued. Fourth, the supposed connection between RE-induced mutations and macroevolutionary events are stated to be at variance with empirical evidence and theoretical considerations. Hypotheses that base phylogenetic transitions in repetitive sequence changes thus remain speculative. Fifth and finally, the case is made for viewing REs as integrally functional components of chromosomes, genomes, and cells. It is argued throughout that a new conceptual framework is needed for understanding the roles of repetitive DNA in genomic/epigenetic systems, and that neo-Darwinian "narratives" have been the primary obstacle to elucidating the effects of these enigmatic components of chromosomes.

Development of SCAR Markers for the DNA-Based Detection of the Asian Long-Horned Beetle; Anoplophora glabripennis (Motschulsky)

Treesearch

Damodar R. Kethidi; David B. Roden; Tim R. Ladd; Peter J. Krell; Arthur Ratnakaran; Qili Feng

2003-01-01

DNA markers were identified for the molecular detection of the Asian long-horned beetle (ALB), Anoplophora glabripennis (Mot.), based on sequence charaterized amplified regions (SCARS) derived from random amplified polymorphic DNA (RAPD) fragments. A 2,740-bp DNA fragment that was present only in ALB and not in other Cerambycids was identified after...
Chiral pathways in DNA dinucleotides using gradient optimized refinement along metastable borders

NASA Astrophysics Data System (ADS)

Romano, Pablo; Guenza, Marina

We present a study of DNA breathing fluctuations using Markov state models (MSM) with our novel refinement procedure. MSM have become a favored method of building kinetic models, however their accuracy has always depended on using a significant number of microstates, making the method costly. We present a method which optimizes macrostates by refining borders with respect to the gradient along the free energy surface. As the separation between macrostates contains highest discretization errors, this method corrects for any errors produced by limited microstate sampling. Using our refined MSM methods, we investigate DNA breathing fluctuations, thermally induced conformational changes in native B-form DNA. Running several microsecond MD simulations of DNA dinucleotides of varying sequences, to include sequence and polarity effects, we've analyzed using our refined MSM to investigate conformational pathways inherent in the unstacking of DNA bases. Our kinetic analysis has shown preferential chirality in unstacking pathways that may be critical in how proteins interact with single stranded regions of DNA. These breathing dynamics can help elucidate the connection between conformational changes and key mechanisms within protein-DNA recognition. NSF Chemistry Division (Theoretical Chemistry), the Division of Physics (Condensed Matter: Material Theory), XSEDE.
A family of long intergenic non-coding RNA genes in human chromosomal region 22q11.2 carry a DNA translocation breakpoint/AT-rich sequence

PubMed Central

2018-01-01

FAM230C, a long intergenic non-coding RNA (lincRNA) gene in human chromosome 13 (chr13) is a member of lincRNA genes termed family with sequence similarity 230. An analysis using bioinformatics search tools and alignment programs was undertaken to determine properties of FAM230C and its related genes. Results reveal that the DNA translocation element, the Translocation Breakpoint Type A (TBTA) sequence, which consists of satellite DNA, Alu elements, and AT-rich sequences is embedded in the FAM230C gene. Eight lincRNA genes related to FAM230C also carry the TBTA sequences. These genes were formed from a large segment of the 3’ half of the FAM230C sequence duplicated in chr22, and are specifically in regions of low copy repeats (LCR22)s, in or close to the 22q.11.2 region. 22q11.2 is a chromosomal segment that undergoes a high rate of DNA translocation and is prone to genetic deletions. FAM230C-related genes present in other chromosomes do not carry the TBTA motif and were formed from the 5’ half region of the FAM230C sequence. These findings identify a high specificity in lincRNA gene formation by gene sequence duplication in different chromosomes. PMID:29668722
Characterization of bacterial diversity in pulque, a traditional Mexican alcoholic fermented beverage, as determined by 16S rDNA analysis.

PubMed

Escalante, Adelfo; Rodríguez, María Elena; Martínez, Alfredo; López-Munguía, Agustín; Bolívar, Francisco; Gosset, Guillermo

2004-06-15

The bacterial diversity in pulque, a traditional Mexican alcoholic fermented beverage, was studied in 16S rDNA clone libraries from three pulque samples. Sequenced clones identified as Lactobacillus acidophilus, Lactobacillus strain ASF360, L. kefir, L. acetotolerans, L. hilgardii, L. plantarum, Leuconostoc pseudomesenteroides, Microbacterium arborescens, Flavobacterium johnsoniae, Acetobacter pomorium, Gluconobacter oxydans, and Hafnia alvei, were detected for the first time in pulque. Identity of 16S rDNA sequenced clones showed that bacterial diversity present among pulque samples is dominated by Lactobacillus species (80.97%). Seventy-eight clones exhibited less than 95% of relatedness to NCBI database sequences, which may indicate the presence of new species in pulque samples.
High-speed all-optical DNA local sequence alignment based on a three-dimensional artificial neural network.

PubMed

Maleki, Ehsan; Babashah, Hossein; Koohi, Somayyeh; Kavehvash, Zahra

2017-07-01

This paper presents an optical processing approach for exploring a large number of genome sequences. Specifically, we propose an optical correlator for global alignment and an extended moiré matching technique for local analysis of spatially coded DNA, whose output is fed to a novel three-dimensional artificial neural network for local DNA alignment. All-optical implementation of the proposed 3D artificial neural network is developed and its accuracy is verified in Zemax. Thanks to its parallel processing capability, the proposed structure performs local alignment of 4 million sequences of 150 base pairs in a few seconds, which is much faster than its electrical counterparts, such as the basic local alignment search tool.
The Porcelain Crab Transcriptome and PCAD, the Porcelain Crab Microarray and Sequence Database

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tagmount, Abderrahmane; Wang, Mei; Lindquist, Erika

2010-01-27

Background: With the emergence of a completed genome sequence of the freshwater crustacean Daphnia pulex, construction of genomic-scale sequence databases for additional crustacean sequences are important for comparative genomics and annotation. Porcelain crabs, genus Petrolisthes, have been powerful crustacean models for environmental and evolutionary physiology with respect to thermal adaptation and understanding responses of marine organisms to climate change. Here, we present a large-scale EST sequencing and cDNA microarray database project for the porcelain crab Petrolisthes cinctipes. Methodology/Principal Findings: A set of ~;;30K unique sequences (UniSeqs) representing ~;;19K clusters were generated from ~;;98K high quality ESTs from a set ofmore » tissue specific non-normalized and mixed-tissue normalized cDNA libraries from the porcelain crab Petrolisthes cinctipes. Homology for each UniSeq was assessed using BLAST, InterProScan, GO and KEGG database searches. Approximately 66percent of the UniSeqs had homology in at least one of the databases. All EST and UniSeq sequences along with annotation results and coordinated cDNA microarray datasets have been made publicly accessible at the Porcelain Crab Array Database (PCAD), a feature-enriched version of the Stanford and Longhorn Array Databases.Conclusions/Significance: The EST project presented here represents the third largest sequencing effort for any crustacean, and the largest effort for any crab species. Our assembly and clustering results suggest that our porcelain crab EST data set is equally diverse to the much larger EST set generated in the Daphnia pulex genome sequencing project, and thus will be an important resource to the Daphnia research community. Our homology results support the pancrustacea hypothesis and suggest that Malacostraca may be ancestral to Branchiopoda and Hexapoda. Our results also suggest that our cDNA microarrays cover as much of the transcriptome as can reasonably be captured in EST library sequencing approaches, and thus represent a rich resource for studies of environmental genomics.« less
[Identification and phylogenetic analysis of one strain of Lactobacillus delbrueckii subsp. bulgaricus separated from yoghourt].

PubMed

Wang, Chuan; Zhang, Chaowu; Pei, Xiaofang; Liu, Hengchuan

2007-11-01

For being further applied and studied, one strain of Lactobacillus delbrueckii subsp. bulgaricus (wch9901) separated from yoghourt which had been identified by phenotype characteristic analysis was identified by 16S rDNA and phylogenetic analyzed. The 16S rDNA of wch9901 was amplified with the genomic DNA of wch9901 as template, and the conservative sequences of the 16S rDNA as primers. Inserted 16S rDNA amplified into clonal vector pGEM-T under the function of T4 DNA ligase to construct recombined plasmid pGEM-wch9901 16S rDNA. The recombined plasmid was identified by restriction enzyme digestion, and the eligible plasmid was presented to sequencing company for DNA sequencing. Nucleic acid sequence was blast in GenBank and phylogenetic tree was constructed using neighbor-joining method of distance methods by Mega3.1 soft. Results of blastn showed that the homology of 16S rDNA of wch9901 with the 16S rDNA of Lactobacillus delbrueckii subsp. bulgaricus strains was higher than 96%. On the phylogenetic tree, wch9901 formed a separate branch and located between Lactobacillus delbrueckii subsp. bulgaricus LGM2 evolution branch and another evolution branch which was composed of Lactobacillus delbrueckii subsp. bulgaricus DL2 evolution cluster and Lactobacillus delbrueckii subsp. bulgaricus JSQ evolution cluster. The distance between wch9901 evolution branch and Lactobacillus delbrueckii subsp. bulgaricus LGM2 evolution branch was the closest. wch9901 belonged to Lactobacillus delbrueckii subsp. bulgaricus. wch9901 showed the closest evolution relationship to Lactobacillus delbrueckii subsp. bulgaricus LGM2.
Preparation of Low-Input and Ligation-Free ChIP-seq Libraries Using Template-Switching Technology.

PubMed

Bolduc, Nathalie; Lehman, Alisa P; Farmer, Andrew

2016-10-10

Chromatin immunoprecipitation (ChIP) followed by high-throughput sequencing (ChIP-seq) has become the gold standard for mapping of transcription factors and histone modifications throughout the genome. However, for ChIP experiments involving few cells or targeting low-abundance transcription factors, the small amount of DNA recovered makes ligation of adapters very challenging. In this unit, we describe a ChIP-seq workflow that can be applied to small cell numbers, including a robust single-tube and ligation-free method for preparation of sequencing libraries from sub-nanogram amounts of ChIP DNA. An example ChIP protocol is first presented, resulting in selective enrichment of DNA-binding proteins and cross-linked DNA fragments immobilized on beads via an antibody bridge. This is followed by a protocol for fast and easy cross-linking reversal and DNA recovery. Finally, we describe a fast, ligation-free library preparation protocol, featuring DNA SMART technology, resulting in samples ready for Illumina sequencing. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.
Ribosomal RNA Genes Contribute to the Formation of Pseudogenes and Junk DNA in the Human Genome.

PubMed

Robicheau, Brent M; Susko, Edward; Harrigan, Amye M; Snyder, Marlene

2017-02-01

Approximately 35% of the human genome can be identified as sequence devoid of a selected-effect function, and not derived from transposable elements or repeated sequences. We provide evidence supporting a known origin for a fraction of this sequence. We show that: 1) highly degraded, but near full length, ribosomal DNA (rDNA) units, including both 45S and Intergenic Spacer (IGS), can be found at multiple sites in the human genome on chromosomes without rDNA arrays, 2) that these rDNA sequences have a propensity for being centromere proximal, and 3) that sequence at all human functional rDNA array ends is divergent from canonical rDNA to the point that it is pseudogenic. We also show that small sequence strings of rDNA (from 45S + IGS) can be found distributed throughout the genome and are identifiable as an "rDNA-like signal", representing 0.26% of the q-arm of HSA21 and ∼2% of the total sequence of other regions tested. The size of sequence strings found in the rDNA-like signal intergrade into the size of sequence strings that make up the full-length degrading rDNA units found scattered throughout the genome. We conclude that the displaced and degrading rDNA sequences are likely of a similar origin but represent different stages in their evolution towards random sequence. Collectively, our data suggests that over vast evolutionary time, rDNA arrays contribute to the production of junk DNA. The concept that the production of rDNA pseudogenes is a by-product of concerted evolution represents a previously under-appreciated process; we demonstrate here its importance. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
DNA breathing dynamics distinguish binding from nonbinding consensus sites for transcription factor YY1 in cells.

PubMed

Alexandrov, Boian S; Fukuyo, Yayoi; Lange, Martin; Horikoshi, Nobuo; Gelev, Vladimir; Rasmussen, Kim Ø; Bishop, Alan R; Usheva, Anny

2012-11-01

The genome-wide mapping of the major gene expression regulators, the transcription factors (TFs) and their DNA binding sites, is of great importance for describing cellular behavior and phenotypic diversity. Presently, the methods for prediction of genomic TF binding produce a large number of false positives, most likely due to insufficient description of the physiochemical mechanisms of protein-DNA binding. Growing evidence suggests that, in the cell, the double-stranded DNA (dsDNA) is subject to local transient strands separations (breathing) that contribute to genomic functions. By using site-specific chromatin immunopecipitations, gel shifts, BIOBASE data, and our model that accurately describes the melting behavior and breathing dynamics of dsDNA we report a specific DNA breathing profile found at YY1 binding sites in cells. We find that the genomic flanking sequence variations and SNPs, may exert long-range effects on DNA dynamics and predetermine YY1 binding. The ubiquitous TF YY1 has a fundamental role in essential biological processes by activating, initiating or repressing transcription depending upon the sequence context it binds. We anticipate that consensus binding sequences together with the related DNA dynamics profile may significantly improve the accuracy of genomic TF binding sites and TF binding-related functional SNPs.
Clinical utility of circulating tumor DNA for molecular assessment in pancreatic cancer.

PubMed

Takai, Erina; Totoki, Yasushi; Nakamura, Hiromi; Morizane, Chigusa; Nara, Satoshi; Hama, Natsuko; Suzuki, Masami; Furukawa, Eisaku; Kato, Mamoru; Hayashi, Hideyuki; Kohno, Takashi; Ueno, Hideki; Shimada, Kazuaki; Okusaka, Takuji; Nakagama, Hitoshi; Shibata, Tatsuhiro; Yachida, Shinichi

2015-12-16

Pancreatic ductal adenocarcinoma (PDAC) remains one of the most lethal malignancies. The genomic landscape of the PDAC genome features four frequently mutated genes (KRAS, CDKN2A, TP53, and SMAD4) and dozens of candidate driver genes altered at low frequency, including potential clinical targets. Circulating cell-free DNA (cfDNA) is a promising resource to detect and monitor molecular characteristics of tumors. In the present study, we determined the mutational status of KRAS in plasma cfDNA using multiplex picoliter-droplet digital PCR in 259 patients with PDAC. We constructed a novel modified SureSelect-KAPA-Illumina platform and an original panel of 60 genes. We then performed targeted deep sequencing of cfDNA and matched germline DNA samples in 48 patients who had ≥1% mutant allele frequencies of KRAS in plasma cfDNA. Importantly, potentially targetable somatic mutations were identified in 14 of 48 patients (29.2%) examined by targeted deep sequencing of cfDNA. We also analyzed somatic copy number alterations based on the targeted sequencing data using our in-house algorithm, and potentially targetable amplifications were detected. Assessment of mutations and copy number alterations in plasma cfDNA may provide a prognostic and diagnostic tool to assist decisions regarding optimal therapeutic strategies for PDAC patients.
Inhibition of HMGA2 binding to DNA by netropsin

PubMed Central

Miao, Yi; Cui, Tengjiao; Leng, Fenfei; Wilson, W. David

2008-01-01

The design of small synthetic molecules that can be used to affect gene expression is an area of active interest for development of agents in therapeutic and biotechnology applications. Many compounds that target the minor groove in AT sequences in DNA are well characterized and are promising reagents for use as modulators of protein-DNA complexes. The mammalian high mobility group transcriptional factor, HMGA2, also targets the DNA minor groove and plays critical roles in disease processes from cancer to obesity. Biosensor-surface plasmon resonance methods were used to monitor HMGA2 binding to target sites on immobilized DNA and a competition assay for inhibition of the HMGA2-DNA complex was designed. HMGA2 binds strongly to the DNA through AT hook domains with KD values of 20 - 30 nM depending on the DNA sequence. The well-characterized minor groove binder, netropsin, was used to develop and test the assay. The compound has two binding sites in the protein-DNA interaction sequence and this provides an advantage for inhibition. An equation for analysis of results when the inhibitor has two binding sites in the biopolymer recognition surface is presented with the results. The assay provides a platform for discovery of HMGA2 inhibitors. PMID:18023407
Chromosome evolution in the Thermotogales: large-scale inversions and strain diversification of CRISPR sequences.

PubMed

DeBoy, Robert T; Mongodin, Emmanuel F; Emerson, Joanne B; Nelson, Karen E

2006-04-01

In the present study, the chromosomes of two members of the Thermotogales were compared. A whole-genome alignment of Thermotoga maritima MSB8 and Thermotoga neapolitana NS-E has revealed numerous large-scale DNA rearrangements, most of which are associated with CRISPR DNA repeats and/or tRNA genes. These DNA rearrangements do not include the putative origin of DNA replication but move within the same replichore, i.e., the same replicating half of the chromosome (delimited by the replication origin and terminus). Based on cumulative GC skew analysis, both the T. maritima and T. neapolitana lineages contain one or two major inverted DNA segments. Also, based on PCR amplification and sequence analysis of the DNA joints that are associated with the major rearrangements, the overall chromosome architecture was found to be conserved at most DNA joints for other strains of T. neapolitana. Taken together, the results from this analysis suggest that the observed chromosomal rearrangements in the Thermotogales likely occurred by successive inversions after their divergence from a common ancestor and before strain diversification. Finally, sequence analysis shows that size polymorphisms in the DNA joints associated with CRISPRs can be explained by expansion and possibly contraction of the DNA repeat and spacer unit, providing a tool for discerning the relatedness of strains from different geographic locations.
Barcoding of fresh water fishes from Pakistan.

PubMed

Karim, Asma; Iqbal, Asad; Akhtar, Rehan; Rizwan, Muhammad; Amar, Ali; Qamar, Usman; Jahan, Shah

2016-07-01

DNA bar-coding is a taxonomic method that uses small genetic markers in organisms' mitochondrial DNA (mt DNA) for identification of particular species. It uses sequence diversity in a 658-base pair fragment near the 5' end of the mitochondrial cytochrome c oxidase subunit 1 (CO1) gene as a tool for species identification. DNA barcoding is more accurate and reliable method as compared with the morphological identification. It is equally useful in juveniles as well as adult stages of fishes. The present study was conducted to identify three farm fish species of Pakistan (Cyprinus carpio, Cirrhinus mrigala, and Ctenopharyngodon idella) genetically. All of them belonged to family cyprinidae. CO1 gene was amplified. PCR products were sequenced and analyzed by bioinformatic software. Conspecific, congenric, and confamilial k2P nucleotide divergence was estimated. From these findings, it was concluded that the gene sequence, CO1, may serve as milestone for the identification of related species at molecular level.
EMPOP-quality mtDNA control region sequences from Kashmiri of Azad Jammu & Kashmir, Pakistan.

PubMed

Rakha, Allah; Peng, Min-Sheng; Bi, Rui; Song, Jiao-Jiao; Salahudin, Zeenat; Adan, Atif; Israr, Muhammad; Yao, Yong-Gang

2016-11-01

The mitochondrial DNA (mtDNA) control region (nucleotide position 16024-576) sequences were generated through Sanger sequencing method for 317 self-identified Kashmiris from all districts of Azad Jammu & Kashmir Pakistan. The population sample set showed a total of 251 haplotypes, with a relatively high haplotype diversity (0.9977) and a low random match probability (0.54%). The containing matrilineal lineages belonging to three different phylogeographic origins of Western Eurasian (48.9%), South Asian (47.0%) and East Asian (4.1%). The present study was compared to previous data from Pakistan and other worldwide populations (Central Asia, Western Asia, and East & Southeast Asia). The dataset is made available through EMPOP under accession number EMP00679 and will serve as an mtDNA reference database in forensic casework in Pakistan. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Myopathic mtDNA Depletion Syndrome Due to Mutation in TK2 Gene.

PubMed

Martín-Hernández, Elena; García-Silva, María Teresa; Quijada-Fraile, Pilar; Rodríguez-García, María Elena; Hernández-Laín, Aurelio; Coca-Robinot, David; Rivera, Henry; Fernández-Toral, Joaquín; Arenas, Joaquín; Martín, MiguelÁngel; Martínez-Azorín, Francisco

2016-02-29

Whole-exome sequencing (WES) was used to identify the disease gene(s) in a Spanish girl with failure to thrive, muscle weakness, mild facial weakness, elevated creatine kinase (CK), deficiency of mitochondrial complex III and depletion of mtDNA. With WES data, it was possible to get the whole mtDNA sequencing and discard any pathogenic variant in this genome. The analysis of whole exome uncovered a homozygous pathogenic mutation in Thymidine kinase 2 gene (TK2; NM_004614.4:c.323C>T, p.T108M). TK2 mutations have been identified mainly in patients with the myopathic form of mtDNA depletion syndromes (MDS). This patient presents an atypical TK2 related-myopathic form of MDS, because despite having a very low content of mtDNA (<20%), she presents a slower and less severe evolution of the disease. In conclusion, our data confirm the role of TK2 gene in MDS and expanded the phenotypic spectrum.
Oligonucleotide Sensor Based on Selective Capture of Upconversion Nanoparticles Triggered by Target-Induced DNA Interstrand Ligand Reaction

PubMed Central

2017-01-01

We present a sensor that exploits the phenomenon of upconversion luminescence to detect the presence of specific sequences of small oligonucleotides such as miRNAs among others. The sensor is based on NaYF4:Yb,Er@SiO2 nanoparticles functionalized with ssDNA that contain azide groups on the 3′ ends. In the presence of a target sequence, interstrand ligation is possible via the click-reaction between one azide of the upconversion probe and a DBCO-ssDNA-biotin probe present in the solution. As a result of this specific and selective process, biotin is covalently attached to the surface of the upconversion nanoparticles. The presence of biotin on the surface of the nanoparticles allows their selective capture on a streptavidin-coated support, giving a luminescent signal proportional to the amount of target strands present in the test samples. With the aim of studying the analytical properties of the sensor, total RNA samples were extracted from healthy mosquitoes and were spiked-in with a specific target sequence at different concentrations. The result of these experiments revealed that the sensor was able to detect 10–17 moles per well (100 fM) of the target sequence in mixtures containing 100 ng of total RNA per well. A similar limit of detection was found for spiked human serum samples, demonstrating the suitability of the sensor for detecting specific sequences of small oligonucleotides under real conditions. In contrast, in the presence of noncomplementary sequences or sequences having mismatches, the luminescent signal was negligible or conspicuously reduced. PMID:28332400
Isolation of a cDNA Encoding a Granule-Bound 152-Kilodalton Starch-Branching Enzyme in Wheat1

PubMed Central

Båga, Monica; Nair, Ramesh B.; Repellin, Anne; Scoles, Graham J.; Chibbar, Ravindra N.

2000-01-01

Screening of a wheat (Triticum aestivum) cDNA library for starch-branching enzyme I (SBEI) genes combined with 5′-rapid amplification of cDNA ends resulted in isolation of a 4,563-bp composite cDNA, Sbe1c. Based on sequence alignment to characterized SBEI cDNA clones isolated from plants, the SBEIc predicted from the cDNA sequence was produced with a transit peptide directing the polypeptide into plastids. Furthermore, the predicted mature form of SBEIc was much larger (152 kD) than previously characterized plant SBEI (80–100 kD) and contained a partial duplication of SBEI sequences. The first SBEI domain showed high amino acid similarity to a 74-kD wheat SBEI-like protein that is inactive as a branching enzyme when expressed in Escherichia coli. The second SBEI domain on SBEIc was identical in sequence to a functional 87-kD SBEI produced in the wheat endosperm. Immunoblot analysis of proteins produced in developing wheat kernels demonstrated that the 152-kD SBEIc was, in contrast to the 87- to 88-kD SBEI, preferentially associated with the starch granules. Proteins similar in size and recognized by wheat SBEI antibodies were also present in Triticum monococcum, Triticum tauschii, and Triticum turgidum subsp. durum. PMID:10982440
Single-molecule study of thymidine glycol and i-motif through the alpha-hemolysin ion channel

NASA Astrophysics Data System (ADS)

He, Lidong

Nanopore-based devices have emerged as a single-molecule detection and analysis tool for a wide range of applications. Through electrophoretically driving DNA molecules across a nanosized pore, a lot of information can be received, including unfolding kinetics and DNA-protein interactions. This single-molecule method has the potential to sequence kilobase length DNA polymers without amplification or labeling, approaching "the third generation" genome sequencing for around $1000 within 24 hours. alpha-Hemolysin biological nanopores have the advantages of excellent stability, low-noise level, and precise site-directed mutagenesis for engineering this protein nanopore. The first work presented in this thesis established the current signal of the thymidine glycol lesion in DNA oligomers through an immobilization experiment. The thymidine glycol enantiomers were differentiated from each other by different current blockage levels. Also, the effect of bulky hydrophobic adducts to the current blockage was investigated. Secondly, the alpha-hemolysin nanopore was used to study the human telomere i-motif and RET oncogene i-motif at a single-molecule level. In Chapter 3, it was demonstrated that the alpha-hemolysin nanopore can differentiate an i-motif form and single-strand DNA form at different pH values based on the same sequence. In addition, it shows potential to differentiate the folding topologies generated from the same DNA sequence.
Single-cell genomic sequencing using Multiple Displacement Amplification.

PubMed

Lasken, Roger S

2007-10-01

Single microbial cells can now be sequenced using DNA amplified by the Multiple Displacement Amplification (MDA) reaction. The few femtograms of DNA in a bacterium are amplified into micrograms of high molecular weight DNA suitable for DNA library construction and Sanger sequencing. The MDA-generated DNA also performs well when used directly as template for pyrosequencing by the 454 Life Sciences method. While MDA from single cells loses some of the genomic sequence, this approach will greatly accelerate the pace of sequencing from uncultured microbes. The genetically linked sequences from single cells are also a powerful tool to be used in guiding genomic assembly of shotgun sequences of multiple organisms from environmental DNA extracts (metagenomic sequences).

Structural analysis of the rDNA intergenic spacer of Brassica nigra: evolutionary divergence of the spacers of the three diploid Brassica species.

PubMed

Bhatia, S; Singh Negi, M; Lakshmikumaran, M

1996-11-01

EcoRI restriction of the B. nigra rDNA recombinants, isolated from a lambda genomic library, showed that the 3.9-kb fragment corresponded to the Intergenic Spacer (IGS), which was sequenced and found to be 3,928 bp in size. Sequence and dot-matrix analyses showed that the organization of the B. nigra rDNA IGS was typical of most rDNA spacers, consisting of a central repetitive region and flanking unique sequences on either side. The repetitive region was composed of two repeat families-RF 'A' and RF 'B.' The B. nigra RF 'A' consisted of a tandem array of three full-length copies of a 106-bp sequence element. RF 'B' was composed of 66 tandemly repeated elements. Each 'B' element was only 21-bp in size and this is the smallest repeat unit identified in plant rDNA to date. The putative transcription initiation site (TIS) was identified as nucleotide position 3,110. Based on the sequence analysis it was suggested that the present organization of the repeat families was generated by successive cycles of deletions and amplifications and was being maintained by homogenization processes such as gene conversion and crossing-over.A detailed comparison of the rDNA IGS sequences of the three diploid Brassica species-namely, B. nigra, B. campestris, and B. oleracea-was carried out. First, comparisons revealed that B. campestris and B. oleracea were close to each other as the repeat families in both showed high sequence homology between each other. Second, the repeat elements in both the species were organized in an interspersed manner. Third, a 52-bp sequence, present just downstream of the repeats in B. campestris, was found to be identical to the B. oleracea repeats, thereby suggesting a common progenitor. On the other hand, in B. nigra no interspersion pattern of organization of repeats was observed. Further, the B. nigra RF 'A' was identified as distinct from the repeat families of B. campestris and B. oleracea. Based on this analysis, it was suggested that during speciation B. campestris and B. oleracea evolved in one lineage whereas B. nigra diverged into a separate lineage. The comparative analysis of the IGS helped in identifying not only conserved ancestral sequence motifs of possible functional significance such as promoters and enhancers, but also sequences which showed variation between the three diploid species and were therefore identified as species-specific sequences.
Nuclear 28S rDNA phylogeny supports the basal placement of Noctiluca scintillans (Dinophyceae; Noctilucales) in dinoflagellates.

PubMed

Ki, Jang-Seu

2010-05-01

Noctiluca scintillans (Macartney) Kofoid et Swezy, 1921 is an unarmoured heterotrophic dinoflagellate with a global distribution, and has been considered as one of the ancestral taxa among dinoflagellates. Recently, 18S rDNA, actin, alpha-, beta-tubulin, and Hsp90-based phylogenies have shown the basal position of the noctilucids. However, the relationships of dinoflagellates in the basal lineages are still controversial. Although the nuclear rDNA (e.g. 18S, ITS-5.8S, and 28S) contains much genetic information, DNA sequences of N. scintillans rDNA molecules were insufficiently characterized as yet. Here the author sequenced a long-range nuclear rDNA, spanning from the 18S to the D5 region of the 28S rDNA, of N. scintillans. The present N. scintillans had a nearly identical genotype (>99.0% similarity) compared to other Noctiluca sequences from different geographic origins. Nucleotide divergence in the partial 28S rDNA was significantly high (p<0.05) as compared to the 18S rDNA, demonstrating that the information from 28S rDNA is more variable. The 28S rDNA phylogeny of 17 selected dinoflagellates, two perkinsids, and two apicomplexans as outgroups showed that N. scintillans and Oxyrrhis marina formed a clade that diverged separately from core dinoflagellates. Copyright (c) 2009 Elsevier GmbH. All rights reserved.
Genome Partitioner: A web tool for multi-level partitioning of large-scale DNA constructs for synthetic biology applications.

PubMed

Christen, Matthias; Del Medico, Luca; Christen, Heinz; Christen, Beat

2017-01-01

Recent advances in lower-cost DNA synthesis techniques have enabled new innovations in the field of synthetic biology. Still, efficient design and higher-order assembly of genome-scale DNA constructs remains a labor-intensive process. Given the complexity, computer assisted design tools that fragment large DNA sequences into fabricable DNA blocks are needed to pave the way towards streamlined assembly of biological systems. Here, we present the Genome Partitioner software implemented as a web-based interface that permits multi-level partitioning of genome-scale DNA designs. Without the need for specialized computing skills, biologists can submit their DNA designs to a fully automated pipeline that generates the optimal retrosynthetic route for higher-order DNA assembly. To test the algorithm, we partitioned a 783 kb Caulobacter crescentus genome design. We validated the partitioning strategy by assembling a 20 kb test segment encompassing a difficult to synthesize DNA sequence. Successful assembly from 1 kb subblocks into the 20 kb segment highlights the effectiveness of the Genome Partitioner for reducing synthesis costs and timelines for higher-order DNA assembly. The Genome Partitioner is broadly applicable to translate DNA designs into ready to order sequences that can be assembled with standardized protocols, thus offering new opportunities to harness the diversity of microbial genomes for synthetic biology applications. The Genome Partitioner web tool can be accessed at https://christenlab.ethz.ch/GenomePartitioner.
Structural analysis of DNA binding by C.Csp231I, a member of a novel class of R-M controller proteins regulating gene expression

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shevtsov, M. B.; Streeter, S. D.; Thresh, S.-J.

2015-02-01

The structure of the new class of controller proteins (exemplified by C.Csp231I) in complex with its 21 bp DNA-recognition sequence is presented, and the molecular basis of sequence recognition in this class of proteins is discussed. An unusual extended spacer between the dimer binding sites suggests a novel interaction between the two C-protein dimers. In a wide variety of bacterial restriction–modification systems, a regulatory ‘controller’ protein (or C-protein) is required for effective transcription of its own gene and for transcription of the endonuclease gene found on the same operon. We have recently turned our attention to a new class ofmore » controller proteins (exemplified by C.Csp231I) that have quite novel features, including a much larger DNA-binding site with an 18 bp (∼60 Å) spacer between the two palindromic DNA-binding sequences and a very different recognition sequence from the canonical GACT/AGTC. Using X-ray crystallography, the structure of the protein in complex with its 21 bp DNA-recognition sequence was solved to 1.8 Å resolution, and the molecular basis of sequence recognition in this class of proteins was elucidated. An unusual aspect of the promoter sequence is the extended spacer between the dimer binding sites, suggesting a novel interaction between the two C-protein dimers when bound to both recognition sites correctly spaced on the DNA. A U-bend model is proposed for this tetrameric complex, based on the results of gel-mobility assays, hydrodynamic analysis and the observation of key contacts at the interface between dimers in the crystal.« less
CDSbank: taxonomy-aware extraction, selection, renaming and formatting of protein-coding DNA or amino acid sequences.

PubMed

Hazes, Bart

2014-02-28

Protein-coding DNA sequences and their corresponding amino acid sequences are routinely used to study relationships between sequence, structure, function, and evolution. The rapidly growing size of sequence databases increases the power of such comparative analyses but it makes it more challenging to prepare high quality sequence data sets with control over redundancy, quality, completeness, formatting, and labeling. Software tools for some individual steps in this process exist but manual intervention remains a common and time consuming necessity. CDSbank is a database that stores both the protein-coding DNA sequence (CDS) and amino acid sequence for each protein annotated in Genbank. CDSbank also stores Genbank feature annotation, a flag to indicate incomplete 5' and 3' ends, full taxonomic data, and a heuristic to rank the scientific interest of each species. This rich information allows fully automated data set preparation with a level of sophistication that aims to meet or exceed manual processing. Defaults ensure ease of use for typical scenarios while allowing great flexibility when needed. Access is via a free web server at http://hazeslab.med.ualberta.ca/CDSbank/. CDSbank presents a user-friendly web server to download, filter, format, and name large sequence data sets. Common usage scenarios can be accessed via pre-programmed default choices, while optional sections give full control over the processing pipeline. Particular strengths are: extract protein-coding DNA sequences just as easily as amino acid sequences, full access to taxonomy for labeling and filtering, awareness of incomplete sequences, and the ability to take one protein sequence and extract all synonymous CDS or identical protein sequences in other species. Finally, CDSbank can also create labeled property files to, for instance, annotate or re-label phylogenetic trees.
Fine Tuning Gene Expression: The Epigenome

PubMed Central

Mohtat, Davoud; Susztak, Katalin

2011-01-01

An epigenetic trait is a stably inherited phenotype resulting from changes in a chromosome without alterations in the DNA sequence. Epigenetic modifications, such as; DNA methylation, together with covalent modification of histones, are thought to alter chromatin density and accessibility of the DNA to cellular machinery, thereby modulating the transcriptional potential of the underlying DNA sequence. As epigenetic marks under environmental influence, epigenetics provides an added layer of variation that might mediate the relationship between genotype and internal and external environmental factors. Integration of our knowledge in genetics, epigenomics and genomics with the use of systems biology tools may present investigators with new powerful tools to study many complex human diseases such as kidney disease. PMID:21044758
Bacillus odysseyi isolate

NASA Technical Reports Server (NTRS)

La Duc, Myron Thomas (Inventor); Venkateswaran, Kasthuri (Inventor)

2007-01-01

The present invention relates to discovery and isolation of a biologically pure culture of a Bacillus odysseyi isolate with high adherence and sterilization resistant properties. B. odysseyi is a round spore forming Bacillus species that produces an exosporium. This novel species has been characterized on the basis of phenotypic traits, 16S rDNA sequence analysis and DNA-DNA hybridization. According to the results of these analyses, this strain belongs to the genus Bacillus and the type strain is 34hs-1.sup.T (=ATCC PTA-4993.sup.T=NRRL B-30641.sup.T=NBRC 100172.sup.T). The GenBank accession number for the 16S rDNA sequence of strain 34hs-1.sup.T is AF526913.
Direct uptake and degradation of DNA by lysosomes

PubMed Central

Fujiwara, Yuuki; Kikuchi, Hisae; Aizawa, Shu; Furuta, Akiko; Hatanaka, Yusuke; Konya, Chiho; Uchida, Kenko; Wada, Keiji; Kabuta, Tomohiro

2013-01-01

Lysosomes contain various hydrolases that can degrade proteins, lipids, nucleic acids and carbohydrates. We recently discovered “RNautophagy,” an autophagic pathway in which RNA is directly taken up by lysosomes and degraded. A lysosomal membrane protein, LAMP2C, a splice variant of LAMP2, binds to RNA and acts as a receptor for this pathway. In the present study, we show that DNA is also directly taken up by lysosomes and degraded. Like RNautophagy, this autophagic pathway, which we term “DNautophagy,” is dependent on ATP. The cytosolic sequence of LAMP2C also directly interacts with DNA, and LAMP2C functions as a receptor for DNautophagy, in addition to RNautophagy. Similarly to RNA, DNA binds to the cytosolic sequences of fly and nematode LAMP orthologs. Together with the findings of our previous study, our present findings suggest that RNautophagy and DNautophagy are evolutionarily conserved systems in Metazoa. PMID:23839276
Investigations of Escherichia coli promoter sequences with artificial neural networks: New signals discovered upstream of the transcriptional startpoint

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pedersen, A.G.; Engelbrecht, J.

1995-12-31

In this paper we present a novel method for using the learning ability of a neural network as a measure of information in local regions of input data. Using the method to analyze Escherichia coli promoters, we discover all previously described signals, and furthermore find new signals that are regularly spaced along the promoter region. The spacing of all signals correspond to the helical periodicity of DNA, meaning that the signals are all present on the same face of the DNA helix in the promoter region. This is consistent with a model where the RNA polymerase contacts the promoter onmore » one side of the DNA, and suggests that the regions important for promoter recognition may include more positions on the DNA than usually assumed. We furthermore analyze the E.coli promoters by calculating the Kullback Leibler distance, and by constructing sequence logos.« less
ACCELERATED EVOLUTION OF LAND SNAILS MANDARINA IN THE OCEANIC BONIN ISLANDS: EVIDENCE FROM MITOCHONDRIAL DNA SEQUENCES.

PubMed

Chiba, Satoshi

1999-04-01

An endemic land snail genus Mandarina of the oceanic Bonin (Ogasawara) Islands shows exceptionally rapid evolution not only of morphological and ecological traits, but of DNA sequence. A phylogenetic relationship based on mitochondrial DNA (mtDNA) sequences suggests that morphological differences equivalent to the differences between families were produced between Mandarina and its ancestor during the Pleistocene. The inferred phylogeny shows that species with similar morphologies and life habitats appeared repeatedly and independently in different lineages and islands at different times. Sequential adaptive radiations occurred in different islands of the Bonin Islands and species occupying arboreal, semiarboreal, and terrestrial habitat arose independently in each island. Because of a close relationship between shell morphology and life habitat, independent evolution of the same life habitat in different islands created species possesing the same shell morphology in different islands and lineages. This rapid evolution produced some incongruences between phylogenetic relationship and species taxonomy. Levels of sequence divergence of mtDNA among the species of Mandarina is extremely high. The maximum level of sequence divergence at 16S and 12S ribosomal RNA sequence within Mandarina are 18.7% and 17.7%, respectively, and this suggests that evolution of mtDNA of Mandarina is extremely rapid, more than 20 times faster than the standard rate in other animals. The present examination reveals that evolution of morphological and ecological traits occurs at extremely high rates in the time of adaptive radiation, especially in fragmented environments. © 1999 The Society for the Study of Evolution.
Gel-seq: A Method for Simultaneous Sequencing Library Preparation of DNA and RNA Using Hydrogel Matrices.

PubMed

Hoople, Gordon D; Richards, Andrew; Wu, Yan; Pisano, Albert P; Zhang, Kun

2018-03-26

The ability to amplify and sequence either DNA or RNA from small starting samples has only been achieved in the last five years. Unfortunately, the standard protocols for generating genomic or transcriptomic libraries are incompatible and researchers must choose whether to sequence DNA or RNA for a particular sample. Gel-seq solves this problem by enabling researchers to simultaneously prepare libraries for both DNA and RNA starting with 100 - 1000 cells using a simple hydrogel device. This paper presents a detailed approach for the fabrication of the device as well as the biological protocol to generate paired libraries. We designed Gel-seq so that it could be easily implemented by other researchers; many genetics labs already have the necessary equipment to reproduce the Gel-seq device fabrication. Our protocol employs commonly-used kits for both whole-transcript amplification (WTA) and library preparation, which are also likely to be familiar to researchers already versed in generating genomic and transcriptomic libraries. Our approach allows researchers to bring to bear the power of both DNA and RNA sequencing on a single sample without splitting and with negligible added cost.
Haplogroup relationships between domestic and wild sheep resolved using a mitogenome panel.

PubMed

Meadows, J R S; Hiendleder, S; Kijas, J W

2011-04-01

Five haplogroups have been identified in domestic sheep through global surveys of mitochondrial (mt) sequence variation, however these group classifications are often based on small fragments of the complete mtDNA sequence; partial control region or the cytochrome B gene. This study presents the complete mitogenome from representatives of each haplogroup identified in domestic sheep, plus a sample of their wild relatives. Comparison of the sequence successfully resolved the relationships between each haplogroup and provided insight into the relationship with wild sheep. The five haplogroups were characterised as branching independently, a radiation that shared a common ancestor 920,000 ± 190,000 years ago based on protein coding sequence. The utility of various mtDNA components to inform the true relationship between sheep was also examined with Bayesian, maximum likelihood and partitioned Bremmer support analyses. The control region was found to be the mtDNA component, which contributed the highest amount of support to the tree generated using the complete data set. This study provides the nucleus of a mtDNA mitogenome panel, which can be used to assess additional mitogenomes and serve as a reference set to evaluate small fragments of the mtDNA.
Haplogroup relationships between domestic and wild sheep resolved using a mitogenome panel

PubMed Central

Meadows, J R S; Hiendleder, S; Kijas, J W

2011-01-01

Five haplogroups have been identified in domestic sheep through global surveys of mitochondrial (mt) sequence variation, however these group classifications are often based on small fragments of the complete mtDNA sequence; partial control region or the cytochrome B gene. This study presents the complete mitogenome from representatives of each haplogroup identified in domestic sheep, plus a sample of their wild relatives. Comparison of the sequence successfully resolved the relationships between each haplogroup and provided insight into the relationship with wild sheep. The five haplogroups were characterised as branching independently, a radiation that shared a common ancestor 920 000±190 000 years ago based on protein coding sequence. The utility of various mtDNA components to inform the true relationship between sheep was also examined with Bayesian, maximum likelihood and partitioned Bremmer support analyses. The control region was found to be the mtDNA component, which contributed the highest amount of support to the tree generated using the complete data set. This study provides the nucleus of a mtDNA mitogenome panel, which can be used to assess additional mitogenomes and serve as a reference set to evaluate small fragments of the mtDNA. PMID:20940734
The Gene Construction Kit: a new computer program for manipulating and presenting DNA constructs.

PubMed

Gross, R H

1990-06-01

The Gene Construction Kit is a new tool for manipulating and displaying DNA sequence information. Constructs can be displayed either graphically or as formatted sequence. Segments of DNA can be cut out with restriction enzymes and pasted into other sites. The program keeps track of staggered ends and notifies the user of incompatibilities and offers a choice of ligation options. Each segment of a construct can have its own defined thickness, pattern, direction and color. The sequence listing can be displayed in any font and style in user defined grouping. Nucleotide positions can be displayed as can restriction sites and protein sequences. The DNA can be displayed as either single- or double-stranded. Restriction sites can be readily marked. Alternative views of the DNA can be maintained and the history of the construct automatically stored. Gel electrophoresis patterns can be generated and can be used in cloning project design. Extensive comments can be stored with the construct and can be searched rapidly for key words. High quality illustrations showing multiple editable constructs with added graphics and text information can be generated for slides, posters or publication.
Fragmentation of contaminant and endogenous DNA in ancient samples determined by shotgun sequencing; prospects for human palaeogenomics.

PubMed

García-Garcerà, Marc; Gigli, Elena; Sanchez-Quinto, Federico; Ramirez, Oscar; Calafell, Francesc; Civit, Sergi; Lalueza-Fox, Carles

2011-01-01

Despite the successful retrieval of genomes from past remains, the prospects for human palaeogenomics remain unclear because of the difficulty of distinguishing contaminant from endogenous DNA sequences. Previous sequence data generated on high-throughput sequencing platforms indicate that fragmentation of ancient DNA sequences is a characteristic trait primarily arising due to depurination processes that create abasic sites leading to DNA breaks. METHODOLOGY/PRINCIPALS FINDINGS: To investigate whether this pattern is present in ancient remains from a temperate environment, we have 454-FLX pyrosequenced different samples dated between 5,500 and 49,000 years ago: a bone from an extinct goat (Myotragus balearicus) that was treated with a depurinating agent (bleach), an Iberian lynx bone not subjected to any treatment, a human Neolithic sample from Barcelona (Spain), and a Neandertal sample from the El Sidrón site (Asturias, Spain). The efficiency of retrieval of endogenous sequences is below 1% in all cases. We have used the non-human samples to identify human sequences (0.35 and 1.4%, respectively), that we positively know are contaminants. We observed that bleach treatment appears to create a depurination-associated fragmentation pattern in resulting contaminant sequences that is indistinguishable from previously described endogenous sequences. Furthermore, the nucleotide composition pattern observed in 5' and 3' ends of contaminant sequences is much more complex than the flat pattern previously described in some Neandertal contaminants. Although much research on samples with known contaminant histories is needed, our results suggest that endogenous and contaminant sequences cannot be distinguished by the fragmentation pattern alone.
Lineage divergence detected in the malaria vector Anopheles marajoara (Diptera: Culicidae) in Amazonian Brazil

PubMed Central

2010-01-01

Background Cryptic species complexes are common among anophelines. Previous phylogenetic analysis based on the complete mtDNA COI gene sequences detected paraphyly in the Neotropical malaria vector Anopheles marajoara. The "Folmer region" detects a single taxon using a 3% divergence threshold. Methods To test the paraphyletic hypothesis and examine the utility of the Folmer region, genealogical trees based on a concatenated (white + 3' COI sequences) dataset and pairwise differentiation of COI fragments were examined. The population structure and demographic history were based on partial COI sequences for 294 individuals from 14 localities in Amazonian Brazil. 109 individuals from 12 localities were sequenced for the nDNA white gene, and 57 individuals from 11 localities were sequenced for the ribosomal DNA (rDNA) internal transcribed spacer 2 (ITS2). Results Distinct A. marajoara lineages were detected by combined genealogical analysis and were also supported among COI haplotypes using a median joining network and AMOVA, with time since divergence during the Pleistocene (<100,000 ya). COI sequences at the 3' end were more variable, demonstrating significant pairwise differentiation (3.82%) compared to the more moderate 2.92% detected by the Folmer region. Lineage 1 was present in all localities, whereas lineage 2 was restricted mainly to the west. Mismatch distributions for both lineages were bimodal, likely due to multiple colonization events and spatial expansion (~798 - 81,045 ya). There appears to be gene flow within, not between lineages, and a partial barrier was detected near Rio Jari in Amapá state, separating western and eastern populations. In contrast, both nDNA data sets (white gene sequences with or without the retention of the 4th intron, and ITS2 sequences and length) detected a single A. marajoara lineage. Conclusions Strong support for combined data with significant differentiation detected in the COI and absent in the nDNA suggest that the divergence is recent, and detectable only by the faster evolving mtDNA. A within subgenus threshold of >2% may be more appropriate among sister taxa in cryptic anopheline complexes than the standard 3%. Differences in demographic history and climatic changes may have contributed to mtDNA lineage divergence in A. marajoara. PMID:20929572
Function-Based Algorithms for Biological Sequences

ERIC Educational Resources Information Center

Mohanty, Pragyan Sheela P.

2015-01-01

Two problems at two different abstraction levels of computational biology are studied. At the molecular level, efficient pattern matching algorithms in DNA sequences are presented. For gene order data, an efficient data structure is presented capable of storing all gene re-orderings in a systematic manner. A common characteristic of presented…
Identification of high-specificity H-NS binding site in LEE5 promoter of enteropathogenic Esherichia coli (EPEC).

PubMed

Bhat, Abhay Prasad; Shin, Minsang; Choy, Hyon E

2014-07-01

Histone-like nucleoid structuring protein (H-NS) is a small but abundant protein present in enteric bacteria and is involved in compaction of the DNA and regulation of the transcription. Recent reports have suggested that H-NS binds to a specific AT rich DNA sequence than to intrinsically curved DNA in sequence independent manner. We detected two high-specificity H-NS binding sites in LEE5 promoter of EPEC centered at -110 and -138, which were close to the proposed consensus H-NS binding motif. To identify H-NS binding sequence in LEE5 promoter, we took a random mutagenesis approach and found the mutations at around -138 were specifically defective in the regulation by H-NS. It was concluded that H-NS exerts maximum repression via the specific sequence at around -138 and subsequently contacts a subunit of RNAP through oligomerization.
Site-Specific Integration of Foreign DNA into Minimal Bacterial and Human Target Sequences Mediated by a Conjugative Relaxase

PubMed Central

Agúndez, Leticia; González-Prieto, Coral; Machón, Cristina; Llosa, Matxalen

2012-01-01

Background Bacterial conjugation is a mechanism for horizontal DNA transfer between bacteria which requires cell to cell contact, usually mediated by self-transmissible plasmids. A protein known as relaxase is responsible for the processing of DNA during bacterial conjugation. TrwC, the relaxase of conjugative plasmid R388, is also able to catalyze site-specific integration of the transferred DNA into a copy of its target, the origin of transfer (oriT), present in a recipient plasmid. This reaction confers TrwC a high biotechnological potential as a tool for genomic engineering. Methodology/Principal Findings We have characterized this reaction by conjugal mobilization of a suicide plasmid to a recipient cell with an oriT-containing plasmid, selecting for the cointegrates. Proteins TrwA and IHF enhanced integration frequency. TrwC could also catalyze integration when it is expressed from the recipient cell. Both Y18 and Y26 catalytic tyrosil residues were essential to perform the reaction, while TrwC DNA helicase activity was dispensable. The target DNA could be reduced to 17 bp encompassing TrwC nicking and binding sites. Two human genomic sequences resembling the 17 bp segment were accepted as targets for TrwC-mediated site-specific integration. TrwC could also integrate the incoming DNA molecule into an oriT copy present in the recipient chromosome. Conclusions/Significance The results support a model for TrwC-mediated site-specific integration. This reaction may allow R388 to integrate into the genome of non-permissive hosts upon conjugative transfer. Also, the ability to act on target sequences present in the human genome underscores the biotechnological potential of conjugative relaxase TrwC as a site-specific integrase for genomic modification of human cells. PMID:22292089
DNA viewed as an out-of-equilibrium structure

NASA Astrophysics Data System (ADS)

Provata, A.; Nicolis, C.; Nicolis, G.

2014-05-01

The complexity of the primary structure of human DNA is explored using methods from nonequilibrium statistical mechanics, dynamical systems theory, and information theory. A collection of statistical analyses is performed on the DNA data and the results are compared with sequences derived from different stochastic processes. The use of χ2 tests shows that DNA can not be described as a low order Markov chain of order up to r =6. Although detailed balance seems to hold at the level of a binary alphabet, it fails when all four base pairs are considered, suggesting spatial asymmetry and irreversibility. Furthermore, the block entropy does not increase linearly with the block size, reflecting the long-range nature of the correlations in the human genomic sequences. To probe locally the spatial structure of the chain, we study the exit distances from a specific symbol, the distribution of recurrence distances, and the Hurst exponent, all of which show power law tails and long-range characteristics. These results suggest that human DNA can be viewed as a nonequilibrium structure maintained in its state through interactions with a constantly changing environment. Based solely on the exit distance distribution accounting for the nonequilibrium statistics and using the Monte Carlo rejection sampling method, we construct a model DNA sequence. This method allows us to keep both long- and short-range statistical characteristics of the native DNA data. The model sequence presents the same characteristic exponents as the natural DNA but fails to capture spatial correlations and point-to-point details.

DNA viewed as an out-of-equilibrium structure.

PubMed

Provata, A; Nicolis, C; Nicolis, G

2014-05-01

The complexity of the primary structure of human DNA is explored using methods from nonequilibrium statistical mechanics, dynamical systems theory, and information theory. A collection of statistical analyses is performed on the DNA data and the results are compared with sequences derived from different stochastic processes. The use of χ^{2} tests shows that DNA can not be described as a low order Markov chain of order up to r=6. Although detailed balance seems to hold at the level of a binary alphabet, it fails when all four base pairs are considered, suggesting spatial asymmetry and irreversibility. Furthermore, the block entropy does not increase linearly with the block size, reflecting the long-range nature of the correlations in the human genomic sequences. To probe locally the spatial structure of the chain, we study the exit distances from a specific symbol, the distribution of recurrence distances, and the Hurst exponent, all of which show power law tails and long-range characteristics. These results suggest that human DNA can be viewed as a nonequilibrium structure maintained in its state through interactions with a constantly changing environment. Based solely on the exit distance distribution accounting for the nonequilibrium statistics and using the Monte Carlo rejection sampling method, we construct a model DNA sequence. This method allows us to keep both long- and short-range statistical characteristics of the native DNA data. The model sequence presents the same characteristic exponents as the natural DNA but fails to capture spatial correlations and point-to-point details.
Toward a mtDNA locus-specific mutation database using the LOVD platform.

PubMed

Elson, Joanna L; Sweeney, Mary G; Procaccio, Vincent; Yarham, John W; Salas, Antonio; Kong, Qing-Peng; van der Westhuizen, Francois H; Pitceathly, Robert D S; Thorburn, David R; Lott, Marie T; Wallace, Douglas C; Taylor, Robert W; McFarland, Robert

2012-09-01

The Human Variome Project (HVP) is a global effort to collect and curate all human genetic variation affecting health. Mutations of mitochondrial DNA (mtDNA) are an important cause of neurogenetic disease in humans; however, identification of the pathogenic mutations responsible can be problematic. In this article, we provide explanations as to why and suggest how such difficulties might be overcome. We put forward a case in support of a new Locus Specific Mutation Database (LSDB) implemented using the Leiden Open-source Variation Database (LOVD) system that will not only list primary mutations, but also present the evidence supporting their role in disease. Critically, we feel that this new database should have the capacity to store information on the observed phenotypes alongside the genetic variation, thereby facilitating our understanding of the complex and variable presentation of mtDNA disease. LOVD supports fast queries of both seen and hidden data and allows storage of sequence variants from high-throughput sequence analysis. The LOVD platform will allow construction of a secure mtDNA database; one that can fully utilize currently available data, as well as that being generated by high-throughput sequencing, to link genotype with phenotype enhancing our understanding of mitochondrial disease, with a view to providing better prognostic information. © 2012 Wiley Periodicals, Inc.
Toward a mtDNA Locus-Specific Mutation Database Using the LOVD Platform

PubMed Central

Elson, Joanna L.; Sweeney, Mary G.; Procaccio, Vincent; Yarham, John W.; Salas, Antonio; Kong, Qing-Peng; van der Westhuizen, Francois H.; Pitceathly, Robert D.S.; Thorburn, David R.; Lott, Marie T.; Wallace, Douglas C.; Taylor, Robert W.; McFarland, Robert

2015-01-01

The Human Variome Project (HVP) is a global effort to collect and curate all human genetic variation affecting health. Mutations of mitochondrial DNA (mtDNA) are an important cause of neurogenetic disease in humans; however, identification of the pathogenic mutations responsible can be problematic. In this article, we provide explanations as to why and suggest how such difficulties might be overcome. We put forward a case in support of a new Locus Specific Mutation Database (LSDB) implemented using the Leiden Open-source Variation Database (LOVD) system that will not only list primary mutations, but also present the evidence supporting their role in disease. Critically, we feel that this new database should have the capacity to store information on the observed phenotypes alongside the genetic variation, thereby facilitating our understanding of the complex and variable presentation of mtDNA disease. LOVD supports fast queries of both seen and hidden data and allows storage of sequence variants from high-throughput sequence analysis. The LOVD platform will allow construction of a secure mtDNA database; one that can fully utilize currently available data, as well as that being generated by high-throughput sequencing, to link genotype with phenotype enhancing our understanding of mitochondrial disease, with a view to providing better prognostic information. PMID:22581690
Detection of Low-Copy-Number Genomic DNA Sequences in Individual Bacterial Cells by Using Peptide Nucleic Acid-Assisted Rolling-Circle Amplification and Fluorescence In Situ Hybridization▿ †

PubMed Central

Smolina, Irina; Lee, Charles; Frank-Kamenetskii, Maxim

2007-01-01

An approach is proposed for in situ detection of short signature DNA sequences present in single copies per bacterial genome. The site is locally opened by peptide nucleic acids, and a circular oligonucleotide is assembled. The amplicon generated by rolling circle amplification is detected by hybridization with fluorescently labeled decorator probes. PMID:17293504
Influence of quasi-specific sites on kinetics of target DNA search by a sequence-specific DNA-binding protein.

PubMed

Kemme, Catherine A; Esadze, Alexandre; Iwahara, Junji

2015-11-10

Functions of transcription factors require formation of specific complexes at particular sites in cis-regulatory elements of genes. However, chromosomal DNA contains numerous sites that are similar to the target sequences recognized by transcription factors. The influence of such "quasi-specific" sites on functions of the transcription factors is not well understood at present by experimental means. In this work, using fluorescence methods, we have investigated the influence of quasi-specific DNA sites on the efficiency of target location by the zinc finger DNA-binding domain of the inducible transcription factor Egr-1, which recognizes a 9 bp sequence. By stopped-flow assays, we measured the kinetics of Egr-1's association with a target site on 143 bp DNA in the presence of various competitor DNAs, including nonspecific and quasi-specific sites. The presence of quasi-specific sites on competitor DNA significantly decelerated the target association by the Egr-1 protein. The impact of the quasi-specific sites depended strongly on their affinity, their concentration, and the degree of their binding to the protein. To quantitatively describe the kinetic impact of the quasi-specific sites, we derived an analytical form of the apparent kinetic rate constant for the target association and used it for fitting to the experimental data. Our kinetic data with calf thymus DNA as a competitor suggested that there are millions of high-affinity quasi-specific sites for Egr-1 among the 3 billion bp of genomic DNA. This study quantitatively demonstrates that naturally abundant quasi-specific sites on DNA can considerably impede the target search processes of sequence-specific DNA-binding proteins.
Influence of Quasi-Specific Sites on Kinetics of Target DNA Search by a Sequence-Specific DNA-Binding Protein

PubMed Central

2015-01-01

Functions of transcription factors require formation of specific complexes at particular sites in cis-regulatory elements of genes. However, chromosomal DNA contains numerous sites that are similar to the target sequences recognized by transcription factors. The influence of such “quasi-specific” sites on functions of the transcription factors is not well understood at present by experimental means. In this work, using fluorescence methods, we have investigated the influence of quasi-specific DNA sites on the efficiency of target location by the zinc finger DNA-binding domain of the inducible transcription factor Egr-1, which recognizes a 9 bp sequence. By stopped-flow assays, we measured the kinetics of Egr-1’s association with a target site on 143 bp DNA in the presence of various competitor DNAs, including nonspecific and quasi-specific sites. The presence of quasi-specific sites on competitor DNA significantly decelerated the target association by the Egr-1 protein. The impact of the quasi-specific sites depended strongly on their affinity, their concentration, and the degree of their binding to the protein. To quantitatively describe the kinetic impact of the quasi-specific sites, we derived an analytical form of the apparent kinetic rate constant for the target association and used it for fitting to the experimental data. Our kinetic data with calf thymus DNA as a competitor suggested that there are millions of high-affinity quasi-specific sites for Egr-1 among the 3 billion bp of genomic DNA. This study quantitatively demonstrates that naturally abundant quasi-specific sites on DNA can considerably impede the target search processes of sequence-specific DNA-binding proteins. PMID:26502071
Array-based detection of genetic alterations associated with disease

DOEpatents

Pinkel, Daniel; Albertson, Donna G.; Gray, Joe W.

2017-09-05

The present invention relates to DNA sequences from regions of copy number change on chromosome 20. The sequences can be used in hybridization methods for the identification of chromosomal abnormalities associated with various diseases.
Array-based detection of genetic alterations associated with disease

DOEpatents

Pinkel, Daniel; Albertson, Donna G.; Gray, Joe W.

2007-09-11

The present invention relates to DNA sequences from regions of copy number change on chromosome 20. The sequences can be used in hybridization methods for the identification of chromosomal abnormalities associated with various diseases.
Brownian dynamics simulations of sequence-dependent duplex denaturation in dynamically superhelical DNA

NASA Astrophysics Data System (ADS)

Mielke, Steven P.; Grønbech-Jensen, Niels; Krishnan, V. V.; Fink, William H.; Benham, Craig J.

2005-09-01

The topological state of DNA in vivo is dynamically regulated by a number of processes that involve interactions with bound proteins. In one such process, the tracking of RNA polymerase along the double helix during transcription, restriction of rotational motion of the polymerase and associated structures, generates waves of overtwist downstream and undertwist upstream from the site of transcription. The resulting superhelical stress is often sufficient to drive double-stranded DNA into a denatured state at locations such as promoters and origins of replication, where sequence-specific duplex opening is a prerequisite for biological function. In this way, transcription and other events that actively supercoil the DNA provide a mechanism for dynamically coupling genetic activity with regulatory and other cellular processes. Although computer modeling has provided insight into the equilibrium dynamics of DNA supercoiling, to date no model has appeared for simulating sequence-dependent DNA strand separation under the nonequilibrium conditions imposed by the dynamic introduction of torsional stress. Here, we introduce such a model and present results from an initial set of computer simulations in which the sequences of dynamically superhelical, 147 base pair DNA circles were systematically altered in order to probe the accuracy with which the model can predict location, extent, and time of stress-induced duplex denaturation. The results agree both with well-tested statistical mechanical calculations and with available experimental information. Additionally, we find that sites susceptible to denaturation show a propensity for localizing to supercoil apices, suggesting that base sequence determines locations of strand separation not only through the energetics of interstrand interactions, but also by influencing the geometry of supercoiling.
Touch imprint cytology with massively parallel sequencing (TIC-seq): a simple and rapid method to snapshot genetic alterations in tumors.

PubMed

Amemiya, Kenji; Hirotsu, Yosuke; Goto, Taichiro; Nakagomi, Hiroshi; Mochizuki, Hitoshi; Oyama, Toshio; Omata, Masao

2016-12-01

Identifying genetic alterations in tumors is critical for molecular targeting of therapy. In the clinical setting, formalin-fixed paraffin-embedded (FFPE) tissue is usually employed for genetic analysis. However, DNA extracted from FFPE tissue is often not suitable for analysis because of its low levels and poor quality. Additionally, FFPE sample preparation is time-consuming. To provide early treatment for cancer patients, a more rapid and robust method is required for precision medicine. We present a simple method for genetic analysis, called touch imprint cytology combined with massively paralleled sequencing (touch imprint cytology [TIC]-seq), to detect somatic mutations in tumors. We prepared FFPE tissues and TIC specimens from tumors in nine lung cancer patients and one patient with breast cancer. We found that the quality and quantity of TIC DNA was higher than that of FFPE DNA, which requires microdissection to enrich DNA from target tissues. Targeted sequencing using a next-generation sequencer obtained sufficient sequence data using TIC DNA. Most (92%) somatic mutations in lung primary tumors were found to be consistent between TIC and FFPE DNA. We also applied TIC DNA to primary and metastatic tumor tissues to analyze tumor heterogeneity in a breast cancer patient, and showed that common and distinct mutations among primary and metastatic sites could be classified into two distinct histological subtypes. TIC-seq is an alternative and feasible method to analyze genomic alterations in tumors by simply touching the cut surface of specimens to slides. © 2016 The Authors. Cancer Medicine published by John Wiley & Sons Ltd.
Brownian dynamics simulations of sequence-dependent duplex denaturation in dynamically superhelical DNA.

PubMed

Mielke, Steven P; Grønbech-Jensen, Niels; Krishnan, V V; Fink, William H; Benham, Craig J

2005-09-22

The topological state of DNA in vivo is dynamically regulated by a number of processes that involve interactions with bound proteins. In one such process, the tracking of RNA polymerase along the double helix during transcription, restriction of rotational motion of the polymerase and associated structures, generates waves of overtwist downstream and undertwist upstream from the site of transcription. The resulting superhelical stress is often sufficient to drive double-stranded DNA into a denatured state at locations such as promoters and origins of replication, where sequence-specific duplex opening is a prerequisite for biological function. In this way, transcription and other events that actively supercoil the DNA provide a mechanism for dynamically coupling genetic activity with regulatory and other cellular processes. Although computer modeling has provided insight into the equilibrium dynamics of DNA supercoiling, to date no model has appeared for simulating sequence-dependent DNA strand separation under the nonequilibrium conditions imposed by the dynamic introduction of torsional stress. Here, we introduce such a model and present results from an initial set of computer simulations in which the sequences of dynamically superhelical, 147 base pair DNA circles were systematically altered in order to probe the accuracy with which the model can predict location, extent, and time of stress-induced duplex denaturation. The results agree both with well-tested statistical mechanical calculations and with available experimental information. Additionally, we find that sites susceptible to denaturation show a propensity for localizing to supercoil apices, suggesting that base sequence determines locations of strand separation not only through the energetics of interstrand interactions, but also by influencing the geometry of supercoiling.
Detection and quantitation of single nucleotide polymorphisms, DNA sequence variations, DNA mutations, DNA damage and DNA mismatches

DOEpatents

McCutchen-Maloney, Sandra L.

2002-01-01

DNA mutation binding proteins alone and as chimeric proteins with nucleases are used with solid supports to detect DNA sequence variations, DNA mutations and single nucleotide polymorphisms. The solid supports may be flow cytometry beads, DNA chips, glass slides or DNA dips sticks. DNA molecules are coupled to solid supports to form DNA-support complexes. Labeled DNA is used with unlabeled DNA mutation binding proteins such at TthMutS to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by binding which gives an increase in signal. Unlabeled DNA is utilized with labeled chimeras to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by nuclease activity of the chimera which gives a decrease in signal.
High-Resolution Whole-Genome Sequencing Reveals That Specific Chromatin Domains from Most Human Chromosomes Associate with Nucleoli

PubMed Central

van Koningsbruggen, Silvana; Gierliński, Marek; Schofield, Pietá; Martin, David; Barton, Geoffey J.; Ariyurek, Yavuz; den Dunnen, Johan T.

2010-01-01

The nuclear space is mostly occupied by chromosome territories and nuclear bodies. Although this organization of chromosomes affects gene function, relatively little is known about the role of nuclear bodies in the organization of chromosomal regions. The nucleolus is the best-studied subnuclear structure and forms around the rRNA repeat gene clusters on the acrocentric chromosomes. In addition to rDNA, other chromatin sequences also surround the nucleolar surface and may even loop into the nucleolus. These additional nucleolar-associated domains (NADs) have not been well characterized. We present here a whole-genome, high-resolution analysis of chromatin endogenously associated with nucleoli. We have used a combination of three complementary approaches, namely fluorescence comparative genome hybridization, high-throughput deep DNA sequencing and photoactivation combined with time-lapse fluorescence microscopy. The data show that specific sequences from most human chromosomes, in addition to the rDNA repeat units, associate with nucleoli in a reproducible and heritable manner. NADs have in common a high density of AT-rich sequence elements, low gene density and a statistically significant enrichment in transcriptionally repressed genes. Unexpectedly, both the direct DNA sequencing and fluorescence photoactivation data show that certain chromatin loci can specifically associate with either the nucleolus, or the nuclear envelope. PMID:20826608
High-resolution whole-genome sequencing reveals that specific chromatin domains from most human chromosomes associate with nucleoli.

PubMed

van Koningsbruggen, Silvana; Gierlinski, Marek; Schofield, Pietá; Martin, David; Barton, Geoffey J; Ariyurek, Yavuz; den Dunnen, Johan T; Lamond, Angus I

2010-11-01

The nuclear space is mostly occupied by chromosome territories and nuclear bodies. Although this organization of chromosomes affects gene function, relatively little is known about the role of nuclear bodies in the organization of chromosomal regions. The nucleolus is the best-studied subnuclear structure and forms around the rRNA repeat gene clusters on the acrocentric chromosomes. In addition to rDNA, other chromatin sequences also surround the nucleolar surface and may even loop into the nucleolus. These additional nucleolar-associated domains (NADs) have not been well characterized. We present here a whole-genome, high-resolution analysis of chromatin endogenously associated with nucleoli. We have used a combination of three complementary approaches, namely fluorescence comparative genome hybridization, high-throughput deep DNA sequencing and photoactivation combined with time-lapse fluorescence microscopy. The data show that specific sequences from most human chromosomes, in addition to the rDNA repeat units, associate with nucleoli in a reproducible and heritable manner. NADs have in common a high density of AT-rich sequence elements, low gene density and a statistically significant enrichment in transcriptionally repressed genes. Unexpectedly, both the direct DNA sequencing and fluorescence photoactivation data show that certain chromatin loci can specifically associate with either the nucleolus, or the nuclear envelope.
DYZ1 arrays show sequence variation between the monozygotic males

PubMed Central

2014-01-01

Background Monozygotic twins (MZT) are an important resource for genetical studies in the context of normal and diseased genomes. In the present study we used DYZ1, a satellite fraction present in the form of tandem arrays on the long arm of the human Y chromosome, as a tool to uncover sequence variations between the monozygotic males. Results We detected copy number variation, frequent insertions and deletions within the sequences of DYZ1 arrays amongst all the three sets of twins used in the present study. MZT1b showed loss of 35 bp compared to that in 1a, whereas 2a showed loss of 31 bp compared to that in 2b. Similarly, 3b showed 10 bp insertion compared to that in 3a. MZT1a germline DNA showed loss of 5 bp and 1b blood DNA showed loss of 26 bp compared to that of 1a blood and 1b germline DNA, respectively. Of the 69 restriction sites detected in DYZ1 arrays, MboII, BsrI, TspEI and TaqI enzymes showed frequent loss and or gain amongst all the 3 pairs studied. MZT1 pair showed loss/gain of VspI, BsrDI, AgsI, PleI, TspDTI, TspEI, TfiI and TaqI restriction sites in both blood and germline DNA. All the three sets of MZT showed differences in the number of DYZ1 copies. FISH signals reflected somatic mosaicism of the DYZ1 copies across the cells. Conclusions DYZ1 showed both sequence and copy number variation between the MZT males. Sequence variation was also noticed between germline and blood DNA samples of the same individual as we observed at least in one set of sample. The result suggests that DYZ1 faithfully records all the genetical changes occurring after the twining which may be ascribed to the environmental factors. PMID:24495361
Methylation patterns of repetitive DNA sequences in germ cells of Mus musculus.

PubMed

Sanford, J; Forrester, L; Chapman, V; Chandley, A; Hastie, N

1984-03-26

The major and the minor satellite sequences of Mus musculus were undermethylated in both sperm and oocyte DNAs relative to the amount of undermethylation observed in adult somatic tissue DNA. This hypomethylation was specific for satellite sequences in sperm DNA. Dispersed repetitive and low copy sequences show a high degree of methylation in sperm DNA; however, a dispersed repetitive sequence was undermethylated in oocyte DNA. This finding suggests a difference in the amount of total genomic DNA methylation between sperm and oocyte DNA. The methylation levels of the minor satellite sequences did not change during spermiogenesis, and were not associated with the onset of meiosis or a specific stage in sperm development.
Denisovans, Melanesians, Europeans, and Neandertals: The Confusion of DNA Assumptions and the Biological Species Concept.

PubMed

Caldararo, Niccolo

2016-08-01

A number of recent articles have appeared on the Denisova fossil remains and attempts to produce DNA sequences from them. One of these recently appeared in Science by Vernot et al. (Science 352:235-239, 2016). We would like to advance an alternative interpretation of the data presented. One concerns the problem of contamination/degradation of the determined DNA sequenced. Just as the publication of the first Neandertal sequence included an interpretation that argued that Neandertals had not contributed any genes to modern humans, the Denisovan interpretation has considerable influence on ideas regarding human evolution. The new papers, however, confuse established ideas concerning the nature of species, as well as the use of terms like premodern, Archaic Homo, and Homo heidelbergensis. Examination of these problems presents a solution by means of reinterpreting the results. Given the claims for gene transfer among a number of Mid Pleistocene hominids, it may be time to reexamine the idea of anagenesis in hominid evolution.
A New Challenge for Compression Algorithms: Genetic Sequences.

ERIC Educational Resources Information Center

Grumbach, Stephane; Tahi, Fariza

1994-01-01

Analyzes the properties of genetic sequences that cause the failure of classical algorithms used for data compression. A lossless algorithm, which compresses the information contained in DNA and RNA sequences by detecting regularities such as palindromes, is presented. This algorithm combines substitutional and statistical methods and appears to…
Genes from the 20Q13 amplicon and their uses

DOEpatents

Gray, Joe; Collins, Colin; Hwang, Soo-in; Godfrey, Tony; Kowbel, David; Rommens, Johanna

1999-01-01

The present invention relates to cDNA sequences from a region of amplification on chromosome 20 associated with disease. The sequences can be used in hybridization methods for the identification of chromosomal abnormalities associated with various diseases. The sequences can also be used for treatment of diseases.
Process of labeling specific chromosomes using recombinant repetitive DNA

DOEpatents

Moyzis, R.K.; Meyne, J.

1988-02-12

Chromosome preferential nucleotide sequences are first determined from a library of recombinant DNA clones having families of repetitive sequences. Library clones are identified with a low homology with a sequence of repetitive DNA families to which the first clones respectively belong and variant sequences are then identified by selecting clones having a pattern of hybridization with genomic DNA dissimilar to the hybridization pattern shown by the respective families. In another embodiment, variant sequences are selected from a sequence of a known repetitive DNA family. The selected variant sequence is classified as chromosome specific, chromosome preferential, or chromosome nonspecific. Sequences which are classified as chromosome preferential are further sequenced and regions are identified having a low homology with other regions of the chromosome preferential sequence or with known sequences of other family members and consensus sequences of the repetitive DNA families for the chromosome preferential sequences. The selected low homology regions are then hybridized with chromosomes to determine those low homology regions hybridized with a specific chromosome under normal stringency conditions.

mtDNA-Server: next-generation sequencing data analysis of human mitochondrial DNA in the cloud.

PubMed

Weissensteiner, Hansi; Forer, Lukas; Fuchsberger, Christian; Schöpf, Bernd; Kloss-Brandstätter, Anita; Specht, Günther; Kronenberg, Florian; Schönherr, Sebastian

2016-07-08

Next generation sequencing (NGS) allows investigating mitochondrial DNA (mtDNA) characteristics such as heteroplasmy (i.e. intra-individual sequence variation) to a higher level of detail. While several pipelines for analyzing heteroplasmies exist, issues in usability, accuracy of results and interpreting final data limit their usage. Here we present mtDNA-Server, a scalable web server for the analysis of mtDNA studies of any size with a special focus on usability as well as reliable identification and quantification of heteroplasmic variants. The mtDNA-Server workflow includes parallel read alignment, heteroplasmy detection, artefact or contamination identification, variant annotation as well as several quality control metrics, often neglected in current mtDNA NGS studies. All computational steps are parallelized with Hadoop MapReduce and executed graphically with Cloudgene. We validated the underlying heteroplasmy and contamination detection model by generating four artificial sample mix-ups on two different NGS devices. Our evaluation data shows that mtDNA-Server detects heteroplasmies and artificial recombinations down to the 1% level with perfect specificity and outperforms existing approaches regarding sensitivity. mtDNA-Server is currently able to analyze the 1000G Phase 3 data (n = 2,504) in less than 5 h and is freely accessible at https://mtdna-server.uibk.ac.at. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Highly-sensitive microRNA detection based on bio-bar-code assay and catalytic hairpin assembly two-stage amplification.

PubMed

Tang, Songsong; Gu, Yuan; Lu, Huiting; Dong, Haifeng; Zhang, Kai; Dai, Wenhao; Meng, Xiangdan; Yang, Fan; Zhang, Xueji

2018-04-03

Herein, a highly-sensitive microRNA (miRNA) detection strategy was developed by combining bio-bar-code assay (BBA) with catalytic hairpin assembly (CHA). In the proposed system, two nanoprobes of magnetic nanoparticles functionalized with DNA probes (MNPs-DNA) and gold nanoparticles with numerous barcode DNA (AuNPs-DNA) were designed. In the presence of target miRNA, the MNP-DNA and AuNP-DNA hybridized with target miRNA to form a "sandwich" structure. After "sandwich" structures were separated from the solution by the magnetic field and dehybridized by high temperature, the barcode DNA sequences were released by dissolving AuNPs. The released barcode DNA sequences triggered the toehold strand displacement assembly of two hairpin probes, leading to recycle of barcode DNA sequences and producing numerous fluorescent CHA products for miRNA detection. Under the optimal experimental conditions, the proposed two-stage amplification system could sensitively detect target miRNA ranging from 10 pM to 10 aM with a limit of detection (LOD) down to 97.9 zM. It displayed good capability to discriminate single base and three bases mismatch due to the unique sandwich structure. Notably, it presented good feasibility for selective multiplexed detection of various combinations of synthetic miRNA sequences and miRNAs extracted from different cell lysates, which were in agreement with the traditional polymerase chain reaction analysis. The two-stage amplification strategy may be significant implication in the biological detection and clinical diagnosis. Copyright © 2017 Elsevier B.V. All rights reserved.
Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models.

PubMed

Park, Byungkyu; Im, Jinyong; Tuvshinjargal, Narankhuu; Lee, Wook; Han, Kyungsook

2014-11-01

As many structures of protein-DNA complexes have been known in the past years, several computational methods have been developed to predict DNA-binding sites in proteins. However, its inverse problem (i.e., predicting protein-binding sites in DNA) has received much less attention. One of the reasons is that the differences between the interaction propensities of nucleotides are much smaller than those between amino acids. Another reason is that DNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding DNA nucleotides is much harder than predicting DNA-binding amino acids. We computed the interaction propensity (IP) of nucleotide triplets with amino acids using an extensive dataset of protein-DNA complexes, and developed two support vector machine (SVM) models that predict protein-binding nucleotides from sequence data alone. One SVM model predicts protein-binding nucleotides using DNA sequence data alone, and the other SVM model predicts protein-binding nucleotides using both DNA and protein sequences. In a 10-fold cross-validation with 1519 DNA sequences, the SVM model that uses DNA sequence data only predicted protein-binding nucleotides with an accuracy of 67.0%, an F-measure of 67.1%, and a Matthews correlation coefficient (MCC) of 0.340. With an independent dataset of 181 DNAs that were not used in training, it achieved an accuracy of 66.2%, an F-measure 66.3% and a MCC of 0.324. Another SVM model that uses both DNA and protein sequences achieved an accuracy of 69.6%, an F-measure of 69.6%, and a MCC of 0.383 in a 10-fold cross-validation with 1519 DNA sequences and 859 protein sequences. With an independent dataset of 181 DNAs and 143 proteins, it showed an accuracy of 67.3%, an F-measure of 66.5% and a MCC of 0.329. Both in cross-validation and independent testing, the second SVM model that used both DNA and protein sequence data showed better performance than the first model that used DNA sequence data. To the best of our knowledge, this is the first attempt to predict protein-binding nucleotides in a given DNA sequence from the sequence data alone. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Rapid discrimination of sequences flanking and within T-DNA insertions in the Arabidopsis genome.

PubMed

Ponce, M R; Quesada, V; Micol, J L

1998-05-01

An improvement to previous methods for recovering Arabidopsis thaliana genomic DNA flanking T-DNA insertions is presented that allows for the avoidance of some of the cloning difficulties caused by the concatameric nature of T-DNA inserts. The principle of the procedure is to categorize by size restriction fragments of mutant DNA, produced in separate digestions with NdeI and Bst1107I. Given that the sites for these two enzymes are contiguous within the pGV3850:1003 T-DNA construct, the restriction fragments obtained fall into two categories: those showing identical size in both digestions, which correspond to sequences internal to T-DNA concatamers; and those of different sizes, that contain the junctions between plant DNA and the T-DNA insert. Such a criterion makes it possible to easily distinguish the digestion products corresponding to internal T-DNA parts, which do not deserve further attention, and those which presumably include a segment of the locus of interest. Discrimination between restriction fragments of genomic mutant DNA can be made on rescued plasmids, inverse PCR amplification products or bands in a genomic blot.
Mitochondrial DNA (mtDNA) variants in the European haplogroups HV, JT, and U do not have a major role in schizophrenia.

PubMed

Torrell, Helena; Salas, Antonio; Abasolo, Nerea; Morén, Constanza; Garrabou, Glòria; Valero, Joaquín; Alonso, Yolanda; Vilella, Elisabet; Costas, Javier; Martorell, Lourdes

2014-10-01

It has been reported that certain genetic factors involved in schizophrenia could be located in the mitochondrial DNA (mtDNA). Therefore, we hypothesized that mtDNA mutations and/or variants would be present in schizophrenia patients and may be related to schizophrenia characteristics and mitochondrial function. This study was performed in three steps: (1) identification of pathogenic mutations and variants in 14 schizophrenia patients with an apparent maternal inheritance of the disease by sequencing the entire mtDNA; (2) case-control association study of 23 variants identified in step 1 (16 missense, 3 rRNA, and 4 tRNA variants) in 495 patients and 615 controls, and (3) analyses of the associated variants according to the clinical, psychopathological, and neuropsychological characteristics and according to the oxidative and enzymatic activities of the mitochondrial respiratory chain. We did not identify pathogenic mtDNA mutations in the 14 sequenced patients. Two known variants were nominally associated with schizophrenia and were further studied. The MT-RNR2 1811A > G variant likely does not play a major role in schizophrenia, as it was not associated with clinical, psychopathological, or neuropsychological variables, and the MT-ATP6 9110T > C p.Ile195Thr variant did not result in differences in the oxidative and enzymatic functions of the mitochondrial respiratory chain. The patients with apparent maternal inheritance of schizophrenia did not exhibit any mutations in their mtDNA. The variants nominally associated with schizophrenia in the present study were not related either to phenotypic characteristics or to mitochondrial function. We did not find evidence pointing to a role for mtDNA sequence variation in schizophrenia. © 2014 Wiley Periodicals, Inc.
Enlightenment of Yeast Mitochondrial Homoplasmy: Diversified Roles of Gene Conversion

PubMed Central

Ling, Feng; Mikawa, Tsutomu; Shibata, Takehiko

2011-01-01

Mitochondria have their own genomic DNA. Unlike the nuclear genome, each cell contains hundreds to thousands of copies of mitochondrial DNA (mtDNA). The copies of mtDNA tend to have heterogeneous sequences, due to the high frequency of mutagenesis, but are quickly homogenized within a cell (“homoplasmy”) during vegetative cell growth or through a few sexual generations. Heteroplasmy is strongly associated with mitochondrial diseases, diabetes and aging. Recent studies revealed that the yeast cell has the machinery to homogenize mtDNA, using a common DNA processing pathway with gene conversion; i.e., both genetic events are initiated by a double-stranded break, which is processed into 3′ single-stranded tails. One of the tails is base-paired with the complementary sequence of the recipient double-stranded DNA to form a D-loop (homologous pairing), in which repair DNA synthesis is initiated to restore the sequence lost by the breakage. Gene conversion generates sequence diversity, depending on the divergence between the donor and recipient sequences, especially when it occurs among a number of copies of a DNA sequence family with some sequence variations, such as in immunoglobulin diversification in chicken. MtDNA can be regarded as a sequence family, in which the members tend to be diversified by a high frequency of spontaneous mutagenesis. Thus, it would be interesting to determine why and how double-stranded breakage and D-loop formation induce sequence homogenization in mitochondria and sequence diversification in nuclear DNA. We will review the mechanisms and roles of mtDNA homoplasmy, in contrast to nuclear gene conversion, which diversifies gene and genome sequences, to provide clues toward understanding how the common DNA processing pathway results in such divergent outcomes. PMID:24710143
Detecting very low allele fraction variants using targeted DNA sequencing and a novel molecular barcode-aware variant caller.

PubMed

Xu, Chang; Nezami Ranjbar, Mohammad R; Wu, Zhong; DiCarlo, John; Wang, Yexun

2017-01-03

Detection of DNA mutations at very low allele fractions with high accuracy will significantly improve the effectiveness of precision medicine for cancer patients. To achieve this goal through next generation sequencing, researchers need a detection method that 1) captures rare mutation-containing DNA fragments efficiently in the mix of abundant wild-type DNA; 2) sequences the DNA library extensively to deep coverage; and 3) distinguishes low level true variants from amplification and sequencing errors with high accuracy. Targeted enrichment using PCR primers provides researchers with a convenient way to achieve deep sequencing for a small, yet most relevant region using benchtop sequencers. Molecular barcoding (or indexing) provides a unique solution for reducing sequencing artifacts analytically. Although different molecular barcoding schemes have been reported in recent literature, most variant calling has been done on limited targets, using simple custom scripts. The analytical performance of barcode-aware variant calling can be significantly improved by incorporating advanced statistical models. We present here a highly efficient, simple and scalable enrichment protocol that integrates molecular barcodes in multiplex PCR amplification. In addition, we developed smCounter, an open source, generic, barcode-aware variant caller based on a Bayesian probabilistic model. smCounter was optimized and benchmarked on two independent read sets with SNVs and indels at 5 and 1% allele fractions. Variants were called with very good sensitivity and specificity within coding regions. We demonstrated that we can accurately detect somatic mutations with allele fractions as low as 1% in coding regions using our enrichment protocol and variant caller.
Complete mitochondrial genome sequence of the common bean anthracnose pathogen Colletotrichum lindemuthianum.

PubMed

Gutiérrez, Pablo; Alzate, Juan; Yepes, Mauricio Salazar; Marín, Mauricio

2016-01-01

Colletotrichum lindemuthianum is the causal agent of anthracnose in common bean (Phaseolus vulgaris), one of the most limiting factors for this crop in South and Central America. In this work, the mitochondrial sequence of a Colombian isolate of C. lindemuthianum obtained from a common bean plant (var. Cargamanto) with anthracnose symptoms is presented. The mtDNA codes for 13 proteins of the respiratory chain, 1 ribosomal protein, 2 homing endonucleases, 2 ribosomal RNAs and 28 tRNAs. This is the first report of a complete mtDNA genome sequence from C. lindemuthianum.
Base pairing among three cis-acting sequences contributes to template switching during hepadnavirus reverse transcription

PubMed Central

Liu, Ning; Tian, Ru; Loeb, Daniel D.

2003-01-01

Synthesis of the relaxed-circular (RC) DNA genome of hepadnaviruses requires two template switches during plus-strand DNA synthesis: primer translocation and circularization. Although primer translocation and circularization use different donor and acceptor sequences, and are distinct temporally, they share the common theme of switching from one end of the minus-strand template to the other end. Studies of duck hepatitis B virus have indicated that, in addition to the donor and acceptor sequences, three other cis-acting sequences, named 3E, M, and 5E, are required for the synthesis of RC DNA by contributing to primer translocation and circularization. The mechanism by which 3E, M, and 5E act was not known. We present evidence that these sequences function by base pairing with each other within the minus-strand template. 3E base-pairs with one portion of M (M3) and 5E base-pairs with an adjacent portion of M (M5). We found that disrupting base pairing between 3E and M3 and between 5E and M5 inhibited primer translocation and circularization. More importantly, restoring base pairing with mutant sequences restored the production of RC DNA. These results are consistent with the model that, within duck hepatitis B virus capsids, the ends of the minus-strand template are juxtaposed via base pairing to facilitate the two template switches during plus-strand DNA synthesis. PMID:12578983
Hybridization chain reaction-based instantaneous derivatization technology for chemiluminescence detection of specific DNA sequences.

PubMed

Wang, Xin; Lau, Choiwan; Kai, Masaaki; Lu, Jianzhong

2013-05-07

We propose here a new amplifying strategy that uses hybridization chain reaction (HCR) to detect specific sequences of DNA, where stable DNA monomers assemble on the magnetic beads only upon exposure to a target DNA. Briefly, in the HCR process, two complementary stable species of hairpins coexist in solution until the introduction of initiator reporter strands triggers a cascade of hybridization events that yield nicked double helices analogous to alternating copolymers. Moreover, a "sandwich-type" detection strategy is employed in our design. Magnetic beads, which are functionalized with capture DNA, are reacted with the target, and sandwiched with the above nicked double helices. Then, chemiluminescence (CL) detection proceeds via an instantaneous derivatization reaction between a specific CL reagent, 3,4,5-trimethoxylphenylglyoxal (TMPG), and the guanine nucleotides within the target DNA, reporter strands and DNA monomers for the generation of light. Our results clearly show that the amplification detection of specific sequences of DNA achieves a better performance (e.g. wide linear response range, low detection limit, and high specificity) as compared to the traditional sandwich type (capture/target/reporter) assays. Upon modification, the approach presented could be extended to detect other types of targets. We believe that this simple technique is promising for improving medical diagnosis and treatment.
Structural insight into the specificity of the B3 DNA-binding domains provided by the co-crystal structure of the C-terminal fragment of BfiI restriction enzyme

PubMed Central

Golovenko, Dmitrij; Manakova, Elena; Zakrys, Linas; Zaremba, Mindaugas; Sasnauskas, Giedrius; Gražulis, Saulius; Siksnys, Virginijus

2014-01-01

The B3 DNA-binding domains (DBDs) of plant transcription factors (TF) and DBDs of EcoRII and BfiI restriction endonucleases (EcoRII-N and BfiI-C) share a common structural fold, classified as the DNA-binding pseudobarrel. The B3 DBDs in the plant TFs recognize a diverse set of target sequences. The only available co-crystal structure of the B3-like DBD is that of EcoRII-N (recognition sequence 5′-CCTGG-3′). In order to understand the structural and molecular mechanisms of specificity of B3 DBDs, we have solved the crystal structure of BfiI-C (recognition sequence 5′-ACTGGG-3′) complexed with 12-bp cognate oligoduplex. Structural comparison of BfiI-C–DNA and EcoRII-N–DNA complexes reveals a conserved DNA-binding mode and a conserved pattern of interactions with the phosphodiester backbone. The determinants of the target specificity are located in the loops that emanate from the conserved structural core. The BfiI-C–DNA structure presented here expands a range of templates for modeling of the DNA-bound complexes of the B3 family of plant TFs. PMID:24423868
"First generation" automated DNA sequencing technology.

PubMed

Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M

2011-10-01

Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines. © 2011 by John Wiley & Sons, Inc.
A rapid, generally applicable method to engineer zinc fingers illustrated by targeting the HIV-1 promoter.

PubMed

Isalan, M; Klug, A; Choo, Y

2001-07-01

DNA-binding domains with predetermined sequence specificity are engineered by selection of zinc finger modules using phage display, allowing the construction of customized transcription factors. Despite remarkable progress in this field, the available protein-engineering methods are deficient in many respects, thus hampering the applicability of the technique. Here we present a rapid and convenient method that can be used to design zinc finger proteins against a variety of DNA-binding sites. This is based on a pair of pre-made zinc finger phage-display libraries, which are used in parallel to select two DNA-binding domains each of which recognizes given 5 base pair sequences, and whose products are recombined to produce a single protein that recognizes a composite (9 base pair) site of predefined sequence. Engineering using this system can be completed in less than two weeks and yields proteins that bind sequence-specifically to DNA with Kd values in the nanomolar range. To illustrate the technique, we have selected seven different proteins to bind various regions of the human immunodeficiency virus 1 (HIV-1) promoter.
Arduino-based automation of a DNA extraction system.

PubMed

Kim, Kyung-Won; Lee, Mi-So; Ryu, Mun-Ho; Kim, Jong-Won

2015-01-01

There have been many studies to detect infectious diseases with the molecular genetic method. This study presents an automation process for a DNA extraction system based on microfluidics and magnetic bead, which is part of a portable molecular genetic test system. This DNA extraction system consists of a cartridge with chambers, syringes, four linear stepper actuators, and a rotary stepper actuator. The actuators provide a sequence of steps in the DNA extraction process, such as transporting, mixing, and washing for the gene specimen, magnetic bead, and reagent solutions. The proposed automation system consists of a PC-based host application and an Arduino-based controller. The host application compiles a G code sequence file and interfaces with the controller to execute the compiled sequence. The controller executes stepper motor axis motion, time delay, and input-output manipulation. It drives the stepper motor with an open library, which provides a smooth linear acceleration profile. The controller also provides a homing sequence to establish the motor's reference position, and hard limit checking to prevent any over-travelling. The proposed system was implemented and its functionality was investigated, especially regarding positioning accuracy and velocity profile.
Molecular and phylogenetic characterizations of an Eimeria krijgsmanni Yakimoff & Gouseff, 1938 (Apicomplexa: Eimeriidae) mouse intestinal protozoan parasite by partial 18S ribosomal RNA gene sequence analysis.

PubMed

Takeo, Toshinori; Tanaka, Tetsuya; Matsubayashi, Makoto; Maeda, Hiroki; Kusakisako, Kodai; Matsui, Toshihiro; Mochizuki, Masami; Matsuo, Tomohide

2014-08-01

Previously, we characterized an undocumented strain of Eimeria krijgsmanni by morphological and biological features. Here, we present a detailed molecular phylogenetic analysis of this organism. Namely, 18S ribosomal RNA gene (rDNA) sequences of E. krijgsmanni were analyzed to incorporate this species into a comprehensive Eimeria phylogeny. As a result, partial 18S rDNA sequence from E. krijgsmanni was successfully determined, and two different types, Type A and Type B, that differed by 1 base pair were identified. E. krijgsmanni was originally isolated from a single oocyst, and thus the result show that the two types might have allelic sequence heterogeneity in the 18S rDNA. Based on phylogenetic analyses, the two types of E. krijgsmanni 18S rDNA formed one of two clades among murine Eimeria spp.; these Eimeria clades reflected morphological similarity among the Eimeria spp. This is the third molecular phylogenetic characterization of a murine Eimeria spp. in addition to E. falciformis and E. papillata. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
In and out of the rRNA genes: characterization of Pokey elements in the sequenced Daphnia genome

PubMed Central

2013-01-01

Background Only a few transposable elements are known to exhibit site-specific insertion patterns, including the well-studied R-element retrotransposons that insert into specific sites within the multigene rDNA. The only known rDNA-specific DNA transposon, Pokey (superfamily: piggyBac) is found in the freshwater microcrustacean, Daphnia pulex. Here, we present a genome-wide analysis of Pokey based on the recently completed whole genome sequencing project for D. pulex. Results Phylogenetic analysis of Pokey elements recovered from the genome sequence revealed the presence of four lineages corresponding to two divergent autonomous families and two related lineages of non-autonomous miniature inverted repeat transposable elements (MITEs). The MITEs are also found at the same 28S rRNA gene insertion site as the Pokey elements, and appear to have arisen as deletion derivatives of autonomous elements. Several copies of the full-length Pokey elements may be capable of producing an active transposase. Surprisingly, both families of Pokey possess a series of 200 bp repeats upstream of the transposase that is derived from the rDNA intergenic spacer (IGS). The IGS sequences within the Pokey elements appear to be evolving in concert with the rDNA units. Finally, analysis of the insertion sites of Pokey elements outside of rDNA showed a target preference for sites similar to the specific sequence that is targeted within rDNA. Conclusions Based on the target site preference of Pokey elements and the concerted evolution of a segment of the element with the rDNA unit, we propose an evolutionary path by which the ancestors of Pokey elements have invaded the rDNA niche. We discuss how specificity for the rDNA unit may have evolved and how this specificity has played a role in the long-term survival of these elements in the subgenus Daphnia. PMID:24059783
High-throughput assays for DNA gyrase and other topoisomerases

PubMed Central

Maxwell, Anthony; Burton, Nicolas P.; O'Hagan, Natasha

2006-01-01

We have developed high-throughput microtitre plate-based assays for DNA gyrase and other DNA topoisomerases. These assays exploit the fact that negatively supercoiled plasmids form intermolecular triplexes more efficiently than when they are relaxed. Two assays are presented, one using capture of a plasmid containing a single triplex-forming sequence by an oligonucleotide tethered to the surface of a microtitre plate and subsequent detection by staining with a DNA-specific fluorescent dye. The other uses capture of a plasmid containing two triplex-forming sequences by an oligonucleotide tethered to the surface of a microtitre plate and subsequent detection by a second oligonucleotide that is radiolabelled. The assays are shown to be appropriate for assaying DNA supercoiling by Escherichia coli DNA gyrase and DNA relaxation by eukaryotic topoisomerases I and II, and E.coli topoisomerase IV. The assays are readily adaptable to other enzymes that change DNA supercoiling (e.g. restriction enzymes) and are suitable for use in a high-throughput format. PMID:16936317
High-throughput assays for DNA gyrase and other topoisomerases.

PubMed

Maxwell, Anthony; Burton, Nicolas P; O'Hagan, Natasha

2006-01-01

We have developed high-throughput microtitre plate-based assays for DNA gyrase and other DNA topoisomerases. These assays exploit the fact that negatively supercoiled plasmids form intermolecular triplexes more efficiently than when they are relaxed. Two assays are presented, one using capture of a plasmid containing a single triplex-forming sequence by an oligonucleotide tethered to the surface of a microtitre plate and subsequent detection by staining with a DNA-specific fluorescent dye. The other uses capture of a plasmid containing two triplex-forming sequences by an oligonucleotide tethered to the surface of a microtitre plate and subsequent detection by a second oligonucleotide that is radiolabelled. The assays are shown to be appropriate for assaying DNA supercoiling by Escherichia coli DNA gyrase and DNA relaxation by eukaryotic topoisomerases I and II, and E.coli topoisomerase IV. The assays are readily adaptable to other enzymes that change DNA supercoiling (e.g. restriction enzymes) and are suitable for use in a high-throughput format.
A force-based, parallel assay for the quantification of protein-DNA interactions.

PubMed

Limmer, Katja; Pippig, Diana A; Aschenbrenner, Daniela; Gaub, Hermann E

2014-01-01

Analysis of transcription factor binding to DNA sequences is of utmost importance to understand the intricate regulatory mechanisms that underlie gene expression. Several techniques exist that quantify DNA-protein affinity, but they are either very time-consuming or suffer from possible misinterpretation due to complicated algorithms or approximations like many high-throughput techniques. We present a more direct method to quantify DNA-protein interaction in a force-based assay. In contrast to single-molecule force spectroscopy, our technique, the Molecular Force Assay (MFA), parallelizes force measurements so that it can test one or multiple proteins against several DNA sequences in a single experiment. The interaction strength is quantified by comparison to the well-defined rupture stability of different DNA duplexes. As a proof-of-principle, we measured the interaction of the zinc finger construct Zif268/NRE against six different DNA constructs. We could show the specificity of our approach and quantify the strength of the protein-DNA interaction.
Influence of DNA sequence on the structure of minicircles under torsional stress

PubMed Central

Wang, Qian; Irobalieva, Rossitza N.; Chiu, Wah; Schmid, Michael F.; Fogg, Jonathan M.; Zechiedrich, Lynn

2017-01-01

Abstract The sequence dependence of the conformational distribution of DNA under various levels of torsional stress is an important unsolved problem. Combining theory and coarse-grained simulations shows that the DNA sequence and a structural correlation due to topology constraints of a circle are the main factors that dictate the 3D structure of a 336 bp DNA minicircle under torsional stress. We found that DNA minicircle topoisomers can have multiple bend locations under high torsional stress and that the positions of these sharp bends are determined by the sequence, and by a positive mechanical correlation along the sequence. We showed that simulations and theory are able to provide sequence-specific information about individual DNA minicircles observed by cryo-electron tomography (cryo-ET). We provided a sequence-specific cryo-ET tomogram fitting of DNA minicircles, registering the sequence within the geometric features. Our results indicate that the conformational distribution of minicircles under torsional stress can be designed, which has important implications for using minicircle DNA for gene therapy. PMID:28609782

Oligonucleotide indexing of DNA barcodes: identification of tuna and other scombrid species in food products.

PubMed

Botti, Sara; Giuffra, Elisabetta

2010-08-23

DNA barcodes are a global standard for species identification and have countless applications in the medical, forensic and alimentary fields, but few barcoding methods work efficiently in samples in which DNA is degraded, e.g. foods and archival specimens. This limits the choice of target regions harbouring a sufficient number of diagnostic polymorphisms. The method described here uses existing PCR and sequencing methodologies to detect mitochondrial DNA polymorphisms in complex matrices such as foods. The reported application allowed the discrimination among 17 fish species of the Scombridae family with high commercial interest such as mackerels, bonitos and tunas which are often present in processed seafood. The approach can be easily upgraded with the release of new genetic diversity information to increase the range of detected species. Cocktail of primers are designed for PCR using publicly available sequences of the target sequence. They are composed of a fixed 5' region and of variable 3' cocktail portions that allow amplification of any member of a group of species of interest. The population of short amplicons is directly sequenced and indexed using primers containing a longer 5' region and the non polymorphic portion of the cocktail portion. A 226 bp region of CytB was selected as target after collection and screening of 148 online sequences; 85 SNPs were found, of which 75 were present in at least two sequences. Primers were also designed for two shorter sub-fragments that could be amplified from highly degraded samples. The test was used on 103 samples of seafood (canned tuna and scomber, tuna salad, tuna sauce) and could successfully detect the presence of different or additional species that were not identified on the labelling of canned tuna, tuna salad and sauce samples. The described method is largely independent of the degree of degradation of DNA source and can thus be applied to processed seafood. Moreover, the method is highly flexible: publicly available sequence information on mitochondrial genomes are rapidly increasing for most species, facilitating the choice of target sequences and the improvement of resolution of the test. This is particularly important for discrimination of marine and aquaculture species for which genome information is still limited.
Analysis of DNA Sequences by an Optical Time-Integrating Correlator: Proof-of-Concept Experiments.

DTIC Science & Technology

1992-05-01

DNA ANALYSIS STRATEGY 4 2.1 Representation of DNA Bases 4 2.2 DNA Analysis Strategy 6 3.0 CUSTOM GENERATORS FOR DNA SEQUENCES 10 3.1 Hardware Design 10...of the DNA bases where each base is represented by a 7-bits long pseudorandom sequence. 5 Figure 4: Coarse analysis of a DNA sequence. 7 Figure 5: Fine...a 20-bases long database. 32 xiii LIST OF TABLES PAGE Table 1: Short representations of the DNA bases where each base is represented by 7-bits long
The cDNA sequence of a neutral horseradish peroxidase.

PubMed

Bartonek-Roxå, E; Eriksson, H; Mattiasson, B

1991-02-16

A cDNA clone encoding a horseradish (Armoracia rusticana) peroxidase has been isolated and characterized. The cDNA contains 1378 nucleotides excluding the poly(A) tail and the deduced protein contains 327 amino acids which includes a 28 amino acid leader sequence. The predicted amino acid sequence is nine amino acids shorter than the major isoenzyme belonging to the horseradish peroxidase C group (HRP-C) and the sequence shows 53.7% identity with this isoenzyme. The described clone encodes nine cysteines of which eight correspond well with the cysteines found in HRP-C. Five potential N-glycosylation sites with the general sequence Asn-X-Thr/Ser are present in the deduced sequence. Compared to the earlier described HRP-C this is three glycosylation sites less. The shorter sequence and fewer N-glycosylation sites give the native isoenzyme a molecular weight of several thousands less than the horseradish peroxidase C isoenzymes. Comparison with the net charge value of HRP-C indicates that the described cDNA clone encodes a peroxidase which has either the same or a slightly less basic pI value, depending on whether the encoded protein is N-terminally blocked or not. This excludes the possibility that HRP-n could belong to either the HRP-A, -D or -E groups. The low sequence identity (53.7%) with HRP-C indicates that the described clone does not belong to the HRP-C isoenzyme group and comparison of the total amino acid composition with the HRP-B group does not place the described clone within this isoenzyme group. Our conclusion is that the described cDNA clone encodes a neutral horseradish peroxidase which belongs to a new, not earlier described, horseradish peroxidase group.
Molecular genetic characterization of the RD-114 gene family of endogenous feline retroviral sequences.

PubMed Central

Reeves, R H; O'Brien, S J

1984-01-01

RD-114 is a replication-competent, xenotropic retrovirus which is homologous to a family of moderately repetitive DNA sequences present at ca. 20 copies in the normal cellular genome of domestic cats. To examine the extent and character of genomic divergence of the RD-114 gene family as well as to assess their positional association within the cat genome, we have prepared a series of molecular clones of endogenous RD-114 DNA segments from a genomic library of cat cellular DNA. Their restriction endonuclease maps were compared with each other as well as to that of the prototype-inducible RD-114 which was molecularly cloned from a chronically infected human cell line. The endogenous sequences analyzed were similar to each other in that they were colinear with RD-114 proviral DNA, were bounded by long terminal redundancies, and conserved many restriction sites in the gag and pol regions. However, the env regions of many of the sequences examined were substantially deleted. Several of the endogenous RD-114 genomes contained a novel envelope sequence which was unrelated to the env gene of the prototype RD-114 env gene but which, like RD-114 and endogenous feline leukemia virus provirus, was found only in species of the genus Felis, and not in other closely related Felidae genera. The endogenous RD-114 sequences each had a distinct cellular flank which indicates that these sequences are not tandem but dispersed nonspecifically throughout the genome. Southern analysis of cat cellular DNA confirmed the conclusions about conserved restriction sites in endogenous sequences and indicated that a single locus may be responsible for the production of the major inducible form of RD-114. Images PMID:6090693
Detection of environmental DNA of Bigheaded Carps in samples collected from selected locations in the St. Croix River and in the Mississippi River

USGS Publications Warehouse

Amberg, Jon J.; McCalla, S. Grace; Miller, Loren; Sorensen, Peter; Gaikowski, Mark P.

2013-01-01

The use of molecular methods, such as the detection of environmental deoxyribonucleic acid (eDNA), have become an increasingly popular tool in surveillance programs that monitor for the presence of invasive species in aquatic systems. One early application of these methods in aquatic systems was surveillance for DNA of Asian carps (specifically bighead carp Hypophthalmichthys nobilis and silver carp H. molitrix) in water samples taken from the Chicago Area Waterway System. The ability to identify DNA of a species in an environmental sample presents a potentially powerful tool because these sensitive analyses can presumably detect the presence of DNA in water even when the species is not abundant or are difficult to catch or monitor with traditional gear. Prior to research presented in this report, an initial eDNA surveillance effort was completed in selected locations in the Upper Mississippi and St. Croix Rivers in 2011 after the capture of a bighead carp in the St. Croix River near Prescott, WI. Data presented in this report were developed to duplicate the 2011 monitoring results from the Upper Mississippi and St. Croix Rivers and to provide critical insight into the technique to inform future work in these locations. We specifically sought to understand the potential confounding effects of other pathways of eDNA movement (e.g., fish-eating birds, watercraft) on the variation in background DNA by collecting water samples from (1) sites within the St. Croix River and the upper Mississippi River where the DNA of silver carp was previously detected, (2) sites considered to be free of Asian carp, and (3) a site known to have a large population of Asian carp. We also sought to establish a baseline Asian carp eDNA signature to which future eDNA sampling efforts could be compared. All samples taken as part of this effort were processed using conventional polymerase chain reaction (PCR) according to procedures outlined in the U.S. Army Corps of Engineers Quality Assurance Project Plan with minor deviations designed to enhance the rigor of our data. Presence of DNA in PCR-positive samples was confirmed by Sanger sequencing (forward and reverse) and sequences were considered positive only if sequences (forward and reverse) of ≥150 base pairs had a match of ≥95% to those of published sequences for bighead carp or silver carp. The DNA of bighead carp and silver carp was not detected in environmental samples collected above and below St. Croix Falls Dam on the St. Croix River, above and below the Coon Rapids Dam and below Lock and Dam 1 on the Upper Mississippi River, and from two negative control lakes, Square Lake and Lake Riley. The DNA of silver carp was detected in environmental samples collected below Lock and Dam 19 at Keokuk, Iowa, a reach of the river with high silver carp abundance. The portion (68%) of environmental samples taken below Lock and Dam 19 that were determined to contain the DNA of silver carp was similar to that reported in the scientific literature for other abundant species. The DNA of bighead carp, however, was not detected in environmental samples collected below Lock and Dam 19, a reach of the river known to have bighead carp. Previous reported detections of the DNA of silver carp in samples collected in 2011 were not replicated in this study. Additional analyses are planned for the DNA extracted from the samples collected in 2012. Those analyses may provide additional information regarding the lack of amplification of bighead carp DNA and the lengths of the sequences of silver carp DNA present in samples taken below Lock and Dam 19. These additional analyses may help inform the use of eDNA monitoring in large, complex systems like the Mississippi River.
Molecular Evidence for Occurrence of Tomato leaf curl New Delhi virus in Ash Gourd (Benincasa hispida) Germplasm Showing a Severe Yellow Stunt Disease in India.

PubMed

Roy, Anirban; Spoorthi, P; Panwar, G; Bag, Manas Kumar; Prasad, T V; Kumar, Gunjeet; Gangopadhyay, K K; Dutta, M

2013-06-01

An evaluation of 70 accessions of ash gourd germplasm grown at National Bureau of Plant Genetic Resources, New Delhi, India during Kharif season (2010) showed natural occurrence of a yellow stunt disease in three accessions (IC554690, IC036330 and Pusa Ujjwal). A set of begomovirus specific primers used in PCR gave expected amplicon from all the symptomatic plants; however no betasatellite was detected. Complete genome of the begomovirus (DNA-A and DNA-B), amplified through rolling circle amplification, was cloned and sequenced. The begomovirus under study shared high sequence identities to different isolates of Tomato leaf curl New Delhi virus (ToLCNDV) and clustered with them. Among those isolates, the DNA-A and DNA-B of the present begomovirus isolate showed highest 99.6 and 96.8 % sequence identities, respectively with an isolate reported on pumpkin from India (DNA-A: AM286433, DNA-B: AM286435). Based on the sequence analysis, the begomovirus obtained from ash gourd was considered as an isolate of ToLCNDV. Thus, the present findings constitute the first report of occurrence of a new yellow stunt disease in ash gourd from India and demonstrated the association of ToLCNDV with the symptomatic samples. Occurrence of ToLCNDV in ash gourd germplasm not only adds up a new cucurbitaceous host of this virus but also raises the concern about the perpetuation of this virus in absence of its main host tomato and thus has an epidemiological relevance for understanding the rapid spread of this virus in tomato and other hosts in Indian sub-continent.
Identical mitochondrial somatic mutations unique to chronic periodontitis and coronary artery disease

PubMed Central

Pallavi, Tokala; Chandra, Rampalli Viswa; Reddy, Aileni Amarender; Reddy, Bavigadda Harish; Naveen, Anumala

2016-01-01

Context: The inflammatory processes involved in chronic periodontitis and coronary artery diseases (CADs) are similar and produce reactive oxygen species that may result in similar somatic mutations in mitochondrial deoxyribonucleic acid (mtDNA). Aims: The aims of the present study were to identify somatic mtDNA mutations in periodontal and cardiac tissues from subjects undergoing coronary artery bypass surgery and determine what fraction was identical and unique to these tissues. Settings and Design: The study population consisted of 30 chronic periodontitis subjects who underwent coronary artery surgery after an angiogram had indicated CAD. Materials and Methods: Gingival tissue samples were taken from the site with deepest probing depth; coronary artery tissue samples were taken during the coronary artery bypass grafting procedures, and blood samples were drawn during this surgical procedure. These samples were stored under aseptic conditions and later transported for mtDNA analysis. Statistical Analysis Used: Complete mtDNA sequences were obtained and aligned with the revised Cambridge reference sequence (NC_012920) using sequence analysis and auto assembler tools. Results: Among the complete mtDNA sequences, a total of 162 variations were spread across the whole mitochondrial genome and present only in the coronary artery and the gingival tissue samples but not in the blood samples. Among the 162 variations, 12 were novel and four of the 12 novel variations were found in mitochondrial NADH dehydrogenase subunit 5 complex I gene (33.3%). Conclusions: Analysis of mtDNA mutations indicated 162 variants unique to periodontitis and CAD. Of these, 12 were novel and may have resulted from destructive oxidative forces common to these two diseases. PMID:27041832
Laser mass spectrometry for DNA sequencing, disease diagnosis, and fingerprinting

NASA Astrophysics Data System (ADS)

Chen, C. H. Winston; Taranenko, N. I.; Zhu, Y. F.; Chung, C. N.; Allman, S. L.

1997-05-01

Since laser mass spectrometry has the potential for achieving very fast DNA analysis, we recently applied it to DNA sequencing, DNA typing for fingerprinting, and DNA screening for disease diagnosis. Two different approaches for sequencing DNA have been successfully demonstrated. One is to sequence DNA with DNA ladders produced from Sanger's enzymatic method. The other is to do direct sequencing without DNA ladders. The need for quick DNA typing for identification purposes is critical for forensic application. Our preliminary results indicate laser mass spectrometry can possible be used for rapid DNA fingerprinting applications at a much lower cost than gel electrophoresis. Population screening for certain genetic disease can be a very efficient step to reducing medical costs through prevention. Since laser mass spectrometry can provide very fast DNA analysis, we applied laser mass spectrometry to disease diagnosis. Clinical samples with both base deletion and point mutation have been tested with complete success.
Storing data encoded DNA in living organisms

DOEpatents

Wong,; Pak C. , Wong; Kwong K. , Foote; Harlan, P [Richland, WA

2006-06-06

Current technologies allow the generation of artificial DNA molecules and/or the ability to alter the DNA sequences of existing DNA molecules. With a careful coding scheme and arrangement, it is possible to encode important information as an artificial DNA strand and store it in a living host safely and permanently. This inventive technology can be used to identify origins and protect R&D investments. It can also be used in environmental research to track generations of organisms and observe the ecological impact of pollutants. Today, there are microorganisms that can survive under extreme conditions. As well, it is advantageous to consider multicellular organisms as hosts for stored information. These living organisms can provide as memory housing and protection for stored data or information. The present invention provides well for data storage in a living organism wherein at least one DNA sequence is encoded to represent data and incorporated into a living organism.
Micronuclear DNA of Oxytricha nova contains sequences with autonomously replicating activity in Saccharomyces cerevisiae.

PubMed Central

Colombo, M M; Swanton, M T; Donini, P; Prescott, D M

1984-01-01

Oxytricha nova is a hypotrichous ciliate with micronuclei and macronuclei. Micronuclei, which contain large, chromosomal-sized DNA, are genetically inert but undergo meiosis and exchange during cell mating. Macronuclei, which contain only small, gene-sized DNA molecules, provide all of the nuclear RNA needed to run the cell. After cell mating the macronucleus is derived from a micronucleus, a derivation that includes excision of the genes from chromosomes and elimination of the remaining DNA. The eliminated DNA includes all of the repetitious sequences and approximately 95% of the unique sequences. We cloned large restriction fragments from the micronucleus that confer replication ability on a replication-deficient plasmid in Saccharomyces cerevisiae. Sequences that confer replication ability are called autonomously replicating sequences. The frequency and effectiveness of autonomously replicating sequences in micronuclear DNA are similar to those reported for DNAs of other organisms introduced into yeast cells. Of the 12 micronuclear fragments with autonomously replicating sequence activity, 9 also showed homology to macronuclear DNA, indicating that they contain a macronuclear gene sequence. We conclude from this that autonomously replicating sequence activity is nonrandomly distributed throughout micronuclear DNA and is preferentially associated with those regions of micronuclear DNA that contain genes. Images PMID:6092934
DNA sequence-dependent mechanics and protein-assisted bending in repressor-mediated loop formation

PubMed Central

Boedicker, James Q.; Garcia, Hernan G.; Johnson, Stephanie; Phillips, Rob

2014-01-01

As the chief informational molecule of life, DNA is subject to extensive physical manipulations. The energy required to deform double-helical DNA depends on sequence, and this mechanical code of DNA influences gene regulation, such as through nucleosome positioning. Here we examine the sequence-dependent flexibility of DNA in bacterial transcription factor-mediated looping, a context for which the role of sequence remains poorly understood. Using a suite of synthetic constructs repressed by the Lac repressor and two well-known sequences that show large flexibility differences in vitro, we make precise statistical mechanical predictions as to how DNA sequence influences loop formation and test these predictions using in vivo transcription and in vitro single-molecule assays. Surprisingly, sequence-dependent flexibility does not affect in vivo gene regulation. By theoretically and experimentally quantifying the relative contributions of sequence and the DNA-bending protein HU to DNA mechanical properties, we reveal that bending by HU dominates DNA mechanics and masks intrinsic sequence-dependent flexibility. Such a quantitative understanding of how mechanical regulatory information is encoded in the genome will be a key step towards a predictive understanding of gene regulation at single-base pair resolution. PMID:24231252
Molecular identification and phylogenetic analysis of Wuchereria bancrofti from human blood samples in Egypt.

PubMed

Abdel-Shafi, Iman R; Shoieb, Eman Y; Attia, Samar S; Rubio, José M; Ta-Tang, Thuy-Huong; El-Badry, Ayman A

2017-03-01

Lymphatic filariasis (LF) is a serious vector-borne health problem, and Wuchereria bancrofti (W.b) is the major cause of LF worldwide and is focally endemic in Egypt. Identification of filarial infection using traditional morphologic and immunological criteria can be difficult and lead to misdiagnosis. The aim of the present study was molecular detection of W.b in residents in endemic areas in Egypt, sequence variance analysis, and phylogenetic analysis of W.b DNA. Collected blood samples from residents in filariasis endemic areas in five governorates were subjected to semi-nested PCR targeting repeated DNA sequence, for detection of W.b DNA. PCR products were sequenced; subsequently, a phylogenetic analysis of the obtained sequences was performed. Out of 300 blood samples, W.b DNA was identified in 48 (16%). Sequencing analysis confirmed PCR results identifying only W.b species. Sequence alignment and phylogenetic analysis indicated genetically distinct clusters of W.b among the study population. Study results demonstrated that the semi-nested PCR proved to be an effective diagnostic tool for accurate and rapid detection of W.b infections in nano-epidemics and is applicable for samples collected in the daytime as well as the night time. PCR products sequencing and phylogenitic analysis revealed three different nucleotide sequences variants. Further genetic studies of W.b in Egypt and other endemic areas are needed to distinguish related strains and the various ecological as well as drug effects exerted on them to support W.b elimination.
Reduced-median-network analysis of complete mitochondrial DNA coding-region sequences for the major African, Asian, and European haplogroups.

PubMed

Herrnstadt, Corinna; Elson, Joanna L; Fahy, Eoin; Preston, Gwen; Turnbull, Douglass M; Anderson, Christen; Ghosh, Soumitra S; Olefsky, Jerrold M; Beal, M Flint; Davis, Robert E; Howell, Neil

2002-05-01

The evolution of the human mitochondrial genome is characterized by the emergence of ethnically distinct lineages or haplogroups. Nine European, seven Asian (including Native American), and three African mitochondrial DNA (mtDNA) haplogroups have been identified previously on the basis of the presence or absence of a relatively small number of restriction-enzyme recognition sites or on the basis of nucleotide sequences of the D-loop region. We have used reduced-median-network approaches to analyze 560 complete European, Asian, and African mtDNA coding-region sequences from unrelated individuals to develop a more complete understanding of sequence diversity both within and between haplogroups. A total of 497 haplogroup-associated polymorphisms were identified, 323 (65%) of which were associated with one haplogroup and 174 (35%) of which were associated with two or more haplogroups. Approximately one-half of these polymorphisms are reported for the first time here. Our results confirm and substantially extend the phylogenetic relationships among mitochondrial genomes described elsewhere from the major human ethnic groups. Another important result is that there were numerous instances both of parallel mutations at the same site and of reversion (i.e., homoplasy). It is likely that homoplasy in the coding region will confound evolutionary analysis of small sequence sets. By a linkage-disequilibrium approach, additional evidence for the absence of human mtDNA recombination is presented here.
Molecular Identification of Ectomycorrhizal Mycelium in Soil Horizons

PubMed Central

Landeweert, Renske; Leeflang, Paula; Kuyper, Thom W.; Hoffland, Ellis; Rosling, Anna; Wernars, Karel; Smit, Eric

2003-01-01

Molecular identification techniques based on total DNA extraction provide a unique tool for identification of mycelium in soil. Using molecular identification techniques, the ectomycorrhizal (EM) fungal community under coniferous vegetation was analyzed. Soil samples were taken at different depths from four horizons of a podzol profile. A basidiomycete-specific primer pair (ITS1F-ITS4B) was used to amplify fungal internal transcribed spacer (ITS) sequences from total DNA extracts of the soil horizons. Amplified basidiomycete DNA was cloned and sequenced, and a selection of the obtained clones was analyzed phylogenetically. Based on sequence similarity, the fungal clone sequences were sorted into 25 different fungal groups, or operational taxonomic units (OTUs). Out of 25 basidiomycete OTUs, 7 OTUs showed high nucleotide homology (≥99%) with known EM fungal sequences and 16 were found exclusively in the mineral soil. The taxonomic positions of six OTUs remained unclear. OTU sequences were compared to sequences from morphotyped EM root tips collected from the same sites. Of the 25 OTUs, 10 OTUs had ≥98% sequence similarity with these EM root tip sequences. The present study demonstrates the use of molecular techniques to identify EM hyphae in various soil types. This approach differs from the conventional method of EM root tip identification and provides a novel approach to examine EM fungal communities in soil. PMID:12514012
Enzymatic DNA molecules

NASA Technical Reports Server (NTRS)

Joyce, Gerald F. (Inventor); Breaker, Ronald R. (Inventor)

1998-01-01

The present invention discloses deoxyribonucleic acid enzymes--catalytic or enzymatic DNA molecules--capable of cleaving nucleic acid sequences or molecules, particularly RNA, in a site-specific manner, as well as compositions including same. Methods of making and using the disclosed enzymes and compositions are also disclosed.
Morphological description and DNA barcoding of Hydrobaenus majus sp. nov. (Diptera: Chironomidae: Orthocladiinae) from the Russian Far East.

PubMed

Makarchenko, Eugenyi A; Makarchenko, Marina A; Semenchenko, Alexander A

2015-08-14

Illustrated descriptions of adult male, pupa and fourth instar larva, as well as DNA barcoding, of Hydrobaenus majus sp. nov. in comparison with the close related species H. sikhotealinensis Makarchenko et Makarchenko from the Russian Far East are provided. The species-specificity of H. majus sp. nov. COI sequences is analyzed and the sequences are presented as diagnostic characters--molecular markers of H. majus and H. sikhotealinensis.
Exponential Megapriming PCR (EMP) Cloning—Seamless DNA Insertion into Any Target Plasmid without Sequence Constraints

PubMed Central

Ulrich, Alexander; Andersen, Kasper R.; Schwartz, Thomas U.

2012-01-01

We present a fast, reliable and inexpensive restriction-free cloning method for seamless DNA insertion into any plasmid without sequence limitation. Exponential megapriming PCR (EMP) cloning requires two consecutive PCR steps and can be carried out in one day. We show that EMP cloning has a higher efficiency than restriction-free (RF) cloning, especially for long inserts above 2.5 kb. EMP further enables simultaneous cloning of multiple inserts. PMID:23300917
Exponential megapriming PCR (EMP) cloning--seamless DNA insertion into any target plasmid without sequence constraints.

PubMed

Ulrich, Alexander; Andersen, Kasper R; Schwartz, Thomas U

2012-01-01

We present a fast, reliable and inexpensive restriction-free cloning method for seamless DNA insertion into any plasmid without sequence limitation. Exponential megapriming PCR (EMP) cloning requires two consecutive PCR steps and can be carried out in one day. We show that EMP cloning has a higher efficiency than restriction-free (RF) cloning, especially for long inserts above 2.5 kb. EMP further enables simultaneous cloning of multiple inserts.
Affordable hands-on DNA sequencing and genotyping: an exercise for teaching DNA analysis to undergraduates.

PubMed

Shah, Kushani; Thomas, Shelby; Stein, Arnold

2013-01-01

In this report, we describe a 5-week laboratory exercise for undergraduate biology and biochemistry students in which students learn to sequence DNA and to genotype their DNA for selected single nucleotide polymorphisms (SNPs). Students use miniaturized DNA sequencing gels that require approximately 8 min to run. The students perform G, A, T, C Sanger sequencing reactions. They prepare and run the gels, perform Southern blots (which require only 10 min), and detect sequencing ladders using a colorimetric detection system. Students enlarge their sequencing ladders from digital images of their small nylon membranes, and read the sequence manually. They compare their reads with the actual DNA sequence using BLAST2. After mastering the DNA sequencing system, students prepare their own DNA from a cheek swab, polymerase chain reaction-amplify a region of their DNA that encompasses a SNP of interest, and perform sequencing to determine their genotype at the SNP position. A family pedigree can also be constructed. The SNP chosen by the instructor was rs17822931, which is in the ABCC11 gene and is the determinant of human earwax type. Genotypes at the rs178229931 site vary in different ethnic populations. © 2013 by The International Union of Biochemistry and Molecular Biology.
The genomes of many yam species contain transcriptionally active endogenous geminiviral sequences that may be functionally expressed

PubMed Central

Filloux, Denis; Murrell, Sasha; Koohapitagtam, Maneerat; Golden, Michael; Julian, Charlotte; Galzi, Serge; Uzest, Marilyne; Rodier-Goud, Marguerite; D’Hont, Angélique; Vernerey, Marie Stephanie; Wilkin, Paul; Peterschmitt, Michel; Winter, Stephan; Murrell, Ben; Martin, Darren P.; Roumagnac, Philippe

2015-01-01

Endogenous viral sequences are essentially ‘fossil records’ that can sometimes reveal the genomic features of long extinct virus species. Although numerous known instances exist of single-stranded DNA (ssDNA) genomes becoming stably integrated within the genomes of bacteria and animals, there remain very few examples of such integration events in plants. The best studied of these events are those which yielded the geminivirus-related DNA elements found within the nuclear genomes of various Nicotiana species. Although other ssDNA virus-like sequences are included within the draft genomes of various plant species, it is not entirely certain that these are not contaminants. The Nicotiana geminivirus-related DNA elements therefore remain the only definitively proven instances of endogenous plant ssDNA virus sequences. Here, we characterize two new classes of endogenous plant virus sequence that are also apparently derived from ancient geminiviruses in the genus Begomovirus. These two endogenous geminivirus-like elements (EGV1 and EGV2) are present in the Dioscorea spp. of the Enantiophyllum clade. We used fluorescence in situ hybridization to confirm that the EGV1 sequences are integrated in the D. alata genome and showed that one or two ancestral EGV sequences likely became integrated more than 1.4 million years ago during or before the diversification of the Asian and African Enantiophyllum Dioscorea spp. Unexpectedly, we found evidence of natural selection actively favouring the maintenance of EGV-expressed replication-associated protein (Rep) amino acid sequences, which clearly indicates that functional EGV Rep proteins were probably expressed for prolonged periods following endogenization. Further, the detection in D. alata of EGV gene transcripts, small 21–24 nt RNAs that are apparently derived from these transcripts, and expressed Rep proteins, provides evidence that some EGV genes are possibly still functionally expressed in at least some of the Enantiophyllum clade species. PMID:27774276

Phylogenetic characterization of a biogas plant microbial community integrating clone library 16S-rDNA sequences and metagenome sequence data obtained by 454-pyrosequencing.

PubMed

Kröber, Magdalena; Bekel, Thomas; Diaz, Naryttza N; Goesmann, Alexander; Jaenicke, Sebastian; Krause, Lutz; Miller, Dimitri; Runte, Kai J; Viehöver, Prisca; Pühler, Alfred; Schlüter, Andreas

2009-06-01

The phylogenetic structure of the microbial community residing in a fermentation sample from a production-scale biogas plant fed with maize silage, green rye and liquid manure was analysed by an integrated approach using clone library sequences and metagenome sequence data obtained by 454-pyrosequencing. Sequencing of 109 clones from a bacterial and an archaeal 16S-rDNA amplicon library revealed that the obtained nucleotide sequences are similar but not identical to 16S-rDNA database sequences derived from different anaerobic environments including digestors and bioreactors. Most of the bacterial 16S-rDNA sequences could be assigned to the phylum Firmicutes with the most abundant class Clostridia and to the class Bacteroidetes, whereas most archaeal 16S-rDNA sequences cluster close to the methanogen Methanoculleus bourgensis. Further sequences of the archaeal library most probably represent so far non-characterised species within the genus Methanoculleus. A similar result derived from phylogenetic analysis of mcrA clone sequences. The mcrA gene product encodes the alpha-subunit of methyl-coenzyme-M reductase involved in the final step of methanogenesis. BLASTn analysis applying stringent settings resulted in assignment of 16S-rDNA metagenome sequence reads to 62 16S-rDNA amplicon sequences thus enabling frequency of abundance estimations for 16S-rDNA clone library sequences. Ribosomal Database Project (RDP) Classifier processing of metagenome 16S-rDNA reads revealed abundance of the phyla Firmicutes, Bacteroidetes and Euryarchaeota and the orders Clostridiales, Bacteroidales and Methanomicrobiales. Moreover, a large fraction of 16S-rDNA metagenome reads could not be assigned to lower taxonomic ranks, demonstrating that numerous microorganisms in the analysed fermentation sample of the biogas plant are still unclassified or unknown.
Homeologous plastid DNA transformation in tobacco is mediated by multiple recombination events.

PubMed Central

Kavanagh, T A; Thanh, N D; Lao, N T; McGrath, N; Peter, S O; Horváth, E M; Dix, P J; Medgyesy, P

1999-01-01

Efficient plastid transformation has been achieved in Nicotiana tabacum using cloned plastid DNA of Solanum nigrum carrying mutations conferring spectinomycin and streptomycin resistance. The use of the incompletely homologous (homeologous) Solanum plastid DNA as donor resulted in a Nicotiana plastid transformation frequency comparable with that of other experiments where completely homologous plastid DNA was introduced. Physical mapping and nucleotide sequence analysis of the targeted plastid DNA region in the transformants demonstrated efficient site-specific integration of the 7.8-kb Solanum plastid DNA and the exclusion of the vector DNA. The integration of the cloned Solanum plastid DNA into the Nicotiana plastid genome involved multiple recombination events as revealed by the presence of discontinuous tracts of Solanum-specific sequences that were interspersed between Nicotiana-specific markers. Marked position effects resulted in very frequent cointegration of the nonselected peripheral donor markers located adjacent to the vector DNA. Data presented here on the efficiency and features of homeologous plastid DNA recombination are consistent with the existence of an active RecA-mediated, but a diminished mismatch, recombination/repair system in higher-plant plastids. PMID:10388829
Distinctive archaebacterial species associated with anaerobic rumen protozoan Entodinium caudatum.

PubMed

Tóthová, T; Piknová, M; Kisidayová, S; Javorský, P; Pristas, P

2008-01-01

The diversity of archaebacteria associated with anaerobic rumen protozoan Entodinium caudatum in long term in vitro culture was investigated by denaturing gradient gel electrophoresis (DGGE) analysis of hypervariable V3 region of archaebacterial 16S rRNA gene. PCR was accomplished directly from DNA extracted from a single protozoal cell and from total community genomic DNA and the obtained fingerprints were compared. The analysis indicated the presence of a solitary intensive band present in Entodinium caudatum single cell DNA, which had no counterparts in the profile from total DNA. The identity of archaebacterium represented by this band was determined by sequence analysis which showed that the sequence fell to the cluster of ciliate symbiotic methanogens identified recently by 16S gene library approach.
Genomics approach to the environmental community of microorganisms

NASA Astrophysics Data System (ADS)

Kawarabayasi, Y.; Maruyama, A.

2004-12-01

It was indicated by microscopic observation or comparison of 16S rDNA sequence that many extremophiles were surviving in many hydrothermal environments. But it is generally said that over 99% of total microbes are now uncultivable. Thus, we planned to identify uncultivable microbes through direct sequencing of environmental DNA. At first, shotgun plasmid libraries were directly constructed with the DNA molecules prepared from mixed microbes collected from low-temperature hydrothermal water at RM24 in the Southern East Pacific Rise (S-EPR). It was shown that the sequences of some number of clones indicated the similar feature to the intron in eukaryote or tandem repetitive sequence identified in some human familiar diseases. The results indicated that many microorganisms with eukaryotic feature were dominant in low temperature water of S-EPR. Secondly, shotgun plasmid libraries were constructed from the environmental DNA prepared from Beppu hot springs. The ORFs were easily identified all clones determined entire sequence. Thus it can be said that hot springs is good resources for searching novel genes. At last, the mixed microbes isolated from Suiyo seamount were used for construction of shotgun library. The clones in this library contained the ORFs. From some clones in hot spring and Suiyo sample, aminoacyl-tRNA synthatase, which is generally present in all organisms, was isolated by similarity. The phylogenetic analysis of aminoacyl-tRNA synthetase identified indicated that novel and unidentified microorganisms should be present in hot spring or Suiyo seamount. The novel genes identified from Suiyo seamount were also utilized for expression in E. coli. Some gene products were successfully obtained from the E. coli cells as soluble proteins. Some protein indicated the thermostability up to 70_E#8249;C, meaning that the original host cell of this gene should be stable up to the same temperature. Our work indicates that environmental genomics, including the direct cloning, sequencing of environmental DNA and expression of gene identified, is powerful approach to collect novel uncultivable microbes or novel active genes.
Transposon-containing DNA cloning vector and uses thereof

DOEpatents

Berg, C.M.; Berg, D.E.; Wang, G.

1997-07-08

The present invention discloses a rapid method of restriction mapping, sequencing or localizing genetic features in a segment of deoxyribonucleic acid (DNA) that is up to 42 kb in size. The method in part comprises cloning of the DNA segment in a specialized cloning vector and then isolating nested deletions in either direction in vivo by intramolecular transposition into the cloned DNA. A plasmid has been prepared and disclosed. 4 figs.
Transposon-containing DNA cloning vector and uses thereof

DOEpatents

Berg, Claire M.; Berg, Douglas E.; Wang, Gan

1997-01-01

The present invention discloses a rapid method of restriction mapping, sequencing or localizing genetic features in a segment of deoxyribonucleic acid (DNA) that is up to 42 kb in size. The method in part comprises cloning of the DNA segment in a specialized cloning vector and then isolating nested deletions in either direction in vivo by intramolecular transposition into the cloned DNA. A plasmid has been prepared and disclosed.
Genome Partitioner: A web tool for multi-level partitioning of large-scale DNA constructs for synthetic biology applications

PubMed Central

Del Medico, Luca; Christen, Heinz; Christen, Beat

2017-01-01

Recent advances in lower-cost DNA synthesis techniques have enabled new innovations in the field of synthetic biology. Still, efficient design and higher-order assembly of genome-scale DNA constructs remains a labor-intensive process. Given the complexity, computer assisted design tools that fragment large DNA sequences into fabricable DNA blocks are needed to pave the way towards streamlined assembly of biological systems. Here, we present the Genome Partitioner software implemented as a web-based interface that permits multi-level partitioning of genome-scale DNA designs. Without the need for specialized computing skills, biologists can submit their DNA designs to a fully automated pipeline that generates the optimal retrosynthetic route for higher-order DNA assembly. To test the algorithm, we partitioned a 783 kb Caulobacter crescentus genome design. We validated the partitioning strategy by assembling a 20 kb test segment encompassing a difficult to synthesize DNA sequence. Successful assembly from 1 kb subblocks into the 20 kb segment highlights the effectiveness of the Genome Partitioner for reducing synthesis costs and timelines for higher-order DNA assembly. The Genome Partitioner is broadly applicable to translate DNA designs into ready to order sequences that can be assembled with standardized protocols, thus offering new opportunities to harness the diversity of microbial genomes for synthetic biology applications. The Genome Partitioner web tool can be accessed at https://christenlab.ethz.ch/GenomePartitioner. PMID:28531174
DNA Photo Lithography with Cinnamate-based Photo-Bio-Nano-Glue

NASA Astrophysics Data System (ADS)

Feng, Lang; Li, Minfeng; Romulus, Joy; Sha, Ruojie; Royer, John; Wu, Kun-Ta; Xu, Qin; Seeman, Nadrian; Weck, Marcus; Chaikin, Paul

2013-03-01

We present a technique to make patterned functional surfaces, using a cinnamate photo cross-linker and photolithography. We have designed and modified a complementary set of single DNA strands to incorporate a pair of opposing cinnamate molecules. On exposure to 360nm UV, the cinnamate makes a highly specific covalent bond permanently linking only the complementary strands containing the cinnamates. We have studied this specific and efficient crosslinking with cinnamate-containing DNA in solution and on particles. UV addressability allows us to pattern surfaces functionally. The entire surface is coated with a DNA sequence A incorporating cinnamate. DNA strands A'B with one end containing a complementary cinnamated sequence A' attached to another sequence B, are then hybridized to the surface. UV photolithography is used to bind the A'B strand in a specific pattern. The system is heated and the unbound DNA is washed away. The pattern is then observed by thermo-reversibly hybridizing either fluorescently dyed B' strands complementary to B, or colloids coated with B' strands. Our techniques can be used to reversibly and/or permanently bind, via DNA linkers, an assortment of molecules, proteins and nanostructures. Potential applications range from advanced self-assembly, such as templated self-replication schemes recently reported, to designed physical and chemical patterns, to high-resolution multi-functional DNA surfaces for genetic detection or DNA computing.
Existing and emerging detection technologies for DNA (Deoxyribonucleic Acid) finger printing, sequencing, bio- and analytical chips: a multidisciplinary development unifying molecular biology, chemical and electronics engineering.

PubMed

Kumar Khanna, Vinod

2007-01-01

The current status and research trends of detection techniques for DNA-based analysis such as DNA finger printing, sequencing, biochips and allied fields are examined. An overview of main detectors is presented vis-à-vis these DNA operations. The biochip method is explained, the role of micro- and nanoelectronic technologies in biochip realization is highlighted, various optical and electrical detection principles employed in biochips are indicated, and the operational mechanisms of these detection devices are described. Although a diversity of biochips for diagnostic and therapeutic applications has been demonstrated in research laboratories worldwide, only some of these chips have entered the clinical market, and more chips are awaiting commercialization. The necessity of tagging is eliminated in refractive-index change based devices, but the basic flaw of indirect nature of most detection methodologies can only be overcome by generic and/or reagentless DNA sensors such as the conductance-based approach and the DNA-single electron transistor (DNA-SET) structure. Devices of the electrical detection-based category are expected to pave the pathway for the next-generation DNA chips. The review provides a comprehensive coverage of the detection technologies for DNA finger printing, sequencing and related techniques, encompassing a variety of methods from the primitive art to the state-of-the-art scenario as well as promising methods for the future.
Molecular Phylogenetics of Trichostrongylus Species (Nematoda: Trichostrongylidae) from Humans of Mazandaran Province, Iran.

PubMed

Sharifdini, Meysam; Heidari, Zahra; Hesari, Zahra; Vatandoost, Sajad; Kia, Eshrat Beigom

2017-06-01

The present study was performed to analyze molecularly the phylogenetic positions of human-infecting Trichostrongylus species in Mazandaran Province, Iran, which is an endemic area for trichostrongyliasis. DNA from 7 Trichostrongylus infected stool samples were extracted by using in-house (IH) method. PCR amplification of ITS2-rDNA region was performed, and products were sequenced. Phylogenetic analysis of the nucleotide sequence data was performed using MEGA 5.0 software. Six out of 7 isolates had high similarity with Trichostrongylus colubriformis , while the other one showed high homology with Trichostrongylus axei registered in GenBank reference sequences. Intra-specific variations within isolates of T. colubriformis and T. axei amounted to 0-1.8% and 0-0.6%, respectively. Trichostrongylus species obtained in the present study were in a cluster with the relevant reference sequences from previous studies. BLAST analysis indicated that there was 100% homology among all 6 ITS2 sequences of T. colubriformis in the present study and most previously registered sequences of T. colubriformis from human, sheep, and goat isolates from Iran and also human isolates from Laos, Thailand, and France. The ITS2 sequence of T. axei exhibited 99.4% homology with the human isolate of T. axei from Thailand, sheep isolates from New Zealand and Iran, and cattle isolate from USA.
A DNA 'barcode blitz': rapid digitization and sequencing of a natural history collection.

PubMed

Hebert, Paul D N; Dewaard, Jeremy R; Zakharov, Evgeny V; Prosser, Sean W J; Sones, Jayme E; McKeown, Jaclyn T A; Mantle, Beth; La Salle, John

2013-01-01

DNA barcoding protocols require the linkage of each sequence record to a voucher specimen that has, whenever possible, been authoritatively identified. Natural history collections would seem an ideal resource for barcode library construction, but they have never seen large-scale analysis because of concerns linked to DNA degradation. The present study examines the strength of this barrier, carrying out a comprehensive analysis of moth and butterfly (Lepidoptera) species in the Australian National Insect Collection. Protocols were developed that enabled tissue samples, specimen data, and images to be assembled rapidly. Using these methods, a five-person team processed 41,650 specimens representing 12,699 species in 14 weeks. Subsequent molecular analysis took about six months, reflecting the need for multiple rounds of PCR as sequence recovery was impacted by age, body size, and collection protocols. Despite these variables and the fact that specimens averaged 30.4 years old, barcode records were obtained from 86% of the species. In fact, one or more barcode compliant sequences (>487 bp) were recovered from virtually all species represented by five or more individuals, even when the youngest was 50 years old. By assembling specimen images, distributional data, and DNA barcode sequences on a web-accessible informatics platform, this study has greatly advanced accessibility to information on thousands of species. Moreover, much of the specimen data became publically accessible within days of its acquisition, while most sequence results saw release within three months. As such, this study reveals the speed with which DNA barcode workflows can mobilize biodiversity data, often providing the first web-accessible information for a species. These results further suggest that existing collections can enable the rapid development of a comprehensive DNA barcode library for the most diverse compartment of terrestrial biodiversity - insects.
Performance of amplicon-based next generation DNA sequencing for diagnostic gene mutation profiling in oncopathology.

PubMed

Sie, Daoud; Snijders, Peter J F; Meijer, Gerrit A; Doeleman, Marije W; van Moorsel, Marinda I H; van Essen, Hendrik F; Eijk, Paul P; Grünberg, Katrien; van Grieken, Nicole C T; Thunnissen, Erik; Verheul, Henk M; Smit, Egbert F; Ylstra, Bauke; Heideman, Daniëlle A M

2014-10-01

Next generation DNA sequencing (NGS) holds promise for diagnostic applications, yet implementation in routine molecular pathology practice requires performance evaluation on DNA derived from routine formalin-fixed paraffin-embedded (FFPE) tissue specimens. The current study presents a comprehensive analysis of TruSeq Amplicon Cancer Panel-based NGS using a MiSeq Personal sequencer (TSACP-MiSeq-NGS) for somatic mutation profiling. TSACP-MiSeq-NGS (testing 212 hotspot mutation amplicons of 48 genes) and a data analysis pipeline were evaluated in a retrospective learning/test set approach (n = 58/n = 45 FFPE-tumor DNA samples) against 'gold standard' high-resolution-melting (HRM)-sequencing for the genes KRAS, EGFR, BRAF and PIK3CA. Next, the performance of the validated test algorithm was assessed in an independent, prospective cohort of FFPE-tumor DNA samples (n = 75). In the learning set, a number of minimum parameter settings was defined to decide whether a FFPE-DNA sample is qualified for TSACP-MiSeq-NGS and for calling mutations. The resulting test algorithm revealed 82% (37/45) compliance to the quality criteria and 95% (35/37) concordant assay findings for KRAS, EGFR, BRAF and PIK3CA with HRM-sequencing (kappa = 0.92; 95% CI = 0.81-1.03) in the test set. Subsequent application of the validated test algorithm to the prospective cohort yielded a success rate of 84% (63/75), and a high concordance with HRM-sequencing (95% (60/63); kappa = 0.92; 95% CI = 0.84-1.01). TSACP-MiSeq-NGS detected 77 mutations in 29 additional genes. TSACP-MiSeq-NGS is suitable for diagnostic gene mutation profiling in oncopathology.
Direct Detection and Sequencing of Damaged DNA Bases

PubMed Central

2011-01-01

Products of various forms of DNA damage have been implicated in a variety of important biological processes, such as aging, neurodegenerative diseases, and cancer. Therefore, there exists great interest to develop methods for interrogating damaged DNA in the context of sequencing. Here, we demonstrate that single-molecule, real-time (SMRT®) DNA sequencing can directly detect damaged DNA bases in the DNA template - as a by-product of the sequencing method - through an analysis of the DNA polymerase kinetics that are altered by the presence of a modified base. We demonstrate the sequencing of several DNA templates containing products of DNA damage, including 8-oxoguanine, 8-oxoadenine, O6-methylguanine, 1-methyladenine, O4-methylthymine, 5-hydroxycytosine, 5-hydroxyuracil, 5-hydroxymethyluracil, or thymine dimers, and show that these base modifications can be readily detected with single-modification resolution and DNA strand specificity. We characterize the distinct kinetic signatures generated by these DNA base modifications. PMID:22185597
Direct detection and sequencing of damaged DNA bases.

PubMed

Clark, Tyson A; Spittle, Kristi E; Turner, Stephen W; Korlach, Jonas

2011-12-20

Products of various forms of DNA damage have been implicated in a variety of important biological processes, such as aging, neurodegenerative diseases, and cancer. Therefore, there exists great interest to develop methods for interrogating damaged DNA in the context of sequencing. Here, we demonstrate that single-molecule, real-time (SMRT®) DNA sequencing can directly detect damaged DNA bases in the DNA template - as a by-product of the sequencing method - through an analysis of the DNA polymerase kinetics that are altered by the presence of a modified base. We demonstrate the sequencing of several DNA templates containing products of DNA damage, including 8-oxoguanine, 8-oxoadenine, O6-methylguanine, 1-methyladenine, O4-methylthymine, 5-hydroxycytosine, 5-hydroxyuracil, 5-hydroxymethyluracil, or thymine dimers, and show that these base modifications can be readily detected with single-modification resolution and DNA strand specificity. We characterize the distinct kinetic signatures generated by these DNA base modifications.
Kilo-sequencing: an ordered strategy for rapid DNA sequence data acquisition.

PubMed Central

Barnes, W M; Bevan, M

1983-01-01

A strategy for rapid DNA sequence acquisition in an ordered, nonrandom manner, while retaining all of the conveniences of the dideoxy method with M13 transducing phage DNA template, is described. Target DNA 3 to 14 kb in size can be stably carried by our M13 vectors. Suitable targets are stretches of DNA which lack an enzyme recognition site which is unique on our cloning vectors and adjacent to the sequencing primer; current sites that are so useful when lacking are Pst, Xba, HindIII, BglII, EcoRI. By an in vitro procedure, we cut RF DNA once randomly and once specifically, to create thousands of deletions which start at the unique restriction site adjacent to the dideoxy sequencing primer and extend various distances across the target DNA. Phage carrying a desired size of deletions, whose DNA as template will give rise to DNA sequence data in a desired location along the target DNA, may be purified by electrophoresis alive on agarose gels. Phage running in the same location on the agarose gel thus conveniently give rise to nucleotide sequence data from the same kilobase of target DNA. Images PMID:6298723
The Past, Present, and Future of Human Centromere Genomics

PubMed Central

Aldrup-MacDonald, Megan E.; Sullivan, Beth A.

2014-01-01

The centromere is the chromosomal locus essential for chromosome inheritance and genome stability. Human centromeres are located at repetitive alpha satellite DNA arrays that compose approximately 5% of the genome. Contiguous alpha satellite DNA sequence is absent from the assembled reference genome, limiting current understanding of centromere organization and function. Here, we review the progress in centromere genomics spanning the discovery of the sequence to its molecular characterization and the work done during the Human Genome Project era to elucidate alpha satellite structure and sequence variation. We discuss exciting recent advances in alpha satellite sequence assembly that have provided important insight into the abundance and complex organization of this sequence on human chromosomes. In light of these new findings, we offer perspectives for future studies of human centromere assembly and function. PMID:24683489
Detecting SNPs and estimating allele frequencies in clonal bacterial populations by sequencing pooled DNA.

PubMed

Holt, Kathryn E; Teo, Yik Y; Li, Heng; Nair, Satheesh; Dougan, Gordon; Wain, John; Parkhill, Julian

2009-08-15

Here, we present a method for estimating the frequencies of SNP alleles present within pooled samples of DNA using high-throughput short-read sequencing. The method was tested on real data from six strains of the highly monomorphic pathogen Salmonella Paratyphi A, sequenced individually and in a pool. A variety of read mapping and quality-weighting procedures were tested to determine the optimal parameters, which afforded > or =80% sensitivity of SNP detection and strong correlation with true SNP frequency at poolwide read depth of 40x, declining only slightly at read depths 20-40x. The method was implemented in Perl and relies on the opensource software Maq for read mapping and SNP calling. The Perl script is freely available from ftp://ftp.sanger.ac.uk/pub/pathogens/pools/.
Identification and analysis of pig chimeric mRNAs using RNA sequencing data

PubMed Central

2012-01-01

Background Gene fusion is ubiquitous over the course of evolution. It is expected to increase the diversity and complexity of transcriptomes and proteomes through chimeric sequence segments or altered regulation. However, chimeric mRNAs in pigs remain unclear. Here we identified some chimeric mRNAs in pigs and analyzed the expression of them across individuals and breeds using RNA-sequencing data. Results The present study identified 669 putative chimeric mRNAs in pigs, of which 251 chimeric candidates were detected in a set of RNA-sequencing data. The 618 candidates had clear trans-splicing sites, 537 of which obeyed the canonical GU-AG splice rule. Only two putative pig chimera variants whose fusion junction was overlapped with that of a known human chimeric mRNA were found. A set of unique chimeric events were considered middle variances in the expression across individuals and breeds, and revealed non-significant variance between sexes. Furthermore, the genomic region of the 5′ partner gene shares a similar DNA sequence with that of the 3′ partner gene for 458 putative chimeric mRNAs. The 81 of those shared DNA sequences significantly matched the known DNA-binding motifs in the JASPAR CORE database. Four DNA motifs shared in parental genomic regions had significant similarity with known human CTCF binding sites. Conclusions The present study provided detailed information on some pig chimeric mRNAs. We proposed a model that trans-acting factors, such as CTCF, induced the spatial organisation of parental genes to the same transcriptional factory so that parental genes were coordinatively transcribed to give birth to chimeric mRNAs. PMID:22925561
Silicene nanoribbon as a new DNA sequencing device

NASA Astrophysics Data System (ADS)

Alesheikh, Sara; Shahtahmassebi, Nasser; Roknabadi, Mahmood Rezaee; Pilevar Shahri, Raheleh

2018-02-01

The importance of applying DNA sequencing in different fields, results in looking for fast and cheap methods. Nanotechnology helps this development by introducing nanostructures used for DNA sequencing. In this work we study the interaction between zigzag silicene nanoribbon and DNA nucleobases using DFT and non equilibrium Green's function approach, to investigate the possibility of using zigzag silicene nanoribbons as a biosensor for DNA sequencing.
Isolation and characterization of target sequences of the chicken CdxA homeobox gene.

PubMed Central

Margalit, Y; Yarus, S; Shapira, E; Gruenbaum, Y; Fainsod, A

1993-01-01

The DNA binding specificity of the chicken homeodomain protein CDXA was studied. Using a CDXA-glutathione-S-transferase fusion protein, DNA fragments containing the binding site for this protein were isolated. The sources of DNA were oligonucleotides with random sequence and chicken genomic DNA. The DNA fragments isolated were sequenced and tested in DNA binding assays. Sequencing revealed that most DNA fragments are AT rich which is a common feature of homeodomain binding sites. By electrophoretic mobility shift assays it was shown that the different target sequences isolated bind to the CDXA protein with different affinities. The specific sequences bound by the CDXA protein in the genomic fragments isolated, were determined by DNase I footprinting. From the footprinted sequences, the CDXA consensus binding site was determined. The CDXA protein binds the consensus sequence A, A/T, T, A/T, A, T, A/G. The CAUDAL binding site in the ftz promoter is also included in this consensus sequence. When tested, some of the genomic target sequences were capable of enhancing the transcriptional activity of reporter plasmids when introduced into CDXA expressing cells. This study determined the DNA sequence specificity of the CDXA protein and it also shows that this protein can further activate transcription in cells in culture. Images PMID:7909943

Sequence periodicity in nucleosomal DNA and intrinsic curvature.

PubMed

Nair, T Murlidharan

2010-05-17

Most eukaryotic DNA contained in the nucleus is packaged by wrapping DNA around histone octamers. Histones are ubiquitous and bind most regions of chromosomal DNA. In order to achieve smooth wrapping of the DNA around the histone octamer, the DNA duplex should be able to deform and should possess intrinsic curvature. The deformability of DNA is a result of the non-parallelness of base pair stacks. The stacking interaction between base pairs is sequence dependent. The higher the stacking energy the more rigid the DNA helix, thus it is natural to expect that sequences that are involved in wrapping around the histone octamer should be unstacked and possess intrinsic curvature. Intrinsic curvature has been shown to be dictated by the periodic recurrence of certain dinucleotides. Several genome-wide studies directed towards mapping of nucleosome positions have revealed periodicity associated with certain stretches of sequences. In the current study, these sequences have been analyzed with a view to understand their sequence-dependent structures. Higher order DNA structures and the distribution of molecular bend loci associated with 146 base nucleosome core DNA sequence from C. elegans and chicken have been analyzed using the theoretical model for DNA curvature. The curvature dispersion calculated by cyclically permuting the sequences revealed that the molecular bend loci were delocalized throughout the nucleosome core region and had varying degrees of intrinsic curvature. The higher order structures associated with nucleosomes of C.elegans and chicken calculated from the sequences revealed heterogeneity with respect to the deviation of the DNA axis. The results points to the possibility of context dependent curvature of varying degrees to be associated with nucleosomal DNA.
Validation of the Hirst-Type Spore Trap for Simultaneous Monitoring of Prokaryotic and Eukaryotic Biodiversities in Urban Air Samples by Next-Generation Sequencing.

PubMed

Núñez, Andrés; Amo de Paz, Guillermo; Ferencova, Zuzana; Rastrojo, Alberto; Guantes, Raúl; García, Ana M; Alcamí, Antonio; Gutiérrez-Bustillo, A Montserrat; Moreno, Diego A

2017-07-01

Pollen, fungi, and bacteria are the main microscopic biological entities present in outdoor air, causing allergy symptoms and disease transmission and having a significant role in atmosphere dynamics. Despite their relevance, a method for monitoring simultaneously these biological particles in metropolitan environments has not yet been developed. Here, we assessed the use of the Hirst-type spore trap to characterize the global airborne biota by high-throughput DNA sequencing, selecting regions of the 16S rRNA gene and internal transcribed spacer for the taxonomic assignment. We showed that aerobiological communities are well represented by this approach. The operational taxonomic units (OTUs) of two traps working synchronically compiled >87% of the total relative abundance for bacterial diversity collected in each sampler, >89% for fungi, and >97% for pollen. We found a good correspondence between traditional characterization by microscopy and genetic identification, obtaining more-accurate taxonomic assignments and detecting a greater diversity using the latter. We also demonstrated that DNA sequencing accurately detects differences in biodiversity between samples. We concluded that high-throughput DNA sequencing applied to aerobiological samples obtained with Hirst spore traps provides reliable results and can be easily implemented for monitoring prokaryotic and eukaryotic entities present in the air of urban areas. IMPORTANCE Detection, monitoring, and characterization of the wide diversity of biological entities present in the air are difficult tasks that require time and expertise in different disciplines. We have evaluated the use of the Hirst spore trap (an instrument broadly employed in aerobiological studies) to detect and identify these organisms by DNA-based analyses. Our results showed a consistent collection of DNA and a good concordance with traditional methods for identification, suggesting that these devices can be used as a tool for continuous monitoring of the airborne biodiversity, improving taxonomic resolution and characterization together. They are also suitable for acquiring novel DNA amplicon-based information in order to gain a better understanding of the biological particles present in a scarcely known environment such as the air. Copyright © 2017 American Society for Microbiology.
Validation of the Hirst-Type Spore Trap for Simultaneous Monitoring of Prokaryotic and Eukaryotic Biodiversities in Urban Air Samples by Next-Generation Sequencing

PubMed Central

Núñez, Andrés; Amo de Paz, Guillermo; Ferencova, Zuzana; Rastrojo, Alberto; Guantes, Raúl; García, Ana M.; Alcamí, Antonio; Gutiérrez-Bustillo, A. Montserrat

2017-01-01

ABSTRACT Pollen, fungi, and bacteria are the main microscopic biological entities present in outdoor air, causing allergy symptoms and disease transmission and having a significant role in atmosphere dynamics. Despite their relevance, a method for monitoring simultaneously these biological particles in metropolitan environments has not yet been developed. Here, we assessed the use of the Hirst-type spore trap to characterize the global airborne biota by high-throughput DNA sequencing, selecting regions of the 16S rRNA gene and internal transcribed spacer for the taxonomic assignment. We showed that aerobiological communities are well represented by this approach. The operational taxonomic units (OTUs) of two traps working synchronically compiled >87% of the total relative abundance for bacterial diversity collected in each sampler, >89% for fungi, and >97% for pollen. We found a good correspondence between traditional characterization by microscopy and genetic identification, obtaining more-accurate taxonomic assignments and detecting a greater diversity using the latter. We also demonstrated that DNA sequencing accurately detects differences in biodiversity between samples. We concluded that high-throughput DNA sequencing applied to aerobiological samples obtained with Hirst spore traps provides reliable results and can be easily implemented for monitoring prokaryotic and eukaryotic entities present in the air of urban areas. IMPORTANCE Detection, monitoring, and characterization of the wide diversity of biological entities present in the air are difficult tasks that require time and expertise in different disciplines. We have evaluated the use of the Hirst spore trap (an instrument broadly employed in aerobiological studies) to detect and identify these organisms by DNA-based analyses. Our results showed a consistent collection of DNA and a good concordance with traditional methods for identification, suggesting that these devices can be used as a tool for continuous monitoring of the airborne biodiversity, improving taxonomic resolution and characterization together. They are also suitable for acquiring novel DNA amplicon-based information in order to gain a better understanding of the biological particles present in a scarcely known environment such as the air. PMID:28455334
The construction and partial characterization of plasmids containing complementary DNA sequences to human calcitonin precursor polyprotein.

PubMed Central

Allison, J; Hall, L; MacIntyre, I; Craig, R K

1981-01-01

(1) Total poly(A)-containing RNA isolated from human thyroid medullary carcinoma tissue was shown to direct the synthesis in the wheat germ cell-free system of a major (Mr 21000) and several minor forms of human calcitonin precursor polyproteins. Evidence for processing of these precursor(s) by the wheat germ cell-free system is also presented. (2) A small complementary DNA (cDNA) plasmid library has been constructed in the PstI site of the plasmid pAT153, using total human thyroid medullary carcinoma poly(A)-containing RNA as the starting material. (3) Plasmids containing abundant cDNA sequences were selected by hybridization in situ, and two of these (ph T-B3 and phT-B6) were characterized by hybridization--translation and restriction analysis. Each was shown to contain human calcitonin precursor polyprotein cDNA sequences. (4) RNA blotting techniques demonstrate that the human calcitonin precursor polyprotein is encoded within a mRNA containing 1000 bases. (5) The results demonstrate that human calcitonin is synthesized as a precursor polyprotein. Images Fig. 1. Fig. 2. Fig. 3. PMID:6896146
Cloning and expression of a cDNA coding for catalase from zebrafish (Danio rerio).

PubMed

Ken, C F; Lin, C T; Wu, J L; Shaw, J F

2000-06-01

A full-length complementary DNA (cDNA) clone encoding a catalase was amplified by the rapid amplication of cDNA ends-polymerase chain reaction (RACE-PCR) technique from zebrafish (Danio rerio) mRNA. Nucleotide sequence analysis of this cDNA clone revealed that it comprised a complete open reading frame coding for 526 amino acid residues and that it had a molecular mass of 59 654 Da. The deduced amino acid sequence showed high similarity with the sequences of catalase from swine (86.9%), mouse (85.8%), rat (85%), human (83.7%), fruit fly (75.6%), nematode (71.1%), and yeast (58.6%). The amino acid residues for secondary structures are apparently conserved as they are present in other mammal species. Furthermore, the coding region of zebrafish catalase was introduced into an expression vector, pET-20b(+), and transformed into Escherichia coli expression host BL21(DE3)pLysS. A 60-kDa active catalase protein was expressed and detected by Coomassie blue staining as well as activity staining on polyacrylamide gel followed electrophoresis.
Mammalian DNA enriched for replication origins is enriched for snap-back sequences.

PubMed

Zannis-Hadjopoulos, M; Kaufmann, G; Martin, R G

1984-11-15

Using the instability of replication loops as a method for the isolation of double-stranded nascent DNA, extruded DNA enriched for replication origins was obtained and denatured. Snap-back DNA, single-stranded DNA with inverted repeats (palindromic sequences), reassociates rapidly into stem-loop structures with zero-order kinetics when conditions are changed from denaturing to renaturing, and can be assayed by chromatography on hydroxyapatite. Origin-enriched nascent DNA strands from mouse, rat and monkey cells growing either synchronously or asynchronously were purified and assayed for the presence of snap-back sequences. The results show that origin-enriched DNA is also enriched for snap-back sequences, implying that some origins for mammalian DNA replication contain or lie near palindromic sequences.
Molecular characterization of Hepatozoon sp. in cats from São Luís Island, Maranhão, Northeastern Brazil.

PubMed

de Bortoli, Caroline P; André, Marcos R; Braga, Maria do Socorro C; Machado, Rosangela Zacarias

2011-10-01

Few molecular studies have been done concerning the molecular characterization of Hepatozoon species among domestic and wild felids. The present work aimed to characterize molecularly the presence of Hepatozoon sp. DNA in cat blood samples from São Luís Island, Maranhão state, Northeastern Brazil. EDTA-whole blood samples were collected from 200 domestic cats with outdoor and wood areas access from São Luís, Maranhão, Brazil. Each sample of extracted DNA was used as a template in PCR reactions aiming to amplify a partial sequence of 18S rRNA of Hepatozoon spp. We also performed sequence alignment to establish the identity of the parasite species infecting these animals using DNA sequences based on 18S rRNA. From 200 sampled cats, Hepatozoon DNA was only found in one animal (0.5%). The found Hepatozoon DNA showed 97% of identity with Hemobartonella felis isolates 1 and 2 from Spain. When analyzing the phylogenetic tree, the found Hepatozoon DNA was in the same clade than H. felis isolates. Our findings suggest that more than one species of Hepatozoon could infect felids in Brazil.
Cytogenetic and Sequence Analyses of Mitochondrial DNA Insertions in Nuclear Chromosomes of Maize

PubMed Central

Lough, Ashley N.; Faries, Kaitlyn M.; Koo, Dal-Hoe; Hussain, Abid; Roark, Leah M.; Langewisch, Tiffany L.; Backes, Teresa; Kremling, Karl A. G.; Jiang, Jiming; Birchler, James A.; Newton, Kathleen J.

2015-01-01

The transfer of mitochondrial DNA (mtDNA) into nuclear genomes is a regularly occurring process that has been observed in many species. Few studies, however, have focused on the variation of nuclear-mtDNA sequences (NUMTs) within a species. This study examined mtDNA insertions within chromosomes of a diverse set of Zea mays ssp. mays (maize) inbred lines by the use of fluorescence in situ hybridization. A relatively large NUMT on the long arm of chromosome 9 (9L) was identified at approximately the same position in four inbred lines (B73, M825, HP301, and Oh7B). Further examination of the similarly positioned 9L NUMT in two lines, B73 and M825, indicated that the large size of these sites is due to the presence of a majority of the mitochondrial genome; however, only portions of this NUMT (∼252 kb total) were found in the publically available B73 nuclear sequence for chromosome 9. Fiber-fluorescence in situ hybridization analysis estimated the size of the B73 9L NUMT to be ∼1.8 Mb and revealed that the NUMT is methylated. Two regions of mtDNA (2.4 kb and 3.3 kb) within the 9L NUMT are not present in the B73 mitochondrial NB genome; however, these 2.4-kb and 3.3-kb segments are present in other Zea mitochondrial genomes, including that of Zea mays ssp. parviglumis, a progenitor of domesticated maize. PMID:26333837
New record of Ascaridia nymphii (Secernentea: Ascaridiidae) from macaw parrot, Ara chloroptera, in China.

PubMed

Yang, Fang; Zhang, Pan; Shi, Xianli; Li, Kangxin; Wang, Minwei; Fu, Yeqi; Yan, Xinxin; Hang, Jianxiong; Li, Guoqing

2018-06-01

Present study was performed to identify the species of ascarids from macaw parrot, Ara chloroptera, in China. Total 6 ascarids (3 males and 3 females) were collected in the feces of 3 macaws at Guangzhou Zoo in Guangdong Province, China. Their morphological characteristics with dimensions were observed under a light microscope, and their genetic characters were analyzed with the partial 18S rDNA, ITS rDNA and nad4 gene sequences, respectively. Results showed that all worms have no interlabia but male worms have two alate spicules, well-developed precloacal sucker and a tail with ventrolateral caudal alae and 11 pairs of papillae. The partial 18S rDNA, ITS rDNA and nad4 sequences were 831bp, 1015bp and 394bp in length, respectively. They showed the highest similarity of 99.8% (18S rDNA) with Ascaridia nymphii, 93.8% identities (ITS rDNA) with A. columbae and 98.5% to 99.5% identities (nad4) with Ascaridia sp. from infected parrot. All Ascaridia nematodes from the macaws were clustered into one clade and formed monophyletic group of Ascaridia with A. columbae and A. galli in two phylogenetic trees. It is observed that the combining morphological and sequencing data from three loci, the present Ascaridia species was identified as Ascaridia nymphii, which is the first record of A. nymphii from macaw parrot in China. Copyright © 2018 Elsevier B.V. All rights reserved.
Biorecognition by DNA oligonucleotides after Exposure to Photoresists and Resist Removers

PubMed Central

Dean, Stacey L.; Morrow, Thomas J.; Patrick, Sue; Li, Mingwei; Clawson, Gary; Mayer, Theresa S.; Keating, Christine D.

2013-01-01

Combining biological molecules with integrated circuit technology is of considerable interest for next generation sensors and biomedical devices. Current lithographic microfabrication methods, however, were developed for compatibility with silicon technology rather than bioorganic molecules and consequently it cannot be assumed that biomolecules will remain attached and intact during on-chip processing. Here, we evaluate the effects of three common photoresists (Microposit S1800 series, PMGI SF6, and Megaposit SPR 3012) and two photoresist removers (acetone and 1165 remover) on the ability of surface-immobilized DNA oligonucleotides to selectively recognize their reverse-complementary sequence. Two common DNA immobilization methods were compared: adsorption of 5′-thiolated sequences directly to gold nanowires and covalent attachment of 5′-thiolated sequences to surface amines on silica coated nanowires. We found that acetone had deleterious effects on selective hybridization as compared to 1165 remover, presumably due to incomplete resist removal. Use of the PMGI photoresist, which involves a high temperature bake step, was detrimental to the later performance of nanowire-bound DNA in hybridization assays, especially for DNA attached via thiol adsorption. The other three photoresists did not substantially degrade DNA binding capacity or selectivity for complementary DNA sequences. To determine if the lithographic steps caused more subtle damage, we also tested oligonucleotides containing a single base mismatch. Finally, a two-step photolithographic process was developed and used in combination with dielectrophoretic nanowire assembly to produce an array of doubly-contacted, electrically isolated individual nanowire components on a chip. Post-fabrication fluorescence imaging indicated that nanowire-bound DNA was present and able to selectively bind complementary strands. PMID:23952639
Lesion bypass activity of DNA polymerase θ (POLQ) is an intrinsic property of the pol domain and depends on unique sequence inserts.

PubMed

Hogg, Matthew; Seki, Mineaki; Wood, Richard D; Doublié, Sylvie; Wallace, Susan S

2011-01-21

DNA polymerase θ (POLQ, polθ) is a large, multidomain DNA polymerase encoded in higher eukaryotic genomes. It is important for maintaining genetic stability in cells and helping protect cells from DNA damage caused by ionizing radiation. POLQ contains an N-terminal helicase-like domain, a large central domain of indeterminate function, and a C-terminal polymerase domain with sequence similarity to the A-family of DNA polymerases. The enzyme has several unique properties, including low fidelity and the ability to insert and extend past abasic sites and thymine glycol lesions. It is not known whether the abasic site bypass activity is an intrinsic property of the polymerase domain or whether helicase activity is also required. Three "insertion" sequence elements present in POLQ are not found in any other A-family DNA polymerase, and it has been proposed that they may lend some unique properties to POLQ. Here, we analyzed the activity of the DNA polymerase in the absence of each sequence insertion. We found that the pol domain is capable of highly efficient bypass of abasic sites in the absence of the helicase-like or central domains. Insertion 1 increases the processivity of the polymerase but has little, if any, bearing on the translesion synthesis properties of the enzyme. However, removal of insertions 2 and 3 reduces activity on undamaged DNA and completely abrogates the ability of the enzyme to bypass abasic sites or thymine glycol lesions. Copyright Â© 2010 Elsevier Ltd. All rights reserved.
DNA sequence determinants controlling affinity, stability and shape of DNA complexes bound by the nucleoid protein Fis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hancock, Stephen P.; Stella, Stefano; Cascio, Duilio

The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequencesmore » in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. Lastly, the affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.« less
DNA sequence determinants controlling affinity, stability and shape of DNA complexes bound by the nucleoid protein Fis

DOE PAGES

Hancock, Stephen P.; Stella, Stefano; Cascio, Duilio; ...

2016-03-09

The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequencesmore » in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. Lastly, the affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.« less
Developmental validation of a Nextera XT mitogenome Illumina MiSeq sequencing method for high-quality samples.

PubMed

Peck, Michelle A; Sturk-Andreaggi, Kimberly; Thomas, Jacqueline T; Oliver, Robert S; Barritt-Ross, Suzanne; Marshall, Charla

2018-05-01

Generating mitochondrial genome (mitogenome) data from reference samples in a rapid and efficient manner is critical to harnessing the greater power of discrimination of the entire mitochondrial DNA (mtDNA) marker. The method of long-range target enrichment, Nextera XT library preparation, and Illumina sequencing on the MiSeq is a well-established technique for generating mitogenome data from high-quality samples. To this end, a validation was conducted for this mitogenome method processing up to 24 samples simultaneously along with analysis in the CLC Genomics Workbench and utilizing the AQME (AFDIL-QIAGEN mtDNA Expert) tool to generate forensic profiles. This validation followed the Federal Bureau of Investigation's Quality Assurance Standards (QAS) for forensic DNA testing laboratories and the Scientific Working Group on DNA Analysis Methods (SWGDAM) validation guidelines. The evaluation of control DNA, non-probative samples, blank controls, mixtures, and nonhuman samples demonstrated the validity of this method. Specifically, the sensitivity was established at ≥25 pg of nuclear DNA input for accurate mitogenome profile generation. Unreproducible low-level variants were observed in samples with low amplicon yields. Further, variant quality was shown to be a useful metric for identifying sequencing error and crosstalk. Success of this method was demonstrated with a variety of reference sample substrates and extract types. These studies further demonstrate the advantages of using NGS techniques by highlighting the quantitative nature of heteroplasmy detection. The results presented herein from more than 175 samples processed in ten sequencing runs, show this mitogenome sequencing method and analysis strategy to be valid for the generation of reference data. Copyright © 2018 Elsevier B.V. All rights reserved.
Specific minor groove solvation is a crucial determinant of DNA binding site recognition

PubMed Central

Harris, Lydia-Ann; Williams, Loren Dean; Koudelka, Gerald B.

2014-01-01

The DNA sequence preferences of nearly all sequence specific DNA binding proteins are influenced by the identities of bases that are not directly contacted by protein. Discrimination between non-contacted base sequences is commonly based on the differential abilities of DNA sequences to allow narrowing of the DNA minor groove. However, the factors that govern the propensity of minor groove narrowing are not completely understood. Here we show that the differential abilities of various DNA sequences to support formation of a highly ordered and stable minor groove solvation network are a key determinant of non-contacted base recognition by a sequence-specific binding protein. In addition, disrupting the solvent network in the non-contacted region of the binding site alters the protein's ability to recognize contacted base sequences at positions 5–6 bases away. This observation suggests that DNA solvent interactions link contacted and non-contacted base recognition by the protein. PMID:25429976
Molecular Detection of Leishmania DNA in Wild-Caught Phlebotomine Sand Flies (Diptera: Psychodidae) From a Cave in the State of Minas Gerais, Brazil.

PubMed

Carvalho, G M L; Brazil, R P; Rêgo, F D; Ramos, M C N F; Zenóbio, A P L A; Andrade Filho, J D

2017-01-01

Leishmania spp. are distributed throughout the world, and different species are associated with varying degrees of disease severity. In Brazil, Leishmania transmission involves several species of phlebotomine sand flies that are closely associated with different parasites and reservoirs, and thereby giving rise to different transmission cycles. Infection occurs during the bloodmeals of sand flies obtained from a variety of wild and domestic animals, and sometimes from humans. The present study focused on detection of Leishmania DNA in phlebotomine sand flies from a cave in the state of Minas Gerais. Detection of Leishmania in female sand flies was performed with ITS1 PCR-RFLP (internal transcribed spacer 1) using HaeIII enzyme and genetic sequencing for SSUrRNA target. The survey of Leishmania DNA was carried out on 232 pools and the parasite DNA was detected in four: one pool of Lutzomyia cavernicola (Costa Lima, 1932), infected with Le. infantum (ITS1 PCR-RFLP), two pools of Evandromyia sallesi (Galvão & Coutinho, 1939), both infected with Leishmania braziliensis complex (SSUrRNA genetic sequencing analysis), and one pool of Sciopemyia sordellii (Shannon & Del Ponte, 1927), infected with subgenus Leishmania (SSUrRNA genetic sequencing analysis). The present study identified the species for Leishmania DNA detected in four pools of sand flies, all of which were captured inside the cave. These results represent the first molecular detection of Lu cavernicola with Le infantum DNA, Sc sordellii with subgenus Leishmania DNA, and Ev sallesi with Leishmania braziliensis complex DNA. The infection rate in females captured for this study was 0.17%. © The Authors 2016. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
A Method for Preparing DNA Sequencing Templates Using a DNA-Binding Microplate

PubMed Central

Yang, Yu; Hebron, Haroun R.; Hang, Jun

2009-01-01

A DNA-binding matrix was immobilized on the surface of a 96-well microplate and used for plasmid DNA preparation for DNA sequencing. The same DNA-binding plate was used for bacterial growth, cell lysis, DNA purification, and storage. In a single step using one buffer, bacterial cells were lysed by enzymes, and released DNA was captured on the plate simultaneously. After two wash steps, DNA was eluted and stored in the same plate. Inclusion of phosphates in the culture medium was found to enhance the yield of plasmid significantly. Purified DNA samples were used successfully in DNA sequencing with high consistency and reproducibility. Eleven vectors and nine libraries were tested using this method. In 10 μl sequencing reactions using 3 μl sample and 0.25 μl BigDye Terminator v3.1, the results from a 3730xl sequencer gave a success rate of 90–95% and read-lengths of 700 bases or more. The method is fully automatable and convenient for manual operation as well. It enables reproducible, high-throughput, rapid production of DNA with purity and yields sufficient for high-quality DNA sequencing at a substantially reduced cost. PMID:19568455
Dendritic Cell-Based Immunotherapy of Breast Cancer: Modulation by CpG DNA

DTIC Science & Technology

2005-09-01

tumor-associated antigens and bacterial DNA oligodeoxynucleotides containing unmethylated CpG sequences (CpG DNA) further augment the immune priming...associated antigens by cytotoxic T lymphocytes, and bacterial DNA oligodeoxy- nucleotides containing unmethylated CpG sequences (CpG DNA) can further...further amplify their immunostimulatory capacity and bacterial DNA oligodeoxynucleotides (ODN) containing unmethylated CpG sequences (CpG DNA) provide such
A rapid and cost-effective method for sequencing pooled cDNA clones by using a combination of transposon insertion and Gateway technology.

PubMed

Morozumi, Takeya; Toki, Daisuke; Eguchi-Ogawa, Tomoko; Uenishi, Hirohide

2011-09-01

Large-scale cDNA-sequencing projects require an efficient strategy for mass sequencing. Here we describe a method for sequencing pooled cDNA clones using a combination of transposon insertion and Gateway technology. Our method reduces the number of shotgun clones that are unsuitable for reconstruction of cDNA sequences, and has the advantage of reducing the total costs of the sequencing project.
Biological sequence compression algorithms.

PubMed

Matsumoto, T; Sadakane, K; Imai, H

2000-01-01

Today, more and more DNA sequences are becoming available. The information about DNA sequences are stored in molecular biology databases. The size and importance of these databases will be bigger and bigger in the future, therefore this information must be stored or communicated efficiently. Furthermore, sequence compression can be used to define similarities between biological sequences. The standard compression algorithms such as gzip or compress cannot compress DNA sequences, but only expand them in size. On the other hand, CTW (Context Tree Weighting Method) can compress DNA sequences less than two bits per symbol. These algorithms do not use special structures of biological sequences. Two characteristic structures of DNA sequences are known. One is called palindromes or reverse complements and the other structure is approximate repeats. Several specific algorithms for DNA sequences that use these structures can compress them less than two bits per symbol. In this paper, we improve the CTW so that characteristic structures of DNA sequences are available. Before encoding the next symbol, the algorithm searches an approximate repeat and palindrome using hash and dynamic programming. If there is a palindrome or an approximate repeat with enough length then our algorithm represents it with length and distance. By using this preprocessing, a new program achieves a little higher compression ratio than that of existing DNA-oriented compression algorithms. We also describe new compression algorithm for protein sequences.

Mapping Structurally Defined Guanine Oxidation Products along DNA Duplexes: Influence of Local Sequence Context and Endogenous Cytosine Methylation

PubMed Central

2015-01-01

DNA oxidation by reactive oxygen species is nonrandom, potentially leading to accumulation of nucleobase damage and mutations at specific sites within the genome. We now present the first quantitative data for sequence-dependent formation of structurally defined oxidative nucleobase adducts along p53 gene-derived DNA duplexes using a novel isotope labeling-based approach. Our results reveal that local nucleobase sequence context differentially alters the yields of 2,2,4-triamino-2H-oxal-5-one (Z) and 8-oxo-7,8-dihydro-2′-deoxyguanosine (OG) in double stranded DNA. While both lesions are overproduced within endogenously methylated MeCG dinucleotides and at 5′ Gs in runs of several guanines, the formation of Z (but not OG) is strongly preferred at solvent-exposed guanine nucleobases at duplex ends. Targeted oxidation of MeCG sequences may be caused by a lowered ionization potential of guanine bases paired with MeC and the preferential intercalation of riboflavin photosensitizer adjacent to MeC:G base pairs. Importantly, some of the most frequently oxidized positions coincide with the known p53 lung cancer mutational “hotspots” at codons 245 (GGC), 248 (CGG), and 158 (CGC) respectively, supporting a possible role of oxidative degradation of DNA in the initiation of lung cancer. PMID:24571128
Detecting the borders between coding and non-coding DNA regions in prokaryotes based on recursive segmentation and nucleotide doublets statistics

PubMed Central

2012-01-01

Background Detecting the borders between coding and non-coding regions is an essential step in the genome annotation. And information entropy measures are useful for describing the signals in genome sequence. However, the accuracies of previous methods of finding borders based on entropy segmentation method still need to be improved. Methods In this study, we first applied a new recursive entropic segmentation method on DNA sequences to get preliminary significant cuts. A 22-symbol alphabet is used to capture the differential composition of nucleotide doublets and stop codon patterns along three phases in both DNA strands. This process requires no prior training datasets. Results Comparing with the previous segmentation methods, the experimental results on three bacteria genomes, Rickettsia prowazekii, Borrelia burgdorferi and E.coli, show that our approach improves the accuracy for finding the borders between coding and non-coding regions in DNA sequences. Conclusions This paper presents a new segmentation method in prokaryotes based on Jensen-Rényi divergence with a 22-symbol alphabet. For three bacteria genomes, comparing to A12_JR method, our method raised the accuracy of finding the borders between protein coding and non-coding regions in DNA sequences. PMID:23282225
A novel image encryption algorithm based on the chaotic system and DNA computing

NASA Astrophysics Data System (ADS)

Chai, Xiuli; Gan, Zhihua; Lu, Yang; Chen, Yiran; Han, Daojun

A novel image encryption algorithm using the chaotic system and deoxyribonucleic acid (DNA) computing is presented. Different from the traditional encryption methods, the permutation and diffusion of our method are manipulated on the 3D DNA matrix. Firstly, a 3D DNA matrix is obtained through bit plane splitting, bit plane recombination, DNA encoding of the plain image. Secondly, 3D DNA level permutation based on position sequence group (3DDNALPBPSG) is introduced, and chaotic sequences generated from the chaotic system are employed to permutate the positions of the elements of the 3D DNA matrix. Thirdly, 3D DNA level diffusion (3DDNALD) is given, the confused 3D DNA matrix is split into sub-blocks, and XOR operation by block is manipulated to the sub-DNA matrix and the key DNA matrix from the chaotic system. At last, by decoding the diffused DNA matrix, we get the cipher image. SHA 256 hash of the plain image is employed to calculate the initial values of the chaotic system to avoid chosen plaintext attack. Experimental results and security analyses show that our scheme is secure against several known attacks, and it can effectively protect the security of the images.
Detection of DNA Methylation by Whole-Genome Bisulfite Sequencing.

PubMed

Li, Qing; Hermanson, Peter J; Springer, Nathan M

2018-01-01

DNA methylation plays an important role in the regulation of the expression of transposons and genes. Various methods have been developed to assay DNA methylation levels. Bisulfite sequencing is considered to be the "gold standard" for single-base resolution measurement of DNA methylation levels. Coupled with next-generation sequencing, whole-genome bisulfite sequencing (WGBS) allows DNA methylation to be evaluated at a genome-wide scale. Here, we described a protocol for WGBS in plant species with large genomes. This protocol has been successfully applied to assay genome-wide DNA methylation levels in maize and barley. This protocol has also been successfully coupled with sequence capture technology to assay DNA methylation levels in a targeted set of genomic regions.
Single-Molecule Electrical Random Resequencing of DNA and RNA

NASA Astrophysics Data System (ADS)

Ohshiro, Takahito; Matsubara, Kazuki; Tsutsui, Makusu; Furuhashi, Masayuki; Taniguchi, Masateru; Kawai, Tomoji

2012-07-01

Two paradigm shifts in DNA sequencing technologies--from bulk to single molecules and from optical to electrical detection--are expected to realize label-free, low-cost DNA sequencing that does not require PCR amplification. It will lead to development of high-throughput third-generation sequencing technologies for personalized medicine. Although nanopore devices have been proposed as third-generation DNA-sequencing devices, a significant milestone in these technologies has been attained by demonstrating a novel technique for resequencing DNA using electrical signals. Here we report single-molecule electrical resequencing of DNA and RNA using a hybrid method of identifying single-base molecules via tunneling currents and random sequencing. Our method reads sequences of nine types of DNA oligomers. The complete sequence of 5'-UGAGGUA-3' from the let-7 microRNA family was also identified by creating a composite of overlapping fragment sequences, which was randomly determined using tunneling current conducted by single-base molecules as they passed between a pair of nanoelectrodes.
Molecular epidemiologic analysis of a Pneumocystis pneumonia outbreak among renal transplant patients.

PubMed

Urabe, N; Ishii, Y; Hyodo, Y; Aoki, K; Yoshizawa, S; Saga, T; Murayama, S Y; Sakai, K; Homma, S; Tateda, K

2016-04-01

Between 18 November and 3 December 2011, five renal transplant patients at the Department of Nephrology, Toho University Omori Medical Centre, Tokyo, were diagnosed with Pneumocystis pneumonia (PCP). We used molecular epidemiologic methods to determine whether the patients were infected with the same strain of Pneumocystis jirovecii. DNA extracted from the residual bronchoalveolar lavage fluid from the five outbreak cases and from another 20 cases of PCP between 2007 and 2014 were used for multilocus sequence typing to compare the genetic similarity of the P. jirovecii. DNA base sequencing by the Sanger method showed some regions where two bases overlapped and could not be defined. A next-generation sequencer was used to analyse the types and ratios of these overlapping bases. DNA base sequences of P. jirovecii in the bronchoalveolar lavage fluid from four of the five PCP patients in the 2011 outbreak and from another two renal transplant patients who developed PCP in 2013 were highly homologous. The Sanger method revealed 14 genomic regions where two differing DNA bases overlapped and could not be identified. Analyses of the overlapping bases by a next-generation sequencer revealed that the differing types of base were present in almost identical ratios. There is a strong possibility that the PCP outbreak at the Toho University Omori Medical Centre was caused by the same strain of P. jirovecii. Two different types of base present in some regions may be due to P. jirovecii's being a diploid species. Copyright © 2015 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved.
Multiple horizontal transfers of nuclear ribosomal genes between phylogenetically distinct grass lineages.

PubMed

Mahelka, Václav; Krak, Karol; Kopecký, David; Fehrer, Judith; Šafář, Jan; Bartoš, Jan; Hobza, Roman; Blavet, Nicolas; Blattner, Frank R

2017-02-14

The movement of nuclear DNA from one vascular plant species to another in the absence of fertilization is thought to be rare. Here, nonnative rRNA gene [ribosomal DNA (rDNA)] copies were identified in a set of 16 diploid barley ( Hordeum ) species; their origin was traceable via their internal transcribed spacer (ITS) sequence to five distinct Panicoideae genera, a lineage that split from the Pooideae about 60 Mya. Phylogenetic, cytogenetic, and genomic analyses implied that the nonnative sequences were acquired between 1 and 5 Mya after a series of multiple events, with the result that some current Hordeum sp. individuals harbor up to five different panicoid rDNA units in addition to the native Hordeum rDNA copies. There was no evidence that any of the nonnative rDNA units were transcribed; some showed indications of having been silenced via pseudogenization. A single copy of a Panicum sp. rDNA unit present in H. bogdanii had been interrupted by a native transposable element and was surrounded by about 70 kbp of mostly noncoding sequence of panicoid origin. The data suggest that horizontal gene transfer between vascular plants is not a rare event, that it is not necessarily restricted to one or a few genes only, and that it can be selectively neutral.
Population and forensic genetic analyses of mitochondrial DNA control region variation from six major provinces in the Korean population.

PubMed

Hong, Seung Beom; Kim, Ki Cheol; Kim, Wook

2015-07-01

We generated complete mitochondrial DNA (mtDNA) control region sequences from 704 unrelated individuals residing in six major provinces in Korea. In addition to our earlier survey of the distribution of mtDNA haplogroup variation, a total of 560 different haplotypes characterized by 271 polymorphic sites were identified, of which 473 haplotypes were unique. The gene diversity and random match probability were 0.9989 and 0.0025, respectively. According to the pairwise comparison of the 704 control region sequences, the mean number of pairwise differences between individuals was 13.47±6.06. Based on the result of mtDNA control region sequences, pairwise FST genetic distances revealed genetic homogeneity of the Korean provinces on a peninsular level, except in samples from Jeju Island. This result indicates there may be a need to formulate a local mtDNA database for Jeju Island, to avoid bias in forensic parameter estimates caused by genetic heterogeneity of the population. Thus, the present data may help not only in personal identification but also in determining maternal lineages to provide an expanded and reliable Korean mtDNA database. These data will be available on the EMPOP database via accession number EMP00661. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Sequence analysis of three mitochondrial DNA molecules reveals interesting differences among Saccharomyces yeasts

PubMed Central

Langkjær, R. B.; Casaregola, S.; Ussery, D. W.; Gaillardin, C.; Piškur, J.

2003-01-01

The complete sequences of mitochondrial DNA (mtDNA) from the two budding yeasts Saccharomyces castellii and Saccharomyces servazzii, consisting of 25 753 and 30 782 bp, respectively, were analysed and compared to Saccharomyces cerevisiae mtDNA. While some of the traits are very similar among Saccharomyces yeasts, others have highly diverged. The two mtDNAs are much more compact than that of S.cerevisiae and contain fewer introns and intergenic sequences, although they have almost the same coding potential. A few genes contain group I introns, but group II introns, otherwise found in S.cerevisiae mtDNA, are not present. Surprisingly, four genes (ATP6, COX2, COX3 and COB) in the mtDNA of S.servazzii contain, in total, five +1 frameshifts. mtDNAs of S.castellii, S.servazzii and S.cerevisiae contain all genes on the same strand, except for one tRNA gene. On the other hand, the gene order is very different. Several gene rearrangements have taken place upon separation of the Saccharomyces lineages, and even a part of the transcription units have not been preserved. It seems that the mechanism(s) involved in the generation of the rearrangements has had to ensure that all genes stayed encoded by the same DNA strand. PMID:12799436
The effects of metal ions on the DNA damage induced by hydrogen peroxide.

PubMed

Kobayashi, S; Ueda, K; Komano, T

1990-01-01

The effects of metal ions on DNA damage induced by hydrogen peroxide were investigated using two methods, agarose-gel electrophoretic analysis of supercoiled DNA and sequencing-gel analysis of single end-labeled DNA fragments of defined sequences. Hydrogen peroxide induced DNA damage when iron or copper ion was present. At least two classes of DNA damage were induced, one being direct DNA-strand cleavage, and the other being base modification labile to hot piperidine. The investigation of the damaged sites and the inhibitory effects of radical scavengers revealed that hydroxyl radical was the species which attacked DNA in the reaction of H2O2/Fe(II). On the other hand, two types of DNA damage were induced by H2O2/Cu(II). Type I damage was predominant and inhibited by potassium iodide, but type II was not. The sites of the base-modification induced by type I damage were similar to those by lipid peroxidation products and by ascorbate in the presence of Cu(II), suggesting the involvement of radical species other than free hydroxyl radical in the damaging reactions.
Detection of DNA Sequences Refractory to PCR Amplification Using a Biophysical SERRS Assay (Surface Enhanced Resonant Raman Spectroscopy)

PubMed Central

Feuillie, Cécile; Merheb, Maxime M.; Gillet, Benjamin; Montagnac, Gilles; Daniel, Isabelle; Hänni, Catherine

2014-01-01

The analysis of ancient or processed DNA samples is often a great challenge, because traditional Polymerase Chain Reaction – based amplification is impeded by DNA damage. Blocking lesions such as abasic sites are known to block the bypass of DNA polymerases, thus stopping primer elongation. In the present work, we applied the SERRS-hybridization assay, a fully non-enzymatic method, to the detection of DNA refractory to PCR amplification. This method combines specific hybridization with detection by Surface Enhanced Resonant Raman Scattering (SERRS). It allows the detection of a series of double-stranded DNA molecules containing a varying number of abasic sites on both strands, when PCR failed to detect the most degraded sequences. Our SERRS approach can quickly detect DNA molecules without any need for DNA repair. This assay could be applied as a pre-requisite analysis prior to enzymatic reparation or amplification. A whole new set of samples, both forensic and archaeological, could then deliver information that was not yet available due to a high degree of DNA damage. PMID:25502338
Detection of DNA sequences refractory to PCR amplification using a biophysical SERRS assay (Surface Enhanced Resonant Raman Spectroscopy).

PubMed

Feuillie, Cécile; Merheb, Maxime M; Gillet, Benjamin; Montagnac, Gilles; Daniel, Isabelle; Hänni, Catherine

2014-01-01

The analysis of ancient or processed DNA samples is often a great challenge, because traditional Polymerase Chain Reaction - based amplification is impeded by DNA damage. Blocking lesions such as abasic sites are known to block the bypass of DNA polymerases, thus stopping primer elongation. In the present work, we applied the SERRS-hybridization assay, a fully non-enzymatic method, to the detection of DNA refractory to PCR amplification. This method combines specific hybridization with detection by Surface Enhanced Resonant Raman Scattering (SERRS). It allows the detection of a series of double-stranded DNA molecules containing a varying number of abasic sites on both strands, when PCR failed to detect the most degraded sequences. Our SERRS approach can quickly detect DNA molecules without any need for DNA repair. This assay could be applied as a pre-requisite analysis prior to enzymatic reparation or amplification. A whole new set of samples, both forensic and archaeological, could then deliver information that was not yet available due to a high degree of DNA damage.
Structure-based Analysis to Hu-DNA Binding

DOE Office of Scientific and Technical Information (OSTI.GOV)

Swinger,K.; Rice, P.

2007-01-01

HU and IHF are prokaryotic proteins that induce very large bends in DNA. They are present in high concentrations in the bacterial nucleoid and aid in chromosomal compaction. They also function as regulatory cofactors in many processes, such as site-specific recombination and the initiation of replication and transcription. HU and IHF have become paradigms for understanding DNA bending and indirect readout of sequence. While IHF shows significant sequence specificity, HU binds preferentially to certain damaged or distorted DNAs. However, none of the structurally diverse HU substrates previously studied in vitro is identical with the distorted substrates in the recently publishedmore » Anabaena HU(AHU)-DNA cocrystal structures. Here, we report binding affinities for AHU and the DNA in the cocrystal structures. The binding free energies for formation of these AHU-DNA complexes range from 10-14.5 kcal/mol, representing K{sub d} values in the nanomolar to low picomolar range, and a maximum stabilization of at least 6.3 kcal/mol relative to complexes with undistorted, non-specific DNA. We investigated IHF binding and found that appropriate structural distortions can greatly enhance its affinity. On the basis of the coupling of structural and relevant binding data, we estimate the amount of conformational strain in an IHF-mediated DNA kink that is relieved by a nick (at least 0.76 kcal/mol) and pinpoint the location of the strain. We show that AHU has a sequence preference for an A+T-rich region in the center of its DNA-binding site, correlating with an unusually narrow minor groove. This is similar to sequence preferences shown by the eukaryotic nucleosome.« less
Controlled Assembly of Ag Nanoparticles and Carbon Nanotube Hybrid Structures for Biosensing

DTIC Science & Technology

2010-01-01

to∼190 kΩ. The same device was again washed with DI water and treated with the thiolated ssDNA in high salt buffer. After a 2 h treatment, the device...after the cleaning only thiolated DNA should be present on the device, whereas the nonspecifically bound DNA as well as the buffer salts should be...ssDNA molecules for 2 h. Specific immobiliza- tion of thiolated ssDNA (sequence: 50thiol_TCATAC AGCTAGATA ACC AAAGA) was carried out in high salt
Differential structural status of the RNA counterpart of an undecamer quasi-palindromic DNA sequence present in LCR of human β-globin gene cluster.

PubMed

Kaushik, Mahima; Kukreti, Shrikant

2015-01-01

Our previous work on structural polymorphism shown at a single nucleotide polymorphism (SNP) (A → G) site located on HS4 region of locus control region (LCR) of β-globin gene has established a hairpin → duplex equilibrium corresponding to A → B like DNA transition (Kaushik M, Kukreti, R., Grover, D., Brahmachari, S.K. and Kukreti S. Nucleic Acids Res. 2003; Kaushik M, Kukreti S. Nucleic Acids Res. 2006). The G-allele of A → G SNP has been shown to be significantly associated with the occurrence of β-thalassemia. Considering the significance of this 11-nt long quasi-palindromic sequence [5'-TGGGG(G/A)CCCCA; HP(G/A)11] of β-globin gene LCR, we further explored the differential behavior of the same DNA sequence with its RNA counterpart, using various biophysical and biochemical techniques. In contrast to its DNA counterpart exhibiting a A → B structural transition and an equilibrium between duplex and hairpin forms, the studied RNA oligonucleotide sequence [5'-UGGGG(G/A)CCCCA; RHP(G/A)11] existed only in duplex form (A-conformation) and did not form hairpin. The single residue difference from A to G led to the unusual thermal stability of the RNA structure formed by the studied sequence. Since, naturally occurring mutations and various SNP sites may stabilize or destabilize the local DNA/RNA secondary structures, these structural transitions may affect the gene expression by a change in the protein-DNA recognition patterns.
Plasmodium falciparum Nucleosomes Exhibit Reduced Stability and Lost Sequence Dependent Nucleosome Positioning

PubMed Central

Silberhorn, Elisabeth; Schwartz, Uwe; Symelka, Anne; de Koning-Ward, Tania; Längst, Gernot

2016-01-01

The packaging and organization of genomic DNA into chromatin represents an additional regulatory layer of gene expression, with specific nucleosome positions that restrict the accessibility of regulatory DNA elements. The mechanisms that position nucleosomes in vivo are thought to depend on the biophysical properties of the histones, sequence patterns, like phased di-nucleotide repeats and the architecture of the histone octamer that folds DNA in 1.65 tight turns. Comparative studies of human and P. falciparum histones reveal that the latter have a strongly reduced ability to recognize internal sequence dependent nucleosome positioning signals. In contrast, the nucleosomes are positioned by AT-repeat sequences flanking nucleosomes in vivo and in vitro. Further, the strong sequence variations in the plasmodium histones, compared to other mammalian histones, do not present adaptations to its AT-rich genome. Human and parasite histones bind with higher affinity to GC-rich DNA and with lower affinity to AT-rich DNA. However, the plasmodium nucleosomes are overall less stable, with increased temperature induced mobility, decreased salt stability of the histones H2A and H2B and considerable reduced binding affinity to GC-rich DNA, as compared with the human nucleosomes. In addition, we show that plasmodium histone octamers form the shortest known nucleosome repeat length (155bp) in vitro and in vivo. Our data suggest that the biochemical properties of the parasite histones are distinct from the typical characteristics of other eukaryotic histones and these properties reflect the increased accessibility of the P. falciparum genome. PMID:28033404
A Delicate Balance Between Repair and Replication Factors Regulates Recombination Between Divergent DNA Sequences in Saccharomyces cerevisiae

PubMed Central

Chakraborty, Ujani; George, Carolyn M.; Lyndaker, Amy M.; Alani, Eric

2016-01-01

Single-strand annealing (SSA) is an important homologous recombination mechanism that repairs DNA double strand breaks (DSBs) occurring between closely spaced repeat sequences. During SSA, the DSB is acted upon by exonucleases to reveal complementary sequences that anneal and are then repaired through tail clipping, DNA synthesis, and ligation steps. In baker’s yeast, the Msh DNA mismatch recognition complex and the Sgs1 helicase act to suppress SSA between divergent sequences by binding to mismatches present in heteroduplex DNA intermediates and triggering a DNA unwinding mechanism known as heteroduplex rejection. Using baker’s yeast as a model, we have identified new factors and regulatory steps in heteroduplex rejection during SSA. First we showed that Top3-Rmi1, a topoisomerase complex that interacts with Sgs1, is required for heteroduplex rejection. Second, we found that the replication processivity clamp proliferating cell nuclear antigen (PCNA) is dispensable for heteroduplex rejection, but is important for repairing mismatches formed during SSA. Third, we showed that modest overexpression of Msh6 results in a significant increase in heteroduplex rejection; this increase is due to a compromise in Msh2-Msh3 function required for the clipping of 3′ tails. Thus 3′ tail clipping during SSA is a critical regulatory step in the repair vs. rejection decision; rejection is favored before the 3′ tails are clipped. Unexpectedly, Msh6 overexpression, through interactions with PCNA, disrupted heteroduplex rejection between divergent sequences in another recombination substrate. These observations illustrate the delicate balance that exists between repair and replication factors to optimize genome stability. PMID:26680658
Length and sequence heterogeneity in 5S rDNA of Populus deltoides.

PubMed

Negi, Madan S; Rajagopal, Jyothi; Chauhan, Neeti; Cronn, Richard; Lakshmikumaran, Malathi

2002-12-01

The 5S rRNA genes and their associated non-transcribed spacer (NTS) regions are present as repeat units arranged in tandem arrays in plant genomes. Length heterogeneity in 5S rDNA repeats was previously identified in Populus deltoides and was also observed in the present study. Primers were designed to amplify the 5S rDNA NTS variants from the P. deltoides genome. The PCR-amplified products from the two accessions of P. deltoides (G3 and G48) suggested the presence of length heterogeneity of 5S rDNA units within and among accessions, and the size of the spacers ranged from 385 to 434 bp. Sequence analysis of the non-transcribed spacer (NTS) revealed two distinct classes of 5S rDNA within both accessions: class 1, which contained GAA trinucleotide microsatellite repeats, and class 2, which lacked the repeats. The class 1 spacer shows length variation owing to the microsatellite, with two clones exhibiting 10 GAA repeat units and one clone exhibiting 16 such repeat units. However, distance analysis shows that class 1 spacer sequences are highly similar inter se, yielding nucleotide diversity (pi) estimates that are less than 0.15% of those obtained for class 2 spacers (pi = 0.0183 vs. 0.1433, respectively). The presence of microsatellite in the NTS region leading to variation in spacer length is reported and discussed for the first time in P. deltoides.
Squeezing water from a stone: high-throughput sequencing from a 145-year old holotype resolves (barely) a cryptic species problem in flying lizards.

PubMed

McGuire, Jimmy A; Cotoras, Darko D; O'Connell, Brendan; Lawalata, Shobi Z S; Wang-Claypool, Cynthia Y; Stubbs, Alexander; Huang, Xiaoting; Wogan, Guinevere O U; Hykin, Sarah M; Reilly, Sean B; Bi, Ke; Riyanto, Awal; Arida, Evy; Smith, Lydia L; Milne, Heather; Streicher, Jeffrey W; Iskandar, Djoko T

2018-01-01

We used Massively Parallel High-Throughput Sequencing to obtain genetic data from a 145-year old holotype specimen of the flying lizard, Draco cristatellus . Obtaining genetic data from this holotype was necessary to resolve an otherwise intractable taxonomic problem involving the status of this species relative to closely related sympatric Draco species that cannot otherwise be distinguished from one another on the basis of museum specimens. Initial analyses suggested that the DNA present in the holotype sample was so degraded as to be unusable for sequencing. However, we used a specialized extraction procedure developed for highly degraded ancient DNA samples and MiSeq shotgun sequencing to obtain just enough low-coverage mitochondrial DNA (721 base pairs) to conclusively resolve the species status of the holotype as well as a second known specimen of this species. The holotype was prepared before the advent of formalin-fixation and therefore was most likely originally fixed with ethanol and never exposed to formalin. Whereas conventional wisdom suggests that formalin-fixed samples should be the most challenging for DNA sequencing, we propose that evaporation during long-term alcohol storage and consequent water-exposure may subject older ethanol-fixed museum specimens to hydrolytic damage. If so, this may pose an even greater challenge for sequencing efforts involving historical samples.
Phylogenetic relationships of bears (the Ursidae) inferred from mitochondrial DNA sequences.

PubMed

Zhang, Y P; Ryder, O A

1994-12-01

The phylogenetic relationships among some bear species are still open questions. We present here mitochondrial DNA sequences of D-loop region, cytochrome b, 12S rRNA, tRNA(Pro), and tRNA(Thr) genes from all bear species and the giant panda. A series of evolutionary trees with concordant topology has been derived based on the combined data set of all of the mitochondrial DNA sequences, which may have resolved the evolutionary relationships of all bear species: the ancestor of the spectacled bear diverged first, followed by the sloth bear; the brown bear and polar bear are sister taxa relative to the Asiatic black bear; the closest relative of the American black bear is the sun bear. Primers for forensic identification of the giant panda and bears are proposed. Analysis of these data, in combination with data from primates and antelopes, suggests that relative substitutional rates between different mitochondrial DNA regions may vary greatly among different taxa of the vertebrates.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.