Analysis of DNA Sequences by an Optical Time-Integrating Correlator: Proof-of-Concept Experiments.
1992-05-01
DNA ANALYSIS STRATEGY 4 2.1 Representation of DNA Bases 4 2.2 DNA Analysis Strategy 6 3.0 CUSTOM GENERATORS FOR DNA SEQUENCES 10 3.1 Hardware Design 10...of the DNA bases where each base is represented by a 7-bits long pseudorandom sequence. 5 Figure 4: Coarse analysis of a DNA sequence. 7 Figure 5: Fine...a 20-bases long database. 32 xiii LIST OF TABLES PAGE Table 1: Short representations of the DNA bases where each base is represented by 7-bits long
Specific minor groove solvation is a crucial determinant of DNA binding site recognition
Harris, Lydia-Ann; Williams, Loren Dean; Koudelka, Gerald B.
2014-01-01
The DNA sequence preferences of nearly all sequence specific DNA binding proteins are influenced by the identities of bases that are not directly contacted by protein. Discrimination between non-contacted base sequences is commonly based on the differential abilities of DNA sequences to allow narrowing of the DNA minor groove. However, the factors that govern the propensity of minor groove narrowing are not completely understood. Here we show that the differential abilities of various DNA sequences to support formation of a highly ordered and stable minor groove solvation network are a key determinant of non-contacted base recognition by a sequence-specific binding protein. In addition, disrupting the solvent network in the non-contacted region of the binding site alters the protein's ability to recognize contacted base sequences at positions 5–6 bases away. This observation suggests that DNA solvent interactions link contacted and non-contacted base recognition by the protein. PMID:25429976
Direct Detection and Sequencing of Damaged DNA Bases
2011-01-01
Products of various forms of DNA damage have been implicated in a variety of important biological processes, such as aging, neurodegenerative diseases, and cancer. Therefore, there exists great interest to develop methods for interrogating damaged DNA in the context of sequencing. Here, we demonstrate that single-molecule, real-time (SMRT®) DNA sequencing can directly detect damaged DNA bases in the DNA template - as a by-product of the sequencing method - through an analysis of the DNA polymerase kinetics that are altered by the presence of a modified base. We demonstrate the sequencing of several DNA templates containing products of DNA damage, including 8-oxoguanine, 8-oxoadenine, O6-methylguanine, 1-methyladenine, O4-methylthymine, 5-hydroxycytosine, 5-hydroxyuracil, 5-hydroxymethyluracil, or thymine dimers, and show that these base modifications can be readily detected with single-modification resolution and DNA strand specificity. We characterize the distinct kinetic signatures generated by these DNA base modifications. PMID:22185597
Direct detection and sequencing of damaged DNA bases.
Clark, Tyson A; Spittle, Kristi E; Turner, Stephen W; Korlach, Jonas
2011-12-20
Products of various forms of DNA damage have been implicated in a variety of important biological processes, such as aging, neurodegenerative diseases, and cancer. Therefore, there exists great interest to develop methods for interrogating damaged DNA in the context of sequencing. Here, we demonstrate that single-molecule, real-time (SMRT®) DNA sequencing can directly detect damaged DNA bases in the DNA template - as a by-product of the sequencing method - through an analysis of the DNA polymerase kinetics that are altered by the presence of a modified base. We demonstrate the sequencing of several DNA templates containing products of DNA damage, including 8-oxoguanine, 8-oxoadenine, O6-methylguanine, 1-methyladenine, O4-methylthymine, 5-hydroxycytosine, 5-hydroxyuracil, 5-hydroxymethyluracil, or thymine dimers, and show that these base modifications can be readily detected with single-modification resolution and DNA strand specificity. We characterize the distinct kinetic signatures generated by these DNA base modifications.
Analysis of DNA Sequences by An Optical Time-Integrating Correlator: Proof-Of-Concept Experiments.
1992-05-01
TABLES xv LIST OF ABBREVIATIONS xvii 1.0 INTRODUCTION 1 2.0 DNA ANALYSIS STRATEGY 4 2.1 Representation of DNA Bases 4 2.2 DNA Analysis Strategy 6 3.0...Zehnder architecture. 3 Figure 3: Short representations of the DNA bases where each base is represented by a 7-bits long pseudorandom sequence. 5... DNA bases where each base is represented by 7-bits long pseudorandom sequences. 4 Table 2: Long representations of the DNA bases with 255-bits maximum
Enantiospecific recognition of DNA sequences by a proflavine Tröger base.
Bailly, C; Laine, W; Demeunynck, M; Lhomme, J
2000-07-05
The DNA interaction of a chiral Tröger base derived from proflavine was investigated by DNA melting temperature measurements and complementary biochemical assays. DNase I footprinting experiments demonstrate that the binding of the proflavine-based Tröger base is both enantio- and sequence-specific. The (+)-isomer poorly interacts with DNA in a non-sequence-selective fashion. In sharp contrast, the corresponding (-)-isomer recognizes preferentially certain DNA sequences containing both A. T and G. C base pairs, such as the motifs 5'-GTT. AAC and 5'-ATGA. TCAT. This is the first experimental demonstration that acridine-type Tröger bases can be used for enantiospecific recognition of DNA sequences. Copyright 2000 Academic Press.
RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis.
Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab
2012-01-01
RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. http://www.cemb.edu.pk/sw.html RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language.
Method for sequencing DNA base pairs
Sessler, Andrew M.; Dawson, John
1993-01-01
The base pairs of a DNA structure are sequenced with the use of a scanning tunneling microscope (STM). The DNA structure is scanned by the STM probe tip, and, as it is being scanned, the DNA structure is separately subjected to a sequence of infrared radiation from four different sources, each source being selected to preferentially excite one of the four different bases in the DNA structure. Each particular base being scanned is subjected to such sequence of infrared radiation from the four different sources as that particular base is being scanned. The DNA structure as a whole is separately imaged for each subjection thereof to radiation from one only of each source.
Sequencing of adenine in DNA by scanning tunneling microscopy
NASA Astrophysics Data System (ADS)
Tanaka, Hiroyuki; Taniguchi, Masateru
2017-08-01
The development of DNA sequencing technology utilizing the detection of a tunnel current is important for next-generation sequencer technologies based on single-molecule analysis technology. Using a scanning tunneling microscope, we previously reported that dI/dV measurements and dI/dV mapping revealed that the guanine base (purine base) of DNA adsorbed onto the Cu(111) surface has a characteristic peak at V s = -1.6 V. If, in addition to guanine, the other purine base of DNA, namely, adenine, can be distinguished, then by reading all the purine bases of each single strand of a DNA double helix, the entire base sequence of the original double helix can be determined due to the complementarity of the DNA base pair. Therefore, the ability to read adenine is important from the viewpoint of sequencing. Here, we report on the identification of adenine by STM topographic and spectroscopic measurements using a synthetic DNA oligomer and viral DNA.
NASA Astrophysics Data System (ADS)
Lestari, D.; Bustamam, A.; Novianti, T.; Ardaneswari, G.
2017-07-01
DNA sequence can be defined as a succession of letters, representing the order of nucleotides within DNA, using a permutation of four DNA base codes including adenine (A), guanine (G), cytosine (C), and thymine (T). The precise code of the sequences is determined using DNA sequencing methods and technologies, which have been developed since the 1970s and currently become highly developed, advanced and highly throughput sequencing technologies. So far, DNA sequencing has greatly accelerated biological and medical research and discovery. However, in some cases DNA sequencing could produce any ambiguous and not clear enough sequencing results that make them quite difficult to be determined whether these codes are A, T, G, or C. To solve these problems, in this study we can introduce other representation of DNA codes namely Quaternion Q = (PA, PT, PG, PC), where PA, PT, PG, PC are the probability of A, T, G, C bases that could appear in Q and PA + PT + PG + PC = 1. Furthermore, using Quaternion representations we are able to construct the improved scoring matrix for global sequence alignment processes, by applying a dot product method. Moreover, this scoring matrix produces better and higher quality of the match and mismatch score between two DNA base codes. In implementation, we applied the Needleman-Wunsch global sequence alignment algorithm using Octave, to analyze our target sequence which contains some ambiguous sequence data. The subject sequences are the DNA sequences of Streptococcus pneumoniae families obtained from the Genebank, meanwhile the target DNA sequence are received from our collaborator database. As the results we found the Quaternion representations improve the quality of the sequence alignment score and we can conclude that DNA sequence target has maximum similarity with Streptococcus pneumoniae.
An evolution based biosensor receptor DNA sequence generation algorithm.
Kim, Eungyeong; Lee, Malrey; Gatton, Thomas M; Lee, Jaewan; Zang, Yupeng
2010-01-01
A biosensor is composed of a bioreceptor, an associated recognition molecule, and a signal transducer that can selectively detect target substances for analysis. DNA based biosensors utilize receptor molecules that allow hybridization with the target analyte. However, most DNA biosensor research uses oligonucleotides as the target analytes and does not address the potential problems of real samples. The identification of recognition molecules suitable for real target analyte samples is an important step towards further development of DNA biosensors. This study examines the characteristics of DNA used as bioreceptors and proposes a hybrid evolution-based DNA sequence generating algorithm, based on DNA computing, to identify suitable DNA bioreceptor recognition molecules for stable hybridization with real target substances. The Traveling Salesman Problem (TSP) approach is applied in the proposed algorithm to evaluate the safety and fitness of the generated DNA sequences. This approach improves efficiency and stability for enhanced and variable-length DNA sequence generation and allows extension to generation of variable-length DNA sequences with diverse receptor recognition requirements.
RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis
Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab
2012-01-01
RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. Availability http://www.cemb.edu.pk/sw.html Abbreviations RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language. PMID:23055611
Church, George M.; Kieffer-Higgins, Stephen
1992-01-01
This invention features vectors and a method for sequencing DNA. The method includes the steps of: a) ligating the DNA into a vector comprising a tag sequence, the tag sequence includes at least 15 bases, wherein the tag sequence will not hybridize to the DNA under stringent hybridization conditions and is unique in the vector, to form a hybrid vector, b) treating the hybrid vector in a plurality of vessels to produce fragments comprising the tag sequence, wherein the fragments differ in length and terminate at a fixed known base or bases, wherein the fixed known base or bases differs in each vessel, c) separating the fragments from each vessel according to their size, d) hybridizing the fragments with an oligonucleotide able to hybridize specifically with the tag sequence, and e) detecting the pattern of hybridization of the tag sequence, wherein the pattern reflects the nucleotide sequence of the DNA.
Method for sequencing DNA base pairs
Sessler, A.M.; Dawson, J.
1993-12-14
The base pairs of a DNA structure are sequenced with the use of a scanning tunneling microscope (STM). The DNA structure is scanned by the STM probe tip, and, as it is being scanned, the DNA structure is separately subjected to a sequence of infrared radiation from four different sources, each source being selected to preferentially excite one of the four different bases in the DNA structure. Each particular base being scanned is subjected to such sequence of infrared radiation from the four different sources as that particular base is being scanned. The DNA structure as a whole is separately imaged for each subjection thereof to radiation from one only of each source. 6 figures.
Development of a Novel Technology for Label Free DNA Sequencing
2012-05-21
of the C-H bond stretch vibrations in the planes of the corresponding DNA bases , and in the higher-frequency side, sequence-identifier region is...composed of the N-H bond stretch vibrations in the planes of the corresponding DNA bases . In addition, the sequence-identifier dividing region almost...regions are localized at the corresponding DNA bases and exhibit a definable dependence on the sequence form of the codons under study. Final
Jun, Goo; Flickinger, Matthew; Hetrick, Kurt N.; Romm, Jane M.; Doheny, Kimberly F.; Abecasis, Gonçalo R.; Boehnke, Michael; Kang, Hyun Min
2012-01-01
DNA sample contamination is a serious problem in DNA sequencing studies and may result in systematic genotype misclassification and false positive associations. Although methods exist to detect and filter out cross-species contamination, few methods to detect within-species sample contamination are available. In this paper, we describe methods to identify within-species DNA sample contamination based on (1) a combination of sequencing reads and array-based genotype data, (2) sequence reads alone, and (3) array-based genotype data alone. Analysis of sequencing reads allows contamination detection after sequence data is generated but prior to variant calling; analysis of array-based genotype data allows contamination detection prior to generation of costly sequence data. Through a combination of analysis of in silico and experimentally contaminated samples, we show that our methods can reliably detect and estimate levels of contamination as low as 1%. We evaluate the impact of DNA contamination on genotype accuracy and propose effective strategies to screen for and prevent DNA contamination in sequencing studies. PMID:23103226
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wu, Liyou; Yi, T. Y.; Van Nostrand, Joy
Phylogenetic analyses were done for the Shewanella strains isolated from Baltic Sea (38 strains), US DOE Hanford Uranium bioremediation site [Hanford Reach of the Columbia River (HRCR), 11 strains], Pacific Ocean and Hawaiian sediments (8 strains), and strains from other resources (16 strains) with three out group strains, Rhodopseudomonas palustris, Clostridium cellulolyticum, and Thermoanaerobacter ethanolicus X514, using DNA relatedness derived from WCGA-based DNA-DNA hybridizations, sequence similarities of 16S rRNA gene and gyrB gene, and sequence similarities of 6 loci of Shewanella genome selected from a shared gene list of the Shewanella strains with whole genome sequenced based on the averagemore » nucleotide identity of them (ANI). The phylogenetic trees based on 16S rRNA and gyrB gene sequences, and DNA relatedness derived from WCGA hybridizations of the tested Shewanella strains share exactly the same sub-clusters with very few exceptions, in which the strains were basically grouped by species. However, the phylogenetic analysis based on DNA relatedness derived from WCGA hybridizations dramatically increased the differentiation resolution at species and strains level within Shewanella genus. When the tree based on DNA relatedness derived from WCGA hybridizations was compared to the tree based on the combined sequences of the selected functional genes (6 loci), we found that the resolutions of both methods are similar, but the clustering of the tree based on DNA relatedness derived from WMGA hybridizations was clearer. These results indicate that WCGA-based DNA-DNA hybridization is an idea alternative of conventional DNA-DNA hybridization methods and it is superior to the phylogenetics methods based on sequence similarities of single genes. Detailed analysis is being performed for the re-classification of the strains examined.« less
Effect of Base Sequence "Defects" on the Electrostatic Potential of Dissolved DNA
NASA Astrophysics Data System (ADS)
Adams, Scott V.; Wagner, Katrina; Kephart, Thomas S.; Edwards, Glenn
1997-11-01
An analytical model of the electrostatic potential surrounding dissolved DNA has been developed. The model consists of an all-atom, mathematically helical structure for DNA, in which the atoms are arranged in infinite lines of discrete point charges on concentric cylindrical surfaces. The surrounding solvent and counterions are treated with the Debye-Huckel approximation (Wagner et al., Biophysical Journal 73, 21-30, 1997). Variation in the electrostatic potential due to structural differences between A, B, and Z conformations and homopolymer base sequence is apparent. The most recent modification to the model exploits the principle of superposition to calculate the potential of DNA with a base sequence containing `defects.' That is, the base sequence is no longer uniform along the polymer. Differences between the potential of homopolymer DNA and the potential of DNA containing base `defects' are immediately obvious. These results may aid in understanding the role of electrostatics in base-sequence specificity exhibited by DNA-binding proteins.
Lee, James W.; Thundat, Thomas G.
2005-06-14
An apparatus and method for performing nucleic acid (DNA and/or RNA) sequencing on a single molecule. The genetic sequence information is obtained by probing through a DNA or RNA molecule base by base at nanometer scale as though looking through a strip of movie film. This DNA sequencing nanotechnology has the theoretical capability of performing DNA sequencing at a maximal rate of about 1,000,000 bases per second. This enhanced performance is made possible by a series of innovations including: novel applications of a fine-tuned nanometer gap for passage of a single DNA or RNA molecule; thin layer microfluidics for sample loading and delivery; and programmable electric fields for precise control of DNA or RNA movement. Detection methods include nanoelectrode-gated tunneling current measurements, dielectric molecular characterization, and atomic force microscopy/electrostatic force microscopy (AFM/EFM) probing for nanoscale reading of the nucleic acid sequences.
USDA-ARS?s Scientific Manuscript database
Analysis of DNA methylation patterns relies increasingly on sequencing-based profiling methods. The four most frequently used sequencing-based technologies are the bisulfite-based methods MethylC-seq and reduced representation bisulfite sequencing (RRBS), and the enrichment-based techniques methylat...
2013-01-01
Background Mitochondrial DNA (mtDNA) typing can be a useful aid for identifying people from compromised samples when nuclear DNA is too damaged, degraded or below detection thresholds for routine short tandem repeat (STR)-based analysis. Standard mtDNA typing, focused on PCR amplicon sequencing of the control region (HVS I and HVS II), is limited by the resolving power of this short sequence, which misses up to 70% of the variation present in the mtDNA genome. Methods We used in-solution hybridisation-based DNA capture (using DNA capture probes prepared from modern human mtDNA) to recover mtDNA from post-mortem human remains in which the majority of DNA is both highly fragmented (<100 base pairs in length) and chemically damaged. The method ‘immortalises’ the finite quantities of DNA in valuable extracts as DNA libraries, which is followed by the targeted enrichment of endogenous mtDNA sequences and characterisation by next-generation sequencing (NGS). Results We sequenced whole mitochondrial genomes for human identification from samples where standard nuclear STR typing produced only partial profiles or demonstrably failed and/or where standard mtDNA hypervariable region sequences lacked resolving power. Multiple rounds of enrichment can substantially improve coverage and sequencing depth of mtDNA genomes from highly degraded samples. The application of this method has led to the reliable mitochondrial sequencing of human skeletal remains from unidentified World War Two (WWII) casualties approximately 70 years old and from archaeological remains (up to 2,500 years old). Conclusions This approach has potential applications in forensic science, historical human identification cases, archived medical samples, kinship analysis and population studies. In particular the methodology can be applied to any case, involving human or non-human species, where whole mitochondrial genome sequences are required to provide the highest level of maternal lineage discrimination. Multiple rounds of in-solution hybridisation-based DNA capture can retrieve whole mitochondrial genome sequences from even the most challenging samples. PMID:24289217
Extracting DNA words based on the sequence features: non-uniform distribution and integrity.
Li, Zhi; Cao, Hongyan; Cui, Yuehua; Zhang, Yanbo
2016-01-25
DNA sequence can be viewed as an unknown language with words as its functional units. Given that most sequence alignment algorithms such as the motif discovery algorithms depend on the quality of background information about sequences, it is necessary to develop an ab initio algorithm for extracting the "words" based only on the DNA sequences. We considered that non-uniform distribution and integrity were two important features of a word, based on which we developed an ab initio algorithm to extract "DNA words" that have potential functional meaning. A Kolmogorov-Smirnov test was used for consistency test of uniform distribution of DNA sequences, and the integrity was judged by the sequence and position alignment. Two random base sequences were adopted as negative control, and an English book was used as positive control to verify our algorithm. We applied our algorithm to the genomes of Saccharomyces cerevisiae and 10 strains of Escherichia coli to show the utility of the methods. The results provide strong evidences that the algorithm is a promising tool for ab initio building a DNA dictionary. Our method provides a fast way for large scale screening of important DNA elements and offers potential insights into the understanding of a genome.
Sequence-dependent DNA deformability studied using molecular dynamics simulations.
Fujii, Satoshi; Kono, Hidetoshi; Takenaka, Shigeori; Go, Nobuhiro; Sarai, Akinori
2007-01-01
Proteins recognize specific DNA sequences not only through direct contact between amino acids and bases, but also indirectly based on the sequence-dependent conformation and deformability of the DNA (indirect readout). We used molecular dynamics simulations to analyze the sequence-dependent DNA conformations of all 136 possible tetrameric sequences sandwiched between CGCG sequences. The deformability of dimeric steps obtained by the simulations is consistent with that by the crystal structures. The simulation results further showed that the conformation and deformability of the tetramers can highly depend on the flanking base pairs. The conformations of xATx tetramers show the most rigidity and are not affected by the flanking base pairs and the xYRx show by contrast the greatest flexibility and change their conformations depending on the base pairs at both ends, suggesting tetramers with the same central dimer can show different deformabilities. These results suggest that analysis of dimeric steps alone may overlook some conformational features of DNA and provide insight into the mechanism of indirect readout during protein-DNA recognition. Moreover, the sequence dependence of DNA conformation and deformability may be used to estimate the contribution of indirect readout to the specificity of protein-DNA recognition as well as nucleosome positioning and large-scale behavior of nucleic acids.
Googling DNA sequences on the World Wide Web.
Hajibabaei, Mehrdad; Singer, Gregory A C
2009-11-10
New web-based technologies provide an excellent opportunity for sharing and accessing information and using web as a platform for interaction and collaboration. Although several specialized tools are available for analyzing DNA sequence information, conventional web-based tools have not been utilized for bioinformatics applications. We have developed a novel algorithm and implemented it for searching species-specific genomic sequences, DNA barcodes, by using popular web-based methods such as Google. We developed an alignment independent character based algorithm based on dividing a sequence library (DNA barcodes) and query sequence to words. The actual search is conducted by conventional search tools such as freely available Google Desktop Search. We implemented our algorithm in two exemplar packages. We developed pre and post-processing software to provide customized input and output services, respectively. Our analysis of all publicly available DNA barcode sequences shows a high accuracy as well as rapid results. Our method makes use of conventional web-based technologies for specialized genetic data. It provides a robust and efficient solution for sequence search on the web. The integration of our search method for large-scale sequence libraries such as DNA barcodes provides an excellent web-based tool for accessing this information and linking it to other available categories of information on the web.
DNA-based watermarks using the DNA-Crypt algorithm.
Heider, Dominik; Barnekow, Angelika
2007-05-29
The aim of this paper is to demonstrate the application of watermarks based on DNA sequences to identify the unauthorized use of genetically modified organisms (GMOs) protected by patents. Predicted mutations in the genome can be corrected by the DNA-Crypt program leaving the encrypted information intact. Existing DNA cryptographic and steganographic algorithms use synthetic DNA sequences to store binary information however, although these sequences can be used for authentication, they may change the target DNA sequence when introduced into living organisms. The DNA-Crypt algorithm and image steganography are based on the same watermark-hiding principle, namely using the least significant base in case of DNA-Crypt and the least significant bit in case of the image steganography. It can be combined with binary encryption algorithms like AES, RSA or Blowfish. DNA-Crypt is able to correct mutations in the target DNA with several mutation correction codes such as the Hamming-code or the WDH-code. Mutations which can occur infrequently may destroy the encrypted information, however an integrated fuzzy controller decides on a set of heuristics based on three input dimensions, and recommends whether or not to use a correction code. These three input dimensions are the length of the sequence, the individual mutation rate and the stability over time, which is represented by the number of generations. In silico experiments using the Ypt7 in Saccharomyces cerevisiae shows that the DNA watermarks produced by DNA-Crypt do not alter the translation of mRNA into protein. The program is able to store watermarks in living organisms and can maintain the original information by correcting mutations itself. Pairwise or multiple sequence alignments show that DNA-Crypt produces few mismatches between the sequences similar to all steganographic algorithms.
DNA-based watermarks using the DNA-Crypt algorithm
Heider, Dominik; Barnekow, Angelika
2007-01-01
Background The aim of this paper is to demonstrate the application of watermarks based on DNA sequences to identify the unauthorized use of genetically modified organisms (GMOs) protected by patents. Predicted mutations in the genome can be corrected by the DNA-Crypt program leaving the encrypted information intact. Existing DNA cryptographic and steganographic algorithms use synthetic DNA sequences to store binary information however, although these sequences can be used for authentication, they may change the target DNA sequence when introduced into living organisms. Results The DNA-Crypt algorithm and image steganography are based on the same watermark-hiding principle, namely using the least significant base in case of DNA-Crypt and the least significant bit in case of the image steganography. It can be combined with binary encryption algorithms like AES, RSA or Blowfish. DNA-Crypt is able to correct mutations in the target DNA with several mutation correction codes such as the Hamming-code or the WDH-code. Mutations which can occur infrequently may destroy the encrypted information, however an integrated fuzzy controller decides on a set of heuristics based on three input dimensions, and recommends whether or not to use a correction code. These three input dimensions are the length of the sequence, the individual mutation rate and the stability over time, which is represented by the number of generations. In silico experiments using the Ypt7 in Saccharomyces cerevisiae shows that the DNA watermarks produced by DNA-Crypt do not alter the translation of mRNA into protein. Conclusion The program is able to store watermarks in living organisms and can maintain the original information by correcting mutations itself. Pairwise or multiple sequence alignments show that DNA-Crypt produces few mismatches between the sequences similar to all steganographic algorithms. PMID:17535434
[Current applications of high-throughput DNA sequencing technology in antibody drug research].
Yu, Xin; Liu, Qi-Gang; Wang, Ming-Rong
2012-03-01
Since the publication of a high-throughput DNA sequencing technology based on PCR reaction was carried out in oil emulsions in 2005, high-throughput DNA sequencing platforms have been evolved to a robust technology in sequencing genomes and diverse DNA libraries. Antibody libraries with vast numbers of members currently serve as a foundation of discovering novel antibody drugs, and high-throughput DNA sequencing technology makes it possible to rapidly identify functional antibody variants with desired properties. Herein we present a review of current applications of high-throughput DNA sequencing technology in the analysis of antibody library diversity, sequencing of CDR3 regions, identification of potent antibodies based on sequence frequency, discovery of functional genes, and combination with various display technologies, so as to provide an alternative approach of discovery and development of antibody drugs.
Xie, Guosen; Mo, Zhongxi
2011-01-21
In this article, we introduce three 3D graphical representations of DNA primary sequences, which we call RY-curve, MK-curve and SW-curve, based on three classifications of the DNA bases. The advantages of our representations are that (i) these 3D curves are strictly non-degenerate and there is no loss of information when transferring a DNA sequence to its mathematical representation and (ii) the coordinates of every node on these 3D curves have clear biological implication. Two applications of these 3D curves are presented: (a) a simple formula is derived to calculate the content of the four bases (A, G, C and T) from the coordinates of nodes on the curves; and (b) a 12-component characteristic vector is constructed to compare similarity among DNA sequences from different species based on the geometrical centers of the 3D curves. As examples, we examine similarity among the coding sequences of the first exon of beta-globin gene from eleven species and validate similarity of cDNA sequences of beta-globin gene from eight species. Copyright © 2010 Elsevier Ltd. All rights reserved.
Star, Bastiaan; Nederbragt, Alexander J.; Hansen, Marianne H. S.; Skage, Morten; Gilfillan, Gregor D.; Bradbury, Ian R.; Pampoulie, Christophe; Stenseth, Nils Chr; Jakobsen, Kjetill S.; Jentoft, Sissel
2014-01-01
Degradation-specific processes and variation in laboratory protocols can bias the DNA sequence composition from samples of ancient or historic origin. Here, we identify a novel artifact in sequences from historic samples of Atlantic cod (Gadus morhua), which forms interrupted palindromes consisting of reverse complementary sequence at the 5′ and 3′-ends of sequencing reads. The palindromic sequences themselves have specific properties – the bases at the 5′-end align well to the reference genome, whereas extensive misalignments exists among the bases at the terminal 3′-end. The terminal 3′ bases are artificial extensions likely caused by the occurrence of hairpin loops in single stranded DNA (ssDNA), which can be ligated and amplified in particular library creation protocols. We propose that such hairpin loops allow the inclusion of erroneous nucleotides, specifically at the 3′-end of DNA strands, with the 5′-end of the same strand providing the template. We also find these palindromes in previously published ancient DNA (aDNA) datasets, albeit at varying and substantially lower frequencies. This artifact can negatively affect the yield of endogenous DNA in these types of samples and introduces sequence bias. PMID:24608104
Local alignment of two-base encoded DNA sequence
Homer, Nils; Merriman, Barry; Nelson, Stanley F
2009-01-01
Background DNA sequence comparison is based on optimal local alignment of two sequences using a similarity score. However, some new DNA sequencing technologies do not directly measure the base sequence, but rather an encoded form, such as the two-base encoding considered here. In order to compare such data to a reference sequence, the data must be decoded into sequence. The decoding is deterministic, but the possibility of measurement errors requires searching among all possible error modes and resulting alignments to achieve an optimal balance of fewer errors versus greater sequence similarity. Results We present an extension of the standard dynamic programming method for local alignment, which simultaneously decodes the data and performs the alignment, maximizing a similarity score based on a weighted combination of errors and edits, and allowing an affine gap penalty. We also present simulations that demonstrate the performance characteristics of our two base encoded alignment method and contrast those with standard DNA sequence alignment under the same conditions. Conclusion The new local alignment algorithm for two-base encoded data has substantial power to properly detect and correct measurement errors while identifying underlying sequence variants, and facilitating genome re-sequencing efforts based on this form of sequence data. PMID:19508732
DNABIT Compress - Genome compression algorithm.
Rajarajeswari, Pothuraju; Apparao, Allam
2011-01-22
Data compression is concerned with how information is organized in data. Efficient storage means removal of redundancy from the data being stored in the DNA molecule. Data compression algorithms remove redundancy and are used to understand biologically important molecules. We present a compression algorithm, "DNABIT Compress" for DNA sequences based on a novel algorithm of assigning binary bits for smaller segments of DNA bases to compress both repetitive and non repetitive DNA sequence. Our proposed algorithm achieves the best compression ratio for DNA sequences for larger genome. Significantly better compression results show that "DNABIT Compress" algorithm is the best among the remaining compression algorithms. While achieving the best compression ratios for DNA sequences (Genomes),our new DNABIT Compress algorithm significantly improves the running time of all previous DNA compression programs. Assigning binary bits (Unique BIT CODE) for (Exact Repeats, Reverse Repeats) fragments of DNA sequence is also a unique concept introduced in this algorithm for the first time in DNA compression. This proposed new algorithm could achieve the best compression ratio as much as 1.58 bits/bases where the existing best methods could not achieve a ratio less than 1.72 bits/bases.
Sequence periodicity in nucleosomal DNA and intrinsic curvature.
Nair, T Murlidharan
2010-05-17
Most eukaryotic DNA contained in the nucleus is packaged by wrapping DNA around histone octamers. Histones are ubiquitous and bind most regions of chromosomal DNA. In order to achieve smooth wrapping of the DNA around the histone octamer, the DNA duplex should be able to deform and should possess intrinsic curvature. The deformability of DNA is a result of the non-parallelness of base pair stacks. The stacking interaction between base pairs is sequence dependent. The higher the stacking energy the more rigid the DNA helix, thus it is natural to expect that sequences that are involved in wrapping around the histone octamer should be unstacked and possess intrinsic curvature. Intrinsic curvature has been shown to be dictated by the periodic recurrence of certain dinucleotides. Several genome-wide studies directed towards mapping of nucleosome positions have revealed periodicity associated with certain stretches of sequences. In the current study, these sequences have been analyzed with a view to understand their sequence-dependent structures. Higher order DNA structures and the distribution of molecular bend loci associated with 146 base nucleosome core DNA sequence from C. elegans and chicken have been analyzed using the theoretical model for DNA curvature. The curvature dispersion calculated by cyclically permuting the sequences revealed that the molecular bend loci were delocalized throughout the nucleosome core region and had varying degrees of intrinsic curvature. The higher order structures associated with nucleosomes of C.elegans and chicken calculated from the sequences revealed heterogeneity with respect to the deviation of the DNA axis. The results points to the possibility of context dependent curvature of varying degrees to be associated with nucleosomal DNA.
Schneider, T D
2001-12-01
The sequence logo for DNA binding sites of the bacteriophage P1 replication protein RepA shows unusually high sequence conservation ( approximately 2 bits) at a minor groove that faces RepA. However, B-form DNA can support only 1 bit of sequence conservation via contacts into the minor groove. The high conservation in RepA sites therefore implies a distorted DNA helix with direct or indirect contacts to the protein. Here I show that a high minor groove conservation signature also appears in sequence logos of sites for other replication origin binding proteins (Rts1, DnaA, P4 alpha, EBNA1, ORC) and promoter binding proteins (sigma(70), sigma(D) factors). This finding implies that DNA binding proteins generally use non-B-form DNA distortion such as base flipping to initiate replication and transcription.
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions
Gardner, Shea N; Mariella, Jr., Raymond P; Christian, Allen T; Young, Jennifer A; Clague, David S
2013-06-25
A method of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths.
Method for rapid base sequencing in DNA and RNA
Jett, J.H.; Keller, R.A.; Martin, J.C.; Moyzis, R.K.; Ratliff, R.L.; Shera, E.B.; Stewart, C.C.
1987-10-07
A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed. 2 figs.
Method for rapid base sequencing in DNA and RNA
Jett, J.H.; Keller, R.A.; Martin, J.C.; Moyzis, R.K.; Ratliff, R.L.; Shera, E.B.; Stewart, C.C.
1990-10-09
A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed. 2 figs.
Method for rapid base sequencing in DNA and RNA
Jett, James H.; Keller, Richard A.; Martin, John C.; Moyzis, Robert K.; Ratliff, Robert L.; Shera, E. Brooks; Stewart, Carleton C.
1990-01-01
A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed.
DNA barcode goes two-dimensions: DNA QR code web server.
Liu, Chang; Shi, Linchun; Xu, Xiaolan; Li, Huan; Xing, Hang; Liang, Dong; Jiang, Kun; Pang, Xiaohui; Song, Jingyuan; Chen, Shilin
2012-01-01
The DNA barcoding technology uses a standard region of DNA sequence for species identification and discovery. At present, "DNA barcode" actually refers to DNA sequences, which are not amenable to information storage, recognition, and retrieval. Our aim is to identify the best symbology that can represent DNA barcode sequences in practical applications. A comprehensive set of sequences for five DNA barcode markers ITS2, rbcL, matK, psbA-trnH, and CO1 was used as the test data. Fifty-three different types of one-dimensional and ten two-dimensional barcode symbologies were compared based on different criteria, such as coding capacity, compression efficiency, and error detection ability. The quick response (QR) code was found to have the largest coding capacity and relatively high compression ratio. To facilitate the further usage of QR code-based DNA barcodes, a web server was developed and is accessible at http://qrfordna.dnsalias.org. The web server allows users to retrieve the QR code for a species of interests, convert a DNA sequence to and from a QR code, and perform species identification based on local and global sequence similarities. In summary, the first comprehensive evaluation of various barcode symbologies has been carried out. The QR code has been found to be the most appropriate symbology for DNA barcode sequences. A web server has also been constructed to allow biologists to utilize QR codes in practical DNA barcoding applications.
Single-Molecule Electrical Random Resequencing of DNA and RNA
NASA Astrophysics Data System (ADS)
Ohshiro, Takahito; Matsubara, Kazuki; Tsutsui, Makusu; Furuhashi, Masayuki; Taniguchi, Masateru; Kawai, Tomoji
2012-07-01
Two paradigm shifts in DNA sequencing technologies--from bulk to single molecules and from optical to electrical detection--are expected to realize label-free, low-cost DNA sequencing that does not require PCR amplification. It will lead to development of high-throughput third-generation sequencing technologies for personalized medicine. Although nanopore devices have been proposed as third-generation DNA-sequencing devices, a significant milestone in these technologies has been attained by demonstrating a novel technique for resequencing DNA using electrical signals. Here we report single-molecule electrical resequencing of DNA and RNA using a hybrid method of identifying single-base molecules via tunneling currents and random sequencing. Our method reads sequences of nine types of DNA oligomers. The complete sequence of 5'-UGAGGUA-3' from the let-7 microRNA family was also identified by creating a composite of overlapping fragment sequences, which was randomly determined using tunneling current conducted by single-base molecules as they passed between a pair of nanoelectrodes.
Osmylated DNA, a novel concept for sequencing DNA using nanopores
NASA Astrophysics Data System (ADS)
Kanavarioti, Anastassia
2015-03-01
Saenger sequencing has led the advances in molecular biology, while faster and cheaper next generation technologies are urgently needed. A newer approach exploits nanopores, natural or solid-state, set in an electrical field, and obtains base sequence information from current variations due to the passage of a ssDNA molecule through the pore. A hurdle in this approach is the fact that the four bases are chemically comparable to each other which leads to small differences in current obstruction. ‘Base calling’ becomes even more challenging because most nanopores sense a short sequence and not individual bases. Perhaps sequencing DNA via nanopores would be more manageable, if only the bases were two, and chemically very different from each other; a sequence of 1s and 0s comes to mind. Osmylated DNA comes close to such a sequence of 1s and 0s. Osmylation is the addition of osmium tetroxide bipyridine across the C5-C6 double bond of the pyrimidines. Osmylation adds almost 400% mass to the reactive base, creates a sterically and electronically notably different molecule, labeled 1, compared to the unreactive purines, labeled 0. If osmylated DNA were successfully sequenced, the result would be a sequence of osmylated pyrimidines (1), and purines (0), and not of the actual nucleobases. To solve this problem we studied the osmylation reaction with short oligos and with M13mp18, a long ssDNA, developed a UV-vis assay to measure extent of osmylation, and designed two protocols. Protocol A uses mild conditions and yields osmylated thymidines (1), while leaving the other three bases (0) practically intact. Protocol B uses harsher conditions and effectively osmylates both pyrimidines, but not the purines. Applying these two protocols also to the complementary of the target polynucleotide yields a total of four osmylated strands that collectively could define the actual base sequence of the target DNA.
UV-Visible Spectroscopy-Based Quantification of Unlabeled DNA Bound to Gold Nanoparticles.
Baldock, Brandi L; Hutchison, James E
2016-12-20
DNA-functionalized gold nanoparticles have been increasingly applied as sensitive and selective analytical probes and biosensors. The DNA ligands bound to a nanoparticle dictate its reactivity, making it essential to know the type and number of DNA strands bound to the nanoparticle surface. Existing methods used to determine the number of DNA strands per gold nanoparticle (AuNP) require that the sequences be fluorophore-labeled, which may affect the DNA surface coverage and reactivity of the nanoparticle and/or require specialized equipment and other fluorophore-containing reagents. We report a UV-visible-based method to conveniently and inexpensively determine the number of DNA strands attached to AuNPs of different core sizes. When this method is used in tandem with a fluorescence dye assay, it is possible to determine the ratio of two unlabeled sequences of different lengths bound to AuNPs. Two sizes of citrate-stabilized AuNPs (5 and 12 nm) were functionalized with mixtures of short (5 base) and long (32 base) disulfide-terminated DNA sequences, and the ratios of sequences bound to the AuNPs were determined using the new method. The long DNA sequence was present as a lower proportion of the ligand shell than in the ligand exchange mixture, suggesting it had a lower propensity to bind the AuNPs than the short DNA sequence. The ratio of DNA sequences bound to the AuNPs was not the same for the large and small AuNPs, which suggests that the radius of curvature had a significant influence on the assembly of DNA strands onto the AuNPs.
Sequence periodicity in nucleosomal DNA and intrinsic curvature
2010-01-01
Background Most eukaryotic DNA contained in the nucleus is packaged by wrapping DNA around histone octamers. Histones are ubiquitous and bind most regions of chromosomal DNA. In order to achieve smooth wrapping of the DNA around the histone octamer, the DNA duplex should be able to deform and should possess intrinsic curvature. The deformability of DNA is a result of the non-parallelness of base pair stacks. The stacking interaction between base pairs is sequence dependent. The higher the stacking energy the more rigid the DNA helix, thus it is natural to expect that sequences that are involved in wrapping around the histone octamer should be unstacked and possess intrinsic curvature. Intrinsic curvature has been shown to be dictated by the periodic recurrence of certain dinucleotides. Several genome-wide studies directed towards mapping of nucleosome positions have revealed periodicity associated with certain stretches of sequences. In the current study, these sequences have been analyzed with a view to understand their sequence-dependent structures. Results Higher order DNA structures and the distribution of molecular bend loci associated with 146 base nucleosome core DNA sequence from C. elegans and chicken have been analyzed using the theoretical model for DNA curvature. The curvature dispersion calculated by cyclically permuting the sequences revealed that the molecular bend loci were delocalized throughout the nucleosome core region and had varying degrees of intrinsic curvature. Conclusions The higher order structures associated with nucleosomes of C.elegans and chicken calculated from the sequences revealed heterogeneity with respect to the deviation of the DNA axis. The results points to the possibility of context dependent curvature of varying degrees to be associated with nucleosomal DNA. PMID:20487515
2000-08-01
4). Sequence recognition of all four DNA bases is achieved by positioning an N- methylimidazole opposite guanine or N-methylpyrrole opposite...unique sequences of DNA based upon selective binding motifs to all four DNA bases , although relatively little is known about the ability of these agents to
DNABIT Compress – Genome compression algorithm
Rajarajeswari, Pothuraju; Apparao, Allam
2011-01-01
Data compression is concerned with how information is organized in data. Efficient storage means removal of redundancy from the data being stored in the DNA molecule. Data compression algorithms remove redundancy and are used to understand biologically important molecules. We present a compression algorithm, “DNABIT Compress” for DNA sequences based on a novel algorithm of assigning binary bits for smaller segments of DNA bases to compress both repetitive and non repetitive DNA sequence. Our proposed algorithm achieves the best compression ratio for DNA sequences for larger genome. Significantly better compression results show that “DNABIT Compress” algorithm is the best among the remaining compression algorithms. While achieving the best compression ratios for DNA sequences (Genomes),our new DNABIT Compress algorithm significantly improves the running time of all previous DNA compression programs. Assigning binary bits (Unique BIT CODE) for (Exact Repeats, Reverse Repeats) fragments of DNA sequence is also a unique concept introduced in this algorithm for the first time in DNA compression. This proposed new algorithm could achieve the best compression ratio as much as 1.58 bits/bases where the existing best methods could not achieve a ratio less than 1.72 bits/bases. PMID:21383923
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions
Gardner, Shea N [San Leandro, CA; Mariella, Jr., Raymond P.; Christian, Allen T [Tracy, CA; Young, Jennifer A [Berkeley, CA; Clague, David S [Livermore, CA
2011-01-18
A method of fabricating a DNA molecule of user-defined sequence. The method comprises the steps of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an even or odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths. In one embodiment starting sequence fragments are of different lengths, n, n+1, n+2, etc.
DNA Barcode Goes Two-Dimensions: DNA QR Code Web Server
Li, Huan; Xing, Hang; Liang, Dong; Jiang, Kun; Pang, Xiaohui; Song, Jingyuan; Chen, Shilin
2012-01-01
The DNA barcoding technology uses a standard region of DNA sequence for species identification and discovery. At present, “DNA barcode” actually refers to DNA sequences, which are not amenable to information storage, recognition, and retrieval. Our aim is to identify the best symbology that can represent DNA barcode sequences in practical applications. A comprehensive set of sequences for five DNA barcode markers ITS2, rbcL, matK, psbA-trnH, and CO1 was used as the test data. Fifty-three different types of one-dimensional and ten two-dimensional barcode symbologies were compared based on different criteria, such as coding capacity, compression efficiency, and error detection ability. The quick response (QR) code was found to have the largest coding capacity and relatively high compression ratio. To facilitate the further usage of QR code-based DNA barcodes, a web server was developed and is accessible at http://qrfordna.dnsalias.org. The web server allows users to retrieve the QR code for a species of interests, convert a DNA sequence to and from a QR code, and perform species identification based on local and global sequence similarities. In summary, the first comprehensive evaluation of various barcode symbologies has been carried out. The QR code has been found to be the most appropriate symbology for DNA barcode sequences. A web server has also been constructed to allow biologists to utilize QR codes in practical DNA barcoding applications. PMID:22574113
Analysis of DNA Sequences by an Optical Time-Integrating Correlator: Proposal
1991-11-01
OF THE PROBLEM AND CURRENT TECHNOLOGY 2 3.0 TIME-INTEGRATING CORRELATOR 2 4.0 REPRESENTATIONS OF THE DNA BASES 8 5.0 DNA ANALYSIS STRATEGY 8 6.0... DNA bases where each base is represented by a 7-bits long pseudorandom sequence. 9 Figure 5: The flow of data in a DNA analysis system based on an...logarithmic scale and a linear scale. 15 x LIST OF TABLES PAGE Table 1: Short representations of the DNA bases where each base is represented by 7-bits
cgDNAweb: a web interface to the cgDNA sequence-dependent coarse-grain model of double-stranded DNA.
De Bruin, Lennart; Maddocks, John H
2018-06-14
The sequence-dependent statistical mechanical properties of fragments of double-stranded DNA is believed to be pertinent to its biological function at length scales from a few base pairs (or bp) to a few hundreds of bp, e.g. indirect read-out protein binding sites, nucleosome positioning sequences, phased A-tracts, etc. In turn, the equilibrium statistical mechanics behaviour of DNA depends upon its ground state configuration, or minimum free energy shape, as well as on its fluctuations as governed by its stiffness (in an appropriate sense). We here present cgDNAweb, which provides browser-based interactive visualization of the sequence-dependent ground states of double-stranded DNA molecules, as predicted by the underlying cgDNA coarse-grain rigid-base model of fragments with arbitrary sequence. The cgDNAweb interface is specifically designed to facilitate comparison between ground state shapes of different sequences. The server is freely available at cgDNAweb.epfl.ch with no login requirement.
Mapping Base Modifications in DNA by Transverse-Current Sequencing
NASA Astrophysics Data System (ADS)
Alvarez, Jose R.; Skachkov, Dmitry; Massey, Steven E.; Kalitsov, Alan; Velev, Julian P.
2018-02-01
Sequencing DNA modifications and lesions, such as methylation of cytosine and oxidation of guanine, is even more important and challenging than sequencing the genome itself. The traditional methods for detecting DNA modifications are either insensitive to these modifications or require additional processing steps to identify a particular type of modification. Transverse-current sequencing in nanopores can potentially identify the canonical bases and base modifications in the same run. In this work, we demonstrate that the most common DNA epigenetic modifications and lesions can be detected with any predefined accuracy based on their tunneling current signature. Our results are based on simulations of the nanopore tunneling current through DNA molecules, calculated using nonequilibrium electron-transport methodology within an effective multiorbital model derived from first-principles calculations, followed by a base-calling algorithm accounting for neighbor current-current correlations. This methodology can be integrated with existing experimental techniques to improve base-calling fidelity.
Method for rapid base sequencing in DNA and RNA with two base labeling
Jett, J.H.; Keller, R.A.; Martin, J.C.; Posner, R.G.; Marrone, B.L.; Hammond, M.L.; Simpson, D.J.
1995-04-11
A method is described for rapid-base sequencing in DNA and RNA with two-base labeling and employing fluorescent detection of single molecules at two wavelengths. Bases modified to accept fluorescent labels are used to replicate a single DNA or RNA strand to be sequenced. The bases are then sequentially cleaved from the replicated strand, excited with a chosen spectrum of electromagnetic radiation, and the fluorescence from individual, tagged bases detected in the order of cleavage from the strand. 4 figures.
Method for rapid base sequencing in DNA and RNA with two base labeling
Jett, James H.; Keller, Richard A.; Martin, John C.; Posner, Richard G.; Marrone, Babetta L.; Hammond, Mark L.; Simpson, Daniel J.
1995-01-01
Method for rapid-base sequencing in DNA and RNA with two-base labeling and employing fluorescent detection of single molecules at two wavelengths. Bases modified to accept fluorescent labels are used to replicate a single DNA or RNA strand to be sequenced. The bases are then sequentially cleaved from the replicated strand, excited with a chosen spectrum of electromagnetic radiation, and the fluorescence from individual, tagged bases detected in the order of cleavage from the strand.
Ludgate, Jackie L; Wright, James; Stockwell, Peter A; Morison, Ian M; Eccles, Michael R; Chatterjee, Aniruddha
2017-08-31
Formalin fixed paraffin embedded (FFPE) tumor samples are a major source of DNA from patients in cancer research. However, FFPE is a challenging material to work with due to macromolecular fragmentation and nucleic acid crosslinking. FFPE tissue particularly possesses challenges for methylation analysis and for preparing sequencing-based libraries relying on bisulfite conversion. Successful bisulfite conversion is a key requirement for sequencing-based methylation analysis. Here we describe a complete and streamlined workflow for preparing next generation sequencing libraries for methylation analysis from FFPE tissues. This includes, counting cells from FFPE blocks and extracting DNA from FFPE slides, testing bisulfite conversion efficiency with a polymerase chain reaction (PCR) based test, preparing reduced representation bisulfite sequencing libraries and massively parallel sequencing. The main features and advantages of this protocol are: An optimized method for extracting good quality DNA from FFPE tissues. An efficient bisulfite conversion and next generation sequencing library preparation protocol that uses 50 ng DNA from FFPE tissue. Incorporation of a PCR-based test to assess bisulfite conversion efficiency prior to sequencing. We provide a complete workflow and an integrated protocol for performing DNA methylation analysis at the genome-scale and we believe this will facilitate clinical epigenetic research that involves the use of FFPE tissue.
Dendritic Cell-Based Immunotherapy of Breast Cancer: Modulation by CpG DNA
2005-09-01
tumor-associated antigens and bacterial DNA oligodeoxynucleotides containing unmethylated CpG sequences (CpG DNA) further augment the immune priming...associated antigens by cytotoxic T lymphocytes, and bacterial DNA oligodeoxy- nucleotides containing unmethylated CpG sequences (CpG DNA) can further...further amplify their immunostimulatory capacity and bacterial DNA oligodeoxynucleotides (ODN) containing unmethylated CpG sequences (CpG DNA) provide such
Su, Jiao; Zhang, Haijie; Jiang, Bingying; Zheng, Huzhi; Chai, Yaqin; Yuan, Ruo; Xiang, Yun
2011-11-15
We report an ultrasensitive electrochemical approach for the detection of uropathogen sequence-specific DNA target. The sensing strategy involves a dual signal amplification process, which combines the signal enhancement by the enzymatic target recycling technique with the sensitivity improvement by the quantum dot (QD) layer-by-layer (LBL) assembled labels. The enzyme-based catalytic target DNA recycling process results in the use of each target DNA sequence for multiple times and leads to direct amplification of the analytical signal. Moreover, the LBL assembled QD labels can further enhance the sensitivity of the sensing system. The coupling of these two effective signal amplification strategies thus leads to low femtomolar (5fM) detection of the target DNA sequences. The proposed strategy also shows excellent discrimination between the target DNA and the single-base mismatch sequences. The advantageous intrinsic sequence-independent property of exonuclease III over other sequence-dependent enzymes makes our new dual signal amplification system a general sensing platform for monitoring ultralow level of various types of target DNA sequences. Copyright © 2011 Elsevier B.V. All rights reserved.
An Optimal Seed Based Compression Algorithm for DNA Sequences
Gopalakrishnan, Gopakumar; Karunakaran, Muralikrishnan
2016-01-01
This paper proposes a seed based lossless compression algorithm to compress a DNA sequence which uses a substitution method that is similar to the LempelZiv compression scheme. The proposed method exploits the repetition structures that are inherent in DNA sequences by creating an offline dictionary which contains all such repeats along with the details of mismatches. By ensuring that only promising mismatches are allowed, the method achieves a compression ratio that is at par or better than the existing lossless DNA sequence compression algorithms. PMID:27555868
Thomas, W. Kelley; Vida, J. T.; Frisse, Linda M.; Mundo, Manuel; Baldwin, James G.
1997-01-01
To effectively integrate DNA sequence analysis and classical nematode taxonomy, we must be able to obtain DNA sequences from formalin-fixed specimens. Microdissected sections of nematodes were removed from specimens fixed in formalin, using standard protocols and without destroying morphological features. The fixed sections provided sufficient template for multiple polymerase chain reaction-based DNA sequence analyses. PMID:19274156
Van Kreijl, C F; Bos, J L
1977-01-01
The repeating nucleotide sequence of 68 base pairs in the mtDNA from an ethidium-induced cytoplasmic petite mutant of yeast has been determined. For sequence analysis specifically primed and terminated RNA copies, obtained by in vitro transcription of the separated strands, were use. The sequence consists of 66 consecutive AT base pairs flanked by two GC pairs and comprises nearly all of the mutant mitochondrial genome. The sequence, moreover, also represents the first part of wild-type mtDNA sequence so far. Images PMID:198740
DNA sequence analysis with droplet-based microfluidics
Abate, Adam R.; Hung, Tony; Sperling, Ralph A.; Mary, Pascaline; Rotem, Assaf; Agresti, Jeremy J.; Weiner, Michael A.; Weitz, David A.
2014-01-01
Droplet-based microfluidic techniques can form and process micrometer scale droplets at thousands per second. Each droplet can house an individual biochemical reaction, allowing millions of reactions to be performed in minutes with small amounts of total reagent. This versatile approach has been used for engineering enzymes, quantifying concentrations of DNA in solution, and screening protein crystallization conditions. Here, we use it to read the sequences of DNA molecules with a FRET-based assay. Using probes of different sequences, we interrogate a target DNA molecule for polymorphisms. With a larger probe set, additional polymorphisms can be interrogated as well as targets of arbitrary sequence. PMID:24185402
DOE Office of Scientific and Technical Information (OSTI.GOV)
Benasutti, M.; Ejadi, S.; Whitlow, M.D.
The mutagenic and carcinogenic chemical aflatoxin B/sub 1/ (AFB/sub 1/) reacts almost exclusively at the N(7)-position of guanine following activation to its reactive form, the 8,9-epoxide (AFB/sub 1/ oxide). In general N(7)-guanine adducts yield DNA strand breaks when heated in base, a property that serves as the basis for the Maxam-Gilbert DNA sequencing reaction specific for guanine. Using DNA sequencing methods, other workers have shown that AFB/sub 1/ oxide gives strand breaks at positions of guanines; however, the guanine bands varied in intensity. This phenomenon has been used to infer that AFB/sub 1/ oxide prefers to react with guanines inmore » some sequence contexts more than in others and has been referred to as sequence specificity of binding. Herein, data on the reaction of AFB/sub 1/ oxide with several synthetic DNA polymers with different sequences are presented, and (following hydrolysis) adduct levels are determine by high-pressure liquid chromatography. These results reveal that for AFB/sub 1/ oxide (1) the N(7)-guanine adduct is the major adduct found in all of the DNA polymers, (2) adduct levels vary in different sequences, and, thus, sequence specificity is also observed by this more direct method, and (3) the intensity of bands in DNA sequencing gels is likely to reflect adduct levels formed at the N(7)-position of guanine. Knowing this, a reinvestigation of the reactivity of guanines in different DNA sequences using DNA sequencing methods was undertaken. Methods are developed to determine the X (5'-side) base and the Y (3'-side) base are most influential in determining guanine reactivity. These rules in conjunction with molecular modeling studies were used to assess the binding sites that might be utilized by AFB/sub 1/ oxide in its reaction with DNA.« less
Image Encryption Algorithm Based on Hyperchaotic Maps and Nucleotide Sequences Database
2017-01-01
Image encryption technology is one of the main means to ensure the safety of image information. Using the characteristics of chaos, such as randomness, regularity, ergodicity, and initial value sensitiveness, combined with the unique space conformation of DNA molecules and their unique information storage and processing ability, an efficient method for image encryption based on the chaos theory and a DNA sequence database is proposed. In this paper, digital image encryption employs a process of transforming the image pixel gray value by using chaotic sequence scrambling image pixel location and establishing superchaotic mapping, which maps quaternary sequences and DNA sequences, and by combining with the logic of the transformation between DNA sequences. The bases are replaced under the displaced rules by using DNA coding in a certain number of iterations that are based on the enhanced quaternary hyperchaotic sequence; the sequence is generated by Chen chaos. The cipher feedback mode and chaos iteration are employed in the encryption process to enhance the confusion and diffusion properties of the algorithm. Theoretical analysis and experimental results show that the proposed scheme not only demonstrates excellent encryption but also effectively resists chosen-plaintext attack, statistical attack, and differential attack. PMID:28392799
"First generation" automated DNA sequencing technology.
Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M
2011-10-01
Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines. © 2011 by John Wiley & Sons, Inc.
Fiannaca, Antonino; La Rosa, Massimo; Rizzo, Riccardo; Urso, Alfonso
2015-07-01
In this paper, an alignment-free method for DNA barcode classification that is based on both a spectral representation and a neural gas network for unsupervised clustering is proposed. In the proposed methodology, distinctive words are identified from a spectral representation of DNA sequences. A taxonomic classification of the DNA sequence is then performed using the sequence signature, i.e., the smallest set of k-mers that can assign a DNA sequence to its proper taxonomic category. Experiments were then performed to compare our method with other supervised machine learning classification algorithms, such as support vector machine, random forest, ripper, naïve Bayes, ridor, and classification tree, which also consider short DNA sequence fragments of 200 and 300 base pairs (bp). The experimental tests were conducted over 10 real barcode datasets belonging to different animal species, which were provided by the on-line resource "Barcode of Life Database". The experimental results showed that our k-mer-based approach is directly comparable, in terms of accuracy, recall and precision metrics, with the other classifiers when considering full-length sequences. In addition, we demonstrate the robustness of our method when a classification is performed task with a set of short DNA sequences that were randomly extracted from the original data. For example, the proposed method can reach the accuracy of 64.8% at the species level with 200-bp fragments. Under the same conditions, the best other classifier (random forest) reaches the accuracy of 20.9%. Our results indicate that we obtained a clear improvement over the other classifiers for the study of short DNA barcode sequence fragments. Copyright © 2015 Elsevier B.V. All rights reserved.
A DNA sequence obtained by replacement of the dopamine RNA aptamer bases is not an aptamer.
Álvarez-Martos, Isabel; Ferapontova, Elena E
2017-08-05
A unique specificity of the aptamer-ligand biorecognition and binding facilitates bioanalysis and biosensor development, contributing to discrimination of structurally related molecules, such as dopamine and other catecholamine neurotransmitters. The aptamer sequence capable of specific binding of dopamine is a 57 nucleotides long RNA sequence reported in 1997 (Biochemistry, 1997, 36, 9726). Later, it was suggested that the DNA homologue of the RNA aptamer retains the specificity of dopamine binding (Biochem. Biophys. Res. Commun., 2009, 388, 732). Here, we show that the DNA sequence obtained by the replacement of the RNA aptamer bases for their DNA analogues is not able of specific biorecognition of dopamine, in contrast to the original RNA aptamer sequence. This DNA sequence binds dopamine and structurally related catecholamine neurotransmitters non-specifically, as any DNA sequence, and, thus, is not an aptamer and cannot be used neither for in vivo nor in situ analysis of dopamine in the presence of structurally related neurotransmitters. Copyright © 2017 Elsevier Inc. All rights reserved.
Shinozuka, Hiroshi; Cogan, Noel O I; Shinozuka, Maiko; Marshall, Alexis; Kay, Pippa; Lin, Yi-Han; Spangenberg, German C; Forster, John W
2015-04-11
Fragmentation at random nucleotide locations is an essential process for preparation of DNA libraries to be used on massively parallel short-read DNA sequencing platforms. Although instruments for physical shearing, such as the Covaris S2 focused-ultrasonicator system, and products for enzymatic shearing, such as the Nextera technology and NEBNext dsDNA Fragmentase kit, are commercially available, a simple and inexpensive method is desirable for high-throughput sequencing library preparation. MspJI is a recently characterised restriction enzyme which recognises the sequence motif CNNR (where R = G or A) when the first base is modified to 5-methylcytosine or 5-hydroxymethylcytosine. A semi-random enzymatic DNA amplicon fragmentation method was developed based on the unique cleavage properties of MspJI. In this method, random incorporation of 5-methyl-2'-deoxycytidine-5'-triphosphate is achieved through DNA amplification with DNA polymerase, followed by DNA digestion with MspJI. Due to the recognition sequence of the enzyme, DNA amplicons are fragmented in a relatively sequence-independent manner. The size range of the resulting fragments was capable of control through optimisation of 5-methyl-2'-deoxycytidine-5'-triphosphate concentration in the reaction mixture. A library suitable for sequencing using the Illumina MiSeq platform was prepared and processed using the proposed method. Alignment of generated short reads to a reference sequence demonstrated a relatively high level of random fragmentation. The proposed method may be performed with standard laboratory equipment. Although the uniformity of coverage was slightly inferior to the Covaris physical shearing procedure, due to efficiencies of cost and labour, the method may be more suitable than existing approaches for implementation in large-scale sequencing activities, such as bacterial artificial chromosome (BAC)-based genome sequence assembly, pan-genomic studies and locus-targeted genotyping-by-sequencing.
Gene sequence analyses and other DNA-based methods for yeast species recognition
USDA-ARS?s Scientific Manuscript database
DNA sequence analyses, as well as other DNA-based methodologies, have transformed the way in which yeasts are identified. The focus of this chapter will be on the resolution of species using various types of DNA comparisons. In other chapters in this book, Rozpedowska, Piškur and Wolfe discuss mul...
Petkevičiūtė, D; Pasi, M; Gonzalez, O; Maddocks, J H
2014-11-10
cgDNA is a package for the prediction of sequence-dependent configuration-space free energies for B-form DNA at the coarse-grain level of rigid bases. For a fragment of any given length and sequence, cgDNA calculates the configuration of the associated free energy minimizer, i.e. the relative positions and orientations of each base, along with a stiffness matrix, which together govern differences in free energies. The model predicts non-local (i.e. beyond base-pair step) sequence dependence of the free energy minimizer. Configurations can be input or output in either the Curves+ definition of the usual helical DNA structural variables, or as a PDB file of coordinates of base atoms. We illustrate the cgDNA package by comparing predictions of free energy minimizers from (a) the cgDNA model, (b) time-averaged atomistic molecular dynamics (or MD) simulations, and (c) NMR or X-ray experimental observation, for (i) the Dickerson-Drew dodecamer and (ii) three oligomers containing A-tracts. The cgDNA predictions are rather close to those of the MD simulations, but many orders of magnitude faster to compute. Both the cgDNA and MD predictions are in reasonable agreement with the available experimental data. Our conclusion is that cgDNA can serve as a highly efficient tool for studying structural variations in B-form DNA over a wide range of sequences. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hancock, Stephen P.; Stella, Stefano; Cascio, Duilio
The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequencesmore » in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. Lastly, the affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.« less
Hancock, Stephen P.; Stella, Stefano; Cascio, Duilio; ...
2016-03-09
The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequencesmore » in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. Lastly, the affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Torella, JP; Lienert, F; Boehm, CR
2014-08-07
Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts, and they hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies-for example, repeated terminator and insulator sequences-that complicate recombination-based assembly. We and others have recently developed DNA assembly methods, which we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked withmore » UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly assembled constructs, or into high-quality combinatorial libraries in only 2-3 d. If the DNA parts must be generated from scratch, an additional 2-5 d are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques.« less
Torella, Joseph P.; Lienert, Florian; Boehm, Christian R.; Chen, Jan-Hung; Way, Jeffrey C.; Silver, Pamela A.
2016-01-01
Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts and hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies — for example repeated terminator and insulator sequences — that complicate recombination-based assembly. We and others have recently developed DNA assembly methods that we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked with UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly-assembled constructs, or into high-quality combinatorial libraries in only 2–3 days. If the DNA parts must be generated from scratch, an additional 2–5 days are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques. PMID:25101822
Saladino, R; Crestini, C; Mincione, E; Costanzo, G; Di Mauro, E; Negri, R
1997-11-01
We describe the reaction of formamide with 2'-deoxycytidine to give pyrimidine ring opening by nucleophilic addition on the electrophilic C(6) and C(4) positions. This information is confirmed by the analysis of the products of formamide attack on 2'-deoxycytidine, 5-methyl-2'-deoxycytidine, and 5-bromo-2'-deoxycytidine, residues when the latter are incorporated into oligonucleotides by DNA polymerase-driven polymerization and solid-phase phosphoramidite procedure. The increased sensitivity of 5-bromo-2'-deoxycytidine relative to that of 2'-deoxycytidine is pivotal for the improvement of the one-lane chemical DNA sequencing procedure based on the base-selective reaction of formamide with DNA. In many DNA sequencing cases it will in fact be possible to incorporate this base analogue into the DNA to be sequenced, thus providing a complete discrimination between its UV absorption signal and that of the thymidine residues. The wide spectrum of different sensitivities to formamide displayed by the 2'-deoxycytidine analogues solves, in the DNA single-lane chemical sequencing procedure, the possible source of errors due to low discrimination between C and T residues.
Sun, Xiaofan; Chen, Haohan; Wang, Shuling; Zhang, Yiping; Tian, Yaping; Zhou, Nandi
2018-08-27
A high-sensitive detection of sequence-specific DNA was established based on the formation of G-quadruplex-hemin complex through continuous hybridization chain reaction (HCR). Taking HIV DNA sequence as an example, a capture probe complementary to part of HIV DNA was firstly self-assembled onto the surface of Au electrode. Then a specially designed assistant probe with both terminals complementary to the target DNA and a G-quadruplex-forming sequence in the center was introduced into the detection solution. In the presence of both the target DNA and the assistant probe, the target DNA can be captured on the electrode surface and then a continuous HCR can be conducted due to the mutual recognition of the target DNA and the assistant probe, leading to the formation of a large number of G-quadruplex on the electrode surface. With the help of hemin, a pronounced electrochemical signal can be observed in differential pulse voltammetry (DPV), due to the formation of G-quadruplex-hemin complex. The peak current is linearly related with the logarithm of the concentration of the target DNA in the range from 10 fM to 10 pM. The electrochemical sensor has high selectivity to clearly discriminate single-base mismatched and three-base mismatched sequences from the original HIV DNA sequence. Moreover, the established DNA sensor was challenged by detection of HIV DNA in human serum samples, which showed the low detection limit of 6.3 fM. Thus it has great application prospect in the field of clinical diagnosis and environmental monitoring. Copyright © 2018 Elsevier B.V. All rights reserved.
Guo, Y C; Wang, H; Wu, H P; Zhang, M Q
2015-12-21
Aimed to address the defects of the large mean square error (MSE), and the slow convergence speed in equalizing the multi-modulus signals of the constant modulus algorithm (CMA), a multi-modulus algorithm (MMA) based on global artificial fish swarm (GAFS) intelligent optimization of DNA encoding sequences (GAFS-DNA-MMA) was proposed. To improve the convergence rate and reduce the MSE, this proposed algorithm adopted an encoding method based on DNA nucleotide chains to provide a possible solution to the problem. Furthermore, the GAFS algorithm, with its fast convergence and global search ability, was used to find the best sequence. The real and imaginary parts of the initial optimal weight vector of MMA were obtained through DNA coding of the best sequence. The simulation results show that the proposed algorithm has a faster convergence speed and smaller MSE in comparison with the CMA, the MMA, and the AFS-DNA-MMA.
Stewart, Mikaela; Dunlap, Tori; Dourlain, Elizabeth; Grant, Bryce; McFail-Isom, Lori
2013-01-01
The fine conformational subtleties of DNA structure modulate many fundamental cellular processes including gene activation/repression, cellular division, and DNA repair. Most of these cellular processes rely on the conformational heterogeneity of specific DNA sequences. Factors including those structural characteristics inherent in the particular base sequence as well as those induced through interaction with solvent components combine to produce fine DNA structural variation including helical flexibility and conformation. Cation-pi interactions between solvent cations or their first hydration shell waters and the faces of DNA bases form sequence selectively and contribute to DNA structural heterogeneity. In this paper, we detect and characterize the binding patterns found in cation-pi interactions between solvent cations and DNA bases in a set of high resolution x-ray crystal structures. Specifically, we found that monovalent cations (Tl+) and the polarized first hydration shell waters of divalent cations (Mg2+, Ca2+) form cation-pi interactions with DNA bases stabilizing unstacked conformations. When these cation-pi interactions are combined with electrostatic interactions a pattern of specific binding motifs is formed within the grooves. PMID:23940752
Stewart, Mikaela; Dunlap, Tori; Dourlain, Elizabeth; Grant, Bryce; McFail-Isom, Lori
2013-01-01
The fine conformational subtleties of DNA structure modulate many fundamental cellular processes including gene activation/repression, cellular division, and DNA repair. Most of these cellular processes rely on the conformational heterogeneity of specific DNA sequences. Factors including those structural characteristics inherent in the particular base sequence as well as those induced through interaction with solvent components combine to produce fine DNA structural variation including helical flexibility and conformation. Cation-pi interactions between solvent cations or their first hydration shell waters and the faces of DNA bases form sequence selectively and contribute to DNA structural heterogeneity. In this paper, we detect and characterize the binding patterns found in cation-pi interactions between solvent cations and DNA bases in a set of high resolution x-ray crystal structures. Specifically, we found that monovalent cations (Tl⁺) and the polarized first hydration shell waters of divalent cations (Mg²⁺, Ca²⁺) form cation-pi interactions with DNA bases stabilizing unstacked conformations. When these cation-pi interactions are combined with electrostatic interactions a pattern of specific binding motifs is formed within the grooves.
M. -S. Kim; N. B. Klopfenstein; J. W. Hanna; G. I. McDonald
2006-01-01
Phylogenetic and genetic relationships among 10 North American Armillaria species were analysed using sequence data from ribosomal DNA (rDNA), including intergenic spacer (IGS-1), internal transcribed spacers with associated 5.8S (ITS + 5.8S), and nuclear large subunit rDNA (nLSU), and amplified fragment length polymorphism (AFLP) markers. Based on rDNA sequence data,...
Hendrickson, Edwin R.; Payne, Jo Ann; Young, Roslyn M.; Starr, Mark G.; Perry, Michael P.; Fahnestock, Stephen; Ellis, David E.; Ebersole, Richard C.
2002-01-01
The environmental distribution of Dehalococcoides group organisms and their association with chloroethene-contaminated sites were examined. Samples from 24 chloroethene-dechlorinating sites scattered throughout North America and Europe were tested for the presence of members of the Dehalococcoides group by using a PCR assay developed to detect Dehalococcoides 16S rRNA gene (rDNA) sequences. Sequences identified by sequence analysis as sequences of members of the Dehalococcoides group were detected at 21 sites. Full dechlorination of chloroethenes to ethene occurred at these sites. Dehalococcoides sequences were not detected in samples from three sites at which partial dechlorination of chloroethenes occurred, where dechlorination appeared to stop at 1,2-cis-dichloroethene. Phylogenetic analysis of the 16S rDNA amplicons confirmed that Dehalococcoides sequences formed a unique 16S rDNA group. These 16S rDNA sequences were divided into three subgroups based on specific base substitution patterns in variable regions 2 and 6 of the Dehalococcoides 16S rDNA sequence. Analyses also demonstrated that specific base substitution patterns were signature patterns. The specific base substitutions distinguished the three sequence subgroups phylogenetically. These results demonstrated that members of the Dehalococcoides group are widely distributed in nature and can be found in a variety of geological formations and in different climatic zones. Furthermore, the association of these organisms with full dechlorination of chloroethenes suggests that they are promising candidates for engineered bioremediation and may be important contributors to natural attenuation of chloroethenes. PMID:11823182
Xu, Yi-Hua; Manoharan, Herbert T; Pitot, Henry C
2007-09-01
The bisulfite genomic sequencing technique is one of the most widely used techniques to study sequence-specific DNA methylation because of its unambiguous ability to reveal DNA methylation status to the order of a single nucleotide. One characteristic feature of the bisulfite genomic sequencing technique is that a number of sample sequence files will be produced from a single DNA sample. The PCR products of bisulfite-treated DNA samples cannot be sequenced directly because they are heterogeneous in nature; therefore they should be cloned into suitable plasmids and then sequenced. This procedure generates an enormous number of sample DNA sequence files as well as adding extra bases belonging to the plasmids to the sequence, which will cause problems in the final sequence comparison. Finding the methylation status for each CpG in each sample sequence is not an easy job. As a result CpG PatternFinder was developed for this purpose. The main functions of the CpG PatternFinder are: (i) to analyze the reference sequence to obtain CpG and non-CpG-C residue position information. (ii) To tailor sample sequence files (delete insertions and mark deletions from the sample sequence files) based on a configuration of ClustalW multiple alignment. (iii) To align sample sequence files with a reference file to obtain bisulfite conversion efficiency and CpG methylation status. And, (iv) to produce graphics, highlighted aligned sequence text and a summary report which can be easily exported to Microsoft Office suite. CpG PatternFinder is designed to operate cooperatively with BioEdit, a freeware on the internet. It can handle up to 100 files of sample DNA sequences simultaneously, and the total CpG pattern analysis process can be finished in minutes. CpG PatternFinder is an ideal software tool for DNA methylation studies to determine the differential methylation pattern in a large number of individuals in a population. Previously we developed the CpG Analyzer program; CpG PatternFinder is our further effort to create software tools for DNA methylation studies.
Wang, Yongjie; Kleespies, Regina G; Ramle, Moslim B; Jehle, Johannes A
2008-09-01
The genomic sequence analysis of many large dsDNA viruses is hampered by the lack of enough sample materials. Here, we report a whole genome amplification of the Oryctes rhinoceros nudivirus (OrNV) isolate Ma07 starting from as few as about 10 ng of purified viral DNA by application of phi29 DNA polymerase- and exonuclease-resistant random hexamer-based multiple displacement amplification (MDA) method. About 60 microg of high molecular weight DNA with fragment sizes of up to 25 kbp was amplified. A genomic DNA clone library was generated using the product DNA. After 8-fold sequencing coverage, the 127,615 bp of OrNV whole genome was sequenced successfully. The results demonstrate that the MDA-based whole genome amplification enables rapid access to genomic information from exiguous virus samples.
Shi, Liang; Khandurina, Julia; Ronai, Zsolt; Li, Bi-Yu; Kwan, Wai King; Wang, Xun; Guttman, András
2003-01-01
A capillary gel electrophoresis based automated DNA fraction collection technique was developed to support a novel DNA fragment-pooling strategy for expressed sequence tag (EST) library construction. The cDNA population is first cleaved by BsaJ I and EcoR I restriction enzymes, and then subpooled by selective ligation with specific adapters followed by polymerase chain reaction (PCR) amplification and labeling. Combination of this cDNA fingerprinting method with high-resolution capillary gel electrophoresis separation and precise fractionation of individual cDNA transcript representatives avoids redundant fragment selection and concomitant repetitive sequencing of abundant transcripts. Using a computer-controlled capillary electrophoresis device the transcript representatives were separated by their size and fractions were automatically collected in every 30 s into 96-well plates. The high resolving power of the sieving matrix ensured sequencing grade separation of the DNA fragments (i.e., single-base resolution) and successful fraction collection. Performance and precision of the fraction collection procedure was validated by PCR amplification of the collected DNA fragments followed by capillary electrophoresis analysis for size and purity verification. The collected and PCR-amplified transcript representatives, ranging up to several hundred base pairs, were then sequenced to create an EST library.
Detection of DNA Methylation by Whole-Genome Bisulfite Sequencing.
Li, Qing; Hermanson, Peter J; Springer, Nathan M
2018-01-01
DNA methylation plays an important role in the regulation of the expression of transposons and genes. Various methods have been developed to assay DNA methylation levels. Bisulfite sequencing is considered to be the "gold standard" for single-base resolution measurement of DNA methylation levels. Coupled with next-generation sequencing, whole-genome bisulfite sequencing (WGBS) allows DNA methylation to be evaluated at a genome-wide scale. Here, we described a protocol for WGBS in plant species with large genomes. This protocol has been successfully applied to assay genome-wide DNA methylation levels in maize and barley. This protocol has also been successfully coupled with sequence capture technology to assay DNA methylation levels in a targeted set of genomic regions.
Charge transport through DNA based electronic barriers
NASA Astrophysics Data System (ADS)
Patil, Sunil R.; Chawda, Vivek; Qi, Jianqing; Anantram, M. P.; Sinha, Niraj
2018-05-01
We report charge transport in electronic 'barriers' constructed by sequence engineering in DNA. Considering the ionization potentials of Thymine-Adenine (AT) and Guanine-Cytosine (GC) base pairs, we treat AT as 'barriers'. The effect of DNA conformation (A and B form) on charge transport is also investigated. Particularly, the effect of width of 'barriers' on hole transport is investigated. Density functional theory (DFT) calculations are performed on energy minimized DNA structures to obtain the electronic Hamiltonian. The quantum transport calculations are performed using the Landauer-Buttiker framework. Our main findings are contrary to previous studies. We find that a longer A-DNA with more AT base pairs can conduct better than shorter A-DNA with a smaller number of AT base pairs. We also find that some sequences of A-DNA can conduct better than a corresponding B-DNA with the same sequence. The counterions mediated charge transport and long range interactions are speculated to be responsible for counter-intuitive length and AT content dependence of conductance of A-DNA.
DNA cross-linking by dehydromonocrotaline lacks apparent base sequence preference.
Rieben, W Kurt; Coulombe, Roger A
2004-12-01
Pyrrolizidine alkaloids (PAs) are ubiquitous plant toxins, many of which, upon oxidation by hepatic mixed-function oxidases, become reactive bifunctional pyrrolic electrophiles that form DNA-DNA and DNA-protein cross-links. The anti-mitotic, toxic, and carcinogenic action of PAs is thought to be caused, at least in part, by these cross-links. We wished to determine whether the activated PA pyrrole dehydromonocrotaline (DHMO) exhibits base sequence preferences when cross-linked to a set of model duplex poly A-T 14-mer oligonucleotides with varying internal and/or end 5'-d(CG), 5'-d(GC), 5'-d(TA), 5'-d(CGCG), or 5'-d(GCGC) sequences. DHMO-DNA cross-links were assessed by electrophoretic mobility shift assay (EMSA) of 32P endlabeled oligonucleotides and by HPLC analysis of cross-linked DNAs enzymatically digested to their constituent deoxynucleosides. The degree of DNA cross-links depended upon the concentration of the pyrrole, but not on the base sequence of the oligonucleotide target. Likewise, HPLC chromatograms of cross-linked and digested DNAs showed no discernible sequence preference for any nucleotide. Added glutathione, tyrosine, cysteine, and aspartic acid, but not phenylalanine, threonine, serine, lysine, or methionine competed with DNA as alternate nucleophiles for cross-linking by DHMO. From these data it appears that DHMO exhibits no strong base preference when forming cross-links with DNA, and that some cellular nucleophiles can inhibit DNA cross-link formation.
Yin, Changchuan
2015-04-01
To apply digital signal processing (DSP) methods to analyze DNA sequences, the sequences first must be specially mapped into numerical sequences. Thus, effective numerical mappings of DNA sequences play key roles in the effectiveness of DSP-based methods such as exon prediction. Despite numerous mappings of symbolic DNA sequences to numerical series, the existing mapping methods do not include the genetic coding features of DNA sequences. We present a novel numerical representation of DNA sequences using genetic codon context (GCC) in which the numerical values are optimized by simulation annealing to maximize the 3-periodicity signal to noise ratio (SNR). The optimized GCC representation is then applied in exon and intron prediction by Short-Time Fourier Transform (STFT) approach. The results show the GCC method enhances the SNR values of exon sequences and thus increases the accuracy of predicting protein coding regions in genomes compared with the commonly used 4D binary representation. In addition, this study offers a novel way to reveal specific features of DNA sequences by optimizing numerical mappings of symbolic DNA sequences.
Manlig, Erika; Wahlberg, Per
2017-01-01
Abstract Sodium bisulphite treatment of DNA combined with next generation sequencing (NGS) is a powerful combination for the interrogation of genome-wide DNA methylation profiles. Library preparation for whole genome bisulphite sequencing (WGBS) is challenging due to side effects of the bisulphite treatment, which leads to extensive DNA damage. Recently, a new generation of methods for bisulphite sequencing library preparation have been devised. They are based on initial bisulphite treatment of the DNA, followed by adaptor tagging of single stranded DNA fragments, and enable WGBS using low quantities of input DNA. In this study, we present a novel approach for quick and cost effective WGBS library preparation that is based on splinted adaptor tagging (SPLAT) of bisulphite-converted single-stranded DNA. Moreover, we validate SPLAT against three commercially available WGBS library preparation techniques, two of which are based on bisulphite treatment prior to adaptor tagging and one is a conventional WGBS method. PMID:27899585
Hu, Yuhua; Xu, Xueqin; Liu, Qionghua; Wang, Ling; Lin, Zhenyu; Chen, Guonan
2014-09-02
A simple, ultrasensitive, and specific electrochemical biosensor was designed to determine the given DNA sequence of Bacillus subtilis by coupling target-induced strand displacement and nicking endonuclease signal amplification. The target DNA (TD, the DNA sequence from the hypervarient region of 16S rDNA of Bacillus subtilis) could be detected by the differential pulse voltammetry (DPV) in a range from 0.1 fM to 20 fM with the detection limit down to 0.08 fM at the 3s(blank) level. This electrochemical biosensor exhibits high distinction ability to single-base mismatch, double-bases mismatch, and noncomplementary DNA sequence, which may be expected to detect single-base mismatch and single nucleotide polymorphisms (SNPs). Moreover, the applicability of the designed biosensor for detecting the given DNA sequence from Bacillus subtilis was investigated. The result obtained by electrochemical method is approximately consistent with that by a real-time quantitative polymerase chain reaction detecting system (QPCR) with SYBR Green.
Berger, C; Berger, B; Parson, W
2012-01-01
In recent years, evidence from domestic dogs has increasingly been analyzed by forensic DNA testing. Especially, canine hairs have proved most suitable and practical due to the high rate of hair transfer occurring between dogs and humans. Starting with the description of a contamination-free sample handling procedure, we give a detailed workflow for sequencing hypervariable segments (HVS) of the mtDNA control region from canine evidence. After the hair material is lysed and the DNA extracted by Phenol/Chloroform, the amplification and sequencing strategy comprises the HVS I and II of the canine control region and is optimized for DNA of medium-to-low quality and quantity. The sequencing procedure is based on the Sanger Big-dye deoxy-terminator method and the separation of the sequencing reaction products is performed on a conventional multicolor fluorescence detection capillary electrophoresis platform. Finally, software-aided base calling and sequence interpretation are addressed exemplarily.
Bertolini, Francesca; Ghionda, Marco Ciro; D'Alessandro, Enrico; Geraci, Claudia; Chiofalo, Vincenzo; Fontanesi, Luca
2015-01-01
The identification of the species of origin of meat and meat products is an important issue to prevent and detect frauds that might have economic, ethical and health implications. In this paper we evaluated the potential of the next generation semiconductor based sequencing technology (Ion Torrent Personal Genome Machine) for the identification of DNA from meat species (pig, horse, cattle, sheep, rabbit, chicken, turkey, pheasant, duck, goose and pigeon) as well as from human and rat in DNA mixtures through the sequencing of PCR products obtained from different couples of universal primers that amplify 12S and 16S rRNA mitochondrial DNA genes. Six libraries were produced including PCR products obtained separately from 13 species or from DNA mixtures containing DNA from all species or only avian or only mammalian species at equimolar concentration or at 1:10 or 1:50 ratios for pig and horse DNA. Sequencing obtained a total of 33,294,511 called nucleotides of which 29,109,688 with Q20 (87.43%) in a total of 215,944 reads. Different alignment algorithms were used to assign the species based on sequence data. Error rate calculated after confirmation of the obtained sequences by Sanger sequencing ranged from 0.0003 to 0.02 for the different species. Correlation about the number of reads per species between different libraries was high for mammalian species (0.97) and lower for avian species (0.70). PCR competition limited the efficiency of amplification and sequencing for avian species for some primer pairs. Detection of low level of pig and horse DNA was possible with reads obtained from different primer pairs. The sequencing of the products obtained from different universal PCR primers could be a useful strategy to overcome potential problems of amplification. Based on these results, the Ion Torrent technology can be applied for the identification of meat species in DNA mixtures.
Bertolini, Francesca; Ghionda, Marco Ciro; D’Alessandro, Enrico; Geraci, Claudia; Chiofalo, Vincenzo; Fontanesi, Luca
2015-01-01
The identification of the species of origin of meat and meat products is an important issue to prevent and detect frauds that might have economic, ethical and health implications. In this paper we evaluated the potential of the next generation semiconductor based sequencing technology (Ion Torrent Personal Genome Machine) for the identification of DNA from meat species (pig, horse, cattle, sheep, rabbit, chicken, turkey, pheasant, duck, goose and pigeon) as well as from human and rat in DNA mixtures through the sequencing of PCR products obtained from different couples of universal primers that amplify 12S and 16S rRNA mitochondrial DNA genes. Six libraries were produced including PCR products obtained separately from 13 species or from DNA mixtures containing DNA from all species or only avian or only mammalian species at equimolar concentration or at 1:10 or 1:50 ratios for pig and horse DNA. Sequencing obtained a total of 33,294,511 called nucleotides of which 29,109,688 with Q20 (87.43%) in a total of 215,944 reads. Different alignment algorithms were used to assign the species based on sequence data. Error rate calculated after confirmation of the obtained sequences by Sanger sequencing ranged from 0.0003 to 0.02 for the different species. Correlation about the number of reads per species between different libraries was high for mammalian species (0.97) and lower for avian species (0.70). PCR competition limited the efficiency of amplification and sequencing for avian species for some primer pairs. Detection of low level of pig and horse DNA was possible with reads obtained from different primer pairs. The sequencing of the products obtained from different universal PCR primers could be a useful strategy to overcome potential problems of amplification. Based on these results, the Ion Torrent technology can be applied for the identification of meat species in DNA mixtures. PMID:25923709
Laser mass spectrometry for DNA sequencing, disease diagnosis, and fingerprinting
NASA Astrophysics Data System (ADS)
Chen, C. H. Winston; Taranenko, N. I.; Zhu, Y. F.; Chung, C. N.; Allman, S. L.
1997-05-01
Since laser mass spectrometry has the potential for achieving very fast DNA analysis, we recently applied it to DNA sequencing, DNA typing for fingerprinting, and DNA screening for disease diagnosis. Two different approaches for sequencing DNA have been successfully demonstrated. One is to sequence DNA with DNA ladders produced from Sanger's enzymatic method. The other is to do direct sequencing without DNA ladders. The need for quick DNA typing for identification purposes is critical for forensic application. Our preliminary results indicate laser mass spectrometry can possible be used for rapid DNA fingerprinting applications at a much lower cost than gel electrophoresis. Population screening for certain genetic disease can be a very efficient step to reducing medical costs through prevention. Since laser mass spectrometry can provide very fast DNA analysis, we applied laser mass spectrometry to disease diagnosis. Clinical samples with both base deletion and point mutation have been tested with complete success.
Fuller, Carl W.; Kumar, Shiv; Porel, Mintu; Chien, Minchen; Bibillo, Arek; Stranges, P. Benjamin; Dorwart, Michael; Tao, Chuanjuan; Li, Zengmin; Guo, Wenjing; Shi, Shundi; Korenblum, Daniel; Trans, Andrew; Aguirre, Anne; Liu, Edward; Harada, Eric T.; Pollard, James; Bhat, Ashwini; Cech, Cynthia; Yang, Alexander; Arnold, Cleoma; Palla, Mirkó; Hovis, Jennifer; Chen, Roger; Morozova, Irina; Kalachikov, Sergey; Russo, James J.; Kasianowicz, John J.; Davis, Randy; Roever, Stefan; Church, George M.; Ju, Jingyue
2016-01-01
DNA sequencing by synthesis (SBS) offers a robust platform to decipher nucleic acid sequences. Recently, we reported a single-molecule nanopore-based SBS strategy that accurately distinguishes four bases by electronically detecting and differentiating four different polymer tags attached to the 5′-phosphate of the nucleotides during their incorporation into a growing DNA strand catalyzed by DNA polymerase. Further developing this approach, we report here the use of nucleotides tagged at the terminal phosphate with oligonucleotide-based polymers to perform nanopore SBS on an α-hemolysin nanopore array platform. We designed and synthesized several polymer-tagged nucleotides using tags that produce different electrical current blockade levels and verified they are active substrates for DNA polymerase. A highly processive DNA polymerase was conjugated to the nanopore, and the conjugates were complexed with primer/template DNA and inserted into lipid bilayers over individually addressable electrodes of the nanopore chip. When an incoming complementary-tagged nucleotide forms a tight ternary complex with the primer/template and polymerase, the tag enters the pore, and the current blockade level is measured. The levels displayed by the four nucleotides tagged with four different polymers captured in the nanopore in such ternary complexes were clearly distinguishable and sequence-specific, enabling continuous sequence determination during the polymerase reaction. Thus, real-time single-molecule electronic DNA sequencing data with single-base resolution were obtained. The use of these polymer-tagged nucleotides, combined with polymerase tethering to nanopores and multiplexed nanopore sensors, should lead to new high-throughput sequencing methods. PMID:27091962
DOE Office of Scientific and Technical Information (OSTI.GOV)
Randall, Graham L.; Zechiedrich, E. L.; Pettitt, Bernard M.
2009-09-01
To understand how underwinding and overwinding the DNA helix affects its structure, we simulated 19 independent DNA systems with fixed degrees of twist using molecular dynamics in a system that does not allow writhe. Underwinding DNA induced spontaneous, sequence-dependent base flipping and local denaturation, while overwinding DNA induced the formation of Pauling-like DNA (P-DNA). The winding resulted in a bimodal state simultaneously including local structural failure and B-form DNA for both underwinding and extreme overwinding. Our simulations suggest that base flipping and local denaturation may provide a landscape influencing protein recognition of DNA sequence to affect, for examples, replication, transcriptionmore » and recombination. Additionally, our findings help explain results from singlemolecule experiments and demonstrate that elastic rod models are strictly valid on average only for unstressed or overwound DNA up to P-DNA formation. Finally, our data support a model in which base flipping can result from torsional stress.« less
A High-Throughput Process for the Solid-Phase Purification of Synthetic DNA Sequences
Grajkowski, Andrzej; Cieślak, Jacek; Beaucage, Serge L.
2017-01-01
An efficient process for the purification of synthetic phosphorothioate and native DNA sequences is presented. The process is based on the use of an aminopropylated silica gel support functionalized with aminooxyalkyl functions to enable capture of DNA sequences through an oximation reaction with the keto function of a linker conjugated to the 5′-terminus of DNA sequences. Deoxyribonucleoside phosphoramidites carrying this linker, as a 5′-hydroxyl protecting group, have been synthesized for incorporation into DNA sequences during the last coupling step of a standard solid-phase synthesis protocol executed on a controlled pore glass (CPG) support. Solid-phase capture of the nucleobase- and phosphate-deprotected DNA sequences released from the CPG support is demonstrated to proceed near quantitatively. Shorter than full-length DNA sequences are first washed away from the capture support; the solid-phase purified DNA sequences are then released from this support upon reaction with tetra-n-butylammonium fluoride in dry dimethylsulfoxide (DMSO) and precipitated in tetrahydrofuran (THF). The purity of solid-phase-purified DNA sequences exceeds 98%. The simulated high-throughput and scalability features of the solid-phase purification process are demonstrated without sacrificing purity of the DNA sequences. PMID:28628204
Mohammed, Monzoorul Haque; Ghosh, Tarini Shankar; Chadaram, Sudha; Mande, Sharmila S
2011-11-30
Obtaining accurate estimates of microbial diversity using rDNA profiling is the first step in most metagenomics projects. Consequently, most metagenomic projects spend considerable amounts of time, money and manpower for experimentally cloning, amplifying and sequencing the rDNA content in a metagenomic sample. In the second step, the entire genomic content of the metagenome is extracted, sequenced and analyzed. Since DNA sequences obtained in this second step also contain rDNA fragments, rapid in silico identification of these rDNA fragments would drastically reduce the cost, time and effort of current metagenomic projects by entirely bypassing the experimental steps of primer based rDNA amplification, cloning and sequencing. In this study, we present an algorithm called i-rDNA that can facilitate the rapid detection of 16S rDNA fragments from amongst millions of sequences in metagenomic data sets with high detection sensitivity. Performance evaluation with data sets/database variants simulating typical metagenomic scenarios indicates the significantly high detection sensitivity of i-rDNA. Moreover, i-rDNA can process a million sequences in less than an hour on a simple desktop with modest hardware specifications. In addition to the speed of execution, high sensitivity and low false positive rate, the utility of the algorithmic approach discussed in this paper is immense given that it would help in bypassing the entire experimental step of primer-based rDNA amplification, cloning and sequencing. Application of this algorithmic approach would thus drastically reduce the cost, time and human efforts invested in all metagenomic projects. A web-server for the i-rDNA algorithm is available at http://metagenomics.atc.tcs.com/i-rDNA/
USDA-ARS?s Scientific Manuscript database
A reassociation kinetics-based approach was used to reduce the complexity of genomic DNA from the Deutsch laboratory strain of the cattle tick, Rhipicephalus microplus, to facilitate genome sequencing. Selected genomic DNA (Cot value = 660) was sequenced using 454 GS FLX technology, resulting in 356...
Widespread recombination in published animal mtDNA sequences.
Tsaousis, A D; Martin, D P; Ladoukakis, E D; Posada, D; Zouros, E
2005-04-01
Mitochondrial DNA (mtDNA) recombination has been observed in several animal species, but there are doubts as to whether it is common or only occurs under special circumstances. Animal mtDNA sequences retrieved from public databases were unambiguously aligned and rigorously tested for evidence of recombination. At least 30 recombination events were detected among 186 alignments examined. Recombinant sequences were found in invertebrates and vertebrates, including primates. It appears that mtDNA recombination may occur regularly in the animal cell but rarely produces new haplotypes because of homoplasmy. Common animal mtDNA recombination would necessitate a reexamination of phylogenetic and biohistorical inference based on the assumption of clonal mtDNA transmission. Recombination may also have an important role in producing and purging mtDNA mutations and thus in mtDNA-based diseases and senescence.
Hurst, Sarah J; Han, Min Su; Lytton-Jean, Abigail K R; Mirkin, Chad A
2007-09-15
We have developed a novel competition assay that uses a gold nanoparticle (Au NP)-based, high-throughput colorimetric approach to screen the sequence selectivity of DNA-binding molecules. This assay hinges on the observation that the melting behavior of DNA-functionalized Au NP aggregates is sensitive to the concentration of the DNA-binding molecule in solution. When short, oligomeric hairpin DNA sequences were added to a reaction solution consisting of DNA-functionalized Au NP aggregates and DNA-binding molecules, these molecules may either bind to the Au NP aggregate interconnects or the hairpin stems based on their relative affinity for each. This relative affinity can be measured as a change in the melting temperature (Tm) of the DNA-modified Au NP aggregates in solution. As a proof of concept, we evaluated the selectivity of 4',6-diamidino-2-phenylindone (an AT-specific binder), ethidium bromide (a nonspecific binder), and chromomycin A (a GC-specific binder) for six sequences of hairpin DNA having different numbers of AT pairs in a five-base pair variable stem region. Our assay accurately and easily confirmed the known trends in selectivity for the DNA binders in question without the use of complicated instrumentation. This novel assay will be useful in assessing large libraries of potential drug candidates that work by binding DNA to form a drug/DNA complex.
Widespread Transient Hoogsteen Base-Pairs in Canonical Duplex DNA with Variable Energetics
Alvey, Heidi S.; Gottardo, Federico L.; Nikolova, Evgenia N.; Al-Hashimi, Hashim M.
2015-01-01
Hoogsteen base-pairing involves a 180 degree rotation of the purine base relative to Watson-Crick base-pairing within DNA duplexes, creating alternative DNA conformations that can play roles in recognition, damage induction, and replication. Here, using Nuclear Magnetic Resonance R1ρ relaxation dispersion, we show that transient Hoogsteen base-pairs occur across more diverse sequence and positional contexts than previously anticipated. We observe sequence-specific variations in Hoogsteen base-pair energetic stabilities that are comparable to variations in Watson-Crick base-pair stability, with Hoogsteen base-pairs being more abundant for energetically less favorable Watson-Crick base-pairs. Our results suggest that the variations in Hoogsteen stabilities and rates of formation are dominated by variations in Watson-Crick base pair stability, suggesting a late transition state for the Watson-Crick to Hoogsteen conformational switch. The occurrence of sequence and position-dependent Hoogsteen base-pairs provide a new potential mechanism for achieving sequence-dependent DNA transactions. PMID:25185517
ABI Base Recall: Automatic Correction and Ends Trimming of DNA Sequences.
Elyazghi, Zakaria; Yazouli, Loubna El; Sadki, Khalid; Radouani, Fouzia
2017-12-01
Automated DNA sequencers produce chromatogram files in ABI format. When viewing chromatograms, some ambiguities are shown at various sites along the DNA sequences, because the program implemented in the sequencing machine and used to call bases cannot always precisely determine the right nucleotide, especially when it is represented by either a broad peak or a set of overlaying peaks. In such cases, a letter other than A, C, G, or T is recorded, most commonly N. Thus, DNA sequencing chromatograms need manual examination: checking for mis-calls and truncating the sequence when errors become too frequent. The purpose of this paper is to develop a program allowing the automatic correction of these ambiguities. This application is a Web-based program powered by Shiny and runs under R platform for an easy exploitation. As a part of the interface, we added the automatic ends clipping option, alignment against reference sequences, and BLAST. To develop and test our tool, we collected several bacterial DNA sequences from different laboratories within Institut Pasteur du Maroc and performed both manual and automatic correction. The comparison between the two methods was carried out. As a result, we note that our program, ABI base recall, accomplishes good correction with a high accuracy. Indeed, it increases the rate of identity and coverage and minimizes the number of mismatches and gaps, hence it provides solution to sequencing ambiguities and saves biologists' time and labor.
Admir J. Giachini; Kentaro Hosaka; Eduardo Nouhra; Joseph Spatafora; James M. Trappe
2010-01-01
Phylogenetic relationships among Geastrales, Gomphales, Hysterangiales, and Phallales were estimated via combined sequences: nuclear large subunit ribosomal DNA (nuc-25S-rDNA), mitochondrial small subunit ribosomal DNA (mit-12S-rDNA), and mitochondrial atp6 DNA (mit-atp6-DNA). Eighty-one taxa comprising 19 genera and 58 species...
Fredlake, Christopher P; Hert, Daniel G; Kan, Cheuk-Wai; Chiesl, Thomas N; Root, Brian E; Forster, Ryan E; Barron, Annelise E
2008-01-15
To realize the immense potential of large-scale genomic sequencing after the completion of the second human genome (Venter's), the costs for the complete sequencing of additional genomes must be dramatically reduced. Among the technologies being developed to reduce sequencing costs, microchip electrophoresis is the only new technology ready to produce the long reads most suitable for the de novo sequencing and assembly of large and complex genomes. Compared with the current paradigm of capillary electrophoresis, microchip systems promise to reduce sequencing costs dramatically by increasing throughput, reducing reagent consumption, and integrating the many steps of the sequencing pipeline onto a single platform. Although capillary-based systems require approximately 70 min to deliver approximately 650 bases of contiguous sequence, we report sequencing up to 600 bases in just 6.5 min by microchip electrophoresis with a unique polymer matrix/adsorbed polymer wall coating combination. This represents a two-thirds reduction in sequencing time over any previously published chip sequencing result, with comparable read length and sequence quality. We hypothesize that these ultrafast long reads on chips can be achieved because the combined polymer system engenders a recently discovered "hybrid" mechanism of DNA electromigration, in which DNA molecules alternate rapidly between repeating through the intact polymer network and disrupting network entanglements to drag polymers through the solution, similar to dsDNA dynamics we observe in single-molecule DNA imaging studies. Most importantly, these results reveal the surprisingly powerful ability of microchip electrophoresis to provide ultrafast Sanger sequencing, which will translate to increased system throughput and reduced costs.
Fredlake, Christopher P.; Hert, Daniel G.; Kan, Cheuk-Wai; Chiesl, Thomas N.; Root, Brian E.; Forster, Ryan E.; Barron, Annelise E.
2008-01-01
To realize the immense potential of large-scale genomic sequencing after the completion of the second human genome (Venter's), the costs for the complete sequencing of additional genomes must be dramatically reduced. Among the technologies being developed to reduce sequencing costs, microchip electrophoresis is the only new technology ready to produce the long reads most suitable for the de novo sequencing and assembly of large and complex genomes. Compared with the current paradigm of capillary electrophoresis, microchip systems promise to reduce sequencing costs dramatically by increasing throughput, reducing reagent consumption, and integrating the many steps of the sequencing pipeline onto a single platform. Although capillary-based systems require ≈70 min to deliver ≈650 bases of contiguous sequence, we report sequencing up to 600 bases in just 6.5 min by microchip electrophoresis with a unique polymer matrix/adsorbed polymer wall coating combination. This represents a two-thirds reduction in sequencing time over any previously published chip sequencing result, with comparable read length and sequence quality. We hypothesize that these ultrafast long reads on chips can be achieved because the combined polymer system engenders a recently discovered “hybrid” mechanism of DNA electromigration, in which DNA molecules alternate rapidly between reptating through the intact polymer network and disrupting network entanglements to drag polymers through the solution, similar to dsDNA dynamics we observe in single-molecule DNA imaging studies. Most importantly, these results reveal the surprisingly powerful ability of microchip electrophoresis to provide ultrafast Sanger sequencing, which will translate to increased system throughput and reduced costs. PMID:18184818
Khan, A S
1984-01-01
The sequence of 363 nucleotides near the 3' end of the pol gene and 564 nucleotides from the 5' terminus of the env gene in an endogenous murine leukemia viral (MuLV) DNA segment, cloned from AKR/J mouse DNA and designated as A-12, was obtained. For comparison, the nucleotide sequence in an analogous portion of AKR mink cell focus-forming (MCF) 247 MuLV provirus was also determined. Sequence features unique to MCF247 MuLV DNA in the 3' pol and 5' env regions were identified by comparison with nucleotide sequences in analogous regions of NFS -Th-1 xenotropic and AKR ecotropic MuLV proviruses. These included (i) an insertion of 12 base pairs encoding four amino acids located 60 base pairs from the 3' terminus of the pol gene and immediately preceding the env gene, (ii) the deletion of 12 base pairs (encoding four amino acids) and the insertion of 3 base pairs (encoding one amino acid) in the 5' portion of the env gene, and (iii) single base substitutions resulting in 2 MCF247 -specific amino acids in the 3' pol and 23 in the 5' env regions. Nucleotide sequence comparison involving the 3' pol and 5' env regions of AKR MCF247 , NFS xenotropic, and AKR ecotropic MuLV proviruses with the cloned endogenous MuLV DNA indicated that MCF247 proviral DNA sequences were conserved in the cloned endogenous MuLV proviral segment. In fact, total nucleotide sequence identity existed between the endogenous MuLV DNA and the MCF247 MuLV provirus in the 3' portion of the pol gene. In the 5' env region, only 4 of 564 nucleotides were different, resulting in three amino acid changes between AKR MCF247 MuLV DNA and the endogenous MuLV DNA present in clone A-12. In addition, nucleotide sequence comparison indicated that Moloney-and Friend-MCF MuLVs were also highly related in the 3' pol and 5' env regions to the cloned endogenous MuLV DNA. These results establish the role of endogenous MuLV DNA segments in generation of recombinant MCF viruses. PMID:6328017
Extending the spectrum of DNA sequences retrieved from ancient bones and teeth
Glocke, Isabelle; Meyer, Matthias
2017-01-01
The number of DNA fragments surviving in ancient bones and teeth is known to decrease with fragment length. Recent genetic analyses of Middle Pleistocene remains have shown that the recovery of extremely short fragments can prove critical for successful retrieval of sequence information from particularly degraded ancient biological material. Current sample preparation techniques, however, are not optimized to recover DNA sequences from fragments shorter than ∼35 base pairs (bp). Here, we show that much shorter DNA fragments are present in ancient skeletal remains but lost during DNA extraction. We present a refined silica-based DNA extraction method that not only enables efficient recovery of molecules as short as 25 bp but also doubles the yield of sequences from longer fragments due to improved recovery of molecules with single-strand breaks. Furthermore, we present strategies for monitoring inefficiencies in library preparation that may result from co-extraction of inhibitory substances during DNA extraction. The combination of DNA extraction and library preparation techniques described here substantially increases the yield of DNA sequences from ancient remains and provides access to a yet unexploited source of highly degraded DNA fragments. Our work may thus open the door for genetic analyses on even older material. PMID:28408382
Recognition of platinum-DNA adducts by HMGB1a.
Ramachandran, Srinivas; Temple, Brenda; Alexandrova, Anastassia N; Chaney, Stephen G; Dokholyan, Nikolay V
2012-09-25
Cisplatin (CP) and oxaliplatin (OX), platinum-based drugs used widely in chemotherapy, form adducts on intrastrand guanines (5'GG) in genomic DNA. DNA damage recognition proteins, transcription factors, mismatch repair proteins, and DNA polymerases discriminate between CP- and OX-GG DNA adducts, which could partly account for differences in the efficacy, toxicity, and mutagenicity of CP and OX. In addition, differential recognition of CP- and OX-GG adducts is highly dependent on the sequence context of the Pt-GG adduct. In particular, DNA binding protein domain HMGB1a binds to CP-GG DNA adducts with up to 53-fold greater affinity than to OX-GG adducts in the TGGA sequence context but shows much smaller differences in binding in the AGGC or TGGT sequence contexts. Here, simulations of the HMGB1a-Pt-DNA complex in the three sequence contexts revealed a higher number of interface contacts for the CP-DNA complex in the TGGA sequence context than in the OX-DNA complex. However, the number of interface contacts was similar in the TGGT and AGGC sequence contexts. The higher number of interface contacts in the CP-TGGA sequence context corresponded to a larger roll of the Pt-GG base pair step. Furthermore, geometric analysis of stacking of phenylalanine 37 in HMGB1a (Phe37) with the platinated guanines revealed more favorable stacking modes correlated with a larger roll of the Pt-GG base pair step in the TGGA sequence context. These data are consistent with our previous molecular dynamics simulations showing that the CP-TGGA complex was able to sample larger roll angles than the OX-TGGA complex or either CP- or OX-DNA complexes in the AGGC or TGGT sequences. We infer that the high binding affinity of HMGB1a for CP-TGGA is due to the greater flexibility of CP-TGGA compared to OX-TGGA and other Pt-DNA adducts. This increased flexibility is reflected in the ability of CP-TGGA to sample larger roll angles, which allows for a higher number of interface contacts between the Pt-DNA adduct and HMGB1a.
NMR studies on the structure and dynamics of lac operator DNA
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, S.C.
Nuclear Magnetic Resonance spectroscopy was used to elucidate the relationships between structure, dynamics and function of the gene regulatory sequence corresponding to the lactose operon operator of Escherichia coli. The length of the DNA fragments examined varied from 13 to 36 base pair, containing all or part of the operator sequence. These DNA fragments are either derived genetically or synthesized chemically. Resonances of the imino protons were assigned by one dimensional inter-base pair nuclear Overhauser enhancement (NOE) measurements. Imino proton exchange rates were measured by saturation recovery methods. Results from the kinetic measurements show an interesting dynamic heterogeneity with amore » maximum opening rate centered about a GTG/CAC sequence which correlates with the biological function of the operator DNA. This particular three base pair sequence occurs frequently and often symmetrically in prokaryotic nd eukaryotic DNA sites where one anticipates specific protein interaction for gene regulation. The observed sequence dependent imino proton exchange rate may be a reflection of variation of the local structure of regulatory DNA. The results also indicate that the observed imino proton exchange rates are length dependent.« less
Methods for sequencing GC-rich and CCT repeat DNA templates
Robinson, Donna L.
2007-02-20
The present invention is directed to a PCR-based method of cycle sequencing DNA and other polynucleotide sequences having high CG content and regions of high GC content, and includes for example DNA strands with a high Cytosine and/or Guanosine content and repeated motifs such as CCT repeats.
Pilotte, Nils; Papaiakovou, Marina; Grant, Jessica R; Bierwert, Lou Ann; Llewellyn, Stacey; McCarthy, James S; Williams, Steven A
2016-03-01
The soil transmitted helminths are a group of parasitic worms responsible for extensive morbidity in many of the world's most economically depressed locations. With growing emphasis on disease mapping and eradication, the availability of accurate and cost-effective diagnostic measures is of paramount importance to global control and elimination efforts. While real-time PCR-based molecular detection assays have shown great promise, to date, these assays have utilized sub-optimal targets. By performing next-generation sequencing-based repeat analyses, we have identified high copy-number, non-coding DNA sequences from a series of soil transmitted pathogens. We have used these repetitive DNA elements as targets in the development of novel, multi-parallel, PCR-based diagnostic assays. Utilizing next-generation sequencing and the Galaxy-based RepeatExplorer web server, we performed repeat DNA analysis on five species of soil transmitted helminths (Necator americanus, Ancylostoma duodenale, Trichuris trichiura, Ascaris lumbricoides, and Strongyloides stercoralis). Employing high copy-number, non-coding repeat DNA sequences as targets, novel real-time PCR assays were designed, and assays were tested against established molecular detection methods. Each assay provided consistent detection of genomic DNA at quantities of 2 fg or less, demonstrated species-specificity, and showed an improved limit of detection over the existing, proven PCR-based assay. The utilization of next-generation sequencing-based repeat DNA analysis methodologies for the identification of molecular diagnostic targets has the ability to improve assay species-specificity and limits of detection. By exploiting such high copy-number repeat sequences, the assays described here will facilitate soil transmitted helminth diagnostic efforts. We recommend similar analyses when designing PCR-based diagnostic tests for the detection of other eukaryotic pathogens.
Investigation of a Sybr-Green-Based Method to Validate DNA Sequences for DNA Computing
2005-05-01
OF A SYBR-GREEN-BASED METHOD TO VALIDATE DNA SEQUENCES FOR DNA COMPUTING 6. AUTHOR(S) Wendy Pogozelski, Salvatore Priore, Matthew Bernard ...simulated annealing. Biochemistry, 35, 14077-14089. 15 Pogozelski, W.K., Bernard , M.P. and Macula, A. (2004) DNA code validation using...and Clark, B.F.C. (eds) In RNA Biochemistry and Biotechnology, NATO ASI Series, Kluwer Academic Publishers. Zucker, M. and Stiegler , P. (1981
Lavery, Richard; Zakrzewska, Krystyna; Beveridge, David; Bishop, Thomas C.; Case, David A.; Cheatham, Thomas; Dixit, Surjit; Jayaram, B.; Lankas, Filip; Laughton, Charles; Maddocks, John H.; Michon, Alexis; Osman, Roman; Orozco, Modesto; Perez, Alberto; Singh, Tanya; Spackova, Nada; Sponer, Jiri
2010-01-01
It is well recognized that base sequence exerts a significant influence on the properties of DNA and plays a significant role in protein–DNA interactions vital for cellular processes. Understanding and predicting base sequence effects requires an extensive structural and dynamic dataset which is currently unavailable from experiment. A consortium of laboratories was consequently formed to obtain this information using molecular simulations. This article describes results providing information not only on all 10 unique base pair steps, but also on all possible nearest-neighbor effects on these steps. These results are derived from simulations of 50–100 ns on 39 different DNA oligomers in explicit solvent and using a physiological salt concentration. We demonstrate that the simulations are converged in terms of helical and backbone parameters. The results show that nearest-neighbor effects on base pair steps are very significant, implying that dinucleotide models are insufficient for predicting sequence-dependent behavior. Flanking base sequences can notably lead to base pair step parameters in dynamic equilibrium between two conformational sub-states. Although this study only provides limited data on next-nearest-neighbor effects, we suggest that such effects should be analyzed before attempting to predict the sequence-dependent behavior of DNA. PMID:19850719
Long-range correlations and charge transport properties of DNA sequences
NASA Astrophysics Data System (ADS)
Liu, Xiao-liang; Ren, Yi; Xie, Qiong-tao; Deng, Chao-sheng; Xu, Hui
2010-04-01
By using Hurst's analysis and transfer approach, the rescaled range functions and Hurst exponents of human chromosome 22 and enterobacteria phage lambda DNA sequences are investigated and the transmission coefficients, Landauer resistances and Lyapunov coefficients of finite segments based on above genomic DNA sequences are calculated. In a comparison with quasiperiodic and random artificial DNA sequences, we find that λ-DNA exhibits anticorrelation behavior characterized by a Hurst exponent 0.5
Dubbed "Tom's T" by Dhruba Chattoraj, the unusually conserved thymine at position +7 in bacteriophage P1 plasmid RepA DNA binding sites rises above repressor and acceptor sequence logos. The T appears to represent base flipping prior to helix opening in this DNA replication initation protein.
An algebraic hypothesis about the primeval genetic code architecture.
Sánchez, Robersy; Grau, Ricardo
2009-09-01
A plausible architecture of an ancient genetic code is derived from an extended base triplet vector space over the Galois field of the extended base alphabet {D,A,C,G,U}, where symbol D represents one or more hypothetical bases with unspecific pairings. We hypothesized that the high degeneration of a primeval genetic code with five bases and the gradual origin and improvement of a primeval DNA repair system could make possible the transition from ancient to modern genetic codes. Our results suggest that the Watson-Crick base pairing G identical with C and A=U and the non-specific base pairing of the hypothetical ancestral base D used to define the sum and product operations are enough features to determine the coding constraints of the primeval and the modern genetic code, as well as, the transition from the former to the latter. Geometrical and algebraic properties of this vector space reveal that the present codon assignment of the standard genetic code could be induced from a primeval codon assignment. Besides, the Fourier spectrum of the extended DNA genome sequences derived from the multiple sequence alignment suggests that the called period-3 property of the present coding DNA sequences could also exist in the ancient coding DNA sequences. The phylogenetic analyses achieved with metrics defined in the N-dimensional vector space (B(3))(N) of DNA sequences and with the new evolutionary model presented here also suggest that an ancient DNA coding sequence with five or more bases does not contradict the expected evolutionary history.
Diffusion modulation of DNA by toehold exchange
NASA Astrophysics Data System (ADS)
Rodjanapanyakul, Thanapop; Takabatake, Fumi; Abe, Keita; Kawamata, Ibuki; Nomura, Shinichiro M.; Murata, Satoshi
2018-05-01
We propose a method to control the diffusion speed of DNA molecules with a target sequence in a polymer solution. The interaction between solute DNA and diffusion-suppressing DNA that has been anchored to a polymer matrix is modulated by the concentration of the third DNA molecule called the competitor by a mechanism called toehold exchange. Experimental results show that the sequence-specific modulation of the diffusion coefficient is successfully achieved. The diffusion coefficient can be modulated up to sixfold by changing the concentration of the competitor. The specificity of the modulation is also verified under the coexistence of a set of DNA with noninteracting base sequences. With this mechanism, we are able to control the diffusion coefficient of individual DNA species by the concentration of another DNA species. This methodology introduces a programmability to a DNA-based reaction-diffusion system.
Liu, Ning; Tian, Ru; Loeb, Daniel D
2003-02-18
Synthesis of the relaxed-circular (RC) DNA genome of hepadnaviruses requires two template switches during plus-strand DNA synthesis: primer translocation and circularization. Although primer translocation and circularization use different donor and acceptor sequences, and are distinct temporally, they share the common theme of switching from one end of the minus-strand template to the other end. Studies of duck hepatitis B virus have indicated that, in addition to the donor and acceptor sequences, three other cis-acting sequences, named 3E, M, and 5E, are required for the synthesis of RC DNA by contributing to primer translocation and circularization. The mechanism by which 3E, M, and 5E act was not known. We present evidence that these sequences function by base pairing with each other within the minus-strand template. 3E base-pairs with one portion of M (M3) and 5E base-pairs with an adjacent portion of M (M5). We found that disrupting base pairing between 3E and M3 and between 5E and M5 inhibited primer translocation and circularization. More importantly, restoring base pairing with mutant sequences restored the production of RC DNA. These results are consistent with the model that, within duck hepatitis B virus capsids, the ends of the minus-strand template are juxtaposed via base pairing to facilitate the two template switches during plus-strand DNA synthesis.
Brain cDNA clone for human cholinesterase
DOE Office of Scientific and Technical Information (OSTI.GOV)
McTiernan, C.; Adkins, S.; Chatonnet, A.
1987-10-01
A cDNA library from human basal ganglia was screened with oligonucleotide probes corresponding to portions of the amino acid sequence of human serum cholinesterase. Five overlapping clones, representing 2.4 kilobases, were isolated. The sequenced cDNA contained 207 base pairs of coding sequence 5' to the amino terminus of the mature protein in which there were four ATG translation start sites in the same reading frame as the protein. Only the ATG coding for Met-(-28) lay within a favorable consensus sequence for functional initiators. There were 1722 base pairs of coding sequence corresponding to the protein found circulating in human serum.more » The amino acid sequence deduced from the cDNA exactly matched the 574 amino acid sequence of human serum cholinesterase, as previously determined by Edman degradation. Therefore, our clones represented cholinesterase rather than acetylcholinesterase. It was concluded that the amino acid sequences of cholinesterase from two different tissues, human brain and human serum, were identical. Hybridization of genomic DNA blots suggested that a single gene, or very few genes coded for cholinesterase.« less
NASA Technical Reports Server (NTRS)
Ho, P. S.; Ellison, M. J.; Quigley, G. J.; Rich, A.
1986-01-01
The ease with which a particular DNA segment adopts the left-handed Z-conformation depends largely on the sequence and on the degree of negative supercoiling to which it is subjected. We describe a computer program (Z-hunt) that is designed to search long sequences of naturally occurring DNA and retrieve those nucleotide combinations of up to 24 bp in length which show a strong propensity for Z-DNA formation. Incorporated into Z-hunt is a statistical mechanical model based on empirically determined energetic parameters for the B to Z transition accumulated to date. The Z-forming potential of a sequence is assessed by ranking its behavior as a function of negative superhelicity relative to the behavior of similar sized randomly generated nucleotide sequences assembled from over 80,000 combinations. The program makes it possible to compare directly the Z-forming potential of sequences with different base compositions and different sequence lengths. Using Z-hunt, we have analyzed the DNA sequences of the bacteriophage phi X174, plasmid pBR322, the animal virus SV40 and the replicative form of the eukaryotic adenovirus-2. The results are compared with those previously obtained by others from experiments designed to locate Z-DNA forming regions in these sequences using probes which show specificity for the left-handed DNA conformation.
Research on Image Encryption Based on DNA Sequence and Chaos Theory
NASA Astrophysics Data System (ADS)
Tian Zhang, Tian; Yan, Shan Jun; Gu, Cheng Yan; Ren, Ran; Liao, Kai Xin
2018-04-01
Nowadays encryption is a common technique to protect image data from unauthorized access. In recent years, many scientists have proposed various encryption algorithms based on DNA sequence to provide a new idea for the design of image encryption algorithm. Therefore, a new method of image encryption based on DNA computing technology is proposed in this paper, whose original image is encrypted by DNA coding and 1-D logistic chaotic mapping. First, the algorithm uses two modules as the encryption key. The first module uses the real DNA sequence, and the second module is made by one-dimensional logistic chaos mapping. Secondly, the algorithm uses DNA complementary rules to encode original image, and uses the key and DNA computing technology to compute each pixel value of the original image, so as to realize the encryption of the whole image. Simulation results show that the algorithm has good encryption effect and security.
Cadmium sulfide nanocluster-based electrochemical stripping detection of DNA hybridization.
Zhu, Ningning; Zhang, Aiping; He, Pingang; Fang, Yuzhi
2003-03-01
A novel, sensitive electrochemical DNA hybridization detection assay, using cadmium sulfide (CdS) nanoclusters as the oligonucleotide labeling tag, is described. The assay relies on the hybridization of the target DNA with the CdS nanocluster oligonucleotide DNA probe, followed by the dissolution of the CdS nanoclusters anchored on the hybrids and the indirect determination of the dissolved cadmium ions by sensitive anodic stripping voltammetry (ASV) at a mercury-coated glassy carbon electrode (GCE). The results showed that only a complementary sequence could form a double-stranded dsDNA-CdS with the DNA probe and give an obvious electrochemical response. A three-base mismatch sequence and non-complementary sequence had negligible response. The combination of the large number of cadmium ions released from each dsDNA hybrid with the remarkable sensitivity of the electrochemical stripping analysis for cadmium at mercury-film GCE allows detection at levels as low as 0.2 pmol L(-1) of the complementary sequence of DNA.
DNA sequence-dependent mechanics and protein-assisted bending in repressor-mediated loop formation
Boedicker, James Q.; Garcia, Hernan G.; Johnson, Stephanie; Phillips, Rob
2014-01-01
As the chief informational molecule of life, DNA is subject to extensive physical manipulations. The energy required to deform double-helical DNA depends on sequence, and this mechanical code of DNA influences gene regulation, such as through nucleosome positioning. Here we examine the sequence-dependent flexibility of DNA in bacterial transcription factor-mediated looping, a context for which the role of sequence remains poorly understood. Using a suite of synthetic constructs repressed by the Lac repressor and two well-known sequences that show large flexibility differences in vitro, we make precise statistical mechanical predictions as to how DNA sequence influences loop formation and test these predictions using in vivo transcription and in vitro single-molecule assays. Surprisingly, sequence-dependent flexibility does not affect in vivo gene regulation. By theoretically and experimentally quantifying the relative contributions of sequence and the DNA-bending protein HU to DNA mechanical properties, we reveal that bending by HU dominates DNA mechanics and masks intrinsic sequence-dependent flexibility. Such a quantitative understanding of how mechanical regulatory information is encoded in the genome will be a key step towards a predictive understanding of gene regulation at single-base pair resolution. PMID:24231252
Size-based molecular diagnostics using plasma DNA for noninvasive prenatal testing.
Yu, Stephanie C Y; Chan, K C Allen; Zheng, Yama W L; Jiang, Peiyong; Liao, Gary J W; Sun, Hao; Akolekar, Ranjit; Leung, Tak Y; Go, Attie T J I; van Vugt, John M G; Minekawa, Ryoko; Oudejans, Cees B M; Nicolaides, Kypros H; Chiu, Rossa W K; Lo, Y M Dennis
2014-06-10
Noninvasive prenatal testing using fetal DNA in maternal plasma is an actively researched area. The current generation of tests using massively parallel sequencing is based on counting plasma DNA sequences originating from different genomic regions. In this study, we explored a different approach that is based on the use of DNA fragment size as a diagnostic parameter. This approach is dependent on the fact that circulating fetal DNA molecules are generally shorter than the corresponding maternal DNA molecules. First, we performed plasma DNA size analysis using paired-end massively parallel sequencing and microchip-based capillary electrophoresis. We demonstrated that the fetal DNA fraction in maternal plasma could be deduced from the overall size distribution of maternal plasma DNA. The fetal DNA fraction is a critical parameter affecting the accuracy of noninvasive prenatal testing using maternal plasma DNA. Second, we showed that fetal chromosomal aneuploidy could be detected by observing an aberrant proportion of short fragments from an aneuploid chromosome in the paired-end sequencing data. Using this approach, we detected fetal trisomy 21 and trisomy 18 with 100% sensitivity (T21: 36/36; T18: 27/27) and 100% specificity (non-T21: 88/88; non-T18: 97/97). For trisomy 13, the sensitivity and specificity were 95.2% (20/21) and 99% (102/103), respectively. For monosomy X, the sensitivity and specificity were both 100% (10/10 and 8/8). Thus, this study establishes the principle of size-based molecular diagnostics using plasma DNA. This approach has potential applications beyond noninvasive prenatal testing to areas such as oncology and transplantation monitoring.
Size-based molecular diagnostics using plasma DNA for noninvasive prenatal testing
Yu, Stephanie C. Y.; Chan, K. C. Allen; Zheng, Yama W. L.; Jiang, Peiyong; Liao, Gary J. W.; Sun, Hao; Akolekar, Ranjit; Leung, Tak Y.; Go, Attie T. J. I.; van Vugt, John M. G.; Minekawa, Ryoko; Oudejans, Cees B. M.; Nicolaides, Kypros H.; Chiu, Rossa W. K.; Lo, Y. M. Dennis
2014-01-01
Noninvasive prenatal testing using fetal DNA in maternal plasma is an actively researched area. The current generation of tests using massively parallel sequencing is based on counting plasma DNA sequences originating from different genomic regions. In this study, we explored a different approach that is based on the use of DNA fragment size as a diagnostic parameter. This approach is dependent on the fact that circulating fetal DNA molecules are generally shorter than the corresponding maternal DNA molecules. First, we performed plasma DNA size analysis using paired-end massively parallel sequencing and microchip-based capillary electrophoresis. We demonstrated that the fetal DNA fraction in maternal plasma could be deduced from the overall size distribution of maternal plasma DNA. The fetal DNA fraction is a critical parameter affecting the accuracy of noninvasive prenatal testing using maternal plasma DNA. Second, we showed that fetal chromosomal aneuploidy could be detected by observing an aberrant proportion of short fragments from an aneuploid chromosome in the paired-end sequencing data. Using this approach, we detected fetal trisomy 21 and trisomy 18 with 100% sensitivity (T21: 36/36; T18: 27/27) and 100% specificity (non-T21: 88/88; non-T18: 97/97). For trisomy 13, the sensitivity and specificity were 95.2% (20/21) and 99% (102/103), respectively. For monosomy X, the sensitivity and specificity were both 100% (10/10 and 8/8). Thus, this study establishes the principle of size-based molecular diagnostics using plasma DNA. This approach has potential applications beyond noninvasive prenatal testing to areas such as oncology and transplantation monitoring. PMID:24843150
Lee, Hwan Young; Song, Injee; Ha, Eunho; Cho, Sung-Bae; Yang, Woo Ick; Shin, Kyoung-Jin
2008-01-01
Background For the past few years, scientific controversy has surrounded the large number of errors in forensic and literature mitochondrial DNA (mtDNA) data. However, recent research has shown that using mtDNA phylogeny and referring to known mtDNA haplotypes can be useful for checking the quality of sequence data. Results We developed a Web-based bioinformatics resource "mtDNAmanager" that offers a convenient interface supporting the management and quality analysis of mtDNA sequence data. The mtDNAmanager performs computations on mtDNA control-region sequences to estimate the most-probable mtDNA haplogroups and retrieves similar sequences from a selected database. By the phased designation of the most-probable haplogroups (both expected and estimated haplogroups), mtDNAmanager enables users to systematically detect errors whilst allowing for confirmation of the presence of clear key diagnostic mutations and accompanying mutations. The query tools of mtDNAmanager also facilitate database screening with two options of "match" and "include the queried nucleotide polymorphism". In addition, mtDNAmanager provides Web interfaces for users to manage and analyse their own data in batch mode. Conclusion The mtDNAmanager will provide systematic routines for mtDNA sequence data management and analysis via easily accessible Web interfaces, and thus should be very useful for population, medical and forensic studies that employ mtDNA analysis. mtDNAmanager can be accessed at . PMID:19014619
Statistical properties of DNA sequences
NASA Technical Reports Server (NTRS)
Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Simons, M.; Stanley, H. E.
1995-01-01
We review evidence supporting the idea that the DNA sequence in genes containing non-coding regions is correlated, and that the correlation is remarkably long range--indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationarity" feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33301 coding and 29453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.
Identifying active foraminifera in the Sea of Japan using metatranscriptomic approach
NASA Astrophysics Data System (ADS)
Lejzerowicz, Franck; Voltsky, Ivan; Pawlowski, Jan
2013-02-01
Metagenetics represents an efficient and rapid tool to describe environmental diversity patterns of microbial eukaryotes based on ribosomal DNA sequences. However, the results of metagenetic studies are often biased by the presence of extracellular DNA molecules that are persistent in the environment, especially in deep-sea sediment. As an alternative, short-lived RNA molecules constitute a good proxy for the detection of active species. Here, we used a metatranscriptomic approach based on RNA-derived (cDNA) sequences to study the diversity of the deep-sea benthic foraminifera and compared it to the metagenetic approach. We analyzed 257 ribosomal DNA and cDNA sequences obtained from seven sediments samples collected in the Sea of Japan at depths ranging from 486 to 3665 m. The DNA and RNA-based approaches gave a similar view of the taxonomic composition of foraminiferal assemblage, but differed in some important points. First, the cDNA dataset was dominated by sequences of rotaliids and robertiniids, suggesting that these calcareous species, some of which have been observed in Rose Bengal stained samples, are the most active component of foraminiferal community. Second, the richness of monothalamous (single-chambered) foraminifera was particularly high in DNA extracts from the deepest samples, confirming that this group of foraminifera is abundant but not necessarily very active in the deep-sea sediments. Finally, the high divergence of undetermined sequences in cDNA dataset indicate the limits of our database and lack of knowledge about some active but possibly rare species. Our study demonstrates the capability of the metatranscriptomic approach to detect active foraminiferal species and prompt its use in future high-throughput sequencing-based environmental surveys.
Automated sample-preparation technologies in genome sequencing projects.
Hilbert, H; Lauber, J; Lubenow, H; Düsterhöft, A
2000-01-01
A robotic workstation system (BioRobot 96OO, QIAGEN) and a 96-well UV spectrophotometer (Spectramax 250, Molecular Devices) were integrated in to the process of high-throughput automated sequencing of double-stranded plasmid DNA templates. An automated 96-well miniprep kit protocol (QIAprep Turbo, QIAGEN) provided high-quality plasmid DNA from shotgun clones. The DNA prepared by this procedure was used to generate more than two mega bases of final sequence data for two genomic projects (Arabidopsis thaliana and Schizosaccharomyces pombe), three thousand expressed sequence tags (ESTs) plus half a mega base of human full-length cDNA clones, and approximately 53,000 single reads for a whole genome shotgun project (Pseudomonas putida).
Jiang, Haojun; Xie, Yifan; Li, Xuchao; Ge, Huijuan; Deng, Yongqiang; Mu, Haofang; Feng, Xiaoli; Yin, Lu; Du, Zhou; Chen, Fang; He, Nongyue
2016-01-01
Short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) have been already used to perform noninvasive prenatal paternity testing from maternal plasma DNA. The frequently used technologies were PCR followed by capillary electrophoresis and SNP typing array, respectively. Here, we developed a noninvasive prenatal paternity testing (NIPAT) based on SNP typing with maternal plasma DNA sequencing. We evaluated the influence factors (minor allele frequency (MAF), the number of total SNP, fetal fraction and effective sequencing depth) and designed three different selective SNP panels in order to verify the performance in clinical cases. Combining targeted deep sequencing of selective SNP and informative bioinformatics pipeline, we calculated the combined paternity index (CPI) of 17 cases to determine paternity. Sequencing-based NIPAT results fully agreed with invasive prenatal paternity test using STR multiplex system. Our study here proved that the maternal plasma DNA sequencing-based technology is feasible and accurate in determining paternity, which may provide an alternative in forensic application in the future.
DNA Base-Calling from a Nanopore Using a Viterbi Algorithm
Timp, Winston; Comer, Jeffrey; Aksimentiev, Aleksei
2012-01-01
Nanopore-based DNA sequencing is the most promising third-generation sequencing method. It has superior read length, speed, and sample requirements compared with state-of-the-art second-generation methods. However, base-calling still presents substantial difficulty because the resolution of the technique is limited compared with the measured signal/noise ratio. Here we demonstrate a method to decode 3-bp-resolution nanopore electrical measurements into a DNA sequence using a Hidden Markov model. This method shows tremendous potential for accuracy (∼98%), even with a poor signal/noise ratio. PMID:22677395
Bergman, C M; Kreitman, M
2001-08-01
Comparative genomic approaches to gene and cis-regulatory prediction are based on the principle that differential DNA sequence conservation reflects variation in functional constraint. Using this principle, we analyze noncoding sequence conservation in Drosophila for 40 loci with known or suspected cis-regulatory function encompassing >100 kb of DNA. We estimate the fraction of noncoding DNA conserved in both intergenic and intronic regions and describe the length distribution of ungapped conserved noncoding blocks. On average, 22%-26% of noncoding sequences surveyed are conserved in Drosophila, with median block length approximately 19 bp. We show that point substitution in conserved noncoding blocks exhibits transition bias as well as lineage effects in base composition, and occurs more than an order of magnitude more frequently than insertion/deletion (indel) substitution. Overall, patterns of noncoding DNA structure and evolution differ remarkably little between intergenic and intronic conserved blocks, suggesting that the effects of transcription per se contribute minimally to the constraints operating on these sequences. The results of this study have implications for the development of alignment and prediction algorithms specific to noncoding DNA, as well as for models of cis-regulatory DNA sequence evolution.
Constructing DNA Barcode Sets Based on Particle Swarm Optimization.
Wang, Bin; Zheng, Xuedong; Zhou, Shihua; Zhou, Changjun; Wei, Xiaopeng; Zhang, Qiang; Wei, Ziqi
2018-01-01
Following the completion of the human genome project, a large amount of high-throughput bio-data was generated. To analyze these data, massively parallel sequencing, namely next-generation sequencing, was rapidly developed. DNA barcodes are used to identify the ownership between sequences and samples when they are attached at the beginning or end of sequencing reads. Constructing DNA barcode sets provides the candidate DNA barcodes for this application. To increase the accuracy of DNA barcode sets, a particle swarm optimization (PSO) algorithm has been modified and used to construct the DNA barcode sets in this paper. Compared with the extant results, some lower bounds of DNA barcode sets are improved. The results show that the proposed algorithm is effective in constructing DNA barcode sets.
Ng'endo, R.N.; Osiemo, Z.B.; Brandl, R.
2013-01-01
DNA sequencing is increasingly being used to assist in species identification in order to overcome taxonomic impediment. However, few studies attempt to compare the results of these molecular studies with a more traditional species delineation approach based on morphological characters. Mitochondrial DNA Cytochrome oxidase subunit 1 (CO1) gene was sequenced, measuring 636 base pairs, from 47 ants of the genus Pheidole (Formicidae: Myrmicinae) collected in the Brazilian Atlantic Forest to test whether the morphology-based assignment of individuals into species is supported by DNA-based species delimitation. Twenty morphospecies were identified, whereas the barcoding analysis identified 19 Molecular Operational Taxonomic Units (MOTUs). Fifteen out of the 19 DNA-based clusters allocated, using sequence divergence thresholds of 2% and 3%, matched with morphospecies. Both thresholds yielded the same number of MOTUs. Only one MOTU was successfully identified to species level using the CO1 sequences of Pheidole species already in the Genbank. The average pairwise sequence divergence for all 47 sequences was 19%, ranging between 0–25%. In some cases, however, morphology and molecular based methods differed in their assignment of individuals to morphospecies or MOTUs. The occurrence of distinct mitochondrial lineages within morphological species highlights groups for further detailed genetic and morphological studies, and therefore a pluralistic approach using several methods to understand the taxonomy of difficult lineages is advocated. PMID:23902257
Winnowing DNA for rare sequences: highly specific sequence and methylation based enrichment.
Thompson, Jason D; Shibahara, Gosuke; Rajan, Sweta; Pel, Joel; Marziali, Andre
2012-01-01
Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue.
Sawamura, Jitsuki; Morishita, Shigeru; Ishigooka, Jun
2014-05-07
Previously, we suggested prototypal models that describe some clinical states based on group postulates. Here, we demonstrate a group/category theory-like model for molecular/genetic biology as an alternative application of our previous model. Specifically, we focus on deoxyribonucleic acid (DNA) base sequences. We construct a wallpaper pattern based on a five-letter cruciform motif with letters C, A, T, G, and E. Whereas the first four letters represent the standard DNA bases, the fifth is introduced for ease in formulating group operations that reproduce insertions and deletions of DNA base sequences. A basic group Z5 = {r, u, d, l, n} of operations is defined for the wallpaper pattern, with which a sequence of points can be generated corresponding to changes of a base in a DNA sequence by following the orbit of a point of the pattern under operations in group Z5. Other manipulations of DNA sequence can be treated using a vector-like notation 'Dj' corresponding to a DNA sequence but based on the five-letter base set; also, 'Dj's are expressed graphically. Insertions and deletions of a series of letters 'E' are admitted to assist in describing DNA recombination. Likewise, a vector-like notation Rj can be constructed for sequences of ribonucleic acid (RNA). The wallpaper group B = {Z5×∞, ●} (an ∞-fold Cartesian product of Z5) acts on Dj (or Rj) yielding changes to Dj (or Rj) denoted by 'Dj◦B(j→k) = Dk' (or 'Rj◦B(j→k) = Rk'). Based on the operations of this group, two types of groups-a modulo 5 linear group and a rotational group over the Gaussian plane, acting on the five bases-are linked as parts of the wallpaper group for broader applications. As a result, changes, insertions/deletions and DNA (RNA) recombination (partial/total conversion) are described. As an exploratory study, a notation for the canonical "central dogma" via a category theory-like way is presented for future developments. Despite the large incompleteness of our methodology, there is fertile ground to consider a symmetry model for genetic coding based on our specific wallpaper group. A more integrated formulation containing "central dogma" for future molecular/genetic biology remains to be explored.
Nagaki, Kiyotaka; Shibata, Fukashi; Kanatani, Asaka; Kashihara, Kazunari; Murata, Minoru
2012-04-01
The centromere is a multi-functional complex comprising centromeric DNA and a number of proteins. To isolate unidentified centromeric DNA sequences, centromere-specific histone H3 variants (CENH3) and chromatin immunoprecipitation (ChIP) have been utilized in some plant species. However, anti-CENH3 antibody for ChIP must be raised in each species because of its species specificity. Production of the antibodies is time-consuming and costly, and it is not easy to produce ChIP-grade antibodies. In this study, we applied a HaloTag7-based chromatin affinity purification system to isolate centromeric DNA sequences in tobacco. This system required no specific antibody, and made it possible to apply a highly stringent wash to remove contaminated DNA. As a result, we succeeded in isolating five tandem repetitive DNA sequences in addition to the centromeric retrotransposons that were previously identified by ChIP. Three of the tandem repeats were centromere-specific sequences located on different chromosomes. These results confirm the validity of the HaloTag7-based chromatin affinity purification system as an alternative method to ChIP for isolating unknown centromeric DNA sequences. The discovery of more than two chromosome-specific centromeric DNA sequences indicates the mosaic structure of tobacco centromeres. © Springer-Verlag 2011
Recent patents of nanopore DNA sequencing technology: progress and challenges.
Zhou, Jianfeng; Xu, Bingqian
2010-11-01
DNA sequencing techniques witnessed fast development in the last decades, primarily driven by the Human Genome Project. Among the proposed new techniques, Nanopore was considered as a suitable candidate for the single DNA sequencing with ultrahigh speed and very low cost. Several fabrication and modification techniques have been developed to produce robust and well-defined nanopore devices. Many efforts have also been done to apply nanopore to analyze the properties of DNA molecules. By comparing with traditional sequencing techniques, nanopore has demonstrated its distinctive superiorities in main practical issues, such as sample preparation, sequencing speed, cost-effective and read-length. Although challenges still remain, recent researches in improving the capabilities of nanopore have shed a light to achieve its ultimate goal: Sequence individual DNA strand at single nucleotide level. This patent review briefly highlights recent developments and technological achievements for DNA analysis and sequencing at single molecule level, focusing on nanopore based methods.
Elder, Robert M; Jayaraman, Arthi
2013-10-10
Gene therapy relies on the delivery of DNA into cells, and polycations are one class of vectors enabling efficient DNA delivery. Nuclear localization sequences (NLS), cationic oligopeptides that target molecules for nuclear entry, can be incorporated into polycations to improve their gene delivery efficiency. We use simulations to study the effect of peptide chemistry and sequence on the DNA-binding behavior of NLS-grafted polycations by systematically mutating the residues in the grafts, which are based on the SV40 NLS (peptide sequence PKKKRKV). Replacing arginine (R) with lysine (K) reduces binding strength by eliminating arginine-DNA interactions, but placing R in a less hindered location (e.g., farther from the grafting point to the polycation backbone) has surprisingly little effect on polycation-DNA binding strength. Changing the positions of the hydrophobic proline (P) and valine (V) residues relative to the polycation backbone changes hydrophobic aggregation within the polycation and, consequently, changes the conformational entropy loss that occurs upon polycation-DNA binding. Since conformational entropy loss affects the free energy of binding, the positions of P and V in the grafts affect DNA binding affinity. The insight from this work guides synthesis of polycations with tailored DNA binding affinity and, in turn, efficient DNA delivery.
DNA Sequence-Dependent Ionic Currents in Ultra-Small Solid-State Nanopores†
Comer, Jeffrey
2016-01-01
Measurements of ionic currents through nanopores partially blocked by DNA have emerged as a powerful method for characterization of the DNA nucleotide sequence. Although the effect of the nucleotide sequence on the nanopore blockade current has been experimentally demonstrated, prediction and interpretation of such measurements remain a formidable challenge. Using atomic resolution computational approaches, here we show how the sequence, molecular conformation, and pore geometry affect the blockade ionic current in model solid-state nanopores. We demonstrate that the blockade current from a DNA molecule is determined by the chemical identities and conformations of at least three consecutive nucleotides. We find the blockade currents produced by the nucleotide triplets to vary considerably with their nucleotide sequence despite having nearly identical molecular conformations. Encouragingly, we find blockade current differences as large as 25% for single-base substitutions in ultra small (1.6 nm × 1.1 nm cross section; 2 nm length) solid-state nanopores. Despite the complex dependence of the blockade current on the sequence and conformation of the DNA triplets, we find that, under many conditions, the number of thymine bases is positively correlated with the current, whereas the number of purine bases and the presence of both purine and pyrimidines in the triplet are negatively correlated with the current. Based on these observations, we construct a simple theoretical model that relates the ion current to the base content of a solid-state nanopore. Furthermore, we show that compact conformations of DNA in narrow pores provide the greatest signal-to-noise ratio for single base detection, whereas reduction of the nanopore length increases the ionic current noise. Thus, the sequence dependence of nanopore blockade current can be theoretically rationalized, although the predictions will likely need to be customized for each nanopore type. PMID:27103233
Marck, C
1988-01-01
DNA Strider is a new integrated DNA and Protein sequence analysis program written with the C language for the Macintosh Plus, SE and II computers. It has been designed as an easy to learn and use program as well as a fast and efficient tool for the day-to-day sequence analysis work. The program consists of a multi-window sequence editor and of various DNA and Protein analysis functions. The editor may use 4 different types of sequences (DNA, degenerate DNA, RNA and one-letter coded protein) and can handle simultaneously 6 sequences of any type up to 32.5 kB each. Negative numbering of the bases is allowed for DNA sequences. All classical restriction and translation analysis functions are present and can be performed in any order on any open sequence or part of a sequence. The main feature of the program is that the same analysis function can be repeated several times on different sequences, thus generating multiple windows on the screen. Many graphic capabilities have been incorporated such as graphic restriction map, hydrophobicity profile and the CAI plot- codon adaptation index according to Sharp and Li. The restriction sites search uses a newly designed fast hexamer look-ahead algorithm. Typical runtime for the search of all sites with a library of 130 restriction endonucleases is 1 second per 10,000 bases. The circular graphic restriction map of the pBR322 plasmid can be therefore computed from its sequence and displayed on the Macintosh Plus screen within 2 seconds and its multiline restriction map obtained in a scrolling window within 5 seconds. PMID:2832831
Sequence Dependent Interactions Between DNA and Single-Walled Carbon Nanotubes
NASA Astrophysics Data System (ADS)
Roxbury, Daniel
It is known that single-stranded DNA adopts a helical wrap around a single-walled carbon nanotube (SWCNT), forming a water-dispersible hybrid molecule. The ability to sort mixtures of SWCNTs based on chirality (electronic species) has recently been demonstrated using special short DNA sequences that recognize certain matching SWCNTs of specific chirality. This thesis investigates the intricacies of DNA-SWCNT sequence-specific interactions through both experimental and molecular simulation studies. The DNA-SWCNT binding strengths were experimentally quantified by studying the kinetics of DNA replacement by a surfactant on the surface of particular SWCNTs. Recognition ability was found to correlate strongly with measured binding strength, e.g. DNA sequence (TAT)4 was found to bind 20 times stronger to the (6,5)-SWCNT than sequence (TAT)4T. Next, using replica exchange molecular dynamics (REMD) simulations, equilibrium structures formed by (a) single-strands and (b) multiple-strands of 12-mer oligonucleotides adsorbed on various SWCNTs were explored. A number of structural motifs were discovered in which the DNA strand wraps around the SWCNT and 'stitches' to itself via hydrogen bonding. Great variability among equilibrium structures was observed and shown to be directly influenced by DNA sequence and SWCNT type. For example, the (6,5)-SWCNT DNA recognition sequence, (TAT)4, was found to wrap in a tight single-stranded right-handed helical conformation. In contrast, DNA sequence T12 forms a beta-barrel left-handed structure on the same SWCNT. These are the first theoretical indications that DNA-based SWCNT selectivity can arise on a molecular level. In a biomedical collaboration with the Mayo Clinic, pathways for DNA-SWCNT internalization into healthy human endothelial cells were explored. Through absorbance spectroscopy, TEM imaging, and confocal fluorescence microscopy, we showed that intracellular concentrations of SWCNTs far exceeded those of the incubation solution, which suggested an energy-dependent pathway. Additionally, by means of pharmacological inhibition and vector-induced gene knockout studies, the DNA-SWCNTs were shown to enter the cells via Rac1-mediated macropinocytosis.
Yanagi, Tomohiro; Shirasawa, Kenta; Terachi, Mayuko; Isobe, Sachiko
2017-01-01
Cultivated strawberry ( Fragaria × ananassa Duch.) has homoeologous chromosomes because of allo-octoploidy. For example, two homoeologous chromosomes that belong to different sub-genome of allopolyploids have similar base sequences. Thus, when conducting de novo assembly of DNA sequences, it is difficult to determine whether these sequences are derived from the same chromosome. To avoid the difficulties associated with homoeologous chromosomes and demonstrate the possibility of sequencing allopolyploids using single chromosomes, we conducted sequence analysis using microdissected single somatic chromosomes of cultivated strawberry. Three hundred and ten somatic chromosomes of the Japanese octoploid strawberry 'Reiko' were individually selected under a light microscope using a microdissection system. DNA from 288 of the dissected chromosomes was successfully amplified using a DNA amplification kit. Using next-generation sequencing, we decoded the base sequences of the amplified DNA segments, and on the basis of mapping, we identified DNA sequences from 144 samples that were best matched to the reference genomes of the octoploid strawberry, F. × ananassa , and the diploid strawberry, F. vesca . The 144 samples were classified into seven pseudo-molecules of F. vesca . The coverage rates of the DNA sequences from the single chromosome onto all pseudo-molecular sequences varied from 3 to 29.9%. We demonstrated an efficient method for sequence analysis of allopolyploid plants using microdissected single chromosomes. On the basis of our results, we believe that whole-genome analysis of allopolyploid plants can be enhanced using methodology that employs microdissected single chromosomes.
DNA sequence+shape kernel enables alignment-free modeling of transcription factor binding.
Ma, Wenxiu; Yang, Lin; Rohs, Remo; Noble, William Stafford
2017-10-01
Transcription factors (TFs) bind to specific DNA sequence motifs. Several lines of evidence suggest that TF-DNA binding is mediated in part by properties of the local DNA shape: the width of the minor groove, the relative orientations of adjacent base pairs, etc. Several methods have been developed to jointly account for DNA sequence and shape properties in predicting TF binding affinity. However, a limitation of these methods is that they typically require a training set of aligned TF binding sites. We describe a sequence + shape kernel that leverages DNA sequence and shape information to better understand protein-DNA binding preference and affinity. This kernel extends an existing class of k-mer based sequence kernels, based on the recently described di-mismatch kernel. Using three in vitro benchmark datasets, derived from universal protein binding microarrays (uPBMs), genomic context PBMs (gcPBMs) and SELEX-seq data, we demonstrate that incorporating DNA shape information improves our ability to predict protein-DNA binding affinity. In particular, we observe that (i) the k-spectrum + shape model performs better than the classical k-spectrum kernel, particularly for small k values; (ii) the di-mismatch kernel performs better than the k-mer kernel, for larger k; and (iii) the di-mismatch + shape kernel performs better than the di-mismatch kernel for intermediate k values. The software is available at https://bitbucket.org/wenxiu/sequence-shape.git. rohs@usc.edu or william-noble@uw.edu. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.
A Method for Preparing DNA Sequencing Templates Using a DNA-Binding Microplate
Yang, Yu; Hebron, Haroun R.; Hang, Jun
2009-01-01
A DNA-binding matrix was immobilized on the surface of a 96-well microplate and used for plasmid DNA preparation for DNA sequencing. The same DNA-binding plate was used for bacterial growth, cell lysis, DNA purification, and storage. In a single step using one buffer, bacterial cells were lysed by enzymes, and released DNA was captured on the plate simultaneously. After two wash steps, DNA was eluted and stored in the same plate. Inclusion of phosphates in the culture medium was found to enhance the yield of plasmid significantly. Purified DNA samples were used successfully in DNA sequencing with high consistency and reproducibility. Eleven vectors and nine libraries were tested using this method. In 10 μl sequencing reactions using 3 μl sample and 0.25 μl BigDye Terminator v3.1, the results from a 3730xl sequencer gave a success rate of 90–95% and read-lengths of 700 bases or more. The method is fully automatable and convenient for manual operation as well. It enables reproducible, high-throughput, rapid production of DNA with purity and yields sufficient for high-quality DNA sequencing at a substantially reduced cost. PMID:19568455
Laser Desorption Mass Spectrometry for DNA Sequencing and Analysis
NASA Astrophysics Data System (ADS)
Chen, C. H. Winston; Taranenko, N. I.; Golovlev, V. V.; Isola, N. R.; Allman, S. L.
1998-03-01
Rapid DNA sequencing and/or analysis is critically important for biomedical research. In the past, gel electrophoresis has been the primary tool to achieve DNA analysis and sequencing. However, gel electrophoresis is a time-consuming and labor-extensive process. Recently, we have developed and used laser desorption mass spectrometry (LDMS) to achieve sequencing of ss-DNA longer than 100 nucleotides. With LDMS, we succeeded in sequencing DNA in seconds instead of hours or days required by gel electrophoresis. In addition to sequencing, we also applied LDMS for the detection of DNA probes for hybridization LDMS was also used to detect short tandem repeats for forensic applications. Clinical applications for disease diagnosis such as cystic fibrosis caused by base deletion and point mutation have also been demonstrated. Experimental details will be presented in the meeting. abstract.
Ong, Hui San; Rahim, Mohd Syafiq; Firdaus-Raih, Mohd; Ramlan, Effirul Ikhwan
2015-01-01
The unique programmability of nucleic acids offers alternative in constructing excitable and functional nanostructures. This work introduces an autonomous protocol to construct DNA Tetris shapes (L-Shape, B-Shape, T-Shape and I-Shape) using modular DNA blocks. The protocol exploits the rich number of sequence combinations available from the nucleic acid alphabets, thus allowing for diversity to be applied in designing various DNA nanostructures. Instead of a deterministic set of sequences corresponding to a particular design, the protocol promotes a large pool of DNA shapes that can assemble to conform to any desired structures. By utilising evolutionary programming in the design stage, DNA blocks are subjected to processes such as sequence insertion, deletion and base shifting in order to enrich the diversity of the resulting shapes based on a set of cascading filters. The optimisation algorithm allows mutation to be exerted indefinitely on the candidate sequences until these sequences complied with all the four fitness criteria. Generated candidates from the protocol are in agreement with the filter cascades and thermodynamic simulation. Further validation using gel electrophoresis indicated the formation of the designed shapes. Thus, supporting the plausibility of constructing DNA nanostructures in a more hierarchical, modular, and interchangeable manner.
Sunflower centromeres consist of a centromere-specific LINE and a chromosome-specific tandem repeat.
Nagaki, Kiyotaka; Tanaka, Keisuke; Yamaji, Naoki; Kobayashi, Hisato; Murata, Minoru
2015-01-01
The kinetochore is a protein complex including kinetochore-specific proteins that plays a role in chromatid segregation during mitosis and meiosis. The complex associates with centromeric DNA sequences that are usually species-specific. In plant species, tandem repeats including satellite DNA sequences and retrotransposons have been reported as centromeric DNA sequences. In this study on sunflowers, a cDNA-encoding centromere-specific histone H3 (CENH3) was isolated from a cDNA pool from a seedling, and an antibody was raised against a peptide synthesized from the deduced cDNA. The antibody specifically recognized the sunflower CENH3 (HaCENH3) and showed centromeric signals by immunostaining and immunohistochemical staining analysis. The antibody was also applied in chromatin immunoprecipitation (ChIP)-Seq to isolate centromeric DNA sequences and two different types of repetitive DNA sequences were identified. One was a long interspersed nuclear element (LINE)-like sequence, which showed centromere-specific signals on almost all chromosomes in sunflowers. This is the first report of a centromeric LINE sequence, suggesting possible centromere targeting ability. Another type of identified repetitive DNA was a tandem repeat sequence with a 187-bp unit that was found only on a pair of chromosomes. The HaCENH3 content of the tandem repeats was estimated to be much higher than that of the LINE, which implies centromere evolution from LINE-based centromeres to more stable tandem-repeat-based centromeres. In addition, the epigenetic status of the sunflower centromeres was investigated by immunohistochemical staining and ChIP, and it was found that centromeres were heterochromatic.
Eastman, Alexander W.; Yuan, Ze-Chun
2015-01-01
Advances in sequencing technology have drastically increased the depth and feasibility of bacterial genome sequencing. However, little information is available that details the specific techniques and procedures employed during genome sequencing despite the large numbers of published genomes. Shotgun approaches employed by second-generation sequencing platforms has necessitated the development of robust bioinformatics tools for in silico assembly, and complete assembly is limited by the presence of repetitive DNA sequences and multi-copy operons. Typically, re-sequencing with multiple platforms and laborious, targeted Sanger sequencing are employed to finish a draft bacterial genome. Here we describe a novel strategy based on the identification and targeted sequencing of repetitive rDNA operons to expedite bacterial genome assembly and finishing. Our strategy was validated by finishing the genome of Paenibacillus polymyxa strain CR1, a bacterium with potential in sustainable agriculture and bio-based processes. An analysis of the 38 contigs contained in the P. polymyxa strain CR1 draft genome revealed 12 repetitive rDNA operons with varied intragenic and flanking regions of variable length, unanimously located at contig boundaries and within contig gaps. These highly similar but not identical rDNA operons were experimentally verified and sequenced simultaneously with multiple, specially designed primer sets. This approach also identified and corrected significant sequence rearrangement generated during the initial in silico assembly of sequencing reads. Our approach reduces the required effort associated with blind primer walking for contig assembly, increasing both the speed and feasibility of genome finishing. Our study further reinforces the notion that repetitive DNA elements are major limiting factors for genome finishing. Moreover, we provided a step-by-step workflow for genome finishing, which may guide future bacterial genome finishing projects. PMID:25653642
Cartwright, Joseph F; Anderson, Karin; Longworth, Joseph; Lobb, Philip; James, David C
2018-06-01
High-fidelity replication of biologic-encoding recombinant DNA sequences by engineered mammalian cell cultures is an essential pre-requisite for the development of stable cell lines for the production of biotherapeutics. However, immortalized mammalian cells characteristically exhibit an increased point mutation frequency compared to mammalian cells in vivo, both across their genomes and at specific loci (hotspots). Thus unforeseen mutations in recombinant DNA sequences can arise and be maintained within producer cell populations. These may affect both the stability of recombinant gene expression and give rise to protein sequence variants with variable bioactivity and immunogenicity. Rigorous quantitative assessment of recombinant DNA integrity should therefore form part of the cell line development process and be an essential quality assurance metric for instances where synthetic/multi-component assemblies are utilized to engineer mammalian cells, such as the assessment of recombinant DNA fidelity or the mutability of single-site integration target loci. Based on Pacific Biosciences (Menlo Park, CA) single molecule real-time (SMRT™) circular consensus sequencing (CCS) technology we developed a rDNA sequence analysis tool to process the multi-parallel sequencing of ∼40,000 single recombinant DNA molecules. After statistical filtering of raw sequencing data, we show that this analytical method is capable of detecting single point mutations in rDNA to a minimum single mutation frequency of 0.0042% (<1/24,000 bases). Using a stable CHO transfectant pool harboring a randomly integrated 5 kB plasmid construct encoding GFP we found that 28% of recombinant plasmid copies contained at least one low frequency (<0.3%) point mutation. These mutations were predominantly found in GC base pairs (85%) and that there was no positional bias in mutation across the plasmid sequence. There was no discernable difference between the mutation frequencies of coding and non-coding DNA. The putative ratio of non-synonymous and synonymous changes within the open reading frames (ORFs) in the plasmid sequence indicates that natural selection does not impact upon the prevalence of these mutations. Here we have demonstrated the abundance of mutations that fall outside of the reported range of detection of next generation sequencing (NGS) and second generation sequencing (SGS) platforms, providing a methodology capable of being utilized in cell line development platforms to identify the fidelity of recombinant genes throughout the production process. © 2018 Wiley Periodicals, Inc.
Binladen, Jonas; Gilbert, M Thomas P; Bollback, Jonathan P; Panitz, Frank; Bendixen, Christian; Nielsen, Rasmus; Willerslev, Eske
2007-02-14
The invention of the Genome Sequence 20 DNA Sequencing System (454 parallel sequencing platform) has enabled the rapid and high-volume production of sequence data. Until now, however, individual emulsion PCR (emPCR) reactions and subsequent sequencing runs have been unable to combine template DNA from multiple individuals, as homologous sequences cannot be subsequently assigned to their original sources. We use conventional PCR with 5'-nucleotide tagged primers to generate homologous DNA amplification products from multiple specimens, followed by sequencing through the high-throughput Genome Sequence 20 DNA Sequencing System (GS20, Roche/454 Life Sciences). Each DNA sequence is subsequently traced back to its individual source through 5'tag-analysis. We demonstrate that this new approach enables the assignment of virtually all the generated DNA sequences to the correct source once sequencing anomalies are accounted for (miss-assignment rate<0.4%). Therefore, the method enables accurate sequencing and assignment of homologous DNA sequences from multiple sources in single high-throughput GS20 run. We observe a bias in the distribution of the differently tagged primers that is dependent on the 5' nucleotide of the tag. In particular, primers 5' labelled with a cytosine are heavily overrepresented among the final sequences, while those 5' labelled with a thymine are strongly underrepresented. A weaker bias also exists with regards to the distribution of the sequences as sorted by the second nucleotide of the dinucleotide tags. As the results are based on a single GS20 run, the general applicability of the approach requires confirmation. However, our experiments demonstrate that 5'primer tagging is a useful method in which the sequencing power of the GS20 can be applied to PCR-based assays of multiple homologous PCR products. The new approach will be of value to a broad range of research areas, such as those of comparative genomics, complete mitochondrial analyses, population genetics, and phylogenetics.
Fukunaga, Kenji; Ichitani, Katsuyuki; Taura, Satoru; Sato, Muneharu; Kawase, Makoto
2005-02-01
We determined the sequence of ribosomal DNA (rDNA) intergenic spacer (IGS) of foxtail millet isolated in our previous study, and identified subrepeats in the polymorphic region. We also developed a PCR-based method for identifying rDNA types based on sequence information and assessed 153 accessions of foxtail millet. Results were congruent with our previous works. This study provides new findings regarding the geographical distribution of rDNA variants. This new method facilitates analyses of numerous foxtail millet accessions. It is helpful for typing of foxtail millet germplasms and elucidating the evolution of this millet.
NASA Astrophysics Data System (ADS)
Sun, S. M.; Slightom, J. L.; Hall, T. C.
1981-01-01
A plant gene coding for the major storage protein (phaseolin, G1-globulin) of the French bean was isolated from a genomic library constructed in the phage vector Charon 24A. Comparison of the nucleotide sequence of part of the gene with that of the cloned messenger RNA (cDNA) revealed the presence of three intervening sequences, all beginning with GTand ending with AG. The 5' and 3' boundaries of intervening sequences TVS-A (88 base pairs) and IVS-B (124 base pairs) are similar to those described for animal and viral genes, but the 3' boundary of IVS-C (129 base pairs) shows some differences. A sequence of 185 amino acids deduced from the cloned DMAs represents about 40% of a phaseolin polypeptide.
Liu, Ning; Tian, Ru; Loeb, Daniel D.
2003-01-01
Synthesis of the relaxed-circular (RC) DNA genome of hepadnaviruses requires two template switches during plus-strand DNA synthesis: primer translocation and circularization. Although primer translocation and circularization use different donor and acceptor sequences, and are distinct temporally, they share the common theme of switching from one end of the minus-strand template to the other end. Studies of duck hepatitis B virus have indicated that, in addition to the donor and acceptor sequences, three other cis-acting sequences, named 3E, M, and 5E, are required for the synthesis of RC DNA by contributing to primer translocation and circularization. The mechanism by which 3E, M, and 5E act was not known. We present evidence that these sequences function by base pairing with each other within the minus-strand template. 3E base-pairs with one portion of M (M3) and 5E base-pairs with an adjacent portion of M (M5). We found that disrupting base pairing between 3E and M3 and between 5E and M5 inhibited primer translocation and circularization. More importantly, restoring base pairing with mutant sequences restored the production of RC DNA. These results are consistent with the model that, within duck hepatitis B virus capsids, the ends of the minus-strand template are juxtaposed via base pairing to facilitate the two template switches during plus-strand DNA synthesis. PMID:12578983
DNA copy number, including telomeres and mitochondria, assayed using next-generation sequencing.
Castle, John C; Biery, Matthew; Bouzek, Heather; Xie, Tao; Chen, Ronghua; Misura, Kira; Jackson, Stuart; Armour, Christopher D; Johnson, Jason M; Rohl, Carol A; Raymond, Christopher K
2010-04-16
DNA copy number variations occur within populations and aberrations can cause disease. We sought to develop an improved lab-automatable, cost-efficient, accurate platform to profile DNA copy number. We developed a sequencing-based assay of nuclear, mitochondrial, and telomeric DNA copy number that draws on the unbiased nature of next-generation sequencing and incorporates techniques developed for RNA expression profiling. To demonstrate this platform, we assayed UMC-11 cells using 5 million 33 nt reads and found tremendous copy number variation, including regions of single and homogeneous deletions and amplifications to 29 copies; 5 times more mitochondria and 4 times less telomeric sequence than a pool of non-diseased, blood-derived DNA; and that UMC-11 was derived from a male individual. The described assay outputs absolute copy number, outputs an error estimate (p-value), and is more accurate than array-based platforms at high copy number. The platform enables profiling of mitochondrial levels and telomeric length. The assay is lab-automatable and has a genomic resolution and cost that are tunable based on the number of sequence reads.
Portable and Error-Free DNA-Based Data Storage.
Yazdi, S M Hossein Tabatabaei; Gabrys, Ryan; Milenkovic, Olgica
2017-07-10
DNA-based data storage is an emerging nonvolatile memory technology of potentially unprecedented density, durability, and replication efficiency. The basic system implementation steps include synthesizing DNA strings that contain user information and subsequently retrieving them via high-throughput sequencing technologies. Existing architectures enable reading and writing but do not offer random-access and error-free data recovery from low-cost, portable devices, which is crucial for making the storage technology competitive with classical recorders. Here we show for the first time that a portable, random-access platform may be implemented in practice using nanopore sequencers. The novelty of our approach is to design an integrated processing pipeline that encodes data to avoid costly synthesis and sequencing errors, enables random access through addressing, and leverages efficient portable sequencing via new iterative alignment and deletion error-correcting codes. Our work represents the only known random access DNA-based data storage system that uses error-prone nanopore sequencers, while still producing error-free readouts with the highest reported information rate/density. As such, it represents a crucial step towards practical employment of DNA molecules as storage media.
DNA copy number, including telomeres and mitochondria, assayed using next-generation sequencing
2010-01-01
Background DNA copy number variations occur within populations and aberrations can cause disease. We sought to develop an improved lab-automatable, cost-efficient, accurate platform to profile DNA copy number. Results We developed a sequencing-based assay of nuclear, mitochondrial, and telomeric DNA copy number that draws on the unbiased nature of next-generation sequencing and incorporates techniques developed for RNA expression profiling. To demonstrate this platform, we assayed UMC-11 cells using 5 million 33 nt reads and found tremendous copy number variation, including regions of single and homogeneous deletions and amplifications to 29 copies; 5 times more mitochondria and 4 times less telomeric sequence than a pool of non-diseased, blood-derived DNA; and that UMC-11 was derived from a male individual. Conclusion The described assay outputs absolute copy number, outputs an error estimate (p-value), and is more accurate than array-based platforms at high copy number. The platform enables profiling of mitochondrial levels and telomeric length. The assay is lab-automatable and has a genomic resolution and cost that are tunable based on the number of sequence reads. PMID:20398377
Torque measurements reveal sequence-specific cooperative transitions in supercoiled DNA
Oberstrass, Florian C.; Fernandes, Louis E.; Bryant, Zev
2012-01-01
B-DNA becomes unstable under superhelical stress and is able to adopt a wide range of alternative conformations including strand-separated DNA and Z-DNA. Localized sequence-dependent structural transitions are important for the regulation of biological processes such as DNA replication and transcription. To directly probe the effect of sequence on structural transitions driven by torque, we have measured the torsional response of a panel of DNA sequences using single molecule assays that employ nanosphere rotational probes to achieve high torque resolution. The responses of Z-forming d(pGpC)n sequences match our predictions based on a theoretical treatment of cooperative transitions in helical polymers. “Bubble” templates containing 50–100 bp mismatch regions show cooperative structural transitions similar to B-DNA, although less torque is required to disrupt strand–strand interactions. Our mechanical measurements, including direct characterization of the torsional rigidity of strand-separated DNA, establish a framework for quantitative predictions of the complex torsional response of arbitrary sequences in their biological context. PMID:22474350
Creation of a type IIS restriction endonuclease with a long recognition sequence
Lippow, Shaun M.; Aha, Patti M.; Parker, Matthew H.; Blake, William J.; Baynes, Brian M.; Lipovšek, Daša
2009-01-01
Type IIS restriction endonucleases cleave DNA outside their recognition sequences, and are therefore particularly useful in the assembly of DNA from smaller fragments. A limitation of type IIS restriction endonucleases in assembly of long DNA sequences is the relative abundance of their target sites. To facilitate ligation-based assembly of extremely long pieces of DNA, we have engineered a new type IIS restriction endonuclease that combines the specificity of the homing endonuclease I-SceI with the type IIS cleavage pattern of FokI. We linked a non-cleaving mutant of I-SceI, which conveys to the chimeric enzyme its specificity for an 18-bp DNA sequence, to the catalytic domain of FokI, which cuts DNA at a defined site outside the target site. Whereas previously described chimeric endonucleases do not produce type IIS-like precise DNA overhangs suitable for ligation, our chimeric endonuclease cleaves double-stranded DNA exactly 2 and 6 nt from the target site to generate homogeneous, 5′, four-base overhangs, which can be ligated with 90% fidelity. We anticipate that these enzymes will be particularly useful in manipulation of DNA fragments larger than a thousand bases, which are very likely to contain target sites for all natural type IIS restriction endonucleases. PMID:19304757
What Information is Stored in DNA: Does it Contain Digital Error Correcting Codes?
NASA Astrophysics Data System (ADS)
Liebovitch, Larry
1998-03-01
The longest term correlations in living systems are the information stored in DNA which reflects the evolutionary history of an organism. The 4 bases (A,T,G,C) encode sequences of amino acids as well as locations of binding sites for proteins that regulate DNA. The fidelity of this important information is maintained by ANALOG error check mechanisms. When a single strand of DNA is replicated the complementary base is inserted in the new strand. Sometimes the wrong base is inserted that sticks out disrupting the phosphate backbone. The new base is not yet methylated, so repair enzymes, that slide along the DNA, can tear out the wrong base and replace it with the right one. The bases in DNA form a sequence of 4 different symbols and so the information is encoded in a DIGITAL form. All the digital codes in our society (ISBN book numbers, UPC product codes, bank account numbers, airline ticket numbers) use error checking code, where some digits are functions of other digits to maintain the fidelity of transmitted informaiton. Does DNA also utitlize a DIGITAL error chekcing code to maintain the fidelity of its information and increase the accuracy of replication? That is, are some bases in DNA functions of other bases upstream or downstream? This raises the interesting mathematical problem: How does one determine whether some symbols in a sequence of symbols are a function of other symbols. It also bears on the issue of determining algorithmic complexity: What is the function that generates the shortest algorithm for reproducing the symbol sequence. The error checking codes most used in our technology are linear block codes. We developed an efficient method to test for the presence of such codes in DNA. We coded the 4 bases as (0,1,2,3) and used Gaussian elimination, modified for modulus 4, to test if some bases are linear combinations of other bases. We used this method to analyze the base sequence in the genes from the lac operon and cytochrome C. We did not find evidence for such error correcting codes in these genes. However, we analyzed only a small amount of DNA and if digitial error correcting schemes are present in DNA, they may be more subtle than such simple linear block codes. The basic issue we raise here, is how information is stored in DNA and an appreciation that digital symbol sequences, such as DNA, admit of interesting schemes to store and protect the fidelity of their information content. Liebovitch, Tao, Todorov, Levine. 1996. Biophys. J. 71:1539-1544. Supported by NIH grant EY6234.
PuLSE: Quality control and quantification of peptide sequences explored by phage display libraries.
Shave, Steven; Mann, Stefan; Koszela, Joanna; Kerr, Alastair; Auer, Manfred
2018-01-01
The design of highly diverse phage display libraries is based on assumption that DNA bases are incorporated at similar rates within the randomized sequence. As library complexity increases and expected copy numbers of unique sequences decrease, the exploration of library space becomes sparser and the presence of truly random sequences becomes critical. We present the program PuLSE (Phage Library Sequence Evaluation) as a tool for assessing randomness and therefore diversity of phage display libraries. PuLSE runs on a collection of sequence reads in the fastq file format and generates tables profiling the library in terms of unique DNA sequence counts and positions, translated peptide sequences, and normalized 'expected' occurrences from base to residue codon frequencies. The output allows at-a-glance quantitative quality control of a phage library in terms of sequence coverage both at the DNA base and translated protein residue level, which has been missing from toolsets and literature. The open source program PuLSE is available in two formats, a C++ source code package for compilation and integration into existing bioinformatics pipelines and precompiled binaries for ease of use.
Liu, Bin; Wang, Shanyi; Dong, Qiwen; Li, Shumin; Liu, Xuan
2016-04-20
DNA-binding proteins play a pivotal role in various intra- and extra-cellular activities ranging from DNA replication to gene expression control. With the rapid development of next generation of sequencing technique, the number of protein sequences is unprecedentedly increasing. Thus it is necessary to develop computational methods to identify the DNA-binding proteins only based on the protein sequence information. In this study, a novel method called iDNA-KACC is presented, which combines the Support Vector Machine (SVM) and the auto-cross covariance transformation. The protein sequences are first converted into profile-based protein representation, and then converted into a series of fixed-length vectors by the auto-cross covariance transformation with Kmer composition. The sequence order effect can be effectively captured by this scheme. These vectors are then fed into Support Vector Machine (SVM) to discriminate the DNA-binding proteins from the non DNA-binding ones. iDNA-KACC achieves an overall accuracy of 75.16% and Matthew correlation coefficient of 0.5 by a rigorous jackknife test. Its performance is further improved by employing an ensemble learning approach, and the improved predictor is called iDNA-KACC-EL. Experimental results on an independent dataset shows that iDNA-KACC-EL outperforms all the other state-of-the-art predictors, indicating that it would be a useful computational tool for DNA binding protein identification. .
Flow cytometric detection method for DNA samples
Nasarabadi, Shanavaz [Livermore, CA; Langlois, Richard G [Livermore, CA; Venkateswaran, Kodumudi S [Round Rock, TX
2011-07-05
Disclosed herein are two methods for rapid multiplex analysis to determine the presence and identity of target DNA sequences within a DNA sample. Both methods use reporting DNA sequences, e.g., modified conventional Taqman.RTM. probes, to combine multiplex PCR amplification with microsphere-based hybridization using flow cytometry means of detection. Real-time PCR detection can also be incorporated. The first method uses a cyanine dye, such as, Cy3.TM., as the reporter linked to the 5' end of a reporting DNA sequence. The second method positions a reporter dye, e.g., FAM.TM. on the 3' end of the reporting DNA sequence and a quencher dye, e.g., TAMRA.TM., on the 5' end.
Flow cytometric detection method for DNA samples
Nasarabadi, Shanavaz [Livermore, CA; Langlois, Richard G [Livermore, CA; Venkateswaran, Kodumudi S [Livermore, CA
2006-08-01
Disclosed herein are two methods for rapid multiplex analysis to determine the presence and identity of target DNA sequences within a DNA sample. Both methods use reporting DNA sequences, e.g., modified conventional Taqman.RTM. probes, to combine multiplex PCR amplification with microsphere-based hybridization using flow cytometry means of detection. Real-time PCR detection can also be incorporated. The first method uses a cyanine dye, such as, Cy3.TM., as the reporter linked to the 5' end of a reporting DNA sequence. The second method positions a reporter dye, e.g., FAM, on the 3' end of the reporting DNA sequence and a quencher dye, e.g., TAMRA, on the 5' end.
Highly multiplexed targeted DNA sequencing from single nuclei.
Leung, Marco L; Wang, Yong; Kim, Charissa; Gao, Ruli; Jiang, Jerry; Sei, Emi; Navin, Nicholas E
2016-02-01
Single-cell DNA sequencing methods are challenged by poor physical coverage, high technical error rates and low throughput. To address these issues, we developed a single-cell DNA sequencing protocol that combines flow-sorting of single nuclei, time-limited multiple-displacement amplification (MDA), low-input library preparation, DNA barcoding, targeted capture and next-generation sequencing (NGS). This approach represents a major improvement over our previous single nucleus sequencing (SNS) Nature Protocols paper in terms of generating higher-coverage data (>90%), thereby enabling the detection of genome-wide variants in single mammalian cells at base-pair resolution. Furthermore, by pooling 48-96 single-cell libraries together for targeted capture, this approach can be used to sequence many single-cell libraries in parallel in a single reaction. This protocol greatly reduces the cost of single-cell DNA sequencing, and it can be completed in 5-6 d by advanced users. This single-cell DNA sequencing protocol has broad applications for studying rare cells and complex populations in diverse fields of biological research and medicine.
Protein Crystal Eco R1 Endonulease-DNA Complex
NASA Technical Reports Server (NTRS)
1998-01-01
Type II restriction enzymes, such as Eco R1 endonulease, present a unique advantage for the study of sequence-specific recognition because they leave a record of where they have been in the form of the cleaved ends of the DNA sites where they were bound. The differential behavior of a sequence -specific protein at sites of differing base sequence is the essence of the sequence-specificity; the core question is how do these proteins discriminate between different DNA sequences especially when the two sequences are very similar. Principal Investigator: Dan Carter/New Century Pharmaceuticals
Winnowing DNA for Rare Sequences: Highly Specific Sequence and Methylation Based Enrichment
Thompson, Jason D.; Shibahara, Gosuke; Rajan, Sweta; Pel, Joel; Marziali, Andre
2012-01-01
Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue. PMID:22355378
Hykin, Sarah M.; Bi, Ke; McGuire, Jimmy A.
2015-01-01
For 150 years or more, specimens were routinely collected and deposited in natural history collections without preserving fresh tissue samples for genetic analysis. In the case of most herpetological specimens (i.e. amphibians and reptiles), attempts to extract and sequence DNA from formalin-fixed, ethanol-preserved specimens—particularly for use in phylogenetic analyses—has been laborious and largely ineffective due to the highly fragmented nature of the DNA. As a result, tens of thousands of specimens in herpetological collections have not been available for sequence-based phylogenetic studies. Massively parallel High-Throughput Sequencing methods and the associated bioinformatics, however, are particularly suited to recovering meaningful genetic markers from severely degraded/fragmented DNA sequences such as DNA damaged by formalin-fixation. In this study, we compared previously published DNA extraction methods on three tissue types subsampled from formalin-fixed specimens of Anolis carolinensis, followed by sequencing. Sufficient quality DNA was recovered from liver tissue, making this technique minimally destructive to museum specimens. Sequencing was only successful for the more recently collected specimen (collected ~30 ybp). We suspect this could be due either to the conditions of preservation and/or the amount of tissue used for extraction purposes. For the successfully sequenced sample, we found a high rate of base misincorporation. After rigorous trimming, we successfully mapped 27.93% of the cleaned reads to the reference genome, were able to reconstruct the complete mitochondrial genome, and recovered an accurate phylogenetic placement for our specimen. We conclude that the amount of DNA available, which can vary depending on specimen age and preservation conditions, will determine if sequencing will be successful. The technique described here will greatly improve the value of museum collections by making many formalin-fixed specimens available for genetic analysis. PMID:26505622
Hykin, Sarah M; Bi, Ke; McGuire, Jimmy A
2015-01-01
For 150 years or more, specimens were routinely collected and deposited in natural history collections without preserving fresh tissue samples for genetic analysis. In the case of most herpetological specimens (i.e. amphibians and reptiles), attempts to extract and sequence DNA from formalin-fixed, ethanol-preserved specimens-particularly for use in phylogenetic analyses-has been laborious and largely ineffective due to the highly fragmented nature of the DNA. As a result, tens of thousands of specimens in herpetological collections have not been available for sequence-based phylogenetic studies. Massively parallel High-Throughput Sequencing methods and the associated bioinformatics, however, are particularly suited to recovering meaningful genetic markers from severely degraded/fragmented DNA sequences such as DNA damaged by formalin-fixation. In this study, we compared previously published DNA extraction methods on three tissue types subsampled from formalin-fixed specimens of Anolis carolinensis, followed by sequencing. Sufficient quality DNA was recovered from liver tissue, making this technique minimally destructive to museum specimens. Sequencing was only successful for the more recently collected specimen (collected ~30 ybp). We suspect this could be due either to the conditions of preservation and/or the amount of tissue used for extraction purposes. For the successfully sequenced sample, we found a high rate of base misincorporation. After rigorous trimming, we successfully mapped 27.93% of the cleaned reads to the reference genome, were able to reconstruct the complete mitochondrial genome, and recovered an accurate phylogenetic placement for our specimen. We conclude that the amount of DNA available, which can vary depending on specimen age and preservation conditions, will determine if sequencing will be successful. The technique described here will greatly improve the value of museum collections by making many formalin-fixed specimens available for genetic analysis.
BiQ Analyzer HT: locus-specific analysis of DNA methylation by high-throughput bisulfite sequencing
Lutsik, Pavlo; Feuerbach, Lars; Arand, Julia; Lengauer, Thomas; Walter, Jörn; Bock, Christoph
2011-01-01
Bisulfite sequencing is a widely used method for measuring DNA methylation in eukaryotic genomes. The assay provides single-base pair resolution and, given sufficient sequencing depth, its quantitative accuracy is excellent. High-throughput sequencing of bisulfite-converted DNA can be applied either genome wide or targeted to a defined set of genomic loci (e.g. using locus-specific PCR primers or DNA capture probes). Here, we describe BiQ Analyzer HT (http://biq-analyzer-ht.bioinf.mpi-inf.mpg.de/), a user-friendly software tool that supports locus-specific analysis and visualization of high-throughput bisulfite sequencing data. The software facilitates the shift from time-consuming clonal bisulfite sequencing to the more quantitative and cost-efficient use of high-throughput sequencing for studying locus-specific DNA methylation patterns. In addition, it is useful for locus-specific visualization of genome-wide bisulfite sequencing data. PMID:21565797
Simulations Using Random-Generated DNA and RNA Sequences
ERIC Educational Resources Information Center
Bryce, C. F. A.
1977-01-01
Using a very simple computer program written in BASIC, a very large number of random-generated DNA or RNA sequences are obtained. Students use these sequences to predict complementary sequences and translational products, evaluate base compositions, determine frequencies of particular triplet codons, and suggest possible secondary structures.…
Accurate multiplex polony sequencing of an evolved bacterial genome.
Shendure, Jay; Porreca, Gregory J; Reppas, Nikos B; Lin, Xiaoxia; McCutcheon, John P; Rosenbaum, Abraham M; Wang, Michael D; Zhang, Kun; Mitra, Robi D; Church, George M
2005-09-09
We describe a DNA sequencing technology in which a commonly available, inexpensive epifluorescence microscope is converted to rapid nonelectrophoretic DNA sequencing automation. We apply this technology to resequence an evolved strain of Escherichia coli at less than one error per million consensus bases. A cell-free, mate-paired library provided single DNA molecules that were amplified in parallel to 1-micrometer beads by emulsion polymerase chain reaction. Millions of beads were immobilized in a polyacrylamide gel and subjected to automated cycles of sequencing by ligation and four-color imaging. Cost per base was roughly one-ninth as much as that of conventional sequencing. Our protocols were implemented with off-the-shelf instrumentation and reagents.
DNA base-calling from a nanopore using a Viterbi algorithm.
Timp, Winston; Comer, Jeffrey; Aksimentiev, Aleksei
2012-05-16
Nanopore-based DNA sequencing is the most promising third-generation sequencing method. It has superior read length, speed, and sample requirements compared with state-of-the-art second-generation methods. However, base-calling still presents substantial difficulty because the resolution of the technique is limited compared with the measured signal/noise ratio. Here we demonstrate a method to decode 3-bp-resolution nanopore electrical measurements into a DNA sequence using a Hidden Markov model. This method shows tremendous potential for accuracy (~98%), even with a poor signal/noise ratio. Copyright © 2012 Biophysical Society. Published by Elsevier Inc. All rights reserved.
DNA-based random number generation in security circuitry.
Gearheart, Christy M; Arazi, Benjamin; Rouchka, Eric C
2010-06-01
DNA-based circuit design is an area of research in which traditional silicon-based technologies are replaced by naturally occurring phenomena taken from biochemistry and molecular biology. This research focuses on further developing DNA-based methodologies to mimic digital data manipulation. While exhibiting fundamental principles, this work was done in conjunction with the vision that DNA-based circuitry, when the technology matures, will form the basis for a tamper-proof security module, revolutionizing the meaning and concept of tamper-proofing and possibly preventing it altogether based on accurate scientific observations. A paramount part of such a solution would be self-generation of random numbers. A novel prototype schema employs solid phase synthesis of oligonucleotides for random construction of DNA sequences; temporary storage and retrieval is achieved through plasmid vectors. A discussion of how to evaluate sequence randomness is included, as well as how these techniques are applied to a simulation of the random number generation circuitry. Simulation results show generated sequences successfully pass three selected NIST random number generation tests specified for security applications.
Blastocystis phylogeny among various isolates from humans to insects.
Yoshikawa, Hisao; Koyama, Yukiko; Tsuchiya, Erika; Takami, Kazutoshi
2016-12-01
Blastocystis is a common unicellular eukaryotic parasite found not only in humans, but also in various kinds of animal species worldwide. Since Blastocystis isolates are morphologically indistinguishable, many molecular biological approaches have been applied to classify these isolates. The complete or partial sequences of the small subunit rRNA gene (SSU rDNA) are mainly used for comparisons and phylogenetic analyses among Blastocystis isolates. However, various lengths of the partial SSU rDNA sequence have been used for phylogenetic inference among genetically different isolates. Based on the complete SSU rDNA sequences, consensus terminology of nine subtypes (STs) of Blastocystis sp. that were supported by phylogenetically monophyletic nine clades was proposed in 2007. Thereafter, eight additional kinds of STs comprising non-human mammalian Blastocystis isolates have been reported based on the phylogeny of SSU rDNA sequences, while STs 11 and 12 were only proposed on the base of partial sequences. Although many sequence data from mammalian and avian Blastocystis are registered in GenBank, only limited data on SSU rDNA are available for poikilotherm-derived Blastocystis isolates. Therefore, the phylogenetic positions of the reptilian/amphibian Blastocystis clades are unstable. The phylogenetic inference of various STs comprising mammalian and/or avian Blastocystis isolates was verified herein based on comparisons between partial and complete SSU rDNA sequences, and the phylogenetic positions of reptilian and amphibian Blastocystis isolates were also investigated using 14 new Blastocystis isolates from reptiles with all known isolates from other reptilians, amphibians, and insects registered in GenBank. Copyright © 2016. Published by Elsevier Ireland Ltd.
Raman-based system for DNA sequencing-mapping and other separations
Vo-Dinh, Tuan
1994-01-01
DNA sequencing and mapping are performed by using a Raman spectrometer with a surface enhanced Raman scattering (SERS) substrate to enhance the Raman signal. A SERS label is attached to a DNA fragment and then analyzed with the Raman spectrometer to identify the DNA fragment according to characteristics of the Raman spectrum generated.
Enlightenment of Yeast Mitochondrial Homoplasmy: Diversified Roles of Gene Conversion
Ling, Feng; Mikawa, Tsutomu; Shibata, Takehiko
2011-01-01
Mitochondria have their own genomic DNA. Unlike the nuclear genome, each cell contains hundreds to thousands of copies of mitochondrial DNA (mtDNA). The copies of mtDNA tend to have heterogeneous sequences, due to the high frequency of mutagenesis, but are quickly homogenized within a cell (“homoplasmy”) during vegetative cell growth or through a few sexual generations. Heteroplasmy is strongly associated with mitochondrial diseases, diabetes and aging. Recent studies revealed that the yeast cell has the machinery to homogenize mtDNA, using a common DNA processing pathway with gene conversion; i.e., both genetic events are initiated by a double-stranded break, which is processed into 3′ single-stranded tails. One of the tails is base-paired with the complementary sequence of the recipient double-stranded DNA to form a D-loop (homologous pairing), in which repair DNA synthesis is initiated to restore the sequence lost by the breakage. Gene conversion generates sequence diversity, depending on the divergence between the donor and recipient sequences, especially when it occurs among a number of copies of a DNA sequence family with some sequence variations, such as in immunoglobulin diversification in chicken. MtDNA can be regarded as a sequence family, in which the members tend to be diversified by a high frequency of spontaneous mutagenesis. Thus, it would be interesting to determine why and how double-stranded breakage and D-loop formation induce sequence homogenization in mitochondria and sequence diversification in nuclear DNA. We will review the mechanisms and roles of mtDNA homoplasmy, in contrast to nuclear gene conversion, which diversifies gene and genome sequences, to provide clues toward understanding how the common DNA processing pathway results in such divergent outcomes. PMID:24710143
Stranges, P. Benjamin; Palla, Mirkó; Kalachikov, Sergey; Nivala, Jeff; Dorwart, Michael; Trans, Andrew; Kumar, Shiv; Porel, Mintu; Chien, Minchen; Tao, Chuanjuan; Morozova, Irina; Li, Zengmin; Shi, Shundi; Aberra, Aman; Arnold, Cleoma; Yang, Alexander; Aguirre, Anne; Harada, Eric T.; Korenblum, Daniel; Pollard, James; Bhat, Ashwini; Gremyachinskiy, Dmitriy; Bibillo, Arek; Chen, Roger; Davis, Randy; Russo, James J.; Fuller, Carl W.; Roever, Stefan; Ju, Jingyue; Church, George M.
2016-01-01
Scalable, high-throughput DNA sequencing is a prerequisite for precision medicine and biomedical research. Recently, we presented a nanopore-based sequencing-by-synthesis (Nanopore-SBS) approach, which used a set of nucleotides with polymer tags that allow discrimination of the nucleotides in a biological nanopore. Here, we designed and covalently coupled a DNA polymerase to an α-hemolysin (αHL) heptamer using the SpyCatcher/SpyTag conjugation approach. These porin–polymerase conjugates were inserted into lipid bilayers on a complementary metal oxide semiconductor (CMOS)-based electrode array for high-throughput electrical recording of DNA synthesis. The designed nanopore construct successfully detected the capture of tagged nucleotides complementary to a DNA base on a provided template. We measured over 200 tagged-nucleotide signals for each of the four bases and developed a classification method to uniquely distinguish them from each other and background signals. The probability of falsely identifying a background event as a true capture event was less than 1.2%. In the presence of all four tagged nucleotides, we observed sequential additions in real time during polymerase-catalyzed DNA synthesis. Single-polymerase coupling to a nanopore, in combination with the Nanopore-SBS approach, can provide the foundation for a low-cost, single-molecule, electronic DNA-sequencing platform. PMID:27729524
Chung, Jongsuk; Son, Dae-Soon; Jeon, Hyo-Jeong; Kim, Kyoung-Mee; Park, Gahee; Ryu, Gyu Ha; Park, Woong-Yang; Park, Donghyun
2016-01-01
Targeted capture massively parallel sequencing is increasingly being used in clinical settings, and as costs continue to decline, use of this technology may become routine in health care. However, a limited amount of tissue has often been a challenge in meeting quality requirements. To offer a practical guideline for the minimum amount of input DNA for targeted sequencing, we optimized and evaluated the performance of targeted sequencing depending on the input DNA amount. First, using various amounts of input DNA, we compared commercially available library construction kits and selected Agilent’s SureSelect-XT and KAPA Biosystems’ Hyper Prep kits as the kits most compatible with targeted deep sequencing using Agilent’s SureSelect custom capture. Then, we optimized the adapter ligation conditions of the Hyper Prep kit to improve library construction efficiency and adapted multiplexed hybrid selection to reduce the cost of sequencing. In this study, we systematically evaluated the performance of the optimized protocol depending on the amount of input DNA, ranging from 6.25 to 200 ng, suggesting the minimal input DNA amounts based on coverage depths required for specific applications. PMID:27220682
Cooper, David N.; Bacolla, Albino; Férec, Claude; Vasquez, Karen M.; Kehrer-Sawatzki, Hildegard; Chen, Jian-Min
2011-01-01
Different types of human gene mutation may vary in size, from structural variants (SVs) to single base-pair substitutions, but what they all have in common is that their nature, size and location are often determined either by specific characteristics of the local DNA sequence environment or by higher-order features of the genomic architecture. The human genome is now recognized to contain ‘pervasive architectural flaws’ in that certain DNA sequences are inherently mutation-prone by virtue of their base composition, sequence repetitivity and/or epigenetic modification. Here we explore how the nature, location and frequency of different types of mutation causing inherited disease are shaped in large part, and often in remarkably predictable ways, by the local DNA sequence environment. The mutability of a given gene or genomic region may also be influenced indirectly by a variety of non-canonical (non-B) secondary structures whose formation is facilitated by the underlying DNA sequence. Since these non-B DNA structures can interfere with subsequent DNA replication and repair, and may serve to increase mutation frequencies in generalized fashion (i.e. both in the context of subtle mutations and SVs), they have the potential to serve as a unifying concept in studies of mutational mechanisms underlying human inherited disease. PMID:21853507
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ton, H.; Yeung, E.S.
1997-02-15
An integrated on-line prototype for coupling a microreactor to capillary electrophoresis for DNA sequencing has been demonstrated. A dye-labeled terminator cycle-sequencing reaction is performed in a fused-silica capillary. Subsequently, the sequencing ladder is directly injected into a size-exclusion chromatographic column operated at nearly 95{degree}C for purification. On-line injection to a capillary for electrophoresis is accomplished at a junction set at nearly 70{degree}C. High temperature at the purification column and injection junction prevents the renaturation of DNA fragments during on-line transfer without affecting the separation. The high solubility of DNA in and the relatively low ionic strength of 1 x TEmore » buffer permit both effective purification and electrokinetic injection of the DNA sample. The system is compatible with highly efficient separations by a replaceable poly(ethylene oxide) polymer solution in uncoated capillary tubes. Future automation and adaptation to a multiple-capillary array system should allow high-speed, high-throughput DNA sequencing from templates to called bases in one step. 32 refs., 5 figs.« less
Sakthivelkumar, S; Ramaraj, P; Veeramani, V; Janarthanan, S
2015-09-01
The basis of the present study was to distinguish the existence of any genetic variability among populations of Culex quinquefasciatus which would be a valuable tool in the management of mosquito control programmes. In the present study, population of Cx. quinquefasciatus collected at different locations in Tamil Nadu were analyzed for their genetic variation based on 28S rDNA D2 region nucleotide sequences. A high degree of genetic polymorphism was detected in the sequences of D2 region of 28S rDNA on the predicted secondary structures in spite of high nucleotide sequence similarity. The findings based on secondary structure using rDNA sequences suggested the existence of a complex genotypic diversity of Cx. quinquefasciatus population collected at different locations of Tamil Nadu, India. This complexity in genetic diversity in a single mosquito population collected at different locations is considered an important issue towards their influence and nature of vector potential of these mosquitoes.
Zhao, Ya-E; Xu, Ji-Ru; Hu, Li; Wu, Li-Ping; Wang, Zheng-Hang
2012-05-01
The study for the first time attempted to accomplish 18S ribosomal DNA (rDNA) complete sequence amplification and analysis for three Demodex species (Demodex folliculorum, Demodex brevis and Demodex canis) based on gDNA extraction from individual mites. The mites were treated by DNA Release Additive and Hot Start II DNA Polymerase so as to promote mite disruption and increase PCR specificity. Determination of D. folliculorum gDNA showed that the gDNA yield reached the highest at 1 mite, tending to descend with the increase of mite number. The individual mite gDNA was successfully used for 18S rDNA fragment (about 900 bp) amplification examination. The alignments of 18S rDNA complete sequences of individual mite samples and those of pooled mite samples ( ≥ 1000mites/sample) showed over 97% identities for each species, indicating that the gDNA extracted from a single individual mite was as satisfactory as that from pooled mites for PCR amplification. Further pairwise sequence analyses showed that average divergence, genetic distance, transition/transversion or phylogenetic tree could not effectively identify the three Demodex species, largely due to the differentiation in the D. canis isolates. It can be concluded that the individual Demodex mite gDNA can satisfy the molecular study of Demodex. 18S rDNA complete sequence is suitable for interfamily identification in Cheyletoidea, but whether it is suitable for intrafamily identification cannot be confirmed until the ascertainment of the types of Demodex mites parasitizing in dogs. Copyright © 2012 Elsevier Inc. All rights reserved.
Effects of sequence on DNA wrapping around histones
NASA Astrophysics Data System (ADS)
Ortiz, Vanessa
2011-03-01
A central question in biophysics is whether the sequence of a DNA strand affects its mechanical properties. In epigenetics, these are thought to influence nucleosome positioning and gene expression. Theoretical and experimental attempts to answer this question have been hindered by an inability to directly resolve DNA structure and dynamics at the base-pair level. In our previous studies we used a detailed model of DNA to measure the effects of sequence on the stability of naked DNA under bending. Sequence was shown to influence DNA's ability to form kinks, which arise when certain motifs slide past others to form non-native contacts. Here, we have now included histone-DNA interactions to see if the results obtained for naked DNA are transferable to the problem of nucleosome positioning. Different DNA sequences interacting with the histone protein complex are studied, and their equilibrium and mechanical properties are compared among themselves and with the naked case. NLM training grant to the Computation and Informatics in Biology and Medicine Training Program (NLM T15LM007359).
Zhu, X Q; Gasser, R B
1998-06-01
In this study, we assessed single-strand conformation polymorphism (SSCP)-based approaches for their capacity to fingerprint sequence variation in ribosomal DNA (rDNA) of ascaridoid nematodes of veterinary and/or human health significance. The second internal transcribed spacer region (ITS-2) of rDNA was utilised as the target region because it is known to provide species-specific markers for this group of parasites. ITS-2 was amplified by PCR from genomic DNA derived from individual parasites and subjected to analysis. Direct SSCP analysis of amplicons from seven taxa (Toxocara vitulorum, Toxocara cati, Toxocara canis, Toxascaris leonina, Baylisascaris procyonis, Ascaris suum and Parascaris equorum) showed that the single-strand (ss) ITS-2 patterns produced allowed their unequivocal identification to species. While no variation in SSCP patterns was detected in the ITS-2 within four species for which multiple samples were available, the method allowed the direct display of four distinct sequence types of ITS-2 among individual worms of T. cati. Comparison of SSCP/sequencing with the methods of dideoxy fingerprinting (ddF) and restriction endonuclease fingerprinting (REF) revealed that also ddF allowed the definition of the four sequence types, whereas REF displayed three of four. The findings indicate the usefulness of the SSCP-based approaches for the identification of ascaridoid nematodes to species, the direct display of sequence variation in rDNA and the detection of population variation. The ability to fingerprint microheterogeneity in ITS-2 rDNA using such approaches also has implications for studying fundamental aspects relating to mutational change in rDNA.
Swenson, Luke C; Moores, Andrew; Low, Andrew J; Thielen, Alexander; Dong, Winnie; Woods, Conan; Jensen, Mark A; Wynhoven, Brian; Chan, Dennison; Glascock, Christopher; Harrigan, P Richard
2010-08-01
Tropism testing should rule out CXCR4-using HIV before treatment with CCR5 antagonists. Currently, the recombinant phenotypic Trofile assay (Monogram) is most widely utilized; however, genotypic tests may represent alternative methods. Independent triplicate amplifications of the HIV gp120 V3 region were made from either plasma HIV RNA or proviral DNA. These underwent standard, population-based sequencing with an ABI3730 (RNA n = 63; DNA n = 40), or "deep" sequencing with a Roche/454 Genome Sequencer-FLX (RNA n = 12; DNA n = 12). Position-specific scoring matrices (PSSMX4/R5) (-6.96 cutoff) and geno2pheno[coreceptor] (5% false-positive rate) inferred tropism from V3 sequence. These methods were then independently validated with a separate, blinded dataset (n = 278) of screening samples from the maraviroc MOTIVATE trials. Standard sequencing of HIV RNA with PSSM yielded 69% sensitivity and 91% specificity, relative to Trofile. The validation dataset gave 75% sensitivity and 83% specificity. Proviral DNA plus PSSM gave 77% sensitivity and 71% specificity. "Deep" sequencing of HIV RNA detected >2% inferred-CXCR4-using virus in 8/8 samples called non-R5 by Trofile, and <2% in 4/4 samples called R5. Triplicate analyses of V3 standard sequence data detect greater proportions of CXCR4-using samples than previously achieved. Sequencing proviral DNA and "deep" V3 sequencing may also be useful tools for assessing tropism.
NASA Technical Reports Server (NTRS)
Nakayama, S.; Kretsinger, R. H.
1993-01-01
In the first report in this series we presented dendrograms based on 152 individual proteins of the EF-hand family. In the second we used sequences from 228 proteins, containing 835 domains, and showed that eight of the 29 subfamilies are congruent and that the EF-hand domains of the remaining 21 subfamilies have diverse evolutionary histories. In this study we have computed dendrograms within and among the EF-hand subfamilies using the encoding DNA sequences. In most instances the dendrograms based on protein and on DNA sequences are very similar. Significant differences between protein and DNA trees for calmodulin remain unexplained. In our fourth report we evaluate the sequences and the distribution of introns within the EF-hand family and conclude that exon shuffling did not play a significant role in its evolution.
Automated one-step DNA sequencing based on nanoliter reaction volumes and capillary electrophoresis.
Pang, H M; Yeung, E S
2000-08-01
An integrated system with a nano-reactor for cycle-sequencing reaction coupled to on-line purification and capillary gel electrophoresis has been demonstrated. Fifty nanoliters of reagent solution, which includes dye-labeled terminators, polymerase, BSA and template, was aspirated and mixed with the template inside the nano-reactor followed by cycle-sequencing reaction. The reaction products were then purified by a size-exclusion chromatographic column operated at 50 degrees C followed by room temperature on-line injection of the DNA fragments into a capillary for gel electrophoresis. Over 450 bases of DNA can be separated and identified. As little as 25 nl reagent solution can be used for the cycle-sequencing reaction with a slightly shorter read length. Significant savings on reagent cost is achieved because the remaining stock solution can be reused without contamination. The steps of cycle sequencing, on-line purification, injection, DNA separation, capillary regeneration, gel-filling and fluidic manipulation were performed with complete automation. This system can be readily multiplexed for high-throughput DNA sequencing or PCR analysis directly from templates or even biological materials.
2014-01-01
Background Next-generation DNA sequencing (NGS) technologies have made huge impacts in many fields of biological research, but especially in evolutionary biology. One area where NGS has shown potential is for high-throughput sequencing of complete mtDNA genomes (of humans and other animals). Despite the increasing use of NGS technologies and a better appreciation of their importance in answering biological questions, there remain significant obstacles to the successful implementation of NGS-based projects, especially for new users. Results Here we present an ‘A to Z’ protocol for obtaining complete human mitochondrial (mtDNA) genomes – from DNA extraction to consensus sequence. Although designed for use on humans, this protocol could also be used to sequence small, organellar genomes from other species, and also nuclear loci. This protocol includes DNA extraction, PCR amplification, fragmentation of PCR products, barcoding of fragments, sequencing using the 454 GS FLX platform, and a complete bioinformatics pipeline (primer removal, reference-based mapping, output of coverage plots and SNP calling). Conclusions All steps in this protocol are designed to be straightforward to implement, especially for researchers who are undertaking next-generation sequencing for the first time. The molecular steps are scalable to large numbers (hundreds) of individuals and all steps post-DNA extraction can be carried out in 96-well plate format. Also, the protocol has been assembled so that individual ‘modules’ can be swapped out to suit available resources. PMID:24460871
Cloning, sequencing, and expression of cDNA for human. beta. -glucuronidase
DOE Office of Scientific and Technical Information (OSTI.GOV)
Oshima, A.; Kyle, J.W.; Miller, R.D.
1987-02-01
The authors report here the cDNA sequence for human placental ..beta..-glucuronidase (..beta..-D-glucuronoside glucuronosohydrolase, EC 3.2.1.31) and demonstrate expression of the human enzyme in transfected COS cells. They also sequenced a partial cDNA clone from human fibroblasts that contained a 153-base-pair deletion within the coding sequence and found a second type of cDNA clone from placenta that contained the same deletion. Nuclease S1 mapping studies demonstrated two types of mRNAs in human placenta that corresponded to the two types of cDNA clones isolated. The NH/sub 2/-terminal amino acid sequence determined for human spleen ..beta..-glucuronidase agreed with that inferred from the DNAmore » sequence of the two placental clones, beginning at amino acid 23, suggesting a cleaved signal sequence of 22 amino acids. When transfected into COS cells, plasmids containing either placental clone expressed an immunoprecipitable protein that contained N-linked oligosaccharides as evidenced by sensitivity to endoglycosidase F. However, only transfection with the clone containing the 153-base-pair segment led to expression of human ..beta..-glucuronidase activity. These studies provide the sequence for the full-length cDNA for human ..beta..-glucuronidase, demonstrate the existence of two populations of mRNA for ..beta..-glucuronidase in human placenta, only one of which specifies a catalytically active enzyme, and illustrate the importance of expression studies in verifying that a cDNA is functionally full-length.« less
Ned B. Klopfenstein; John W. Hanna; Amy L. Ross-Davis; Jane E. Stewart; Yuko Ota; Rosario Medel-Ortiz; Miguel Armando Lopez-Ramirez; Ruben Damian Elias-Roman; Dionicio Alvarado-Rosales; Mee-Sook Kim
2013-01-01
Armillaria plays diverse ecological roles in forests worldwide, which has inspired interest in understanding phylogenetic relationships within and among species of this genus. Previous rDNA sequence-based phylogenetic analyses of Armillaria have shown general relationships among widely divergent taxa, but rDNA sequences were not reliable for separating closely related...
Fantin, Yuri S.; Neverov, Alexey D.; Favorov, Alexander V.; Alvarez-Figueroa, Maria V.; Braslavskaya, Svetlana I.; Gordukova, Maria A.; Karandashova, Inga V.; Kuleshov, Konstantin V.; Myznikova, Anna I.; Polishchuk, Maya S.; Reshetov, Denis A.; Voiciehovskaya, Yana A.; Mironov, Andrei A.; Chulanov, Vladimir P.
2013-01-01
Sanger sequencing is a common method of reading DNA sequences. It is less expensive than high-throughput methods, and it is appropriate for numerous applications including molecular diagnostics. However, sequencing mixtures of similar DNA of pathogens with this method is challenging. This is important because most clinical samples contain such mixtures, rather than pure single strains. The traditional solution is to sequence selected clones of PCR products, a complicated, time-consuming, and expensive procedure. Here, we propose the base-calling with vocabulary (BCV) method that computationally deciphers Sanger chromatograms obtained from mixed DNA samples. The inputs to the BCV algorithm are a chromatogram and a dictionary of sequences that are similar to those we expect to obtain. We apply the base-calling function on a test dataset of chromatograms without ambiguous positions, as well as one with 3–14% sequence degeneracy. Furthermore, we use BCV to assemble a consensus sequence for an HIV genome fragment in a sample containing a mixture of viral DNA variants and to determine the positions of the indels. Finally, we detect drug-resistant Mycobacterium tuberculosis strains carrying frameshift mutations mixed with wild-type bacteria in the pncA gene, and roughly characterize bacterial communities in clinical samples by direct 16S rRNA sequencing. PMID:23382983
Statistical and linguistic features of DNA sequences
NASA Technical Reports Server (NTRS)
Havlin, S.; Buldyrev, S. V.; Goldberger, A. L.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.
1995-01-01
We present evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range--indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationary" feature of the sequence of base pairs by applying a new algorithm called Detrended Fluctuation Analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and noncoding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to all eukaryotic DNA sequences (33 301 coding and 29 453 noncoding) in the entire GenBank database. We describe a simple model to account for the presence of long-range power-law correlations which is based upon a generalization of the classic Levy walk. Finally, we describe briefly some recent work showing that the noncoding sequences have certain statistical features in common with natural languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function. We suggest that noncoding regions in plants and invertebrates may display a smaller entropy and larger redundancy than coding regions, further supporting the possibility that noncoding regions of DNA may carry biological information.
Nanopore-CMOS Interfaces for DNA Sequencing
Magierowski, Sebastian; Huang, Yiyun; Wang, Chengjie; Ghafar-Zadeh, Ebrahim
2016-01-01
DNA sequencers based on nanopore sensors present an opportunity for a significant break from the template-based incumbents of the last forty years. Key advantages ushered by nanopore technology include a simplified chemistry and the ability to interface to CMOS technology. The latter opportunity offers substantial promise for improvement in sequencing speed, size and cost. This paper reviews existing and emerging means of interfacing nanopores to CMOS technology with an emphasis on massively-arrayed structures. It presents this in the context of incumbent DNA sequencing techniques, reviews and quantifies nanopore characteristics and models and presents CMOS circuit methods for the amplification of low-current nanopore signals in such interfaces. PMID:27509529
Nanopore-CMOS Interfaces for DNA Sequencing.
Magierowski, Sebastian; Huang, Yiyun; Wang, Chengjie; Ghafar-Zadeh, Ebrahim
2016-08-06
DNA sequencers based on nanopore sensors present an opportunity for a significant break from the template-based incumbents of the last forty years. Key advantages ushered by nanopore technology include a simplified chemistry and the ability to interface to CMOS technology. The latter opportunity offers substantial promise for improvement in sequencing speed, size and cost. This paper reviews existing and emerging means of interfacing nanopores to CMOS technology with an emphasis on massively-arrayed structures. It presents this in the context of incumbent DNA sequencing techniques, reviews and quantifies nanopore characteristics and models and presents CMOS circuit methods for the amplification of low-current nanopore signals in such interfaces.
Flow cytometry for enrichment and titration in massively parallel DNA sequencing
Sandberg, Julia; Ståhl, Patrik L.; Ahmadian, Afshin; Bjursell, Magnus K.; Lundeberg, Joakim
2009-01-01
Massively parallel DNA sequencing is revolutionizing genomics research throughout the life sciences. However, the reagent costs and labor requirements in current sequencing protocols are still substantial, although improvements are continuously being made. Here, we demonstrate an effective alternative to existing sample titration protocols for the Roche/454 system using Fluorescence Activated Cell Sorting (FACS) technology to determine the optimal DNA-to-bead ratio prior to large-scale sequencing. Our method, which eliminates the need for the costly pilot sequencing of samples during titration is capable of rapidly providing accurate DNA-to-bead ratios that are not biased by the quantification and sedimentation steps included in current protocols. Moreover, we demonstrate that FACS sorting can be readily used to highly enrich fractions of beads carrying template DNA, with near total elimination of empty beads and no downstream sacrifice of DNA sequencing quality. Automated enrichment by FACS is a simple approach to obtain pure samples for bead-based sequencing systems, and offers an efficient, low-cost alternative to current enrichment protocols. PMID:19304748
BiRen: predicting enhancers with a deep-learning-based model using the DNA sequence alone.
Yang, Bite; Liu, Feng; Ren, Chao; Ouyang, Zhangyi; Xie, Ziwei; Bo, Xiaochen; Shu, Wenjie
2017-07-01
Enhancer elements are noncoding stretches of DNA that play key roles in controlling gene expression programmes. Despite major efforts to develop accurate enhancer prediction methods, identifying enhancer sequences continues to be a challenge in the annotation of mammalian genomes. One of the major issues is the lack of large, sufficiently comprehensive and experimentally validated enhancers for humans or other species. Thus, the development of computational methods based on limited experimentally validated enhancers and deciphering the transcriptional regulatory code encoded in the enhancer sequences is urgent. We present a deep-learning-based hybrid architecture, BiRen, which predicts enhancers using the DNA sequence alone. Our results demonstrate that BiRen can learn common enhancer patterns directly from the DNA sequence and exhibits superior accuracy, robustness and generalizability in enhancer prediction relative to other state-of-the-art enhancer predictors based on sequence characteristics. Our BiRen will enable researchers to acquire a deeper understanding of the regulatory code of enhancer sequences. Our BiRen method can be freely accessed at https://github.com/wenjiegroup/BiRen . shuwj@bmi.ac.cn or boxc@bmi.ac.cn. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
DNA barcoding Indian freshwater fishes.
Lakra, Wazir Singh; Singh, M; Goswami, Mukunda; Gopalakrishnan, A; Lal, K K; Mohindra, V; Sarkar, U K; Punia, P P; Singh, K V; Bhatt, J P; Ayyappan, S
2016-11-01
DNA barcoding is a promising technique for species identification using a short mitochondrial DNA sequence of cytochrome c oxidase I (COI) gene. In the present study, DNA barcodes were generated from 72 species of freshwater fish covering the Orders Cypriniformes, Siluriformes, Perciformes, Synbranchiformes, and Osteoglossiformes representing 50 genera and 19 families. All the samples were collected from diverse sites except the species endemic to a particular location. Species were represented by multiple specimens in the great majority of the barcoded species. A total of 284 COI sequences were generated. After amplification and sequencing of 700 base pair fragment of COI, primers were trimmed which invariably generated a 655 base pair barcode sequence. The average Kimura two-parameter (K2P) distances within-species, genera, families, and orders were 0.40%, 9.60%, 13.10%, and 17.16%, respectively. DNA barcode discriminated congeneric species without any confusion. The study strongly validated the efficiency of COI as an ideal marker for DNA barcoding of Indian freshwater fishes.
Mutation detection using automated fluorescence-based sequencing.
Montgomery, Kate T; Iartchouck, Oleg; Li, Li; Perera, Anoja; Yassin, Yosuf; Tamburino, Alex; Loomis, Stephanie; Kucherlapati, Raju
2008-04-01
The development of high-throughput DNA sequencing techniques has made direct DNA sequencing of PCR-amplified genomic DNA a rapid and economical approach to the identification of polymorphisms that may play a role in disease. Point mutations as well as small insertions or deletions are readily identified by DNA sequencing. The mutations may be heterozygous (occurring in one allele while the other allele retains the normal sequence) or homozygous (occurring in both alleles). Sequencing alone cannot discriminate between true homozygosity and apparent homozygosity due to the loss of one allele due to a large deletion. In this unit, strategies are presented for using PCR amplification and automated fluorescence-based sequencing to identify sequence variation. The size of the project and laboratory preference and experience will dictate how the data is managed and which software tools are used for analysis. A high-throughput protocol is given that has been used to search for mutations in over 200 different genes at the Harvard Medical School - Partners Center for Genetics and Genomics (HPCGG, http://www.hpcgg.org/). Copyright 2008 by John Wiley & Sons, Inc.
Ahmed, Saami; Kaushik, Mahima; Chaudhary, Swati; Kukreti, Shrikant
2018-05-01
Sequence recognition and conformational polymorphism enable DNA to emerge out as a substantial tool in fabricating the devices within nano-dimensions. These DNA associated nano devices work on the principle of conformational switches, which can be facilitated by many factors like sequence of DNA/RNA strand, change in pH or temperature, enzyme or ligand interactions etc. Thus, controlling these DNA conformational changes to acquire the desired function is significant for evolving DNA hybridization biosensor, used in genetic screening and molecular diagnosis. For exploring this conformational switching ability of cytosine-rich DNA oligonucleotides as a function of pH for their potential usage as biosensors, this study has been designed. A C-rich stretch of DNA sequence (5'-TCCCCCAATTAATTCCCCCA-3'; SG20c) has been investigated using UV-Thermal denaturation, poly-acrylamide gel electrophoresis and CD spectroscopy. The SG20c sequence is shown to adopt various topologies of i-motif structure at low pH. This pH dependent transition of SG20c from unstructured single strand to unimolecular and bimolecular i-motif structures can further be exploited for its utilization as switching on/off pH-based biosensors. Copyright © 2018. Published by Elsevier B.V.
Molecular architecture of classical cytological landmarks: Centromeres and telomeres
DOE Office of Scientific and Technical Information (OSTI.GOV)
Meyne, J.
1994-11-01
Both the human telomere repeat and the pericentromeric repeat sequence (GGAAT)n were isolated based on evolutionary conservation. Their isolation was based on the premise that chromosomal features as structurally and functionally important as telomeres and centromeres should be highly conserved. Both sequences were isolated by high stringency screening of a human repetitive DNA library with rodent repetitive DNA. The pHuR library (plasmid Human Repeat) used for this project was enriched for repetitive DNA by using a modification of the standard DNA library preparation method. Usually DNA for a library is cut with restriction enzymes, packaged, infected, and the library ismore » screened. A problem with this approach is that many tandem repeats don`t have any (or many) common restriction sites. Therefore, many of the repeat sequences will not be represented in the library because they are not restricted to a viable length for the vector used. To prepare the pHuR library, human DNA was mechanically sheared to a small size. These relatively short DNA fragments were denatured and then renatured to C{sub o}t 50. Theoretically only repetitive DNA sequences should renature under C{sub o}t 50 conditions. The single-stranded regions were digested using S1 nuclease, leaving the double-stranded, renatured repeat sequences.« less
[Whole Genome Sequencing of Human mtDNA Based on Ion Torrent PGM™ Platform].
Cao, Y; Zou, K N; Huang, J P; Ma, K; Ping, Y
2017-08-01
To analyze and detect the whole genome sequence of human mitochondrial DNA (mtDNA) by Ion Torrent PGM™ platform and to study the differences of mtDNA sequence in different tissues. Samples were collected from 6 unrelated individuals by forensic postmortem examination, including chest blood, hair, costicartilage, nail, skeletal muscle and oral epithelium. Amplification of whole genome sequence of mtDNA was performed by 4 pairs of primer. Libraries were constructed with Ion Shear™ Plus Reagents kit and Ion Plus Fragment Library kit. Whole genome sequencing of mtDNA was performed using Ion Torrent PGM™ platform. Sanger sequencing was used to determine the heteroplasmy positions and the mutation positions on HVⅠ region. The whole genome sequence of mtDNA from all samples were amplified successfully. Six unrelated individuals belonged to 6 different haplotypes. Different tissues in one individual had heteroplasmy difference. The heteroplasmy positions and the mutation positions on HVⅠ region were verified by Sanger sequencing. After a consistency check by the Kappa method, it was found that the results of mtDNA sequence had a high consistency in different tissues. The testing method used in present study for sequencing the whole genome sequence of human mtDNA can detect the heteroplasmy difference in different tissues, which have good consistency. The results provide guidance for the further applications of mtDNA in forensic science. Copyright© by the Editorial Department of Journal of Forensic Medicine
Improvements on a privacy-protection algorithm for DNA sequences with generalization lattices.
Li, Guang; Wang, Yadong; Su, Xiaohong
2012-10-01
When developing personal DNA databases, there must be an appropriate guarantee of anonymity, which means that the data cannot be related back to individuals. DNA lattice anonymization (DNALA) is a successful method for making personal DNA sequences anonymous. However, it uses time-consuming multiple sequence alignment and a low-accuracy greedy clustering algorithm. Furthermore, DNALA is not an online algorithm, and so it cannot quickly return results when the database is updated. This study improves the DNALA method. Specifically, we replaced the multiple sequence alignment in DNALA with global pairwise sequence alignment to save time, and we designed a hybrid clustering algorithm comprised of a maximum weight matching (MWM)-based algorithm and an online algorithm. The MWM-based algorithm is more accurate than the greedy algorithm in DNALA and has the same time complexity. The online algorithm can process data quickly when the database is updated. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Tan, Cheng; Takada, Shoji
2017-01-01
While nucleosome positioning on eukaryotic genome play important roles for genetic regulation, molecular mechanisms of nucleosome positioning and sliding along DNA are not well understood. Here we investigated thermally-activated spontaneous nucleosome sliding mechanisms developing and applying a coarse-grained molecular simulation method that incorporates both long-range electrostatic and short-range hydrogen-bond interactions between histone octamer and DNA. The simulations revealed two distinct sliding modes depending on the nucleosomal DNA sequence. A uniform DNA sequence showed frequent sliding with one base pair step in a rotation-coupled manner, akin to screw-like motions. On the contrary, a strong positioning sequence, the so-called 601 sequence, exhibits rare, abrupt transitions of five and ten base pair steps without rotation. Moreover, we evaluated the importance of hydrogen bond interactions on the sliding mode, finding that strong and weak bonds favor respectively the rotation-coupled and -uncoupled sliding movements. PMID:29194442
NASA Astrophysics Data System (ADS)
Moreland, Blythe; Oman, Kenji; Curfman, John; Yan, Pearlly; Bundschuh, Ralf
Methyl-binding domain (MBD) protein pulldown experiments have been a valuable tool in measuring the levels of methylated CpG dinucleotides. Due to the frequent use of this technique, high-throughput sequencing data sets are available that allow a detailed quantitative characterization of the underlying interaction between methylated DNA and MBD proteins. Analyzing such data sets, we first found that two such proteins cannot bind closer to each other than 2 bp, consistent with structural models of the DNA-protein interaction. Second, the large amount of sequencing data allowed us to find rather weak but nevertheless clearly statistically significant sequence preferences for several bases around the required CpG. These results demonstrate that pulldown sequencing is a high-precision tool in characterizing DNA-protein interactions. This material is based upon work supported by the National Science Foundation under Grant No. DMR-1410172.
Functional interrogation of non-coding DNA through CRISPR genome editing
Canver, Matthew C.; Bauer, Daniel E.; Orkin, Stuart H.
2017-01-01
Methodologies to interrogate non-coding regions have lagged behind coding regions despite comprising the vast majority of the genome. However, the rapid evolution of clustered regularly interspaced short palindromic repeats (CRISPR)-based genome editing has provided a multitude of novel techniques for laboratory investigation including significant contributions to the toolbox for studying non-coding DNA. CRISPR-mediated loss-of-function strategies rely on direct disruption of the underlying sequence or repression of transcription without modifying the targeted DNA sequence. CRISPR-mediated gain-of-function approaches similarly benefit from methods to alter the targeted sequence through integration of customized sequence into the genome as well as methods to activate transcription. Here we review CRISPR-based loss- and gain-of-function techniques for the interrogation of non-coding DNA. PMID:28288828
Gopal, J; Yebra, M J; Bhagwat, A S
1994-01-01
The methyltransferase (MTase) in the DsaV restriction--modification system methylates within 5'-CCNGG sequences. We have cloned the gene for this MTase and determined its sequence. The predicted sequence of the MTase protein contains sequence motifs conserved among all cytosine-5 MTases and is most similar to other MTases that methylate CCNGG sequences, namely M.ScrFI and M.SsoII. All three MTases methylate the internal cytosine within their recognition sequence. The 'variable' region within the three enzymes that methylate CCNGG can be aligned with the sequences of two enzymes that methylate CCWGG sequences. Remarkably, two segments within this region contain significant similarity with the region of M.HhaI that is known to contact DNA bases. These alignments suggest that many cytosine-5 MTases are likely to interact with DNA using a similar structural framework. Images PMID:7971279
Raman-based system for DNA sequencing-mapping and other separations
Vo-Dinh, T.
1994-04-26
DNA sequencing and mapping are performed by using a Raman spectrometer with a surface enhanced Raman scattering (SERS) substrate to enhance the Raman signal. A SERS label is attached to a DNA fragment and then analyzed with the Raman spectrometer to identify the DNA fragment according to characteristics of the Raman spectrum generated. 11 figures.
Ancient DNA analysis reveals woolly rhino evolutionary relationships.
Orlando, Ludovic; Leonard, Jennifer A; Thenot, Aurélie; Laudet, Vincent; Guerin, Claude; Hänni, Catherine
2003-09-01
With ancient DNA technology, DNA sequences have been added to the list of characters available to infer the phyletic position of extinct species in evolutionary trees. We have sequenced the entire 12S rRNA and partial cytochrome b (cyt b) genes of one 60-70,000-year-old sample, and partial 12S rRNA and cyt b sequences of two 40-45,000-year-old samples of the extinct woolly rhinoceros (Coelodonta antiquitatis). Based on these two mitochondrial markers, phylogenetic analyses show that C. antiquitatis is most closely related to one of the three extant Asian rhinoceros species, Dicerorhinus sumatrensis. Calculations based on a molecular clock suggest that the lineage leading to C. antiquitatis and D. sumatrensis diverged in the Oligocene, 21-26 MYA. Both results agree with morphological models deduced from palaeontological data. Nuclear inserts of mitochondrial DNA were identified in the ancient specimens. These data should encourage the use of nuclear DNA in future ancient DNA studies. It also further establishes that the degraded nature of ancient DNA does not completely protect ancient DNA studies based on mitochondrial data from the problems associated with nuclear inserts.
NGS-based likelihood ratio for identifying contributors in two- and three-person DNA mixtures.
Chan Mun Wei, Joshua; Zhao, Zicheng; Li, Shuai Cheng; Ng, Yen Kaow
2018-06-01
DNA fingerprinting, also known as DNA profiling, serves as a standard procedure in forensics to identify a person by the short tandem repeat (STR) loci in their DNA. By comparing the STR loci between DNA samples, practitioners can calculate a probability of match to identity the contributors of a DNA mixture. Most existing methods are based on 13 core STR loci which were identified by the Federal Bureau of Investigation (FBI). Analyses based on these loci of DNA mixture for forensic purposes are highly variable in procedures, and suffer from subjectivity as well as bias in complex mixture interpretation. With the emergence of next-generation sequencing (NGS) technologies, the sequencing of billions of DNA molecules can be parallelized, thus greatly increasing throughput and reducing the associated costs. This allows the creation of new techniques that incorporate more loci to enable complex mixture interpretation. In this paper, we propose a computation for likelihood ratio that uses NGS (next generation sequencing) data for DNA testing on mixed samples. We have applied the method to 4480 simulated DNA mixtures, which consist of various mixture proportions of 8 unrelated whole-genome sequencing data. The results confirm the feasibility of utilizing NGS data in DNA mixture interpretations. We observed an average likelihood ratio as high as 285,978 for two-person mixtures. Using our method, all 224 identity tests for two-person mixtures and three-person mixtures were correctly identified. Copyright © 2018 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Peng, Jun; Ling, Jian; Zhang, Xiu-Qing; Bai, Hui-Ping; Zheng, Liyan; Cao, Qiu-E.; Ding, Zhong-Tao
2015-02-01
In this work, we designed a new fluorescent oligonucleotides-stabilized silver nanoclusters (DNA/AgNCs) probe for sensitive detection of mercury and copper ions. This probe contains two tailored DNA sequence. One is a signal probe contains a cytosine-rich sequence template for AgNCs synthesis and link sequence at both ends. The other is a guanine-rich sequence for signal enhancement and link sequence complementary to the link sequence of the signal probe. After hybridization, the fluorescence of hybridized double-strand DNA/AgNCs is 200-fold enhanced based on the fluorescence enhancement effect of DNA/AgNCs in proximity of guanine-rich DNA sequence. The double-strand DNA/AgNCs probe is brighter and stable than that of single-strand DNA/AgNCs, and more importantly, can be used as novel fluorescent probes for detecting mercury and copper ions. Mercury and copper ions in the range of 6.0-160.0 and 6-240 nM, can be linearly detected with the detection limits of 2.1 and 3.4 nM, respectively. Our results indicated that the analytical parameters of the method for mercury and copper ions detection are much better than which using a single-strand DNA/AgNCs.
Shao, Zhiyong; Graf, Shannon; Chaga, Oleg Y; Lavrov, Dennis V
2006-10-15
The 16,937-nuceotide sequence of the linear mitochondrial DNA (mt-DNA) molecule of the moon jelly Aurelia aurita (Cnidaria, Scyphozoa) - the first mtDNA sequence from the class Scypozoa and the first sequence of a linear mtDNA from Metazoa - has been determined. This sequence contains genes for 13 energy pathway proteins, small and large subunit rRNAs, and methionine and tryptophan tRNAs. In addition, two open reading frames of 324 and 969 base pairs in length have been found. The deduced amino-acid sequence of one of them, ORF969, displays extensive sequence similarity with the polymerase [but not the exonuclease] domain of family B DNA polymerases, and this ORF has been tentatively identified as dnab. This is the first report of dnab in animal mtDNA. The genes in A. aurita mtDNA are arranged in two clusters with opposite transcriptional polarities; transcription proceeding toward the ends of the molecule. The determined sequences at the ends of the molecule are nearly identical but inverted and lack any obvious potential secondary structures or telomere-like repeat elements. The acquisition of mitochondrial genomic data for the second class of Cnidaria allows us to reconstruct characteristic features of mitochondrial evolution in this animal phylum.
TaxI: a software tool for DNA barcoding using distance methods
Steinke, Dirk; Vences, Miguel; Salzburger, Walter; Meyer, Axel
2005-01-01
DNA barcoding is a promising approach to the diagnosis of biological diversity in which DNA sequences serve as the primary key for information retrieval. Most existing software for evolutionary analysis of DNA sequences was designed for phylogenetic analyses and, hence, those algorithms do not offer appropriate solutions for the rapid, but precise analyses needed for DNA barcoding, and are also unable to process the often large comparative datasets. We developed a flexible software tool for DNA taxonomy, named TaxI. This program calculates sequence divergences between a query sequence (taxon to be barcoded) and each sequence of a dataset of reference sequences defined by the user. Because the analysis is based on separate pairwise alignments this software is also able to work with sequences characterized by multiple insertions and deletions that are difficult to align in large sequence sets (i.e. thousands of sequences) by multiple alignment algorithms because of computational restrictions. Here, we demonstrate the utility of this approach with two datasets of fish larvae and juveniles from Lake Constance and juvenile land snails under different models of sequence evolution. Sets of ribosomal 16S rRNA sequences, characterized by multiple indels, performed as good as or better than cox1 sequence sets in assigning sequences to species, demonstrating the suitability of rRNA genes for DNA barcoding. PMID:16214755
Assessing the Fidelity of Ancient DNA Sequences Amplified From Nuclear Genes
Binladen, Jonas; Wiuf, Carsten; Gilbert, M. Thomas P.; Bunce, Michael; Barnett, Ross; Larson, Greger; Greenwood, Alex D.; Haile, James; Ho, Simon Y. W.; Hansen, Anders J.; Willerslev, Eske
2006-01-01
To date, the field of ancient DNA has relied almost exclusively on mitochondrial DNA (mtDNA) sequences. However, a number of recent studies have reported the successful recovery of ancient nuclear DNA (nuDNA) sequences, thereby allowing the characterization of genetic loci directly involved in phenotypic traits of extinct taxa. It is well documented that postmortem damage in ancient mtDNA can lead to the generation of artifactual sequences. However, as yet no one has thoroughly investigated the damage spectrum in ancient nuDNA. By comparing clone sequences from 23 fossil specimens, recovered from environments ranging from permafrost to desert, we demonstrate the presence of miscoding lesion damage in both the mtDNA and nuDNA, resulting in insertion of erroneous bases during amplification. Interestingly, no significant differences in the frequency of miscoding lesion damage are recorded between mtDNA and nuDNA despite great differences in cellular copy numbers. For both mtDNA and nuDNA, we find significant positive correlations between total sequence heterogeneity and the rates of type 1 transitions (adenine → guanine and thymine → cytosine) and type 2 transitions (cytosine → thymine and guanine → adenine), respectively. Type 2 transitions are by far the most dominant and increase relative to those of type 1 with damage load. The results suggest that the deamination of cytosine (and 5-methyl cytosine) to uracil (and thymine) is the main cause of miscoding lesions in both ancient mtDNA and nuDNA sequences. We argue that the problems presented by postmortem damage, as well as problems with contamination from exogenous sources of conserved nuclear genes, allelic variation, and the reliance on single nucleotide polymorphisms, call for great caution in studies relying on ancient nuDNA sequences. PMID:16299392
Directed evolution of the TALE N-terminal domain for recognition of all 5' bases.
Lamb, Brian M; Mercer, Andrew C; Barbas, Carlos F
2013-11-01
Transcription activator-like effector (TALE) proteins can be designed to bind virtually any DNA sequence. General guidelines for design of TALE DNA-binding domains suggest that the 5'-most base of the DNA sequence bound by the TALE (the N0 base) should be a thymine. We quantified the N0 requirement by analysis of the activities of TALE transcription factors (TALE-TF), TALE recombinases (TALE-R) and TALE nucleases (TALENs) with each DNA base at this position. In the absence of a 5' T, we observed decreases in TALE activity up to >1000-fold in TALE-TF activity, up to 100-fold in TALE-R activity and up to 10-fold reduction in TALEN activity compared with target sequences containing a 5' T. To develop TALE architectures that recognize all possible N0 bases, we used structure-guided library design coupled with TALE-R activity selections to evolve novel TALE N-terminal domains to accommodate any N0 base. A G-selective domain and broadly reactive domains were isolated and characterized. The engineered TALE domains selected in the TALE-R format demonstrated modularity and were active in TALE-TF and TALEN architectures. Evolved N-terminal domains provide effective and unconstrained TALE-based targeting of any DNA sequence as TALE binding proteins and designer enzymes.
Unexpected substrate specificity of T4 DNA ligase revealed by in vitro selection
NASA Technical Reports Server (NTRS)
Harada, Kazuo; Orgel, Leslie E.
1993-01-01
We have used in vitro selection techniques to characterize DNA sequences that are ligated efficiently by T4 DNA ligase. We find that the ensemble of selected sequences ligates about 50 times as efficiently as the random mixture of sequences used as the input for selection. Surprisingly many of the selected sequences failed to produce a match at or close to the ligation junction. None of the 20 selected oligomers that we sequenced produced a match two bases upstream from the ligation junction.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sobottka, Marcelo, E-mail: sobottka@mtm.ufsc.br; Hart, Andrew G., E-mail: ahart@dim.uchile.cl
Highlights: {yields} We propose a simple stochastic model to construct primitive DNA sequences. {yields} The model provide an explanation for Chargaff's second parity rule in primitive DNA sequences. {yields} The model is also used to predict a novel type of strand symmetry in primitive DNA sequences. {yields} We extend the results for bacterial DNA sequences and compare distributional properties intrinsic to the model to statistical estimates from 1049 bacterial genomes. {yields} We find out statistical evidences that the novel type of strand symmetry holds for bacterial DNA sequences. -- Abstract: Chargaff's second parity rule for short oligonucleotides states that themore » frequency of any short nucleotide sequence on a strand is approximately equal to the frequency of its reverse complement on the same strand. Recent studies have shown that, with the exception of organellar DNA, this parity rule generally holds for double-stranded DNA genomes and fails to hold for single-stranded genomes. While Chargaff's first parity rule is fully explained by the Watson-Crick pairing in the DNA double helix, a definitive explanation for the second parity rule has not yet been determined. In this work, we propose a model based on a hidden Markov process for approximating the distributional structure of primitive DNA sequences. Then, we use the model to provide another possible theoretical explanation for Chargaff's second parity rule, and to predict novel distributional aspects of bacterial DNA sequences.« less
BATTLE: Biomarker-Based Approaches of Targeted Therapy for Lung Cancer Elimination
2008-04-01
although a grade 3 neutropenia was dose-limiting in one importance. Th th ubstrate of the CYP3A4 isoenzyme and P-gp. Its metabolism is sensitive to...tratification in clinis Molecular Pathway Biomarkers Type of Analysis EGFR EGFR Mutation ( exons 18 to 21) DNA sequencing EGFR Increased Copy Number...polysomy/am 1plification) DNA FISH K-Ras/B-Raf K-RAS Mutation (codons 12,13, 61) DNA sequencing B-RAF Mutations ( exons 11 and 15) DNA sequencing
Dabney, Jesse; Knapp, Michael; Glocke, Isabelle; Gansauge, Marie-Theres; Weihmann, Antje; Nickel, Birgit; Valdiosera, Cristina; García, Nuria; Pääbo, Svante; Arsuaga, Juan-Luis; Meyer, Matthias
2013-09-24
Although an inverse relationship is expected in ancient DNA samples between the number of surviving DNA fragments and their length, ancient DNA sequencing libraries are strikingly deficient in molecules shorter than 40 bp. We find that a loss of short molecules can occur during DNA extraction and present an improved silica-based extraction protocol that enables their efficient retrieval. In combination with single-stranded DNA library preparation, this method enabled us to reconstruct the mitochondrial genome sequence from a Middle Pleistocene cave bear (Ursus deningeri) bone excavated at Sima de los Huesos in the Sierra de Atapuerca, Spain. Phylogenetic reconstructions indicate that the U. deningeri sequence forms an early diverging sister lineage to all Western European Late Pleistocene cave bears. Our results prove that authentic ancient DNA can be preserved for hundreds of thousand years outside of permafrost. Moreover, the techniques presented enable the retrieval of phylogenetically informative sequences from samples in which virtually all DNA is diminished to fragments shorter than 50 bp.
Dabney, Jesse; Knapp, Michael; Glocke, Isabelle; Gansauge, Marie-Theres; Weihmann, Antje; Nickel, Birgit; Valdiosera, Cristina; García, Nuria; Pääbo, Svante; Arsuaga, Juan-Luis; Meyer, Matthias
2013-01-01
Although an inverse relationship is expected in ancient DNA samples between the number of surviving DNA fragments and their length, ancient DNA sequencing libraries are strikingly deficient in molecules shorter than 40 bp. We find that a loss of short molecules can occur during DNA extraction and present an improved silica-based extraction protocol that enables their efficient retrieval. In combination with single-stranded DNA library preparation, this method enabled us to reconstruct the mitochondrial genome sequence from a Middle Pleistocene cave bear (Ursus deningeri) bone excavated at Sima de los Huesos in the Sierra de Atapuerca, Spain. Phylogenetic reconstructions indicate that the U. deningeri sequence forms an early diverging sister lineage to all Western European Late Pleistocene cave bears. Our results prove that authentic ancient DNA can be preserved for hundreds of thousand years outside of permafrost. Moreover, the techniques presented enable the retrieval of phylogenetically informative sequences from samples in which virtually all DNA is diminished to fragments shorter than 50 bp. PMID:24019490
Crystal structure of MboIIA methyltransferase.
Osipiuk, Jerzy; Walsh, Martin A; Joachimiak, Andrzej
2003-09-15
DNA methyltransferases (MTases) are sequence-specific enzymes which transfer a methyl group from S-adenosyl-L-methionine (AdoMet) to the amino group of either cytosine or adenine within a recognized DNA sequence. Methylation of a base in a specific DNA sequence protects DNA from nucleolytic cleavage by restriction enzymes recognizing the same DNA sequence. We have determined at 1.74 A resolution the crystal structure of a beta-class DNA MTase MboIIA (M.MboIIA) from the bacterium Moraxella bovis, the smallest DNA MTase determined to date. M.MboIIA methylates the 3' adenine of the pentanucleotide sequence 5'-GAAGA-3'. The protein crystallizes with two molecules in the asymmetric unit which we propose to resemble the dimer when M.MboIIA is not bound to DNA. The overall structure of the enzyme closely resembles that of M.RsrI. However, the cofactor-binding pocket in M.MboIIA forms a closed structure which is in contrast to the open-form structures of other known MTases.
Wolffe, E J; Gause, W C; Pelfrey, C M; Holland, S M; Steinberg, A D; August, J T
1990-01-05
We describe the isolation and sequencing of a cDNA encoding mouse Pgp-1. An oligonucleotide probe corresponding to the NH2-terminal sequence of the purified protein was synthesized by the polymerase chain reaction and used to screen a mouse macrophage lambda gt11 library. A cDNA clone with an insert of 1.2 kilobases was selected and sequenced. In Northern blot analysis, only cells expressing Pgp-1 contained mRNA species that hybridized with this Pgp-1 cDNA. The nucleotide sequence of the cDNA has a single open reading frame that yields a protein-coding sequence of 1076 base pairs followed by a 132-base pair 3'-untranslated sequence that includes a putative polyadenylation signal but no poly(A) tail. The translated sequence comprises a 13-amino acid signal peptide followed by a polypeptide core of 345 residues corresponding to an Mr of 37,800. Portions of the deduced amino acid sequence were identical to those obtained by amino acid sequence analysis from the purified glycoprotein, confirming that the cDNA encodes Pgp-1. The predicted structure of Pgp-1 includes an NH2-terminal extracellular domain (residues 14-265), a transmembrane domain (residues 266-286), and a cytoplasmic tail (residues 287-358). Portions of the mouse Pgp-1 sequence are highly similar to that of the human CD44 cell surface glycoprotein implicated in cell adhesion. The protein also shows sequence similarity to the proteoglycan tandem repeat sequences found in cartilage link protein and cartilage proteoglycan core protein which are thought to be involved in binding to hyaluronic acid.
Chwialkowska, Karolina; Korotko, Urszula; Kosinska, Joanna; Szarejko, Iwona; Kwasniewski, Miroslaw
2017-01-01
Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq). We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS) and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare . However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation analysis in crop plants with large and complex genomes.
Chwialkowska, Karolina; Korotko, Urszula; Kosinska, Joanna; Szarejko, Iwona; Kwasniewski, Miroslaw
2017-01-01
Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq). We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS) and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare. However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation analysis in crop plants with large and complex genomes. PMID:29250096
2014-01-01
Background Previously, we suggested prototypal models that describe some clinical states based on group postulates. Here, we demonstrate a group/category theory-like model for molecular/genetic biology as an alternative application of our previous model. Specifically, we focus on deoxyribonucleic acid (DNA) base sequences. Results We construct a wallpaper pattern based on a five-letter cruciform motif with letters C, A, T, G, and E. Whereas the first four letters represent the standard DNA bases, the fifth is introduced for ease in formulating group operations that reproduce insertions and deletions of DNA base sequences. A basic group Z5 = {r, u, d, l, n} of operations is defined for the wallpaper pattern, with which a sequence of points can be generated corresponding to changes of a base in a DNA sequence by following the orbit of a point of the pattern under operations in group Z5. Other manipulations of DNA sequence can be treated using a vector-like notation ‘Dj’ corresponding to a DNA sequence but based on the five-letter base set; also, ‘Dj’s are expressed graphically. Insertions and deletions of a series of letters ‘E’ are admitted to assist in describing DNA recombination. Likewise, a vector-like notation Rj can be constructed for sequences of ribonucleic acid (RNA). The wallpaper group B = {Z5×∞, ●} (an ∞-fold Cartesian product of Z5) acts on Dj (or Rj) yielding changes to Dj (or Rj) denoted by ‘Dj◦B(j→k) = Dk’ (or ‘Rj◦B(j→k) = Rk’). Based on the operations of this group, two types of groups—a modulo 5 linear group and a rotational group over the Gaussian plane, acting on the five bases—are linked as parts of the wallpaper group for broader applications. As a result, changes, insertions/deletions and DNA (RNA) recombination (partial/total conversion) are described. As an exploratory study, a notation for the canonical “central dogma” via a category theory-like way is presented for future developments. Conclusions Despite the large incompleteness of our methodology, there is fertile ground to consider a symmetry model for genetic coding based on our specific wallpaper group. A more integrated formulation containing “central dogma” for future molecular/genetic biology remains to be explored. PMID:24885369
Sun, Zhongyue; Liao, Tangbin; Zhang, Yulin; Shu, Jing; Zhang, Hong; Zhang, Guo-Jun
2016-12-15
A very simple sensing device based on biomimetic nanochannels has been developed for label-free, ultrasensitive and highly sequence-specific detection of DNA. Probe DNA was modified on the inner wall of the nanochannel surface by layer-by-layer (LBL) assembly. After probe DNA immobilization, DNA detection was realized by monitoring the rectified ion current when hybridization occurred. Due to three dimensional (3D) nanoscale environment of the nanochannel, this special geometry dramatically increased the surface area of the nanochannel for immobilization of probe molecules on the inner-surface and enlarged contact area between probes and target-molecules. Thus, the unique sensor reached a reliable detection limit of 10 fM for target DNA. In addition, this DNA sensor could discriminate complementary DNA (c-DNA) from non-complementary DNA (nc-DNA), two-base mismatched DNA (2bm-DNA) and one-base mismatched DNA (1bm-DNA) with high specificity. Moreover, the nanochannel-based biosensor was also able to detect target DNA even in an interfering environment and serum samples. This approach will provide a novel biosensing platform for detection and discrimination of disease-related molecular targets and unknown sequence DNA. Copyright © 2016 Elsevier B.V. All rights reserved.
Annotation, submission and screening of repetitive elements in Repbase: RepbaseSubmitter and Censor.
Kohany, Oleksiy; Gentles, Andrew J; Hankus, Lukasz; Jurka, Jerzy
2006-10-25
Repbase is a reference database of eukaryotic repetitive DNA, which includes prototypic sequences of repeats and basic information described in annotations. Updating and maintenance of the database requires specialized tools, which we have created and made available for use with Repbase, and which may be useful as a template for other curated databases. We describe the software tools RepbaseSubmitter and Censor, which are designed to facilitate updating and screening the content of Repbase. RepbaseSubmitter is a java-based interface for formatting and annotating Repbase entries. It eliminates many common formatting errors, and automates actions such as calculation of sequence lengths and composition, thus facilitating curation of Repbase sequences. In addition, it has several features for predicting protein coding regions in sequences; searching and including Pubmed references in Repbase entries; and searching the NCBI taxonomy database for correct inclusion of species information and taxonomic position. Censor is a tool to rapidly identify repetitive elements by comparison to known repeats. It uses WU-BLAST for speed and sensitivity, and can conduct DNA-DNA, DNA-protein, or translated DNA-translated DNA searches of genomic sequence. Defragmented output includes a map of repeats present in the query sequence, with the options to report masked query sequence(s), repeat sequences found in the query, and alignments. Censor and RepbaseSubmitter are available as both web-based services and downloadable versions. They can be found at http://www.girinst.org/repbase/submission.html (RepbaseSubmitter) and http://www.girinst.org/censor/index.php (Censor).
Jakubec, David; Laskowski, Roman A.; Vondrasek, Jiri
2016-01-01
Decades of intensive experimental studies of the recognition of DNA sequences by proteins have provided us with a view of a diverse and complicated world in which few to no features are shared between individual DNA-binding protein families. The originally conceived direct readout of DNA residue sequences by amino acid side chains offers very limited capacity for sequence recognition, while the effects of the dynamic properties of the interacting partners remain difficult to quantify and almost impossible to generalise. In this work we investigated the energetic characteristics of all DNA residue—amino acid side chain combinations in the conformations found at the interaction interface in a very large set of protein—DNA complexes by the means of empirical potential-based calculations. General specificity-defining criteria were derived and utilised to look beyond the binding motifs considered in previous studies. Linking energetic favourability to the observed geometrical preferences, our approach reveals several additional amino acid motifs which can distinguish between individual DNA bases. Our results remained valid in environments with various dielectric properties. PMID:27384774
Bakhori, Noremylia Mohd; Yusof, Nor Azah; Abdullah, Abdul Halim; Hussein, Mohd Zobir
2013-12-12
An optical DNA biosensor based on fluorescence resonance energy transfer (FRET) utilizing synthesized quantum dot (QD) has been developed for the detection of specific-sequence of DNA for Ganoderma boninense, an oil palm pathogen. Modified QD that contained carboxylic groups was conjugated with a single-stranded DNA probe (ssDNA) via amide-linkage. Hybridization of the target DNA with conjugated QD-ssDNA and reporter probe labeled with Cy5 allows for the detection of related synthetic DNA sequence of Ganoderma boninense gene based on FRET signals. Detection of FRET emission before and after hybridization was confirmed through the capability of the system to produce FRET at 680 nm for hybridized sandwich with complementary target DNA. No FRET emission was observed for non-complementary system. Hybridization time, temperature and effect of different concentration of target DNA were studied in order to optimize the developed system. The developed biosensor has shown high sensitivity with detection limit of 3.55 × 10-9 M. TEM results show that the particle size of QD varies in the range between 5 to 8 nm after ligand modification and conjugation with ssDNA. This approach is capable of providing a simple, rapid and sensitive method for detection of related synthetic DNA sequence of Ganoderma boninense.
Mohd Bakhori, Noremylia; Yusof, Nor Azah; Abdullah, Abdul Halim; Hussein, Mohd Zobir
2013-12-01
An optical DNA biosensor based on fluorescence resonance energy transfer (FRET) utilizing synthesized quantum dot (QD) has been developed for the detection of specific-sequence of DNA for Ganoderma boninense, an oil palm pathogen. Modified QD that contained carboxylic groups was conjugated with a single-stranded DNA probe (ssDNA) via amide-linkage. Hybridization of the target DNA with conjugated QD-ssDNA and reporter probe labeled with Cy5 allows for the detection of related synthetic DNA sequence of Ganoderma boninense gene based on FRET signals. Detection of FRET emission before and after hybridization was confirmed through the capability of the system to produce FRET at 680 nm for hybridized sandwich with complementary target DNA. No FRET emission was observed for non-complementary system. Hybridization time, temperature and effect of different concentration of target DNA were studied in order to optimize the developed system. The developed biosensor has shown high sensitivity with detection limit of 3.55 × 10(-9) M. TEM results show that the particle size of QD varies in the range between 5 to 8 nm after ligand modification and conjugation with ssDNA. This approach is capable of providing a simple, rapid and sensitive method for detection of related synthetic DNA sequence of Ganoderma boninense.
Type III restriction-modification enzymes: a historical perspective.
Rao, Desirazu N; Dryden, David T F; Bheemanaik, Shivakumara
2014-01-01
Restriction endonucleases interact with DNA at specific sites leading to cleavage of DNA. Bacterial DNA is protected from restriction endonuclease cleavage by modifying the DNA using a DNA methyltransferase. Based on their molecular structure, sequence recognition, cleavage position and cofactor requirements, restriction-modification (R-M) systems are classified into four groups. Type III R-M enzymes need to interact with two separate unmethylated DNA sequences in inversely repeated head-to-head orientations for efficient cleavage to occur at a defined location (25-27 bp downstream of one of the recognition sites). Like the Type I R-M enzymes, Type III R-M enzymes possess a sequence-specific ATPase activity for DNA cleavage. ATP hydrolysis is required for the long-distance communication between the sites before cleavage. Different models, based on 1D diffusion and/or 3D-DNA looping, exist to explain how the long-distance interaction between the two recognition sites takes place. Type III R-M systems are found in most sequenced bacteria. Genome sequencing of many pathogenic bacteria also shows the presence of a number of phase-variable Type III R-M systems, which play a role in virulence. A growing number of these enzymes are being subjected to biochemical and genetic studies, which, when combined with ongoing structural analyses, promise to provide details for mechanisms of DNA recognition and catalysis.
Environmental Control Of A Genetic Process
NASA Technical Reports Server (NTRS)
Khosla, Chaitan; Bailey, James E.
1991-01-01
E. coli bacteria altered to contain DNA sequence encoding production of hemoglobin made to produce hemoglobin at rates decreasing with increases in concentration of oxygen in culture media. Represents amplification of part of method described in "Cloned Hemoglobin Genes Enhance Growth Of Cells" (NPO-17517). Manipulation of promoter/regulator DNA sequences opens promising new subfield of recombinant-DNA technology for environmental control of expression of selected DNA sequences. New recombinant-DNA fusion gene products, expression vectors, and nucleotide-base sequences will emerge. Likely applications include such aerobic processes as manufacture of cloned proteins and synthesis of metabolites, production of chemicals by fermentation, enzymatic degradation, treatment of wastes, brewing, and variety of oxidative chemical reactions.
Cesarman, E; Chang, Y; Moore, P S; Said, J W; Knowles, D M
1995-05-04
DNA fragments that appeared to belong to an unidentified human herpesvirus were recently found in more than 90 percent of Kaposi's sarcoma lesions associated with the acquired immunodeficiency syndrome (AIDS). These fragments were also found in 6 of 39 tissue samples without Kaposi's sarcoma, including 3 malignant lymphomas, from patients with AIDS, but not in samples from patients without AIDS. We examined the DNA of 193 lymphomas from 42 patients with AIDS and 151 patients who did not have AIDS. We searched the DNA for sequences of Kaposi's sarcoma-associated herpesvirus (KSHV) by Southern blot hybridization, the polymerase chain reaction (PCR), or both. The PCR products in the positive samples were sequences and compared with the KSHV sequences in Kaposi's sarcoma tissues from patients with AIDS. KSHV sequences were identified in eight lymphomas in patients infected with the human immunodeficiency virus. All eight, and only these eight, were body-cavity-based lymphomas--that is, they were characterized by pleural, pericardial, or peritoneal lymphomatous effusions. All eight lymphomas also contained the Epstein-Barr viral genome. KSHV sequences were not found in the other 185 lymphomas. KSHV sequences were 40 to 80 times more abundant in the body-cavity-based lymphomas than in the Kaposi's sarcoma lesions. A high degree of conservation of KSHV sequences in Kaposi's sarcoma and in the eight lymphomas suggests the presence of the same agent in both lesions. The recently discovered KSHV DNA sequences occur in an unusual subgroup of AIDS-related B-cell lymphomas, but not in any other lymphoid neoplasm studied thus far. Our finding strongly suggests that a novel herpesvirus has a pathogenic role in AIDS-related body-cavity-based lymphomas.
Optimisation of DNA extraction from the crustacean Daphnia
Athanasio, Camila Gonçalves; Chipman, James K.; Viant, Mark R.
2016-01-01
Daphnia are key model organisms for mechanistic studies of phenotypic plasticity, adaptation and microevolution, which have led to an increasing demand for genomics resources. A key step in any genomics analysis, such as high-throughput sequencing, is the availability of sufficient and high quality DNA. Although commercial kits exist to extract genomic DNA from several species, preparation of high quality DNA from Daphnia spp. and other chitinous species can be challenging. Here, we optimise methods for tissue homogenisation, DNA extraction and quantification customised for different downstream analyses (e.g., LC-MS/MS, Hiseq, mate pair sequencing or Nanopore). We demonstrate that if Daphnia magna are homogenised as whole animals (including the carapace), absorbance-based DNA quantification methods significantly over-estimate the amount of DNA, resulting in using insufficient starting material for experiments, such as preparation of sequencing libraries. This is attributed to the high refractive index of chitin in Daphnia’s carapace at 260 nm. Therefore, unless the carapace is removed by overnight proteinase digestion, the extracted DNA should be quantified with fluorescence-based methods. However, overnight proteinase digestion will result in partial fragmentation of DNA therefore the prepared DNA is not suitable for downstream methods that require high molecular weight DNA, such as PacBio, mate pair sequencing and Nanopore. In conclusion, we found that the MasterPure DNA purification kit, coupled with grinding of frozen tissue, is the best method for extraction of high molecular weight DNA as long as the extracted DNA is quantified with fluorescence-based methods. This method generated high yield and high molecular weight DNA (3.10 ± 0.63 ng/µg dry mass, fragments >60 kb), free of organic contaminants (phenol, chloroform) and is suitable for large number of downstream analyses. PMID:27190714
The Essential Component in DNA-Based Information Storage System: Robust Error-Tolerating Module
Yim, Aldrin Kay-Yuen; Yu, Allen Chi-Shing; Li, Jing-Woei; Wong, Ada In-Chun; Loo, Jacky F. C.; Chan, King Ming; Kong, S. K.; Yip, Kevin Y.; Chan, Ting-Fung
2014-01-01
The size of digital data is ever increasing and is expected to grow to 40,000 EB by 2020, yet the estimated global information storage capacity in 2011 is <300 EB, indicating that most of the data are transient. DNA, as a very stable nano-molecule, is an ideal massive storage device for long-term data archive. The two most notable illustrations are from Church et al. and Goldman et al., whose approaches are well-optimized for most sequencing platforms – short synthesized DNA fragments without homopolymer. Here, we suggested improvements on error handling methodology that could enable the integration of DNA-based computational process, e.g., algorithms based on self-assembly of DNA. As a proof of concept, a picture of size 438 bytes was encoded to DNA with low-density parity-check error-correction code. We salvaged a significant portion of sequencing reads with mutations generated during DNA synthesis and sequencing and successfully reconstructed the entire picture. A modular-based programing framework – DNAcodec with an eXtensible Markup Language-based data format was also introduced. Our experiments demonstrated the practicability of long DNA message recovery with high error tolerance, which opens the field to biocomputing and synthetic biology. PMID:25414846
Recognising promoter sequences using an artificial immune system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cooke, D.E.; Hunt, J.E.
1995-12-31
We have developed an artificial immune system (AIS) which is based on the human immune system. The AIS possesses an adaptive learning mechanism which enables antibodies to emerge which can be used for classification tasks. In this paper, we describe how the AIS has been used to evolve antibodies which can classify promoter containing and promoter negative DNA sequences. The DNA sequences used for teaching were 57 nucleotides in length and contained procaryotic promoters. The system classified previously unseen DNA sequences with an accuracy of approximately 90%.
Genomic signal processing methods for computation of alignment-free distances from DNA sequences.
Borrayo, Ernesto; Mendizabal-Ruiz, E Gerardo; Vélez-Pérez, Hugo; Romo-Vázquez, Rebeca; Mendizabal, Adriana P; Morales, J Alejandro
2014-01-01
Genomic signal processing (GSP) refers to the use of digital signal processing (DSP) tools for analyzing genomic data such as DNA sequences. A possible application of GSP that has not been fully explored is the computation of the distance between a pair of sequences. In this work we present GAFD, a novel GSP alignment-free distance computation method. We introduce a DNA sequence-to-signal mapping function based on the employment of doublet values, which increases the number of possible amplitude values for the generated signal. Additionally, we explore the use of three DSP distance metrics as descriptors for categorizing DNA signal fragments. Our results indicate the feasibility of employing GAFD for computing sequence distances and the use of descriptors for characterizing DNA fragments.
DNA Nucleotide Sequence Restricted by the RI Endonuclease
Hedgpeth, Joe; Goodman, Howard M.; Boyer, Herbert W.
1972-01-01
The sequence of DNA base pairs adjacent to the phosphodiester bonds cleaved by the RI restriction endonuclease in unmodified DNA from coliphage λ has been determined. The 5′-terminal nucleotide labeled with 32P and oligonucleotides up to the heptamer were analyzed from a pancreatic DNase digest. The following sequence of nucleotides adjacent to the RI break made in λ DNA was deduced from these data and from the 3′-dinucleotide sequence and nearest-neighbor analysis obtained from repair synthesis with the DNA polymerase of Rous sarcoma virus [Formula: see text] The RI endonuclease cleavage of the phosphodiester bonds (indicated by arrows) generates 5′-phosphoryls and short cohesive termini of four nucleotides, pApApTpT. The most striking feature of the sequence is its symmetry. PMID:4343974
Genomic Signal Processing Methods for Computation of Alignment-Free Distances from DNA Sequences
Borrayo, Ernesto; Mendizabal-Ruiz, E. Gerardo; Vélez-Pérez, Hugo; Romo-Vázquez, Rebeca; Mendizabal, Adriana P.; Morales, J. Alejandro
2014-01-01
Genomic signal processing (GSP) refers to the use of digital signal processing (DSP) tools for analyzing genomic data such as DNA sequences. A possible application of GSP that has not been fully explored is the computation of the distance between a pair of sequences. In this work we present GAFD, a novel GSP alignment-free distance computation method. We introduce a DNA sequence-to-signal mapping function based on the employment of doublet values, which increases the number of possible amplitude values for the generated signal. Additionally, we explore the use of three DSP distance metrics as descriptors for categorizing DNA signal fragments. Our results indicate the feasibility of employing GAFD for computing sequence distances and the use of descriptors for characterizing DNA fragments. PMID:25393409
Yamada, Kazuhiko; Nishida-Umehara, Chizuko; Matsuda, Yoichi
2004-03-01
We isolated a new family of satellite DNA sequences from HaeIII- and EcoRI-digested genomic DNA of the Blakiston's fish owl ( Ketupa blakistoni). The repetitive sequences were organized in tandem arrays of the 174 bp element, and localized to the centromeric regions of all macrochromosomes, including the Z and W chromosomes, and microchromosomes. This hybridization pattern was consistent with the distribution of C-band-positive centromeric heterochromatin, and the satellite DNA sequences occupied 10% of the total genome as a major component of centromeric heterochromatin. The sequences were homogenized between macro- and microchromosomes in this species, and therefore intraspecific divergence of the nucleotide sequences was low. The 174 bp element cross-hybridized to the genomic DNA of six other Strigidae species, but not to that of the Tytonidae, suggesting that the satellite DNA sequences are conserved in the same family but fairly divergent between the different families in the Strigiformes. Secondly, the centromeric satellite DNAs were cloned from eight Strigidae species, and the nucleotide sequences of 41 monomer fragments were compared within and between species. Molecular phylogenetic relationships of the nucleotide sequences were highly correlated with both the taxonomy based on morphological traits and the phylogenetic tree constructed by DNA-DNA hybridization. These results suggest that the satellite DNA sequence has evolved by concerted evolution in the Strigidae and that it is a good taxonomic and phylogenetic marker to examine genetic diversity between Strigiformes species.
Walker, M D; Park, C W; Rosen, A; Aronheim, A
1990-01-01
Cell specific expression of the insulin gene is achieved through transcriptional mechanisms operating on multiple DNA sequence elements located in the 5' flanking region of the gene. Of particular importance in the rat insulin I gene are two closely similar 9 bp sequences (IEB1 and IEB2): mutation of either of these leads to 5-10 fold reduction in transcriptional activity. We have screened an expression cDNA library derived from mouse pancreatic endocrine beta cells with a radioactive DNA probe containing multiple copies of the IEB1 sequence. A cDNA clone (A1) isolated by this procedure encodes a protein which shows efficient binding to the IEB1 probe, but much weaker binding to either an unrelated DNA probe or to a probe bearing a single base pair insertion within the recognition sequence. DNA sequence analysis indicates a protein belonging to the helix-loop-helix family of DNA-binding proteins. The ability of the protein encoded by clone A1 to recognize a number of wild type and mutant DNA sequences correlates closely with the ability of each sequence element to support transcription in vivo in the context of the insulin 5' flanking DNA. We conclude that the isolated cDNA may encode a transcription factor that participates in control of insulin gene expression. Images PMID:2181401
Mitochondrial sequence analysis for forensic identification using pyrosequencing technology.
Andréasson, H; Asp, A; Alderborn, A; Gyllensten, U; Allen, M
2002-01-01
Over recent years, requests for mtDNA analysis in the field of forensic medicine have notably increased, and the results of such analyses have proved to be very useful in forensic cases where nuclear DNA analysis cannot be performed. Traditionally, mtDNA has been analyzed by DNA sequencing of the two hypervariable regions, HVI and HVII, in the D-loop. DNA sequence analysis using the conventional Sanger sequencing is very robust but time consuming and labor intensive. By contrast, mtDNA analysis based on the pyrosequencing technology provides fast and accurate results from the human mtDNA present in many types of evidence materials in forensic casework. The assay has been developed to determine polymorphic sites in the mitochondrial D-loop as well as the coding region to further increase the discrimination power of mtDNA analysis. The pyrosequencing technology for analysis of mtDNA polymorphisms has been tested with regard to sensitivity, reproducibility, and success rate when applied to control samples and actual casework materials. The results show that the method is very accurate and sensitive; the results are easily interpreted and provide a high success rate on casework samples. The panel of pyrosequencing reactions for the mtDNA polymorphisms were chosen to result in an optimal discrimination power in relation to the number of bases determined.
Kiesler, Kevin M; Coble, Michael D; Hall, Thomas A; Vallone, Peter M
2014-01-01
A set of 711 samples from four U.S. population groups was analyzed using a novel mass spectrometry based method for mitochondrial DNA (mtDNA) base composition profiling. Comparison of the mass spectrometry results with Sanger sequencing derived data yielded a concordance rate of 99.97%. Length heteroplasmy was identified in 46% of samples and point heteroplasmy was observed in 6.6% of samples in the combined mass spectral and Sanger data set. Using discrimination capacity as a metric, Sanger sequencing of the full control region had the highest discriminatory power, followed by the mass spectrometry base composition method, which was more discriminating than Sanger sequencing of just the hypervariable regions. This trend is in agreement with the number of nucleotides covered by each of the three assays. Published by Elsevier Ireland Ltd.
Repair of DNA damage caused by cytosine deamination in mitochondrial DNA of forensic case samples.
Gorden, Erin M; Sturk-Andreaggi, Kimberly; Marshall, Charla
2018-05-01
DNA sequence damage from cytosine deamination is well documented in degraded samples, such as those from ancient and forensic contexts. This study examined the effect of a DNA repair treatment on mitochondrial DNA (mtDNA) from aged and degraded skeletal samples. DNA extracts from 21 non-probative, degraded skeletal samples (aged 50-70 years) were utilized for the analysis. A portion of each sample extract was subjected to DNA repair using a commercial repair kit, the New England BioLabs' NEBNext FFPE DNA Repair Kit (Ipswich, MA). MtDNA was enriched using PCR and targeted capture in a side-by-side experiment of untreated and repaired DNA. Sequencing was performed using both traditional (Sanger-type; STS) and next-generation sequencing (NGS) methods Although cytosine deamination was evident in the mtDNA sequence data, the observed level of damaged bases varied by sequencing method as well as by enrichment type. The STS PCR amplicon data did not show evidence of cytosine deamination that could be distinguished from background signal in either the untreated or repaired sample set. However, the same PCR amplicons showed 850 C → T/G → A substitutions consistent with cytosine deamination with variant frequencies (VFs) of up to 25% when sequenced using NGS methods The occurrence of base misincorporation due to cytosine deamination was reduced by 98% (to 10) in the NGS amplicon data after repair. The NGS capture data indicated low levels (1-2%) of cytosine deamination in mtDNA fragments that was effectively mitigated by DNA repair. The observed difference in the level of cytosine deamination between the PCR and capture enrichment methods can be attributed to the greater propensity for stochastic effects from the PCR enrichment technique employed (e.g., low template input, increased PCR cycles). Altogether these results indicate that DNA repair may be required when sequencing PCR-amplified DNA from degraded forensic case samples with NGS methods. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
Long-range barcode labeling-sequencing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Feng; Zhang, Tao; Singh, Kanwar K.
Methods for sequencing single large DNA molecules by clonal multiple displacement amplification using barcoded primers. Sequences are binned based on barcode sequences and sequenced using a microdroplet-based method for sequencing large polynucleotide templates to enable assembly of haplotype-resolved complex genomes and metagenomes.
Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models.
Park, Byungkyu; Im, Jinyong; Tuvshinjargal, Narankhuu; Lee, Wook; Han, Kyungsook
2014-11-01
As many structures of protein-DNA complexes have been known in the past years, several computational methods have been developed to predict DNA-binding sites in proteins. However, its inverse problem (i.e., predicting protein-binding sites in DNA) has received much less attention. One of the reasons is that the differences between the interaction propensities of nucleotides are much smaller than those between amino acids. Another reason is that DNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding DNA nucleotides is much harder than predicting DNA-binding amino acids. We computed the interaction propensity (IP) of nucleotide triplets with amino acids using an extensive dataset of protein-DNA complexes, and developed two support vector machine (SVM) models that predict protein-binding nucleotides from sequence data alone. One SVM model predicts protein-binding nucleotides using DNA sequence data alone, and the other SVM model predicts protein-binding nucleotides using both DNA and protein sequences. In a 10-fold cross-validation with 1519 DNA sequences, the SVM model that uses DNA sequence data only predicted protein-binding nucleotides with an accuracy of 67.0%, an F-measure of 67.1%, and a Matthews correlation coefficient (MCC) of 0.340. With an independent dataset of 181 DNAs that were not used in training, it achieved an accuracy of 66.2%, an F-measure 66.3% and a MCC of 0.324. Another SVM model that uses both DNA and protein sequences achieved an accuracy of 69.6%, an F-measure of 69.6%, and a MCC of 0.383 in a 10-fold cross-validation with 1519 DNA sequences and 859 protein sequences. With an independent dataset of 181 DNAs and 143 proteins, it showed an accuracy of 67.3%, an F-measure of 66.5% and a MCC of 0.329. Both in cross-validation and independent testing, the second SVM model that used both DNA and protein sequence data showed better performance than the first model that used DNA sequence data. To the best of our knowledge, this is the first attempt to predict protein-binding nucleotides in a given DNA sequence from the sequence data alone. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Intrinsic flexibility of B-DNA: the experimental TRX scale.
Heddi, Brahim; Oguey, Christophe; Lavelle, Christophe; Foloppe, Nicolas; Hartmann, Brigitte
2010-01-01
B-DNA flexibility, crucial for DNA-protein recognition, is sequence dependent. Free DNA in solution would in principle be the best reference state to uncover the relation between base sequences and their intrinsic flexibility; however, this has long been hampered by a lack of suitable experimental data. We investigated this relationship by compiling and analyzing a large dataset of NMR (31)P chemical shifts in solution. These measurements reflect the BI <--> BII equilibrium in DNA, intimately correlated to helicoidal descriptors of the curvature, winding and groove dimensions. Comparing the ten complementary DNA dinucleotide steps indicates that some steps are much more flexible than others. This malleability is primarily controlled at the dinucleotide level, modulated by the tetranucleotide environment. Our analyses provide an experimental scale called TRX that quantifies the intrinsic flexibility of the ten dinucleotide steps in terms of Twist, Roll, and X-disp (base pair displacement). Applying the TRX scale to DNA sequences optimized for nucleosome formation reveals a 10 base-pair periodic alternation of stiff and flexible regions. Thus, DNA flexibility captured by the TRX scale is relevant to nucleosome formation, suggesting that this scale may be of general interest to better understand protein-DNA recognition.
Decoding DNA labels by melting curve analysis using real-time PCR.
Balog, József A; Fehér, Liliána Z; Puskás, László G
2017-12-01
Synthetic DNA has been used as an authentication code for a diverse number of applications. However, existing decoding approaches are based on either DNA sequencing or the determination of DNA length variations. Here, we present a simple alternative protocol for labeling different objects using a small number of short DNA sequences that differ in their melting points. Code amplification and decoding can be done in two steps using quantitative PCR (qPCR). To obtain a DNA barcode with high complexity, we defined 8 template groups, each having 4 different DNA templates, yielding 158 (>2.5 billion) combinations of different individual melting temperature (Tm) values and corresponding ID codes. The reproducibility and specificity of the decoding was confirmed by using the most complex template mixture, which had 32 different products in 8 groups with different Tm values. The industrial applicability of our protocol was also demonstrated by labeling a drone with an oil-based paint containing a predefined DNA code, which was then successfully decoded. The method presented here consists of a simple code system based on a small number of synthetic DNA sequences and a cost-effective, rapid decoding protocol using a few qPCR reactions, enabling a wide range of authentication applications.
Aspergillus section Versicolores: nine new species and multilocus DNA sequence based phylogeny
USDA-ARS?s Scientific Manuscript database
ß-tubulin, calmodulin, internal transcribed spacer and partial lsu-rDNA, RNA polymerase, DNA replication licensing factor Mcm7, and pre-rRNA processing protein Tsr1 were amplified and sequenced from 62 A. versicolor clade isolates and analyzed phylogenetically using the concordance model to establis...
Aspergillus section Versicolores, nine new species and multilocus DNA sequence based phylogeny
USDA-ARS?s Scientific Manuscript database
ß-tubulin, calmodulin, internal transcribed spacer and partial lsu-rDNA, RNA polymerase, DNA replication licensing factor Mcm7, and pre-rRNA processing protein Tsr1 were amplified and sequenced from 62 A. versicolor clade isolates and analyzed phylogenetically using the concordance model to establis...
Modeling DNA bubble formation at the atomic scale
DOE Office of Scientific and Technical Information (OSTI.GOV)
Beleva, V; Rasmussen, K. O.; Garcia, A. E.
We describe the fluctuations of double stranded DNA molecules using a minimalist Go model over a wide range of temperatures. Minimalist models allow us to describe, at the atomic level, the opening and formation of bubbles in DNA double helices. This model includes all the geometrical constraints in helix melting imposed by the 3D structure of the molecule. The DNA forms melted bubbles within double helices. These bubbles form and break as a function of time. The equilibrium average number of broken base pairs shows a sharp change as a function of T. We observe a temperature profile of sequencemore » dependent bubble formation similar to those measured by Zeng et al. Long nuclei acid molecules melt partially through the formations of bubbles. It is known that CG rich sequences melt at higher temperatures than AT rich sequences. The melting temperature, however, is not solely determined by the CG content, but by the sequence through base stacking and solvent interactions. Recently, models that incorporate the sequence and nonlinear dynamics of DNA double strands have shown that DNA exhibits a very rich dynamics. Recent extensions of the Bishop-Peyrard model show that fluctuations in the DNA structure lead to opening in localized regions, and that these regions in the DNA are associated with transcription initiation sites. 1D and 2D models of DNA may contain enough information about stacking and base pairing interactions, but lack the coupling between twisting, bending and base pair opening imposed by the double helical structure of DNA that all atom models easily describe. However, the complexity of the energy function used in all atom simulations (including solvent, ions, etc) does not allow for the description of DNA folding/unfolding events that occur in the microsecond time scale.« less
Quantum mechanical calculations related to ionization and charge transfer in DNA
NASA Astrophysics Data System (ADS)
Cauët, E.; Valiev, M.; Weare, J. H.; Liévin, J.
2012-07-01
Ionization and charge migration in DNA play crucial roles in mechanisms of DNA damage caused by ionizing radiation, oxidizing agents and photo-irradiation. Therefore, an evaluation of the ionization properties of the DNA bases is central to the full interpretation and understanding of the elementary reactive processes that occur at the molecular level during the initial exposure and afterwards. Ab initio quantum mechanical (QM) methods have been successful in providing highly accurate evaluations of key parameters, such as ionization energies (IE) of DNA bases. Hence, in this study, we performed high-level QM calculations to characterize the molecular energy levels and potential energy surfaces, which shed light on ionization and charge migration between DNA bases. In particular, we examined the IEs of guanine, the most easily oxidized base, isolated and embedded in base clusters, and investigated the mechanism of charge migration over two and three stacked guanines. The IE of guanine in the human telomere sequence has also been evaluated. We report a simple molecular orbital analysis to explain how modifications in the base sequence are expected to change the efficiency of the sequence as a hole trap. Finally, the application of a hybrid approach combining quantum mechanics with molecular mechanics brings an interesting discussion as to how the native aqueous DNA environment affects the IE threshold of nucleobases.
Exploring the Limits of DNA Size: Naphtho-homologated DNA Bases and Pairs
Lee, Alex H. F.; Kool, Eric T.
2008-01-01
A new design for DNA bases and base pairs is described in which the pyrimidine bases are widened by naphtho-homologation. Two naphtho-homologated deoxyribosides, dyyT (1) and dyyC (2) were synthesized and could be incorporated into oligonucleotides as suitably protected phosphoramidite derivatives. The deoxyribosides were found to be fluorescent, with emission maxima at 446 and 433 nm, respectively. Studies with single substitutions of 1 and 2 in the natural DNA context revealed exceptionally strong base stacking propensity for both. Sequences containing multiple substitutions of 1 and 2 paired opposite adenine and guanine were subsequently mixed and studied by several analytical methods. Data from UV mixing experiments, FRET measurements, fluorescence quenching experiments, and hybridizations on beads suggest that complementary “doublewide DNA” (yyDNA) strands may self-assemble into helical complexes with 1:1 stoichiometry. Data from thermal denaturation plots and CD spectra were less conclusive. Control experiments in one sequence context gave evidence that yyDNA helices, if formed, are preferentially antiparallel and are sequence selective. Hypothesized base pairing schemes are analogous to Watson-Crick pairing, but with glycosidic C1′-C1′ distances widened by over 45%, to ca. 15.2 Å. The possible self-assembly of the double-wide DNA helix establishes a new limit for the size of information-encoding, DNA-like molecules, and the fluorescence of yyDNA bases suggests uses as reporters in monomeric and oligomeric forms. PMID:16834396
Resolution of model Holliday junctions by yeast endonuclease: effect of DNA structure and sequence.
Parsons, C A; Murchie, A I; Lilley, D M; West, S C
1989-01-01
The resolution of Holliday junctions in DNA involves specific cleavage at or close to the site of the junction. A nuclease from Saccharomyces cerevisiae cleaves model Holliday junctions in vitro by the introduction of nicks in regions of duplex DNA adjacent to the crossover point. In previous studies [Parsons and West (1988) Cell, 52, 621-629] it was shown that cleavage occurred within homologous arm sequences with precise symmetry across the junction. In contrast, junctions with heterologous arm sequences were cleaved asymmetrically. In this work, we have studied the effect of sequence changes and base modification upon the site of cleavage. It is shown that the specificity of cleavage is unchanged providing that perfect homology is maintained between opposing arm sequences. However, in the absence of homology, cleavage depends upon sequence context and is affected by minor changes such as base modification. These data support the proposed mechanism for cleavage of a Holliday junction, which requires homologous alignment of arm sequences in an enzyme--DNA complex as a prerequisite for symmetrical cleavage by the yeast endonuclease. Images PMID:2653810
Zhu, X Q; Chilton, N B; Gasser, R B
1998-05-01
This study evaluated the use of a commercially available DNA intercalating agent (Resolver Gold) in agarose gels for the direct detection of sequence variation in ribosomal DNA (rDNA). This agent binds preferentially to AT sequence motifs in DNA. Regions of nuclear rDNA, known to provide genetic markers for the identification of species of parasitic ascarid nematodes (order Ascaridida), were amplified by polymerase chain reaction (PCR) and subjected to electrophoresis in standard agarose gels versus gels supplemented with Resolver Gold. Individual taxa examined could not be distinguished reliably based on the size of their amplicons in standard agarose gels, whereas they could be readily delineated based on mobility using Resolver Gold-supplemented gels. The latter was achieved because of differences (approximately 0.1-8.2%) in the AT content of the fragments among different taxa, which were associated with significant interspecific differences (approximately 11-39%) in the rDNA sequences employed. There was a tendency for fragments with higher AT content to migrate slower in supplemented agarose gels compared with those of lower AT content. The results indicate the usefulness of this electrophoretic approach to rapidly screen for sequence variability within or among PCR-amplified rDNA fragments of similar sizes but differing AT contents. Although evaluated on rDNA of parasites, the approach has potential to be applied to a range of genes of different groups of infectious organisms.
Malecka, Kamila; Michalczuk, Lech; Radecka, Hanna; Radecki, Jerzy
2014-10-09
A DNA biosensor for detection of specific oligonucleotides sequences of Plum Pox Virus (PPV) in plant extracts and buffer is proposed. The working principles of a genosensor are based on the ion-channel mechanism. The NH2-ssDNA probe was deposited onto a glassy carbon electrode surface to form an amide bond between the carboxyl group of oxidized electrode surface and amino group from ssDNA probe. The analytical signals generated as a result of hybridization were registered in Osteryoung square wave voltammetry in the presence of [Fe(CN)6]3-/4- as a redox marker. The 22-mer and 42-mer complementary ssDNA sequences derived from PPV and DNA samples from plants infected with PPV were used as targets. Similar detection limits of 2.4 pM (31.0 pg/mL) and 2.3 pM (29.5 pg/mL) in the concentration range 1-8 pM were observed in the presence of the 22-mer ssDNA and 42-mer complementary ssDNA sequences of PPV, respectively. The genosensor was capable of discriminating between samples consisting of extracts from healthy plants and leaf extracts from infected plants in the concentration range 10-50 pg/mL. The detection limit was 12.8 pg/mL. The genosensor displayed good selectivity and sensitivity. The 20-mer partially complementary DNA sequences with four complementary bases and DNA samples from healthy plants used as negative controls generated low signal.
NASA Astrophysics Data System (ADS)
Drozd, Marcin; Pietrzak, Mariusz D.; Malinowska, Elżbieta
2018-05-01
The framework of presented study covers the development and examination of the analytical performance of surface plasmon resonance-based (SPR) DNA biosensors dedicated for a detection of model target oligonucleotide sequence. For this aim, various strategies of immobilization of DNA probes on gold transducers were tested. Besides the typical approaches: chemisorption of thiolated ssDNA (DNA-thiol) and physisorption of non-functionalized oligonucleotides, relatively new method based on chemisorption of dithiocarbamate-functionalized ssDNA (DNA-DTC) was applied for the first time for preparation of DNA-based SPR biosensor. The special emphasis was put on the correlation between the method of DNA immobilization and the composition of obtained receptor layer. The carried out studies focused on the examination of the capability of developed receptors layers to interact with both target DNA and DNA-functionalized AuNPs. It was found, that the detection limit of target DNA sequence (27 nb length) depends on the strategy of probe immobilization and backfilling method, and in the best case it amounted to 0,66 nM. Moreover, the application of ssDNA-functionalized gold nanoparticles (AuNPs) as plasmonic labels for secondary enhancement of SPR response is presented. The influence of spatial organization and surface density of a receptor layer on the ability to interact with DNA-functionalized AuNPs is discussed. Due to the best compatibility of receptors immobilized via DTC chemisorption: 1.47 ± 0.4 ·1012 molecules • cm-2 (with the calculated area occupied by single nanoparticle label of 132.7 nm2), DNA chemisorption based on DTCs is pointed as especially promising for DNA biosensors utilizing indirect detection in competitive assays.
Drozd, Marcin; Pietrzak, Mariusz D; Malinowska, Elżbieta
2018-01-01
The framework of presented study covers the development and examination of the analytical performance of surface plasmon resonance-based (SPR) DNA biosensors dedicated for a detection of model target oligonucleotide sequence. For this aim, various strategies of immobilization of DNA probes on gold transducers were tested. Besides the typical approaches: chemisorption of thiolated ssDNA (DNA-thiol) and physisorption of non-functionalized oligonucleotides, relatively new method based on chemisorption of dithiocarbamate-functionalized ssDNA (DNA-DTC) was applied for the first time for preparation of DNA-based SPR biosensor. The special emphasis was put on the correlation between the method of DNA immobilization and the composition of obtained receptor layer. The carried out studies focused on the examination of the capability of developed receptors layers to interact with both target DNA and DNA-functionalized AuNPs. It was found, that the detection limit of target DNA sequence (27 nb length) depends on the strategy of probe immobilization and backfilling method, and in the best case it amounted to 0.66 nM. Moreover, the application of ssDNA-functionalized gold nanoparticles (AuNPs) as plasmonic labels for secondary enhancement of SPR response is presented. The influence of spatial organization and surface density of a receptor layer on the ability to interact with DNA-functionalized AuNPs is discussed. Due to the best compatibility of receptors immobilized via DTC chemisorption: 1.47 ± 0.4 · 10 12 molecules · cm -2 (with the calculated area occupied by single nanoparticle label of ~132.7 nm 2 ), DNA chemisorption based on DTCs is pointed as especially promising for DNA biosensors utilizing indirect detection in competitive assays.
Method for performing site-specific affinity fractionation for use in DNA sequencing
Mirzabekov, Andrei Darievich; Lysov, Yuri Petrovich; Dubley, Svetlana A.
1999-01-01
A method for fractionating and sequencing DNA via affinity interaction is provided comprising contacting cleaved DNA to a first array of oligonucleotide molecules to facilitate hybridization between said cleaved DNA and the molecules; extracting the hybridized DNA from the molecules; contacting said extracted hybridized DNA with a second array of oligonucleotide molecules, wherein the oligonucleotide molecules in the second array have specified base sequences that are complementary to said extracted hybridized DNA; and attaching labeled DNA to the second array of oligonucleotide molecules, wherein the labeled re-hybridized DNA have sequences that are complementary to the oligomers. The invention further provides a method for performing multi-step conversions of the chemical structure of compounds comprising supplying an array of polyacrylamide vessels separated by hydrophobic surfaces; immobilizing a plurality of reactants, such as enzymes, in the vessels so that each vessel contains one reactant; contacting the compounds to each of the vessels in a predetermined sequence and for a sufficient time to convert the compounds to a desired state; and isolating the converted compounds from said array.
Mirzabekov, Andrei Darievich; Lysov, Yuri Petrovich; Dubley, Svetlana A.
2000-01-01
A method for fractionating and sequencing DNA via affinity interaction is provided comprising contacting cleaved DNA to a first array of oligonucleotide molecules to facilitate hybridization between said cleaved DNA and the molecules; extracting the hybridized DNA from the molecules; contacting said extracted hybridized DNA with a second array of oligonucleotide molecules, wherein the oligonucleotide molecules in the second array have specified base sequences that are complementary to said extracted hybridized DNA; and attaching labeled DNA to the second array of oligonucleotide molecules, wherein the labeled re-hybridized DNA have sequences that are complementary to the oligomers. The invention further provides a method for performing multi-step conversions of the chemical structure of compounds comprising supplying an array of polyacrylamide vessels separated by hydrophobic surfaces; immobilizing a plurality of reactants, such as enzymes, in the vessels so that each vessel contains one reactant; contacting the compounds to each of the vessels in a predetermined sequence and for a sufficient time to convert the compounds to a desired state; and isolating the converted compounds from said array.
Method for performing site-specific affinity fractionation for use in DNA sequencing
Mirzabekov, A.D.; Lysov, Y.P.; Dubley, S.A.
1999-05-18
A method for fractionating and sequencing DNA via affinity interaction is provided comprising contacting cleaved DNA to a first array of oligonucleotide molecules to facilitate hybridization between the cleaved DNA and the molecules; extracting the hybridized DNA from the molecules; contacting the extracted hybridized DNA with a second array of oligonucleotide molecules, wherein the oligonucleotide molecules in the second array have specified base sequences that are complementary to the extracted hybridized DNA; and attaching labeled DNA to the second array of oligonucleotide molecules, wherein the labeled re-hybridized DNA have sequences that are complementary to the oligomers. The invention further provides a method for performing multi-step conversions of the chemical structure of compounds comprising supplying an array of polyacrylamide vessels separated by hydrophobic surfaces; immobilizing a plurality of reactants, such as enzymes, in the vessels so that each vessel contains one reactant; contacting the compounds to each of the vessels in a predetermined sequence and for a sufficient time to convert the compounds to a desired state; and isolating the converted compounds from the array. 14 figs.
Hajibabaei, Mehrdad; Shokralla, Shadi; Zhou, Xin; Singer, Gregory A. C.; Baird, Donald J.
2011-01-01
Timely and accurate biodiversity analysis poses an ongoing challenge for the success of biomonitoring programs. Morphology-based identification of bioindicator taxa is time consuming, and rarely supports species-level resolution especially for immature life stages. Much work has been done in the past decade to develop alternative approaches for biodiversity analysis using DNA sequence-based approaches such as molecular phylogenetics and DNA barcoding. On-going assembly of DNA barcode reference libraries will provide the basis for a DNA-based identification system. The use of recently introduced next-generation sequencing (NGS) approaches in biodiversity science has the potential to further extend the application of DNA information for routine biomonitoring applications to an unprecedented scale. Here we demonstrate the feasibility of using 454 massively parallel pyrosequencing for species-level analysis of freshwater benthic macroinvertebrate taxa commonly used for biomonitoring. We designed our experiments in order to directly compare morphology-based, Sanger sequencing DNA barcoding, and next-generation environmental barcoding approaches. Our results show the ability of 454 pyrosequencing of mini-barcodes to accurately identify all species with more than 1% abundance in the pooled mixture. Although the approach failed to identify 6 rare species in the mixture, the presence of sequences from 9 species that were not represented by individuals in the mixture provides evidence that DNA based analysis may yet provide a valuable approach in finding rare species in bulk environmental samples. We further demonstrate the application of the environmental barcoding approach by comparing benthic macroinvertebrates from an urban region to those obtained from a conservation area. Although considerable effort will be required to robustly optimize NGS tools to identify species from bulk environmental samples, our results indicate the potential of an environmental barcoding approach for biomonitoring programs. PMID:21533287
NMR and enzymology of modified DNA/protein interactions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kennedy, M.A.
1994-12-31
We have found distinct DNA structure and base dynamics precisely at the TpA cleavage site in the TTTAAA AHA III endonuclease restriction sequence. Hence, the unusual base stacking and mobility found in this sequence may be important to the mechanism of enzymatic cleavage of the phophodiester bond.
Multiplexed Sequence Encoding: A Framework for DNA Communication.
Zakeri, Bijan; Carr, Peter A; Lu, Timothy K
2016-01-01
Synthetic DNA has great propensity for efficiently and stably storing non-biological information. With DNA writing and reading technologies rapidly advancing, new applications for synthetic DNA are emerging in data storage and communication. Traditionally, DNA communication has focused on the encoding and transfer of complete sets of information. Here, we explore the use of DNA for the communication of short messages that are fragmented across multiple distinct DNA molecules. We identified three pivotal points in a communication-data encoding, data transfer & data extraction-and developed novel tools to enable communication via molecules of DNA. To address data encoding, we designed DNA-based individualized keyboards (iKeys) to convert plaintext into DNA, while reducing the occurrence of DNA homopolymers to improve synthesis and sequencing processes. To address data transfer, we implemented a secret-sharing system-Multiplexed Sequence Encoding (MuSE)-that conceals messages between multiple distinct DNA molecules, requiring a combination key to reveal messages. To address data extraction, we achieved the first instance of chromatogram patterning through multiplexed sequencing, thereby enabling a new method for data extraction. We envision these approaches will enable more widespread communication of information via DNA.
Peters, R; King, C Y; Ukiyama, E; Falsafi, S; Donahoe, P K; Weiss, M A
1995-04-11
SRY, a genetic "master switch" for male development in mammals, exhibits two biochemical activities: sequence-specific recognition of duplex DNA and sequence-independent binding to the sharp angles of four-way DNA junctions. Here, we distinguish between these activities by analysis of a mutant SRY associated with human sex reversal (46, XY female with pure gonadal dysgenesis). The substitution (168T in human SRY) alters a nonpolar side chain in the minor-groove DNA recognition alpha-helix of the HMG box [Haqq, C.M., King, C.-Y., Ukiyama, E., Haqq, T.N., Falsalfi, S., Donahoe, P.K., & Weiss, M.A. (1994) Science 266, 1494-1500]. The native (but not mutant) side chain inserts between specific base pairs in duplex DNA, interrupting base stacking at a site of induced DNA bending. Isotope-aided 1H-NMR spectroscopy demonstrates that analogous side-chain insertion occurs on binding of SRY to a four-way junction, establishing a shared mechanism of sequence- and structure-specific DNA binding. Although the mutant DNA-binding domain exhibits > 50-fold reduction in sequence-specific DNA recognition, near wild-type affinity for four-way junctions is retained. Our results (i) identify a shared SRY-DNA contact at a site of either induced or intrinsic DNA bending, (ii) demonstrate that this contact is not required to bind an intrinsically bent DNA target, and (iii) rationalize patterns of sequence conservation or diversity among HMG boxes. Clinical association of the I68T mutation with human sex reversal supports the hypothesis that specific DNA recognition by SRY is required for male sex determination.
Sequence dependency of canonical base pair opening in the DNA double helix
Villa, Alessandra
2017-01-01
The flipping-out of a DNA base from the double helical structure is a key step of many cellular processes, such as DNA replication, modification and repair. Base pair opening is the first step of base flipping and the exact mechanism is still not well understood. We investigate sequence effects on base pair opening using extensive classical molecular dynamics simulations targeting the opening of 11 different canonical base pairs in two DNA sequences. Two popular biomolecular force fields are applied. To enhance sampling and calculate free energies, we bias the simulation along a simple distance coordinate using a newly developed adaptive sampling algorithm. The simulation is guided back and forth along the coordinate, allowing for multiple opening pathways. We compare the calculated free energies with those from an NMR study and check assumptions of the model used for interpreting the NMR data. Our results further show that the neighboring sequence is an important factor for the opening free energy, but also indicates that other sequence effects may play a role. All base pairs are observed to have a propensity for opening toward the major groove. The preferred opening base is cytosine for GC base pairs, while for AT there is sequence dependent competition between the two bases. For AT opening, we identify two non-canonical base pair interactions contributing to a local minimum in the free energy profile. For both AT and CG we observe long-lived interactions with water and with sodium ions at specific sites on the open base pair. PMID:28369121
Discrete Ramanujan transform for distinguishing the protein coding regions from other regions.
Hua, Wei; Wang, Jiasong; Zhao, Jian
2014-01-01
Based on the study of Ramanujan sum and Ramanujan coefficient, this paper suggests the concepts of discrete Ramanujan transform and spectrum. Using Voss numerical representation, one maps a symbolic DNA strand as a numerical DNA sequence, and deduces the discrete Ramanujan spectrum of the numerical DNA sequence. It is well known that of discrete Fourier power spectrum of protein coding sequence has an important feature of 3-base periodicity, which is widely used for DNA sequence analysis by the technique of discrete Fourier transform. It is performed by testing the signal-to-noise ratio at frequency N/3 as a criterion for the analysis, where N is the length of the sequence. The results presented in this paper show that the property of 3-base periodicity can be only identified as a prominent spike of the discrete Ramanujan spectrum at period 3 for the protein coding regions. The signal-to-noise ratio for discrete Ramanujan spectrum is defined for numerical measurement. Therefore, the discrete Ramanujan spectrum and the signal-to-noise ratio of a DNA sequence can be used for distinguishing the protein coding regions from the noncoding regions. All the exon and intron sequences in whole chromosomes 1, 2, 3 and 4 of Caenorhabditis elegans have been tested and the histograms and tables from the computational results illustrate the reliability of our method. In addition, we have analyzed theoretically and gotten the conclusion that the algorithm for calculating discrete Ramanujan spectrum owns the lower computational complexity and higher computational accuracy. The computational experiments show that the technique by using discrete Ramanujan spectrum for classifying different DNA sequences is a fast and effective method. Copyright © 2014 Elsevier Ltd. All rights reserved.
Biswas, Sovan; Sen, Suman; Im, JongOne; Biswas, Sudipta; Krstic, Predrag; Ashcroft, Brian; Borges, Chad; Zhao, Yanan; Lindsay, Stuart; Zhang, Peiming
2016-12-27
A reader molecule, which recognizes all the naturally occurring nucleobases in an electron tunnel junction, is required for sequencing DNA by a recognition tunneling (RT) technique, referred to as a universal reader. In the present study, we have designed a series of heterocyclic carboxamides based on hydrogen bonding and a large-sized pyrene ring based on a π-π stacking interaction as universal reader candidates. Each of these compounds was synthesized to bear a thiolated linker for attachment to metal electrodes and examined for their interactions with naturally occurring DNA nucleosides and nucleotides by 1 H NMR, ESI-MS, computational calculations, and surface plasmon resonance. RT measurements were carried out in a scanning tunnel microscope. All of these molecules generated electrical signals with DNA nucleotides in tunneling junctions under physiological conditions (phosphate buffered aqueous solution, pH 7.4). Using a support vector machine as a tool for data analysis, we found that these candidates distinguished among naturally occurring DNA nucleotides with the accuracy of pyrene (by π-π stacking interactions) > azole carboxamides (by hydrogen-bonding interactions). In addition, the pyrene reader operated efficiently in a larger tunnel junction. However, the azole carboxamide could read abasic (AP) monophosphate, a product from spontaneous base hydrolysis or an intermediate of base excision repair. Thus, we envision that sequencing DNA using both π-π stacking and hydrogen-bonding-based universal readers in parallel should generate more comprehensive genome sequences than sequencing based on either reader molecule alone.
Functional interrogation of non-coding DNA through CRISPR genome editing.
Canver, Matthew C; Bauer, Daniel E; Orkin, Stuart H
2017-05-15
Methodologies to interrogate non-coding regions have lagged behind coding regions despite comprising the vast majority of the genome. However, the rapid evolution of clustered regularly interspaced short palindromic repeats (CRISPR)-based genome editing has provided a multitude of novel techniques for laboratory investigation including significant contributions to the toolbox for studying non-coding DNA. CRISPR-mediated loss-of-function strategies rely on direct disruption of the underlying sequence or repression of transcription without modifying the targeted DNA sequence. CRISPR-mediated gain-of-function approaches similarly benefit from methods to alter the targeted sequence through integration of customized sequence into the genome as well as methods to activate transcription. Here we review CRISPR-based loss- and gain-of-function techniques for the interrogation of non-coding DNA. Copyright © 2017 Elsevier Inc. All rights reserved.
Species-specific Typing of DNA Based on Palindrome Frequency Patterns
Lamprea-Burgunder, Estelle; Ludin, Philipp; Mäser, Pascal
2011-01-01
DNA in its natural, double-stranded form may contain palindromes, sequences which read the same from either side because they are identical to their reverse complement on the sister strand. Short palindromes are underrepresented in all kinds of genomes. The frequency distribution of short palindromes exhibits more than twice the inter-species variance of non-palindromic sequences, which renders palindromes optimally suited for the typing of DNA. Here, we show that based on palindrome frequency, DNA sequences can be discriminated to the level of species of origin. By plotting the ratios of actual occurrence to expectancy, we generate palindrome frequency patterns that allow to cluster different sequences of the same genome and to assign plasmids, and in some cases even viruses to their respective host genomes. This finding will be of use in the growing field of metagenomics. PMID:21429991
The sequence specificity of UV-induced DNA damage in a systematically altered DNA sequence.
Khoe, Clairine V; Chung, Long H; Murray, Vincent
2018-06-01
The sequence specificity of UV-induced DNA damage was investigated in a specifically designed DNA plasmid using two procedures: end-labelling and linear amplification. Absorption of UV photons by DNA leads to dimerisation of pyrimidine bases and produces two major photoproducts, cyclobutane pyrimidine dimers (CPDs) and pyrimidine(6-4)pyrimidone photoproducts (6-4PPs). A previous study had determined that two hexanucleotide sequences, 5'-GCTC*AC and 5'-TATT*AA, were high intensity UV-induced DNA damage sites. The UV clone plasmid was constructed by systematically altering each nucleotide of these two hexanucleotide sequences. One of the main goals of this study was to determine the influence of single nucleotide alterations on the intensity of UV-induced DNA damage. The sequence 5'-GCTC*AC was designed to examine the sequence specificity of 6-4PPs and the highest intensity 6-4PP damage sites were found at 5'-GTTC*CC nucleotides. The sequence 5'-TATT*AA was devised to investigate the sequence specificity of CPDs and the highest intensity CPD damage sites were found at 5'-TTTT*CG nucleotides. It was proposed that the tetranucleotide DNA sequence, 5'-YTC*Y (where Y is T or C), was the consensus sequence for the highest intensity UV-induced 6-4PP adduct sites; while it was 5'-YTT*C for the highest intensity UV-induced CPD damage sites. These consensus tetranucleotides are composed entirely of consecutive pyrimidines and must have a DNA conformation that is highly productive for the absorption of UV photons. Crown Copyright © 2018. Published by Elsevier B.V. All rights reserved.
Promoter Sequences Prediction Using Relational Association Rule Mining
Czibula, Gabriela; Bocicor, Maria-Iuliana; Czibula, Istvan Gergely
2012-01-01
In this paper we are approaching, from a computational perspective, the problem of promoter sequences prediction, an important problem within the field of bioinformatics. As the conditions for a DNA sequence to function as a promoter are not known, machine learning based classification models are still developed to approach the problem of promoter identification in the DNA. We are proposing a classification model based on relational association rules mining. Relational association rules are a particular type of association rules and describe numerical orderings between attributes that commonly occur over a data set. Our classifier is based on the discovery of relational association rules for predicting if a DNA sequence contains or not a promoter region. An experimental evaluation of the proposed model and comparison with similar existing approaches is provided. The obtained results show that our classifier overperforms the existing techniques for identifying promoter sequences, confirming the potential of our proposal. PMID:22563233
Structure-affinity relationships for the binding of actinomycin D to DNA
NASA Astrophysics Data System (ADS)
Gallego, José; Ortiz, Angel R.; de Pascual-Teresa, Beatriz; Gago, Federico
1997-03-01
Molecular models of the complexes between actinomycin D and 14 different DNA hexamers were built based on the X-ray crystal structure of the actinomycin-d(GAAGCTTC)2 complex. The DNA sequences included the canonical GpC binding step flanked by different base pairs, nonclassical binding sites such as GpG and GpT, and sites containing 2,6-diamino- purine. A good correlation was found between the intermolecular interaction energies calculated for the refined complexes and the relative preferences of actinomycin binding to standard and modified DNA. A detailed energy decomposition into van der Waals and electrostatic components for the interactions between the DNA base pairs and either the chromophore or the peptidic part of the antibiotic was performed for each complex. The resulting energy matrix was then subjected to principal component analysis, which showed that actinomycin D discriminates among different DNA sequences by an interplay of hydrogen bonding and stacking interactions. The structure-affinity relationships for this important antitumor drug are thus rationalized and may be used to advantage in the design of novel sequence-specific DNA-binding agents.
Wysoczynski, Christina L.; Roemer, Sarah C.; Dostal, Vishantie; Barkley, Robert M.; Churchill, Mair E. A.; Malarkey, Christopher S.
2013-01-01
Obtaining quantities of highly pure duplex DNA is a bottleneck in the biophysical analysis of protein–DNA complexes. In traditional DNA purification methods, the individual cognate DNA strands are purified separately before annealing to form DNA duplexes. This approach works well for palindromic sequences, in which top and bottom strands are identical and duplex formation is typically complete. However, in cases where the DNA is non-palindromic, excess of single-stranded DNA must be removed through additional purification steps to prevent it from interfering in further experiments. Here we describe and apply a novel reversed-phase ion-pair liquid chromatography purification method for double-stranded DNA ranging in lengths from 17 to 51 bp. Both palindromic and non-palindromic DNA can be readily purified. This method has the unique ability to separate blunt double-stranded DNA from pre-attenuated (n-1, n-2, etc) synthesis products, and from DNA duplexes with single base pair overhangs. Additionally, palindromic DNA sequences with only minor differences in the central spacer sequence of the DNA can be separated, and the purified DNA is suitable for co-crystallization of protein–DNA complexes. Thus, double-stranded ion-pair liquid chromatography is a useful approach for duplex DNA purification for many applications. PMID:24013567
Benabdelkrim Filali, Oumama; Kabine, Mostafa; El Hamouchi, Adil; Lemrani, Meryem; Debboun, Mustapha; Sarih, M'hammed
2018-06-05
Anopheles sergentii known as the "oasis vector" or the "desert malaria vector" is considered the main vector of malaria in the southern parts of Morocco. Its presence in Morocco is confirmed for the first time through sequencing of mitochondrial DNA (mDNA) cytochrome c oxidase subunit I (COI) barcodes and nuclear ribosomal DNA (rDNA) second internal transcribed spacer (ITS2) sequences and direct comparison with specimens of A. sergentii of other countries. The DNA barcodes (n = 39) obtained from A. sergentii collected in 2015 and 2016 showed more diversity with 10 haplotypes, compared with 3 haplotypes obtained from ITS2 sequences (n = 59). Moreover, the comparison using the ITS2 sequences showed closer evolutionary relationship between the Moroccan and Egyptian strains than the Iranian strain. Nevertheless, genetic differences due to geographical segregation were also observed. This study provides the first report on the sequence of rDNA-ITS2 and mtDNA COI, which could be used to better understand the biodiversity of A. sergentii.
Albayrak, Levent; Khanipov, Kamil; Pimenova, Maria; Golovko, George; Rojas, Mark; Pavlidis, Ioannis; Chumakov, Sergei; Aguilar, Gerardo; Chávez, Arturo; Widger, William R; Fofanov, Yuriy
2016-12-12
Low-abundance mutations in mitochondrial populations (mutations with minor allele frequency ≤ 1%), are associated with cancer, aging, and neurodegenerative disorders. While recent progress in high-throughput sequencing technology has significantly improved the heteroplasmy identification process, the ability of this technology to detect low-abundance mutations can be affected by the presence of similar sequences originating from nuclear DNA (nDNA). To determine to what extent nDNA can cause false positive low-abundance heteroplasmy calls, we have identified mitochondrial locations of all subsequences that are common or similar (one mismatch allowed) between nDNA and mitochondrial DNA (mtDNA). Performed analysis revealed up to a 25-fold variation in the lengths of longest common and longest similar (one mismatch allowed) subsequences across the mitochondrial genome. The size of the longest subsequences shared between nDNA and mtDNA in several regions of the mitochondrial genome were found to be as low as 11 bases, which not only allows using these regions to design new, very specific PCR primers, but also supports the hypothesis of the non-random introduction of mtDNA into the human nuclear DNA. Analysis of the mitochondrial locations of the subsequences shared between nDNA and mtDNA suggested that even very short (36 bases) single-end sequencing reads can be used to identify low-abundance variation in 20.4% of the mitochondrial genome. For longer (76 and 150 bases) reads, the proportion of the mitochondrial genome where nDNA presence will not interfere found to be 44.5 and 67.9%, when low-abundance mutations at 100% of locations can be identified using 417 bases long single reads. This observation suggests that the analysis of low-abundance variations in mitochondria population can be extended to a variety of large data collections such as NCBI Sequence Read Archive, European Nucleotide Archive, The Cancer Genome Atlas, and International Cancer Genome Consortium.
Chen, Y. C.; Eisner, J. D.; Kattar, M. M.; Rassoulian-Barrett, S. L.; LaFe, K.; Yarfitz, S. L.; Limaye, A. P.; Cookson, B. T.
2000-01-01
Identification of medically relevant yeasts can be time-consuming and inaccurate with current methods. We evaluated PCR-based detection of sequence polymorphisms in the internal transcribed spacer 2 (ITS2) region of the rRNA genes as a means of fungal identification. Clinical isolates (401), reference strains (6), and type strains (27), representing 34 species of yeasts were examined. The length of PCR-amplified ITS2 region DNA was determined with single-base precision in less than 30 min by using automated capillary electrophoresis. Unique, species-specific PCR products ranging from 237 to 429 bp were obtained from 92% of the clinical isolates. The remaining 8%, divided into groups with ITS2 regions which differed by ≤2 bp in mean length, all contained species-specific DNA sequences easily distinguishable by restriction enzyme analysis. These data, and the specificity of length polymorphisms for identifying yeasts, were confirmed by DNA sequence analysis of the ITS2 region from 93 isolates. Phenotypic and ITS2-based identification was concordant for 427 of 434 yeast isolates examined using sequence identity of ≥99%. Seven clinical isolates contained ITS2 sequences that did not agree with their phenotypic identification, and ITS2-based phylogenetic analyses indicate the possibility of new or clinically unusual species in the Rhodotorula and Candida genera. This work establishes an initial database, validated with over 400 clinical isolates, of ITS2 length and sequence polymorphisms for 34 species of yeasts. We conclude that size and restriction analysis of PCR-amplified ITS2 region DNA is a rapid and reliable method to identify clinically significant yeasts, including potentially new or emerging pathogenic species. PMID:10834993
Guérin, Frédéric; Arnaiz, Olivier; Boggetto, Nicole; Denby Wilkes, Cyril; Meyer, Eric; Sperling, Linda; Duharcourt, Sandra
2017-04-26
DNA elimination is developmentally programmed in a wide variety of eukaryotes, including unicellular ciliates, and leads to the generation of distinct germline and somatic genomes. The ciliate Paramecium tetraurelia harbors two types of nuclei with different functions and genome structures. The transcriptionally inactive micronucleus contains the complete germline genome, while the somatic macronucleus contains a reduced genome streamlined for gene expression. During development of the somatic macronucleus, the germline genome undergoes massive and reproducible DNA elimination events. Availability of both the somatic and germline genomes is essential to examine the genome changes that occur during programmed DNA elimination and ultimately decipher the mechanisms underlying the specific removal of germline-limited sequences. We developed a novel experimental approach that uses flow cell imaging and flow cytometry to sort subpopulations of nuclei to high purity. We sorted vegetative micronuclei and macronuclei during development of P. tetraurelia. We validated the method by flow cell imaging and by high throughput DNA sequencing. Our work establishes the proof of principle that developing somatic macronuclei can be sorted from a complex biological sample to high purity based on their size, shape and DNA content. This method enabled us to sequence, for the first time, the germline DNA from pure micronuclei and to identify novel transposable elements. Sequencing the germline DNA confirms that the Pgm domesticated transposase is required for the excision of all ~45,000 Internal Eliminated Sequences. Comparison of the germline DNA and unrearranged DNA obtained from PGM-silenced cells reveals that the latter does not provide a faithful representation of the germline genome. We developed a flow cytometry-based method to purify P. tetraurelia nuclei to high purity and provided quality control with flow cell imaging and high throughput DNA sequencing. We identified 61 germline transposable elements including the first Paramecium retrotransposons. This approach paves the way to sequence the germline genomes of P. aurelia sibling species for future comparative genomic studies.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gottlieb, J.; Muzyczka, N.
1988-06-01
When circular recombinant plasmids containing adeno-associated virus (AAV) DNA sequences are transfected into human cells, the AAV provirus is rescued. Using these circular AAV plasmids as substrates, the authors isolated an enzyme fraction from HeLa cell nuclear extracts that excises intact AAV DNA in vitro from vector DNA and produces linear DNA products. The recognition signal for the enzyme is a polypurine-polypyrimidine sequence which is at least 9 residues long and rich in G . C base pairs. Such sequences are present in AAV recombinant plasmids as part of the first 15 base pairs of the AAV terminal repeat andmore » in some cases as the result of cloning the AAV genome by G . C tailing. The isolated enzyme fraction does not have significant endonucleolytic activity on single-stranded or double-stranded DNA. Plasmid DNA that is transfected into tissue culture cells is cleaved in vivo to produce a pattern of DNA fragments similar to that seen with purified enzyme in vitro. The activity has been called endo R for rescue, and its behavior suggests that it may have a role in recombination of cellular chromosomes.« less
Utro, Filippo; Di Benedetto, Valeria; Corona, Davide F V; Giancarlo, Raffaele
2016-03-15
Thanks to research spanning nearly 30 years, two major models have emerged that account for nucleosome organization in chromatin: statistical and sequence specific. The first is based on elegant, easy to compute, closed-form mathematical formulas that make no assumptions of the physical and chemical properties of the underlying DNA sequence. Moreover, they need no training on the data for their computation. The latter is based on some sequence regularities but, as opposed to the statistical model, it lacks the same type of closed-form formulas that, in this case, should be based on the DNA sequence only. We contribute to close this important methodological gap between the two models by providing three very simple formulas for the sequence specific one. They are all based on well-known formulas in Computer Science and Bioinformatics, and they give different quantifications of how complex a sequence is. In view of how remarkably well they perform, it is very surprising that measures of sequence complexity have not even been considered as candidates to close the mentioned gap. We provide experimental evidence that the intrinsic level of combinatorial organization and information-theoretic content of subsequences within a genome are strongly correlated to the level of DNA encoded nucleosome organization discovered by Kaplan et al Our results establish an important connection between the intrinsic complexity of subsequences in a genome and the intrinsic, i.e. DNA encoded, nucleosome organization of eukaryotic genomes. It is a first step towards a mathematical characterization of this latter 'encoding'. Supplementary data are available at Bioinformatics online. futro@us.ibm.com. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Karas, Vlad O; Sinnott-Armstrong, Nicholas A; Varghese, Vici; Shafer, Robert W; Greenleaf, William J; Sherlock, Gavin
2018-01-01
Abstract Much of the within species genetic variation is in the form of single nucleotide polymorphisms (SNPs), typically detected by whole genome sequencing (WGS) or microarray-based technologies. However, WGS produces mostly uninformative reads that perfectly match the reference, while microarrays require genome-specific reagents. We have developed Diff-seq, a sequencing-based mismatch detection assay for SNP discovery without the requirement for specialized nucleic-acid reagents. Diff-seq leverages the Surveyor endonuclease to cleave mismatched DNA molecules that are generated after cross-annealing of a complex pool of DNA fragments. Sequencing libraries enriched for Surveyor-cleaved molecules result in increased coverage at the variant sites. Diff-seq detected all mismatches present in an initial test substrate, with specific enrichment dependent on the identity and context of the variation. Application to viral sequences resulted in increased observation of variant alleles in a biologically relevant context. Diff-Seq has the potential to increase the sensitivity and efficiency of high-throughput sequencing in the detection of variation. PMID:29361139
Sequencing of cDNA Clones from the Genetic Map of Tomato (Lycopersicon esculentum)
Ganal, Martin W.; Czihal, Rosemarie; Hannappel, Ulrich; Kloos, Dorothee-U.; Polley, Andreas; Ling, Hong-Qing
1998-01-01
The dense RFLP linkage map of tomato (Lycopersicon esculentum) contains >300 anonymous cDNA clones. Of those clones, 272 were partially or completely sequenced. The sequences were compared at the DNA and protein level to known genes in databases. For 57% of the clones, a significant match to previously described genes was found. The information will permit the conversion of those markers to STS markers and allow their use in PCR-based mapping experiments. Furthermore, it will facilitate the comparative mapping of genes across distantly related plant species by direct comparison of DNA sequences and map positions. [cDNA sequence data reported in this paper have been submitted to the EMBL database under accession nos. AA824695–AA825005 and the dbEST_Id database under accession nos. 1546519–1546862.] PMID:9724330
Tin, Mandy Man-Ying; Economo, Evan Philip; Mikheyev, Alexander Sergeyevich
2014-01-01
Ancient and archival DNA samples are valuable resources for the study of diverse historical processes. In particular, museum specimens provide access to biotas distant in time and space, and can provide insights into ecological and evolutionary changes over time. However, archival specimens are difficult to handle; they are often fragile and irreplaceable, and typically contain only short segments of denatured DNA. Here we present a set of tools for processing such samples for state-of-the-art genetic analysis. First, we report a protocol for minimally destructive DNA extraction of insect museum specimens, which produced sequenceable DNA from all of the samples assayed. The 11 specimens analyzed had fragmented DNA, rarely exceeding 100 bp in length, and could not be amplified by conventional PCR targeting the mitochondrial cytochrome oxidase I gene. Our approach made these samples amenable to analysis with commonly used next-generation sequencing-based molecular analytic tools, including RAD-tagging and shotgun genome re-sequencing. First, we used museum ant specimens from three species, each with its own reference genome, for RAD-tag mapping. Were able to use the degraded DNA sequences, which were sequenced in full, to identify duplicate reads and filter them prior to base calling. Second, we re-sequenced six Hawaiian Drosophila species, with millions of years of divergence, but with only a single available reference genome. Despite a shallow coverage of 0.37 ± 0.42 per base, we could recover a sufficient number of overlapping SNPs to fully resolve the species tree, which was consistent with earlier karyotypic studies, and previous molecular studies, at least in the regions of the tree that these studies could resolve. Although developed for use with degraded DNA, all of these techniques are readily applicable to more recent tissue, and are suitable for liquid handling automation.
Clarification of the Concept of Ganoderma orbiforme with High Morphological Plasticity
Wang, Dong-Mei; Wu, Sheng-Hua; Yao, Yi-Jian
2014-01-01
Ganoderma has been considered a very difficult genus among the polypores to classify and is currently in a state of taxonomic chaos. In a study of Ganoderma collections including numerous type specimens, we found that six species namely G. cupreum, G. densizonatum, G. limushanense, G. mastoporum, G. orbiforme, G. subtornatum, and records of G. fornicatum from Mainland China and Taiwan are very similar to one another in basidiocarp texture, pilear cuticle structure, context color, pore color and basidiospore characteristics. Further, we sequenced the nrDNA ITS region (ITS1 and ITS2) and partial mtDNA SSU region of the studied materials, and performed phylogenetic analyses based on these sequence data. The nrDNA ITS sequence analysis results show that the eight nrDNA ITS sequences derived from this study have single-nucleotide polymorphisms in ITS1 and/or ITS2 at inter- and intra-individual levels. In the nrDNA ITS phylogenetic trees, all the sequences from this study are grouped together with those of G. cupreum and G. mastoporum retrieved from GenBank to form a distinct clade. The mtDNA SSU sequence analysis results reveal that the five mtDNA SSU sequences derived from this study are clustered together with those of G. cupreum retrieved from GenBank and also form a distinct clade in the mtDNA SSU phylogenetic trees. Based on morphological and molecular data, we conclude that the studied taxa are conspecific. Among the names assigned to this species, G. fornicatum given to Asian collections has nomenclatural priority over the others. However, the type of G. fornicatum from Brazil is probably lost and a modern description based on the type lacks. The identification of the Asian collections to G. fornicatum therefore cannot be confirmed. To the best of our knowledge, G. orbiforme is the earliest valid name for use. PMID:24875218
ANN modeling of DNA sequences: new strategies using DNA shape code.
Parbhane, R V; Tambe, S S; Kulkarni, B D
2000-09-01
Two new encoding strategies, namely, wedge and twist codes, which are based on the DNA helical parameters, are introduced to represent DNA sequences in artificial neural network (ANN)-based modeling of biological systems. The performance of the new coding strategies has been evaluated by conducting three case studies involving mapping (modeling) and classification applications of ANNs. The proposed coding schemes have been compared rigorously and shown to outperform the existing coding strategies especially in situations wherein limited data are available for building the ANN models.
Mapping DNA polymerase errors by single-molecule sequencing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, David F.; Lu, Jenny; Chang, Seungwoo
Genomic integrity is compromised by DNA polymerase replication errors, which occur in a sequence-dependent manner across the genome. Accurate and complete quantification of a DNA polymerase's error spectrum is challenging because errors are rare and difficult to detect. We report a high-throughput sequencing assay to map in vitro DNA replication errors at the single-molecule level. Unlike previous methods, our assay is able to rapidly detect a large number of polymerase errors at base resolution over any template substrate without quantification bias. To overcome the high error rate of high-throughput sequencing, our assay uses a barcoding strategy in which each replicationmore » product is tagged with a unique nucleotide sequence before amplification. Here, this allows multiple sequencing reads of the same product to be compared so that sequencing errors can be found and removed. We demonstrate the ability of our assay to characterize the average error rate, error hotspots and lesion bypass fidelity of several DNA polymerases.« less
Mapping DNA polymerase errors by single-molecule sequencing
Lee, David F.; Lu, Jenny; Chang, Seungwoo; ...
2016-05-16
Genomic integrity is compromised by DNA polymerase replication errors, which occur in a sequence-dependent manner across the genome. Accurate and complete quantification of a DNA polymerase's error spectrum is challenging because errors are rare and difficult to detect. We report a high-throughput sequencing assay to map in vitro DNA replication errors at the single-molecule level. Unlike previous methods, our assay is able to rapidly detect a large number of polymerase errors at base resolution over any template substrate without quantification bias. To overcome the high error rate of high-throughput sequencing, our assay uses a barcoding strategy in which each replicationmore » product is tagged with a unique nucleotide sequence before amplification. Here, this allows multiple sequencing reads of the same product to be compared so that sequencing errors can be found and removed. We demonstrate the ability of our assay to characterize the average error rate, error hotspots and lesion bypass fidelity of several DNA polymerases.« less
Correcting for Sample Contamination in Genotype Calling of DNA Sequence Data
Flickinger, Matthew; Jun, Goo; Abecasis, Gonçalo R.; Boehnke, Michael; Kang, Hyun Min
2015-01-01
DNA sample contamination is a frequent problem in DNA sequencing studies and can result in genotyping errors and reduced power for association testing. We recently described methods to identify within-species DNA sample contamination based on sequencing read data, showed that our methods can reliably detect and estimate contamination levels as low as 1%, and suggested strategies to identify and remove contaminated samples from sequencing studies. Here we propose methods to model contamination during genotype calling as an alternative to removal of contaminated samples from further analyses. We compare our contamination-adjusted calls to calls that ignore contamination and to calls based on uncontaminated data. We demonstrate that, for moderate contamination levels (5%–20%), contamination-adjusted calls eliminate 48%–77% of the genotyping errors. For lower levels of contamination, our contamination correction methods produce genotypes nearly as accurate as those based on uncontaminated data. Our contamination correction methods are useful generally, but are particularly helpful for sample contamination levels from 2% to 20%. PMID:26235984
Klobutcher, L A; Swanton, M T; Donini, P; Prescott, D M
1981-01-01
In hypotrichous ciliates, all of the macronuclear DNA is in the form of low molecular weight molecules with an average size of approximately 2200 base pairs. Total macronuclear DNA from four hypotrichs has been shown to have inverted terminal repeats by direct sequence analysis. In Oxytricha nova, Oxytricha sp., and Stylonychia pustulata, this terminal sequence may be written as 5'-C4A4C4A4C4 ... 3'-G4T4G4T4G4T4G4T4G4 ... In Euplotes aediculatus, the sequences is similar but differs in the lengths of the duplex region (28 base pairs) and of the putative 3' extension (14 base pairs). Also in Euplotes, a second common sequence of 5 base pairs (A-A-C-T-T-T-T-G-A-A) occurs internal to the terminal repeat and a 17-base-pair heterogeneous region: 5'-C4A4C4A4C4A4C4(X)17T-T-G-A-A ... 3'-G2T4G4T4G4T4G4T4G4T4G4(X)17A-A-C-T-T ... The length of the terminal repeat sequence for O. nova was confirmed in cloned macronuclear DNA molecules. Images PMID:6265931
Directed evolution of the TALE N-terminal domain for recognition of all 5′ bases
Lamb, Brian M.; Mercer, Andrew C.; Barbas, Carlos F.
2013-01-01
Transcription activator-like effector (TALE) proteins can be designed to bind virtually any DNA sequence. General guidelines for design of TALE DNA-binding domains suggest that the 5′-most base of the DNA sequence bound by the TALE (the N0 base) should be a thymine. We quantified the N0 requirement by analysis of the activities of TALE transcription factors (TALE-TF), TALE recombinases (TALE-R) and TALE nucleases (TALENs) with each DNA base at this position. In the absence of a 5′ T, we observed decreases in TALE activity up to >1000-fold in TALE-TF activity, up to 100-fold in TALE-R activity and up to 10-fold reduction in TALEN activity compared with target sequences containing a 5′ T. To develop TALE architectures that recognize all possible N0 bases, we used structure-guided library design coupled with TALE-R activity selections to evolve novel TALE N-terminal domains to accommodate any N0 base. A G-selective domain and broadly reactive domains were isolated and characterized. The engineered TALE domains selected in the TALE-R format demonstrated modularity and were active in TALE-TF and TALEN architectures. Evolved N-terminal domains provide effective and unconstrained TALE-based targeting of any DNA sequence as TALE binding proteins and designer enzymes. PMID:23980031
Phylogenetic Position of a Copper Age Sheep (Ovis aries) Mitochondrial DNA
Olivieri, Cristina; Ermini, Luca; Rizzi, Ermanno; Corti, Giorgio; Luciani, Stefania; Marota, Isolina; De Bellis, Gianluca; Rollo, Franco
2012-01-01
Background Sheep (Ovis aries) were domesticated in the Fertile Crescent region about 9,000-8,000 years ago. Currently, few mitochondrial (mt) DNA studies are available on archaeological sheep. In particular, no data on archaeological European sheep are available. Methodology/Principal Findings Here we describe the first portion of mtDNA sequence of a Copper Age European sheep. DNA was extracted from hair shafts which were part of the clothes of the so-called Tyrolean Iceman or Ötzi (5,350 - 5,100 years before present). Mitochondrial DNA (a total of 2,429 base pairs, encompassing a portion of the control region, tRNAPhe, a portion of the 12S rRNA gene, and the whole cytochrome B gene) was sequenced using a mixed sequencing procedure based on PCR amplification and 454 sequencing of pooled amplification products. We have compared the sequence with the corresponding sequence of 334 extant lineages. Conclusions/Significance A phylogenetic network based on a new cladistic notation for the mitochondrial diversity of domestic sheep shows that the Ötzi's sheep falls within haplogroup B, thus demonstrating that sheep belonging to this haplogroup were already present in the Alps more than 5,000 years ago. On the other hand, the lineage of the Ötzi's sheep is defined by two transitions (16147, and 16440) which, assembled together, define a motif that has not yet been identified in modern sheep populations. PMID:22457789
Pasi, Marco; Maddocks, John H.; Lavery, Richard
2015-01-01
Microsecond molecular dynamics simulations of B-DNA oligomers carried out in an aqueous environment with a physiological salt concentration enable us to perform a detailed analysis of how potassium ions interact with the double helix. The oligomers studied contain all 136 distinct tetranucleotides and we are thus able to make a comprehensive analysis of base sequence effects. Using a recently developed curvilinear helicoidal coordinate method we are able to analyze the details of ion populations and densities within the major and minor grooves and in the space surrounding DNA. The results show higher ion populations than have typically been observed in earlier studies and sequence effects that go beyond the nature of individual base pairs or base pair steps. We also show that, in some special cases, ion distributions converge very slowly and, on a microsecond timescale, do not reflect the symmetry of the corresponding base sequence. PMID:25662221
[Genome-scale sequence data processing and epigenetic analysis of DNA methylation].
Wang, Ting-Zhang; Shan, Gao; Xu, Jian-Hong; Xue, Qing-Zhong
2013-06-01
A new approach recently developed for detecting cytosine DNA methylation (mC) and analyzing the genome-scale DNA methylation profiling, is called BS-Seq which is based on bisulfite conversion of genomic DNA combined with next-generation sequencing. The method can not only provide an insight into the difference of genome-scale DNA methylation among different organisms, but also reveal the conservation of DNA methylation in all contexts and nucleotide preference for different genomic regions, including genes, exons, and repetitive DNA sequences. It will be helpful to under-stand the epigenetic impacts of cytosine DNA methylation on the regulation of gene expression and maintaining silence of repetitive sequences, such as transposable elements. In this paper, we introduce the preprocessing steps of DNA methylation data, by which cytosine (C) and guanine (G) in the reference sequence are transferred to thymine (T) and adenine (A), and cytosine in reads is transferred to thymine, respectively. We also comprehensively review the main content of the DNA methylation analysis on the genomic scale: (1) the cytosine methylation under the context of different sequences; (2) the distribution of genomic methylcytosine; (3) DNA methylation context and the preference for the nucleotides; (4) DNA- protein interaction sites of DNA methylation; (5) degree of methylation of cytosine in the different structural elements of genes. DNA methylation analysis technique provides a powerful tool for the epigenome study in human and other species, and genes and environment interaction, and founds the theoretical basis for further development of disease diagnostics and therapeutics in human.
Liang, Feng; Lindsay, Stuart; Zhang, Peiming
2012-11-21
With the aid of Density Functional Theory (DFT), we designed 1,8-naphthyridine-2,7-diamine as a recognition molecule to read DNA base pairs for genomic sequencing by electron tunneling. NMR studies show that it can form stable triplets with both A : T and G : C base pairs through hydrogen bonding. Our results suggest that the naphthyridine molecule should be able to function as a universal base pair reader in a tunneling gap, generating distinguishable signatures under electrical bias for each of DNA base pairs.
Liang, Feng; Lindsay, Stuart; Zhang, Peiming
2013-01-01
With the aid of Density Functional Theory (DFT), we designed 1,8-naphthyridine-2,7-diamine as a recognition molecule to read the DNA base pairs for genomic sequencing by electron tunneling. NMR studies show that it can form stable triplets with both A:T and G:C base pairs through hydrogen bonding. Our results suggest that the naphthyridine molecule should be able to function as a universal base pair reader in a tunneling gap, generating distinguishable signatures under electrical bias for each of DNA base pairs. PMID:23038027
Mohd Bakhori, Noremylia; Yusof, Nor Azah; Abdullah, Abdul Halim; Hussein, Mohd Zobir
2013-01-01
An optical DNA biosensor based on fluorescence resonance energy transfer (FRET) utilizing synthesized quantum dot (QD) has been developed for the detection of specific-sequence of DNA for Ganoderma boninense, an oil palm pathogen. Modified QD that contained carboxylic groups was conjugated with a single-stranded DNA probe (ssDNA) via amide-linkage. Hybridization of the target DNA with conjugated QD-ssDNA and reporter probe labeled with Cy5 allows for the detection of related synthetic DNA sequence of Ganoderma boninense gene based on FRET signals. Detection of FRET emission before and after hybridization was confirmed through the capability of the system to produce FRET at 680 nm for hybridized sandwich with complementary target DNA. No FRET emission was observed for non-complementary system. Hybridization time, temperature and effect of different concentration of target DNA were studied in order to optimize the developed system. The developed biosensor has shown high sensitivity with detection limit of 3.55 × 10−9 M. TEM results show that the particle size of QD varies in the range between 5 to 8 nm after ligand modification and conjugation with ssDNA. This approach is capable of providing a simple, rapid and sensitive method for detection of related synthetic DNA sequence of Ganoderma boninense. PMID:25587406
A novel model for DNA sequence similarity analysis based on graph theory.
Qi, Xingqin; Wu, Qin; Zhang, Yusen; Fuller, Eddie; Zhang, Cun-Quan
2011-01-01
Determination of sequence similarity is one of the major steps in computational phylogenetic studies. As we know, during evolutionary history, not only DNA mutations for individual nucleotide but also subsequent rearrangements occurred. It has been one of major tasks of computational biologists to develop novel mathematical descriptors for similarity analysis such that various mutation phenomena information would be involved simultaneously. In this paper, different from traditional methods (eg, nucleotide frequency, geometric representations) as bases for construction of mathematical descriptors, we construct novel mathematical descriptors based on graph theory. In particular, for each DNA sequence, we will set up a weighted directed graph. The adjacency matrix of the directed graph will be used to induce a representative vector for DNA sequence. This new approach measures similarity based on both ordering and frequency of nucleotides so that much more information is involved. As an application, the method is tested on a set of 0.9-kb mtDNA sequences of twelve different primate species. All output phylogenetic trees with various distance estimations have the same topology, and are generally consistent with the reported results from early studies, which proves the new method's efficiency; we also test the new method on a simulated data set, which shows our new method performs better than traditional global alignment method when subsequent rearrangements happen frequently during evolutionary history.
Uchoi, Ajit; Malik, Surendra Kumar; Choudhary, Ravish; Kumar, Susheel; Rohini, M R; Pal, Digvender; Ercisli, Sezai; Chaudhury, Rekha
2016-06-01
Phylogenetic relationships of Indian Citron (Citrus medica L.) with other important Citrus species have been inferred through sequence analyses of rbcL and matK gene region of chloroplast DNA. The study was based on 23 accessions of Citrus genotypes representing 15 taxa of Indian Citrus, collected from wild, semi-wild, and domesticated stocks. The phylogeny was inferred using the maximum parsimony (MP) and neighbor-joining (NJ) methods. Both MP and NJ trees separated all the 23 accessions of Citrus into five distinct clusters. The chloroplast DNA (cpDNA) analysis based on rbcL and matK sequence data carried out in Indian taxa of Citrus was useful in differentiating all the true species and species/varieties of probable hybrid origin in distinct clusters or groups. Sequence analysis based on rbcL and matK gene provided unambiguous identification and disposition of true species like C. maxima, C. medica, C. reticulata, and related hybrids/cultivars. The separation of C. maxima, C. medica, and C. reticulata in distinct clusters or sub-clusters supports their distinctiveness as the basic species of edible Citrus. However, the cpDNA sequence analysis of rbcL and matK gene could not find any clear cut differentiation between subgenera Citrus and Papeda as proposed in Swingle's system of classification.
Carpenter, Meredith L.; Buenrostro, Jason D.; Valdiosera, Cristina; Schroeder, Hannes; Allentoft, Morten E.; Sikora, Martin; Rasmussen, Morten; Gravel, Simon; Guillén, Sonia; Nekhrizov, Georgi; Leshtakov, Krasimir; Dimitrova, Diana; Theodossiev, Nikola; Pettener, Davide; Luiselli, Donata; Sandoval, Karla; Moreno-Estrada, Andrés; Li, Yingrui; Wang, Jun; Gilbert, M. Thomas P.; Willerslev, Eske; Greenleaf, William J.; Bustamante, Carlos D.
2013-01-01
Most ancient specimens contain very low levels of endogenous DNA, precluding the shotgun sequencing of many interesting samples because of cost. Ancient DNA (aDNA) libraries often contain <1% endogenous DNA, with the majority of sequencing capacity taken up by environmental DNA. Here we present a capture-based method for enriching the endogenous component of aDNA sequencing libraries. By using biotinylated RNA baits transcribed from genomic DNA libraries, we are able to capture DNA fragments from across the human genome. We demonstrate this method on libraries created from four Iron Age and Bronze Age human teeth from Bulgaria, as well as bone samples from seven Peruvian mummies and a Bronze Age hair sample from Denmark. Prior to capture, shotgun sequencing of these libraries yielded an average of 1.2% of reads mapping to the human genome (including duplicates). After capture, this fraction increased substantially, with up to 59% of reads mapped to human and enrichment ranging from 6- to 159-fold. Furthermore, we maintained coverage of the majority of regions sequenced in the precapture library. Intersection with the 1000 Genomes Project reference panel yielded an average of 50,723 SNPs (range 3,062–147,243) for the postcapture libraries sequenced with 1 million reads, compared with 13,280 SNPs (range 217–73,266) for the precapture libraries, increasing resolution in population genetic analyses. Our whole-genome capture approach makes it less costly to sequence aDNA from specimens containing very low levels of endogenous DNA, enabling the analysis of larger numbers of samples. PMID:24568772
DNA sequence alignment by microhomology sampling during homologous recombination
Qi, Zhi; Redding, Sy; Lee, Ja Yil; Gibb, Bryan; Kwon, YoungHo; Niu, Hengyao; Gaines, William A.; Sung, Patrick
2015-01-01
Summary Homologous recombination (HR) mediates the exchange of genetic information between sister or homologous chromatids. During HR, members of the RecA/Rad51 family of recombinases must somehow search through vast quantities of DNA sequence to align and pair ssDNA with a homologous dsDNA template. Here we use single-molecule imaging to visualize Rad51 as it aligns and pairs homologous DNA sequences in real-time. We show that Rad51 uses a length-based recognition mechanism while interrogating dsDNA, enabling robust kinetic selection of 8-nucleotide (nt) tracts of microhomology, which kinetically confines the search to sites with a high probability of being a homologous target. Successful pairing with a 9th nucleotide coincides with an additional reduction in binding free energy and subsequent strand exchange occurs in precise 3-nt steps, reflecting the base triplet organization of the presynaptic complex. These findings provide crucial new insights into the physical and evolutionary underpinnings of DNA recombination. PMID:25684365
Crystal structure of MboIIA methyltransferase.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Osipiuk, J.; Walsh, M. A.; Joachimiak, A.
2003-09-15
DNA methyltransferases (MTases) are sequence-specific enzymes which transfer a methyl group from S-adenosyl-L-methionine (AdoMet) to the amino group of either cytosine or adenine within a recognized DNA sequence. Methylation of a base in a specific DNA sequence protects DNA from nucleolytic cleavage by restriction enzymes recognizing the same DNA sequence. We have determined at 1.74 {angstrom} resolution the crystal structure of a {beta}-class DNA MTase MboIIA (M {center_dot} MboIIA) from the bacterium Moraxella bovis, the smallest DNA MTase determined to date. M {center_dot} MboIIA methylates the 3' adenine of the pentanucleotide sequence 5'-GAAGA-3'. The protein crystallizes with two molecules inmore » the asymmetric unit which we propose to resemble the dimer when M {center_dot} MboIIA is not bound to DNA. The overall structure of the enzyme closely resembles that of M {center_dot} RsrI. However, the cofactor-binding pocket in M {center_dot} MboIIA forms a closed structure which is in contrast to the open-form structures of other known MTases.« less
DNA/RNA transverse current sequencing: intrinsic structural noise from neighboring bases
Alvarez, Jose R.; Skachkov, Dmitry; Massey, Steven E.; Kalitsov, Alan; Velev, Julian P.
2015-01-01
Nanopore DNA sequencing via transverse current has emerged as a promising candidate for third-generation sequencing technology. It produces long read lengths which could alleviate problems with assembly errors inherent in current technologies. However, the high error rates of nanopore sequencing have to be addressed. A very important source of the error is the intrinsic noise in the current arising from carrier dispersion along the chain of the molecule, i.e., from the influence of neighboring bases. In this work we perform calculations of the transverse current within an effective multi-orbital tight-binding model derived from first-principles calculations of the DNA/RNA molecules, to study the effect of this structural noise on the error rates in DNA/RNA sequencing via transverse current in nanopores. We demonstrate that a statistical technique, utilizing not only the currents through the nucleotides but also the correlations in the currents, can in principle reduce the error rate below any desired precision. PMID:26150827
Arduino-based automation of a DNA extraction system.
Kim, Kyung-Won; Lee, Mi-So; Ryu, Mun-Ho; Kim, Jong-Won
2015-01-01
There have been many studies to detect infectious diseases with the molecular genetic method. This study presents an automation process for a DNA extraction system based on microfluidics and magnetic bead, which is part of a portable molecular genetic test system. This DNA extraction system consists of a cartridge with chambers, syringes, four linear stepper actuators, and a rotary stepper actuator. The actuators provide a sequence of steps in the DNA extraction process, such as transporting, mixing, and washing for the gene specimen, magnetic bead, and reagent solutions. The proposed automation system consists of a PC-based host application and an Arduino-based controller. The host application compiles a G code sequence file and interfaces with the controller to execute the compiled sequence. The controller executes stepper motor axis motion, time delay, and input-output manipulation. It drives the stepper motor with an open library, which provides a smooth linear acceleration profile. The controller also provides a homing sequence to establish the motor's reference position, and hard limit checking to prevent any over-travelling. The proposed system was implemented and its functionality was investigated, especially regarding positioning accuracy and velocity profile.
Jalili, Seifollah; Karami, Leila; Schofield, Jeremy
2013-06-01
Proline-rich homeodomain (PRH) is a regulatory protein controlling transcription and gene expression processes by binding to the specific sequence of DNA, especially to the sequence 5'-TAATNN-3'. The impact of base pair mutations on the binding between the PRH protein and DNA is investigated using molecular dynamics and free energy simulations to identify DNA sequences that form stable complexes with PRH. Three 20-ns molecular dynamics simulations (PRH-TAATTG, PRH-TAATTA and PRH-TAATGG complexes) in explicit solvent water were performed to investigate three complexes structurally. Structural analysis shows that the native TAATTG sequence forms a complex that is more stable than complexes with base pair mutations. It is also observed that upon mutation, the number and occupancy of the direct and water-mediated hydrogen bonds decrease. Free energy calculations performed with the thermodynamic integration method predict relative binding free energies of 0.64 and 2 kcal/mol for GC to AT and TA to GC mutations, respectively, suggesting that among the three DNA sequences, the PRH-TAATTG complex is more stable than the two mutated complexes. In addition, it is demonstrated that the stability of the PRH-TAATTA complex is greater than that of the PRH-TAATGG complex.
Chaotic Image Encryption Algorithm Based on Bit Permutation and Dynamic DNA Encoding.
Zhang, Xuncai; Han, Feng; Niu, Ying
2017-01-01
With the help of the fact that chaos is sensitive to initial conditions and pseudorandomness, combined with the spatial configurations in the DNA molecule's inherent and unique information processing ability, a novel image encryption algorithm based on bit permutation and dynamic DNA encoding is proposed here. The algorithm first uses Keccak to calculate the hash value for a given DNA sequence as the initial value of a chaotic map; second, it uses a chaotic sequence to scramble the image pixel locations, and the butterfly network is used to implement the bit permutation. Then, the image is coded into a DNA matrix dynamic, and an algebraic operation is performed with the DNA sequence to realize the substitution of the pixels, which further improves the security of the encryption. Finally, the confusion and diffusion properties of the algorithm are further enhanced by the operation of the DNA sequence and the ciphertext feedback. The results of the experiment and security analysis show that the algorithm not only has a large key space and strong sensitivity to the key but can also effectively resist attack operations such as statistical analysis and exhaustive analysis.
Chaotic Image Encryption Algorithm Based on Bit Permutation and Dynamic DNA Encoding
2017-01-01
With the help of the fact that chaos is sensitive to initial conditions and pseudorandomness, combined with the spatial configurations in the DNA molecule's inherent and unique information processing ability, a novel image encryption algorithm based on bit permutation and dynamic DNA encoding is proposed here. The algorithm first uses Keccak to calculate the hash value for a given DNA sequence as the initial value of a chaotic map; second, it uses a chaotic sequence to scramble the image pixel locations, and the butterfly network is used to implement the bit permutation. Then, the image is coded into a DNA matrix dynamic, and an algebraic operation is performed with the DNA sequence to realize the substitution of the pixels, which further improves the security of the encryption. Finally, the confusion and diffusion properties of the algorithm are further enhanced by the operation of the DNA sequence and the ciphertext feedback. The results of the experiment and security analysis show that the algorithm not only has a large key space and strong sensitivity to the key but can also effectively resist attack operations such as statistical analysis and exhaustive analysis. PMID:28912802
Greenberg, Jay R.; Perry, Robert P.
1971-01-01
The relationship of the DNA sequences from which polyribosomal messenger RNA (mRNA) and heterogeneous nuclear RNA (NRNA) of mouse L cells are transcribed was investigated by means of hybridization kinetics and thermal denaturation of the hybrids. Hybridization was performed in formamide solutions at DNA excess. Under these conditions most of the hybridizing mRNA and NRNA react at values of Dot (DNA concentration multiplied by time) expected for RNA transcribed from the nonrepeated or rarely repeated fraction of the genome. However, a fraction of both mRNA and NRNA hybridize at values of Dot about 10,000 times lower, and therefore must be transcribed from highly redundant DNA sequences. The fraction of NRNA hybridizing to highly repeated sequences is about 1.7 times greater than the corresponding fraction of mRNA. The hybrids formed by the rapidly reacting fractions of both NRNA and mRNA melt over a narrow temperature range with a midpoint about 11°C below that of native L cell DNA. This indicates that these hybrids consist of partially complementary sequences with approximately 11% mismatching of bases. Hybrids formed by the slowly reacting fraction of NRNA melt within 4°–6°C of native DNA, indicating very little, if any, mismatching of bases. Hybrids of the slowly reacting components of mRNA, formed under conditions of sufficiently low RNA input, have a high thermal stability, similar to that observed for hybrids of the slowly reacting NRNA component. However, when higher inputs of mRNA are used, hybrids are formed which have a strikingly lower thermal stability. This observation can be explained by assuming that there is sufficient similarity among the relatively rare DNA sequences coding for mRNA so that under hybridization conditions, in which these DNA sequences are not truly in excess, reversible hybrids exhibiting a considerable amount of mispairing are formed. The fact that a comparable phenomenon has not been observed for NRNA may mean that there is less similarity among the relatively rare DNA sequences coding for NRNA than there is among the rare sequences coding for mRNA. PMID:4999767
Structural mechanics of DNA wrapping in the nucleosome.
Battistini, Federica; Hunter, Christopher A; Gardiner, Eleanor J; Packer, Martin J
2010-02-19
Experimental X-ray crystal structures and a database of calculated structural parameters of DNA octamers were used in combination to analyse the mechanics of DNA bending in the nucleosome core complex. The 1kx5 X-ray crystal structure of the nucleosome core complex was used to determine the relationship between local structure at the base-step level and the global superhelical conformation observed for nucleosome-bound DNA. The superhelix is characterised by a large curvature (597 degrees) in one plane and very little curvature (10 degrees) in the orthogonal plane. Analysis of the curvature at the level of 10-step segments shows that there is a uniform curvature of 30 degrees per helical turn throughout most of the structure but that there are two sharper kinks of 50 degrees at +/-2 helical turns from the central dyad base pair. The curvature is due almost entirely to the base-step parameter roll. There are large periodic variations in roll, which are in phase with the helical twist and account for 500 degrees of the total curvature. Although variations in the other base-step parameters perturb the local path of the DNA, they make minimal contributions to the total curvature. This implies that DNA bending in the nucleosome is achieved using the roll-slide-twist degree of freedom previously identified as the major degree of freedom in naked DNA oligomers. The energetics of bending into a nucleosome-bound conformation were therefore analysed using a database of structural parameters that we have previously developed for naked DNA oligomers. The minimum energy roll, the roll flexibility force constant and the maximum and minimum accessible roll values were obtained for each base step in the relevant octanucleotide context to account for the effects of conformational coupling that vary with sequence context. The distribution of base-step roll values and corresponding strain energy required to bend DNA into the nucleosome-bound conformation defined by the 1kx5 structure were obtained by applying a constant bending moment. When a single bending moment was applied to the entire sequence, the local details of the calculated structure did not match the experiment. However, when local 10-step bending moments were applied separately, the calculated structure showed excellent agreement with experiment. This implies that the protein applies variable bending forces along the DNA to maintain the superhelical path required for nucleosome wrapping. In particular, the 50 degrees kinks are constraints imposed by the protein rather than a feature of the 1kx5 DNA sequence. The kinks coincide with a relatively flexible region of the sequence, and this is probably a prerequisite for high-affinity nucleosome binding, but the bending strain energy is significantly higher at these points than for the rest of the sequence. In the most rigid regions of the sequence, a higher strain energy is also required to achieve the standard 30 degrees curvature per helical turn. We conclude that matching of the DNA sequence to the local roll periodicity required to achieve bending, together with the increased flexibility required at the kinks, determines the sequence selectivity of DNA wrapping in the nucleosome. 2009 Elsevier Ltd. All rights reserved.
Normand, A C; Packeu, A; Cassagne, C; Hendrickx, M; Ranque, S; Piarroux, R
2018-05-01
Conventional dermatophyte identification is based on morphological features. However, recent studies have proposed to use the nucleotide sequences of the rRNA internal transcribed spacer (ITS) region as an identification barcode of all fungi, including dermatophytes. Several nucleotide databases are available to compare sequences and thus identify isolates; however, these databases often contain mislabeled sequences that impair sequence-based identification. We evaluated five of these databases on a clinical isolate panel. We selected 292 clinical dermatophyte strains that were prospectively subjected to an ITS2 nucleotide sequence analysis. Sequences were analyzed against the databases, and the results were compared to clusters obtained via DNA alignment of sequence segments. The DNA tree served as the identification standard throughout the study. According to the ITS2 sequence identification, the majority of strains (255/292) belonged to the genus Trichophyton , mainly T. rubrum complex ( n = 184), T. interdigitale ( n = 40), T. tonsurans ( n = 26), and T. benhamiae ( n = 5). Other genera included Microsporum (e.g., M. canis [ n = 21], M. audouinii [ n = 10], Nannizzia gypsea [ n = 3], and Epidermophyton [ n = 3]). Species-level identification of T. rubrum complex isolates was an issue. Overall, ITS DNA sequencing is a reliable tool to identify dermatophyte species given that a comprehensive and correctly labeled database is consulted. Since many inaccurate identification results exist in the DNA databases used for this study, reference databases must be verified frequently and amended in line with the current revisions of fungal taxonomy. Before describing a new species or adding a new DNA reference to the available databases, its position in the phylogenetic tree must be verified. Copyright © 2018 American Society for Microbiology.
Global DNA methylation analysis using methyl-sensitive amplification polymorphism (MSAP).
Yaish, Mahmoud W; Peng, Mingsheng; Rothstein, Steven J
2014-01-01
DNA methylation is a crucial epigenetic process which helps control gene transcription activity in eukaryotes. Information regarding the methylation status of a regulatory sequence of a particular gene provides important knowledge of this transcriptional control. DNA methylation can be detected using several methods, including sodium bisulfite sequencing and restriction digestion using methylation-sensitive endonucleases. Methyl-Sensitive Amplification Polymorphism (MSAP) is a technique used to study the global DNA methylation status of an organism and hence to distinguish between two individuals based on the DNA methylation status determined by the differential digestion pattern. Therefore, this technique is a useful method for DNA methylation mapping and positional cloning of differentially methylated genes. In this technique, genomic DNA is first digested with a methylation-sensitive restriction enzyme such as HpaII, and then the DNA fragments are ligated to adaptors in order to facilitate their amplification. Digestion using a methylation-insensitive isoschizomer of HpaII, MspI is used in a parallel digestion reaction as a loading control in the experiment. Subsequently, these fragments are selectively amplified by fluorescently labeled primers. PCR products from different individuals are compared, and once an interesting polymorphic locus is recognized, the desired DNA fragment can be isolated from a denaturing polyacrylamide gel, sequenced and identified based on DNA sequence similarity to other sequences available in the database. We will use analysis of met1, ddm1, and atmbd9 mutants and wild-type plants treated with a cytidine analogue, 5-azaC, or zebularine to demonstrate how to assess the genetic modulation of DNA methylation in Arabidopsis. It should be noted that despite the fact that MSAP is a reliable technique used to fish for polymorphic methylated loci, its power is limited to the restriction recognition sites of the enzymes used in the genomic DNA digestion.
Varela, Eduardo S; Lima, João P M S; Galdino, Alexsandro S; Pinto, Luciano da S; Bezerra, Walderly M; Nunes, Edson P; Alves, Maria A O; Grangeiro, Thalles B
2004-01-01
The complete sequences of nuclear ribosomal DNA (nrDNA) internal transcribed spacer regions (ITS/5.8S) were determined for species belonging to six genera from the subtribe Diocleinae as well as for the anomalous genera Calopogonium and Pachyrhizus. Phylogenetic trees constructed by distance matrix, maximum parsimony and maximum likelihood methods showed that Calopogonium and Pachyrhizus were outside the clade Diocleinae (Canavalia, Camptosema, Cratylia, Dioclea, Cymbosema, and Galactia). This finding supports previous morphological, phytochemical, and molecular evidence that Calopogonium and Pachyrhizus do not belong to the subtribe Diocleinae. Within the true Diocleinae clade, the clustering of genera and species were congruent with morphology-based classifications, suggesting that ITS/5.8S sequences can provide enough informative sites to allow resolution below the genus level. This is the first evidence of the phylogeny of subtribe Diocleinae based on nuclear DNA sequences.
Xiong, Ai-Sheng; Yao, Quan-Hong; Peng, Ri-He; Li, Xian; Fan, Hui-Qin; Cheng, Zong-Ming; Li, Yi
2004-07-07
Chemical synthesis of DNA sequences provides a powerful tool for modifying genes and for studying gene function, structure and expression. Here, we report a simple, high-fidelity and cost-effective PCR-based two-step DNA synthesis (PTDS) method for synthesis of long segments of DNA. The method involves two steps. (i) Synthesis of individual fragments of the DNA of interest: ten to twelve 60mer oligonucleotides with 20 bp overlap are mixed and a PCR reaction is carried out with high-fidelity DNA polymerase Pfu to produce DNA fragments that are approximately 500 bp in length. (ii) Synthesis of the entire sequence of the DNA of interest: five to ten PCR products from the first step are combined and used as the template for a second PCR reaction using high-fidelity DNA polymerase pyrobest, with the two outermost oligonucleotides as primers. Compared with the previously published methods, the PTDS method is rapid (5-7 days) and suitable for synthesizing long segments of DNA (5-6 kb) with high G + C contents, repetitive sequences or complex secondary structures. Thus, the PTDS method provides an alternative tool for synthesizing and assembling long genes with complex structures. Using the newly developed PTDS method, we have successfully obtained several genes of interest with sizes ranging from 1.0 to 5.4 kb.
Charge transport and ac response under light illumination in gate-modulated DNA molecular junctions.
Zhang, Yan; Zhu, Wen-Huan; Ding, Guo-Hui; Dong, Bing; Wang, Xue-Feng
2015-05-22
Using a two-strand tight-binding model and within nonequilibrium Green's function approach, we study charge transport through DNA sequences (GC)NGC and (GC)1(TA)NTA (GC)3 sandwiched between two Pt electrodes. We show that at low temperature DNA sequence (GC)NGC exhibits coherent charge carrier transport at very small bias, since the highest occupied molecular orbital in the GC base pair can be aligned with the Fermi energy of the metallic electrodes by a gate voltage. A weak distance dependent conductance is found in DNA sequence (GC)1(TA)NTA (GC)3 with large NTA. Different from the mechanism of thermally induced hopping of charges proposed by the previous experiments, we find that this phenomenon is dominated by quantum tunnelling through discrete quantum well states in the TA base pairs. In addition, ac response of this DNA junction under light illumination is also investigated. The suppression of ac conductances of the left and right lead of DNA sequences at some particular frequencies is attributed to the excitation of electrons in the DNA to the lead Fermi surface by ac potential, or the excitation of electrons in deep DNA energy levels to partially occupied energy levels in the transport window. Therefore, measuring ac response of DNA junctions can reveal a wealth of information about the intrinsic dynamics of DNA molecules.
The phylogenetic relationship of Alexandrium monilatum to other Alexandrium spp. was explored using 18S rDNA sequences. Maximum likelilhood phylogenetic analysis of the combined rDNA sequences established that A. monilatum paired with Alexandrium taylori and that the pair was the...
USDA-ARS?s Scientific Manuscript database
High-throughput next-generation sequencing was used to scan the genome and generate reliable sequence of high copy number regions. Using this method, we examined whole plastid genomes as well as nearly 6000 bases of nuclear ribosomal DNA sequences for nine genotypes of Theobroma cacao and an indivi...
The phylogenetic relationship of Alexandrium monilatum to other Alexandrium spp. was explored using 18S rDNA sequences. Maximum likelihood phylogenetic analysis of the combined rDNA sequences established that A. monilatum paired with Alexandrium taylori and that the pair was the ...
Quantitative analysis and prediction of G-quadruplex forming sequences in double-stranded DNA
Kim, Minji; Kreig, Alex; Lee, Chun-Ying; Rube, H. Tomas; Calvert, Jacob; Song, Jun S.; Myong, Sua
2016-01-01
Abstract G-quadruplex (GQ) is a four-stranded DNA structure that can be formed in guanine-rich sequences. GQ structures have been proposed to regulate diverse biological processes including transcription, replication, translation and telomere maintenance. Recent studies have demonstrated the existence of GQ DNA in live mammalian cells and a significant number of potential GQ forming sequences in the human genome. We present a systematic and quantitative analysis of GQ folding propensity on a large set of 438 GQ forming sequences in double-stranded DNA by integrating fluorescence measurement, single-molecule imaging and computational modeling. We find that short minimum loop length and the thymine base are two main factors that lead to high GQ folding propensity. Linear and Gaussian process regression models further validate that the GQ folding potential can be predicted with high accuracy based on the loop length distribution and the nucleotide content of the loop sequences. Our study provides important new parameters that can inform the evaluation and classification of putative GQ sequences in the human genome. PMID:27095201
Kumar Khanna, Vinod
2007-01-01
The current status and research trends of detection techniques for DNA-based analysis such as DNA finger printing, sequencing, biochips and allied fields are examined. An overview of main detectors is presented vis-à-vis these DNA operations. The biochip method is explained, the role of micro- and nanoelectronic technologies in biochip realization is highlighted, various optical and electrical detection principles employed in biochips are indicated, and the operational mechanisms of these detection devices are described. Although a diversity of biochips for diagnostic and therapeutic applications has been demonstrated in research laboratories worldwide, only some of these chips have entered the clinical market, and more chips are awaiting commercialization. The necessity of tagging is eliminated in refractive-index change based devices, but the basic flaw of indirect nature of most detection methodologies can only be overcome by generic and/or reagentless DNA sensors such as the conductance-based approach and the DNA-single electron transistor (DNA-SET) structure. Devices of the electrical detection-based category are expected to pave the pathway for the next-generation DNA chips. The review provides a comprehensive coverage of the detection technologies for DNA finger printing, sequencing and related techniques, encompassing a variety of methods from the primitive art to the state-of-the-art scenario as well as promising methods for the future.
Nanopore-based fourth-generation DNA sequencing technology.
Feng, Yanxiao; Zhang, Yuechuan; Ying, Cuifeng; Wang, Deqiang; Du, Chunlei
2015-02-01
Nanopore-based sequencers, as the fourth-generation DNA sequencing technology, have the potential to quickly and reliably sequence the entire human genome for less than $1000, and possibly for even less than $100. The single-molecule techniques used by this technology allow us to further study the interaction between DNA and protein, as well as between protein and protein. Nanopore analysis opens a new door to molecular biology investigation at the single-molecule scale. In this article, we have reviewed academic achievements in nanopore technology from the past as well as the latest advances, including both biological and solid-state nanopores, and discussed their recent and potential applications. Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.
Synthesis and DNA interaction of a mixed proflavine-phenanthroline Tröger base.
Baldeyrou, Brigitte; Tardy, Christelle; Bailly, Christian; Colson, Pierre; Houssier, Claude; Charmantray, Franck; Demeunynck, Martine
2002-04-01
We report the synthesis of an asymmetric Tröger base containing the two well characterised DNA binding chromophores, proflavine and phenanthroline. The mode of interaction of the hybrid molecule was investigated by circular and linear dichroism experiments and a biochemical assay using DNA topoisomerase I. The data are compatible with a model in which the proflavine moiety intercalates between DNA base pairs and the phenanthroline ring occupies the DNA groove. DNase I cleavage experiments were carried out to investigate the sequence preference of the hybrid ligand and a well resolved footprint was detected at a site encompassing two adjacent 5'-GTC.5-GAC triplets. The sequence preference of the asymmetric molecule is compared to that of the symmetric analogues.
Plasmonic SERS nanochips and nanoprobes for medical diagnostics and bio-energy applications
NASA Astrophysics Data System (ADS)
Ngo, Hoan T.; Wang, Hsin-Neng; Crawford, Bridget M.; Fales, Andrew M.; Vo-Dinh, Tuan
2017-02-01
The development of rapid, easy-to-use, cost-effective, high accuracy, and high sensitive DNA detection methods for molecular diagnostics has been receiving increasing interest. Over the last five years, our laboratory has developed several chip-based DNA detection techniques including the molecular sentinel-on-chip (MSC), the multiplex MSC, and the inverse molecular sentinel-on-chip (iMS-on-Chip). In these techniques, plasmonic surface-enhanced Raman scattering (SERS) Nanowave chips were functionalized with DNA probes for single-step DNA detection. Sensing mechanisms were based on hybridization of target sequences and DNA probes, resulting in a distance change between SERS reporters and the Nanowave chip's gold surface. This distance change resulted in change in SERS intensity, thus indicating the presence and capture of the target sequences. Our techniques were single-step DNA detection techniques. Target sequences were detected by simple delivery of sample solutions onto DNA probe-functionalized Nanowave chips and SERS signals were measured after 1h - 2h incubation. Target sequence labeling or washing to remove unreacted components was not required, making the techniques simple, easy-to-use, and cost effective. The usefulness of the techniques for medical diagnostics was illustrated by the detection of genetic biomarkers for respiratory viral infection and of dengue virus 4 DNA.
Sequence of retrovirus provirus resembles that of bacterial transposable elements
NASA Astrophysics Data System (ADS)
Shimotohno, Kunitada; Mizutani, Satoshi; Temin, Howard M.
1980-06-01
The nucleotide sequences of the terminal regions of an infectious integrated retrovirus cloned in the modified λ phage cloning vector Charon 4A have been elucidated. There is a 569-base pair direct repeat at both ends of the viral DNA. The cell-virus junctions at each end consist of a 5-base pair direct repeat of cell DNA next to a 3-base pair inverted repeat of viral DNA. This structure resembles that of a transposable element and is consistent with the protovirus hypothesis that retroviruses evolved from the cell genome.
Applications of statistical physics and information theory to the analysis of DNA sequences
NASA Astrophysics Data System (ADS)
Grosse, Ivo
2000-10-01
DNA carries the genetic information of most living organisms, and the of genome projects is to uncover that genetic information. One basic task in the analysis of DNA sequences is the recognition of protein coding genes. Powerful computer programs for gene recognition have been developed, but most of them are based on statistical patterns that vary from species to species. In this thesis I address the question if there exist universal statistical patterns that are different in coding and noncoding DNA of all living species, regardless of their phylogenetic origin. In search for such species-independent patterns I study the mutual information function of genomic DNA sequences, and find that it shows persistent period-three oscillations. To understand the biological origin of the observed period-three oscillations, I compare the mutual information function of genomic DNA sequences to the mutual information function of stochastic model sequences. I find that the pseudo-exon model is able to reproduce the mutual information function of genomic DNA sequences. Moreover, I find that a generalization of the pseudo-exon model can connect the existence and the functional form of long-range correlations to the presence and the length distributions of coding and noncoding regions. Based on these theoretical studies I am able to find an information-theoretical quantity, the average mutual information (AMI), whose probability distributions are significantly different in coding and noncoding DNA, while they are almost identical in all studied species. These findings show that there exist universal statistical patterns that are different in coding and noncoding DNA of all studied species, and they suggest that the AMI may be used to identify genes in different living species, irrespective of their taxonomic origin.
Khrapko, Konstantin R [Moscow, RU; Khorlin, Alexandr A [Moscow, RU; Ivanov, Igor B [Moskovskaya, RU; Ershov, Gennady M [Moscow, RU; Lysov, Jury P [Moscow, RU; Florentiev, Vladimir L [Moscow, RU; Mirzabekov, Andrei D [Moscow, RU
1996-09-03
A method for sequencing DNA by hybridization that includes the following steps: forming an array of oligonucleotides at such concentrations that either ensure the same dissociation temperature for all fully complementary duplexes or allows hybridization and washing of such duplexes to be conducted at the same temperature; hybridizing said oligonucleotide array with labeled test DNA; washing in duplex dissociation conditions; identifying single-base substitutions in the test DNA by analyzing the distribution of the dissociation temperatures and reconstructing the DNA nucleotide sequence based on the above analysis. A device for carrying out the method comprises a solid substrate and a matrix rigidly bound to the substrate. The matrix contains the oligonucleotide array and consists of a multiplicity of gel portions. Each gel portion contains one oligonucleotide of desired length. The gel portions are separated from one another by interstices and have a thickness not exceeding 30 .mu.m.
Ishiguro, Naotaka; Inoshima, Yasuo; Yanai, Tokuma; Sasaki, Motoki; Matsui, Akira; Kikuchi, Hiroki; Maruyama, Masashi; Hongo, Hitomi; Vostretsov, Yuri E; Gasilin, Viatcheslav; Kosintsev, Pavel A; Quanjia, Chen; Chunxue, Wang
2016-02-01
The mitochondrial DNA (mtDNA) control region (198- to 598-bp) of four ancient Canis specimens (two Canis mandibles, a cranium, and a first phalanx) was examined, and each specimen was genetically identified as Japanese wolf. Two unique nucleotide substitutions, the 78-C insertion and the 482-G deletion, both of which are specific for Japanese wolf, were observed in each sample. Based on the mtDNA sequences analyzed, these four specimens and 10 additional Japanese wolf samples could be classified into two groups- Group A (10 samples) and Group B (4 samples)-which contain or lack an 8-bp insertion/deletion (indel), respectively. Interestingly, three dogs (Akita-b, Kishu 25, and S-husky 102) that each contained Japanese wolf-specific features were also classified into Group A or B based on the 8-bp indel. To determine the origin or ancestor of the Japanese wolf, mtDNA control regions of ancient continental Canis specimens were examined; 84 specimens were from Russia, and 29 were from China. However, none of these 113 specimens contained Japanese wolf-specific sequences. Moreover, none of 426 Japanese modern hunting dogs examined contained these Japanese wolf-specific mtDNA sequences. The mtDNA control region sequences of Groups A and B appeared to be unique to grey wolf and dog populations.
SAM: String-based sequence search algorithm for mitochondrial DNA database queries
Röck, Alexander; Irwin, Jodi; Dür, Arne; Parsons, Thomas; Parson, Walther
2011-01-01
The analysis of the haploid mitochondrial (mt) genome has numerous applications in forensic and population genetics, as well as in disease studies. Although mtDNA haplotypes are usually determined by sequencing, they are rarely reported as a nucleotide string. Traditionally they are presented in a difference-coded position-based format relative to the corrected version of the first sequenced mtDNA. This convention requires recommendations for standardized sequence alignment that is known to vary between scientific disciplines, even between laboratories. As a consequence, database searches that are vital for the interpretation of mtDNA data can suffer from biased results when query and database haplotypes are annotated differently. In the forensic context that would usually lead to underestimation of the absolute and relative frequencies. To address this issue we introduce SAM, a string-based search algorithm that converts query and database sequences to position-free nucleotide strings and thus eliminates the possibility that identical sequences will be missed in a database query. The mere application of a BLAST algorithm would not be a sufficient remedy as it uses a heuristic approach and does not address properties specific to mtDNA, such as phylogenetically stable but also rapidly evolving insertion and deletion events. The software presented here provides additional flexibility to incorporate phylogenetic data, site-specific mutation rates, and other biologically relevant information that would refine the interpretation of mitochondrial DNA data. The manuscript is accompanied by freeware and example data sets that can be used to evaluate the new software (http://stringvalidation.org). PMID:21056022
Sequence analysis of Leukemia DNA
NASA Astrophysics Data System (ADS)
Nacong, Nasria; Lusiyanti, Desy; Irawan, Muhammad. Isa
2018-03-01
Cancer is a very deadly disease, one of which is leukemia disease or better known as blood cancer. The cancer cell can be detected by taking DNA in laboratory test. This study focused on local alignment of leukemia and non leukemia data resulting from NCBI in the form of DNA sequences by using Smith-Waterman algorithm. SmithWaterman algorithm was invented by TF Smith and MS Waterman in 1981. These algorithms try to find as much as possible similarity of a pair of sequences, by giving a negative value to the unequal base pair (mismatch), and positive values on the same base pair (match). So that will obtain the maximum positive value as the end of the alignment, and the minimum value as the initial alignment. This study will use sequences of leukemia and 3 sequences of non leukemia.
Sequence specificity of single-stranded DNA-binding proteins: a novel DNA microarray approach
Morgan, Hugh P.; Estibeiro, Peter; Wear, Martin A.; Max, Klaas E.A.; Heinemann, Udo; Cubeddu, Liza; Gallagher, Maurice P.; Sadler, Peter J.; Walkinshaw, Malcolm D.
2007-01-01
We have developed a novel DNA microarray-based approach for identification of the sequence-specificity of single-stranded nucleic-acid-binding proteins (SNABPs). For verification, we have shown that the major cold shock protein (CspB) from Bacillus subtilis binds with high affinity to pyrimidine-rich sequences, with a binding preference for the consensus sequence, 5′-GTCTTTG/T-3′. The sequence was modelled onto the known structure of CspB and a cytosine-binding pocket was identified, which explains the strong preference for a cytosine base at position 3. This microarray method offers a rapid high-throughput approach for determining the specificity and strength of ss DNA–protein interactions. Further screening of this newly emerging family of transcription factors will help provide an insight into their cellular function. PMID:17488853
2010-01-01
Background Cryptic species complexes are common among anophelines. Previous phylogenetic analysis based on the complete mtDNA COI gene sequences detected paraphyly in the Neotropical malaria vector Anopheles marajoara. The "Folmer region" detects a single taxon using a 3% divergence threshold. Methods To test the paraphyletic hypothesis and examine the utility of the Folmer region, genealogical trees based on a concatenated (white + 3' COI sequences) dataset and pairwise differentiation of COI fragments were examined. The population structure and demographic history were based on partial COI sequences for 294 individuals from 14 localities in Amazonian Brazil. 109 individuals from 12 localities were sequenced for the nDNA white gene, and 57 individuals from 11 localities were sequenced for the ribosomal DNA (rDNA) internal transcribed spacer 2 (ITS2). Results Distinct A. marajoara lineages were detected by combined genealogical analysis and were also supported among COI haplotypes using a median joining network and AMOVA, with time since divergence during the Pleistocene (<100,000 ya). COI sequences at the 3' end were more variable, demonstrating significant pairwise differentiation (3.82%) compared to the more moderate 2.92% detected by the Folmer region. Lineage 1 was present in all localities, whereas lineage 2 was restricted mainly to the west. Mismatch distributions for both lineages were bimodal, likely due to multiple colonization events and spatial expansion (~798 - 81,045 ya). There appears to be gene flow within, not between lineages, and a partial barrier was detected near Rio Jari in Amapá state, separating western and eastern populations. In contrast, both nDNA data sets (white gene sequences with or without the retention of the 4th intron, and ITS2 sequences and length) detected a single A. marajoara lineage. Conclusions Strong support for combined data with significant differentiation detected in the COI and absent in the nDNA suggest that the divergence is recent, and detectable only by the faster evolving mtDNA. A within subgenus threshold of >2% may be more appropriate among sister taxa in cryptic anopheline complexes than the standard 3%. Differences in demographic history and climatic changes may have contributed to mtDNA lineage divergence in A. marajoara. PMID:20929572
Didelot, Audrey; Kotsopoulos, Steve K; Lupo, Audrey; Pekin, Deniz; Li, Xinyu; Atochin, Ivan; Srinivasan, Preethi; Zhong, Qun; Olson, Jeff; Link, Darren R; Laurent-Puig, Pierre; Blons, Hélène; Hutchison, J Brian; Taly, Valerie
2013-05-01
Assessment of DNA integrity and quantity remains a bottleneck for high-throughput molecular genotyping technologies, including next-generation sequencing. In particular, DNA extracted from paraffin-embedded tissues, a major potential source of tumor DNA, varies widely in quality, leading to unpredictable sequencing data. We describe a picoliter droplet-based digital PCR method that enables simultaneous detection of DNA integrity and the quantity of amplifiable DNA. Using a multiplex assay, we detected 4 different target lengths (78, 159, 197, and 550 bp). Assays were validated with human genomic DNA fragmented to sizes of 170 bp to 3000 bp. The technique was validated with DNA quantities as low as 1 ng. We evaluated 12 DNA samples extracted from paraffin-embedded lung adenocarcinoma tissues. One sample contained no amplifiable DNA. The fractions of amplifiable DNA for the 11 other samples were between 0.05% and 10.1% for 78-bp fragments and ≤1% for longer fragments. Four samples were chosen for enrichment and next-generation sequencing. The quality of the sequencing data was in agreement with the results of the DNA-integrity test. Specifically, DNA with low integrity yielded sequencing results with lower levels of coverage and uniformity and had higher levels of false-positive variants. The development of DNA-quality assays will enable researchers to downselect samples or process more DNA to achieve reliable genome sequencing with the highest possible efficiency of cost and effort, as well as minimize the waste of precious samples. © 2013 American Association for Clinical Chemistry.
Ávila-Arcos, María C.; Cappellini, Enrico; Romero-Navarro, J. Alberto; Wales, Nathan; Moreno-Mayar, J. Víctor; Rasmussen, Morten; Fordyce, Sarah L.; Montiel, Rafael; Vielle-Calzada, Jean-Philippe; Willerslev, Eske; Gilbert, M. Thomas P.
2011-01-01
The development of second-generation sequencing technologies has greatly benefitted the field of ancient DNA (aDNA). Its application can be further exploited by the use of targeted capture-enrichment methods to overcome restrictions posed by low endogenous and contaminating DNA in ancient samples. We tested the performance of Agilent's SureSelect and Mycroarray's MySelect in-solution capture systems on Illumina sequencing libraries built from ancient maize to identify key factors influencing aDNA capture experiments. High levels of clonality as well as the presence of multiple-copy sequences in the capture targets led to biases in the data regardless of the capture method. Neither method consistently outperformed the other in terms of average target enrichment, and no obvious difference was observed either when two tiling designs were compared. In addition to demonstrating the plausibility of capturing aDNA from ancient plant material, our results also enable us to provide useful recommendations for those planning targeted-sequencing on aDNA. PMID:22355593
Multiplexed Sequence Encoding: A Framework for DNA Communication
Zakeri, Bijan; Carr, Peter A.; Lu, Timothy K.
2016-01-01
Synthetic DNA has great propensity for efficiently and stably storing non-biological information. With DNA writing and reading technologies rapidly advancing, new applications for synthetic DNA are emerging in data storage and communication. Traditionally, DNA communication has focused on the encoding and transfer of complete sets of information. Here, we explore the use of DNA for the communication of short messages that are fragmented across multiple distinct DNA molecules. We identified three pivotal points in a communication—data encoding, data transfer & data extraction—and developed novel tools to enable communication via molecules of DNA. To address data encoding, we designed DNA-based individualized keyboards (iKeys) to convert plaintext into DNA, while reducing the occurrence of DNA homopolymers to improve synthesis and sequencing processes. To address data transfer, we implemented a secret-sharing system—Multiplexed Sequence Encoding (MuSE)—that conceals messages between multiple distinct DNA molecules, requiring a combination key to reveal messages. To address data extraction, we achieved the first instance of chromatogram patterning through multiplexed sequencing, thereby enabling a new method for data extraction. We envision these approaches will enable more widespread communication of information via DNA. PMID:27050646
HLA genotyping by next-generation sequencing of complementary DNA.
Segawa, Hidenobu; Kukita, Yoji; Kato, Kikuya
2017-11-28
Genotyping of the human leucocyte antigen (HLA) is indispensable for various medical treatments. However, unambiguous genotyping is technically challenging due to high polymorphism of the corresponding genomic region. Next-generation sequencing is changing the landscape of genotyping. In addition to high throughput of data, its additional advantage is that DNA templates are derived from single molecules, which is a strong merit for the phasing problem. Although most currently developed technologies use genomic DNA, use of cDNA could enable genotyping with reduced costs in data production and analysis. We thus developed an HLA genotyping system based on next-generation sequencing of cDNA. Each HLA gene was divided into 3 or 4 target regions subjected to PCR amplification and subsequent sequencing with Ion Torrent PGM. The sequence data were then subjected to an automated analysis. The principle of the analysis was to construct candidate sequences generated from all possible combinations of variable bases and arrange them in decreasing order of the number of reads. Upon collecting candidate sequences from all target regions, 2 haplotypes were usually assigned. Cases not assigned 2 haplotypes were forwarded to 4 additional processes: selection of candidate sequences applying more stringent criteria, removal of artificial haplotypes, selection of candidate sequences with a relaxed threshold for sequence matching, and countermeasure for incomplete sequences in the HLA database. The genotyping system was evaluated using 30 samples; the overall accuracy was 97.0% at the field 3 level and 98.3% at the G group level. With one sample, genotyping of DPB1 was not completed due to short read size. We then developed a method for complete sequencing of individual molecules of the DPB1 gene, using the molecular barcode technology. The performance of the automatic genotyping system was comparable to that of systems developed in previous studies. Thus, next-generation sequencing of cDNA is a viable option for HLA genotyping.
DNA microdevice for electrochemical detection of Escherichia coli 0157:H7 molecular markers.
Berganza, J; Olabarria, G; García, R; Verdoy, D; Rebollo, A; Arana, S
2007-04-15
An electrochemical DNA sensor based on the hybridization recognition of a single-stranded DNA (ssDNA) probe immobilized onto a gold electrode to its complementary ssDNA is presented. The DNA probe is bound on gold surface electrode by using self-assembled monolayer (SAM) technology. An optimized mixed SAM with a blocking molecule preventing the nonspecific adsorption on the electrode surface has been prepared. In this paper, a DNA biosensor is designed by means of the immobilization of a single stranded DNA probe on an electrochemical transducer surface to recognize specifically Escherichia coli (E. coli) 0157:H7 complementary target DNA sequence via cyclic voltammetry experiments. The 21 mer DNA probe including a C6 alkanethiol group at the 5' phosphate end has been synthesized to form the SAM onto the gold surface through the gold sulfur bond. The goal of this paper has been to design, characterise and optimise an electrochemical DNA sensor. In order to investigate the oligonucleotide probe immobilization and the hybridization detection, experiments with different concentration of DNA and mismatch sequences have been performed. This microdevice has demonstrated the suitability of oligonucleotide Self-assembled monolayers (SAMs) on gold as immobilization method. The DNA probes deposited on gold surface have been functional and able to detect changes in bases sequence in a 21-mer oligonucleotide.
Using Playing Cards to Simulate a Molecular Clock
ERIC Educational Resources Information Center
Westerling, Karin E.
2008-01-01
Changes in DNA base-repair may serve as an indicator of the time elapsed since divergence from a common ancestor. DNA sequences can now be analyzed. The simulation presented in this article allows students to observe the accumulation of changes in a randomly mutating sequence of playing cards. The cards are analogous to DNA nucleotide or protein…
USDA-ARS?s Scientific Manuscript database
DNA sequencing and other DNA-based methods, such as PCR, are now broadly used for detection and identification of bacterial foodborne pathogens. For the identification of foodborne bacterial pathogens, it is important to make taxonomic assignments to the species, or even subspecies level. Long-read ...
A laboratory information management system for DNA barcoding workflows.
Vu, Thuy Duong; Eberhardt, Ursula; Szöke, Szániszló; Groenewald, Marizeth; Robert, Vincent
2012-07-01
This paper presents a laboratory information management system for DNA sequences (LIMS) created and based on the needs of a DNA barcoding project at the CBS-KNAW Fungal Biodiversity Centre (Utrecht, the Netherlands). DNA barcoding is a global initiative for species identification through simple DNA sequence markers. We aim at generating barcode data for all strains (or specimens) included in the collection (currently ca. 80 k). The LIMS has been developed to better manage large amounts of sequence data and to keep track of the whole experimental procedure. The system has allowed us to classify strains more efficiently as the quality of sequence data has improved, and as a result, up-to-date taxonomic names have been given to strains and more accurate correlation analyses have been carried out.
Nair, Maya S; D'Mello, Samar; Pant, Rashmi; Poluri, Krishna Mohan
2017-05-01
Interactions of a natural stilbene compound, resveratrol with two DNA sequences containing AATT/TTAA segments have been studied. Resveratrol is found to interact with both the sequences. The mode of interaction has been studied using absorption, steady state fluorescence and circular dichroism spectroscopic techniques. UV-visible absorption and fluorescence studies provided the information regarding the binding constants and the stoichiometry of binding, whereas circular dichroism studies depicted the structural changes in DNA upon resveratrol binding. Our results evidenced that, though resveratrol showed similar affinity to both the sequences, the mode of interactions was different. The binding constants of resveratrol to AATT/TTAA sequences were found to be 7.55×10 5 M -1 and 5.42×10 5 M -1 respectively. Spectroscopic data evidenced for a groove binding interaction. Melting studies showed that the binding of resveratrol induces differential stability to the DNA sequences d(CGTTAACG) 2 and d(CGAATTCG) 2 . Fluorescence data showed a stoichiometry of 1:1 for d(CGAATTCG) 2 -resveratrol complex and 1:4 for d(CGTTAACG) 2 -resveratrol complex. Molecular docking studies demonstrated that resveratrol binds to the minor groove region of both the sequences to form stable complexes with varied atomic contacts to the DNA bases or backbone. Both the complexes are stabilized by hydrogen bond formation. Our results evidenced that modulation of DNA sequence within the same bases can greatly alter the binding geometry and stability of the complex upon binding to small molecule inhibitor compounds like resveratrol. Copyright © 2017 Elsevier B.V. All rights reserved.
Enzyme-free detection and quantification of double-stranded nucleic acids.
Feuillie, Cécile; Merheb, Maxime Mohamad; Gillet, Benjamin; Montagnac, Gilles; Hänni, Catherine; Daniel, Isabelle
2012-08-01
We have developed a fully enzyme-free SERRS hybridization assay for specific detection of double-stranded DNA sequences. Although all DNA detection methods ranging from PCR to high-throughput sequencing rely on enzymes, this method is unique for being totally non-enzymatic. The efficiency of enzymatic processes is affected by alterations, modifications, and/or quality of DNA. For instance, a limitation of most DNA polymerases is their inability to process DNA damaged by blocking lesions. As a result, enzymatic amplification and sequencing of degraded DNA often fail. In this study we succeeded in detecting and quantifying, within a mixture, relative amounts of closely related double-stranded DNA sequences from Rupicapra rupicapra (chamois) and Capra hircus (goat). The non-enzymatic SERRS assay presented here is the corner stone of a promising approach to overcome the failure of DNA polymerase when DNA is too degraded or when the concentration of polymerase inhibitors is too high. It is the first time double-stranded DNA has been detected with a truly non-enzymatic SERRS-based method. This non-enzymatic, inexpensive, rapid assay is therefore a breakthrough in nucleic acid detection.
Molecular Identification and Databases in Fusarium
USDA-ARS?s Scientific Manuscript database
DNA sequence-based methods for identifying pathogenic and mycotoxigenic Fusarium isolates have become the gold standard worldwide. Moreover, fusarial DNA sequence data are increasing rapidly in several web-accessible databases for comparative purposes. Unfortunately, the use of Basic Alignment Sea...
Ultra-low background DNA cloning system.
Goto, Kenta; Nagano, Yukio
2013-01-01
Yeast-based in vivo cloning is useful for cloning DNA fragments into plasmid vectors and is based on the ability of yeast to recombine the DNA fragments by homologous recombination. Although this method is efficient, it produces some by-products. We have developed an "ultra-low background DNA cloning system" on the basis of yeast-based in vivo cloning, by almost completely eliminating the generation of by-products and applying the method to commonly used Escherichia coli vectors, particularly those lacking yeast replication origins and carrying an ampicillin resistance gene (Amp(r)). First, we constructed a conversion cassette containing the DNA sequences in the following order: an Amp(r) 5' UTR (untranslated region) and coding region, an autonomous replication sequence and a centromere sequence from yeast, a TRP1 yeast selectable marker, and an Amp(r) 3' UTR. This cassette allowed conversion of the Amp(r)-containing vector into the yeast/E. coli shuttle vector through use of the Amp(r) sequence by homologous recombination. Furthermore, simultaneous transformation of the desired DNA fragment into yeast allowed cloning of this DNA fragment into the same vector. We rescued the plasmid vectors from all yeast transformants, and by-products containing the E. coli replication origin disappeared. Next, the rescued vectors were transformed into E. coli and the by-products containing the yeast replication origin disappeared. Thus, our method used yeast- and E. coli-specific "origins of replication" to eliminate the generation of by-products. Finally, we successfully cloned the DNA fragment into the vector with almost 100% efficiency.
Basic quantitative polymerase chain reaction using real-time fluorescence measurements.
Ares, Manuel
2014-10-01
This protocol uses quantitative polymerase chain reaction (qPCR) to measure the number of DNA molecules containing a specific contiguous sequence in a sample of interest (e.g., genomic DNA or cDNA generated by reverse transcription). The sample is subjected to fluorescence-based PCR amplification and, theoretically, during each cycle, two new duplex DNA molecules are produced for each duplex DNA molecule present in the sample. The progress of the reaction during PCR is evaluated by measuring the fluorescence of dsDNA-dye complexes in real time. In the early cycles, DNA duplication is not detected because inadequate amounts of DNA are made. At a certain threshold cycle, DNA-dye complexes double each cycle for 8-10 cycles, until the DNA concentration becomes so high and the primer concentration so low that the reassociation of the product strands blocks efficient synthesis of new DNA and the reaction plateaus. There are two types of measurements: (1) the relative change of the target sequence compared to a reference sequence and (2) the determination of molecule number in the starting sample. The first requires a reference sequence, and the second requires a sample of the target sequence with known numbers of the molecules of sequence to generate a standard curve. By identifying the threshold cycle at which a sample first begins to accumulate DNA-dye complexes exponentially, an estimation of the numbers of starting molecules in the sample can be extrapolated. © 2014 Cold Spring Harbor Laboratory Press.
Schadt, Eric E.; Banerjee, Onureena; Fang, Gang; Feng, Zhixing; Wong, Wing H.; Zhang, Xuegong; Kislyuk, Andrey; Clark, Tyson A.; Luong, Khai; Keren-Paz, Alona; Chess, Andrew; Kumar, Vipin; Chen-Plotkin, Alice; Sondheimer, Neal; Korlach, Jonas; Kasarskis, Andrew
2013-01-01
Current generation DNA sequencing instruments are moving closer to seamlessly sequencing genomes of entire populations as a routine part of scientific investigation. However, while significant inroads have been made identifying small nucleotide variation and structural variations in DNA that impact phenotypes of interest, progress has not been as dramatic regarding epigenetic changes and base-level damage to DNA, largely due to technological limitations in assaying all known and unknown types of modifications at genome scale. Recently, single-molecule real time (SMRT) sequencing has been reported to identify kinetic variation (KV) events that have been demonstrated to reflect epigenetic changes of every known type, providing a path forward for detecting base modifications as a routine part of sequencing. However, to date no statistical framework has been proposed to enhance the power to detect these events while also controlling for false-positive events. By modeling enzyme kinetics in the neighborhood of an arbitrary location in a genomic region of interest as a conditional random field, we provide a statistical framework for incorporating kinetic information at a test position of interest as well as at neighboring sites that help enhance the power to detect KV events. The performance of this and related models is explored, with the best-performing model applied to plasmid DNA isolated from Escherichia coli and mitochondrial DNA isolated from human brain tissue. We highlight widespread kinetic variation events, some of which strongly associate with known modification events, while others represent putative chemically modified sites of unknown types. PMID:23093720
Schadt, Eric E; Banerjee, Onureena; Fang, Gang; Feng, Zhixing; Wong, Wing H; Zhang, Xuegong; Kislyuk, Andrey; Clark, Tyson A; Luong, Khai; Keren-Paz, Alona; Chess, Andrew; Kumar, Vipin; Chen-Plotkin, Alice; Sondheimer, Neal; Korlach, Jonas; Kasarskis, Andrew
2013-01-01
Current generation DNA sequencing instruments are moving closer to seamlessly sequencing genomes of entire populations as a routine part of scientific investigation. However, while significant inroads have been made identifying small nucleotide variation and structural variations in DNA that impact phenotypes of interest, progress has not been as dramatic regarding epigenetic changes and base-level damage to DNA, largely due to technological limitations in assaying all known and unknown types of modifications at genome scale. Recently, single-molecule real time (SMRT) sequencing has been reported to identify kinetic variation (KV) events that have been demonstrated to reflect epigenetic changes of every known type, providing a path forward for detecting base modifications as a routine part of sequencing. However, to date no statistical framework has been proposed to enhance the power to detect these events while also controlling for false-positive events. By modeling enzyme kinetics in the neighborhood of an arbitrary location in a genomic region of interest as a conditional random field, we provide a statistical framework for incorporating kinetic information at a test position of interest as well as at neighboring sites that help enhance the power to detect KV events. The performance of this and related models is explored, with the best-performing model applied to plasmid DNA isolated from Escherichia coli and mitochondrial DNA isolated from human brain tissue. We highlight widespread kinetic variation events, some of which strongly associate with known modification events, while others represent putative chemically modified sites of unknown types.
Evaluating the role of coherent delocalized phonon-like modes in DNA cyclization
Alexandrov, Ludmil B.; Rasmussen, Kim Ã.; Bishop, Alan R.; ...
2017-08-29
The innate flexibility of a DNA sequence is quantified by the Jacobson-Stockmayer’s J-factor, which measures the propensity for DNA loop formation. Recent studies of ultra-short DNA sequences revealed a discrepancy of up to six orders of magnitude between experimentally measured and theoretically predicted J-factors. These large differences suggest that, in addition to the elastic moduli of the double helix, other factors contribute to loop formation. We develop a new theoretical model that explores how coherent delocalized phonon-like modes in DNA provide single-stranded ”flexible hinges” to assist in loop formation. We also combine the Czapla-Swigon-Olson structural model of DNA with ourmore » extended Peyrard-Bishop-Dauxois model and, without changing any of the parameters of the two models, apply this new computational framework to 86 experimentally characterized DNA sequences. Our results demonstrate that the new computational framework can predict J-factors within an order of magnitude of experimental measurements for most ultra-short DNA sequences, while continuing to accurately describe the J-factors of longer sequences. Furthermore, we demonstrate that our computational framework can be used to describe the cyclization of DNA sequences that contain a base pair mismatch. Overall, our results support the conclusion that coherent delocalized phonon-like modes play an important role in DNA cyclization.« less
Evaluating the role of coherent delocalized phonon-like modes in DNA cyclization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Alexandrov, Ludmil B.; Rasmussen, Kim Ã.; Bishop, Alan R.
The innate flexibility of a DNA sequence is quantified by the Jacobson-Stockmayer’s J-factor, which measures the propensity for DNA loop formation. Recent studies of ultra-short DNA sequences revealed a discrepancy of up to six orders of magnitude between experimentally measured and theoretically predicted J-factors. These large differences suggest that, in addition to the elastic moduli of the double helix, other factors contribute to loop formation. We develop a new theoretical model that explores how coherent delocalized phonon-like modes in DNA provide single-stranded ”flexible hinges” to assist in loop formation. We also combine the Czapla-Swigon-Olson structural model of DNA with ourmore » extended Peyrard-Bishop-Dauxois model and, without changing any of the parameters of the two models, apply this new computational framework to 86 experimentally characterized DNA sequences. Our results demonstrate that the new computational framework can predict J-factors within an order of magnitude of experimental measurements for most ultra-short DNA sequences, while continuing to accurately describe the J-factors of longer sequences. Furthermore, we demonstrate that our computational framework can be used to describe the cyclization of DNA sequences that contain a base pair mismatch. Overall, our results support the conclusion that coherent delocalized phonon-like modes play an important role in DNA cyclization.« less
Spreadsheet-based program for alignment of overlapping DNA sequences.
Anbazhagan, R; Gabrielson, E
1999-06-01
Molecular biology laboratories frequently face the challenge of aligning small overlapping DNA sequences derived from a long DNA segment. Here, we present a short program that can be used to adapt Excel spreadsheets as a tool for aligning DNA sequences, regardless of their orientation. The program runs on any Windows or Macintosh operating system computer with Excel 97 or Excel 98. The program is available for use as an Excel file, which can be downloaded from the BioTechniques Web site. Upon execution, the program opens a specially designed customized workbook and is capable of identifying overlapping regions between two sequence fragments and displaying the sequence alignment. It also performs a number of specialized functions such as recognition of restriction enzyme cutting sites and CpG island mapping without costly specialized software.
Yuko Ota; Mee-Sook Kim; Hitoshi Neda; Ned B. Klopfenstein; Eri Hasegawa
2011-01-01
An undetermined Armillaria species was collected on Amami-Oshima, a subtropical island of Japan. The phylogenetic position of the Armillaria sp. was determined using sequences of the elongation factor-1a (EF-1a) gene and the internal transcribed spacer (ITS) region (ITS1-5.8S-ITS2) of ribosomal DNA (rDNA). The phylogenetic analyses based on EF-1a and ITS sequences...
Cloning and sequence analysis of Hemonchus contortus HC58cDNA.
Muleke, Charles I; Ruofeng, Yan; Lixin, Xu; Xinwen, Bo; Xiangrui, Li
2007-06-01
The complete coding sequence of Hemonchus contortus HC58cDNA was generated by rapid amplification of cDNA ends and polymerase chain reaction using primers based on the 5' and 3' ends of the parasite mRNA, accession no. AF305964. The HC58cDNA gene was 851 bp long, with open reading frame of 717 bp, precursors to 239 amino acids coding for approximately 27 kDa protein. Analysis of amino acid sequence revealed conserved residues of cysteine, histidine, asparagine, occluding loop pattern, hemoglobinase motif and glutamine of the oxyanion hole characteristic of cathepsin B like proteases (CBL). Comparison of the predicted amino acid sequences showed the protein shared 33.5-58.7% identity to cathepsin B homologues in the papain clan CA family (family C1). Phylogenetic analysis revealed close evolutionary proximity of the protein sequence to counterpart sequences in the CBL, suggesting that HC58cDNA was a member of the papain family.
USDA-ARS?s Scientific Manuscript database
Reconstructing the phylogeny of Pyrus has been difficult due to the wide distribution of the genus and lack of informative data. In this study, we collected 110 accessions representing 25 Pyrus species and constructed both phylogenetic trees and phylogenetic networks based on multiple DNA sequence d...
A novel chaos-based image encryption algorithm using DNA sequence operations
NASA Astrophysics Data System (ADS)
Chai, Xiuli; Chen, Yiran; Broyde, Lucie
2017-01-01
An image encryption algorithm based on chaotic system and deoxyribonucleic acid (DNA) sequence operations is proposed in this paper. First, the plain image is encoded into a DNA matrix, and then a new wave-based permutation scheme is performed on it. The chaotic sequences produced by 2D Logistic chaotic map are employed for row circular permutation (RCP) and column circular permutation (CCP). Initial values and parameters of the chaotic system are calculated by the SHA 256 hash of the plain image and the given values. Then, a row-by-row image diffusion method at DNA level is applied. A key matrix generated from the chaotic map is used to fuse the confused DNA matrix; also the initial values and system parameters of the chaotic system are renewed by the hamming distance of the plain image. Finally, after decoding the diffused DNA matrix, we obtain the cipher image. The DNA encoding/decoding rules of the plain image and the key matrix are determined by the plain image. Experimental results and security analyses both confirm that the proposed algorithm has not only an excellent encryption result but also resists various typical attacks.
NASA Astrophysics Data System (ADS)
Walker, David Lee
1999-12-01
This study uses dynamical analysis to examine in a quantitative fashion the information coding mechanism in DNA sequences. This exceeds the simple dichotomy of either modeling the mechanism by comparing DNA sequence walks as Fractal Brownian Motion (fbm) processes. The 2-D mappings of the DNA sequences for this research are from Iterated Function System (IFS) (Also known as the ``Chaos Game Representation'' (CGR)) mappings of the DNA sequences. This technique converts a 1-D sequence into a 2-D representation that preserves subsequence structure and provides a visual representation. The second step of this analysis involves the application of Wavelet Packet Transforms, a recently developed technique from the field of signal processing. A multi-fractal model is built by using wavelet transforms to estimate the Hurst exponent, H. The Hurst exponent is a non-parametric measurement of the dynamism of a system. This procedure is used to evaluate gene- coding events in the DNA sequence of cystic fibrosis mutations. The H exponent is calculated for various mutation sites in this gene. The results of this study indicate the presence of anti-persistent, random walks and persistent ``sub-periods'' in the sequence. This indicates the hypothesis of a multi-fractal model of DNA information encoding warrants further consideration. This work examines the model's behavior in both pathological (mutations) and non-pathological (healthy) base pair sequences of the cystic fibrosis gene. These mutations both natural and synthetic were introduced by computer manipulation of the original base pair text files. The results show that disease severity and system ``information dynamics'' correlate. These results have implications for genetic engineering as well as in mathematical biology. They suggest that there is scope for more multi-fractal models to be developed.
NASA Astrophysics Data System (ADS)
Millard, Julie T.; Pilon, André M.
2003-04-01
A recent forensic approach for identification of unknown biological samples is mitochondrial DNA (mtDNA) sequencing. We describe a laboratory exercise suitable for an undergraduate biochemistry course in which the polymerase chain reaction is used to amplify a 440 base pair hypervariable region of human mtDNA from a variety of "crime scene" samples (e.g., teeth, hair, nails, cigarettes, envelope flaps, toothbrushes, and chewing gum). Amplification is verified via agarose gel electrophoresis and then samples are subjected to cycle sequencing. Sequence alignments are made via the program CLUSTAL W, allowing students to compare samples and solve the "crime."
Fluorogenic DNA Sequencing in PDMS Microreactors
Sims, Peter A.; Greenleaf, William J.; Duan, Haifeng; Xie, X. Sunney
2012-01-01
We have developed a multiplex sequencing-by-synthesis method combining terminal-phosphate labeled fluorogenic nucleotides (TPLFNs) and resealable microreactors. In the presence of phosphatase, the incorporation of a non-fluorescent TPLFN into a DNA primer by DNA polymerase results in a fluorophore. We immobilize DNA templates within polydimethylsiloxane (PDMS) microreactors, sequentially introduce one of the four identically labeled TPLFNs, seal the microreactors, allow template-directed TPLFN incorporation, and measure the signal from the fluorophores trapped in the microreactors. This workflow allows sequencing in a manner akin to pyrosequencing but without constant monitoring of each microreactor. With cycle times of <10 minutes, we demonstrate 30 base reads with ∼99% raw accuracy. “Fluorogenic pyrosequencing” combines benefits of pyrosequencing, such as rapid turn-around, native DNA generation, and single-color detection, with benefits of fluorescence-based approaches, such as highly sensitive detection and simple parallelization. PMID:21666670
Nanofluidic Device with Embedded Nanopore
NASA Astrophysics Data System (ADS)
Zhang, Yuning; Reisner, Walter
2014-03-01
Nanofluidic based devices are robust methods for biomolecular sensing and single DNA manipulation. Nanopore-based DNA sensing has attractive features that make it a leading candidate as a single-molecule DNA sequencing technology. Nanochannel based extension of DNA, combined with enzymatic or denaturation-based barcoding schemes, is already a powerful approach for genome analysis. We believe that there is revolutionary potential in devices that combine nanochannels with nanpore detectors. In particular, due to the fast translocation of a DNA molecule through a standard nanopore configuration, there is an unfavorable trade-off between signal and sequence resolution. With a combined nanochannel-nanopore device, based on embedding a nanopore inside a nanochannel, we can in principle gain independent control over both DNA translocation speed and sensing signal, solving the key draw-back of the standard nanopore configuration. We demonstrate that we can detect - using fluorescent microscopy - successful translocation of DNA from the nanochannel out through the nanopore, a possible method to 'select' a given barcode for further analysis. We also show that in equilibrium DNA will not escape through an embedded sub-persistence length nanopore until a certain voltage bias is added.
Cai, Wei; Xie, Shunbi; Zhang, Jin; Tang, Dianyong; Tang, Ying
2018-06-08
We presented a novel dual-DNAzyme feedback amplification (DDFA) strategy for Pb 2+ detection based on a micropipette tip-based miniaturized homogeneous electrochemical device. The DDFA system involves two rolling circle amplification (RCA) processes in which two circular DNA templates (C1 and C2) have been designed with a Pb 2+ -DNAzyme sequence (8-17 DNAzyme, anti-GR-5 DNAzyme) and an antisense sequence of G-quadruplex. And a linear DNA (L-DNA), which consists of a primer sequence and a Pb 2+ -DNAzyme substrate sequence, could hybridize with C1 and C2 to form two DNA complexes. In presence of Pb 2+ , the Pb 2+ -DNAzyme exhibited excellent cleavage specificity toward the substrate sequence in L-DNA, leaving primer sequence to trigger two paths of RCA process and finally resulting in massive long nanosolo DNA strands with reduplicated G-quadruplex sequences. And then, methylene blue (MB) could selectively intercalate into G-quadruplex to reduce the free MB concentration in the solution. Thereafter, a carbon fiber microelectrode-based miniaturized electrochemical device was constructed to record the decrease of electrochemical signal due to the much lower diffusion rate of MB/G-quadruplex complex than that of free MB. Therefore, the concentration of Pb 2+ could be correctively and sensitively determined in a homogeneous solution by combining DDFA with miniaturized electrochemical device. This protocol not only exhibited high selectivity and sensitivity toward Pb 2+ with a detection limit of 0.048 pM, but also reduced sample volume to 10 µL. In addition, this sensing system has been successfully applied to Pb 2+ detection in Yangtze River with desirable quantitative manners, which matched well with the atomic absorption spectrometry (AAS). Copyright © 2018 Elsevier B.V. All rights reserved.
Methylsorb: a simple method for quantifying DNA methylation using DNA-gold affinity interactions.
Sina, Abu Ali Ibn; Carrascosa, Laura G; Palanisamy, Ramkumar; Rauf, Sakandar; Shiddiky, Muhammad J A; Trau, Matt
2014-10-21
The analysis of DNA methylation is becoming increasingly important both in the clinic and also as a research tool to unravel key epigenetic molecular mechanisms in biology. Current methodologies for the quantification of regional DNA methylation (i.e., the average methylation over a region of DNA in the genome) are largely affected by comprehensive DNA sequencing methodologies which tend to be expensive, tedious, and time-consuming for many applications. Herein, we report an alternative DNA methylation detection method referred to as "Methylsorb", which is based on the inherent affinity of DNA bases to the gold surface (i.e., the trend of the affinity interactions is adenine > cytosine ≥ guanine > thymine).1 Since the degree of gold-DNA affinity interaction is highly sequence dependent, it provides a new capability to detect DNA methylation by simply monitoring the relative adsorption of bisulfite treated DNA sequences onto a gold chip. Because the selective physical adsorption of DNA fragments to gold enable a direct read-out of regional DNA methylation, the current requirement for DNA sequencing is obviated. To demonstrate the utility of this method, we present data on the regional methylation status of two CpG clusters located in the EN1 and MIR200B genes in MCF7 and MDA-MB-231 cells. The methylation status of these regions was obtained from the change in relative mass on gold surface with respect to relative adsorption of an unmethylated DNA source and this was detected using surface plasmon resonance (SPR) in a label-free and real-time manner. We anticipate that the simplicity of this method, combined with the high level of accuracy for identifying the methylation status of cytosines in DNA, could find broad application in biology and diagnostics.
Cimino, Matthew T
2010-03-01
Twenty-four herbal dietary supplement powder and extract reference standards provided by the National Institute of Standards and Technology (NIST) were investigated using three different commercially available DNA extraction kits to evaluate DNA availability for downstream nucleotide-based applications. The material included samples of Camellia, Citrus, Ephedra, Ginkgo, Hypericum, Serenoa, And Vaccinium. Protocols from Qiagen, MoBio, and Phytopure were used to isolate and purify DNA from the NIST standards. The resulting DNA concentration was quantified using SYBR Green fluorometry. Each of the 24 samples yielded DNA, though the concentration of DNA from each approach was notably different. The Phytopure method consistently yielded more DNA. The average yield ratio was 22 : 3 : 1 (ng/microL; Phytopure : Qiagen : MoBio). Amplification of the internal transcribed spacer II region using PCR was ultimately successful in 22 of the 24 samples. Direct sequencing chromatograms of the amplified material suggested that most of the samples were comprised of mixtures. However, the sequencing chromatograms of 12 of the 24 samples were sufficient to confirm the identity of the target material. The successful extraction, amplification, and sequencing of DNA from these herbal dietary supplement extracts and powders supports a continued effort to explore nucleotide sequence-based tools for the authentication and identification of plants in dietary supplements. (c) Georg Thieme Verlag KG Stuttgart . New York.
DNA Sequencing by Capillary Electrophoresis
Karger, Barry L.; Guttman, Andras
2009-01-01
Sequencing of human and other genomes has been at the center of interest in the biomedical field over the past several decades and is now leading toward an era of personalized medicine. During this time, DNA sequencing methods have evolved from the labor intensive slab gel electrophoresis, through automated multicapillary electrophoresis systems using fluorophore labeling with multispectral imaging, to the “next generation” technologies of cyclic array, hybridization based, nanopore and single molecule sequencing. Deciphering the genetic blueprint and follow-up confirmatory sequencing of Homo sapiens and other genomes was only possible by the advent of modern sequencing technologies that was a result of step by step advances with a contribution of academics, medical personnel and instrument companies. While next generation sequencing is moving ahead at break-neck speed, the multicapillary electrophoretic systems played an essential role in the sequencing of the Human Genome, the foundation of the field of genomics. In this prospective, we wish to overview the role of capillary electrophoresis in DNA sequencing based in part of several of our articles in this journal. PMID:19517496
Sequence Dependencies of DNA Deformability and Hydration in the Minor Groove
Yonetani, Yoshiteru; Kono, Hidetoshi
2009-01-01
Abstract DNA deformability and hydration are both sequence-dependent and are essential in specific DNA sequence recognition by proteins. However, the relationship between the two is not well understood. Here, systematic molecular dynamics simulations of 136 DNA sequences that differ from each other in their central tetramer revealed that sequence dependence of hydration is clearly correlated with that of deformability. We show that this correlation can be illustrated by four typical cases. Most rigid basepair steps are highly likely to form an ordered hydration pattern composed of one water molecule forming a bridge between the bases of distinct strands, but a few exceptions favor another ordered hydration composed of two water molecules forming such a bridge. Steps with medium deformability can display both of these hydration patterns with frequent transition. Highly flexible steps do not have any stable hydration pattern. A detailed picture of this correlation demonstrates that motions of hydration water molecules and DNA bases are tightly coupled with each other at the atomic level. These results contribute to our understanding of the entropic contribution from water molecules in protein or drug binding and could be applied for the purpose of predicting binding sites. PMID:19686662
Tang, Songsong; Gu, Yuan; Lu, Huiting; Dong, Haifeng; Zhang, Kai; Dai, Wenhao; Meng, Xiangdan; Yang, Fan; Zhang, Xueji
2018-04-03
Herein, a highly-sensitive microRNA (miRNA) detection strategy was developed by combining bio-bar-code assay (BBA) with catalytic hairpin assembly (CHA). In the proposed system, two nanoprobes of magnetic nanoparticles functionalized with DNA probes (MNPs-DNA) and gold nanoparticles with numerous barcode DNA (AuNPs-DNA) were designed. In the presence of target miRNA, the MNP-DNA and AuNP-DNA hybridized with target miRNA to form a "sandwich" structure. After "sandwich" structures were separated from the solution by the magnetic field and dehybridized by high temperature, the barcode DNA sequences were released by dissolving AuNPs. The released barcode DNA sequences triggered the toehold strand displacement assembly of two hairpin probes, leading to recycle of barcode DNA sequences and producing numerous fluorescent CHA products for miRNA detection. Under the optimal experimental conditions, the proposed two-stage amplification system could sensitively detect target miRNA ranging from 10 pM to 10 aM with a limit of detection (LOD) down to 97.9 zM. It displayed good capability to discriminate single base and three bases mismatch due to the unique sandwich structure. Notably, it presented good feasibility for selective multiplexed detection of various combinations of synthetic miRNA sequences and miRNAs extracted from different cell lysates, which were in agreement with the traditional polymerase chain reaction analysis. The two-stage amplification strategy may be significant implication in the biological detection and clinical diagnosis. Copyright © 2017 Elsevier B.V. All rights reserved.
Toward rules relating zinc finger protein sequences and DNA binding site preferences.
Desjarlais, J R; Berg, J M
1992-08-15
Zinc finger proteins of the Cys2-His2 type consist of tandem arrays of domains, where each domain appears to contact three adjacent base pairs of DNA through three key residues. We have designed and prepared a series of variants of the central zinc finger within the DNA binding domain of Sp1 by using information from an analysis of a large data base of zinc finger protein sequences. Through systematic variations at two of the three contact positions (underlined), relatively specific recognition of sequences of the form 5'-GGGGN(G or T)GGG-3' has been achieved. These results provide the basis for rules that may develop into a code that will allow the design of zinc finger proteins with preselected DNA site specificity.
Huang, Xuan; Zheng, Jing; Chen, Min; Zhao, Yangyu; Zhang, Chunlei; Liu, Lifu; Xie, Weiwei; Shi, Shuqiong; Wei, Yuan; Lei, Dongzhu; Xu, Chenming; Wu, Qichang; Guo, Xiaoling; Shi, Xiaomei; Zhou, Yi; Liu, Qiufang; Gao, Ya; Jiang, Fuman; Zhang, Hongyun; Su, Fengxia; Ge, Huijuan; Li, Xuchao; Pan, Xiaoyu; Chen, Shengpei; Chen, Fang; Fang, Qun; Jiang, Hui; Lau, Tze Kin; Wang, Wei
2014-04-01
The objective of this study is to assess the performance of noninvasive prenatal testing for trisomies 21 and 18 on the basis of massively parallel sequencing of cell-free DNA from maternal plasma in twin pregnancies. A double-blind study was performed over 12 months. A total of 189 pregnant women carrying twins were recruited from seven hospitals. Maternal plasma DNA sequencing was performed to detect trisomies 21 and 18. The fetal karyotype was used as gold standard to estimate the sensitivity and specificity of sequencing-based noninvasive prenatal test. There were nine cases of trisomy 21 and two cases of trisomy 18 confirmed by karyotyping. Plasma DNA sequencing correctly identified nine cases of trisomy 21 and one case of trisomy 18. The discordant case of trisomy 18 was an unusual case of monozygotic twin with discordant fetal karyotype (one normal and the other trisomy 18). The sensitivity and specificity of maternal plasma DNA sequencing for fetal trisomy 21 were both 100% and for fetal trisomy 18 were 50% and 100%, respectively. Our study further supported that sequencing-based noninvasive prenatal testing of trisomy 21 in twin pregnancies could be achieved with a high accuracy, which could effectively avoid almost 95% of invasive prenatal diagnosis procedures. © 2013 John Wiley & Sons, Ltd.
Characterization of an In Vivo Z-DNA Detection Probe Based on a Cell Nucleus Accumulating Intrabody.
Gulis, Galina; Silva, Izabel Cristina Rodrigues; Sousa, Herdson Renney; Sousa, Isabel Garcia; Bezerra, Maryani Andressa Gomes; Quilici, Luana Salgado; Maranhao, Andrea Queiroz; Brigido, Marcelo Macedo
2016-09-01
Left-handed Z-DNA is a physiologically unstable DNA conformation, and its existence in vivo can be attributed to localized torsional distress. Despite evidence for the existence of Z-DNA in vivo, its precise role in the control of gene expression is not fully understood. Here, an in vivo probe based on an anti-Z-DNA intrabody is proposed for native Z-DNA detection. The probe was used for chromatin immunoprecipitation of potential Z-DNA-forming sequences in the human genome. One of the isolated putative Z-DNA-forming sequences was cloned upstream of a reporter gene expression cassette under control of the CMV promoter. The reporter gene encoded an antibody fragment fused to GFP. Transient co-transfection of this vector along with the Z-probe coding vector improved reporter gene expression. This improvement was demonstrated by measuring reporter gene mRNA and protein levels and the amount of fluorescence in co-transfected CHO-K1 cells. These results suggest that the presence of the anti-Z-DNA intrabody can interfere with a Z-DNA-containing reporter gene expression. Therefore, this in vivo probe for the detection of Z-DNA could be used for global correlation of Z-DNA-forming sequences and gene expression regulation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tan, H.
1999-03-31
The purpose of this research is to develop a multiplexed sample processing system in conjunction with multiplexed capillary electrophoresis for high-throughput DNA sequencing. The concept from DNA template to called bases was first demonstrated with a manually operated single capillary system. Later, an automated microfluidic system with 8 channels based on the same principle was successfully constructed. The instrument automatically processes 8 templates through reaction, purification, denaturation, pre-concentration, injection, separation and detection in a parallel fashion. A multiplexed freeze/thaw switching principle and a distribution network were implemented to manage flow direction and sample transportation. Dye-labeled terminator cycle-sequencing reactions are performedmore » in an 8-capillary array in a hot air thermal cycler. Subsequently, the sequencing ladders are directly loaded into a corresponding size-exclusion chromatographic column operated at {approximately} 60 C for purification. On-line denaturation and stacking injection for capillary electrophoresis is simultaneously accomplished at a cross assembly set at {approximately} 70 C. Not only the separation capillary array but also the reaction capillary array and purification columns can be regenerated after every run. DNA sequencing data from this system allow base calling up to 460 bases with accuracy of 98%.« less
Suyama, Yoshihisa; Matsuki, Yu
2015-01-01
Restriction-enzyme (RE)-based next-generation sequencing methods have revolutionized marker-assisted genetic studies; however, the use of REs has limited their widespread adoption, especially in field samples with low-quality DNA and/or small quantities of DNA. Here, we developed a PCR-based procedure to construct reduced representation libraries without RE digestion steps, representing de novo single-nucleotide polymorphism discovery, and its genotyping using next-generation sequencing. Using multiplexed inter-simple sequence repeat (ISSR) primers, thousands of genome-wide regions were amplified effectively from a wide variety of genomes, without prior genetic information. We demonstrated: 1) Mendelian gametic segregation of the discovered variants; 2) reproducibility of genotyping by checking its applicability for individual identification; and 3) applicability in a wide variety of species by checking standard population genetic analysis. This approach, called multiplexed ISSR genotyping by sequencing, should be applicable to many marker-assisted genetic studies with a wide range of DNA qualities and quantities. PMID:26593239
ERIC Educational Resources Information Center
Miner, Carol; della Villa, Paula
1997-01-01
Describes an activity in which students reverse-translate proteins from their amino acid sequences back to their DNA sequences then assign musical notes to represent the adenine, guanine, cytosine, and thymine bases. Data is obtained from the National Institutes of Health (NIH) on the Internet. (DDR)
Sequence dependence of electron-induced DNA strand breakage revealed by DNA nanoarrays
Keller, Adrian; Rackwitz, Jenny; Cauët, Emilie; Liévin, Jacques; Körzdörfer, Thomas; Rotaru, Alexandru; Gothelf, Kurt V.; Besenbacher, Flemming; Bald, Ilko
2014-01-01
The electronic structure of DNA is determined by its nucleotide sequence, which is for instance exploited in molecular electronics. Here we demonstrate that also the DNA strand breakage induced by low-energy electrons (18 eV) depends on the nucleotide sequence. To determine the absolute cross sections for electron induced single strand breaks in specific 13 mer oligonucleotides we used atomic force microscopy analysis of DNA origami based DNA nanoarrays. We investigated the DNA sequences 5′-TT(XYX)3TT with X = A, G, C and Y = T, BrU 5-bromouracil and found absolute strand break cross sections between 2.66 · 10−14 cm2 and 7.06 · 10−14 cm2. The highest cross section was found for 5′-TT(ATA)3TT and 5′-TT(ABrUA)3TT, respectively. BrU is a radiosensitizer, which was discussed to be used in cancer radiation therapy. The replacement of T by BrU into the investigated DNA sequences leads to a slight increase of the absolute strand break cross sections resulting in sequence-dependent enhancement factors between 1.14 and 1.66. Nevertheless, the variation of strand break cross sections due to the specific nucleotide sequence is considerably higher. Thus, the present results suggest the development of targeted radiosensitizers for cancer radiation therapy. PMID:25487346
Formation of rings from segments of HeLa-cell nuclear deoxyribonucleic acid
Hardman, Norman
1974-01-01
Duplex segments of HeLa-cell nuclear DNA were generated by cleavage with DNA restriction endonuclease from Haemophilus influenzae. About 20–25% of the DNA segments produced, when partly degraded with exonuclease III and annealed, were found to form rings visible in the electron microscope. A further 5% of the DNA segments formed structures that were branched in configuration. Similar structures were generated from HeLa-cell DNA, without prior treatment with restriction endonuclease, when the complementary polynucleotide chains were exposed by exonuclease III action at single-chain nicks. After exposure of an average single-chain length of 1400 nucleotides per terminus at nicks in HeLa-cell DNA by exonuclease III, followed by annealing, the physical length of ring closures was estimated and found to be 0.02–0.1μm, or 50–300 base pairs. An almost identical distribution of lengths was recorded for the regions of complementary base sequence responsible for branch formation. It is proposed that most of the rings and branches are formed from classes of reiterated base sequence with an average length of 180 base pairs arranged intermittenly in HeLa-cell DNA. From the rate of formation of branched structures when HeLa-cell DNA segments were heat-denatured and annealed, it is estimated that the reiterated sequences are in families containing approximately 2400–24000 copies. ImagesPLATE 2PLATE 1 PMID:4462738
NMR studies of DNA oligomers and their interactions with minor groove binding ligands
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fagan, Patricia A.
1996-05-01
The cationic peptide ligands distamycin and netropsin bind noncovalently to the minor groove of DNA. The binding site, orientation, stoichiometry, and qualitative affinity of distamycin binding to several short DNA oligomers were investigated by NMR spectroscopy. The oligomers studied contain A,T-rich or I,C-rich binding sites, where I = 2-desaminodeoxyguanosine. I•C base pairs are functional analogs of A•T base pairs in the minor groove. The different behaviors exhibited by distamycin and netropsin binding to various DNA sequences suggested that these ligands are sensitive probes of DNA structure. For sites of five or more base pairs, distamycin can form 1:1 or 2:1more » ligand:DNA complexes. Cooperativity in distamycin binding is low in sites such as AAAAA which has narrow minor grooves, and is higher in sites with wider minor grooves such as ATATAT. The distamycin binding and base pair opening lifetimes of I,C-containing DNA oligomers suggest that the I,C minor groove is structurally different from the A,T minor groove. Molecules which direct chemistry to a specific DNA sequence could be used as antiviral compounds, diagnostic probes, or molecular biology tools. The author studied two ligands in which reactive groups were tethered to a distamycin to increase the sequence specificity of the reactive agent.« less
Anton, Brian P; Mongodin, Emmanuel F; Agrawal, Sonia; Fomenkov, Alexey; Byrd, Devon R; Roberts, Richard J; Raleigh, Elisabeth A
2015-01-01
We report the complete sequence of ER2796, a laboratory strain of Escherichia coli K-12 that is completely defective in DNA methylation. Because of its lack of any native methylation, it is extremely useful as a host into which heterologous DNA methyltransferase genes can be cloned and the recognition sequences of their products deduced by Pacific Biosciences Single-Molecule Real Time (SMRT) sequencing. The genome was itself sequenced from a long-insert library using the SMRT platform, resulting in a single closed contig devoid of methylated bases. Comparison with K-12 MG1655, the first E. coli K-12 strain to be sequenced, shows an essentially co-linear relationship with no major rearrangements despite many generations of laboratory manipulation. The comparison revealed a total of 41 insertions and deletions, and 228 single base pair substitutions. In addition, the long-read approach facilitated the surprising discovery of four gene conversion events, three involving rRNA operons and one between two cryptic prophages. Such events thus contribute both to genomic homogenization and to bacteriophage diversification. As one of relatively few laboratory strains of E. coli to be sequenced, the genome also reveals the sequence changes underlying a number of classical mutant alleles including those affecting the various native DNA methylation systems.
Anton, Brian P.; Mongodin, Emmanuel F.; Agrawal, Sonia; Fomenkov, Alexey; Byrd, Devon R.; Roberts, Richard J.; Raleigh, Elisabeth A.
2015-01-01
We report the complete sequence of ER2796, a laboratory strain of Escherichia coli K-12 that is completely defective in DNA methylation. Because of its lack of any native methylation, it is extremely useful as a host into which heterologous DNA methyltransferase genes can be cloned and the recognition sequences of their products deduced by Pacific Biosciences Single-Molecule Real Time (SMRT) sequencing. The genome was itself sequenced from a long-insert library using the SMRT platform, resulting in a single closed contig devoid of methylated bases. Comparison with K-12 MG1655, the first E. coli K-12 strain to be sequenced, shows an essentially co-linear relationship with no major rearrangements despite many generations of laboratory manipulation. The comparison revealed a total of 41 insertions and deletions, and 228 single base pair substitutions. In addition, the long-read approach facilitated the surprising discovery of four gene conversion events, three involving rRNA operons and one between two cryptic prophages. Such events thus contribute both to genomic homogenization and to bacteriophage diversification. As one of relatively few laboratory strains of E. coli to be sequenced, the genome also reveals the sequence changes underlying a number of classical mutant alleles including those affecting the various native DNA methylation systems. PMID:26010885
Ali, M A; Al-Hemaid, F M; Lee, J; Hatamleh, A A; Gyulai, G; Rahman, M O
2015-10-02
The present study explored the systematic inventory of Echinops L. (Asteraceae) of Saudi Arabia, with special reference to the molecular typing of Echinops abuzinadianus Chaudhary, an endemic species to Saudi Arabia, based on the internal transcribed spacer (ITS) sequences (ITS1-5.8S-ITS2) of nuclear ribosomal DNA. A sequence similarity search using BLAST and a phylogenetic analysis of the ITS sequence of E. abuzinadianus revealed a high level of sequence similarity with E. glaberrimus DC. (section Ritropsis). The novel primary sequence and the secondary structure of ITS2 of E. abuzinadianus could potentially be used for molecular genotyping.
Thomas, Austen C; Jarman, Simon N; Haman, Katherine H; Trites, Andrew W; Deagle, Bruce E
2014-08-01
Ecologists are increasingly interested in quantifying consumer diets based on food DNA in dietary samples and high-throughput sequencing of marker genes. It is tempting to assume that food DNA sequence proportions recovered from diet samples are representative of consumer's diet proportions, despite the fact that captive feeding studies do not support that assumption. Here, we examine the idea of sequencing control materials of known composition along with dietary samples in order to correct for technical biases introduced during amplicon sequencing and biological biases such as variable gene copy number. Using the Ion Torrent PGM(©) , we sequenced prey DNA amplified from scats of captive harbour seals (Phoca vitulina) fed a constant diet including three fish species in known proportions. Alongside, we sequenced a prey tissue mix matching the seals' diet to generate tissue correction factors (TCFs). TCFs improved the diet estimates (based on sequence proportions) for all species and reduced the average estimate error from 28 ± 15% (uncorrected) to 14 ± 9% (TCF-corrected). The experimental design also allowed us to infer the magnitude of prey-specific digestion biases and calculate digestion correction factors (DCFs). The DCFs were compared with possible proxies for differential digestion (e.g. fish protein%, fish lipid%) revealing a strong relationship between the DCFs and percent lipid of the fish prey, suggesting prey-specific corrections based on lipid content would produce accurate diet estimates in this study system. These findings demonstrate the value of parallel sequencing of food tissue mixtures in diet studies and offer new directions for future research in quantitative DNA diet analysis. © 2013 John Wiley & Sons Ltd.
Reed, K M; Dorschner, M O; Todd, T N; Phillips, R B
1998-09-01
Sequence variation in the control region (D-loop) of the mitochondrial DNA (mtDNA) was examined to assess the genetic distinctiveness of the shortjaw cisco (Coregonus zenithicus). Individuals from within the Great Lakes Basin as well as inland lakes outside the basin were sampled. DNA fragments containing the entire D-loop were amplified by PCR from specimens of C. zenithicus and the related species C. artedi, C. hoyi, C. kiyi, and C. clupeaformis. DNA sequence analysis revealed high similarity within and among species and shared polymorphism for length variants. Based on this analysis, the shortjaw cisco is not genetically distinct from other cisco species.
Dynamics and control of DNA sequence amplification
DOE Office of Scientific and Technical Information (OSTI.GOV)
Marimuthu, Karthikeyan; Chakrabarti, Raj, E-mail: raj@pmc-group.com, E-mail: rajc@andrew.cmu.edu; Division of Fundamental Research, PMC Advanced Technology, Mount Laurel, New Jersey 08054
2014-10-28
DNA amplification is the process of replication of a specified DNA sequence in vitro through time-dependent manipulation of its external environment. A theoretical framework for determination of the optimal dynamic operating conditions of DNA amplification reactions, for any specified amplification objective, is presented based on first-principles biophysical modeling and control theory. Amplification of DNA is formulated as a problem in control theory with optimal solutions that can differ considerably from strategies typically used in practice. Using the Polymerase Chain Reaction as an example, sequence-dependent biophysical models for DNA amplification are cast as control systems, wherein the dynamics of the reactionmore » are controlled by a manipulated input variable. Using these control systems, we demonstrate that there exists an optimal temperature cycling strategy for geometric amplification of any DNA sequence and formulate optimal control problems that can be used to derive the optimal temperature profile. Strategies for the optimal synthesis of the DNA amplification control trajectory are proposed. Analogous methods can be used to formulate control problems for more advanced amplification objectives corresponding to the design of new types of DNA amplification reactions.« less
Trebitz, Anett S; Hoffman, Joel C; Grant, George W; Billehus, Tyler M; Pilgrim, Erik M
2015-07-22
DNA-based identification of mixed-organism samples offers the potential to greatly reduce the need for resource-intensive morphological identification, which would be of value both to bioassessment and non-native species monitoring. The ability to assign species identities to DNA sequences found depends on the availability of comprehensive DNA reference libraries. Here, we compile inventories for aquatic metazoans extant in or threatening to invade the Laurentian Great Lakes and examine the availability of reference mitochondrial COI DNA sequences (barcodes) in the Barcode of Life Data System for them. We found barcode libraries largely complete for extant and threatening-to-invade vertebrates (100% of reptile, 99% of fish, and 92% of amphibian species had barcodes). In contrast, barcode libraries remain poorly developed for precisely those organisms where morphological identification is most challenging; 46% of extant invertebrates lacked reference barcodes with rates especially high among rotifers, oligochaetes, and mites. Lack of species-level identification for many aquatic invertebrates also is a barrier to matching DNA sequences with physical specimens. Attaining the potential for DNA-based identification of mixed-organism samples covering the breadth of aquatic fauna requires a concerted effort to build supporting barcode libraries and voucher collections.
NASA Astrophysics Data System (ADS)
Trebitz, Anett S.; Hoffman, Joel C.; Grant, George W.; Billehus, Tyler M.; Pilgrim, Erik M.
2015-07-01
DNA-based identification of mixed-organism samples offers the potential to greatly reduce the need for resource-intensive morphological identification, which would be of value both to bioassessment and non-native species monitoring. The ability to assign species identities to DNA sequences found depends on the availability of comprehensive DNA reference libraries. Here, we compile inventories for aquatic metazoans extant in or threatening to invade the Laurentian Great Lakes and examine the availability of reference mitochondrial COI DNA sequences (barcodes) in the Barcode of Life Data System for them. We found barcode libraries largely complete for extant and threatening-to-invade vertebrates (100% of reptile, 99% of fish, and 92% of amphibian species had barcodes). In contrast, barcode libraries remain poorly developed for precisely those organisms where morphological identification is most challenging; 46% of extant invertebrates lacked reference barcodes with rates especially high among rotifers, oligochaetes, and mites. Lack of species-level identification for many aquatic invertebrates also is a barrier to matching DNA sequences with physical specimens. Attaining the potential for DNA-based identification of mixed-organism samples covering the breadth of aquatic fauna requires a concerted effort to build supporting barcode libraries and voucher collections.
Phylogenetic Network for European mtDNA
Finnilä, Saara; Lehtonen, Mervi S.; Majamaa, Kari
2001-01-01
The sequence in the first hypervariable segment (HVS-I) of the control region has been used as a source of evolutionary information in most phylogenetic analyses of mtDNA. Population genetic inference would benefit from a better understanding of the variation in the mtDNA coding region, but, thus far, complete mtDNA sequences have been rare. We determined the nucleotide sequence in the coding region of mtDNA from 121 Finns, by conformation-sensitive gel electrophoresis and subsequent sequencing and by direct sequencing of the D loop. Furthermore, 71 sequences from our previous reports were included, so that the samples represented all the mtDNA haplogroups present in the Finnish population. We found a total of 297 variable sites in the coding region, which allowed the compilation of unambiguous phylogenetic networks. The D loop harbored 104 variable sites, and, in most cases, these could be localized within the coding-region networks, without discrepancies. Interestingly, many homoplasies were detected in the coding region. Nucleotide variation in the rRNA and tRNA genes was 6%, and that in the third nucleotide positions of structural genes amounted to 22% of that in the HVS-I. The complete networks enabled the relationships between the mtDNA haplogroups to be analyzed. Phylogenetic networks based on the entire coding-region sequence in mtDNA provide a rich source for further population genetic studies, and complete sequences make it easier to differentiate between disease-causing mutations and rare polymorphisms. PMID:11349229
COI (cytochrome oxidase-I) sequence based studies of Carangid fishes from Kakinada coast, India.
Persis, M; Chandra Sekhar Reddy, A; Rao, L M; Khedkar, G D; Ravinder, K; Nasruddin, K
2009-09-01
Mitochondrial DNA, cytochrome oxidase-1 gene sequences were analyzed for species identification and phylogenetic relationship among the very high food value and commercially important Indian carangid fish species. Sequence analysis of COI gene very clearly indicated that all the 28 fish species fell into five distinct groups, which are genetically distant from each other and exhibited identical phylogenetic reservation. All the COI gene sequences from 28 fishes provide sufficient phylogenetic information and evolutionary relationship to distinguish the carangid species unambiguously. This study proves the utility of mtDNA COI gene sequence based approach in identifying fish species at a faster pace.
Cai, Sheng; Cao, Zhijuan; Lau, Choiwan; Lu, Jianzhong
2014-11-21
By using the allosteric hairpin DNA switch, a novel assay for the detection of microRNA (miRNA) let-7a via a hybridization chain reaction (HCR) was introduced. Briefly, the hairpin DNA switch probe is a single-stranded DNA consisting of a streptavidin (SA) aptamer sequence, a target binding sequence and a certain sequence that acts as a trigger of the HCR. In the presence of target let-7a, the hairpin DNA switch would open and expose the stem region sequences, where a part of this sequence acts as initiator sequence strands for the HCR and triggers a cascade of hybridization events that yields nicked double helices analogous to alternating copolymers, another part is the SA aptamer sequence which activates its binding affinity to SA on SA-coated magnetic particles. The hybridization event could be sensitively detected via an instantaneous derivatization reaction between a special chemiluminescence (CL) reagent, 3,4,5-trimethoxylphenylglyoxal (TMPG) and the guanine nucleotides within the target, the hairpin DNA switch probe, and HCR helices to form an unstable CL intermediate for the generation of light. Our results show that the coupling of the hairpin DNA switch probe and the HCR for the amplified detection of let-7a achieves a better performance (e.g. wide linear response range: 0.1-1000 fmol, low detection limit: 0.1 fmol, and high specificity). Furthermore, this approach could be easily applied to the detection of let-7a in human lung cells, and extended to detect other types of miRNA and proteins such as PDGF based on aptamers. We believe such advancements will represent a significant step towards improved diagnostics and more personalized medical treatment.
Cloud-based adaptive exon prediction for DNA analysis.
Putluri, Srinivasareddy; Zia Ur Rahman, Md; Fathima, Shaik Yasmeen
2018-02-01
Cloud computing offers significant research and economic benefits to healthcare organisations. Cloud services provide a safe place for storing and managing large amounts of such sensitive data. Under conventional flow of gene information, gene sequence laboratories send out raw and inferred information via Internet to several sequence libraries. DNA sequencing storage costs will be minimised by use of cloud service. In this study, the authors put forward a novel genomic informatics system using Amazon Cloud Services, where genomic sequence information is stored and accessed for processing. True identification of exon regions in a DNA sequence is a key task in bioinformatics, which helps in disease identification and design drugs. Three base periodicity property of exons forms the basis of all exon identification techniques. Adaptive signal processing techniques found to be promising in comparison with several other methods. Several adaptive exon predictors (AEPs) are developed using variable normalised least mean square and its maximum normalised variants to reduce computational complexity. Finally, performance evaluation of various AEPs is done based on measures such as sensitivity, specificity and precision using various standard genomic datasets taken from National Center for Biotechnology Information genomic sequence database.
Earl, P L; Jones, E V; Moss, B
1986-01-01
A 5400-base-pair segment of the vaccinia virus genome was sequenced and an open reading frame of 938 codons was found precisely where the DNA polymerase had been mapped by transfer of a phosphonoacetate-resistance marker. A single nucleotide substitution changing glycine at position 347 to aspartic acid accounts for the drug resistance of the mutant vaccinia virus. The 5' end of the DNA polymerase mRNA was located 80 base pairs before the methionine codon initiating the open reading frame. Correspondence between the predicted Mr 108,577 polypeptide and the 110,000 purified enzyme indicates that little or no proteolytic processing occurs. Extensive homology, extending over 435 amino acids, was found upon comparing the DNA polymerase of vaccinia virus and DNA polymerase of Epstein-Barr virus. A highly conserved sequence of 14 amino acids in the carboxyl-terminal regions of the above DNA polymerases is also present at a similar location in adenovirus DNA polymerase. This structure, which is predicted to form a turn flanked by beta-pleated sheets, may form part of an essential binding or catalytic site that accounts for its presence in DNA polymerases of poxviruses, herpesviruses, and adenoviruses. Images PMID:3012524
Analysis of intraspecific patterns in genetic diversity of stream fishes provides a potentially powerful method for assessing the status and trends in the condition of aquatic ecosystems. We analyzed mitochondrial DNA (mtDNA) sequences (590 bases of cytochrome B) and nuclear DNA...
A Sequence-Dependent DNA Condensation Induced by Prion Protein
2018-01-01
Different studies indicated that the prion protein induces hybridization of complementary DNA strands. Cell culture studies showed that the scrapie isoform of prion protein remained bound with the chromosome. In present work, we used an oxazole dye, YOYO, as a reporter to quantitative characterization of the DNA condensation by prion protein. We observe that the prion protein induces greater fluorescence quenching of YOYO intercalated in DNA containing only GC bases compared to the DNA containing four bases whereas the effect of dye bound to DNA containing only AT bases is marginal. DNA-condensing biological polyamines are less effective than prion protein in quenching of DNA-bound YOYO fluorescence. The prion protein induces marginal quenching of fluorescence of the dye bound to oligonucleotides, which are resistant to condensation. The ultrastructural studies with electron microscope also validate the biophysical data. The GC bases of the target DNA are probably responsible for increased condensation in the presence of prion protein. To our knowledge, this is the first report of a human cellular protein inducing a sequence-dependent DNA condensation. The increased condensation of GC-rich DNA by prion protein may suggest a biological function of the prion protein and a role in its pathogenesis. PMID:29657864
A Sequence-Dependent DNA Condensation Induced by Prion Protein.
Bera, Alakesh; Biring, Sajal
2018-01-01
Different studies indicated that the prion protein induces hybridization of complementary DNA strands. Cell culture studies showed that the scrapie isoform of prion protein remained bound with the chromosome. In present work, we used an oxazole dye, YOYO, as a reporter to quantitative characterization of the DNA condensation by prion protein. We observe that the prion protein induces greater fluorescence quenching of YOYO intercalated in DNA containing only GC bases compared to the DNA containing four bases whereas the effect of dye bound to DNA containing only AT bases is marginal. DNA-condensing biological polyamines are less effective than prion protein in quenching of DNA-bound YOYO fluorescence. The prion protein induces marginal quenching of fluorescence of the dye bound to oligonucleotides, which are resistant to condensation. The ultrastructural studies with electron microscope also validate the biophysical data. The GC bases of the target DNA are probably responsible for increased condensation in the presence of prion protein. To our knowledge, this is the first report of a human cellular protein inducing a sequence-dependent DNA condensation. The increased condensation of GC-rich DNA by prion protein may suggest a biological function of the prion protein and a role in its pathogenesis.
Effects of 16S rDNA sampling on estimates of the number of endosymbiont lineages in sucking lice
Burleigh, J. Gordon; Light, Jessica E.; Reed, David L.
2016-01-01
Phylogenetic trees can reveal the origins of endosymbiotic lineages of bacteria and detect patterns of co-evolution with their hosts. Although taxon sampling can greatly affect phylogenetic and co-evolutionary inference, most hypotheses of endosymbiont relationships are based on few available bacterial sequences. Here we examined how different sampling strategies of Gammaproteobacteria sequences affect estimates of the number of endosymbiont lineages in parasitic sucking lice (Insecta: Phthirapatera: Anoplura). We estimated the number of louse endosymbiont lineages using both newly obtained and previously sequenced 16S rDNA bacterial sequences and more than 42,000 16S rDNA sequences from other Gammaproteobacteria. We also performed parametric and nonparametric bootstrapping experiments to examine the effects of phylogenetic error and uncertainty on these estimates. Sampling of 16S rDNA sequences affects the estimates of endosymbiont diversity in sucking lice until we reach a threshold of genetic diversity, the size of which depends on the sampling strategy. Sampling by maximizing the diversity of 16S rDNA sequences is more efficient than randomly sampling available 16S rDNA sequences. Although simulation results validate estimates of multiple endosymbiont lineages in sucking lice, the bootstrap results suggest that the precise number of endosymbiont origins is still uncertain. PMID:27547523
On site DNA barcoding by nanopore sequencing
Menegon, Michele; Cantaloni, Chiara; Rodriguez-Prieto, Ana; Centomo, Cesare; Abdelfattah, Ahmed; Rossato, Marzia; Bernardi, Massimo; Xumerle, Luciano; Loader, Simon; Delledonne, Massimo
2017-01-01
Biodiversity research is becoming increasingly dependent on genomics, which allows the unprecedented digitization and understanding of the planet’s biological heritage. The use of genetic markers i.e. DNA barcoding, has proved to be a powerful tool in species identification. However, full exploitation of this approach is hampered by the high sequencing costs and the absence of equipped facilities in biodiversity-rich countries. In the present work, we developed a portable sequencing laboratory based on the portable DNA sequencer from Oxford Nanopore Technologies, the MinION. Complementary laboratory equipment and reagents were selected to be used in remote and tough environmental conditions. The performance of the MinION sequencer and the portable laboratory was tested for DNA barcoding in a mimicking tropical environment, as well as in a remote rainforest of Tanzania lacking electricity. Despite the relatively high sequencing error-rate of the MinION, the development of a suitable pipeline for data analysis allowed the accurate identification of different species of vertebrates including amphibians, reptiles and mammals. In situ sequencing of a wild frog allowed us to rapidly identify the species captured, thus confirming that effective DNA barcoding in the field is possible. These results open new perspectives for real-time-on-site DNA sequencing thus potentially increasing opportunities for the understanding of biodiversity in areas lacking conventional laboratory facilities. PMID:28977016
Liu, Bin; Liu, Fule; Fang, Longyun; Wang, Xiaolong; Chou, Kuo-Chen
2015-04-15
In order to develop powerful computational predictors for identifying the biological features or attributes of DNAs, one of the most challenging problems is to find a suitable approach to effectively represent the DNA sequences. To facilitate the studies of DNAs and nucleotides, we developed a Python package called representations of DNAs (repDNA) for generating the widely used features reflecting the physicochemical properties and sequence-order effects of DNAs and nucleotides. There are three feature groups composed of 15 features. The first group calculates three nucleic acid composition features describing the local sequence information by means of kmers; the second group calculates six autocorrelation features describing the level of correlation between two oligonucleotides along a DNA sequence in terms of their specific physicochemical properties; the third group calculates six pseudo nucleotide composition features, which can be used to represent a DNA sequence with a discrete model or vector yet still keep considerable sequence-order information via the physicochemical properties of its constituent oligonucleotides. In addition, these features can be easily calculated based on both the built-in and user-defined properties via using repDNA. The repDNA Python package is freely accessible to the public at http://bioinformatics.hitsz.edu.cn/repDNA/. bliu@insun.hit.edu.cn or kcchou@gordonlifescience.org Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Yang, Cheng-Hong; Wu, Kuo-Chuan; Chuang, Li-Yeh; Chang, Hsueh-Wei
2018-01-01
DNA barcode sequences are accumulating in large data sets. A barcode is generally a sequence larger than 1000 base pairs and generates a computational burden. Although the DNA barcode was originally envisioned as straightforward species tags, the identification usage of barcode sequences is rarely emphasized currently. Single-nucleotide polymorphism (SNP) association studies provide us an idea that the SNPs may be the ideal target of feature selection to discriminate between different species. We hypothesize that SNP-based barcodes may be more effective than the full length of DNA barcode sequences for species discrimination. To address this issue, we tested a r ibulose diphosphate carboxylase ( rbcL ) S NP b arcoding (RSB) strategy using a decision tree algorithm. After alignment and trimming, 31 SNPs were discovered in the rbcL sequences from 38 Brassicaceae plant species. In the decision tree construction, these SNPs were computed to set up the decision rule to assign the sequences into 2 groups level by level. After algorithm processing, 37 nodes and 31 loci were required for discriminating 38 species. Finally, the sequence tags consisting of 31 rbcL SNP barcodes were identified for discriminating 38 Brassicaceae species based on the decision tree-selected SNP pattern using RSB method. Taken together, this study provides the rational that the SNP aspect of DNA barcode for rbcL gene is a useful and effective sequence for tagging 38 Brassicaceae species.
Herzner, Anna-Maria; Hagmann, Cristina Amparo; Goldeck, Marion; Wolter, Steven; Kübler, Kirsten; Wittmann, Sabine; Gramberg, Thomas; Andreeva, Liudmila; Hopfner, Karl-Peter; Mertens, Christina; Zillinger, Thomas; Jin, Tengchuan; Xiao, Tsan Sam; Bartok, Eva; Coch, Christoph; Ackermann, Damian; Hornung, Veit; Ludwig, Janos; Barchet, Winfried; Hartmann, Gunther; Schlee, Martin
2015-10-01
Cytosolic DNA that emerges during infection with a retrovirus or DNA virus triggers antiviral type I interferon responses. So far, only double-stranded DNA (dsDNA) over 40 base pairs (bp) in length has been considered immunostimulatory. Here we found that unpaired DNA nucleotides flanking short base-paired DNA stretches, as in stem-loop structures of single-stranded DNA (ssDNA) derived from human immunodeficiency virus type 1 (HIV-1), activated the type I interferon-inducing DNA sensor cGAS in a sequence-dependent manner. DNA structures containing unpaired guanosines flanking short (12- to 20-bp) dsDNA (Y-form DNA) were highly stimulatory and specifically enhanced the enzymatic activity of cGAS. Furthermore, we found that primary HIV-1 reverse transcripts represented the predominant viral cytosolic DNA species during early infection of macrophages and that these ssDNAs were highly immunostimulatory. Collectively, our study identifies unpaired guanosines in Y-form DNA as a highly active, minimal cGAS recognition motif that enables detection of HIV-1 ssDNA.
Yazdi, Soroush H; Giles, Kristen L; White, Ian M
2013-11-05
We demonstrate sensitive and multiplexed detection of DNA sequences through a surface enhanced resonance Raman spectroscopy (SERRS)-based competitive displacement assay in an integrated microsystem. The use of the competitive displacement scheme, in which the target DNA sequence displaces a Raman-labeled reporter sequence that has lower affinity for the immobilized probe, enables detection of unlabeled target DNA sequences with a simple single-step procedure. In our implementation, the displacement reaction occurs in a microporous packed column of silica beads prefunctionalized with probe-reporter pairs. The use of a functionalized packed-bead column in a microfluidic channel provides two major advantages: (i) immobilization surface chemistry can be performed as a batch process instead of on a chip-by-chip basis, and (ii) the microporous network eliminates the diffusion limitations of a typical biological assay, which increases the sensitivity. Packed silica beads are also leveraged to improve the SERRS detection of the Raman-labeled reporter. Following displacement, the reporter adsorbs onto aggregated silver nanoparticles in a microfluidic mixer; the nanoparticle-reporter conjugates are then trapped and concentrated in the silica bead matrix, which leads to a significant increase in plasmonic nanoparticles and adsorbed Raman reporters within the detection volume as compared to an open microfluidic channel. The experimental results reported here demonstrate detection down to 100 pM of the target DNA sequence, and the experiments are shown to be specific, repeatable, and quantitative. Furthermore, we illustrate the advantage of using SERRS by demonstrating multiplexed detection. The sensitivity of the assay, combined with the advantages of multiplexed detection and single-step operation with unlabeled target sequences makes this method attractive for practical applications. Importantly, while we illustrate DNA sequence detection, the SERRS-based competitive displacement assay is applicable to detection of a variety of biological macromolecules, including proteins and proteolytic enzymes.
Isalan, M; Klug, A; Choo, Y
2001-07-01
DNA-binding domains with predetermined sequence specificity are engineered by selection of zinc finger modules using phage display, allowing the construction of customized transcription factors. Despite remarkable progress in this field, the available protein-engineering methods are deficient in many respects, thus hampering the applicability of the technique. Here we present a rapid and convenient method that can be used to design zinc finger proteins against a variety of DNA-binding sites. This is based on a pair of pre-made zinc finger phage-display libraries, which are used in parallel to select two DNA-binding domains each of which recognizes given 5 base pair sequences, and whose products are recombined to produce a single protein that recognizes a composite (9 base pair) site of predefined sequence. Engineering using this system can be completed in less than two weeks and yields proteins that bind sequence-specifically to DNA with Kd values in the nanomolar range. To illustrate the technique, we have selected seven different proteins to bind various regions of the human immunodeficiency virus 1 (HIV-1) promoter.
Sequence-dependent base pair stepping dynamics in XPD helicase unwinding
Qi, Zhi; Pugh, Robert A; Spies, Maria; Chemla, Yann R
2013-01-01
Helicases couple the chemical energy of ATP hydrolysis to directional translocation along nucleic acids and transient duplex separation. Understanding helicase mechanism requires that the basic physicochemical process of base pair separation be understood. This necessitates monitoring helicase activity directly, at high spatio-temporal resolution. Using optical tweezers with single base pair (bp) resolution, we analyzed DNA unwinding by XPD helicase, a Superfamily 2 (SF2) DNA helicase involved in DNA repair and transcription initiation. We show that monomeric XPD unwinds duplex DNA in 1-bp steps, yet exhibits frequent backsteps and undergoes conformational transitions manifested in 5-bp backward and forward steps. Quantifying the sequence dependence of XPD stepping dynamics with near base pair resolution, we provide the strongest and most direct evidence thus far that forward, single-base pair stepping of a helicase utilizes the spontaneous opening of the duplex. The proposed unwinding mechanism may be a universal feature of DNA helicases that move along DNA phosphodiester backbones. DOI: http://dx.doi.org/10.7554/eLife.00334.001 PMID:23741615
Inferring Higher Functional Information for RIKEN Mouse Full-Length cDNA Clones With FACTS
Nagashima, Takeshi; Silva, Diego G.; Petrovsky, Nikolai; Socha, Luis A.; Suzuki, Harukazu; Saito, Rintaro; Kasukawa, Takeya; Kurochkin, Igor V.; Konagaya, Akihiko; Schönbach, Christian
2003-01-01
FACTS (Functional Association/Annotation of cDNA Clones from Text/Sequence Sources) is a semiautomated knowledge discovery and annotation system that integrates molecular function information derived from sequence analysis results (sequence inferred) with functional information extracted from text. Text-inferred information was extracted from keyword-based retrievals of MEDLINE abstracts and by matching of gene or protein names to OMIM, BIND, and DIP database entries. Using FACTS, we found that 47.5% of the 60,770 RIKEN mouse cDNA FANTOM2 clone annotations were informative for text searches. MEDLINE queries yielded molecular interaction-containing sentences for 23.1% of the clones. When disease MeSH and GO terms were matched with retrieved abstracts, 22.7% of clones were associated with potential diseases, and 32.5% with GO identifiers. A significant number (23.5%) of disease MeSH-associated clones were also found to have a hereditary disease association (OMIM Morbidmap). Inferred neoplastic and nervous system disease represented 49.6% and 36.0% of disease MeSH-associated clones, respectively. A comparison of sequence-based GO assignments with informative text-based GO assignments revealed that for 78.2% of clones, identical GO assignments were provided for that clone by either method, whereas for 21.8% of clones, the assignments differed. In contrast, for OMIM assignments, only 28.5% of clones had identical sequence-based and text-based OMIM assignments. Sequence, sentence, and term-based functional associations are included in the FACTS database (http://facts.gsc.riken.go.jp/), which permits results to be annotated and explored through web-accessible keyword and sequence search interfaces. The FACTS database will be a critical tool for investigating the functional complexity of the mouse transcriptome, cDNA-inferred interactome (molecular interactions), and pathome (pathologies). PMID:12819151
Mesoscopic modeling of DNA denaturation rates: Sequence dependence and experimental comparison
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dahlen, Oda, E-mail: oda.dahlen@ntnu.no; Erp, Titus S. van, E-mail: titus.van.erp@ntnu.no
Using rare event simulation techniques, we calculated DNA denaturation rate constants for a range of sequences and temperatures for the Peyrard-Bishop-Dauxois (PBD) model with two different parameter sets. We studied a larger variety of sequences compared to previous studies that only consider DNA homopolymers and DNA sequences containing an equal amount of weak AT- and strong GC-base pairs. Our results show that, contrary to previous findings, an even distribution of the strong GC-base pairs does not always result in the fastest possible denaturation. In addition, we applied an adaptation of the PBD model to study hairpin denaturation for which experimentalmore » data are available. This is the first quantitative study in which dynamical results from the mesoscopic PBD model have been compared with experiments. Our results show that present parameterized models, although giving good results regarding thermodynamic properties, overestimate denaturation rates by orders of magnitude. We believe that our dynamical approach is, therefore, an important tool for verifying DNA models and for developing next generation models that have higher predictive power than present ones.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Iovannisci, D.; Brown, C.; Winn-Deen, E.
1994-09-01
The cloning and sequencing of the gene associated with cystic fibrosis (CF) now provides the opportunity for earlier detection and carrier screening through DNA-based detection schemes. To date, over 300 mutations have been reported to the CF Consortium; however, only 30 mutations have been observed frequently enough world-wide to warrant routine screening. Many of these mutations are not available as cloned material or as established tissue culture cell lines to aid in the development of DNA-based detection assays. We have therefore cloned the 30 most frequently reported mutations, plus the mutation R347H due to its association with male infertility (31more » mutations, total). Two approaches were employed: direct PCR amplification, where mutations were available from patient sources, and site-directed PCR mutagenesis of normal genomic DNA to generate the remaining mutations. After amplification, products were cloned into a sequencing vector, bacterial transformants were screened by a novel method (PCR/oligonucleotide litigation assay/sequence-coded separation), and plamid DNA sequences determined by automated fluorescent methods on the Applied Biosystems 373A. Mixing of the clones allows the construction of artificial genotypes useful as positive control material for assay validation. A second round of mutagenesis, resulting in the construction of plasmids bearing multiple mutations, will be evaluated for their utility as reagent control materials in kit development.« less
Characterization of proviruses cloned from mink cell focus-forming virus-infected cellular DNA.
Khan, A S; Repaske, R; Garon, C F; Chan, H W; Rowe, W P; Martin, M A
1982-01-01
Two proviruses were cloned from EcoRI-digested DNA extracted from mink cells chronically infected with AKR mink cell focus-forming (MCF) 247 murine leukemia virus (MuLV), using a lambda phage host vector system. One cloned MuLV DNA fragment (designated MCF 1) contained sequences extending 6.8 kilobases from an EcoRI restriction site in the 5' long terminal repeat (LTR) to an EcoRI site located in the envelope (env) region and was indistinguishable by restriction endonuclease mapping for 5.1 kilobases (except for the EcoRI site in the LTR) from the 5' end of AKR ecotropic proviral DNA. The DNA segment extending from 5.1 to 6.8 kilobases contained several restriction sites that were not present in the AKR ecotropic provirus. A 0.5-kilobase DNA segment located at the 3' end of MCF 1 DNA contained sequences which hybridized to a xenotropic env-specific DNA probe but not to labeled ecotropic env-specific DNA. This dual character of MCF 1 proviral DNA was also confirmed by analyzing heteroduplex molecules by electron microscopy. The second cloned proviral DNA (designated MCF 2) was a 6.9-kilobase EcoRI DNA fragment which contained LTR sequences at each end and a 2.0-kilobase deletion encompassing most of the env region. The MCF 2 proviral DNA proved to be a useful reagent for detecting LTRs electron microscopically due to the presence of nonoverlapping, terminally located LTR sequences which effected its circularization with DNAs containing homologous LTR sequences. Nucleotide sequence analysis demonstrated the presence of a 104-base-pair direct repeat in the LTR of MCF 2 DNA. In contrast, only a single copy of the reiterated component of the direct repeat was present in MCF 1 DNA. Images PMID:6281459
Method for phosphorothioate antisense DNA sequencing by capillary electrophoresis with UV detection.
Froim, D; Hopkins, C E; Belenky, A; Cohen, A S
1997-11-01
The progress of antisense DNA therapy demands development of reliable and convenient methods for sequencing short single-stranded oligonucleotides. A method of phosphorothioate antisense DNA sequencing analysis using UV detection coupled to capillary electrophoresis (CE) has been developed based on a modified chain termination sequencing method. The proposed method reduces the sequencing cost since it uses affordable CE-UV instrumentation and requires no labeling with minimal sample processing before analysis. Cycle sequencing with ThermoSequenase generates quantities of sequencing products that are readily detectable by UV. Discrimination of undesired components from sequencing products in the reaction mixture, previously accomplished by fluorescent or radioactive labeling, is now achieved by bringing concentrations of undesired components below the UV detection range which yields a 'clean', well defined sequence. UV detection coupled with CE offers additional conveniences for sequencing since it can be accomplished with commercially available CE-UV equipment and is readily amenable to automation.
Method for phosphorothioate antisense DNA sequencing by capillary electrophoresis with UV detection.
Froim, D; Hopkins, C E; Belenky, A; Cohen, A S
1997-01-01
The progress of antisense DNA therapy demands development of reliable and convenient methods for sequencing short single-stranded oligonucleotides. A method of phosphorothioate antisense DNA sequencing analysis using UV detection coupled to capillary electrophoresis (CE) has been developed based on a modified chain termination sequencing method. The proposed method reduces the sequencing cost since it uses affordable CE-UV instrumentation and requires no labeling with minimal sample processing before analysis. Cycle sequencing with ThermoSequenase generates quantities of sequencing products that are readily detectable by UV. Discrimination of undesired components from sequencing products in the reaction mixture, previously accomplished by fluorescent or radioactive labeling, is now achieved by bringing concentrations of undesired components below the UV detection range which yields a 'clean', well defined sequence. UV detection coupled with CE offers additional conveniences for sequencing since it can be accomplished with commercially available CE-UV equipment and is readily amenable to automation. PMID:9336449
Loreille, Odile; Ratnayake, Shashikala; Stockwell, Timothy B.; Mallick, Swapan; Skoglund, Pontus; Onorato, Anthony J.; Bergman, Nicholas H.; Reich, David; Irwin, Jodi A.
2018-01-01
High throughput sequencing (HTS) has been used for a number of years in the field of paleogenomics to facilitate the recovery of small DNA fragments from ancient specimens. Recently, these techniques have also been applied in forensics, where they have been used for the recovery of mitochondrial DNA sequences from samples where traditional PCR-based assays fail because of the very short length of endogenous DNA molecules. Here, we describe the biological sexing of a ~4000-year-old Egyptian mummy using shotgun sequencing and two established methods of biological sex determination (RX and RY), by way of mitochondrial genome analysis as a means of sequence data authentication. This particular case of historical interest increases the potential utility of HTS techniques for forensic purposes by demonstrating that data from the more discriminatory nuclear genome can be recovered from the most damaged specimens, even in cases where mitochondrial DNA cannot be recovered with current PCR-based forensic technologies. Although additional work remains to be done before nuclear DNA recovered via these methods can be used routinely in operational casework for individual identification purposes, these results indicate substantial promise for the retrieval of probative individually identifying DNA data from the most limited and degraded forensic specimens. PMID:29494531
High-Throughput Analysis of T-DNA Location and Structure Using Sequence Capture.
Inagaki, Soichi; Henry, Isabelle M; Lieberman, Meric C; Comai, Luca
2015-01-01
Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA-genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously, using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. Our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.
Smith, J K; Parry, J D; Day, J G; Smith, R J
1998-10-01
The use of primers based on the Hip1 sequence as a typing technique for cyanobacteria has been investigated. The discovery of short repetitive sequence structures in bacterial DNA during the last decade has led to the development of PCR-based methods for typing, i.e., distinguishing and identifying, bacterial species and strains. An octameric palindromic sequence known as Hip1 has been shown to be present in the chromosomal DNA of many species of cyanobacteria as a highly repetitious interspersed sequence. PCR primers were constructed that extended the Hip1 sequence at the 3' end by two bases. Five of the 16 possible extended primers were tested. Each of the five primers produced a different set of products when used to prime PCR from cyanobacterial genomic DNA. Each primer produced a distinct set of products for each of the 15 cyanobacterial species tested. The ability of Hip1-based PCR to resolve taxonomic differences was assessed by analysis of independent isolates of Anabaena flos-aquae and Nostoc ellipsosporum obtained from the CCAP (Culture Collection of Algae and Protozoa, IFE, Cumbria, UK). A PCR-based RFLP analysis of products amplified from the 23S-16S rDNA intergenic region was used to characterize the isolates and to compare with the Hip1 typing data. The RFLP and Hip1 typing yielded similar results and both techniques were able to distinguish different strains. On the basis of these results it is suggested that the Hip1 PCR technique may assist in distinguishing cyanobacterial species and strains.
Pan, Hong-zhi; Yu, Hong-wei; Wang, Na; Zhang, Ze; Wan, Guang-cai; Liu, Hao; Guan, Xue; Chang, Dong
2015-11-20
We describe the fabrication of a sensitive electrochemical DNA biosensor for determination of Klebsiella pneumoniae carbapenemase (KPC). The highly sensitive and selective electrochemical biosensor for DNA detection was constructed based on a glassy carbon electrode (GCE) modified with gold nanoparticles (Au-NPs) and graphene (Gr). Then Au-NPs/Gr/GCE was characterized by scanning electro microscope (SEM), cyclic voltammetry (CV) and electrochemical impedance spectroscopy (EIS). The hybridization detection was measured by diffierential pulse voltammetry (DPV) using methylene blue (MB) as the hybridization indicator. The dynamic range of detection of the sensor for the target DNA sequences was from 1 × 10(-12) to 1 × 10(-7)mol/L, with a detection limit of 2 × 10(-13)mol/L. The DNA biosensor had excellent specificity for distinguishing complementary DNA sequence in the presence of non-complementary and mismatched DNA sequence. The results demonstrated that the Au-NPs/Gr nanocomposite was a promising substrate for the development of high-performance electrocatalysts for determination of KPC. Copyright © 2015 Elsevier B.V. All rights reserved.
DNA-based methods have considerably increased our understanding of the bacterial diversity of water distribution systems (WDS). However, as DNA may persist after cell death, the use of DNA-based methods cannot be used to describe metabolically-active microbes. In contrast, intra...
Sequence Effect on the Formation of DNA Minidumbbells.
Liu, Yuan; Lam, Sik Lok
2017-11-16
The DNA minidumbbell (MDB) is a recently identified non-B structure. The reported MDBs contain two TTTA, CCTG, or CTTG type II loops. At present, the knowledge and understanding of the sequence criteria for MDB formation are still limited. In this study, we performed a systematic high-resolution nuclear magnetic resonance (NMR) and native gel study to investigate the effect of sequence variations in tandem repeats on the formation of MDBs. Our NMR results reveal the importance of hydrogen bonds, base-base stacking, and hydrophobic interactions from each of the participating residues. We conclude that in the MDBs formed by tandem repeats, C-G loop-closing base pairs are more stabilizing than T-A loop-closing base pairs, and thymine residues in both the second and third loop positions are more stabilizing than cytosine residues. The results from this study enrich our knowledge on the sequence criteria for the formation of MDBs, paving a path for better exploring their potential roles in biological systems and DNA nanotechnology.
2012-01-01
Background In the last 30 years, a number of DNA fingerprinting methods such as RFLP, RAPD, AFLP, SSR, DArT, have been extensively used in marker development for molecular plant breeding. However, it remains a daunting task to identify highly polymorphic and closely linked molecular markers for a target trait for molecular marker-assisted selection. The next-generation sequencing (NGS) technology is far more powerful than any existing generic DNA fingerprinting methods in generating DNA markers. In this study, we employed a grain legume crop Lupinus angustifolius (lupin) as a test case, and examined the utility of an NGS-based method of RAD (restriction-site associated DNA) sequencing as DNA fingerprinting for rapid, cost-effective marker development tagging a disease resistance gene for molecular breeding. Results Twenty informative plants from a cross of RxS (disease resistant x susceptible) in lupin were subjected to RAD single-end sequencing by multiplex identifiers. The entire RAD sequencing products were resolved in two lanes of the 16-lanes per run sequencing platform Solexa HiSeq2000. A total of 185 million raw reads, approximately 17 Gb of sequencing data, were collected. Sequence comparison among the 20 test plants discovered 8207 SNP markers. Filtration of DNA sequencing data with marker identification parameters resulted in the discovery of 38 molecular markers linked to the disease resistance gene Lanr1. Five randomly selected markers were converted into cost-effective, simple PCR-based markers. Linkage analysis using marker genotyping data and disease resistance phenotyping data on a F8 population consisting of 186 individual plants confirmed that all these five markers were linked to the R gene. Two of these newly developed sequence-specific PCR markers, AnSeq3 and AnSeq4, flanked the target R gene at a genetic distance of 0.9 centiMorgan (cM), and are now replacing the markers previously developed by a traditional DNA fingerprinting method for marker-assisted selection in the Australian national lupin breeding program. Conclusions We demonstrated that more than 30 molecular markers linked to a target gene of agronomic trait of interest can be identified from a small portion (1/8) of one sequencing run on HiSeq2000 by applying NGS based RAD sequencing in marker development. The markers developed by the strategy described in this study are all co-dominant SNP markers, which can readily be converted into high throughput multiplex format or low-cost, simple PCR-based markers desirable for large scale marker implementation in plant breeding programs. The high density and closely linked molecular markers associated with a target trait help to overcome a major bottleneck for implementation of molecular markers on a wide range of germplasm in breeding programs. We conclude that application of NGS based RAD sequencing as DNA fingerprinting is a very rapid and cost-effective strategy for marker development in molecular plant breeding. The strategy does not require any prior genome knowledge or molecular information for the species under investigation, and it is applicable to other plant species. PMID:22805587
Yang, Huaan; Tao, Ye; Zheng, Zequn; Li, Chengdao; Sweetingham, Mark W; Howieson, John G
2012-07-17
In the last 30 years, a number of DNA fingerprinting methods such as RFLP, RAPD, AFLP, SSR, DArT, have been extensively used in marker development for molecular plant breeding. However, it remains a daunting task to identify highly polymorphic and closely linked molecular markers for a target trait for molecular marker-assisted selection. The next-generation sequencing (NGS) technology is far more powerful than any existing generic DNA fingerprinting methods in generating DNA markers. In this study, we employed a grain legume crop Lupinus angustifolius (lupin) as a test case, and examined the utility of an NGS-based method of RAD (restriction-site associated DNA) sequencing as DNA fingerprinting for rapid, cost-effective marker development tagging a disease resistance gene for molecular breeding. Twenty informative plants from a cross of RxS (disease resistant x susceptible) in lupin were subjected to RAD single-end sequencing by multiplex identifiers. The entire RAD sequencing products were resolved in two lanes of the 16-lanes per run sequencing platform Solexa HiSeq2000. A total of 185 million raw reads, approximately 17 Gb of sequencing data, were collected. Sequence comparison among the 20 test plants discovered 8207 SNP markers. Filtration of DNA sequencing data with marker identification parameters resulted in the discovery of 38 molecular markers linked to the disease resistance gene Lanr1. Five randomly selected markers were converted into cost-effective, simple PCR-based markers. Linkage analysis using marker genotyping data and disease resistance phenotyping data on a F8 population consisting of 186 individual plants confirmed that all these five markers were linked to the R gene. Two of these newly developed sequence-specific PCR markers, AnSeq3 and AnSeq4, flanked the target R gene at a genetic distance of 0.9 centiMorgan (cM), and are now replacing the markers previously developed by a traditional DNA fingerprinting method for marker-assisted selection in the Australian national lupin breeding program. We demonstrated that more than 30 molecular markers linked to a target gene of agronomic trait of interest can be identified from a small portion (1/8) of one sequencing run on HiSeq2000 by applying NGS based RAD sequencing in marker development. The markers developed by the strategy described in this study are all co-dominant SNP markers, which can readily be converted into high throughput multiplex format or low-cost, simple PCR-based markers desirable for large scale marker implementation in plant breeding programs. The high density and closely linked molecular markers associated with a target trait help to overcome a major bottleneck for implementation of molecular markers on a wide range of germplasm in breeding programs. We conclude that application of NGS based RAD sequencing as DNA fingerprinting is a very rapid and cost-effective strategy for marker development in molecular plant breeding. The strategy does not require any prior genome knowledge or molecular information for the species under investigation, and it is applicable to other plant species.
Sastre, Natalia; Ravera, Ivan; Villanueva, Sergio; Altet, Laura; Bardagí, Mar; Sánchez, Armand; Francino, Olga; Ferrer, Lluís
2012-12-01
The historical classification of Demodex mites has been based on their hosts and morphological features. Genome sequencing has proved to be a very effective taxonomic tool in phylogenetic studies and has been applied in the classification of Demodex. Mitochondrial 16S rDNA has been demonstrated to be an especially useful marker to establish phylogenetic relationships. To amplify and sequence a segment of the mitochondrial 16S rDNA from Demodex canis and Demodex injai, as well as from the short-bodied mite called, unofficially, D. cornei and to determine their genetic proximity. Demodex mites were examined microscopically and classified as Demodex folliculorum (one sample), D. canis (four samples), D. injai (two samples) or the short-bodied species D. cornei (three samples). DNA was extracted, and a 338 bp fragment of the 16S rDNA was amplified and sequenced. The sequences of the four D. canis mites were identical and shared 99.6 and 97.3% identity with two D. canis sequences available at GenBank. The sequences of the D. cornei isolates were identical and showed 97.8, 98.2 and 99.6% identity with the D. canis isolates. The sequences of the two D. injai isolates were also identical and showed 76.6% identity with the D. canis sequence. Demodex canis and D. injai are two different species, with a genetic distance of 23.3%. It would seem that the short-bodied Demodex mite D. cornei is a morphological variant of D. canis. © 2012 The Authors. Veterinary Dermatology © 2012 ESVD and ACVD.
ERIC Educational Resources Information Center
Militello, Kevin T.
2013-01-01
Epigenetic inheritance is the inheritance of genetic information that is not based on DNA sequence alone. One type of epigenetic information that has come to the forefront in the last few years is modified DNA bases. The most common modified DNA base in nature is 5-methylcytosine. Herein, we describe a laboratory experiment that combines…
Stacked-unstacked equilibrium at the nick site of DNA.
Protozanova, Ekaterina; Yakovchuk, Peter; Frank-Kamenetskii, Maxim D
2004-09-17
Stability of duplex DNA with respect to separation of complementary strands is crucial for DNA executing its major functions in the cell and it also plays a central role in major biotechnology applications of DNA: DNA sequencing, polymerase chain reaction, and DNA microarrays. Two types of interaction are well known to contribute to DNA stability: stacking between adjacent base-pairs and pairing between complementary bases. However, their contribution into the duplex stability is yet to be determined. Now we fill this fundamental gap in our knowledge of the DNA double helix. We have prepared a series of 32, 300 bp-long DNA fragments with solitary nicks in the same position differing only in base-pairs flanking the nick. Electrophoretic mobility of these fragments in the gel has been studied. Assuming the equilibrium between stacked and unstacked conformations at the nick site, all 32 stacking free energy parameters have been obtained. Only ten of them are essential and they govern the stacking interactions between adjacent base-pairs in intact DNA double helix. A full set of DNA stacking parameters has been determined for the first time. From these data and from a well-known dependence of DNA melting temperature on G.C content, the contribution of base-pairing into duplex stability has been estimated. The obtained energy parameters of the DNA double helix are of paramount importance for understanding sequence-dependent DNA flexibility and for numerous biotechnology applications.
Amicarelli, Giulia; Adlerstein, Daniel; Shehi, Erlet; Wang, Fengfei; Makrigiorgos, G Mike
2006-10-01
Genotyping methods that reveal single-nucleotide differences are useful for a wide range of applications. We used digestion of 3-way DNA junctions in a novel technology, OneCutEventAmplificatioN (OCEAN) that allows sequence-specific signal generation and amplification. We combined OCEAN with peptide-nucleic-acid (PNA)-based variant enrichment to detect and simultaneously genotype v-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog (KRAS) codon 12 sequence variants in human tissue specimens. We analyzed KRAS codon 12 sequence variants in 106 lung cancer surgical specimens. We conducted a PNA-PCR reaction that suppresses wild-type KRAS amplification and genotyped the product with a set of OCEAN reactions carried out in fluorescence microplate format. The isothermal OCEAN assay enabled a 3-way DNA junction to form between the specific target nucleic acid, a fluorescently labeled "amplifier", and an "anchor". The amplifier-anchor contact contains the recognition site for a restriction enzyme. Digestion produces a cleaved amplifier and generation of a fluorescent signal. The cleaved amplifier dissociates from the 3-way DNA junction, allowing a new amplifier to bind and propagate the reaction. The system detected and genotyped KRAS sequence variants down to approximately 0.3% variant-to-wild-type alleles. PNA-PCR/OCEAN had a concordance rate with PNA-PCR/sequencing of 93% to 98%, depending on the exact implementation. Concordance rate with restriction endonuclease-mediated selective-PCR/sequencing was 89%. OCEAN is a practical and low-cost novel technology for sequence-specific signal generation. Reliable analysis of KRAS sequence alterations in human specimens circumvents the requirement for sequencing. Application is expected in genotyping KRAS codon 12 sequence variants in surgical specimens or in bodily fluids, as well as single-base variations and sequence alterations in other genes.
Linear and Nonlinear Statistical Characterization of DNA
NASA Astrophysics Data System (ADS)
Norio Oiwa, Nestor; Goldman, Carla; Glazier, James
2002-03-01
We find spatial order in the distribution of protein-coding (including RNAs) and control segments of GenBank genomic sequences, irrespective of ATCG content. This is achieved by correlations, histograms, fractal dimensions and singularity spectra. Estimates of these quantities in complete nuclear genome indicate that coding sequences are long-range correlated and their disposition are self-similar (multifractal) for eukaryotes. These characteristics are absent in prokaryotes, where there are few noncoding sequences, suggesting the `junk' DNA play a relevant role to the genome structure and function. Concerning the genetic message of ATCG sequences, we build a random walk (Levy flight), using DNA symmetry arguments, where we associate A, T, C and G as left, right, down and up steps, respectively. Nonlinear analysis of mitochondrial DNA walks reveal multifractal pattern based on palindromic sequences, which fold in hairpins and loops.
Biological nanopore MspA for DNA sequencing
NASA Astrophysics Data System (ADS)
Manrao, Elizabeth A.
Unlocking the information hidden in the human genome provides insight into the inner workings of complex biological systems and can be used to greatly improve health-care. In order to allow for widespread sequencing, new technologies are required that provide fast and inexpensive readings of DNA. Nanopore sequencing is a third generation DNA sequencing technology that is currently being developed to fulfill this need. In nanopore sequencing, a voltage is applied across a small pore in an electrolyte solution and the resulting ionic current is recorded. When DNA passes through the channel, the ionic current is partially blocked. If the DNA bases uniquely modulate the ionic current flowing through the channel, the time trace of the current can be related to the sequence of DNA passing through the pore. There are two main challenges to realizing nanopore sequencing: identifying a pore with sensitivity to single nucleotides and controlling the translocation of DNA through the pore so that the small single nucleotide current signatures are distinguishable from background noise. In this dissertation, I explore the use of Mycobacterium smegmatis porin A (MspA) for nanopore sequencing. In order to determine MspA's sensitivity to single nucleotides, DNA strands of various compositions are held in the pore as the resulting ionic current is measured. DNA is immobilized in MspA by attaching it to a large molecule which acts as an anchor. This technique confirms the single nucleotide resolution of the pore and additionally shows that MspA is sensitive to epigenetic modifications and single nucleotide polymorphisms. The forces from the electric field within MspA, the effective charge of nucleotides, and elasticity of DNA are estimated using a Freely Jointed Chain model of single stranded DNA. These results offer insight into the interactions of DNA within the pore. With the nucleotide sensitivity of MspA confirmed, a method is introduced to controllably pass DNA through the pore. Using a DNA polymerase, DNA strands are stepped through MspA one nucleotide at a time. The steps are observable as distinct levels on the ionic-current time-trace and are related to the DNA sequence. These experiments overcome the two fundamental challenges to realizing MspA nanopore sequencing and pave the way to the development of a commercial technology.
Chiron: translating nanopore raw signal directly into nucleotide sequence using deep learning.
Teng, Haotian; Cao, Minh Duc; Hall, Michael B; Duarte, Tania; Wang, Sheng; Coin, Lachlan J M
2018-05-01
Sequencing by translocating DNA fragments through an array of nanopores is a rapidly maturing technology that offers faster and cheaper sequencing than other approaches. However, accurately deciphering the DNA sequence from the noisy and complex electrical signal is challenging. Here, we report Chiron, the first deep learning model to achieve end-to-end basecalling and directly translate the raw signal to DNA sequence without the error-prone segmentation step. Trained with only a small set of 4,000 reads, we show that our model provides state-of-the-art basecalling accuracy, even on previously unseen species. Chiron achieves basecalling speeds of more than 2,000 bases per second using desktop computer graphics processing units.
PMS2 gene mutational analysis: direct cDNA sequencing to circumvent pseudogene interference.
Wimmer, Katharina; Wernstedt, Annekatrin
2014-01-01
The presence of highly homologous pseudocopies can compromise the mutation analysis of a gene of interest. In particular, when using PCR-based strategies, pseudogene co-amplification has to be effectively prevented. This is often achieved by using primers designed to be parental gene specific according to the reference sequence and by applying stringent PCR conditions. However, there are cases in which this approach is of limited utility. For example, it has been shown that the PMS2 gene exchanges sequences with one of its pseudogenes, named PMS2CL. This results in functional PMS2 alleles containing pseudogene-derived sequences at their 3'-end and in nonfunctional PMS2CL pseudogene alleles that contain gene-derived sequences. Hence, the paralogues cannot be distinguished according to the reference sequence. This shortcoming can be effectively circumvented by using direct cDNA sequencing. This approach is based on the selective amplification of PMS2 transcripts in two overlapping 1.6-kb RT-PCR products. In addition to avoiding pseudogene co-amplification and allele dropout, this method has also the advantage that it allows to effectively identify deletions, splice mutations, and de novo retrotransposon insertions that escape the detection of most DNA-based mutation analysis protocols.
BLAST and FASTA similarity searching for multiple sequence alignment.
Pearson, William R
2014-01-01
BLAST, FASTA, and other similarity searching programs seek to identify homologous proteins and DNA sequences based on excess sequence similarity. If two sequences share much more similarity than expected by chance, the simplest explanation for the excess similarity is common ancestry-homology. The most effective similarity searches compare protein sequences, rather than DNA sequences, for sequences that encode proteins, and use expectation values, rather than percent identity, to infer homology. The BLAST and FASTA packages of sequence comparison programs provide programs for comparing protein and DNA sequences to protein databases (the most sensitive searches). Protein and translated-DNA comparisons to protein databases routinely allow evolutionary look back times from 1 to 2 billion years; DNA:DNA searches are 5-10-fold less sensitive. BLAST and FASTA can be run on popular web sites, but can also be downloaded and installed on local computers. With local installation, target databases can be customized for the sequence data being characterized. With today's very large protein databases, search sensitivity can also be improved by searching smaller comprehensive databases, for example, a complete protein set from an evolutionarily neighboring model organism. By default, BLAST and FASTA use scoring strategies target for distant evolutionary relationships; for comparisons involving short domains or queries, or searches that seek relatively close homologs (e.g. mouse-human), shallower scoring matrices will be more effective. Both BLAST and FASTA provide very accurate statistical estimates, which can be used to reliably identify protein sequences that diverged more than 2 billion years ago.
Liu, Guo-Hua; Li, Chun; Li, Jia-Yuan; Zhou, Dong-Hui; Xiong, Rong-Chuan; Lin, Rui-Qing; Zou, Feng-Cai; Zhu, Xing-Quan
2012-01-01
Sparganosis, caused by the plerocercoid larvae of members of the genus Spirometra, can cause significant public health problem and considerable economic losses. In the present study, the complete mitochondrial DNA (mtDNA) sequence of Spirometra erinaceieuropaei from China was determined, characterized and compared with that of S. erinaceieuropaei from Japan. The gene arrangement in the mt genome sequences of S. erinaceieuropaei from China and Japan is identical. The identity of the mt genomes was 99.1% between S. erinaceieuropaei from China and Japan, and the complete mtDNA sequence of S. erinaceieuropaei from China is slightly shorter (2 bp) than that from Japan. Phylogenetic analysis of S. erinaceieuropaei with other representative cestodes using two different computational algorithms [Bayesian inference (BI) and maximum likelihood (ML)] based on concatenated amino acid sequences of 12 protein-coding genes, revealed that S. erinaceieuropaei is closely related to Diphyllobothrium spp., supporting classification based on morphological features. The present study determined the complete mtDNA sequences of S. erinaceieuropaei from China that provides novel genetic markers for studying the population genetics and molecular epidemiology of S. erinaceieuropaei in humans and animals. PMID:22553464
Tramontano, A; Macchiato, M F
1986-01-01
An algorithm to determine the probability that a reading frame codifies for a protein is presented. It is based on the results of our previous studies on the thermodynamic characteristics of a translated reading frame. We also develop a prediction procedure to distinguish between coding and non-coding reading frames. The procedure is based on the characteristics of the putative product of the DNA sequence and not on periodicity characteristics of the sequence, so the prediction is not biased by the presence of overlapping translated reading frames or by the presence of translated reading frames on the complementary DNA strand. PMID:3753761
NASA Technical Reports Server (NTRS)
Nordheim, A.; Rich, A.
1983-01-01
Three 8-base pair (bp) segments of alternating purine-pyrimidine from the simian virus 40 enhancer region form Z-DNA on negative supercoiling; minichromosome DNase I-hypersensitive sites determined by others bracket these three segments. A survey of transcriptional enhancer sequences reveals a pattern of potential Z-DNA-forming regions which occur in pairs 50-80 bp apart. This may influence local chromatin structure and may be related to transcriptional activation.
Yang, Xiang; Yang, Ke; Zhao, Xiang; Lin, Zhongquan; Liu, Zhiyong; Luo, Sha; Zhang, Yang; Wang, Yunxia; Fu, Weiling
2017-12-04
The demand for rapid and sensitive bacterial detection is continuously increasing due to the significant requirements of various applications. In this study, a terahertz (THz) biosensor based on rolling circle amplification (RCA) was developed for the isothermal detection of bacterial DNA. The synthetic bacterium-specific sequence of 16S rDNA hybridized with a padlock probe (PLP) that contains a sequence fully complementary to the target sequence at the 5' and 3' ends. The linear PLP was circularized by ligation to form a circular PLP upon recognition of the target sequence; then the capture probe (CP) immobilized on magnetic beads (MBs) acted as a primer to initialize RCA. As DNA molecules are much less absorptive than water molecules in the THz range, the RCA products on the surface of the MBs cause a significant decrease in THz absorption, which can be sensitively probed by THz spectroscopy. Our results showed that 0.12 fmol of synthetic bacterial DNA and 0.05 ng μL -1 of genomic DNA could be effectively detected using this assay. In addition, the specificity of this strategy was demonstrated by its low signal response to interfering bacteria. The proposed strategy not only represents a new method for the isothermal detection of the target bacterial DNA but also provides a general methodology for sensitive and specific DNA biosensing using THz spectroscopy.
Biswas, Kristi; Taylor, Michael W.; Gear, Kim
2017-01-01
The application of high-throughput, next-generation sequencing technologies has greatly improved our understanding of the human oral microbiome. While deciphering this diverse microbial community using such approaches is more accurate than traditional culture-based methods, experimental bias introduced during critical steps such as DNA extraction may compromise the results obtained. Here, we systematically evaluate four commonly used microbial DNA extraction methods (MoBio PowerSoil® DNA Isolation Kit, QIAamp® DNA Mini Kit, Zymo Bacterial/Fungal DNA Mini PrepTM, phenol:chloroform-based DNA isolation) based on the following criteria: DNA quality and yield, and microbial community structure based on Illumina amplicon sequencing of the V3–V4 region of the 16S rRNA gene of bacteria and the internal transcribed spacer (ITS) 1 region of fungi. Our results indicate that DNA quality and yield varied significantly with DNA extraction method. Representation of bacterial genera in plaque and saliva samples did not significantly differ across DNA extraction methods and DNA extraction method showed no effect on the recovery of fungal genera from plaque. By contrast, fungal diversity from saliva was affected by DNA extraction method, suggesting that not all protocols are suitable to study the salivary mycobiome. PMID:28099455
Ahmed, Ikhlak; Sarazin, Alexis; Bowler, Chris; Colot, Vincent; Quesneville, Hadi
2011-09-01
Transposable elements (TEs) and their relics play major roles in genome evolution. However, mobilization of TEs is usually deleterious and strongly repressed. In plants and mammals, this repression is typically associated with DNA methylation, but the relationship between this epigenetic mark and TE sequences has not been investigated systematically. Here, we present an improved annotation of TE sequences and use it to analyze genome-wide DNA methylation maps obtained at single-nucleotide resolution in Arabidopsis. We show that although the majority of TE sequences are methylated, ∼26% are not. Moreover, a significant fraction of TE sequences densely methylated at CG, CHG and CHH sites (where H = A, T or C) have no or few matching small interfering RNA (siRNAs) and are therefore unlikely to be targeted by the RNA-directed DNA methylation (RdDM) machinery. We provide evidence that these TE sequences acquire DNA methylation through spreading from adjacent siRNA-targeted regions. Further, we show that although both methylated and unmethylated TE sequences located in euchromatin tend to be more abundant closer to genes, this trend is least pronounced for methylated, siRNA-targeted TE sequences located 5' to genes. Based on these and other findings, we propose that spreading of DNA methylation through promoter regions explains at least in part the negative impact of siRNA-targeted TE sequences on neighboring gene expression.
Sequence-dependent response of DNA to torsional stress: a potential biological regulation mechanism.
Reymer, Anna; Zakrzewska, Krystyna; Lavery, Richard
2018-02-28
Torsional restraints on DNA change in time and space during the life of the cell and are an integral part of processes such as gene expression, DNA repair and packaging. The mechanical behavior of DNA under torsional stress has been studied on a mesoscopic scale, but little is known concerning its response at the level of individual base pairs and the effects of base pair composition. To answer this question, we have developed a geometrical restraint that can accurately control the total twist of a DNA segment during all-atom molecular dynamics simulations. By applying this restraint to four different DNA oligomers, we are able to show that DNA responds to both under- and overtwisting in a very heterogeneous manner. Certain base pair steps, in specific sequence environments, are able to absorb most of the torsional stress, leaving other steps close to their relaxed conformation. This heterogeneity also affects the local torsional modulus of DNA. These findings suggest that modifying torsional stress on DNA could act as a modulator for protein binding via the heterogeneous changes in local DNA structure.
Sequence-dependent response of DNA to torsional stress: a potential biological regulation mechanism
Reymer, Anna; Zakrzewska, Krystyna; Lavery, Richard
2018-01-01
Abstract Torsional restraints on DNA change in time and space during the life of the cell and are an integral part of processes such as gene expression, DNA repair and packaging. The mechanical behavior of DNA under torsional stress has been studied on a mesoscopic scale, but little is known concerning its response at the level of individual base pairs and the effects of base pair composition. To answer this question, we have developed a geometrical restraint that can accurately control the total twist of a DNA segment during all-atom molecular dynamics simulations. By applying this restraint to four different DNA oligomers, we are able to show that DNA responds to both under- and overtwisting in a very heterogeneous manner. Certain base pair steps, in specific sequence environments, are able to absorb most of the torsional stress, leaving other steps close to their relaxed conformation. This heterogeneity also affects the local torsional modulus of DNA. These findings suggest that modifying torsional stress on DNA could act as a modulator for protein binding via the heterogeneous changes in local DNA structure. PMID:29267977
Secondary structure prediction and structure-specific sequence analysis of single-stranded DNA.
Dong, F; Allawi, H T; Anderson, T; Neri, B P; Lyamichev, V I
2001-08-01
DNA sequence analysis by oligonucleotide binding is often affected by interference with the secondary structure of the target DNA. Here we describe an approach that improves DNA secondary structure prediction by combining enzymatic probing of DNA by structure-specific 5'-nucleases with an energy minimization algorithm that utilizes the 5'-nuclease cleavage sites as constraints. The method can identify structural differences between two DNA molecules caused by minor sequence variations such as a single nucleotide mutation. It also demonstrates the existence of long-range interactions between DNA regions separated by >300 nt and the formation of multiple alternative structures by a 244 nt DNA molecule. The differences in the secondary structure of DNA molecules revealed by 5'-nuclease probing were used to design structure-specific probes for mutation discrimination that target the regions of structural, rather than sequence, differences. We also demonstrate the performance of structure-specific 'bridge' probes complementary to non-contiguous regions of the target molecule. The structure-specific probes do not require the high stringency binding conditions necessary for methods based on mismatch formation and permit mutation detection at temperatures from 4 to 37 degrees C. Structure-specific sequence analysis is applied for mutation detection in the Mycobacterium tuberculosis katG gene and for genotyping of the hepatitis C virus.
Buchmueller, Karen L; Staples, Andrew M; Howard, Cameron M; Horick, Sarah M; Uthe, Peter B; Le, N Minh; Cox, Kari K; Nguyen, Binh; Pacheco, Kimberly A O; Wilson, W David; Lee, Moses
2005-01-19
Pyrrole (Py) and imidazole (Im) polyamides can be designed to target specific DNA sequences. The effect that the pyrrole and imidazole arrangement, plus DNA sequence, have on sequence specificity and binding affinity has been investigated using DNA melting (DeltaT(M)), circular dichroism (CD), and surface plasmon resonance (SPR) studies. SPR results obtained from a complete set of triheterocyclic polyamides show a dramatic difference in the affinity of f-ImPyIm for its cognate DNA (K(eq) = 1.9 x 10(8) M(-1)) and f-PyPyIm for its cognate DNA (K(eq) = 5.9 x 10(5) M(-1)), which could not have been anticipated prior to characterization of these compounds. Moreover, f-ImPyIm has a 10-fold greater affinity for CGCG than distamycin A has for its cognate, AATT. To understand this difference, the triamide dimers are divided into two structural groupings: central and terminal pairings. The four possible central pairings show decreasing selectivity and affinity for their respective cognate sequences: -ImPy > -PyPy- > -PyIm- approximately -ImIm-. These results extend the language of current design motifs for polyamide sequence recognition to include the use of "words" for recognizing two adjacent base pairs, rather than "letters" for binding to single base pairs. Thus, polyamides designed to target Watson-Crick base pairs should utilize the strength of -ImPy- and -PyPy- central pairings. The f/Im and f/Py terminal groups yielded no advantage for their respective C/G or T/A base pairs. The exception is with the -ImPy- central pairing, for which f/Im has a 10-fold greater affinity for C/G than f/Py has for T/A.
Scherer, N M; Basso, D M
2008-09-16
DNATagger is a web-based tool for coloring and editing DNA, RNA and protein sequences and alignments. It is dedicated to the visualization of protein coding sequences and also protein sequence alignments to facilitate the comprehension of evolutionary processes in sequence analysis. The distinctive feature of DNATagger is the use of codons as informative units for coloring DNA and RNA sequences. The codons are colored according to their corresponding amino acids. It is the first program that colors codons in DNA sequences without being affected by "out-of-frame" gaps of alignments. It can handle single gaps and gaps inside the triplets. The program also provides the possibility to edit the alignments and change color patterns and translation tables. DNATagger is a JavaScript application, following the W3C guidelines, designed to work on standards-compliant web browsers. It therefore requires no installation and is platform independent. The web-based DNATagger is available as free and open source software at http://www.inf.ufrgs.br/~dmbasso/dnatagger/.
An Alu-based, MGB Eclipse real-time PCR method for quantitation of human DNA in forensic samples.
Nicklas, Janice A; Buel, Eric
2005-09-01
The forensic community needs quick, reliable methods to quantitate human DNA in crime scene samples to replace the laborious and imprecise slot blot method. A real-time PCR based method has the possibility of allowing development of a faster and more quantitative assay. Alu sequences are primate-specific and are found in many copies in the human genome, making these sequences an excellent target or marker for human DNA. This paper describes the development of a real-time Alu sequence-based assay using MGB Eclipse primers and probes. The advantages of this assay are simplicity, speed, less hands-on-time and automated quantitation, as well as a large dynamic range (128 ng/microL to 0.5 pg/microL).
Farjami, Elaheh; Clima, Lilia; Gothelf, Kurt V; Ferapontova, Elena E
2010-06-01
A DNA molecular beacon approach was used for the analysis of interactions between DNA and Methylene Blue (MB) as a redox indicator of a hybridization event. DNA hairpin structures of different length and guanine (G) content were immobilized onto gold electrodes in their folded states through the alkanethiol linker at the 5'-end. Binding of MB to the folded hairpin DNA was electrochemically studied and compared with binding to the duplex structure formed by hybridization of the hairpin DNA to a complementary DNA strand. Variation of the electrochemical signal from the DNA-MB complex was shown to depend primarily on the DNA length and sequence used: the G-C base pairs were the preferential sites of MB binding in the duplex. For short 20 nts long DNA sequences, the increased electrochemical response from MB bound to the duplex structure was consistent with the increased amount of bound and electrochemically readable MB molecules (i.e. MB molecules that are available for the electron transfer (ET) reaction with the electrode). With longer DNA sequences, the balance between the amounts of the electrochemically readable MB molecules bound to the hairpin DNA and to the hybrid was opposite: a part of the MB molecules bound to the long-sequence DNA duplex seem to be electrochemically mute due to long ET distance. The increasing electrochemical response from MB bound to the short-length DNA hybrid contrasts with the decreasing signal from MB bound to the long-length DNA hybrid and allows an "off"-"on" genosensor development.
1998-12-01
Type II restriction enzymes, such as Eco R1 endonulease, present a unique advantage for the study of sequence-specific recognition because they leave a record of where they have been in the form of the cleaved ends of the DNA sites where they were bound. The differential behavior of a sequence -specific protein at sites of differing base sequence is the essence of the sequence-specificity; the core question is how do these proteins discriminate between different DNA sequences especially when the two sequences are very similar. Principal Investigator: Dan Carter/New Century Pharmaceuticals
Assessing the utility of the Oxford Nanopore MinION for snake venom gland cDNA sequencing.
Hargreaves, Adam D; Mulley, John F
2015-01-01
Portable DNA sequencers such as the Oxford Nanopore MinION device have the potential to be truly disruptive technologies, facilitating new approaches and analyses and, in some cases, taking sequencing out of the lab and into the field. However, the capabilities of these technologies are still being revealed. Here we show that single-molecule cDNA sequencing using the MinION accurately characterises venom toxin-encoding genes in the painted saw-scaled viper, Echis coloratus. We find the raw sequencing error rate to be around 12%, improved to 0-2% with hybrid error correction and 3% with de novo error correction. Our corrected data provides full coding sequences and 5' and 3' UTRs for 29 of 33 candidate venom toxins detected, far superior to Illumina data (13/40 complete) and Sanger-based ESTs (15/29). We suggest that, should the current pace of improvement continue, the MinION will become the default approach for cDNA sequencing in a variety of species.
Assessing the utility of the Oxford Nanopore MinION for snake venom gland cDNA sequencing
Hargreaves, Adam D.
2015-01-01
Portable DNA sequencers such as the Oxford Nanopore MinION device have the potential to be truly disruptive technologies, facilitating new approaches and analyses and, in some cases, taking sequencing out of the lab and into the field. However, the capabilities of these technologies are still being revealed. Here we show that single-molecule cDNA sequencing using the MinION accurately characterises venom toxin-encoding genes in the painted saw-scaled viper, Echis coloratus. We find the raw sequencing error rate to be around 12%, improved to 0–2% with hybrid error correction and 3% with de novo error correction. Our corrected data provides full coding sequences and 5′ and 3′ UTRs for 29 of 33 candidate venom toxins detected, far superior to Illumina data (13/40 complete) and Sanger-based ESTs (15/29). We suggest that, should the current pace of improvement continue, the MinION will become the default approach for cDNA sequencing in a variety of species. PMID:26623194
Sequencing to Station in 12 Months (Targeting Orbital 5 Launch, March 30th)
NASA Technical Reports Server (NTRS)
Smith, David J.; Burton, Aaron Steven
2015-01-01
The Biomolecule Sequencer is a Commercial Off-The-Shelf device developed by Oxford Nanopore Technologies and implements a method of DNA sequencing unlike any other current sequencers. The device measures changes in electrical current through a nanopore depending on the sequence of the DNA strand that is passing through it. Since the technology is built on nanometer-scale ion pores, the hardware itself is exceptionally small (3 x 1 x 58 inches), lightweight (less than 120 grams with USB cable), and powered only by a USB connection. The sequencing device is permanent, while the flow cells, to which the samples are added, are periodically replaced. The goal of our upcoming technology demonstration on ISS is to provide evidence that DNA sequencing in space is possible, which holds the exciting potential to enable the identification of microorganisms, monitor changes in microbes and humans in response to spaceflight, and possibly aid in the detection of DNA-based life elsewhere in the universe.
To Clone or Not To Clone: Method Analysis for Retrieving Consensus Sequences In Ancient DNA Samples
Winters, Misa; Barta, Jodi Lynn; Monroe, Cara; Kemp, Brian M.
2011-01-01
The challenges associated with the retrieval and authentication of ancient DNA (aDNA) evidence are principally due to post-mortem damage which makes ancient samples particularly prone to contamination from “modern” DNA sources. The necessity for authentication of results has led many aDNA researchers to adopt methods considered to be “gold standards” in the field, including cloning aDNA amplicons as opposed to directly sequencing them. However, no standardized protocol has emerged regarding the necessary number of clones to sequence, how a consensus sequence is most appropriately derived, or how results should be reported in the literature. In addition, there has been no systematic demonstration of the degree to which direct sequences are affected by damage or whether direct sequencing would provide disparate results from a consensus of clones. To address this issue, a comparative study was designed to examine both cloned and direct sequences amplified from ∼3,500 year-old ancient northern fur seal DNA extracts. Majority rules and the Consensus Confidence Program were used to generate consensus sequences for each individual from the cloned sequences, which exhibited damage at 31 of 139 base pairs across all clones. In no instance did the consensus of clones differ from the direct sequence. This study demonstrates that, when appropriate, cloning need not be the default method, but instead, should be used as a measure of authentication on a case-by-case basis, especially when this practice adds time and cost to studies where it may be superfluous. PMID:21738625
2017-01-01
Amplicon (targeted) sequencing by massively parallel sequencing (PCR-MPS) is a potential method for use in forensic DNA analyses. In this application, PCR-MPS may supplement or replace other instrumental analysis methods such as capillary electrophoresis and Sanger sequencing for STR and mitochondrial DNA typing, respectively. PCR-MPS also may enable the expansion of forensic DNA analysis methods to include new marker systems such as single nucleotide polymorphisms (SNPs) and insertion/deletions (indels) that currently are assayable using various instrumental analysis methods including microarray and quantitative PCR. Acceptance of PCR-MPS as a forensic method will depend in part upon developing protocols and criteria that define the limitations of a method, including a defensible analytical threshold or method detection limit. This paper describes an approach to establish objective analytical thresholds suitable for multiplexed PCR-MPS methods. A definition is proposed for PCR-MPS method background noise, and an analytical threshold based on background noise is described. PMID:28542338
Young, Brian; King, Jonathan L; Budowle, Bruce; Armogida, Luigi
2017-01-01
Amplicon (targeted) sequencing by massively parallel sequencing (PCR-MPS) is a potential method for use in forensic DNA analyses. In this application, PCR-MPS may supplement or replace other instrumental analysis methods such as capillary electrophoresis and Sanger sequencing for STR and mitochondrial DNA typing, respectively. PCR-MPS also may enable the expansion of forensic DNA analysis methods to include new marker systems such as single nucleotide polymorphisms (SNPs) and insertion/deletions (indels) that currently are assayable using various instrumental analysis methods including microarray and quantitative PCR. Acceptance of PCR-MPS as a forensic method will depend in part upon developing protocols and criteria that define the limitations of a method, including a defensible analytical threshold or method detection limit. This paper describes an approach to establish objective analytical thresholds suitable for multiplexed PCR-MPS methods. A definition is proposed for PCR-MPS method background noise, and an analytical threshold based on background noise is described.
Entropic fluctuations in DNA sequences
NASA Astrophysics Data System (ADS)
Thanos, Dimitrios; Li, Wentian; Provata, Astero
2018-03-01
The Local Shannon Entropy (LSE) in blocks is used as a complexity measure to study the information fluctuations along DNA sequences. The LSE of a DNA block maps the local base arrangement information to a single numerical value. It is shown that despite this reduction of information, LSE allows to extract meaningful information related to the detection of repetitive sequences in whole chromosomes and is useful in finding evolutionary differences between organisms. More specifically, large regions of tandem repeats, such as centromeres, can be detected based on their low LSE fluctuations along the chromosome. Furthermore, an empirical investigation of the appropriate block sizes is provided and the relationship of LSE properties with the structure of the underlying repetitive units is revealed by using both computational and mathematical methods. Sequence similarity between the genomic DNA of closely related species also leads to similar LSE values at the orthologous regions. As an application, the LSE covariance function is used to measure the evolutionary distance between several primate genomes.
NASA Astrophysics Data System (ADS)
Zhou, Hong; Zhang, Zhinan; Chen, Haiyan; Sun, Renhua; Wang, Hui; Guo, Lei; Pan, Haijian
2010-07-01
In this study, we integrated a DNA barcoding project with an ecological survey on intertidal polychaete communities and investigated the utility of CO1 gene sequence as a DNA barcode for the classification of the intertidal polychaetes. Using 16S rDNA as a complementary marker and combining morphological and ecological characterization, some of dominant and common polychaete species from Chinese coasts were assessed for their taxonomic status. We obtained 22 haplotype gene sequences of 13 taxa, including 10 CO1 sequences and 12 16S rDNA sequences. Based on intra- and inter-specific distances, we built phylogenetic trees using the neighbor-joining method. Our study suggested that the mitochondrial CO1 gene was a valid DNA barcoding marker for species identification in polychaetes, but other genes, such as 16S rDNA, could be used as a complementary genetic marker. For more accurate species identification and effective testing of species hypothesis, DNA barcoding should be incorporated with morphological, ecological, biogeographical, and phylogenetic information. The application of DNA barcoding and molecular identification in the ecological survey on the intertidal polychaete communities demonstrated the feasibility of integrating DNA taxonomy and ecology.
MSuPDA: A Memory Efficient Algorithm for Sequence Alignment.
Khan, Mohammad Ibrahim; Kamal, Md Sarwar; Chowdhury, Linkon
2016-03-01
Space complexity is a million dollar question in DNA sequence alignments. In this regard, memory saving under pushdown automata can help to reduce the occupied spaces in computer memory. Our proposed process is that anchor seed (AS) will be selected from given data set of nucleotide base pairs for local sequence alignment. Quick splitting techniques will separate the AS from all the DNA genome segments. Selected AS will be placed to pushdown automata's (PDA) input unit. Whole DNA genome segments will be placed into PDA's stack. AS from input unit will be matched with the DNA genome segments from stack of PDA. Match, mismatch and indel of nucleotides will be popped from the stack under the control unit of pushdown automata. During the POP operation on stack, it will free the memory cell occupied by the nucleotide base pair.
High-resolution biophysical analysis of the dynamics of nucleosome formation
Hatakeyama, Akiko; Hartmann, Brigitte; Travers, Andrew; Nogues, Claude; Buckle, Malcolm
2016-01-01
We describe a biophysical approach that enables changes in the structure of DNA to be followed during nucleosome formation in in vitro reconstitution with either the canonical “Widom” sequence or a judiciously mutated sequence. The rapid non-perturbing photochemical analysis presented here provides ‘snapshots’ of the DNA configuration at any given moment in time during nucleosome formation under a very broad range of reaction conditions. Changes in DNA photochemical reactivity upon protein binding are interpreted as being mainly induced by alterations in individual base pair roll angles. The results strengthen the importance of the role of an initial (H3/H4)2 histone tetramer-DNA interaction and highlight the modulation of this early event by the DNA sequence. (H3/H4)2 binding precedes and dictates subsequent H2A/H2B-DNA interactions, which are less affected by the DNA sequence, leading to the final octameric nucleosome. Overall, our results provide a novel, exciting way to investigate those biophysical properties of DNA that constitute a crucial component in nucleosome formation and stabilization. PMID:27263658
Potapov, Vladimir; Ong, Jennifer L; Langhorst, Bradley W; Bilotti, Katharina; Cahoon, Dan; Canton, Barry; Knight, Thomas F; Evans, Thomas C; Lohman, Gregory Js
2018-05-08
DNA ligases are key enzymes in molecular and synthetic biology that catalyze the joining of breaks in duplex DNA and the end-joining of DNA fragments. Ligation fidelity (discrimination against the ligation of substrates containing mismatched base pairs) and bias (preferential ligation of particular sequences over others) have been well-studied in the context of nick ligation. However, almost no data exist for fidelity and bias in end-joining ligation contexts. In this study, we applied Pacific Biosciences Single-Molecule Real-Time sequencing technology to directly sequence the products of a highly multiplexed ligation reaction. This method has been used to profile the ligation of all three-base 5'-overhangs by T4 DNA ligase under typical ligation conditions in a single experiment. We report the relative frequency of all ligation products with or without mismatches, the position-dependent frequency of each mismatch, and the surprising observation that 5'-TNA overhangs ligate extremely inefficiently compared to all other Watson-Crick pairings. The method can easily be extended to profile other ligases, end-types (e.g. blunt ends and overhangs of different lengths), and the effect of adjacent sequence on the ligation results. Further, the method has the potential to provide new insights into the thermodynamics of annealing and the kinetics of end-joining reactions.
Understanding the mechanisms of protein-DNA interactions
NASA Astrophysics Data System (ADS)
Lavery, Richard
2004-03-01
Structural, biochemical and thermodynamic data on protein-DNA interactions show that specific recognition cannot be reduced to a simple set of binary interactions between the partners (such as hydrogen bonds, ion pairs or steric contacts). The mechanical properties of the partners also play a role and, in the case of DNA, variations in both conformation and flexibility as a function of base sequence can be a significant factor in guiding a protein to the correct binding site. All-atom molecular modeling offers a means of analyzing the role of different binding mechanisms within protein-DNA complexes of known structure. This however requires estimating the binding strengths for the full range of sequences with which a given protein can interact. Since this number grows exponentially with the length of the binding site it is necessary to find a method to accelerate the calculations. We have achieved this by using a multi-copy approach (ADAPT) which allows us to build a DNA fragment with a variable base sequence. The results obtained with this method correlate well with experimental consensus binding sequences. They enable us to show that indirect recognition mechanisms involving the sequence dependent properties of DNA play a significant role in many complexes. This approach also offers a means of predicting protein binding sites on the basis of binding energies, which is complementary to conventional lexical techniques.
Wickramaarachchi, W A R T; Shankarappa, K S; Rangaswamy, K T; Maruthi, M N; Rajapakse, R G A S; Ghosh, Saptarshi
2016-06-01
Bunchy top disease of banana caused by Banana bunchy top virus (BBTV, genus Babuvirus family Nanoviridae) is one of the most important constraints in production of banana in the different parts of the world. Six genomic DNA components of BBTV isolate from Kandy, Sri Lanka (BBTV-K) were amplified by polymerase chain reaction (PCR) with specific primers using total DNA extracted from banana tissues showing typical symptoms of bunchy top disease. The amplicons were of expected size of 1.0-1.1 kb, which were cloned and sequenced. Analysis of sequence data revealed the presence of six DNA components; DNA-R, DNA-U3, DNA-S, DNA-N, DNA-M and DNA-C for Sri Lanka isolate. Comparisons of sequence data of DNA components followed by the phylogenetic analysis, grouped Sri Lanka-(Kandy) isolate in the Pacific Indian Oceans (PIO) group. Sri Lanka-(Kandy) isolate of BBTV is classified a new member of PIO group based on analysis of six components of the virus.
Fluorescence-tunable Ag-DNA biosensor with tailored cytotoxicity for live-cell applications
NASA Astrophysics Data System (ADS)
Bossert, Nelli; de Bruin, Donny; Götz, Maria; Bouwmeester, Dirk; Heinrich, Doris
2016-11-01
DNA-stabilized silver clusters (Ag-DNA) show excellent promise as a multi-functional nanoagent for molecular investigations in living cells. The unique properties of these fluorescent nanomaterials allow for intracellular optical sensors with tunable cytotoxicity based on simple modifications of the DNA sequences. Three Ag-DNA nanoagent designs are investigated, exhibiting optical responses to the intracellular environments and sensing-capability of ions, functional inside living cells. Their sequence-dependent fluorescence responses inside living cells include (1) a strong splitting of the fluorescence peak for a DNA hairpin construct, (2) an excitation and emission shift of up to 120 nm for a single-stranded DNA construct, and (3) a sequence robust in fluorescence properties. Additionally, the cytotoxicity of these Ag-DNA constructs is tunable, ranging from highly cytotoxic to biocompatible Ag-DNA, independent of their optical sensing capability. Thus, Ag-DNA represents a versatile live-cell nanoagent addressable towards anti-cancer, patient-specific and anti-bacterial applications.
Hirsch, B; Endris, V; Lassmann, S; Weichert, W; Pfarr, N; Schirmacher, P; Kovaleva, V; Werner, M; Bonzheim, I; Fend, F; Sperveslage, J; Kaulich, K; Zacher, A; Reifenberger, G; Köhrer, K; Stepanow, S; Lerke, S; Mayr, T; Aust, D E; Baretton, G; Weidner, S; Jung, A; Kirchner, T; Hansmann, M L; Burbat, L; von der Wall, E; Dietel, M; Hummel, M
2018-04-01
The simultaneous detection of multiple somatic mutations in the context of molecular diagnostics of cancer is frequently performed by means of amplicon-based targeted next-generation sequencing (NGS). However, only few studies are available comparing multicenter testing of different NGS platforms and gene panels. Therefore, seven partner sites of the German Cancer Consortium (DKTK) performed a multicenter interlaboratory trial for targeted NGS using the same formalin-fixed, paraffin-embedded (FFPE) specimen of molecularly pre-characterized tumors (n = 15; each n = 5 cases of Breast, Lung, and Colon carcinoma) and a colorectal cancer cell line DNA dilution series. Detailed information regarding pre-characterized mutations was not disclosed to the partners. Commercially available and custom-designed cancer gene panels were used for library preparation and subsequent sequencing on several devices of two NGS different platforms. For every case, centrally extracted DNA and FFPE tissue sections for local processing were delivered to each partner site to be sequenced with the commercial gene panel and local bioinformatics. For cancer-specific panel-based sequencing, only centrally extracted DNA was analyzed at seven sequencing sites. Subsequently, local data were compiled and bioinformatics was performed centrally. We were able to demonstrate that all pre-characterized mutations were re-identified correctly, irrespective of NGS platform or gene panel used. However, locally processed FFPE tissue sections disclosed that the DNA extraction method can affect the detection of mutations with a trend in favor of magnetic bead-based DNA extraction methods. In conclusion, targeted NGS is a very robust method for simultaneous detection of various mutations in FFPE tissue specimens if certain pre-analytical conditions are carefully considered.
NASA Astrophysics Data System (ADS)
Alvarez, Jose; Massey, Steven; Kalitsov, Alan; Velev, Julian
Nanopore sequencing via transverse current has emerged as a competitive candidate for mapping DNA methylation without needed bisulfite-treatment, fluorescent tag, or PCR amplification. By eliminating the error producing amplification step, long read lengths become feasible, which greatly simplifies the assembly process and reduces the time and the cost inherent in current technologies. However, due to the large error rates of nanopore sequencing, single base resolution has not been reached. A very important source of noise is the intrinsic structural noise in the electric signature of the nucleotide arising from the influence of neighboring nucleotides. In this work we perform calculations of the tunneling current through DNA molecules in nanopores using the non-equilibrium electron transport method within an effective multi-orbital tight-binding model derived from first-principles calculations. We develop a base-calling algorithm accounting for the correlations of the current through neighboring bases, which in principle can reduce the error rate below any desired precision. Using this method we show that we can clearly distinguish DNA methylation and other base modifications based on the reading of the tunneling current.
Pairagon: a highly accurate, HMM-based cDNA-to-genome aligner.
Lu, David V; Brown, Randall H; Arumugam, Manimozhiyan; Brent, Michael R
2009-07-01
The most accurate way to determine the intron-exon structures in a genome is to align spliced cDNA sequences to the genome. Thus, cDNA-to-genome alignment programs are a key component of most annotation pipelines. The scoring system used to choose the best alignment is a primary determinant of alignment accuracy, while heuristics that prevent consideration of certain alignments are a primary determinant of runtime and memory usage. Both accuracy and speed are important considerations in choosing an alignment algorithm, but scoring systems have received much less attention than heuristics. We present Pairagon, a pair hidden Markov model based cDNA-to-genome alignment program, as the most accurate aligner for sequences with high- and low-identity levels. We conducted a series of experiments testing alignment accuracy with varying sequence identity. We first created 'perfect' simulated cDNA sequences by splicing the sequences of exons in the reference genome sequences of fly and human. The complete reference genome sequences were then mutated to various degrees using a realistic mutation simulator and the perfect cDNAs were aligned to them using Pairagon and 12 other aligners. To validate these results with natural sequences, we performed cross-species alignment using orthologous transcripts from human, mouse and rat. We found that aligner accuracy is heavily dependent on sequence identity. For sequences with 100% identity, Pairagon achieved accuracy levels of >99.6%, with one quarter of the errors of any other aligner. Furthermore, for human/mouse alignments, which are only 85% identical, Pairagon achieved 87% accuracy, higher than any other aligner. Pairagon source and executables are freely available at http://mblab.wustl.edu/software/pairagon/
Pollier, Jacob; González-Guzmán, Miguel; Ardiles-Diaz, Wilson; Geelen, Danny; Goossens, Alain
2011-01-01
cDNA-Amplified Fragment Length Polymorphism (cDNA-AFLP) is a commonly used technique for genome-wide expression analysis that does not require prior sequence knowledge. Typically, quantitative expression data and sequence information are obtained for a large number of differentially expressed gene tags. However, most of the gene tags do not correspond to full-length (FL) coding sequences, which is a prerequisite for subsequent functional analysis. A medium-throughput screening strategy, based on integration of polymerase chain reaction (PCR) and colony hybridization, was developed that allows in parallel screening of a cDNA library for FL clones corresponding to incomplete cDNAs. The method was applied to screen for the FL open reading frames of a selection of 163 cDNA-AFLP tags from three different medicinal plants, leading to the identification of 109 (67%) FL clones. Furthermore, the protocol allows for the use of multiple probes in a single hybridization event, thus significantly increasing the throughput when screening for rare transcripts. The presented strategy offers an efficient method for the conversion of incomplete expressed sequence tags (ESTs), such as cDNA-AFLP tags, to FL-coding sequences.
A modular DNA signal translator for the controlled release of a protein by an aptamer.
Beyer, Stefan; Simmel, Friedrich C
2006-01-01
Owing to the intimate linkage of sequence and structure in nucleic acids, DNA is an extremely attractive molecule for the development of molecular devices, in particular when a combination of information processing and chemomechanical tasks is desired. Many of the previously demonstrated devices are driven by hybridization between DNA 'effector' strands and specific recognition sequences on the device. For applications it is of great interest to link several of such molecular devices together within artificial reaction cascades. Often it will not be possible to choose DNA sequences freely, e.g. when functional nucleic acids such as aptamers are used. In such cases translation of an arbitrary 'input' sequence into a desired effector sequence may be required. Here we demonstrate a molecular 'translator' for information encoded in DNA and show how it can be used to control the release of a protein by an aptamer using an arbitrarily chosen DNA input strand. The function of the translator is based on branch migration and the action of the endonuclease FokI. The modular design of the translator facilitates the adaptation of the device to various input or output sequences.
A modular DNA signal translator for the controlled release of a protein by an aptamer
Beyer, Stefan; Simmel, Friedrich C.
2006-01-01
Owing to the intimate linkage of sequence and structure in nucleic acids, DNA is an extremely attractive molecule for the development of molecular devices, in particular when a combination of information processing and chemomechanical tasks is desired. Many of the previously demonstrated devices are driven by hybridization between DNA ‘effector’ strands and specific recognition sequences on the device. For applications it is of great interest to link several of such molecular devices together within artificial reaction cascades. Often it will not be possible to choose DNA sequences freely, e.g. when functional nucleic acids such as aptamers are used. In such cases translation of an arbitrary ‘input’ sequence into a desired effector sequence may be required. Here we demonstrate a molecular ‘translator’ for information encoded in DNA and show how it can be used to control the release of a protein by an aptamer using an arbitrarily chosen DNA input strand. The function of the translator is based on branch migration and the action of the endonuclease FokI. The modular design of the translator facilitates the adaptation of the device to various input or output sequences. PMID:16547201
Urabe, N; Ishii, Y; Hyodo, Y; Aoki, K; Yoshizawa, S; Saga, T; Murayama, S Y; Sakai, K; Homma, S; Tateda, K
2016-04-01
Between 18 November and 3 December 2011, five renal transplant patients at the Department of Nephrology, Toho University Omori Medical Centre, Tokyo, were diagnosed with Pneumocystis pneumonia (PCP). We used molecular epidemiologic methods to determine whether the patients were infected with the same strain of Pneumocystis jirovecii. DNA extracted from the residual bronchoalveolar lavage fluid from the five outbreak cases and from another 20 cases of PCP between 2007 and 2014 were used for multilocus sequence typing to compare the genetic similarity of the P. jirovecii. DNA base sequencing by the Sanger method showed some regions where two bases overlapped and could not be defined. A next-generation sequencer was used to analyse the types and ratios of these overlapping bases. DNA base sequences of P. jirovecii in the bronchoalveolar lavage fluid from four of the five PCP patients in the 2011 outbreak and from another two renal transplant patients who developed PCP in 2013 were highly homologous. The Sanger method revealed 14 genomic regions where two differing DNA bases overlapped and could not be identified. Analyses of the overlapping bases by a next-generation sequencer revealed that the differing types of base were present in almost identical ratios. There is a strong possibility that the PCP outbreak at the Toho University Omori Medical Centre was caused by the same strain of P. jirovecii. Two different types of base present in some regions may be due to P. jirovecii's being a diploid species. Copyright © 2015 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved.
Quantum Point Contact Single-Nucleotide Conductance for DNA and RNA Sequence Identification.
Afsari, Sepideh; Korshoj, Lee E; Abel, Gary R; Khan, Sajida; Chatterjee, Anushree; Nagpal, Prashant
2017-11-28
Several nanoscale electronic methods have been proposed for high-throughput single-molecule nucleic acid sequence identification. While many studies display a large ensemble of measurements as "electronic fingerprints" with some promise for distinguishing the DNA and RNA nucleobases (adenine, guanine, cytosine, thymine, and uracil), important metrics such as accuracy and confidence of base calling fall well below the current genomic methods. Issues such as unreliable metal-molecule junction formation, variation of nucleotide conformations, insufficient differences between the molecular orbitals responsible for single-nucleotide conduction, and lack of rigorous base calling algorithms lead to overlapping nanoelectronic measurements and poor nucleotide discrimination, especially at low coverage on single molecules. Here, we demonstrate a technique for reproducible conductance measurements on conformation-constrained single nucleotides and an advanced algorithmic approach for distinguishing the nucleobases. Our quantum point contact single-nucleotide conductance sequencing (QPICS) method uses combed and electrostatically bound single DNA and RNA nucleotides on a self-assembled monolayer of cysteamine molecules. We demonstrate that by varying the applied bias and pH conditions, molecular conductance can be switched ON and OFF, leading to reversible nucleotide perturbation for electronic recognition (NPER). We utilize NPER as a method to achieve >99.7% accuracy for DNA and RNA base calling at low molecular coverage (∼12×) using unbiased single measurements on DNA/RNA nucleotides, which represents a significant advance compared to existing sequencing methods. These results demonstrate the potential for utilizing simple surface modifications and existing biochemical moieties in individual nucleobases for a reliable, direct, single-molecule, nanoelectronic DNA and RNA nucleotide identification method for sequencing.
Ceccarelli, Marcello; Galluzzi, Luca; Diotallevi, Aurora; Andreoni, Francesca; Fowler, Hailie; Petersen, Christine; Vitale, Fabrizio; Magnani, Mauro
2017-05-16
Leishmaniasis is a neglected disease caused by many Leishmania species, belonging to subgenera Leishmania (Leishmania) and Leishmania (Viannia). Several qPCR-based molecular diagnostic approaches have been reported for detection and quantification of Leishmania species. Many of these approaches use the kinetoplast DNA (kDNA) minicircles as the target sequence. These assays had potential cross-species amplification, due to sequence similarity between Leishmania species. Previous works demonstrated discrimination between L. (Leishmania) and L. (Viannia) by SYBR green-based qPCR assays designed on kDNA, followed by melting or high-resolution melt (HRM) analysis. Importantly, these approaches cannot fully distinguish L. (L.) infantum from L. (L.) amazonensis, which can coexist in the same geographical area. DNA from 18 strains/isolates of L. (L.) infantum, L. (L.) amazonensis, L. (V.) braziliensis, L. (V.) panamensis, L. (V.) guyanensis, and 62 clinical samples from L. (L.) infantum-infected dogs were amplified by a previously developed qPCR (qPCR-ML) and subjected to HRM analysis; selected PCR products were sequenced using an ABI PRISM 310 Genetic Analyzer. Based on the obtained sequences, a new SYBR-green qPCR assay (qPCR-ama) intended to amplify a minicircle subclass more abundant in L. (L.) amazonensis was designed. The qPCR-ML followed by HRM analysis did not allow discrimination between L. (L.) amazonensis and L. (L.) infantum in 53.4% of cases. Hence, the novel SYBR green-based qPCR (qPCR-ama) has been tested. This assay achieved a detection limit of 0.1 pg of parasite DNA in samples spiked with host DNA and did not show cross amplification with Trypanosoma cruzi or host DNA. Although the qPCR-ama also amplified L. (L.) infantum strains, the C q values were dramatically increased compared to qPCR-ML. Therefore, the combined analysis of C q values from qPCR-ML and qPCR-ama allowed to distinguish L. (L.) infantum and L. (L.) amazonensis in 100% of tested samples. A new and affordable SYBR-green qPCR-based approach to distinguish between L. (L.) infantum and L. (L.) amazonensis was developed exploiting the major abundance of a minicircle sequence rather than targeting a hypothetical species-specific sequence. The fast and accurate discrimination between these species can be useful to provide adequate prognosis and treatment.
Mizas, Ch; Sirakoulis, G Ch; Mardiris, V; Karafyllidis, I; Glykos, N; Sandaltzopoulos, R
2008-04-01
Change of DNA sequence that fuels evolution is, to a certain extent, a deterministic process because mutagenesis does not occur in an absolutely random manner. So far, it has not been possible to decipher the rules that govern DNA sequence evolution due to the extreme complexity of the entire process. In our attempt to approach this issue we focus solely on the mechanisms of mutagenesis and deliberately disregard the role of natural selection. Hence, in this analysis, evolution refers to the accumulation of genetic alterations that originate from mutations and are transmitted through generations without being subjected to natural selection. We have developed a software tool that allows modelling of a DNA sequence as a one-dimensional cellular automaton (CA) with four states per cell which correspond to the four DNA bases, i.e. A, C, T and G. The four states are represented by numbers of the quaternary number system. Moreover, we have developed genetic algorithms (GAs) in order to determine the rules of CA evolution that simulate the DNA evolution process. Linear evolution rules were considered and square matrices were used to represent them. If DNA sequences of different evolution steps are available, our approach allows the determination of the underlying evolution rule(s). Conversely, once the evolution rules are deciphered, our tool may reconstruct the DNA sequence in any previous evolution step for which the exact sequence information was unknown. The developed tool may be used to test various parameters that could influence evolution. We describe a paradigm relying on the assumption that mutagenesis is governed by a near-neighbour-dependent mechanism. Based on the satisfactory performance of our system in the deliberately simplified example, we propose that our approach could offer a starting point for future attempts to understand the mechanisms that govern evolution. The developed software is open-source and has a user-friendly graphical input interface.
2015-01-01
DNA oxidation by reactive oxygen species is nonrandom, potentially leading to accumulation of nucleobase damage and mutations at specific sites within the genome. We now present the first quantitative data for sequence-dependent formation of structurally defined oxidative nucleobase adducts along p53 gene-derived DNA duplexes using a novel isotope labeling-based approach. Our results reveal that local nucleobase sequence context differentially alters the yields of 2,2,4-triamino-2H-oxal-5-one (Z) and 8-oxo-7,8-dihydro-2′-deoxyguanosine (OG) in double stranded DNA. While both lesions are overproduced within endogenously methylated MeCG dinucleotides and at 5′ Gs in runs of several guanines, the formation of Z (but not OG) is strongly preferred at solvent-exposed guanine nucleobases at duplex ends. Targeted oxidation of MeCG sequences may be caused by a lowered ionization potential of guanine bases paired with MeC and the preferential intercalation of riboflavin photosensitizer adjacent to MeC:G base pairs. Importantly, some of the most frequently oxidized positions coincide with the known p53 lung cancer mutational “hotspots” at codons 245 (GGC), 248 (CGG), and 158 (CGC) respectively, supporting a possible role of oxidative degradation of DNA in the initiation of lung cancer. PMID:24571128
2012-01-01
Background Detecting the borders between coding and non-coding regions is an essential step in the genome annotation. And information entropy measures are useful for describing the signals in genome sequence. However, the accuracies of previous methods of finding borders based on entropy segmentation method still need to be improved. Methods In this study, we first applied a new recursive entropic segmentation method on DNA sequences to get preliminary significant cuts. A 22-symbol alphabet is used to capture the differential composition of nucleotide doublets and stop codon patterns along three phases in both DNA strands. This process requires no prior training datasets. Results Comparing with the previous segmentation methods, the experimental results on three bacteria genomes, Rickettsia prowazekii, Borrelia burgdorferi and E.coli, show that our approach improves the accuracy for finding the borders between coding and non-coding regions in DNA sequences. Conclusions This paper presents a new segmentation method in prokaryotes based on Jensen-Rényi divergence with a 22-symbol alphabet. For three bacteria genomes, comparing to A12_JR method, our method raised the accuracy of finding the borders between protein coding and non-coding regions in DNA sequences. PMID:23282225
Zhao, Ya-E; Wu, Li-Ping
2012-09-01
To confirm phylogenetic relationships in Demodex mites based on mitochondrial 16S rDNA partial sequences, mtDNA 16S partial sequences of ten isolates of three Demodex species from China were amplified, recombined, and sequenced and then analyzed with two Demodex folliculorum isolates from Spain. Lastly, genetic distance was computed, and phylogenetic tree was reconstructed. MEGA 4.0 analysis showed high sequence identity among 16S rDNA partial sequences of three Demodex species, which were 95.85 % in D. folliculorum, 98.53 % in Demodex canis, and 99.71 % in Demodex brevis. The divergence, genetic distance, and transition/transversions of the three Demodex species reached interspecies level, whereas there was no significant difference of the divergence (1.1 %), genetic distance (0.011), and transition/transversions (3/1) of the two geographic D. folliculorum isolates (Spain and China). Phylogenetic trees reveal that the three Demodex species formed three separate branches of one clade, where D. folliculorum and D. canis gathered first, and then gathered with D. brevis. The two Spain and five China D. folliculorum isolates did not form sister clades. In conclusion, 16S mtDNA are suitable for phylogenetic relationship analysis in low taxa (genus or species), but not for intraspecies determination of Demodex. The differentiation among the three Demodex species has reached interspecies level.
TFBSshape: a motif database for DNA shape features of transcription factor binding sites.
Yang, Lin; Zhou, Tianyin; Dror, Iris; Mathelier, Anthony; Wasserman, Wyeth W; Gordân, Raluca; Rohs, Remo
2014-01-01
Transcription factor binding sites (TFBSs) are most commonly characterized by the nucleotide preferences at each position of the DNA target. Whereas these sequence motifs are quite accurate descriptions of DNA binding specificities of transcription factors (TFs), proteins recognize DNA as a three-dimensional object. DNA structural features refine the description of TF binding specificities and provide mechanistic insights into protein-DNA recognition. Existing motif databases contain extensive nucleotide sequences identified in binding experiments based on their selection by a TF. To utilize DNA shape information when analysing the DNA binding specificities of TFs, we developed a new tool, the TFBSshape database (available at http://rohslab.cmb.usc.edu/TFBSshape/), for calculating DNA structural features from nucleotide sequences provided by motif databases. The TFBSshape database can be used to generate heat maps and quantitative data for DNA structural features (i.e., minor groove width, roll, propeller twist and helix twist) for 739 TF datasets from 23 different species derived from the motif databases JASPAR and UniPROBE. As demonstrated for the basic helix-loop-helix and homeodomain TF families, our TFBSshape database can be used to compare, qualitatively and quantitatively, the DNA binding specificities of closely related TFs and, thus, uncover differential DNA binding specificities that are not apparent from nucleotide sequence alone.
TFBSshape: a motif database for DNA shape features of transcription factor binding sites
Yang, Lin; Zhou, Tianyin; Dror, Iris; Mathelier, Anthony; Wasserman, Wyeth W.; Gordân, Raluca; Rohs, Remo
2014-01-01
Transcription factor binding sites (TFBSs) are most commonly characterized by the nucleotide preferences at each position of the DNA target. Whereas these sequence motifs are quite accurate descriptions of DNA binding specificities of transcription factors (TFs), proteins recognize DNA as a three-dimensional object. DNA structural features refine the description of TF binding specificities and provide mechanistic insights into protein–DNA recognition. Existing motif databases contain extensive nucleotide sequences identified in binding experiments based on their selection by a TF. To utilize DNA shape information when analysing the DNA binding specificities of TFs, we developed a new tool, the TFBSshape database (available at http://rohslab.cmb.usc.edu/TFBSshape/), for calculating DNA structural features from nucleotide sequences provided by motif databases. The TFBSshape database can be used to generate heat maps and quantitative data for DNA structural features (i.e., minor groove width, roll, propeller twist and helix twist) for 739 TF datasets from 23 different species derived from the motif databases JASPAR and UniPROBE. As demonstrated for the basic helix-loop-helix and homeodomain TF families, our TFBSshape database can be used to compare, qualitatively and quantitatively, the DNA binding specificities of closely related TFs and, thus, uncover differential DNA binding specificities that are not apparent from nucleotide sequence alone. PMID:24214955
Kim, W J; Ji, Y; Choi, G; Kang, Y M; Yang, S; Moon, B C
2016-08-05
This study was performed to identify and analyze the phylogenetic relationship among four herbaceous species of the genus Paeonia, P. lactiflora, P. japonica, P. veitchii, and P. suffruticosa, using DNA barcodes. These four species, which are commonly used in traditional medicine as Paeoniae Radix and Moutan Radicis Cortex, are pharmaceutically defined in different ways in the national pharmacopoeias in Korea, Japan, and China. To authenticate the different species used in these medicines, we evaluated rDNA-internal transcribed spacers (ITS), matK and rbcL regions, which provide information capable of effectively distinguishing each species from one another. Seventeen samples were collected from different geographic regions in Korea and China, and DNA barcode regions were amplified using universal primers. Comparative analyses of these DNA barcode sequences revealed species-specific nucleotide sequences capable of discriminating the four Paeonia species. Among the entire sequences of three barcodes, marker nucleotides were identified at three positions in P. lactiflora, eleven in P. japonica, five in P. veitchii, and 25 in P. suffruticosa. Phylogenetic analyses also revealed four distinct clusters showing homogeneous clades with high resolution at the species level. The results demonstrate that the analysis of these three DNA barcode sequences is a reliable method for identifying the four Paeonia species and can be used to authenticate Paeoniae Radix and Moutan Radicis Cortex at the species level. Furthermore, based on the assessment of amplicon sizes, inter/intra-specific distances, marker nucleotides, and phylogenetic analysis, rDNA-ITS was the most suitable DNA barcode for identification of these species.
Next-Generation Sequencing in Oncology: Genetic Diagnosis, Risk Prediction and Cancer Classification
Kamps, Rick; Brandão, Rita D.; van den Bosch, Bianca J.; Paulussen, Aimee D. C.; Xanthoulea, Sofia; Blok, Marinus J.; Romano, Andrea
2017-01-01
Next-generation sequencing (NGS) technology has expanded in the last decades with significant improvements in the reliability, sequencing chemistry, pipeline analyses, data interpretation and costs. Such advances make the use of NGS feasible in clinical practice today. This review describes the recent technological developments in NGS applied to the field of oncology. A number of clinical applications are reviewed, i.e., mutation detection in inherited cancer syndromes based on DNA-sequencing, detection of spliceogenic variants based on RNA-sequencing, DNA-sequencing to identify risk modifiers and application for pre-implantation genetic diagnosis, cancer somatic mutation analysis, pharmacogenetics and liquid biopsy. Conclusive remarks, clinical limitations, implications and ethical considerations that relate to the different applications are provided. PMID:28146134
Guard, Jean; Sanchez-Ingunza, Roxana; Morales, Cesar; Stewart, Tod; Liljebjelke, Karen; Kessel, JoAnn; Ingram, Kim; Jones, Deana; Jackson, Charlene; Fedorka-Cray, Paula; Frye, Jonathan; Gast, Richard; Hinton, Arthur
2012-01-01
Two DNA-based methods were compared for the ability to assign serotype to 139 isolates of Salmonella enterica ssp. I. Intergenic sequence ribotyping (ISR) evaluated single nucleotide polymorphisms occurring in a 5S ribosomal gene region and flanking sequences bordering the gene dkgB. A DNA microarray hybridization method that assessed the presence and the absence of sets of genes was the second method. Serotype was assigned for 128 (92.1%) of submissions by the two DNA methods. ISR detected mixtures of serotypes within single colonies and it cost substantially less than Kauffmann–White serotyping and DNA microarray hybridization. Decreasing the cost of serotyping S. enterica while maintaining reliability may encourage routine testing and research. PMID:22998607
Reed, Kent M.; Dorschner, Michael O.; Todd, Thomas N.; Phillips, Ruth B.
1998-01-01
Sequence variation in the control region (D-loop) of the mitochondrial DNA (mtDNA) was examined to assess the genetic distinctiveness of the shortjaw cisco (Coregonus zenithicus). Individuals from within the Great Lakes Basin as well as inland lakes outside the basin were sampled. DNA fragments containing the entire D-loop were amplified by PCR from specimens ofC. zenithicus and the related species C. artedi, C. hoyi, C. kiyi, and C. clupeaformis. DNA sequence analysis revealed high similarity within and among species and shared polymorphism for length variants. Based on this analysis, the shortjaw cisco is not genetically distinct from other cisco species.
Substrate sequence selectivity of APOBEC3A implicates intra-DNA interactions.
Silvas, Tania V; Hou, Shurong; Myint, Wazo; Nalivaika, Ellen; Somasundaran, Mohan; Kelch, Brian A; Matsuo, Hiroshi; Kurt Yilmaz, Nese; Schiffer, Celia A
2018-05-14
The APOBEC3 (A3) family of human cytidine deaminases is renowned for providing a first line of defense against many exogenous and endogenous retroviruses. However, the ability of these proteins to deaminate deoxycytidines in ssDNA makes A3s a double-edged sword. When overexpressed, A3s can mutate endogenous genomic DNA resulting in a variety of cancers. Although the sequence context for mutating DNA varies among A3s, the mechanism for substrate sequence specificity is not well understood. To characterize substrate specificity of A3A, a systematic approach was used to quantify the affinity for substrate as a function of sequence context, length, secondary structure, and solution pH. We identified the A3A ssDNA binding motif as (T/C)TC(A/G), which correlated with enzymatic activity. We also validated that A3A binds RNA in a sequence specific manner. A3A bound tighter to substrate binding motif within a hairpin loop compared to linear oligonucleotide, suggesting A3A affinity is modulated by substrate structure. Based on these findings and previously published A3A-ssDNA co-crystal structures, we propose a new model with intra-DNA interactions for the molecular mechanism underlying A3A sequence preference. Overall, the sequence and structural preferences identified for A3A leads to a new paradigm for identifying A3A's involvement in mutation of endogenous or exogenous DNA.
Recombination of polynucleotide sequences using random or defined primers
Arnold, Frances H.; Shao, Zhixin; Affholter, Joseph A.; Zhao, Huimin H; Giver, Lorraine J.
2000-01-01
A method for in vitro mutagenesis and recombination of polynucleotide sequences based on polymerase-catalyzed extension of primer oligonucleotides is disclosed. The method involves priming template polynucleotide(s) with random-sequences or defined-sequence primers to generate a pool of short DNA fragments with a low level of point mutations. The DNA fragments are subjected to denaturization followed by annealing and further enzyme-catalyzed DNA polymerization. This procedure is repeated a sufficient number of times to produce full-length genes which comprise mutants of the original template polynucleotides. These genes can be further amplified by the polymerase chain reaction and cloned into a vector for expression of the encoded proteins.
Recombination of polynucleotide sequences using random or defined primers
Arnold, Frances H.; Shao, Zhixin; Affholter, Joseph A.; Zhao, Huimin; Giver, Lorraine J.
2001-01-01
A method for in vitro mutagenesis and recombination of polynucleotide sequences based on polymerase-catalyzed extension of primer oligonucleotides is disclosed. The method involves priming template polynucleotide(s) with random-sequences or defined-sequence primers to generate a pool of short DNA fragments with a low level of point mutations. The DNA fragments are subjected to denaturization followed by annealing and further enzyme-catalyzed DNA polymerization. This procedure is repeated a sufficient number of times to produce full-length genes which comprise mutants of the original template polynucleotides. These genes can be further amplified by the polymerase chain reaction and cloned into a vector for expression of the encoded proteins.
Breathing dynamics based parameter sensitivity analysis of hetero-polymeric DNA
DOE Office of Scientific and Technical Information (OSTI.GOV)
Talukder, Srijeeta; Sen, Shrabani; Chaudhury, Pinaki, E-mail: pinakc@rediffmail.com
We study the parameter sensitivity of hetero-polymeric DNA within the purview of DNA breathing dynamics. The degree of correlation between the mean bubble size and the model parameters is estimated for this purpose for three different DNA sequences. The analysis leads us to a better understanding of the sequence dependent nature of the breathing dynamics of hetero-polymeric DNA. Out of the 14 model parameters for DNA stability in the statistical Poland-Scheraga approach, the hydrogen bond interaction ε{sub hb}(AT) for an AT base pair and the ring factor ξ turn out to be the most sensitive parameters. In addition, the stackingmore » interaction ε{sub st}(TA-TA) for an TA-TA nearest neighbor pair of base-pairs is found to be the most sensitive one among all stacking interactions. Moreover, we also establish that the nature of stacking interaction has a deciding effect on the DNA breathing dynamics, not the number of times a particular stacking interaction appears in a sequence. We show that the sensitivity analysis can be used as an effective measure to guide a stochastic optimization technique to find the kinetic rate constants related to the dynamics as opposed to the case where the rate constants are measured using the conventional unbiased way of optimization.« less
Comparison of Methods of Detection of Exceptional Sequences in Prokaryotic Genomes.
Rusinov, I S; Ershova, A S; Karyagina, A S; Spirin, S A; Alexeevski, A V
2018-02-01
Many proteins need recognition of specific DNA sequences for functioning. The number of recognition sites and their distribution along the DNA might be of biological importance. For example, the number of restriction sites is often reduced in prokaryotic and phage genomes to decrease the probability of DNA cleavage by restriction endonucleases. We call a sequence an exceptional one if its frequency in a genome significantly differs from one predicted by some mathematical model. An exceptional sequence could be either under- or over-represented, depending on its frequency in comparison with the predicted one. Exceptional sequences could be considered biologically meaningful, for example, as targets of DNA-binding proteins or as parts of abundant repetitive elements. Several methods to predict frequency of a short sequence in a genome, based on actual frequencies of certain its subsequences, are used. The most popular are methods based on Markov chain models. But any rigorous comparison of the methods has not previously been performed. We compared three methods for the prediction of short sequence frequencies: the maximum-order Markov chain model-based method, the method that uses geometric mean of extended Markovian estimates, and the method that utilizes frequencies of all subsequences including discontiguous ones. We applied them to restriction sites in complete genomes of 2500 prokaryotic species and demonstrated that the results depend greatly on the method used: lists of 5% of the most under-represented sites differed by up to 50%. The method designed by Burge and coauthors in 1992, which utilizes all subsequences of the sequence, showed a higher precision than the other two methods both on prokaryotic genomes and randomly generated sequences after computational imitation of selective pressure. We propose this method as the first choice for detection of exceptional sequences in prokaryotic genomes.
Mikkelsen, Martin; Frank-Hansen, Rune; Hansen, Anders J; Morling, Niels
2014-09-01
of sequencing of whole mitochondrial genome, HV1 and HV2 DNA with the second generation system (SGS) Roche 454 GS Junior were compared with results of Sanger sequencing and SNP typing with SNaPshot single base extension detected with MALDI-TOF and capillary electrophoresis. We investigated the performance of the software analysis of the data, reproducibility, ability to sequence homopolymeric regions, detection of mixtures and heteroplasmy as well as the implications of the depth of coverage. We found full reproducibility between samples sequenced twice with SGS. We found close to full concordance between the mtDNA sequences of 26 samples obtained with (1) the 454 SGS method using a depth of coverage above 100 and (2) Sanger sequencing and SNP typing. The discrepancies were primarily observed in homopolymeric regions. The 454 SGS method was able to sequence 95% of the reads correctly in homopolymers up to 4 bases, and up to 6 bases could be sequenced with similar success if the results were carefully, visually inspected. The 454 technology was able to detect mixtures or heteroplasmy of approximately 10%. We detected previously unreported heteroplasmy in the GM9947A component of the NIST human mitochondrial DNA SRM-2392 standard reference material. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Herrera, Alex F.; Kim, Haesook T.; Kong, Katherine A.; Faham, Malek; Sun, Heather; Sohani, Aliyah R.; Alyea, Edwin P.; Carlton, Victoria E.; Chen, Yi-Bin; Cutler, Corey S.; Ho, Vincent T.; Koreth, John; Kotwaliwale, Chitra; Nikiforow, Sarah; Ritz, Jerome; Rodig, Scott J.; Soiffer, Robert J.; Antin, Joseph H.; Armand, Philippe
2016-01-01
Summary Next-generation sequencing (NGS)-based circulating tumour DNA (ctDNA) detection is a promising monitoring tool for lymphoid malignancies. We evaluated whether the presence of ctDNA was associated with outcome after allogeneic haematopoietic stem cell transplantation (HSCT) in lymphoma patients. We studied 88 patients drawn from a phase 3 clinical trial of reduced-intensity conditioning HSCT in lymphoma. Conventional restaging and collection of peripheral blood samples occurred at pre-specified time points before and after HSCT and were assayed for ctDNA by sequencing of the immunoglobulin or T-cell receptor genes. Tumour clonotypes were identified in 87% of patients with adequate tumour samples. Sixteen of 19 (84%) patients with disease progression after HSCT had detectable ctDNA prior to progression at a median of 3.7 months prior to relapse/progression. Patients with detectable ctDNA 3 months after HSCT had inferior progression-free survival (PFS) (2-year PFS 58% versus 84% in ctDNA-negative patients, p=0.033). In multivariate models, detectable ctDNA was associated with increased risk of progression/death (Hazard ratio 3.9, p=0.003) and increased risk of relapse/progression (Hazard ratio 10.8, p=0.0006). Detectable ctDNA is associated with an increased risk of relapse/progression, but further validation studies are necessary to confirm these findings and determine the clinical utility of NGS-based minimal residual disease monitoring in lymphoma patients after HSCT. PMID:27711974
Formation of (DNA)2-LNA triplet with recombinant base recognition: A quantum mechanical study
NASA Astrophysics Data System (ADS)
Mall, Vijaya Shri; Tiwari, Rakesh Kumar
2018-05-01
The formation of DNA triple helix offers the verity of new possibilities in molecular biology. However its applications are limited to purine and pyrimidine rich sequences recognized by forming Hoogsteen/Reverse Hoogsteen triplets in major groove sites of DNA duplex. To overcome this drawback modification in bases backbone and glucose of nucleotide unit of DNA have been proposed so that the third strand base recognized by both the bases of DNA duplex by forming Recombinant type(R-type) of bonding in mixed sequences. Here we performed Quanrum Mechanical (Hartree-Fock and DFT) methodology on natural DNA and Locked Nucleic Acids(LNA) triplets using 6-31G and some other new advance basis sets. Study suggests energetically stable conformation has been observed for recombinant triplets in order of G-C*G > A-T*A > G-C*C > T-A*T for both type of triplets. Interestingly LNA leads to more stable conformation in all set of triplets, clearly suggests an important biological tool to overcome above mentioned drawbacks.
Schwelm, Arne; Berney, Cédric; Dixelius, Christina; Bass, David; Neuhauser, Sigrid
2016-01-01
Clubroot disease caused by Plasmodiophora brassicae is one of the most important diseases of cultivated brassicas. P. brassicae occurs in pathotypes which differ in the aggressiveness towards their Brassica host plants. To date no DNA based method to distinguish these pathotypes has been described. In 2011 polymorphism within the 28S rDNA of P. brassicae was reported which potentially could allow to distinguish pathotypes without the need of time-consuming bioassays. However, isolates of P. brassicae from around the world analysed in this study do not show polymorphism in their LSU rDNA sequences. The previously described polymorphism most likely derived from soil inhabiting Cercozoa more specifically Neoheteromita-like glissomonads. Here we correct the LSU rDNA sequence of P. brassicae. By using FISH we demonstrate that our newly generated sequence belongs to the causal agent of clubroot disease. PMID:27750174
Nagle, Padraic S; McKeever, Caitriona; Rodriguez, Fernando; Nguyen, Binh; Wilson, W David; Rozas, Isabel
2014-09-25
In this paper we report the design and biophysical evaluation of novel rigid-core symmetric and asymmetric dicationic DNA binders containing 9H-fluorene and 9,10-dihydroanthracene cores as well as the synthesis of one of these fluorene derivatives. First, the affinity toward particular DNA sequences of these compounds and flexible core derivatives was evaluated by means of surface plasmon resonance and thermal denaturation experiments finding that the position of the cations significantly influence the binding strength. Then their affinity and mode of binding were further studied by performing circular dichroism and UV studies and the results obtained were rationalized by means of DFT calculations. We found that the fluorene derivatives prepared have the ability to bind to the minor groove of certain DNA sequences and intercalate to others, whereas the dihydroanthracene compounds bind via intercalation to all the DNA sequences studied here.
Pfeiffer, H; Hühne, J; Ortmann, C; Waterkamp, K; Brinkmann, B
1999-01-01
The analysis of mitochondrial DNA (mtDNA) from shed hairs has gained high importance in forensic casework since telogen hairs are one of the most common types of evidence left at the crime scene. In this systematic study of hair shafts from 20 individuals, the correlation of mtDNA recovery with hair morphology (length, diameter, volume, colour), with sex, and with body localisation (head, armpit, pubis) was investigated. The highest average success rate of hypervariable region 1 (HV 1) sequencing was found in head hair shafts (75%) followed by pubic (66%) and axillary hair shafts (52%). No statistically significant correlation between morphological parameters or sex and the success rate of sequencing was found. MtDNA sequences of buccal cells, head, pubic and axillary hair shafts did not show intraindividual differences. Heteroplasmic base positions were observed neither in the hair shafts nor in control samples of buccal cells.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tuskan, Gerald A; Gunter, Lee E; DiFazio, Stephen P
The 18S-28S rDNA and 5S rDNA loci in Populus trichocarpa were localized using fluorescent in situ hybridization (FISH). Two 18S-28S rDNA sites and one 5S rDNA site were identified and located at the ends of 3 different chromosomes. FISH signals from the Arabidopsis -type telomere repeat sequence were observed at the distal ends of each chromosome. Six BAC clones selected from 2 linkage groups based on genome sequence assembly (LG-I and LG-VI) were localized on 2 chromosomes, as expected. BACs from LG-I hybridized to the longest chromosome in the complement. All BAC positions were found to be concordant with sequencemore » assembly positions. BAC-FISH will be useful for delineating each of the Populus trichocarpa chromosomes and improving the sequence assembly of this model angiosperm tree species.« less
Tan, Lianjiang; Liu, Yazhi; Li, Xiaowei; Wu, Xin-Yan; Gong, Bing; Shen, Yu-Mei; Shao, Zhifeng
2016-02-11
An acid-cleavable linker based on a dimethylketal moiety was synthesized and used to connect a nucleotide with a fluorophore to produce a 3'-OH unblocked nucleotide analogue as an excellent reversible terminator for DNA sequencing by synthesis.
Development of a Green Fluorescent Protein-Based Laboratory Curriculum
ERIC Educational Resources Information Center
Larkin, Patrick D.; Hartberg, Yasha
2005-01-01
A laboratory curriculum has been designed for an undergraduate biochemistry course that focuses on the investigation of the green fluorescent protein (GFP). The sequence of procedures extends from analysis of the DNA sequence through PCR amplification, recombinant plasmid DNA synthesis, bacterial transformation, expression, isolation, and…
Deng, Jiajia; Toh, Chee-Seng
2013-06-17
A novel and integrated membrane sensing platform for DNA detection is developed based on an anodic aluminum oxide (AAO) membrane. Platinum electrodes (~50-100 nm thick) are coated directly on both sides of the alumina membrane to eliminate the solution resistance outside the nanopores. The electrochemical impedance technique is employed to monitor the impedance changes within the nanopores upon DNA binding. Pore resistance (Rp) linearly increases in response towards the increasing concentration of the target DNA in the range of 1 × 10⁻¹² to 1 × 10⁻⁶ M. Moreover, the biosensor selectively differentiates the complementary sequence from single base mismatched (MM-1) strands and non-complementary strands. This study reveals a simple, selective and sensitive method to fabricate a label-free DNA biosensor.
Zhu, Jing; Ding, Yongshun; Liu, Xingti; Wang, Lei; Jiang, Wei
2014-09-15
Highly sensitive and selective detection strategy for single-base mutations is essential for risk assessment of malignancy and disease prognosis. In this work, a fluorescent detection method for single-base mutation was proposed based on high selectivity of toehold-mediated strand displacement reaction (TSDR) and powerful signal amplification capability of isothermal DNA amplification. A discrimination probe was specially designed with a stem-loop structure and an overhanging toehold domain. Hybridization between the toehold domain and the perfect matched target initiated the TSDR along with the unfolding of the discrimination probe. Subsequently, the target sequence acted as a primer to initiate the polymerization and nicking reactions, which released a great abundant of short sequences. Finally, the released strands were annealed with the reporter probe, launching another polymerization and nicking reaction to produce lots of G-quadruplex DNA, which could bind the N-methyl mesoporphyrin IX to yield an enhanced fluorescence response. However, when there was even a single base mismatch in the target DNA, the TSDR was suppressed and so subsequent isothermal DNA amplification and fluorescence response process could not occur. The proposed approach has been successfully implemented for the identification of the single-base mutant sequences in the human KRAS gene with a detection limit of 1.8 pM. Furthermore, a recovery of 90% was obtained when detecting the target sequence in spiked HeLa cells lysate, demonstrating the feasibility of this detection strategy for single-base mutations in biological samples. Copyright © 2014 Elsevier B.V. All rights reserved.
Tharmatha, T; Gajapathy, K; Ramasamy, R; Surendran, S N
2017-02-01
The correct identification of sand fly vectors of leishmaniasis is important for controlling the disease. Genetic, particularly DNA sequence data, has lately become an important adjunct to the use of morphological criteria for this purpose. A recent DNA sequencing study revealed the presence of two cryptic species in the Sergentomyia bailyi species complex in India. The present study was undertaken to ascertain the presence of cryptic species in the Se. bailyi complex in Sri Lanka using morphological characteristics and DNA sequences from cytochrome c oxidase subunits. Sand flies were collected from leishmaniasis endemic and non-endemic dry zone districts of Sri Lanka. A total of 175 Se. bailyi specimens were initially screened for morphological variations and the identified samples formed two groups, tentatively termed as Se. bailyi species A and B, based on the relative length of the sensilla chaeticum and antennal flagellomere. DNA sequences from the mitochondrial cytochrome c oxidase subunit I (COI) and subunit II (COII) genes of morphologically identified Se. bailyi species A and B were subsequently analyzed. The two species showed differences in the COI and COII gene sequences and were placed in two separate clades by phylogenetic analysis. An allele specific polymerase chain reaction assay based on sequence variation in the COI gene accurately differentiated species A and B. The study therefore describes the first morphological and genetic evidence for the presence of two cryptic species within the Se. bailyi complex in Sri Lanka and a DNA-based laboratory technique for differentiating them.
McMeel, O M; Hoey, E M; Ferguson, A
2001-01-01
The cDNA nucleotide sequences of the lactate dehydrogenase alleles LDH-C1*90 and *100 of brown trout (Salmo trutta) were found to differ at position 308 where an A is present in the *100 allele but a G is present in the *90 allele. This base substitution results in an amino acid change from aspartic acid at position 82 in the LDH-C1 100 allozyme to a glycine in the 90 allozyme. Since aspartic acid has a net negative charge whilst glycine is uncharged, this is consistent with the electrophoretic observation that the LDH-C1 100 allozyme has a more anodal mobility relative to the LDH-C1 90 allozyme. Based on alignment of the cDNA sequence with the mouse genomic sequence, a local primer set was designed, incorporating the variable position, and was found to give very good amplification with brown trout genomic DNA. Sequencing of this fragment confirmed the difference in both homozygous and heterozygous individuals. Digestion of the polymerase chain reaction products with BslI, a restriction enzyme specific for the site difference, gave one, two and three fragments for the two homozygotes and the heterozygote, respectively, following electrophoretic separation. This provides a DNA-based means of routine screening of the highly informative LDH-C1* polymorphism in brown trout population genetic studies. Primer sets presented could be used to sequence cDNA of other LDH* genes of brown trout and other species.
Analysis of DNA Sequences by an Optical ime-Integrating Correlator: Proposal
1991-11-01
CURRENT TECHNOLOGY 2 3.0 TIME-INTEGRATING CORRELATOR 2 4.0 REPRESENTATIONS OF THE DNA BASES 8 5.0 DNA ANALYSIS STRATEGY 8 6.0 STRATEGY FOR COARSE...1)-correlation peak formed by the AxB term and (2)-pedestal formed by the A + B terms. 7 Figure 4: Short representations of the DNA bases where each...linear scale. 15 x LIST OF TABLES PAGE Table 1: Short representations of the DNA bases where each base is represented by 7-bits long pseudorandom
Repetitive sequences in plant nuclear DNA: types, distribution, evolution and function.
Mehrotra, Shweta; Goyal, Vinod
2014-08-01
Repetitive DNA sequences are a major component of eukaryotic genomes and may account for up to 90% of the genome size. They can be divided into minisatellite, microsatellite and satellite sequences. Satellite DNA sequences are considered to be a fast-evolving component of eukaryotic genomes, comprising tandemly-arrayed, highly-repetitive and highly-conserved monomer sequences. The monomer unit of satellite DNA is 150-400 base pairs (bp) in length. Repetitive sequences may be species- or genus-specific, and may be centromeric or subtelomeric in nature. They exhibit cohesive and concerted evolution caused by molecular drive, leading to high sequence homogeneity. Repetitive sequences accumulate variations in sequence and copy number during evolution, hence they are important tools for taxonomic and phylogenetic studies, and are known as "tuning knobs" in the evolution. Therefore, knowledge of repetitive sequences assists our understanding of the organization, evolution and behavior of eukaryotic genomes. Repetitive sequences have cytoplasmic, cellular and developmental effects and play a role in chromosomal recombination. In the post-genomics era, with the introduction of next-generation sequencing technology, it is possible to evaluate complex genomes for analyzing repetitive sequences and deciphering the yet unknown functional potential of repetitive sequences. Copyright © 2014 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.
Yang, Xiaoxia; Wang, Jia; Sun, Jun; Liu, Rong
2015-01-01
Protein-nucleic acid interactions are central to various fundamental biological processes. Automated methods capable of reliably identifying DNA- and RNA-binding residues in protein sequence are assuming ever-increasing importance. The majority of current algorithms rely on feature-based prediction, but their accuracy remains to be further improved. Here we propose a sequence-based hybrid algorithm SNBRFinder (Sequence-based Nucleic acid-Binding Residue Finder) by merging a feature predictor SNBRFinderF and a template predictor SNBRFinderT. SNBRFinderF was established using the support vector machine whose inputs include sequence profile and other complementary sequence descriptors, while SNBRFinderT was implemented with the sequence alignment algorithm based on profile hidden Markov models to capture the weakly homologous template of query sequence. Experimental results show that SNBRFinderF was clearly superior to the commonly used sequence profile-based predictor and SNBRFinderT can achieve comparable performance to the structure-based template methods. Leveraging the complementary relationship between these two predictors, SNBRFinder reasonably improved the performance of both DNA- and RNA-binding residue predictions. More importantly, the sequence-based hybrid prediction reached competitive performance relative to our previous structure-based counterpart. Our extensive and stringent comparisons show that SNBRFinder has obvious advantages over the existing sequence-based prediction algorithms. The value of our algorithm is highlighted by establishing an easy-to-use web server that is freely accessible at http://ibi.hzau.edu.cn/SNBRFinder.
NASA Technical Reports Server (NTRS)
Smith, David J.; Burton, Aaron; Castro-Wallace, Sarah; John, Kristen; Stahl, Sarah E.; Dworkin, Jason Peter; Lupisella, Mark L.
2016-01-01
On the International Space Station (ISS), technologies capable of rapid microbial identification and disease diagnostics are not currently available. NASA still relies upon sample return for comprehensive, molecular-based sample characterization. Next-generation DNA sequencing is a powerful approach for identifying microorganisms in air, water, and surfaces onboard spacecraft. The Biomolecule Sequencer payload, manifested to SpaceX-9 and scheduled on the Increment 4748 research plan (June 2016), will assess the functionality of a commercially-available next-generation DNA sequencer in the microgravity environment of ISS. The MinION device from Oxford Nanopore Technologies (Oxford, UK) measures picoamp changes in electrical current dependent on nucleotide sequences of the DNA strand migrating through nanopores in the system. The hardware is exceptionally small (9.5 x 3.2 x 1.6 cm), lightweight (120 grams), and powered only by a USB connection. For the ISS technology demonstration, the Biomolecule Sequencer will be powered by a Microsoft Surface Pro3. Ground-prepared samples containing lambda bacteriophage, Escherichia coli, and mouse genomic DNA, will be launched and stored frozen on the ISS until experiment initiation. Immediately prior to sequencing, a crew member will collect and thaw frozen DNA samples, connect the sequencer to the Surface Pro3, inject thawed samples into a MinION flow cell, and initiate sequencing. At the completion of the sequencing run, data will be downlinked for ground analysis. Identical, synchronous ground controls will be used for data comparisons to determine sequencer functionality, run-time sequence, current dynamics, and overall accuracy. We will present our latest results from the ISS flight experiment the first time DNA has ever been sequenced in space and discuss the many potential applications of the Biomolecule Sequencer for environmental monitoring, medical diagnostics, higher fidelity and more adaptable Space Biology Human Research Program investigations, and even life detection experiments for astrobiology missions.
DNA nanomapping using CRISPR-Cas9 as a programmable nanoparticle.
Mikheikin, Andrey; Olsen, Anita; Leslie, Kevin; Russell-Pavier, Freddie; Yacoot, Andrew; Picco, Loren; Payton, Oliver; Toor, Amir; Chesney, Alden; Gimzewski, James K; Mishra, Bud; Reed, Jason
2017-11-21
Progress in whole-genome sequencing using short-read (e.g., <150 bp), next-generation sequencing technologies has reinvigorated interest in high-resolution physical mapping to fill technical gaps that are not well addressed by sequencing. Here, we report two technical advances in DNA nanotechnology and single-molecule genomics: (1) we describe a labeling technique (CRISPR-Cas9 nanoparticles) for high-speed AFM-based physical mapping of DNA and (2) the first successful demonstration of using DVD optics to image DNA molecules with high-speed AFM. As a proof of principle, we used this new "nanomapping" method to detect and map precisely BCL2-IGH translocations present in lymph node biopsies of follicular lymphoma patents. This HS-AFM "nanomapping" technique can be complementary to both sequencing and other physical mapping approaches.
Nanopore Kinetic Proofreading of DNA Sequences
NASA Astrophysics Data System (ADS)
Ling, Xinsheng Sean
The concept of DNA sequencing using the time dependence of the nanopore ionic current was proposed in 1996 by Kasianowicz, Brandin, Branton, and Deamer (KBBD). The KBBD concept has generated tremendous amount interests in recent decade. In this talk, I will review the current understanding of the DNA ``translocation'' dynamics and how it can be described by Schrodinger's 1915 paper on first-passage-time distribution function. Schrodinger's distribution function can be used to give a rigorous criterion for achieving nanopore DNA sequencing which turns out to be identical to that of gel electrophoresis used by Sanger in the first-generation Sanger method. A nanopore DNA sequencing technology also requires discrimination of bases with high accuracies. I will describe a solid-state nanopore sandwich structure that can function as a proofreading device capable of discriminating between correct and incorrect hybridization probes with an accuracy rivaling that of high-fidelity DNA polymerases. The latest results from Nanjing will be presented. This work is supported by China 1000-Talent Program at Southeast University, Nanjing, China.
Equilibrious Strand Exchange Promoted by DNA Conformational Switching
NASA Astrophysics Data System (ADS)
Wu, Zhiguo; Xie, Xiao; Li, Puzhen; Zhao, Jiayi; Huang, Lili; Zhou, Xiang
2013-01-01
Most of DNA strand exchange reactions in vitro are based on toehold strategy which is generally nonequilibrium, and intracellular strand exchange mediated by proteins shows little sequence specificity. Herein, a new strand exchange promoted by equilibrious DNA conformational switching is verified. Duplexes containing c-myc sequence which is potentially converted into G-quadruplex are designed in this strategy. The dynamic equilibrium between duplex and G4-DNA is response to the specific exchange of homologous single-stranded DNA (ssDNA). The SER is enzyme free and sequence specific. No ATP is needed and the displaced ssDNAs are identical to the homologous ssDNAs. The SER products and exchange kenetics are analyzed by PAGE and the RecA mediated SER is performed as the contrast. This SER is a new feature of G4-DNAs and a novel strategy to utilize the dynamic equilibrium of DNA conformations.
2000 Year-old ancient equids: an ancient-DNA lesson from pompeii remains.
Di Bernardo, Giovanni; Del Gaudio, Stefania; Galderisi, Umberto; Cipollaro, Marilena
2004-11-15
Ancient DNA extracted from 2000 year-old equine bones was examined in order to amplify mitochondrial and nuclear DNA fragments. A specific equine satellite-type sequence representing 3.7%-11% of the entire equine genome, proved to be a suitable target to address the question of the presence of aDNA in ancient bones. The PCR strategy designed to investigate this specific target also allowed us to calculate the molecular weight of amplifiable DNA fragments. Sequencing of a 370 bp DNA fragment of mitochondrial control region allowed the comparison of ancient DNA sequences with those of modern horses to assess their genetic relationship. The 16S rRNA mitochondrial gene was also examined to unravel the post-mortem base modification feature and to test the status of Pompeian equids taxon on the basis of a Mae III restriction site polymorphism. Copyright 2004 Wiley-Liss, Inc.
DNA-based approaches to identify forest fungi in Pacific Islands: A pilot study
Anna E. Case; Sara M. Ashiglar; Phil G. Cannon; Ernesto P. Militante; Edwin R. Tadiosa; Mutya Quintos-Manalo; Nelson M. Pampolina; John W. Hanna; Fred E. Brooks; Amy L. Ross-Davis; Mee-Sook Kim; Ned B. Klopfenstein
2013-01-01
DNA-based diagnostics have been successfully used to characterize diverse forest fungi (e.g., Hoff et al. 2004, Kim et al. 2006, Glaeser & Lindner 2011). DNA sequencing of the internal transcribed spacer (ITS) and large subunit (LSU) regions of nuclear ribosomal DNA (rDNA) has proved especially useful (Sonnenberg et al. 2007, Seifert 2009, Schoch et al. 2012) for...
Oligo Design: a computer program for development of probes for oligonucleotide microarrays.
Herold, Keith E; Rasooly, Avraham
2003-12-01
Oligonucleotide microarrays have demonstrated potential for the analysis of gene expression, genotyping, and mutational analysis. Our work focuses primarily on the detection and identification of bacteria based on known short sequences of DNA. Oligo Design, the software described here, automates several design aspects that enable the improved selection of oligonucleotides for use with microarrays for these applications. Two major features of the program are: (i) a tiling algorithm for the design of short overlapping temperature-matched oligonucleotides of variable length, which are useful for the analysis of single nucleotide polymorphisms and (ii) a set of tools for the analysis of multiple alignments of gene families and related short DNA sequences, which allow for the identification of conserved DNA sequences for PCR primer selection and variable DNA sequences for the selection of unique probes for identification. Note that the program does not address the full genome perspective but, instead, is focused on the genetic analysis of short segments of DNA. The program is Internet-enabled and includes a built-in browser and the automated ability to download sequences from GenBank by specifying the GI number. The program also includes several utilities, including audio recital of a DNA sequence (useful for verifying sequences against a written document), a random sequence generator that provides insight into the relationship between melting temperature and GC content, and a PCR calculator.
Thai, Quan Ke; Chung, Dung Anh; Tran, Hoang-Dung
2017-06-26
Canine and wolf mitochondrial DNA haplotypes, which can be used for forensic or phylogenetic analyses, have been defined in various schemes depending on the region analyzed. In recent studies, the 582 bp fragment of the HV1 region is most commonly used. 317 different canine HV1 haplotypes have been reported in the rapidly growing public database GenBank. These reported haplotypes contain several inconsistencies in their haplotype information. To overcome this issue, we have developed a Canis mtDNA HV1 database. This database collects data on the HV1 582 bp region in dog mitochondrial DNA from the GenBank to screen and correct the inconsistencies. It also supports users in detection of new novel mutation profiles and assignment of new haplotypes. The Canis mtDNA HV1 database (CHD) contains 5567 nucleotide entries originating from 15 subspecies in the species Canis lupus. Of these entries, 3646 were haplotypes and grouped into 804 distinct sequences. 319 sequences were recognized as previously assigned haplotypes, while the remaining 485 sequences had new mutation profiles and were marked as new haplotype candidates awaiting further analysis for haplotype assignment. Of the 3646 nucleotide entries, only 414 were annotated with correct haplotype information, while 3232 had insufficient or lacked haplotype information and were corrected or modified before storing in the CHD. The CHD can be accessed at http://chd.vnbiology.com . It provides sequences, haplotype information, and a web-based tool for mtDNA HV1 haplotyping. The CHD is updated monthly and supplies all data for download. The Canis mtDNA HV1 database contains information about canine mitochondrial DNA HV1 sequences with reconciled annotation. It serves as a tool for detection of inconsistencies in GenBank and helps identifying new HV1 haplotypes. Thus, it supports the scientific community in naming new HV1 haplotypes and to reconcile existing annotation of HV1 582 bp sequences.
[cDNA library construction from panicle meristem of finger millet].
Radchuk, V; Pirko, Ia V; Isaenkov, S V; Emets, A I; Blium, Ia B
2014-01-01
The protocol for production of full-size cDNA using SuperScript Full-Length cDNA Library Construction Kit II (Invitrogen) was tested and high quality cDNA library from meristematic tissue of finger millet panicle (Eleusine coracana (L.) Gaertn) was created. The titer of obtained cDNA library comprised 3.01 x 10(5) CFU/ml in avarage. In average the length of cDNA insertion consisted about 1070 base pairs, the effectivity of cDNA fragment insertions--99.5%. The selective sequencing of cDNA clones from created library was performed. The sequences of cDNA clones were identified with usage of BLAST-search. The results of cDNA library analysis and selective sequencing represents prove good functionality and full length character of inserted cDNA clones. Obtained cDNA library from meristematic tissue of finger millet panicle represents good and valuable source for isolation and identification of key genes regulating metabolism and meristematic development and for mining of new molecular markers to conduct out high quality genetic investigations and molecular breeding as well.
Phylogenetic tree of 16s rRNA sequences from sulfate-reducing bacteria in a sandy marine sediment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Devereux, R.; Mundfrom, G.W.
1994-01-01
Phylogenetic divergence among sulfate-reducing bateria in an estuarine sediment sample was investigated by PCR amplification and comparison of partial 16S rDNA sequences. Twenty unique 16S rDNA sequences were found, 12 from delta subclass bacteria based on overall sequence similarity (82-91%). Two successive PCR amplifications were used to obtain and clone the 16S rDNA. The first reaction used templates derived from phosphate-buffered saline washed sediment with primers designed to amplify nearly full-length bacterial domain 16S rDNA. A produce from a first reaction was used as template in a second reaction with primers designed to selectivity amplify a region of 16S rDNAmore » genes of sulfate-reducing bacteria. A phylogenetic tree incorporating the cloned sequences suggests the presence of yet to be cultivated lines of sulfate-reducing bacteria within the sediment sample.« less
SeqCompress: an algorithm for biological sequence compression.
Sardaraz, Muhammad; Tahir, Muhammad; Ikram, Ataul Aziz; Bajwa, Hassan
2014-10-01
The growth of Next Generation Sequencing technologies presents significant research challenges, specifically to design bioinformatics tools that handle massive amount of data efficiently. Biological sequence data storage cost has become a noticeable proportion of total cost in the generation and analysis. Particularly increase in DNA sequencing rate is significantly outstripping the rate of increase in disk storage capacity, which may go beyond the limit of storage capacity. It is essential to develop algorithms that handle large data sets via better memory management. This article presents a DNA sequence compression algorithm SeqCompress that copes with the space complexity of biological sequences. The algorithm is based on lossless data compression and uses statistical model as well as arithmetic coding to compress DNA sequences. The proposed algorithm is compared with recent specialized compression tools for biological sequences. Experimental results show that proposed algorithm has better compression gain as compared to other existing algorithms. Copyright © 2014 Elsevier Inc. All rights reserved.
An integrated semiconductor device enabling non-optical genome sequencing.
Rothberg, Jonathan M; Hinz, Wolfgang; Rearick, Todd M; Schultz, Jonathan; Mileski, William; Davey, Mel; Leamon, John H; Johnson, Kim; Milgrew, Mark J; Edwards, Matthew; Hoon, Jeremy; Simons, Jan F; Marran, David; Myers, Jason W; Davidson, John F; Branting, Annika; Nobile, John R; Puc, Bernard P; Light, David; Clark, Travis A; Huber, Martin; Branciforte, Jeffrey T; Stoner, Isaac B; Cawley, Simon E; Lyons, Michael; Fu, Yutao; Homer, Nils; Sedova, Marina; Miao, Xin; Reed, Brian; Sabina, Jeffrey; Feierstein, Erika; Schorn, Michelle; Alanjary, Mohammad; Dimalanta, Eileen; Dressman, Devin; Kasinskas, Rachel; Sokolsky, Tanya; Fidanza, Jacqueline A; Namsaraev, Eugeni; McKernan, Kevin J; Williams, Alan; Roth, G Thomas; Bustillo, James
2011-07-20
The seminal importance of DNA sequencing to the life sciences, biotechnology and medicine has driven the search for more scalable and lower-cost solutions. Here we describe a DNA sequencing technology in which scalable, low-cost semiconductor manufacturing techniques are used to make an integrated circuit able to directly perform non-optical DNA sequencing of genomes. Sequence data are obtained by directly sensing the ions produced by template-directed DNA polymerase synthesis using all-natural nucleotides on this massively parallel semiconductor-sensing device or ion chip. The ion chip contains ion-sensitive, field-effect transistor-based sensors in perfect register with 1.2 million wells, which provide confinement and allow parallel, simultaneous detection of independent sequencing reactions. Use of the most widely used technology for constructing integrated circuits, the complementary metal-oxide semiconductor (CMOS) process, allows for low-cost, large-scale production and scaling of the device to higher densities and larger array sizes. We show the performance of the system by sequencing three bacterial genomes, its robustness and scalability by producing ion chips with up to 10 times as many sensors and sequencing a human genome.
Wang, Jing; McCord, Bruce
2011-06-01
A common problem in the analysis of forensic DNA evidence is the presence of environmentally degraded and inhibited DNA. Such samples produce a variety of interpretational problems such as allele imbalance, allele dropout and sequence specific inhibition. In an attempt to develop methods to enhance the recovery of this type of evidence, magnetic bead hybridization has been applied to extract and preconcentrate DNA sequences containing short tandem repeat (STR) alleles of interest. In this work, genomic DNA was fragmented by heating, and sequences associated with STR alleles were selectively hybridized to allele-specific biotinylated probes. Each particular biotinylated probe-DNA complex was bound to streptavidin-coated magnetic beads using enabling enrichment of target DNA sequences. Experiments conducted using degraded DNA samples, as well as samples containing a large concentration of inhibitory substances, showed good specificity and recovery of missing alleles. Based on the favorable results obtained with these specific probes, this method should prove useful as a tool to improve the recovery of alleles from degraded and inhibited DNA samples. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
A novel chaotic image encryption scheme using DNA sequence operations
NASA Astrophysics Data System (ADS)
Wang, Xing-Yuan; Zhang, Ying-Qian; Bao, Xue-Mei
2015-10-01
In this paper, we propose a novel image encryption scheme based on DNA (Deoxyribonucleic acid) sequence operations and chaotic system. Firstly, we perform bitwise exclusive OR operation on the pixels of the plain image using the pseudorandom sequences produced by the spatiotemporal chaos system, i.e., CML (coupled map lattice). Secondly, a DNA matrix is obtained by encoding the confused image using a kind of DNA encoding rule. Then we generate the new initial conditions of the CML according to this DNA matrix and the previous initial conditions, which can make the encryption result closely depend on every pixel of the plain image. Thirdly, the rows and columns of the DNA matrix are permuted. Then, the permuted DNA matrix is confused once again. At last, after decoding the confused DNA matrix using a kind of DNA decoding rule, we obtain the ciphered image. Experimental results and theoretical analysis show that the scheme is able to resist various attacks, so it has extraordinarily high security.
Hawlitschek, Oliver; Porch, Nick; Hendrich, Lars; Balke, Michael
2011-02-09
DNA sequencing techniques used to estimate biodiversity, such as DNA barcoding, may reveal cryptic species. However, disagreements between barcoding and morphological data have already led to controversy. Species delimitation should therefore not be based on mtDNA alone. Here, we explore the use of nDNA and bioclimatic modelling in a new species of aquatic beetle revealed by mtDNA sequence data. The aquatic beetle fauna of Australia is characterised by high degrees of endemism, including local radiations such as the genus Antiporus. Antiporus femoralis was previously considered to exist in two disjunct, but morphologically indistinguishable populations in south-western and south-eastern Australia. We constructed a phylogeny of Antiporus and detected a deep split between these populations. Diagnostic characters from the highly variable nuclear protein encoding arginine kinase gene confirmed the presence of two isolated populations. We then used ecological niche modelling to examine the climatic niche characteristics of the two populations. All results support the status of the two populations as distinct species. We describe the south-western species as Antiporus occidentalis sp.n. In addition to nDNA sequence data and extended use of mitochondrial sequences, ecological niche modelling has great potential for delineating morphologically cryptic species.
Hawlitschek, Oliver; Porch, Nick; Hendrich, Lars; Balke, Michael
2011-01-01
Background DNA sequencing techniques used to estimate biodiversity, such as DNA barcoding, may reveal cryptic species. However, disagreements between barcoding and morphological data have already led to controversy. Species delimitation should therefore not be based on mtDNA alone. Here, we explore the use of nDNA and bioclimatic modelling in a new species of aquatic beetle revealed by mtDNA sequence data. Methodology/Principal Findings The aquatic beetle fauna of Australia is characterised by high degrees of endemism, including local radiations such as the genus Antiporus. Antiporus femoralis was previously considered to exist in two disjunct, but morphologically indistinguishable populations in south-western and south-eastern Australia. We constructed a phylogeny of Antiporus and detected a deep split between these populations. Diagnostic characters from the highly variable nuclear protein encoding arginine kinase gene confirmed the presence of two isolated populations. We then used ecological niche modelling to examine the climatic niche characteristics of the two populations. All results support the status of the two populations as distinct species. We describe the south-western species as Antiporus occidentalis sp.n. Conclusion/Significance In addition to nDNA sequence data and extended use of mitochondrial sequences, ecological niche modelling has great potential for delineating morphologically cryptic species. PMID:21347370
Performing SELEX experiments in silico
NASA Astrophysics Data System (ADS)
Wondergem, J. A. J.; Schiessel, H.; Tompitak, M.
2017-11-01
Due to the sequence-dependent nature of the elasticity of DNA, many protein-DNA complexes and other systems in which DNA molecules must be deformed have preferences for the type of DNA sequence they interact with. SELEX (Systematic Evolution of Ligands by EXponential enrichment) experiments and similar sequence selection experiments have been used extensively to examine the (indirect readout) sequence preferences of, e.g., nucleosomes (protein spools around which DNA is wound for compactification) and DNA rings. We show how recently developed computational and theoretical tools can be used to emulate such experiments in silico. Opening up this possibility comes with several benefits. First, it allows us a better understanding of our models and systems, specifically about the roles played by the simulation temperature and the selection pressure on the sequences. Second, it allows us to compare the predictions made by the model of choice with experimental results. We find agreement on important features between predictions of the rigid base-pair model and experimental results for DNA rings and interesting differences that point out open questions in the field. Finally, our simulations allow application of the SELEX methodology to systems that are experimentally difficult to realize because they come with high energetic costs and are therefore unlikely to form spontaneously, such as very short or overwound DNA rings.
Noncoding sequence classification based on wavelet transform analysis: part I
NASA Astrophysics Data System (ADS)
Paredes, O.; Strojnik, M.; Romo-Vázquez, R.; Vélez Pérez, H.; Ranta, R.; Garcia-Torales, G.; Scholl, M. K.; Morales, J. A.
2017-09-01
DNA sequences in human genome can be divided into the coding and noncoding ones. Coding sequences are those that are read during the transcription. The identification of coding sequences has been widely reported in literature due to its much-studied periodicity. Noncoding sequences represent the majority of the human genome. They play an important role in gene regulation and differentiation among the cells. However, noncoding sequences do not exhibit periodicities that correlate to their functions. The ENCODE (Encyclopedia of DNA elements) and Epigenomic Roadmap Project projects have cataloged the human noncoding sequences into specific functions. We study characteristics of noncoding sequences with wavelet analysis of genomic signals.